Computes FDR estimates and confidence intervals for a sequence of potential significance thresholds.
fdrTbl(obs_vec, perm_list, pname, ntests, lowerbound, upperbound, incr = 0.1, cl = 0.95, c1 = NA)
| obs_vec | observed vector of p-values. |
|---|---|
| perm_list | list of dataframes that include a column of permutation p-values (or statistics) in each. The length of the list permp = number of permutations. |
| pname | name of column in each list component dataframe that includes p-values (or statistics). |
| ntests | total number of observed tests, which is usually the same as the length of obs_vec and the number of rows in each perm_list dataframe. However, this may not be the case if results were filtered by a p-value threshold or statistic threshold. If filtering was conducted then lowerbound must be greater (more extreme) than the filtering criterion. |
| lowerbound | lowerbound refers to the range of -log10(p-value) over which fdr is computed for a sequence of thresholds |
| upperbound | upperbound refers to the range of -log10(p-value) over which fdr is computed for a sequence of thresholds |
| incr | value by which to increment the sequence from lowerbound to upperbound on a -log10(p-value) scale. Default is 0.1. |
| cl | confidence level (default is .95). |
| c1 | overdispersion parameter. If this parameter is not specified (default initial value is NA), then the parameter is estimated from the data. If all tests are known to be independent, then this parameter should be set to 1. |
A dataframe is returned where rows correspond to p-value thresholds in the sequence from lowerbound to upperbound and columns are: c("threshold","fdr","ll","ul","pi0","odp","S","Sp")
p-value threshold chosen to define positive tests
estimated FDR at the chosen p-value threshold
estimated lower 95% confidence bound for the FDR estimate
estimated upper 95% confidence bound for the FDR estimate
estimated percent of true null hypotheses
estimated over-dispersion parameter
observed number of positive tests
total number of positive tests summed across all permuted result sets
fdrTbl calls fdr_od. Output from fdrTbl() can be used for FDRplot() input.
Millstein J, Volfson D. 2013. Computationally efficient permutation-based confidence interval estimation for tail-area FDR. Frontiers in Genetics | Statistical Genetics and Methodology 4(179):1-11.
nrow_=100 ncol_=100 X = as.data.frame(matrix(rnorm(nrow_*ncol_),nrow=nrow_,ncol=ncol_)) Y = as.data.frame(matrix(rnorm(nrow_*ncol_),nrow=nrow_,ncol=ncol_)) nperm = 10 myanalysis = function(X,Y){ ntests = ncol(X) rslts = as.data.frame(matrix(NA,nrow=ntests,ncol=2)) names(rslts) = c("ID","pvalue") rslts[,"ID"] = 1:ntests for(i in 1:ntests){ fit = cor.test(X[,i],Y[,i],na.action="na.exclude", alternative="two.sided",method="pearson") rslts[i,"pvalue"] = fit$p.value } return(rslts) } # End myanalysis ## Generate observed results obs = myanalysis(X,Y) ## Generate permuted results perml = vector('list',nperm) for(p_ in 1:nperm){ X1 = X[order(runif(ncol_)),] perml[[p_]] = myanalysis(X1,Y) } ## FDR results table fdrTbl(obs$pvalue,perml,"pvalue",ncol_,1,2)#> threshold fdr ll ul pi0 odp S Sp #> 1 1.0 0.6749643 0.36977410 1 0.9544950 1.000000 14 99 #> 2 1.1 0.6277429 0.28199382 1 0.9590517 1.423478 11 72 #> 3 1.2 0.6568453 0.30853828 1 0.9691161 1.064963 9 61 #> 4 1.3 0.5969548 0.26116160 1 0.9717868 1.000000 7 43 #> 5 1.4 0.5850622 0.24029943 1 0.9751037 1.000000 6 36 #> 6 1.5 0.7167868 0.24679799 1 0.9886715 1.000000 4 29 #> 7 1.6 1.0000000 0.29340855 1 1.0000000 1.000000 1 22 #> 8 1.7 1.0000000 0.15549945 1 1.0000000 1.240835 1 15 #> 9 1.8 1.0000000 0.18217647 1 1.0000000 1.000000 1 14 #> 10 1.9 1.0000000 0.11853618 1 1.0000000 1.274554 1 12 #> 11 2.0 0.8990918 0.07999064 1 0.9989909 1.357900 1 9