R/accuracy_sifter.R
accuracy_sifter.Rd
Uses SIFTER's 2011 definition of accuracy, where a protein is tagged as accurately predicted if the highest ranked prediction matches it.
accuracy_sifter(pred, lab, tol = 1e-10, highlight = "", ...)
# S3 method for aphylo_estimates
accuracy_sifter(pred, lab, tol = 1e-10, highlight = "", ...)
# S3 method for default
accuracy_sifter(pred, lab, tol = 1e-10, highlight = "", nine_na = TRUE, ...)
A matrix of predictions, or an aphylo_estimates object.
A matrix of labels (0,1,NA, or 9 if nine_na = TRUE
).
Numeric scalar. Predictions within tol
of the max score
will be tagged as the prediction made by the model (see deails).
Pattern passed to sprintf used to highlight predicted functions that match the observed.
Further arguments passed to the method. In the case of aphylo_estimates
,
the arguments are passed to predict.aphylo_estimates()
.
Treat 9 as NA.
A data frame with Ntip()
rows and four variables. The variables are:
Gene: Label of the gene
Predicted: The assigned gene function.
Observed: The true set of gene functions.
Accuracy: The measurement of accuracy according to Engelhardt et al. (2011).
The analysis is done at the protein level. For each protein, the function
compares the YES annotations of that proteins with the predicted by the model.
The algorithm selects the predicted annotations as those that are within
tol
of the maximum score.
This algorithm doesn't take into account NOT annotations (0s), which are excluded from the analysis.
When highlight = ""
, no highlight is done.
set.seed(81231)
atree <- raphylo(50, psi = c(0,0), P = 3)
ans <- aphylo_mcmc(atree ~ mu_d + mu_s + Pi)
#> Warning: While using multiple chains, a single initial point has been passed via `initial`: c(0.9, 0.5, 0.1, 0.05, 0.5). The values will be recycled. Ideally you would want to start each chain from different locations.
#> Convergence has been reached with 10000 steps. Gelman-Rubin's R: 1.0134. (500 final count of samples).
accuracy_sifter(ans)
#> Gene Predicted Observed Accuracy
#> 1 1 fun0001 fun0000,fun0001,fun0002 1
#> 2 2 fun0001 fun0000,fun0001 1
#> 3 3 fun0002 fun0002 1
#> 4 4 fun0002 fun0001,fun0002 1
#> 5 5 fun0002 fun0002 1
#> 6 6 fun0002 fun0002 1
#> 7 7 fun0002 fun0001,fun0002 1
#> 8 8 fun0001 fun0000,fun0001 1
#> 9 9 fun0002 fun0000,fun0002 1
#> 10 10 fun0001 fun0001,fun0002 1
#> 11 11 fun0002 fun0001,fun0002 1
#> 12 12 fun0002 fun0002 1
#> 13 13 fun0002 fun0002 1
#> 14 14 fun0002 fun0002 1
#> 15 15 fun0002 fun0002 1
#> 16 16 fun0002 fun0002 1
#> 17 17 fun0001 fun0000,fun0001 1
#> 18 18 fun0002 fun0002 1
#> 19 19 fun0002 fun0000,fun0002 1
#> 20 20 fun0001 fun0000,fun0001,fun0002 1
#> 21 21 fun0002 fun0002 1
#> 22 22 fun0001 fun0000,fun0001 1
#> 23 23 fun0002 fun0002 1
#> 24 24 fun0002 fun0001,fun0002 1
#> 25 25 fun0002 fun0001,fun0002 1
#> 26 26 fun0000 fun0000,fun0001 1
#> 27 27 fun0002 fun0000,fun0001,fun0002 1
#> 28 28 fun0002 fun0001,fun0002 1
#> 29 29 fun0002 fun0000,fun0001,fun0002 1
#> 30 30 fun0000 fun0000,fun0001,fun0002 1
#> 31 31 fun0002 fun0000,fun0001,fun0002 1
#> 32 32 fun0002 fun0002 1
#> 33 33 fun0002 fun0000,fun0002 1
#> 34 34 fun0001 fun0001 1
#> 35 35 fun0002 fun0002 1
#> 36 36 fun0002 fun0002 1
#> 37 37 fun0002 fun0002 1
#> 38 38 fun0002 fun0000,fun0002 1
#> 39 39 fun0002 fun0002 1
#> 40 40 fun0002 fun0002 1
#> 41 41 fun0001 fun0000,fun0002 0
#> 42 42 fun0001 fun0000,fun0001 1
#> 43 43 fun0002 fun0002 1
#> 44 44 fun0001 fun0001 1
#> 45 45 fun0000 fun0000,fun0001 1
#> 46 46 fun0001 fun0000,fun0001 1
#> 47 47 fun0002 fun0001,fun0002 1
#> 48 48 fun0002 fun0000,fun0001,fun0002 1
#> 49 49 fun0002 fun0001,fun0002 1
#> 50 50 fun0002 fun0002 1