Posterior probabilities based on parameter estimates — posterior-probabilities • aphylo

The function predict_pre_order uses a pre-order algorithm to compute the posterior probabilities, whereas the predict_brute_force computes posterior probabilities generating all possible cases.

# S3 method for aphylo_estimates
predict(
  object,
  which.tree = NULL,
  ids = NULL,
  newdata = NULL,
  params = stats::coef(object),
  loo = TRUE,
  nsamples = 1L,
  centiles = c(0.025, 0.5, 0.975),
  cl = NULL,
  ...
)

predict_pre_order(x, ...)

# S3 method for aphylo_estimates
predict_pre_order(
  x,
  params = stats::coef(x),
  which.tree = 1:Ntrees(x),
  ids = lapply(Ntip(x)[which.tree], seq_len),
  loo = TRUE,
  nsamples = 1L,
  centiles = c(0.025, 0.5, 0.975),
  ncores = 1L,
  cl = NULL,
  ...
)

# S3 method for aphylo
predict_pre_order(x, psi, mu_d, mu_s, eta, Pi, ...)

predict_brute_force(atree, psi, mu_d, mu_s, Pi, force = FALSE)

Arguments

which.tree: Integer scalar. Which tree to include in the prediction.
ids: Integer vector. Ids (positions) of the nodes that need to be predicted (see details.)
newdata: (optional) An aphylo object.
params: A numeric vector with the corresponding parameters.
loo: Logical scalar. When loo = TRUE, predictions are preformed similar to what a leave-one-out cross-validation scheme would be done (see predict.aphylo_estimates).
nsamples: Integer scalar. When greater than one, the prediction is done using a random sample from the MCMC chain. This only works if the model was fitted using MCMC, of course.
centiles: Used together with nsamples, this indicates the centiles to be computed from the distribution of outcomes.
...: Ignored.
ncores, cl: Passed to parallel::makeCluster().
psi: Numeric vector of length 2. Misclasification probabilities. (see LogLike).
mu_d, mu_s: Numeric vector of length 2. Gain/loss probabilities (see LogLike).
eta: Numeric vector of length 2. Annotation bias probabilities (see LogLike).
Pi: Numeric scalar. Root node probability of having the function (see LogLike).
atree, x, object: Either a tree of class aphylo or an object of class aphylo_estimates
force: Logical scalar. When TRUE it will try to compute the brute-force probabilities for trees with more than 7 nodes.

Value

In the case of the predict method, a P column numeric matrix with values between \([0,1]\) (probabilities).

Details

The function predict_brute_force is only intended for testing. For predictions after estimating the model, see predict.aphylo_estimates.

In the case of the parameter loo (leave-one-out), while making tip-level predictions, at each leaf the algorithm will drop annotations regarding that leaf, making its prediction using all the available information except the one include in such leaf.

The predict_brute_force function makes the (obviously) brute force calculation of the probabilities. It will perform It returns a list with the following:

Pr The conditional probabilities of observing a tree given a particular state of the leave nodes. The size is given by (2^nnodes x 2^nleaves), each entry is read as "The probability of observing scenario i (row) given that the leaves have state j (colum)." The scenarios are specified in the row matrix returned by the function.
row Indicates the state of each node (columns) per scenario (row).
col Indicates the state of each leaf node (columns) per potential leaf scenario.

Prediction on specific nodes

The ids parameter indicates for which nodes, both internal and tips, the predictions should be made. By default, the function will only make predictions on the leaf nodes.

The ids follow ape's convention, this is, 1:Ntips(x) are the leaf nodes, Ntips(x) + 1L is the root node, and everything else are the interior nodes.

Although the prediction algorithm is fast, indicating only a subset of the nodes could make a difference when loo = TRUE and/or nsamples > 1 (calculating a Credible/Confidence Interval.)

In the case of multiAphylo, ids should be passed as a list of length Ntrees(x), with each element indicating the nodes. Otherwise, ids are passed as an integer vector.

Examples

# Single tree ---------------------------------------------------------------
set.seed(123)
atree <- raphylo(10)

# Fitting the model with MLE
ans <- aphylo_mle(atree ~ psi + mu_d + mu_s + Pi)

# Prediction on leaves
predict(ans)
#>         fun0000
#> 1  9.999300e-01
#> 2  1.666676e-05
#> 3  1.500028e-05
#> 4  9.999700e-01
#> 5  1.666676e-05
#> 6  2.500090e-01
#> 7  4.999991e-01
#> 8  9.999300e-01
#> 9  1.500028e-05
#> 10 9.999700e-01
#> 11           NA
#> 12           NA
#> 13           NA
#> 14           NA
#> 15           NA
#> 16           NA
#> 17           NA
#> 18           NA
#> 19           NA

# Prediction on all nodes (including root and interior)
predict(ans, ids = 1:Nnode(ans, internal.only = FALSE))
#>         fun0000
#> 1  9.999300e-01
#> 2  1.666676e-05
#> 3  1.500028e-05
#> 4  9.999700e-01
#> 5  1.666676e-05
#> 6  2.500090e-01
#> 7  4.999991e-01
#> 8  9.999300e-01
#> 9  1.500028e-05
#> 10 9.999700e-01
#> 11 9.999941e-01
#> 12 9.999807e-01
#> 13 9.999896e-01
#> 14 1.333401e-10
#> 15 2.500205e-06
#> 16 1.000016e-05
#> 17 1.000000e+00
#> 18 1.000000e+00
#> 19 1.000092e-10

# Multiple trees (multiAphylo) ----------------------------------------------
atree <- c(raphylo(10), raphylo(5))

# Fitting the model with MLE
ans <- aphylo_mle(atree ~ psi + mu_d + mu_s + Pi)

# Prediction on leaves
predict(ans)
#> [[1]]
#>         fun0000
#> 1  1.748979e-05
#> 2  5.801845e-01
#> 3  5.263530e-01
#> 4  9.999810e-01
#> 5  1.000050e-05
#> 6  1.652290e-05
#> 7  1.000050e-05
#> 8  9.999900e-01
#> 9  9.999900e-01
#> 10 1.000023e-05
#> 11           NA
#> 12           NA
#> 13           NA
#> 14           NA
#> 15           NA
#> 16           NA
#> 17           NA
#> 18           NA
#> 19           NA
#> 
#> [[2]]
#>       fun0000
#> 1 1.00001e-05
#> 2 1.00001e-05
#> 3 1.00002e-05
#> 4 1.00002e-05
#> 5 1.00001e-05
#> 6          NA
#> 7          NA
#> 8          NA
#> 9          NA
#> 

# Predicting only interior nodes
predict(ans, ids = list(11:19, 6:9))
#> [[1]]
#>         fun0000
#> 1            NA
#> 2            NA
#> 3            NA
#> 4            NA
#> 5            NA
#> 6            NA
#> 7            NA
#> 8            NA
#> 9            NA
#> 10           NA
#> 11 4.609290e-15
#> 12 1.304628e-10
#> 13 6.522976e-06
#> 14 5.307827e-01
#> 15 5.307823e-01
#> 16 1.498014e-10
#> 17 9.991994e-15
#> 18 1.000000e+00
#> 19 1.000000e+00
#> 
#> [[2]]
#>       fun0000
#> 1          NA
#> 2          NA
#> 3          NA
#> 4          NA
#> 5          NA
#> 6 2.00010e-15
#> 7 2.00014e-15
#> 8 2.00018e-15
#> 9 4.00020e-15
#>