Simulate functions on a ginven tree

sim_fun_on_tree(
  tree,
  tip.type,
  node.type,
  psi,
  mu_d,
  mu_s,
  eta,
  Pi,
  P = 1L,
  informative = getOption("aphylo_informative", FALSE),
  maxtries = 20L
)

Arguments

tree

An object of class phylo

tip.type, node.type

Integer vectors with values 0,1. 0 denotes duplication node and 1 speciation node. This is used in LogLike.

psi

Numeric vector of length 2. Misclasification probabilities. (see LogLike).

mu_d, mu_s

Numeric vector of length 2. Gain/loss probabilities (see LogLike).

eta

Numeric vector of length 2. Annotation bias probabilities (see LogLike).

Pi

Numeric scalar. Root node probability of having the function (see LogLike).

P

Integer scalar. Number of functions to simulate.

informative

Logical scalar. When TRUE (default) the function re-runs the simulation algorithm until both 0s and 1s show in the leaf nodes of the tree.

maxtries

Integer scalar. If informative = TRUE, then the function will try at most maxtries times.

Value

An matrix of size length(offspring)*P with values 9, 0 and 1 indicating "no information", "no function" and "function".

Details

Using the model described in the vignette peeling_phylo.html

The optiona informative was created such that when needed the function can be forced to simualte annotations while making sure (or at least trying maxtries times) that the leafs have both 0s and 9s. From what we've learned while conducting simulation studies, using this option may indirectly bias the data generating process.

Examples

# Example 1 ----------------------------------------------------------------
# We need to simulate a tree
set.seed(1231)
newtree <- sim_tree(1e3)

# Preprocessing the data

# Simulating
ans <- sim_fun_on_tree(
  newtree,
  psi  = c(.01, .05),
  mu_d = c(.90, .80),
  mu_s = c(.1, .05),
  Pi   = .5,
  eta  = c(1, 1)
)

# Tabulating results
table(ans)
#> ans
#>    0    1 
#>  951 1048