Title: | Rapidly Estimates Phylogeny from Large Allele Frequency Data Using Root Distances Method |
---|---|
Description: | Rapidly estimates tree-topology from large allele frequency data using Root Distances Method, under a Brownian Motion Model. See Peng et al. (2021) <doi:10.1016/j.ympev.2021.107142>. |
Authors: | Arindam RoyChoudhury [aut, cre, cph], Jing Peng [aut], Ying Li [aut], Laura Kubatko [aut, ths] |
Maintainer: | Arindam RoyChoudhury <[email protected]> |
License: | AGPL-3 |
Version: | 0.1.2 |
Built: | 2025-02-21 03:29:00 UTC |
Source: | https://github.com/arindamroychoudhury/rapidphylo |
The dataset “Human_Allele_Frequencies” is a 5 × 31,000 matrix that contains allele frequencies from 31,000 single nucleotide polymorphisms in Chromosomes 1-10 in 5 human populations. The last population “San” is intended to be used as an outgroup. The allele frequencies have been compiled from ALFRED database at Yale University. The analysis from this dataset has been published in Peng et al 2021.
Human_Allele_Frequencies
Human_Allele_Frequencies
An object of class matrix
(inherits from array
) with 5 rows and 31000 columns.
RDM()
estimates a tree-topology from allele frequencies.
RDM( mat_allele_freq, outgroup, use = c("complete.obs", "pairwise.complete.obs", "everything", "all.obs", "na.or.complete") )
RDM( mat_allele_freq, outgroup, use = c("complete.obs", "pairwise.complete.obs", "everything", "all.obs", "na.or.complete") )
mat_allele_freq |
A |
outgroup |
A variable that can be either the population name or a numerical row number of the outgroup data. |
use |
Specify which part of data is used to compute the covariance matrix. The options are " |
The input matrix is the observed values of the frequencies at tips .
A logit transformation is performed on the allele frequency data, so that the observed values
are approximately normal. (The logit transformation of r refers to
.) The transformed matrix is converted into a data frame for further analyses.
An estimated tree-topology in Newick format.
Peng J, Rajeevan H, Kubatko L, and RoyChoudhury A (2021) A fast likelihood approach for estimation of large phylogenies from continuous trait data. Molecular Phylogenetics and Evolution 161 107142.
# A dataset "Human_Allele_Frequencies" is loaded with the package; # it has allele frequencies in 31,000 sites for # 4 human populations and one outgroup human population. # check data dimension dim(Human_Allele_Frequencies) # run RDM function rd_tre <- RDM(Human_Allele_Frequencies, outgroup = "San", use = "pairwise.complete.obs") # result visualization plot(rd_tre, use.edge.length = FALSE, cex = 0.5)
# A dataset "Human_Allele_Frequencies" is loaded with the package; # it has allele frequencies in 31,000 sites for # 4 human populations and one outgroup human population. # check data dimension dim(Human_Allele_Frequencies) # run RDM function rd_tre <- RDM(Human_Allele_Frequencies, outgroup = "San", use = "pairwise.complete.obs") # result visualization plot(rd_tre, use.edge.length = FALSE, cex = 0.5)