DocumentCode
761590
Title
Nonparametric Estimation of the Number of Unique Sequences in Biological Samples
Author
Xu, Changjiang ; Xu, Luzhou ; Yu, Fahong ; Tan, Weihong ; Moroz, Leonid L. ; Li, Jian
Author_Institution
Dept. of Telecommun. Eng., Nanjing Univ. of Posts & Telecommun.
Volume
54
Issue
10
fYear
2006
Firstpage
3759
Lastpage
3767
Abstract
Large-scale determination of uniquely expressed genes (or mRNAs) in specific cells and tissues is a challenging problem in computational and functional genomics. We consider nonparametric approaches for estimating the number of unique, nonredundant sequences in biological samples. By introducing the moments of species´ abundance in a population, we analyze the relative abundance of species in the population and present a lower bound estimator and so-called medial estimator for the number of distinct species in the population. The lower bound estimate is applicable to populations with small coefficients of variation (CV). The medial estimator works well for the populations with relatively large CV, especially gene expression data. Simulation analysis shows that the medial estimator performs better than existing methods. Finally, we apply our nonparametric approaches to estimate the number of expressed mRNAs in a normal colon epithelial tissue as well as unique clones in an amplified cDNA sample prepared from the CNS of the sea-slug Aplysia
Keywords
DNA; biological tissues; genetics; sequences; statistical analysis; amplified cDNA sample; biological samples; coefficients of variation; computational genomics; functional genomics; gene expression data; lower bound estimator; mRNA; medial estimator; nonparametric estimation; normal colon epithelial tissue; sea-slug Aplysia; specie abundance; unique sequences; uniquely expressed genes; Analytical models; Bioinformatics; Biology computing; Cloning; Colon; Data analysis; Gene expression; Genomics; Large-scale systems; Performance analysis; Aplysia; expressed sequence tags; genomics; nonparametric estimation; relative abundance of species; transcriptome;
fLanguage
English
Journal_Title
Signal Processing, IEEE Transactions on
Publisher
ieee
ISSN
1053-587X
Type
jour
DOI
10.1109/TSP.2006.880211
Filename
1703845
Link To Document