Overlapping probabilities of top ranking gene lists, hypergeometric distribution, and stringency of gene selection criterion
- PMID: 17947148
- DOI: 10.1109/IEMBS.2006.260828
Overlapping probabilities of top ranking gene lists, hypergeometric distribution, and stringency of gene selection criterion
Abstract
When the same set of genes appear in two top ranking gene lists in two different studies, it is often of interest to estimate the probability for this being a chance event. This overlapping probability is well known to follow the hypergeometric distribution. Usually, the lengths of top-ranking gene lists are assumed to be fixed, by using a pre-set criterion on, e.g., p-value for the t-test. We investigate how overlapping probability changes with the gene selection criterion, or simply, with the length of the top-ranking gene lists. It is concluded that overlapping probability is indeed a function of the gene list length, and its statistical significance should be quoted in the context of gene selection criterion.
Similar articles
-
The balance of reproducibility, sensitivity, and specificity of lists of differentially expressed genes in microarray studies.BMC Bioinformatics. 2008 Aug 12;9 Suppl 9(Suppl 9):S10. doi: 10.1186/1471-2105-9-S9-S10. BMC Bioinformatics. 2008. PMID: 18793455 Free PMC article.
-
Post hoc pattern matching: assigning significance to statistically defined expression patterns in single channel microarray data.BMC Bioinformatics. 2007 Jul 5;8:240. doi: 10.1186/1471-2105-8-240. BMC Bioinformatics. 2007. PMID: 17615071 Free PMC article.
-
Is cross-validation better than resubstitution for ranking genes?Bioinformatics. 2004 Jan 22;20(2):253-8. doi: 10.1093/bioinformatics/btg399. Bioinformatics. 2004. PMID: 14734317
-
How does gene expression clustering work?Nat Biotechnol. 2005 Dec;23(12):1499-501. doi: 10.1038/nbt1205-1499. Nat Biotechnol. 2005. PMID: 16333293 Review.
-
Stability and aggregation of ranked gene lists.Brief Bioinform. 2009 Sep;10(5):556-68. doi: 10.1093/bib/bbp034. Brief Bioinform. 2009. PMID: 19679825 Review.
Cited by
-
Integration of a systems biological network analysis and QTL results for biomass heterosis in Arabidopsis thaliana.PLoS One. 2012;7(11):e49951. doi: 10.1371/journal.pone.0049951. Epub 2012 Nov 16. PLoS One. 2012. PMID: 23166802 Free PMC article.
-
Chromatin-dependent binding of the S. cerevisiae HMGB protein Nhp6A affects nucleosome dynamics and transcription.Genes Dev. 2010 Sep 15;24(18):2031-42. doi: 10.1101/gad.1948910. Genes Dev. 2010. PMID: 20844014 Free PMC article.
-
Regulation of nucleosome positioning by a CHD Type III chromatin remodeler and its relationship to developmental gene expression in Dictyostelium.Genome Res. 2017 Apr;27(4):591-600. doi: 10.1101/gr.216309.116. Epub 2017 Mar 22. Genome Res. 2017. PMID: 28330902 Free PMC article.
-
A model system for assessing and comparing the ability of exon microarray and tag sequencing to detect genes specific for malignant B-cells.BMC Genomics. 2012 Nov 5;13:596. doi: 10.1186/1471-2164-13-596. BMC Genomics. 2012. PMID: 23127183 Free PMC article.
-
A molecular model for neurodevelopmental disorders.Transl Psychiatry. 2015 May 12;5(5):e565. doi: 10.1038/tp.2015.56. Transl Psychiatry. 2015. PMID: 25966365 Free PMC article.
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources