Permutations and computational power: A molecular cascade analysis to approach big data in psychiatry - 08/07/17
Résumé |
In the last few years, we conducted a number of molecular pathway analyses on the genetic samples provided by the NIMH. The molecular pathway approach accounts for the polygenic nature of the most part of psychiatric disorders. Nevertheless, the limits of this approach including the limited knowledge about the function of the genes, the fact that longer genes have higher probability to harbour variations significantly associated with the phenotype under analysis and the false positive associations for single variations, demand statistical control and bio-statistical knowledge. Permutations are a methodology to control for false positive associations, but their implementation requires that a number of criteria are taken into account: 1) the same number of genes and the same number of variations of the index pathway must be simulated in order to limit the bias of selecting significantly longer or shorter genes; 2) a sufficient number of permutated pathways is created (10E5 to 10E6 depending on computational resources) which demands higher computational power; 3) the correct statistical thresholds are identified and discussed; 4) some pathways might be over-represented and the source of information must be constantly updated. The tools for running a molecular pathway analysis (R Foundation for Statistical Computing, 2013) when interacting with a supercluster PC and the international bioinformatic datasets (Embase, NIMH and others), together with the critical steps of bioinformatics scripting (bash language) are described and discussed.
Le texte complet de cet article est disponible en PDF.Plan
Vol 41 - N° S
P. S55 - avril 2017 Retour au numéroBienvenue sur EM-consulte, la référence des professionnels de santé.
L’accès au texte intégral de cet article nécessite un abonnement.
Déjà abonné à cette revue ?