Tuesday, August 26, 2014

GWAS meta analysis

Gilman et al, Neuron, 2011. p898-907. NetBag on Autism
NetBag is a greedy approaches. The clustering methods started with one or two genes in CNV as ‘seeds’.

Gilman11 generated a weighted background human gene network for their study.

Gilman11 compared the cluster raw pvalue, called local pvalue to the p-values from random networks. The adjusted p-value is called global p-value.



-----------------------------------------------------------------------------------------------------------------------

AIS13 categorize pathway association methods into canonical and de nov pathway methods.

For de novo pathway discovery, integer linear program (ILP) is used in Leiserson , Blokh, Plos Comput Biol. Simultaneous identification of multiple driver pathway in cancer.

Steiner tree problem where one seeks the lowest cost pathway that connect the associated genes. See Liu et al, BMC Sys Biol 2012, Gene, pathway and nework frameworks to identify epistatic interactions of single nucleotide polymorphisms derived from GWAS data.

-----------------------------------------------------------------------------------------------------------------------
LERR13 review the method on protein-protein and protein-DNA networks to identify 'causal' genetic variant. (Their 'causal' definition is a narrowly defined one).

LERR13 argues that GWAS mostly find SNP that are LD with the actual 'causal' gene. This problem is of less concern to CNV analysis. One solution is to use network to 'rank' genes in the same haplotype known to be associate to the phenotype of interest or similar phenotypes. (This method is in the spirit of our recent CNV paper).

The green square represents the 'known' 'causal' gene. So, this is largely a traversal-measures based method.

LERR13 argues that networks contribute to 'missing heritability'.

LERR13 seems to suggest that protein-DNA networks are better suited for expression QTL (eQTL).

LERR13 shows that OMIM is the source of 'causal' gene information for most network based GWAS (table 1). Only one paper use GeneCards as an alternative source.

LERR13 cited several pathway enrichment analysis of GWAS. It argues that interaction are treated equally in these enrichment analysis. (This can be cited in our CNV replies). The authors then show several method use weighted networks to identify network modules using iterative 'seed and extend' method. (For comparison, our CNV paper did not use seed explicitly, and avoid some 'prejudice').

LERR13 also discussed subnetwork modules with mutation hotspots in cancer genomes.

----------------------------------------------------------------------------------------------------------

BGTF12: GWAS use meta-analysis of multiple data sets to reduce false positives and increase statistical power.

A major concern of GWAS meta analysis is the heterogeneity in the data sets, such LD difference among data sets, chip differences.  However, Lin and Zeng 2010 (Gene epidemil) show heterogeneity is not a significant factor using simulation studies.

Combination across data sets is the frequentist approach, cumulative studies is the Bayesian approach.


In R, GWAS meta-analysis package: Metrafor, rmeta, and CATMAP.

BGTF12 argues that GWAS data should be 'cleaned' and imputed before meta-analysis.

Reference:
[AIS13] Atias, Istrail, Sharan 2013, Current Opinion in Genetics and Development. Pathway-based analysis of genomic variation data.

[LERR13] Leiserson, Eldrige, Ramachandran, Raphael, 2013, Current Opinion in Genetics and Development. Network analysis of GWAS data.

[BGTF12], begum, ghosh, tseng, feigold, 2012 NAS, comprehensive literature review and statistical consideration for GWAS meta analysis

No comments:

Post a Comment