Showing posts with label gene network. Show all posts
Showing posts with label gene network. Show all posts

Thursday, April 15, 2021

GTEx tissue and cell specific gene networks in humans


Some of the best multi-view data are at NCBI GTEx site. 

https://www.gtexportal.org/home/datasets

 

These data sets however are quite complicated and need substantial analysis because they can be fed into deep learning models. 

 

nano -w GTEx_Analysis_2017-06-05_v8_RNASeQCv1.1.9_gene_reads.gct 

This file seems to show Ensembl gene ids and counts

There are genotypic data, so we can infer how SNPS -> expression -> phenotypic changes

The GTEx Consortium atlas of genetic regulatory effects across human tissues

https://www.biorxiv.org/content/10.1101/787903v1

ML papers using GTEx
* standard ML outperform deep learning
https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-020-3427-8 

Wednesday, March 17, 2021

temporal and spatial gene activities and diseases

 temporal and spatial issues of gene activities -> diseases

context dependent function, interactions 




Tuesday, March 16, 2021

CSHL network biology,

 

genetic perturbation on protein concentration, PPI from string database. 


Weiqun Li, Georgian Washington Univ. hichub


D-Script

http://cb.csail.mit.edu/cb/dscript/


Wednesday, February 24, 2021

Similarity network fusion for aggregating data types on a genomic scale

 

I found that this is not a machine learning method. It seems to be a graph branch heuristics. 

https://www.nature.com/articles/nmeth.2810.pdf?origin=ppub


Sunday, January 3, 2021

Friday, February 16, 2018

Monday, June 26, 2017

Yeast genetic map, thecellmap.org


http://thecellmap.org/costanzo2016/

Three zip files
-rw-r--r--@  1 hqin  staff   497M Jun 26 09:58 Raw genetic interaction datasets- Pair-wise interaction format.zip
-rw-r--r--@  1 hqin  staff    34M Jun 26 09:58 Raw genetic interaction datasets- Matrix format.zip



-rw-r--r--@  1 hqin  staff   147M Jun 26 09:59 Genetic interaction profile similarity matrices.zip

Expand to three folders
drwxr-xr-x@  7 hqin  staff   238B Dec  6  2016 Data File S1. Raw genetic interaction datasets: Pair-wise interaction format
drwxr-xr-x@ 18 hqin  staff   612B Dec  6  2016 Data File S2. Raw genetic interaction datasets: Matrix format
drwxr-xr-x@  5 hqin  staff   170B Oct 20  2016 Data File S3. Genetic interaction profile similarity matrices


The global interaction dataset is based on the construction and analysis of ~23 million double mutants which identified 550,000 negative and 350,000 positive genetic interactions and covers ~90% of all yeast genes as either array and/or query mutants. The global genetic interaction dataset includes three different genetic interaction maps. First, 3,589 nonessential deletion query mutant strains were screened against the deletion mutant array covering 3,892 nonessential genes to generate a nonessential x nonessential (NxN) network. Second, 1,162 TS query mutant strains representing 804 essential genes were also screened against the nonessential deletion mutant array to generate an essential x nonessential (ExN) network. Finally, 2,241 nonessential deletion mutant query strains and 1,108 TS query mutant strains, corresponding to 795 essential genes, were crossed to an array of 792 TS strains, spanning 561 unique essential genes, to generate an expanded ExN network and an essential x essential (ExE) network. The data can be downloaded from the links below. Note that we continue to map genetic interactions for remaining gene pairs not represented in this dataset and we will update the data and networks as new interactions are generated.


Correction: e should be epsilon.  abs(epsilon)>0.08 should be used for intermediate criteria. 


Reference
http://hongqinlab.blogspot.com/2013/06/notes-costanzo-sga-2009.html


20180131Wed
No self-interactions have been found in the cellmap network (using the stringent criteria). Therefore, to prepare the networks for permutation, I ordered all gene pairs alphabetically (i.e., both [A,B] and [B,A] will be changed to [A,B]), and then removed the redundant pairs (i.e., only one [A,B] was left).
However, after that I found that there are 878,704 overall interactions (both positive and negative), 540,396 negative interactions and 353,117 positive interactions.
Now the problem is that, neg + pos - all ~ 100k ... i.e., there are ~100k interactions have been found in both negative and positive sets.

Thursday, February 16, 2017

controversies on driver node identificaiton


In a PONE paper on driver node in cancer networks, a reader commented on different ways to identify driver nodes.

http://journals.plos.org/plosone/article/comment?id=10.1371/annotation/fa7b59e2-c5b0-4e34-b2bc-da5d4eef0ee5


Nacher J, Akutsu T (2012) Dominating scale-free networks with variable scaling exponent: heterogeneous networks are not difficult to control - Abstract - New Journal of Physics - IOPscience. New Journal of Physics 14.

There more recent papers on controllability outside of Liu and Barabasi group. 





Friday, December 18, 2015

toread, fruit fly gene network on cell shape

https://medium.com/lifes-building-blocks/building-gene-networks-67069be4fcb0#.1gutmoyw0

Monday, December 8, 2014

Li14, BMC Medical Genomics, predict disease genes using weighted tissue-specific networks

Li et al. BMC Medical Genomics 2014, 7(Suppl 2):S4 Prediction of disease-related genes based on weighted tissue-specific networks by using DNA methylation

Min Li1, Jiayi Zhang1, Qing Liu1, Jianxin Wang1*, Fang-Xiang Wu1,2*

From IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2013)

 "Considering the fact that the majority of genetic disorders tend to manifest only in a single or a few
tissues, we constructed tissue-specific networks (TSN) by integrating PIN and tissue-specific data." Qin: The observation is true, but it does not mean disease genes are only found in tissues with clinic phenotypes. These tissues show phenotypes only probably because the disease-causing genes play limiting roles in this tissues. 

 "In this paper, we treated known aberrant methylation genes as seed nodes and set initial quantity value with the use of the seed set, which will enhance the importance of seed nodes in network and solve defects of initial PageRank algorithm. The aberrant methylation data related to specific diseases in PubMeth database [44] were used in this paper." 

Li14 used page-rank method to predict disease genes, but seemed to modify it with centrality related measures.

Li14 calculated precision, how?

[52] Culhane AC, Schröder MS, Sultana R, et al: GeneSigDB: a manually curated database and resource for analysis of gene expression signatures. Nucleic acids research 2012, 40(D1):1060-1066.
Qin: Using a known-database to evaluate prediction. The precision of 60-80%. Did the authors address over-fitting problems? 

Tissue specific networks were constructed by removing unexpressed nodes.

References on removal method to generate tissue-specific networks
Waldman YY, Tuller T, Shlomi T, et al: Translation efficiency in humans: tissue specificity global optimization and differences between developmental stages. Nucleic Acids Research 2010, 38(9):2964-2974.

Bossi A, Lehner B: Tissue specificity and the human protein interaction network. Molecular Systems Biology 2009, 5(1):260.

Lopes TJ, Schaefer M, Shoemaker J, et al: Tissue-specific subnetworks and characteristics of publicly available human protein interaction databases. Bioinformatics 2011, 27(17):2414-2421.