Wednesday, February 17, 2021

integrate multiple genomic data sets for deep learning, Yang20BiB

 

multi-view and multi-label classification is used Yang, Deng, ... Wu, Briefings in Bioinformatics, 2020, 1-13, RNA-binding protein recognition based on multi-view deep feature and multi-label learning.  

Yang20BiB used a separate CNN model for RNA-sequence matrix, amino acid matrix, and dipeptide-matrix, respectively. Each model predicts multi-label classification, and the final result is based on the majority of votes. This might be adapted to other phenotypes predictions, such fitness, cell fate, essentiality? 

When reading the py codes, it seems RNA UTR were all translated into Amino Acids in 3 frames!? This does not make biological sense! 

One-hot coding was used to label the RBP-motif binding. 

Figure 5 suggest multilabel classifcation is done by sequential binary classier. 

Question: How is separated CNN co-trained? 

One way is to learn parameters to make the correlation of two-view predictions as high as possible. 

https://en.wikipedia.org/wiki/One-hot 

No comments:

Post a Comment