## Sunday, July 3, 2016

### Marks 11 PlosOne protein 3D using maximum entropy

In Marks11PlosOne, my understanding is that M1 is the conservation of the marginal probability. M2 is the information entropy. M3 is to find the maximum of M2 with constraint of M1 using Lagrange multipliers (a vector) in theory. It seems to me M1, M2, M3 are what supposed to work in theory. The author then proposed an approximation method to solve M3.

The calculation approach for e_ij are indicated in Text S1 based on mean-field approximation:

The authors then calculated "residue pair coupling" using direction information (DI)

matlab code for DI:

function DI = compute_di(i, j, W, mu1, mu2, Pia)
tiny = 1.0e-100;
Pdir = W.*(mu1'*mu2);
Pdir = Pdir / sum(sum(Pdir));
Pfac = Pia(i, :)' * Pia(j, :);
DI = trace(Pdir' * log((Pdir+tiny)./(Pfac+tiny)));
end

Question: I thought e_ij describe all pairwise coupling strength? Why "DI"?
A: The authors seem to suggest that "DI" "reduced the set of empirically correlated residue pairs to the minimal set of pairs most likely to co-vary due to evolution constraints.

Shanon's information entropy
https://en.wikipedia.org/wiki/Entropy_(information_theory)#Definition

http://mp.weixin.qq.com/s?__biz=MzA5ODUxOTA5Mg%3D%3D&mid=2652549508&idx=1&sn=ee9d3efbbb0cf896b47d841e6a7b67f9&scene=5&srcid=06067TI9HUkXJ7r084noc7RS#rd

MATLAB code:
http://evfold.org/evfold-web/code.do

Note: This is basically network inference.