Monday, December 17, 2012

Generate pruned phylogeny using Biopython

Biopython Phylo

Its cookbook (http://biopython.org/wiki/Phylo_cookbook) offers some useful examples. Biopython is still tied to python 2.7.

The following code will read a Newick tree, generate a lookup dictionry for leaf-nodes, and then remove individual leaf node to generate a pruned tree. 

from Bio import Phylo
#from Bio.Phylo import PhyloXML, NewickIO

def lookup_by_names(tree):
    names = {}
    for clade in tree.get_terminals():
        if clade.name:
            if clade.name in names:
                raise ValueError("Duplicate key: %s" % clade.name)
            names[clade.name] = clade
    return names


EX_NEWICK = 'spec.nwk'
treeA = Phylo.read(EX_NEWICK, 'newick')
print(treeA)
names = lookup_by_names(treeA)
treeA.prune(names['Bcl'])  #prune() takes an object from the same tree
treeA.count_terminals()
treeA.prune(names['Bha'])  #prune() takes an object from the same tree
treeA.count_terminals()
treeA.prune(names['Bsu'])  #prune() takes an object from the same tree
treeA.count_terminals()
treeA.prune(names['Bpu'])  #prune() takes an object from the same tree
treeA.count_terminals()



No comments:

Post a Comment