Wednesday, August 21, 2013

Biopython and SNP


References:
http://comments.gmane.org/gmane.comp.python.bio.devel/8928

https://github.com/ngopal/23andMe


http://biopython.org/pipermail/biopython/2010-April/006416.html
2010/4/13 Tiago Antão <tiagoantao at gmail.com>:
> Hi,
>
> Just a simple question:
> Entrez SNP seems to return ASN.1 format only.
> Is there any way to parse this in biopython? I've looked at SeqIO and
> found nothing...
> I can think of tools to process this outside, but I am just curious if
> this is processed natively with Biopython (being an exposed NCBI
> format...)
>
> Many thanks,
> Tiago
> PS - You can easily try this with:
> hdl = Entrez.efetch(db="snp", id="3739022")
> print hdl.read()

Hi Tiago,

No, we don't support ASN.1, and I don't see any good reason to - I
think it would only be NCBI ASN.1 we'd we interested in, and I think
that all their resources are available in other easier to use formats
like XML these days.

See also http://en.wikipedia.org/wiki/Abstract_Syntax_Notation_One

Instead ask Entrez to give you the SNP data as XML:

Entrez.efetch(db="snp", id="3739022", retmode="xml")

Hopefully the SNP XML file has everything in it.

You have a choice of Python XML parsers to use. However, the
Bio.Entrez parser doesn't like this XML. This appears to be related
(or caused by) a known NCBI bug. See
http://bugzilla.open-bio.org/show_bug.cgi?id=2771

Peter

No comments:

Post a Comment