This site is to serve as my note-book and to effectively communicate with my students and collaborators. Every now and then, a blog may be of interest to other researchers or teachers. Views in this blog are my own. All rights of research results and findings on this blog are reserved. See also http://youtube.com/c/hongqin @hongqin
Showing posts with label SNPs. Show all posts
Showing posts with label SNPs. Show all posts
Thursday, May 14, 2020
from alignment to snp
https://www.researchgate.net/post/Does_anyone_know_a_software_for_SNPs_analysis_from_FASTA_sequences
Saturday, May 9, 2020
convert multiple seq alignment into snp
SNP-sites: rapid efficient extraction of SNPs from multi-FASTA alignments
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5320690/Friday, January 8, 2016
COSMIC genotype
v75
~/genotype/
gunzip 1240121_complexGenotypes.csv.gz
Byte-4:genotypes hqin$ wc -l 1240121_complexGenotypes.csv
884149 1240121_complexGenotypes.csv
There are 884K rows of SNPs in this file.
Genotypes
---------
Files listing the SNP calls for each cell line identified by PICNIC analysis of
Affymetrix SNP6.0 array data. Both a simple genotype (AA, BB – homozygous or AB
– heterozygous) and a complex interpretation of the genotype are given (for
example, in a triploid region of the genome the genotype maybe AAB).
Download from genotypes directory.
File Description
Chr - Chromosome GRCh38/hg38
pos - Genome Position GRCh38/hg38
ncopies.A - Number of copies of allele A
ncopies.B - Number of copies of allele B
Probe.Set.ID - SNP6.0 probe ID
dbSNP.RS.ID - dbSNP reference ID
Allele.A - genotype 'A' nucleotide
Allele.B - genotype 'B' nucleotide
chr_b36 - Chromosome NCBI36/hg18
pos_b36 - Genome Position NCBI36/hg18
chr_b37 - Chromosome GRCh37/hg19
pos_b37 - Genome Position GRCh37/hg19
complexGenotype - a complex interpretation of the genotype eg in a triploid
region the genotype maybe AAB
simpleGenotype - a simple genotype eg AA, BB – homozygous or AB –
heterozygous
Wednesday, April 30, 2014
Li Ma defense, haplotype, Qing Song lab, Morehouse medical school
Fu and Ma in prep
Teri manlolio,
Schizophrenia, Lee 2012, GWAS
http://www.nature.com/ng/journal/v44/n3/abs/ng.1108.html
Smemo etal Nature 2014 longrage haplotype FTO IRX3
http://www.nature.com/nature/journal/v507/n7492/full/nature13138.html
haplotyping methods
GMP 2001, nat genetics
Qiagen 2005
Polony2006, Nat Genet
Barcode 2009, Nat Method
Fosmid 2011, Nat biotech
HiC 2013, Nat Biotech
Illumina 2014, Nat biotech,
Laser microdissection of chromosomes, take half of 46 chrosomes, by chance we can get some single chromosomes which can be used for haplotyping.
23 chromosome is 3.5pg, amplified to 5-8ug for highthroughput sequencing.
Use heterozygosity of identify diploid and haploid chromosomes.
HiFi software
http://www.cs.gsu.edu/?q=node/536
Quake Dataset
http://www.cbcb.umd.edu/software/quake/
Friday, October 11, 2013
notes on dbSNP, in progress
Multiple genome locations can be mapped to a single SNP. One reasons is the ambiguity of alignment.
http://www.biostars.org/p/2323/
https://cgsmd.isi.edu/dbsnpq/downloads.php
Wednesday, August 21, 2013
Biopython and SNP
References:
http://comments.gmane.org/gmane.comp.python.bio.devel/8928
https://github.com/ngopal/23andMe
http://biopython.org/pipermail/biopython/2010-April/006416.html
2010/4/13 Tiago Antão <tiagoantao at gmail.com>: > Hi, > > Just a simple question: > Entrez SNP seems to return ASN.1 format only. > Is there any way to parse this in biopython? I've looked at SeqIO and > found nothing... > I can think of tools to process this outside, but I am just curious if > this is processed natively with Biopython (being an exposed NCBI > format...) > > Many thanks, > Tiago > PS - You can easily try this with: > hdl = Entrez.efetch(db="snp", id="3739022") > print hdl.read() Hi Tiago, No, we don't support ASN.1, and I don't see any good reason to - I think it would only be NCBI ASN.1 we'd we interested in, and I think that all their resources are available in other easier to use formats like XML these days. See also http://en.wikipedia.org/wiki/Abstract_Syntax_Notation_One Instead ask Entrez to give you the SNP data as XML: Entrez.efetch(db="snp", id="3739022", retmode="xml") Hopefully the SNP XML file has everything in it. You have a choice of Python XML parsers to use. However, the Bio.Entrez parser doesn't like this XML. This appears to be related (or caused by) a known NCBI bug. See http://bugzilla.open-bio.org/show_bug.cgi?id=2771 Peter
Subscribe to:
Posts (Atom)