Thursday, December 18, 2014

NGS method, RNA seq

Method in Lei 2013, Gene, Diminishing returns in next-generation sequencing (NGS)
transcriptome data. 

The sequencing files downloaded from NCBI SRA database were initially
converted from SRA format to FASTQ format using SRA toolkit
(http://www.ncbi.nlm.nih.gov/Traces/sra/?view=software). Then, the
raw data were filtered using the following criteria: (1) the number of
unknown bases (N) was no more than two for each read; and (2) the
fraction of low quality sites (Q b 5) was no more than 50% for each
read. The data that passed this quality control were then used to map
back to their respective genome sequences using bowtie2 (Langmead
and Salzberg, 2012). Only uniquely mapped reads with no more than
two mismatches were retained for further analysis. After mapping, the
counts for each gene were summarized using HTSeq (http://wwwhuber.
embl.de/users/anders/HTSeq/doc/overview.html). In the simulation,
a predetermined-sized subset of reads was randomly selected
from the original file. Using the samemapping procedure as mentioned
above, the RPKM for each gene and depth of coverage were calculated
and comparedwith those fromoriginal data. In-house Perl and R scripts

were developed for data analysis and graphing (available upon request).

No comments:

Post a Comment