Saturday, March 30, 2013

Cases for students study and discussion (in progress)

Handness is not genetic

Vaccines and autism

All published medical research are wrong? 
PlosMed Bayesian paper.

Sir2 controversy

Myopia and sleeping light, Correlation or causation?
Studies including this type of error are published even in leading biomedical journals. For example, a 1999 Nature study found a strong association between myopia, or near-sightedness, and night-time ambient light exposure during sleep in children. The authors concluded that it seems prudent that infants and young children sleep at night without artificial lighting in the bedroom. A later study refuted these findings and reported that, in this case, the cause of myopia was genetic, not environmental, as many of the study participants’ parents also suffered from the condition. Of course, the fact that “correlation does not imply causation” should not lead towards diametrically opposite conclusions that correlation could not point to a possible existence of causality. Correlations, especially the high value of the linear correlation coefficient, may point to the existence of causality, but the conclusion requires systematic examination.
Chocolate and brain development
One example of ecological inference fallacy is a 2012 paper in a New England Journal of Medicine: the study author found that there was a close and significant linear correlation between chocolate consumption per capita and the number of Nobel laureates per 10 million persons in a total of 23 countries. On the basis of this finding, he concluded that chocolate consumption enhances cognitive function and closely correlates with the number of Nobel laureates in each country. But without accurate data at the individual level, it is impossible to draw such a conclusion. For example, it was unknown how much and whether Nobel laureates consumed chocolate.
GDP per capita

Friday, March 29, 2013


Wednesday, March 27, 2013

How to find student registration pins, 2012 Nov version

Saturday, March 23, 2013

Anaerobic growth of microbes (in progress)

Anaerobic growth of microbes

Wednesday, March 20, 2013

NGS services (in progress)

Beckman Coulter Genomics, $899, based on a batch of 32 samples including library construction, 2x75bp sequencing on Illumina HiSeq 2000 and primary data analysis. This means 32x900 ~ 30K per order.

Friday, March 15, 2013

Probability of radioactive isotope survival and lifetime

Based on wikipedia,  particle decays is a Poisson process.  The probability that a particle survives for time t before decaying is the Poisson waiting time, i.e., an exponential distribution.  This is amazingly similar to the coalescence simulation in population genetics.

where tau is the mean lifetime of the particle and gamma is the Lorentz factor of the particle. This is basically the first order reaction. 

Spelman international travel application link

For students to take international trips, please apply through this link:

Contact person is Dr. Dimeji Togunde

Faculty should contact HR about approval, travel insurance, and polices about taking international trips. 

Notes on ZipfR

ZipfR is for Large Number of Rare Events (LNRE) modeling of lexical distributions. So its parameters are named related to the lexicon.

N = sample size.
V = vocabular size.
Number of types (Vm) per frequency class (m).

The question is how to specify gamma for the power-law network model? 
ZipfR limits alpha in ZM model between 0 and 1, this seems to be a major problem for biology network simulation that often have power-law coefficient between 2 and 3. 

ZipfR,  R text package, and igraph probably can be do some interesting projects.

Power law, Zipf, Pareto distribution, and gene network

Power law distribution refers to the general form of y = a * x^ k.

In gene interaction networks with infinite number of genes, the power law distribution of connectivity is
       (Eq 1)
where k is the connectivity, and Z(gamma) is the Riemann zeta function.  This definition implies that when gamma <=3, the mean does not exists because its distribution has a infinite variance.

Zipf law is defined in a finite population with N number of elements with discrete values.
The above is basically the probability mass function. In this form, the variance is always finite. So, the theoretic property is different from Eq 1 definition. 

Zipf law can be generalized as the Zipf–Mandelbrot law.
where k is the rank of data, and q and s are distribution parameters. This generalized Zipf-Mandelbrot law also covers the specialized form for a 'pure' gene/protein network in power-law configuration as in Eq 1.

In R, a package zipfR is available.  See discussion and

Z <- lnre("zm", alpha=.8, B=1e-3)
hist( as.integer(rlnre(Z, 1000)))

As an alternative, the "rmutil" package by Jim Lindsey provides Pareto distributions.

It seems netmodels  also provide power-law distribution, but this package become orphaned. 

dist <- degree(
alpha <- calc.alpha(dist),5)

Pareto distribution is defined for continuous variable x. It is seemed to designed to describe the tail distribution in economics. It cumulative distribution function given by wikipedia is:

Its PDF is
in which, alpha and xm^alpha are constants.

Zipf distribution can be viewed as a discrete Pareto distribution. So, I can use 'rpareto{extremevalues}'  or 'rpareto{VGAM}' to generate the randome number and then discretize them. (See

Rich Wash wrote some R scripts to provide dpowerlaw(), ppowerlaw(), rpowerlaw() functions.

The igraph package provide a power-law fit function that seems to uses mle().

References and links:

Thursday, March 14, 2013

Notes on introduction to systems biology (in progress)

ODE models


Cell cycle model

Modeling butanol production by Clostridium beijerinckii,
Shinto, H, Tashiro, Y. Yamashita, Kobayashi, ..., 2008, Kinetic study of substrate dependency for higher butanol production in acetone-butanol-ethanol-fermentation. Process Biochemistry, 43, 152-1461.  (XppAuto ode model is included).

Notes on chemostat (in progress)

  A mini bioreactor system.

Maitreya Dunham lab on chemostat
Botstein lab on chemostat
   Botstein lab chemostat manual 

Notes on microfluidics in biology (in progress)

Ruth Williams, sticky lithography, The Scientist. 

Amber Dance, live-cell imaging system, The Scientist

Carl Zeiss Cell observer

Perfusion system
  Perfusion pump, Bioptechs, ~$1.2K,

pulse and change growth yeast media in microfluidics, such as radicicol
screen of pooled mutants and hybrids. 

high throughput phenotyping (lifespan assays)

Candidate projects and data sources for course projects (in progress)


 GEO includes gene expression, and NGS data
John Snow's cholera data

GHO Raw Data Download Web Service

WHO childhood hunger data, used by Jeff Leek's course oxidative stress and mutation

The loan data from data analysis course: 
For this analysis you will use the loans data available from here:

There is a code book for the variables in the data set available here:



cancer genome atlas


Scientific survey data

Wednesday, March 6, 2013

Bacterial aging (in progress)

Draft in progress.

Francois Taddei

Asymmetric segregation of protein aggregates is associated with cellular aging and rejuvenation

Dukan S, Nystrom T (1998) Bacterial senescence: stasis results in increased and differential oxidation of cytoplasmic proteins leading to developmental induction of the heat shock regulon. Genes Dev 12: 3431–3441.
Dukan S, Nystrom T (1999) Oxidative stress defense and deterioration of growth-arrested Escherichia coli cells. J Biol Chem 274: 26027-26032.

Topics on food and/or health disparity (in progress)

Ron Finley: A guerilla gardenner, Ted Talk
  "Food is the problem and the solution". 
  "Growing your food is like printing your money."

Type 2 diabetes declines along out-of-Africa.


Prostate cancer

Monday, March 4, 2013

Radiation-dose survival curve of D. radiodurans is similar to Gompertz curve

Radiation-dose survival curve of D. radiodurans is similar to a Gompertz curve. It's broader shoulder suggests a very smaller 'initial mortality rate' scenario.

Daly, M.J., 2012, DNA Repair, Death by protein damange in irradiated cells.

Protein carbonylation and aging (in progress)

Maisonneuve09  stated that "Carbonyl derivatives are mainly formed by direct metal-catalysed oxidation (MCO) attacks on the amino-acid side chains of proline, arginine, lysine and threonine residues. "  Maisonneuve09 then focus on the RKPT-enriched regions in proteins.

More genomics methods have been applied to identify carbonylation sites.
Connection to ROS damage and aging? Do aging-hub aging tend to avoid carbonylation motif??