Friday, August 29, 2014

Proton gradient

BIO386, E coli aging, endosymbiosis, mitochondrial aging

Mitochondria has proteobacterial origin.

E coli genus is in the proteobacteria phylum.

Thursday, August 28, 2014

FYE, meeting 1, discussion of Americanah

Five sections meet in sci 134

Students talked about their culture shock experiences when came to Atlanta or traveling aboard.

Two students mentioned their trip to China.

bio233 chapter 2, microscopy, intro to microbe world

Preparation: Plastic sheets, pencils, dark markers,

=> About 20 minute,  plastic sheets drawing exercises for contrast, group project
one bacterial cell drawn with pencils and
one cell drawn with dark markers
 for contrast activities.
Most groups drew very nice pictures.

=> Went through slides and used clicker based questions.

A students asked what should be remembered.


  • I forget to share PPT screen after the video showing.
  • YouTube live-recording does not allow 'pen' feature on PPT.
  • Bamboo pen does not go to the Monitor window. When I synchronized my desktop and the monitor window, I lost my screensharing windows. 
  • I had trouble of locating of my live-event control window. I had two of them opening at the same time for a while. 

Tuesday, August 26, 2014

bio233, chapter 1, intro to microbe, 20140826

lecture slides
 history of planet earth
 clicker quiz on figures

video intro to microbiology
video on Pasteur, spontaneous generation

Assignments (14 out of 25)

Ask volunteer for Virus tradeoff paper

Treats for students that finished the pre-survey

Summary at the end of class. Asking students for lessons learned.

GWAS meta analysis

Gilman et al, Neuron, 2011. p898-907. NetBag on Autism
NetBag is a greedy approaches. The clustering methods started with one or two genes in CNV as ‘seeds’.

Gilman11 generated a weighted background human gene network for their study.

Gilman11 compared the cluster raw pvalue, called local pvalue to the p-values from random networks. The adjusted p-value is called global p-value.


AIS13 categorize pathway association methods into canonical and de nov pathway methods.

For de novo pathway discovery, integer linear program (ILP) is used in Leiserson , Blokh, Plos Comput Biol. Simultaneous identification of multiple driver pathway in cancer.

Steiner tree problem where one seeks the lowest cost pathway that connect the associated genes. See Liu et al, BMC Sys Biol 2012, Gene, pathway and nework frameworks to identify epistatic interactions of single nucleotide polymorphisms derived from GWAS data.

LERR13 review the method on protein-protein and protein-DNA networks to identify 'causal' genetic variant. (Their 'causal' definition is a narrowly defined one).

LERR13 argues that GWAS mostly find SNP that are LD with the actual 'causal' gene. This problem is of less concern to CNV analysis. One solution is to use network to 'rank' genes in the same haplotype known to be associate to the phenotype of interest or similar phenotypes. (This method is in the spirit of our recent CNV paper).

The green square represents the 'known' 'causal' gene. So, this is largely a traversal-measures based method.

LERR13 argues that networks contribute to 'missing heritability'.

LERR13 seems to suggest that protein-DNA networks are better suited for expression QTL (eQTL).

LERR13 shows that OMIM is the source of 'causal' gene information for most network based GWAS (table 1). Only one paper use GeneCards as an alternative source.

LERR13 cited several pathway enrichment analysis of GWAS. It argues that interaction are treated equally in these enrichment analysis. (This can be cited in our CNV replies). The authors then show several method use weighted networks to identify network modules using iterative 'seed and extend' method. (For comparison, our CNV paper did not use seed explicitly, and avoid some 'prejudice').

LERR13 also discussed subnetwork modules with mutation hotspots in cancer genomes.


BGTF12: GWAS use meta-analysis of multiple data sets to reduce false positives and increase statistical power.

A major concern of GWAS meta analysis is the heterogeneity in the data sets, such LD difference among data sets, chip differences.  However, Lin and Zeng 2010 (Gene epidemil) show heterogeneity is not a significant factor using simulation studies.

Combination across data sets is the frequentist approach, cumulative studies is the Bayesian approach.

In R, GWAS meta-analysis package: Metrafor, rmeta, and CATMAP.

BGTF12 argues that GWAS data should be 'cleaned' and imputed before meta-analysis.

[AIS13] Atias, Istrail, Sharan 2013, Current Opinion in Genetics and Development. Pathway-based analysis of genomic variation data.

[LERR13] Leiserson, Eldrige, Ramachandran, Raphael, 2013, Current Opinion in Genetics and Development. Network analysis of GWAS data.

[BGTF12], begum, ghosh, tseng, feigold, 2012 NAS, comprehensive literature review and statistical consideration for GWAS meta analysis

Monday, August 25, 2014

Moodle 2.5 resources

Q: How to order questions in assignment by their performance?
A: Quiz administration -> Results-> Statistics

bio233, lab, sampling microbes on campus, 20140825 monday

bio233 lab,

24 students, 6 groups, 9 plates each,

read the protocols

demos, sterile technique,

sample the same size of area,
make a serial number of each plate date-group#-plate#
label the plates on the sides in the back,

take picture of locations

parafilm seal the plates, leave at room temperature

AUC cross registration, fall 2014

AUC cross registration

Friday, August 22, 2014

Bio386 day 2, unix shell, computer basics

Let students worked on unix shell.
Apple computer basics

ls -a -l

Students are not aware of proper posture to use computers.

Todo: Computers in room 277 use OS 10.6.8. They need to be updated.

audio recording at

Bio386 todo list

=> Network permutation analysis

=> R igraph to bio386 on network analysis; 

Thursday, August 21, 2014

BIO233 day 1, Fall 2014

o over syllabus, grading policies, grading policies.

Ask student read lab safety policies. 

Mentioned that reading and assignment will be due before class. 

IRB, academic integrity form, photo and video release forms.

I then used power pointer to test clickers.

I live-recorded my lecture on YouTube. The powerpoint is set to individual window model. Youtube live-events can recognize the monitor window.

I used name tags to remember student names.  I went over 3 round to call each student name. 


I tried to show the spring 2014 bio233 final exam. The second part of the final has a problem, with an error message saying that only programmer can fix this. I later found the
 last question has an error. After deleting the last question, the exam can be shown. 

Three out the 22 students could not read the words on the screen. They seem have eyesight problems but are not wearing glasses. 

Wednesday, August 20, 2014

FYE advsing, comparative women's study, fall 2014, FYE advising problem

Comparative women's study, fall 2014. Instructor does not take first year students.

dbGap ?

dbGap, GWAS
The database of Genotypes and Phenotypes (dbGaP) was developed to archive and distribute the results of studies that have investigated the interaction of genotype and phenotype.

bio386, day 1, fall 2014

Irb release forms  Laptop setup
R installation
R studio installation
Property leasing form
Presurvey on computing attitude

Setup an Github account
Create a new repo for namefall2014
Found and join qinlab on Github

Basic Apple computer usage, double-click for righ-click, command button, screen shot, Finder, Download, Application, command-tab to switch between windows,
Previous student projects

R for Dummies, R by examples to students. 

Monday, August 18, 2014

Open source animation tool

Janet Iwasa: How animations can help scientists test a hypothesis

Molecular classification of cancer is more informative than locations

animation on algorithm

SpelFolio tutorials

SpelFolio tutorials

meta analysis of GWAS

NGR review by  Evangelos Evangelou and John P. A. Ioannidis, 2013

EI13 wrote that FDR has been rarely used in GWAS, though popular in omics. 

When phenotype is hard to define, GWAS often has problems. 

LRT test on Gompertz-Makeham model,

# I used single-fitting as initial values for the double-fitting. It worked. 
# Hessian does seems to be available when lower bounds are specifized in optim()

#20140818 Nested model test on LS1 RLS using the Gompertz-Makehma model
# Conclusions:
# H0: all I, G, M are the same between control and LS1
# H1: M are different between control and LS1
# p = 1.285958e-05, indicating that H1 is significantly different from the hull hypothesis

# H0 I = 1.000000e-05, G= 0.35, M = 0.021
# H1: I = 5.597089e-07, G = 0.45, M1 = 0.0063, M2 = 0.052


tb = read.xlsx( "LS1 MD data.xlsx", 1 );
ctl = tb[,2]
ls1 = tb[,3]

fitCtlGom = flexsurvreg(formula = Surv(ctl) ~ 1, dist="gompertz")

S = calculate.s(ctl)
s= S$s; t=S$t;
g1 = gnls( s ~  exp( (I/G) *(1 - exp(G* t)) - M*t ), start=list( I=0.003, G=0.16, M=0.01) )
retGM = optim ( g1$coef,, lifespan=ctl,
               lower = c(1E-10, 1E-5, 0), upper = c(0.2, 2, 0.1), method="L-BFGS-B");
retG = optim ( c(3.6E-5, 0.3),, lifespan=ctl,
               lower = c(1E-10, 1E-5), upper = c(0.2, 2 ), method="L-BFGS-B");

fitLS1Gom = flexsurvreg(formula = Surv(ls1) ~ 1, dist="gompertz")

S2 = calculate.s(ls1)
s = S2$s; t = S2$t
g2 = gnls( s ~  exp( (I/G) *(1 - exp(G* t)) - M*t ), start=list( I=0.003, G=0.16, M=0.01) )
retGM2 = optim ( g2$coef,, lifespan=ls1, lower = c(1E-10, 1E-5, 0), upper = c(0.2, 2, 0.1), method="L-BFGS-B");
retG2 = optim ( c(0.0166, 0.1),, lifespan=ls1,
               lower = c(1E-10, 1E-5), upper = c(0.2, 2 ), method="L-BFGS-B");

initIGM = c(retGM$par, retGM2$par)

##### LRT to exam whether two data on I,G,M.
#                            rawIGM =c( I1,     G1,          M1,      I2,         G2,     M2 )
H0   <- function( rawIGM ) { IG <- c(rawIGM[1], rawIGM[2], rawIGM[3], rawIGM[1],  rawIGM[2], rawIGM[3]) }  #all the same
H1m  <- function( rawIGM ) { IG <- c(rawIGM[1], rawIGM[2], rawIGM[3], rawIGM[1],  rawIGM[2], rawIGM[6]) } # M different <- function( rawIGM, model, lifespan1, lifespan2 ) {
  IGM = model(rawIGM);
  I1 = IGM[1]; G1 = IGM[2]; M1=IGM[3]; I2 = IGM[4]; G2 = IGM[5]; M2=IGM[6];
  my.lh1 =[1:3], lifespan1)
  my.lh2 =[4:6], lifespan2)
  my.lh = my.lh1 + my.lh2
  print (IGM ); #trace the convergence
  ret = my.lh

rawIGM = c(retGM$par, retGM2$par)
llh.H0   = optim( initIGM,, model=H0,   lifespan1=ctl, lifespan2=ls1,
                  method="L-BFGS-B", lower=c(1E-5,1E-5,1E-5, 1E-5,1E-5,1E-5) );
llh.H1m  = optim( initIGM,, model=H1m,  lifespan1=ctl, lifespan2=ls1,
                  method="L-BFGS-B", lower=c(1E-7,1E-7,1E-7, 1E-7,1E-7,1E-7) );

rbind( llh.H0$par, llh.H1m$par)
deltaLH = llh.H0$value - rbind( llh.H0$value, llh.H1m$value)

1 - pchisq( 2*deltaLH, df =1 );

###### End of LRT    ############

Custom distribution for flexsurv(), example

## Compare generalized gamma fit with Weibull
fitg <- flexsurvreg(formula = Surv(futime, fustat) ~ 1, data = ovarian, dist="gengamma")
fitw <- flexsurvreg(formula = Surv(futime, fustat) ~ 1, data = ovarian, dist="weibull")
lines(fitw, col="blue",,
## Identical AIC, probably not enough data in this simple example for a
## very flexible model to be worthwhile.

## Custom distribution
library(eha)  ## make "dllogis" and "pllogis" available to the working environment
custom.llogis <- list(name="llogis",
                      transforms=c(log, log),
                      inv.transforms=c(exp, exp),
                      inits=function(t){ c(1, median(t)) })
fitl <- flexsurvreg(formula = Surv(futime, fustat) ~ 1, data = ovarian, dist=custom.llogis)


Thursday, August 14, 2014

Generate a random file with fixed size

using 'dd'

dd if=/dev/urandom of=a.log bs=2097152 count=1

dd if=/dev/urandom of=a.log bs=106954752 count=1


auc woodruff library, remote access

git push errors for large files (100M is the limit)

Files larger than 100M are rejected by Github. 


helen:courses_archives hqin$ git push origin master
Counting objects: 670, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (644/644), done.
Writing objects: 100% (669/669), 789.22 MiB | 916.00 KiB/s, done.
Total 669 (delta 44), reused 0 (delta 0)
remote: error: GH001: Large files detected.
remote: error: Trace: 340050046f102fa5e1320f3fc47e3a80
remote: error: See for more information.
remote: error: File cshlccb2011/ is 172.19 MB; this exceeds GitHub's file size limit of 100 MB
 ! [remote rejected] master -> master (pre-receive hook declined)

error: failed to push some refs to ''

Wednesday, August 13, 2014

git push file size problem, config http.postbuffer

To fix file size problem

#for 500M
git config http.postBuffer 524288000

git config http.postBuffer 9524288000

PSC blacklight

module load R
R -f test.R

blacklight user guide

Advising fall 2014

Spelman Fall advising

Advising, placement analysis in R


lang = read.xlsx2( "language Placement Results 2014Fall.xlsx", 1,
                  colClasses=c("character", "character", "character", "character", "character") )
for(i in 1:5) {
  lang[,i] = as.character(lang[,i])

tb = read.xlsx2( "qin-FYE-2014-worksheets.xlsx", 1 )
for(i in 1:4) {
  tb[,i] = as.character(tb[,i])

tb$SCID %in% lang$SCID
tb$Language = lang$Language[match(tb$SCID, lang$SCID)]
tb$LangPlacement = lang$placement[match(tb$SCID, lang$SCID)]


math = read.xlsx2( "Advisor Placements Fall 2014.xlsx", 1)
for(i in 1:4) {
  math[,i] = as.character(math[,i])
tb$SCID %in% math$Student.ID.
tb$MathCode. = math$Code.[match(tb$SCID, math$Student.ID.)]
tb$MathPlacement = math[match(tb$SCID, math$Student.ID.), 4]
tb$MathOther = math[match(tb$SCID, math$Student.ID.), 5]

write.xlsx(tb, "qin-FYE-20140813.xlsx")

Tuesday, August 12, 2014

R excel: match student language placement

For advising


lang = read.xlsx2( "language Placement Results 2014Fall.xlsx", 1,
                  colClasses=c("character", "character", "character", "character", "character") )
for(i in 1:5) {
  lang[,i] = as.character(lang[,i])

tb = read.xlsx2( "qin-FYE-2014-worksheets.xlsx", 1 )
for(i in 1:4) {
  tb[,i] = as.character(tb[,i])

tb$SCID %in% lang$SCID
tb$Language = lang$Language[match(tb$SCID, lang$SCID)]
tb$placement = lang$placement[match(tb$SCID, lang$SCID)]

Powerpoint presentation for GoogleHangout Live-event on YouTube

The key is to set PowerPointer presentation to "by individual windows".

Choose "Setup Slide Show"

 Choose (Browsed by an individual window)

see :

Quotes for teaching

If people do not believe that mathematics is simple, it is only because they do not realize how complicated life is.  -- John von Neumann

Faculty Institute, August 12, 2014

Critical inquiry

High Impact Practices

Digital Pedagogy: mobile device in classrooms, learning management system, social media in classrooms, crowd funding research
Ref: future trends in technology and eduction, Bryan Alexander

Student Learning Outcomes
SLO should be mapped in syllabi. For major core courses, assessment should be provided.

Forum for Education Abroad

To setup Spelfolio, a template and rubric need to be setup, often several weeks ahead.

Millennial Professor

SACSCOC (Southern Association of College and Schools Commission on College)
SACSCOC uses a peer review model (outside of the home state). This is the 5th year report for Spelman.

How should we address SACS requirement
Identify SLO, evaluate them, improve curriculum based on SLO assessment.

Curriculum Mapping Quick Reference Guide (handout)

Effective rubric

IBL: Inquiry based learning

Teachers studio

transparency, accessibility, collaboration, outreach, speed

President search
Recruiting in Sep-Oct (3 possible search firms), screen and evaluation in Nov-Jan. Interview till May.

What are the characteristics of a future college president? -> Position profile.

J Ehme: Teacher explains, encourages, evaluates, ... ...
WebEx meet for class MWF 2-2:50
WebAssign for homework, quizzes and tests, automatic grading

Caveats: can's read faces and gestures,

Hibbard on lecture capture, 
Don't obsess with perfection.

ShowMe App
Educreation App
Adobe Captivate ($)
Adaptive learing Moodle learning quizzes

Lecture capture by GoogleHangOut

Monday, August 11, 2014

gradient decent for optimization

Notes from Andrew Ng's Machine learning, Coursera. 

bizhub 754 windows dell laptop installation

12:40pm, download the PCL driver for Windows 7

Install local printer with a new TCP/IP port, did not work. 

Saturday, August 9, 2014

submitR test using 48states

Files in the submitted jobs should be put in the same folder.

Folder pushed to github.

Friday, August 8, 2014

Undergraduate Research Conference, NIMBioS, UTK, Nov 1-2, 2014

meeting page

Abstract guidelines


2014 Undergraduate Research Conference

Abstract Information

Abstract submission deadlines:
  • For those requesting NIMBioS funding: September 10, 2014
  • For all other attendees: October 24, 2014
Abstract Guidelines:
  • For those applying for funding, please submit your abstract(s) with your funding application.
  • For all others, attach abstract(s) as a Microsoft Word document to an email and send to Kelly Sturner Please use the subject line: URC 2014 Abstract(s).
  • Abstract should be no more than 200 words.
  • Follow style and formatting guidelines of this sample abstract (.doc).
Is this your first abstract? Need some advice? See our Writing the Abstract  PREZI.

Poster guidelines

To request funding

Sunday, August 3, 2014

Predictive power of network aging model

DG asked about the "predictive power/accuracy in its current form" about my network model of aging.

Synthetic 'short' lived cells in double-null mutant when both single nulls are long-lived. 

Synthetic lethals is credited to Haldane by David Botstein.

Saturday, August 2, 2014

todo: glucose, tor1d, sir2 overexpression,

which one is more close to DR, sir2 overexpression or tor1D?

ploidity evolution in raffinose, david pellman lab

Anna Selmecki

haploid and diploid are stable during controlled evolution in raffinose
triploid, tetradploid tend to have aneuploid, lead to faster evolution.

CGH detected segment chromosome anueploids

sir2, SIR2 overexpression, LRT

(1) sir2D and WT have different R and G. 
(2) sir2D and SIR2 overexpression share the same G. 

Black: by4742; Red: sir2D, Blue: SIR2-overexpression.

spatial sensing in cerevisiae

pheromone receptors polarizes and concentrated on one spot?
Cdc42 patch determine the point of growth

positive feedback loop


Normally, hyphae occur in diploid a/alpha and only on solid media.

Occur more in sigma background, rare in S288c.

Friday, August 1, 2014

heterogeniety in yeast colonies,

=> Palková groups argues that yeast colony is a model of multi-cellular organism.

=> prion like proteins in responses to enviromental cues such as bacteria

=> TF stochastically binds to DNA

=> limited macro-molecules, crowded space within cells

yeast14, august 1, afternoon

=> temperature sensitive genetic interactions
4.4x more than the 2010 version.

negative essential interactions are functionally informative (are locally important).
In other words, possitive genetic interactions are long-range. 

=> sirtuins substrates, David Paul Toczyski
nitotinamide treatment, quantitatvie mass-spec

52 substrates of sirtuins, many are TFs, growth and ribosome biogenesis

=> Ohnuki, Ohya, phenotyping in haplosufficiency in essential genes

ribosomal complex
chaperonin CCT complex

positive correlated GO terms:
netative correlated  GO terms: proteosome <--> tRNA synthesis

=> disordered proteins,  Daniel Jarosz
20 yeast proteins with prion domain (N/Q rich)
screens show many more, suggest more non-canonical prion-domains

GAR+ loves longer than gar-
bacteria elicit [GAR+] via a small molecule (unknown)

GAR+ coexist with bacteria

=> Hofmann
Ure2, Cyc8, Sfp1, Mot3 --> FLO11, HXT2

predicted prion TFs

FLO11 bistable
Holmes 2013 Cell

fermentable carbon ~ nonfermentable carbon

azide blocks electron transfer chain

FLO11 is heavily glycosylated, a glucose sink

Iodine staining of colony (glycogen)

50 nM R3a-5a small molecule inhibitor of manno-transferase?

Dan G: why cheater? Just two mode of development states?  (It is the same genome.)

=> Ben Tu's lab, mitochondria biogenesis
 yeast metabolic cycles
glucose depletion and ethenao+glycerol  induce mitochondria biogenesis

yeast14, August 1, Friday

=> Vacuole pH in replicative aging, Gotschling lab.
Vacuole acidity is reduced during aging and is asymmetric. Hughes Nature 2012

plasma membrane H+-ATPase has the opposite asymmetric pattern from vacuole acitiy between mother and daughter cells.


Quinaccine, vacuole acidity

1 million of Pma1 on plasma membrane, 100K ATPase on vacuole. only 3000 free protons in cytoplasma ??

Glucose activate Pma1(ATPase), so low glucose decrease ATPase activity.

Proton diffuses very fast, so cytoplamic proton are instaneous. Yet, proton gradient are observed.

Tor1 activity affects vacuole activity.

=> Peroxisome
major ROS producers, similar to liver bile function. peroxisomes always split, but do not fuse.

older peroxisome have different membrane compositions
older peroxisomes are retained in older mother cells.

Older peroxidsomes are removed by autophagy.

Peroxidsome division is similar to mitochondria divisions. Drp1

=> translocation of cyclin C (cnc1) to mitochondrai mediate stress induced fission and programmed cell death
Randy Strich
Chen JCB 2003, AY JCS 2010, mitochondrial fission and fusion

Good: fusion, Bad: fission
H2O2 induce mitrochondria fission

Drp1 control mitochondrial fission

cyclinC does not cycle and do not control cell cyle. Instead, regulate trancription of stress genes.