Saturday, January 31, 2015

REU comparisons

Wustl, $1200 for housing or $500 for travel, on-campus housing $1300 for 11 weeks

http://reu.cse.wustl.edu/reu/FAQ.html

Friday, January 30, 2015

todo, SVM project. find out the support vectors. verify the predictions in ken rls database.

  SVM project. find out the support vectors. verify the predictions in ken rls database.







This project could take a long time

sequence exercies in R, occurence of DNA words

R code exercise on occurrence of DNA words.

Learning outcomes:
Longer words should have less occurrence in DNA
Restriction enzymes with longer sites should occur less frequently in DNA.

Reference
http://a-little-book-of-r-for-bioinformatics.readthedocs.org/en/latest/src/chapter1.html
http://www.bioconductor.org/packages/release/bioc/html/REDseq.html



# Exercise to study how occurence of DNA words are influenced by their length.
# What are the occurence of 1-letter, 2-letter, 3-letter, ... 8-letter DNA words? 
# Learning outcome: longer words should have less occurrence in DNA
# by Hong Qin, Jan 30, 2015, for Bio125 @ Spelman College

library("seqinr");

# read in some bacterial 16s rDNA sequences
seqs = read.fasta( "http://www.bioinformatics.org/ctls/download/data/16srDNA.fasta",seqtype="DNA");

# look at the first sequence
seq1 = seqs[[1]]
count(seq1, 1) #nucleotide composition
mean( count(seq1, 1) )

count(seq1, 2) # occurence of two-letter DNA words
mean( count(seq1, 2) )

count(seq1, 3) # occurence of 3-letter DNA words
mean( count(seq1, 3) )
results = count(seq1, 3)
results['agc']

# ?? # occurence 4-letter words?
# ?  # occurence of 5-letter DNA words
# ? # occurence of 6-letter DNA words

count(seq1, 8) # occurence of 8-letter DNA words
mean( count(seq1, 8) )
median( count(seq1, 8) )
max( count(seq1, 8) )
hist(count(seq1, 8), br=30)

results = count(seq1, 8)
results['agccgacc']


*** Instructional Technolog request on LotusNote

To request add student TA into Moodle course. 

The following link only worked on Lotus Notes 9 at Windows. (Did not work on Lotus Notes 8 in my apple laptop)

To Submit an Instructional Technology Request:

1.  From the Lotus Notes Dashboard 
select  MIT Requests in Category section 
2. Select 
MIT Requests in the Applications section in the right column
3. Click on the 
Submit Requests folder
4. Click on 
Service Request
5. Complete the request
6. Click 
Submit

You can also
 email your support request to the Service Desk at help@spelman.edu.

For Moodle course request: 
  • Locate the MIT Request category
  • Select Instructional Technology Request form
  • Click Open Selected App button
  • Open Submit Request folder
  • Click Moodle Course link
  • Complete form as instructed








For Moodle User access 




Thursday, January 29, 2015

Bizhub c754 to osX 10.9.1 laptop



http://onyxweb.mykonicaminolta.com/OneStopProductSupport/SearchResults?products=1603&fileTypes=0&OSs=39




Gram stain lab

Preparation:

To prepare fresh bacteria, I can only focus on gram positive ones, because age of the gram-negative bacteria would not influence the gram stain outcomes.

Note that Bacillus subtilis can take 2 days of 30C incubation to form colonies.

Materials:
Crystal violet, 95% EtOH, Gram Iodine, Safranin.

Problems. Some 95EtOH were contaminated by Iodine, and rubber were oxidized and cracked.

Georgia Academy of Science

James A Nienow, Treasure.


todo: PCA analysis to virus data

todo: PCA analysis to virus data

Data driven investigator, Moore



http://ged.msu.edu/downloads/2014-moore-ddd-preapp.pdf

http://www.moore.org/programs/science/data-driven-discovery/ddd-investigators

envent brite

"release tickets" on waiting list:

bio233, epidemiology,

print london maps
student scatter play dough on the map




Clicker presenter card usage


Ref 
www.turningtechnologies.com/pdf/UserGuides/PresenterCard_1.1.pdf

2015: Skipped clicker, adopt socrative and Moodle online tests.

The application of statistical physics to evolutionary biology, Sella and Hirsh, 2005 PNAS

The application of statistical physics to evolutionary biology, Sella and Hirsh, 2005 PNAS

This is a quite influential paper in many aspects.

It seems to provides a theoretic explanation for the multiplicative and additive fitness measures.


Some of the words that I found interesting are:
Energy is an additive quantity in physical systems, so it is not surprising that its counterpart in the evolutionary process should be the additive fintess.

Maximization of free fitness is precisely analogous to the second law of thermodynamics.









Organismal complexity measure, Tenaillon et al 2008

Quantifying Organismal Complexity using a Population Genetic Approach
PloS ONE 2007
Olivier Tenaillon1,2.*, Olin K. Silander2,3., Jean-Philippe Uzan4, Lin Chao2


College Learning Assessment




http://en.wikipedia.org/wiki/Collegiate_Learning_Assessment#Criticisms

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.134.383&rep=rep1&type=pdf&utm_source=buffer&utm_campaign=Buffer&utm_content=buffer5a240&utm_medium=twitter


bio125, Thu, 20150129, mini prep, plasmid map, ApE

Section 1

8-8:20am. review mutation assignment
DNA mutation assignment, Wikipedia is wrong.

review mini prep protocol ApE

absorption spectrum of DNA and protein

Endo wash is to improve transformation efficienty (remove endotoxin, endonuclease. Though this is probably not a problem for yeas, Kioko found it improve 260/280 ratio).

Student asked why the elution buffer is used as blank control to measure DNA concentration.

9am. mini prep lab started.
by 10am. Students finished first round of centrifuge (binding DNA to zappy columns)

11am. Some students came back from nano drop measurement. DNA yields and 260/270 ratio were good.


Problems: 
Students not sure whether they should change new tips every time!
One student hold micropitte with tip upside down
Most students do not know to how resuspend properly. In large falcon tubes, Kioko prefer to use pipette up-and-down.
Many students are clear when lysis will be done.
Many students did not racks when they pick tubes from centrifuge.
Students did not realize there were two micro-centrifuge in room 351.

Tips: Kioko gave TE, lysis, neutralization buffer, wash buffer, elution buffer separately but in order in order to minimize mixups of tube. (This seems to slowed things down, but avoided chaos)
Kioko asked students to one additional spin to dry the columns. 



Section 2:
15 min review preclass lab assignment
30min, student went through protocol
10 minutes, show video from morning section

by 1:55 pm, miniprep started.

by 3:09 pm, most groups finished eluting plasmid DNA. Students went to 2nd floor core facility for nanodrop DNA concentration measurement.

by 4pm. All groups finished nanodrop measurement. So, nanodrop took 1 hour.

Problems:
Similar problems with section 1.
A group arranged 15ml tubes in unbalanced positions.

Concerns:
Kioko thinks elution buffer should be increased to 50 ul, given that we used 5ml Ecoli.







Tuesday, January 27, 2015

bio125, Jan 27, Tue, central dogma, dna repair, msh2 overview

Section 1:
Go over assignment and Student presentations 
=>Repair problem #4 not right. #5 not right? 

8:38am replication 3 in class, drawing 
bottom 2, helicase or topoisomase?

9:15am,   mutation, HSV paper
http://www.ncbi.nlm.nih.gov/pubmed/20026654 ATM S1981 reference

9:15-10am, student demo
ApE on gene X preclass assignment

10-10:10am Rstudio usage on standard curves, led by a student
math4

10:15, msh2 video presentation of spring 2014.

Section 2:


Go over assignment and Student presentations 
=>mutation, #4 #5

by 1:50 pm, finished NSV1 problem set in class

by 2pm, finished replication picture quiz, (This seems really helpful).

2:15pm, standard curve
2:43pm, ApE
by 3pm, reviewed math 4 assignment

3pm, show msh2 student presentaiton video:
Ask students to identify gene, cancer, subtype, mechanism, model organism,


===============
pre-class: central dogma, DNA repair , MSH2 overview (abstract reading),,

In-class: 
(1) Replication (in class, drawings-mutation), 
(2) mutation, HSV paper

(2) Math problem 5, standard curve

(3) MCAT



Running R code on linear regression, generating plot and save figures

ApE usage

replication assignment 3

mutation&repair  assignment 1


MCAT problem set on DNA replication

Review lab report

Math problem review

MSH2 project overview, Socrative quiz using old problems

http://highered.mheducation.com/sites/007353224x/student_view0/chapter11/index.html

Saturday, January 24, 2015

Thursday, January 22, 2015

bio125, Thu, bradford protein concentration determination

Section 1:
8am. Announcement:
office hours, wed 1-5pm. 
Books on SpelElearn common site
Student demo, recording
How to use pipette
Go over serial dilution protocol.

Studens were asked to figure out way to perform experiement on their own, only asking for help when they are 'challenged'.


by 8:30, student finished protocol presentation, demonstrated pipette usage.
I then went over R code on standard curve preparation and analysis. I explained that sample codes will always provide to students in bio125.
R and Rstudio demo for data analysis


I explained how to save pictures in Rstudio

Serial dilution
Bradford protein concentration determination

9:10am. Lab instructor started the lab. BSA stock in eppen tube. Unknown samples are labeled by numbers.

Some student continue to work at 10:45am.

Problems: Students are not clearly about 5X and 1X. This convention was not explained in the protocol. (Wang said 5X is in Math4)

Students were not sure how to label cuvettes. 
Wrong wavelength was used by one student.
Wrong orientation of cuvett in spec: A group measue OD by putting cuvett sideways. 
Some BSA stock does not have glycerol. Kioko thought protein may fall out of water and lead to low concentration. 

Many students have trouble to figure the concentration of the original solution Unknown. I used the following figure to explain them.





Kioko: Bradford staining is irreversible. In other words, stained over-concentrated proteins in the cuvette cannot be diluted. So, if the measurement shoot over the standard curve, students has to dilute from the concentrate protein stock again to make it landed in the range.
Kioko: Add Bradford stock as a the last solution to the tube, ensuring staining reaction occur to the same extent.  

Kioko also said the students did the serial dilution experiment in bio120.

skipped: Linear fitting, R2 and p-value
Review homework and assignment. No time, leave for next class.

Section 2:
1/3 students did not read protocol or finish quiz for the lab.



Spelman Factbook


http://www.spelman.edu/docs/FactBook/facts-figsbook_103114.pdf?sfvrsn=2

Tuesday, January 20, 2015

toread, single cell genome and transcriptome sequencing

http://www.nature.com/nbt/journal/vaop/ncurrent/full/nbt.3129.html


Target of rapamycin signalling mediates the lifespan-extending effects of dietary restriction by essential amino acid alteration.



Abstract

Dietary restriction (DR), defined as a moderate reduction in food intake short of malnutrition, has been shown to extend healthy lifespan in a diverse range of organisms, from yeast to primates. Reduced signalling through the insulin/IGF-like (IIS) and Target of Rapamycin (TOR) signalling pathways also extend lifespan. InDrosophila melanogaster the lifespan benefits of DR can be reproduced by modulating only the essential amino acids in yeast based food. Here, we show that pharmacological downregulation of TOR signalling, but not reduced IIS, modulates the lifespan response to DR by amino acid alteration. Of the physiological responses flies exhibit upon DR, only increased body fat and decreased heat stress resistance phenotypes correlated with longevity via reduced TOR signalling. These data indicate that lowered dietary amino acids promote longevity via TOR, not by enhanced resistance to molecular damage, but through modified physiological conditions that favour fat accumulation.

Ubuntu LTS 14.04 installation and basic usage


Install Ubuntu LTS 14.04 to an old Dell laptop with LTS 12.

Ubuntu LTS14.04  was written on a DVD.

The installation asks for internet connection, but gave option to upgrade LTS v12. I was asked to enter username and password. Presumable the old directory will be over-written? (No, this was not the case. Old directory was kept there. )

Near the end of the installation, a warning saying that some application need to reinstalled. Then it asked for restart.

Somehow, the language was set to Chinese.  I figure out that "shift" can switch between language input modes.


Georgia Tech Campus map



Guest parking are labeled as Area 2, 3, and 4.

bio125, Tue, Jan 20, 2015, DNA structure, replication

_Camcorder to record student performance.

Section 1: 
8-8:20am
_Go over student assignments

8:20-9am, do exericse in class on nucleotide structure, chromosome, and replication.

One student asked the difference between nucleosome and chromosome. I used a cable and some balls made from playdough to illustrate the nucleosome made of histone complex wrapped by DNA strands.

section 1, 9-9:30
_NCBI nucleotide database, video capture
mRNA (Kozak sequence, translation initiation)

section 1: 9:30-10:00
_ApE
_ApE, CDS, reverse complementation

section 1, 10:am
_R and Rstudio for the assignment,
_simple R demo code
_make solution demo code

Lessons: 
Youtube video was cropped off the top section during iMovie editing.

What worked: 
I screen cast many lectures and demo and uploaded them in-time for for section 2.

Section 2: 
30 min:
_Go over student assignments. Several groups in this section had problems with math problems.

_NCBI nucleotide database

1:45-2:13, let students help each other.
_ApE
2:13pm-2:20
_ApE, CDS, reverse complementation

by 2:30pm
_ build nucleotide exercise

2:30-3:20pm
_R and Rstudio for the assignment,
_simple R demo code

_make solution demo code 
One student asked about "#" in the R codes. Some students asked about the parenthesis.

3:20, ask students to summarize the class.

For homework assignment:  nucleotide structure, chromosome, and replication.

Lessons: 
Some student have trouble downloading Rstudio. 

Not used:
_Central dogma review (concept map, group activity)
_Prepare R and Rstudio on flashdrives

_DNA structure and replication, with MSH2's role mentioned briefly
human genome 3 billion DNAbase
yeast, how big?
Brooker chapter 11 slides

Go-over brocker questions in classes

Past student presentations on MSH2 project
Past project poster

Note, in 2015 spring, students are said to have learned micro pipette usage in BIO120.

Optional: Serial dilution exercise using colored papers and petri dishes

skipped: DNA double struck in SPDB, based bio233 materials on DNA.

Monday, January 19, 2015

*** sample proposals, open research



http://ged.msu.edu/research.html

workshop proposal, including report
http://www.spatial.cs.umn.edu/few/epilogue.html

NIH funding accouncement

NIH funding accouncement

http://grants.nih.gov/grants/guide/index.html?CFID=189952485&CFTOKEN=76131536&jsessionid=f630742d3881e9f40e3c7f23302b3234753b

NIH forms, templates

NIH forms, format
http://grants.nih.gov/grants/funding/424/index.htm#inst

http://grants.nih.gov/grants/forms.htm#reserach

http://grants.nih.gov/grants/forms_page_limits.htm
Page limits vary among different categories. For R25, it can be 25 pages.

Emerging researcher meeting

http://www.emerging-researchers.org/
Good meeting for students.

I did not see specific requirement of the poster requirement.

Todo: Need to add an introduction of yeast aging. SVM diagram.

R/Rstudio tutorial page for BIO125, Spring 2015

This is a dynamic page and will change frequently during Spring semester of 2015.

1.What is R? 

Wikipedia entry on R

Why R by Courtney Brown at Emory. 

Why R and beyond.

R blogger that provides recent and often interesting development about R.  

What is R video (Added after the class).

2. Install R to your own computers.

Instructions to download R. 

Install R studio.  RStudio provides a nice GUI to R.

Install packages to R: Video for Windows Version.

3. Introduction to R.


Hong Qin's slides: Overview of R;   Basic programming in RInput & Output in R;

Lydon Walker, getting started with R, an accelerated primer


 4. Simple exercises in R.

Multiple regression demo

Hierarchical clustering using cities. CodeVideo.

Laddy Gaga and clustering analysis. Code. Video.

Bioconductor workshop materials.




http://cran.r-project.org/doc/contrib/Seefeld_StatsRBio.pdf



Saturday, January 17, 2015

NIH glossary

http://grants.nih.gov/grants/glossary.htm#F

Federal Pell Grant

Federal Pell Grant

The Federal Pell Grant Program provides need-based grants to low-income undergraduate and certain postbaccalaureate students to promote access to postsecondary education. Students may use their grants at any one of approximately 5,400 participating postsecondary institutions. Grant amounts are dependent on: the student's expected family contribution (EFC) (see below); the cost of attendance (as determined by the institution); the student's enrollment status (full-time or part-time); and whether the student attends for a full academic year or less.


http://www2.ed.gov/programs/fpg/index.html

Thursday, January 15, 2015

Socrative, report student results

It seems Socrative can only require student names in the pre-defined quizzes.

I solve this problem by writing a generic quiz on Socrative. I then started Socrative in teacher mode on Mac-tower, and tried student login in two other computers. I loginto Socrative through gmail directly.  After finish the quiz, I save the quiz results to my GoogleDrive directly in a ZIP file that contains the students names and their answers in a Excel file.

Apparently, Socrative does not students to change their answers after their initial submission.

Bio125, day 1, week 1,

Names tags,
Syllabus, flipped classroom, assignments are mostly due before class, 
Learning objectives
Use a seat map to connect student names, faces and seatings.

Signatures, notepad and pens
lab safety form (I added the students names to the list).
IRB, academic integrity, photo video release form (Need to add Yes or No on the form)

pre-assessment
bring incentives

HP laptop log, wireless connections
Bring ethernet cables 

Socrative login. Go over slides to test socrative. (students have trouble to generate account. Later, I figure out that students names are only available in pre-defined quizzes).

ApE installation. On Yosemite, when the security setting is not changed, OS X gave a warning as if the downloaded software is damaged.

Presentation orders. natural group order

Did Group building on cancer.
PubMed search
Primary literature: original work versus reviews and commentaries.

group 1, 2: colorectal cancer, MSH2
group 3: breast cancer, BRCA1
group 4; leukemia,
group 5: oral cancer, smoking, alcohol PMID 25564114
group 6: ovarian cancer: BRCA1 and 2
group 7: prostate cancer:
group 8: stomach cancer
group 9: pancreatic cancer
group 10: liver cancer

Summary: presentation groups of next class on homework assignments.




Wednesday, January 14, 2015

Snowball microphone test, level setting for lecture captureing

Level 3 gave the highest volume recorded in QuickTime. Level 1 is second. Level 2 somehow gave the weakest voice.

I used OS X Yosemite, laptop 'ace'.

NIH Big Data to Knowledge (BD2K) Enhancing Diversity in Biomedical Data Science (R25)

RFA
http://grants.nih.gov/grants/guide/rfa-files/RFA-MD-15-005.html


On January 13, 2015, a new funding opportunity announcement was released entitled NIH Big Data to Knowledge (BD2K) Enhancing Diversity in Biomedical Data Science (R25). The over-arching goal of this BD2K R25 program is to support educational activities that enhance the diversity of the biomedical, behavioral, and clinical research workforce. To accomplish the stated over-arching goal, this FOA will support creative educational activities with a primary focus on research experiences for students and faculty, and for curriculum development. 

The primary purpose of the NIH BD2K Enhancing Diversity in Biomedical Data Science program is to provide resources for eligible institutions to implement innovative approaches to research education for diverse students in Big Data science, including those from underrepresented backgrounds in biomedical research. Higher education institutions listed in the FOA are eligible to apply. Some institutions provide unique opportunities for access to students from diverse backgrounds underrepresented in biomedical and behavioral research. Accordingly, the NIH Big Data to Knowledge (BD2K) program strongly encourages applications from the following institutions: Historically Black Colleges and Universities (HBCUs), Tribally Controlled Colleges and Universities (TCCUs), Hispanic-Serving Institutions (HSIs), Alaska Native and Native Hawaiian-Serving Institutions, and institutions serving individuals living with disabilities. Applicants must collaborate with at least one NIH BD2K Center [NIH BD2K Centers] across the nation to develop the BD2K R25 program at the applicant institution. Refer to RFA-MD-15-005 for details. 



"Collaborative activities with the NIH BD2K Centers may include, but are not limited to: short-term research experiences for students and faculty at the NIH BD2K Centers, and hands-on projects; developing and/or disseminating curriculum materials that will be used at the applicant institution, and/or in a joint-instructional capacity with BD2K faculty. - See more at: http://grants.nih.gov/grants/guide/rfa-files/RFA-MD-15-005.html#sthash.gBxhvyIY.dpuf"



List of BD2K centers:
http://bd2k.nih.gov/FY14/COE/COE.html#sthash.NGUHPDVC.nZubDRF2.dpbs

UNIVERSITY OF PITTSBURGH (Super computing center?)
http://projectreporter.nih.gov/project_info_description.cfm?aid=8932078&icde=22003109
http://www.dbmi.pitt.edu/person/gregory-cooper-md-phd

UW Madison (past Spelman REU program)
http://projectreporter.nih.gov/project_info_description.cfm?aid=8921373&icde=22003161


References:
http://bd2k.nih.gov/FY14/Ed/Ed.html#sthash.5HdqQJNh.dpbs


Report of a workshop, very informative
http://bd2k.nih.gov/pdf/bd2k_training_workshop_report.pdf

Who to Train: The BD2K workforce will need both quantitative (statistical and computational)
expertise and biomedical domain expertise, taken together as “data science” expertise.
Examples of biomedical fields that already incorporate varying amounts and mixtures of
quantitative expertise are bioinformatics, computational biology, biomedical informatics,
biostatistics, and quantitative biology. Both basic and clinical researchers at all career levels
need to receive training.
 When to Train: Training is needed at all career stages: exposure courses for
undergraduates, cross-training for graduate students and postdoctoral fellows, training as
needed for researchers at all levels to facilitate their work, refresher courses or certificates in
specific competencies for mid-level researchers, and relevant continuing medical education
courses for clinical professionals.
 What to Train: Both long- and short-term training is needed, and efforts should be guided by
the competency level required for the technical knowledge and skills to be gained. The
technical knowledge and skills needed include: (1) computational and informatics skills; (2)

mathematics and statistics expertise; and (3) domain science knowledge.

How to Train: Several ways to cross-train biomedical and quantitative scientists were
suggested, including through (1) new or expansion of existing long-term research training
programs (which can incorporate activities such as boot camps, joint and team coursework,
delayed laboratory rotations, dual or team mentoring, clinical and industrial externships, and
team challenges); (2) short-term courses and hands-on immersive experiments (which can
span short courses, certificate programs, immersive workshops, summer institutes, clinical
immersion and shadowing, and continuing medical education opportunities); (3) curricula for
biomedical Big Data; (4) technology-enabled learning systems and environments (e.g., webbased
courses and Massive Open Online Courses (MOOCs) to offer training to a much
larger audience; and (5) a training laboratory that has tools and resources for self-directed

learning and exploration. 


Moodle 2.5 customerization, dock panel

The Dock panel is a little triage sign on the left.


Bamboom Wacom driver installation, OS X, Yosemite

Bamboom Wacom driver installation, OS X, Yosemite

The older driver on the CD in the shipment does not work for Yosemite anymore. I found a legacy support page on bamboom
http://us.wacom.com/en/support/legacy-drivers/

My Bamboo Pen tablet is Model CTL-470.

I connected my Bamboo tablet and it worked.

Tuesday, January 13, 2015

Useful teaching/coaching strategies

From
http://www.scientificamerican.com/article/the-secret-to-raising-smart-kids1/?WT.mc_id=SA_Twitter

You did a good job drawing. I like the detail you added to the people's faces.

You really studied for your social studies test. You read the material over several times, outlined it and tested yourself on it. It really worked!

I like the way you tried a lot of different strategies on that math problem until you finally got it.

That was a hard English assignment, but you stuck with it until you got it done. You stayed at your desk and kept your concentration. That's great!


I like that you took on that challenging project for your science class. It will take a lot of work—doing the research, designing the apparatus, making the parts and building it. You are going to learn a lot of great things.

Oh, sorry, that was too easy—no fun. Let's do something more challenging that you can learn from.

Let's all talk about what we struggled with today and learned from. I'll go first.

Mistakes are so interesting. Here's a wonderful mistake.


Let's see what we can learn from it.


PSC blacklight trial, 20150113


Instructions: 
"Once you login you will be in your $HOME directory (/usr/users/1/hqin2) which is backed up but has a quota of 5 Gbytes. You also have access to a $SCRATCH directory (/brashear/hqin2) which has essentially unlimited storage  and is not backed up. Files in $SCRATCH may be removed, oldest first, to make room when needed, though we try to keep  them for 2-weeks at least.

There is a file archiver, you can access it as the directory /arc/users/hqin2/ from the login node, where you can store whatever you need to keep long-term (while your allocation is active, of course). You can also connect to the archiver via sftp, at data.psc.edu. You can use Fugu or any other graphical user interface if you prefer. This is the simplest way to transfer files to PSC, you can see them in the /arc directory from the login node and copy them to/from the $HOME or $SCRATCH directory as needed.

When you run and write data, we prefer that you write to $SCRATCH, which is a distributed file system and can handle the load, and not to $HOME."


hqin2@tg-login1:~> echo $HOME
/usr/users/1/hqin2
hqin2@tg-login1:~> echo $SCRATCH
/brashear/hqin2
hqin2@tg-login1:~> du /arc/users/hqin2
2 /arc/users/hqin2
hqin2@tg-login1:~> df /arc/users/hqin2
Filesystem           1K-blocks      Used Available Use% Mounted on
/arc                 3656882477312 2021505932032 1635376545280  56% /arc
hqin2@tg-login1:~> df -h /arc/users/hqin2
Filesystem            Size  Used Avail Use% Mounted on
/arc                  3.4P  1.9P  1.5P  56% /arc


Instructions:
"Look at this webpage:
http://www.psc.edu/index.php/computing-resources/blacklight

it has examples of scripts for running batch jobs, in particular I think you will want to run an 'interactive batch job' to check that your code works.

    qsub -I -l ncpus=16 -l walltime=0:30:00 -q debug

once you get a prompt, you are on the 'backend', or 'compute node', i.e. Blacklight proper, and everything runs there, not on the login node.

Let's say  I have a trivial R example:

y <- rnorm(10)
print(y)

this is saved in a file (example.R), and I want to run it. So I type the 'qsub ....' command above, and  after I got an interactive prompt, enter the following;

source /usr/share/modules/init/bash
module load R

R --slave CMD BATCH ./example.R

and the output appears in 'example.Rout'.  OK, so I'm done. To get out of the 'compute node', I type 'exit' and press enter.

The first two lines (source ... ; module ...) load the definition of the 'module' command, the second uses the module command to put (a version of) R in my path, and the last executes the R script in batch mode.  

Once I have figured out that everything is working, I can run the script in full batch mode (non-interactively) by putting this into a PBS script, i.e. a file, let's call it 'R.pbs':

#!/bin/bash
#PBS -q batch
#PBS -l ncpus=16
#PBS -l walltime=0:03:00

source /usr/share/modules/init/bash
module load R
cd $PBS_O_WORKDIR

ja
R --slave CMD BATCH ./example.R
ja -chlst

So you are just entering the commands you typed interactively,  after a line that indicates what 'shell' you want to run under, and some  options to the batch scheduler (the number of cores, and the minutes, which you had entered on the command line before).   What is new is the "cd $PBS_O_WORKDIR" which makes the script start on whatever directory you were when you submitted the command. Also, the couple of lines "ja" and "ja -chlst"  surrounding the call to R. They are not essential, but  collect useful information on the job (maximum amount of memory, time spent, cpu time used, etc.)

So you have this script called 'R.pbs',  and you can submit it to the scheduler with the command

    qsub R.pbs

The scheduler will reply with something like:
394363.tg-login1.blacklight.psc.teragrid.org

the number is the 'job ID' of your PBS job, which you can use to ask for more information from the scheduler.  You can always ask it 'what jobs do I have in the queue' like this:

    qstat -u hqin2

and it will list them all, together with the state (R means running, Q means it still in the queue).  If it lists nothing, it means all your jobs completed.  After the job completed, there should appear a couple of files in the directory where you put the script. Since I didn't use any option to give the job a name, the files would be named {script name}.e#### and {script name}.o####, in the example that would be R.pbs.o########## and R.pbs.e#######. The 'o' file has any output that the job would write to the standard output, the 'e' file anything that would normally go to the standard error file.   You can also redirect output from any command in the job script to a file. "

source /usr/share/modules/init/bash
module load R
R --slave CMD BATCH ./example.R
hqin2@tg-login1:~> ll example.R* #output is example.Rout
-rw-r--r-- 1 hqin2 mc48o9p  24 2015-01-13 20:47 example.R
-rw-r--r-- 1 hqin2 mc48o9p 942 2015-01-13 20:48 example.Rout
hqin2@tg-login1:~> nano -w R.pbs
hqin2@tg-login1:~> pwd
/usr/users/1/hqin2
hqin2@tg-login1:~> qsub R.pbs 
418673.tg-login1.blacklight.psc.teragrid.org

hqin2@tg-login1:~> qstat -u hqin2

tg-login1.blacklight.psc.teragrid.org: 
                                                                    Req'd  Req'd   Elap
Job ID               Username Queue    Jobname    SessID  NDS  TSK  Memory Time  S Time
-------------------- -------- -------- ---------- ------- ---- ---- ------ ----- - -----
418673.tg-login1     hqin2    batch_r  R.pbs          --   --    16    --  00:03 Q   -- 
hqin2@tg-login1:~> 

Nothing was in the output file. So, I modified the running line to "R -f example.R"

hqin2@tg-login1:~/test> ls
example.R  R2.pbs
hqin2@tg-login1:~/test> ll
total 8
-rw-r--r-- 1 hqin2 mc48o9p  24 2015-01-13 22:33 example.R
-rw-r--r-- 1 hqin2 mc48o9p 199 2015-01-13 22:33 R2.pbs
hqin2@tg-login1:~/test> qsub R2.pbs 
418692.tg-login1.blacklight.psc.teragrid.org
hqin2@tg-login1:~/test> qstat -u hqin2

tg-login1.blacklight.psc.teragrid.org: 
                                                                    Req'd  Req'd   Elap
Job ID               Username Queue    Jobname    SessID  NDS  TSK  Memory Time  S Time
-------------------- -------- -------- ---------- ------- ---- ---- ------ ----- - -----
418692.tg-login1     hqin2    batch_r  R2.pbs         --   --    16    --  00:03 Q   -- 
hqin2@tg-login1:~/test> cat R2.pbs 
#!/bin/bash
#PBS -q batch
#PBS -l ncpus=16
#PBS -l walltime=0:03:00

source /usr/share/modules/init/bash
module load R
cd $PBS_O_WORKDIR

ja
#R --slave CMD BATCH ./example.R
R -f example.R

ja -chlst


hqin2@tg-login1:~/test> ll
total 16
-rw-r--r-- 1 hqin2 mc48o9p   24 2015-01-13 22:33 example.R
-rw-r--r-- 1 hqin2 mc48o9p  199 2015-01-13 22:33 R2.pbs
-rw------- 1 hqin2 mc48o9p    0 2015-01-13 23:13 R2.pbs.e418692
-rw------- 1 hqin2 mc48o9p 4905 2015-01-13 23:13 R2.pbs.o418692
hqin2@tg-login1:~/test> cat R2.pbs.o418692 

R version 2.15.3 (2013-03-01) -- "Security Blanket"
Copyright (C) 2013 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-unknown-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> y = rnorm(10)
> print (y)
 [1] -0.46271891  0.34547494 -0.97556883 -0.64659599  0.01052027  0.06472313
 [7]  0.43858725  0.83961732 -0.74945123  0.15012829


Job Accounting - Command Report
===============================

    Command       Started    Elapsed    User CPU    Sys CPU       CPU      Block I/O    Swap In      CPU MEM        Characters           Logical I/O      CoreMem   VirtMem   Ex
     Name           At       Seconds    Seconds     Seconds    Delay Secs  Delay Secs  Delay Secs  Avg Mbytes     Read     Written     Read      Write    HiValue   HiValue   St   Ni  Fl   SBU's 
===============  ========  ==========  ==========  ==========  ==========  ==========  ==========  ==========  =========  =========  ========  ========  ========  ========  ===  ===  ==  =======
# CFG   ON(    1) (    7)  23:13:32 01/13/2015  System:  Linux bl0.psc.teragrid.org 2.6.32.49-0.3-default #1 SMP 2011-12-02 11:28:04 +0100 x86_64
ja               23:13:32        0.31        0.00        0.00        0.00        0.00        0.00        0.85      0.019      0.000        19         3      1064     23780    0    0         0.00
uname            23:13:32        0.00        0.00        0.00        0.00        0.00        0.00       12.64      0.004      0.000         8         1       664      5316    0    0         0.00
R                23:13:32        0.00        0.00        0.01        0.00        0.00        0.00        0.00      0.000      0.000         0         1       884     12616    0    0  F      0.00
sed              23:13:32        0.00        0.00        0.01        0.00        0.00        0.00        0.00      0.004      0.000        10         1       816      5396    0    0         0.00
R                23:13:32        0.00        0.00        0.01        0.00        0.00        0.00        0.00      0.000      0.000         0         1       888     12616    0    0  F      0.00
sed              23:13:32        0.00        0.00        0.01        0.00        0.00        0.00        0.00      0.004      0.000        10         1       812      5396    0    0         0.00
R                23:13:32        0.00        0.00        0.01        0.00        0.00        0.00        0.00      0.000      0.000         0         0       856     12612    0    0  F      0.00
rm               23:13:33        0.01        0.00        0.00        0.00        0.00        0.00        0.96      0.012      0.000        20         0       712      5336    0    0         0.00
R                23:13:33        0.35        0.22        0.08        0.00        0.00        0.00       70.16      4.166      0.001       190        25     32412     75240    0    0         0.00


Job CSA Accounting - Summary Report
====================================

Job Accounting File Name         : /dev/tmpfs/418692/.jacct65df3
Operating System                 : Linux bl0.psc.teragrid.org 2.6.32.49-0.3-default #1 SMP 2011-12-02 11:28:04 +0100 x86_64
User Name (ID)                   : hqin2 (51231)
Group Name (ID)                  : mc48o9p (15132)
Project Name (ID)                : ? (0)
Job ID                           : 0x65df3
Report Starts                    : 01/13/15 23:13:32
Report Ends                      : 01/13/15 23:13:33
Elapsed Time                     :            1      Seconds
User CPU Time                    :            0.2200 Seconds
System CPU Time                  :            0.1090 Seconds
CPU Time Core Memory Integral    :            5.2741 Mbyte-seconds
CPU Time Virtual Memory Integral :           15.2699 Mbyte-seconds
Maximum Core Memory Used         :           31.6523 Mbytes
Maximum Virtual Memory Used      :           73.4766 Mbytes
Characters Read                  :            4.2103 Mbytes
Characters Written               :            0.0012 Mbytes
Logical I/O Read Requests        :          257
Logical I/O Write Requests       :           33
CPU Delay                        :            0.0030 Seconds
Block I/O Delay                  :            0.0002 Seconds
Swap In Delay                    :            0.0000 Seconds
Number of Commands               :            9
System Billing Units             :            0.0000

hqin2@tg-login1:~/test> 


Note: I compared today's R.pbs with job1.sh on 20150112
the line  "source /usr/share/modules/init/bash" seems to be critical. It make sure that "module" can be recognized.