Tuesday, January 31, 2017

cpsc 1100 inner loop, outloop

for nest loop demo, I tried to print a triage tree, but wine-VisualLogic gave a running error. Later, I tried Windows, and it worked. So, wine did emulate perfectly for Windows.

Use postIt note and drawing on board to explain inner and out loops.


bio3250 chi-sq , pedigree, 20170131Tue

Go over ppt presentation again.

Socrative
YouTube (make sure screen is captured)
Annouce home work assignment

80 minutes on chi-sq analysis. Asked students on board, and use Socrative to get all students work on the chi-sq calculation procedure.



10 minute on pedigree analysis (finished just the first example on autosomal dominant case).


Save socrative.
Save video

Friday, January 27, 2017

flow charts software for education

Sent by JD

Runs on Windows. There is apparently a version for Ubuntu Linux that
doesn't have all the features. May be runnable on a Mac using Wine or
other Windows emulation software. No support for iOS. Can do
object-oriented flowcharting and can generate Java code. Doesn't appear
to have the graphical capabilities of Visual Logic.

Runs on Windows. Will generate code in a number of high-level languages,
but can execute directly from the flowchart like Visual Logic.

Runs on Windows. Looks to be more of a pseudocode-type language, but
generates flowcharts.

None of the above appears to be runnable on iOS devices (we discussed
the desirability of being able to run on iPads). So far, the only
similar product I have found that runs on iOS is Hopscotch:


It is intended to give kids exposure to coding concepts. I'm not sure if
it can do all the same things as Visual Logic. Does anyone else have
experience with it?



UTC Envision


http://www.utc.edu/research-sponsored-programs/tera-help.php

Thursday, January 26, 2017

bio3250 dihybrid cross, pedigree, chi-square, p-value


Socrative
Youtube
Dihybrid cross:
 4 kind of colored postIt.



chi-square, degree of freedom, p-value (using a fake dice with three six)

1 hr on dihybrid cross, chisq and p-value
I was shocked that entire class pick a wrong answer for chisq p-value interpretation. Later, after careful reading, I realized that the problems are phrased in mis-leading ways.

20 min on example, go over problem in homework.

left over:  chiq exercises, pedigree, probability

Save Socrative
Save YouTube





Wednesday, January 25, 2017

CPSC1100 VL lab2,

VL lab2 require WHILE loop which I have not mentioned in class. I had to go over WHILE and FOR loop before the lap.



docker deepsea error on macpro



https://github.com/gifford-lab/deepsea-docker

docker run -it --rm \
--device /dev/nvidia0 \
--device /dev/nvidia1 \
--device /dev/nvidia2 \
--device /dev/nvidia-uvm \
--device /dev/nvidiactl \
giffordlab/deepsea-docker






Tuesday, January 24, 2017

CPSC1100 20170124Tue

Start video recording.

Review phone bill diagram from last class using single conditions and nest if loops

Compound conditions. Socrative.

Use compound conditions to explain the phone bill.
Smallest number example.

Students have trouble to understand place-holder variable. I explained as a "label", and labeled myself with a post-it, then looking for person with a different colored hair, and keep track of the 'label'.

Lab2/Hw 2 accouncement
Video links on UTCLearn accouncement.

Current grade



bio3250 genetics, heredity chapter 3, part2

Socrative

Punett square of RR x rr cross
 postIt R and r drawing for punett square explanations.
 explain chromosomes



1pm, Go over questions in homework of chapter 3. Punnet square of dog, plant.

1:10pm,
explain Video assignment, rubric
Current_grade explanation
Youtube channel announcement

Thing to go over: two-locus punnet square, pedigree, probability

1:20, end of class. save socrative and youtube video.

Monday, January 23, 2017

FANTOM5 RIKEN project


Time-series expression data?
http://fantom.gsc.riken.jp/papers/

Install theano on applejack




applejack:~ hqin$ which pip

/Users/hqin/anaconda2/bin/pip






























annaconda py2.7 commandline installation, applejack

laptop: applejack,

Download commandline Anaconda2 installation using python 2.7.

Under 'hqin'
bash Anaconda2-4.2.0-MacOSX-x86_64.sh
Anaconda2 will now be installed into this location:

/Users/hqin/anaconda2

Sunday, January 22, 2017

DeepSEA


DeepSEA on non-coding variation and TF prediction in human genomes.


Saturday, January 21, 2017

Genenetwork API test

Gene network server API

https://github.com/genenetwork/gn_server/blob/master/doc/API.md


For mouse cross information
http://www.genenetwork.org/mouseCross.html

GEO time series data



(time series cancer) AND "Mus musculus"[porgn:__txid10090] AND cancer

RNN notes



http://karpathy.github.io/2015/05/21/rnn-effectiveness/

https://github.com/h2oai/deepwater

http://machinelearningmastery.com/crash-course-recurrent-neural-networks-deep-learning/

https://www.ncbi.nlm.nih.gov/pubmed/17975278

http://www2.fiit.stuba.sk/~cernans/main/download.html


diving into R h20



https://www.r-bloggers.com/diving-into-h2o/


R h2o test (passed)

# in bash
# Byte:java hqin$ pwd
# cd /Users/hqin/Library/R/3.3/library/h2o/java
# java -jar h2o.jar 

library(h2o)
localH2O = h2o.init(startH2O = FALSE)

demo(h2o.glm)
#it worked


R H2O initiate problem, java and versions (errors)


See
https://www.r-bloggers.com/diving-into-h2o/

# The following two commands remove any previously installed H2O packages for R.
if ("package:h2o" %in% search()) { detach("package:h2o", unload=TRUE) }
if ("h2o" %in% rownames(installed.packages())) { remove.packages("h2o") }
 
# Next, we download, install and initialize the H2O package for R.
install.packages("h2o", repos=(c("http://s3.amazonaws.com/h2o-release/h2o/rel-kahan/5/R", getOption("repos"))))
 
library(h2o)
localH2O = h2o.init()
# I was reminded to update Java JDK to 64 bit. 
# http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html  download and install. 
# however, h2o still does not detect the newest java. 


step 1:  from a command line prompt, try to start the h2o server manually: 

java -jar h2o.jar 

You can give the full path to java if you want to.  H2O works with both Java 7 and Java 8. 

If needed, you can download a standalone H2O from here: 
        http://h2o.ai/download/ 


step 2:  from R, connect to the already-running h2o server: 

h2o.init(startH2O = FALSE) 

Download H2O
http://s3.amazonaws.com/h2o-release/h2o/master/3751/index.html

H2O version mimatch? 
#error in h2o.init() : 
#  Version mismatch! H2O is running version 3.11.0.3751 but h2o-R package is #version 3.10.2.2.
#         Install the matching h2o-R version from - http://h2o-# release.s3.amazonaws.com/h2o/master/3751/index.html



Try: 
http://s3.amazonaws.com/h2o-release/h2o/master/3751/index.html?aliId=2451206

Questions for BXD


genotypes link not working:
ftp://atlas.uthsc.edu/Public/BXD_WebQTL_Genotypes

BXD mouse cross


http://www.genenetwork.org/mouseCross.html


     BXD:
The BXD family of recombinant inbred (RI) strains were derived by crossing C57BL/6J (B6) and DBA/2J (D2) and inbreeding progeny for 20 or more generations. This genetic reference panel is a remarkable resource because data for thousands of phenotypes and nearly 100 gene, protein, and metabolite expression data sets have been acquired over a nearly a 40-year period. Another advantage of the BXD family is that the both parents have been sequenced (C57BL/6J as part of a public effort, and DBA/2J by Celera Genomics, by the UTHSC group and by Sanger). Based on our analysis of the sequence data, these two strains differ at approximately 4.8 million SNPs. Variants (mostly single nucleotide polymorphisms and about 500,000 insertion-deletions) that produce interesting phenotypes can be located efficiently. The zoomable physical maps in GeneNetwork can display the positions of B versus D-type SNPs at high resolution.
Our DBA/2J sequence data (from Wang et al. 2016) have been used to generate a virtual genome for this strain using a C57BL/6J framework. In other words, all SNPs and small DBA/2J indels have inserted in place of original C57BL/6J sequence. This DBA/2J genome is available at here.

EPOCH DIFFERENCES or "Batch Effect" among BXD strains. BXD strains (1 through 103) were produced as at least four separate groups or subfamilies. BXD1 through BXD30 were produced by Benjamin A. Taylor starting in about 1971, with the first publication using early generation BXD lines at F7 to F10 in 1973 (Taylor et al., 1973 Full text, 1975 (Taylor et al., 1975Womack et al., 1975). A distinction is made between an RI line, which is not necessarily fully inbred (<20 F generations of inbreeding, and an RI strain, which should be the progeny of 20 or more sequential sib matings).
BXD31 and BXD32 are exceptional BXD strains. They were created by the Mouse Mutant Resource at The Jackson Laboratory in the propagation of different visible mutations. Mutations arose in C57BL/6J or DBA/2 and then outcrossed to the other strain and subsequently inbred while maintaining the mutation. In the case of BXD31, inbreeding was followed after the F2 generation. In BXD32, there was a backcross to DBA/2 before sib mating. BXD32 is therefore approximately 75% DBA/2 and 25% B6. When Taylor learned about these strains he decided that they would be a useful addition to the BXD RI set. He eliminated the visible mutations, continued inbreeding, and added them to the BXDs set. The first publication using these two strains was in 1980. Both are usually lumped together with BXD1 through BXD30 as a "single" first cohort, but obviously, they should perhaps be excluded from all cohorts or BXD "epochs." BXD32 is also exceptional because it has a D mitochondrial genome and a B-type Y chromosome. (Information in this paragraph mainly from BA Taylor, email of Aug 17, 2014 to RWW).
BXD33 through BXD42 were also produced by Benjamin Taylor (Taylor et al. 1999), but from a new set of F2 crosses initiated in the early 1990s.
BXD43 through BXD102 were produced by Lu Lu, Jeremy Peirce, Lee M. Silver, and Robert W. Williams in the late 1990s and early 2000s using advanced intercross progeny (Peirce et al. 2004). These strains have roughly twice the number of recombinations as conventional F2-derived RI strains.
BXD104 through BXD157 were produced by Lu Lu and Robert W. Williams starting in 2008 using standard F2 stock.
BXD160 through BXD186 were produced by Lu Lu and Robert W. Williams starting in 2010 using G8 and G9 advanced intercross stock donated by Abraham Palmer.
BXD187 through BXD220 were produced by Lu Lu and Robert W. Williams starting in 2014 using F2 stock.
Initial genotypes in 2008 highlighted breeding errors that resulted in sets of very closely related sister substrains that are almost genetically identical. In collaboration with the Jackson Laboratory, we have made the following changes in strain names to clarify the strong genetic relations among the newer BXD strains.
In general, data for the following sister substrains needs to be handled with special care prior to or during statistical analysis or gene mapping for the simple reason that the substrains are not genetically independent. However, if investigators do discover significant phenotype differences among strains, then data can be treated independently. (The latest version of GeneNetwork (GN 2) includes a mapping method called Python Linear Mixed Model (pyLMM) that corrects for the effects of shared pedigrees and corrects mapping results (code written by Nick Furlotte in 2011-2013). )
  1. BXD73BXD73a (original known as BXD80), and BXD73b (originally known as BXD103) are genetically very similar. BXD73 and BXD80 are genetically identical at 82264 of 100290 markers (82% identical by descent). BXD73 keeps its original name and JAX identifier number (JR#7117), whereas BXD80 is now referred to as BXD73a (JR#7124). BXD73 and BXD103 are genetically identical at 90917 of 100290 markers (90.6% identical by descent). BXD103 is now referred to as BXD73b (JR#7146).
  2. BXD48 and BXD48a (originally known as BXD96) are sister substrains, and are genetically identical at 93485 of 100290 markers (93.2% identical by descent). BXD48 retains its original name and JAX identifier number (JAX JR#7097) whereas BXD96 is now referred to as BXD48a (JR#7139).
  3. BXD65BXD65a (originally known as BXD97), and BXD65b (originally known as BXD92) are sister substrains. BXD65 and BXD97 are genetically identical at 92225 of 100290 markers (92% identical by descent). BXD97 is now referred to as BXD65a (JR#7140). BXD65 and BXD92 are genetically identical at 6155 of 6459 markers (95.3% identical by descent). BXD65 retains its original name and JAX identifier (JR#7110) whereas BXD92 is now referred to as BXD65b (JR#9677).
While the strains used to generate these subsets of BXDs have the same official names and were all made using stock from the Jackson Laboratory, the individual parents were are not genetically identical due to inevitable genetic drift and mutation. Shifman and colleagues detected a surprisingly large number of new SNPs (n = 47 out of about 13000 SNPs studied) in the set of strains generated by BA Taylor in the early 1990s, and a small number (n = 5) of even newer SNPs in the set of BXD strains generated at UTHSC in the late 1990s (see Shifman et al., 2006).







"In the BXD set, 52 SNPs showed variation in genotypes that corresponded to the different phases of development of the BXD RIs [24–26] (Table S4). Forty-seven SNPs are not polymorphic in the 26 BXD strains established from a single cross of a C57BL/6J female to a DBA/2J male, but are polymorphic in similar BXD strains established more than 20 y later. Five SNPs are not polymorphic in the first 36 BXD strains, but are polymorphic in the newest set of 53 BXD lines (BXD43–100)."

Correction for Family or Epoch Substructure
The BXDs have the following epoch substructure:
  1. BXD1 through 30 make up the first epoch. Breeding for this group of BXD strains started in about 1970, with the first publication of fully inbred BXD strains in 1975 (Taylor et al., 1983, see Trait ID 10715). In fact, BXD32 has a mitochondrion that is inherited from DBA/2J. BXD32 could be considered the first DXB strain (DXB32).
  2. BXD33 to BXD42 make up Ben Taylor's final addition to the BXD strains (Taylor et al, 2001, Trait ID 10645).
  3. BXD43 to BXD103. This is a complex cohort of strains generated at UTHSC from advanced intercross progeny (Peirce et al., 2004).
  4. BXD104 to BXD157. This is a single cohort of strains generated at UTHSC from F2 intercross stock.
  5. BXD160 to BXD186. This is a single cohort of strains generated at UTHSC from G8 and G9 advanced intercross progeny donated to RW Williams by Abraham Palmer in 2008.
  6. BXD187 to BXD220. This is a single cohort of strains generated at UTHSC from F2 intercross stock.
Users of the expanded BXD panel should take this epoch substructure into account. This is easy to do using the "Epoch" traits that are included in the BXD Phenotype database. For example, BXD Phenotype 12688 (BXD epoch batch trait 1) provides a simple code for the major phases of BXD production using the code of -1 for the first set through to BXD32, 0 for the second set (33 to 42), and +1 for the newer UTHSC set (43 to 103).
  1. Determine whether your trait covaries well with any one of the three Epoch traits in GeneNetwork. Also check the status of BXD31 and BXD32. They may not belong to any group.
  2. Determine if your trait maps extremely well to Chr 4 at 62 Mb (near the ALAD segmental duplication in DBA/2J).
Strain nomenclature: Some of the BXD strains have accumulated new mutations that have recently been characterized. When these mutations are known, the full nomenclature of the strain is now being modified. For example, BXD24/TyJ (aka BXD24 in most GeneNetwork databases), suffered a mutation in the Cep290 gene in the late 1980s. The mutant allele (rd16 is associated with autosomal recessive retinal degeneration. The original BXD strain was briefly referred to as BXD24a/TyJ, while the blind co-isogenic mutant was referred to as BXD24b/TyJ. The great majority of phenotype, expression, and genotype data in GeneNetwork was generated using these blind BXD24b/TyJ animals. However, in 2010, the nomenclature was changed again and the blind variant (JAX stock 000031) is now known as BXD24/TyJ-Cep290rd16/J. The original BXD with normal vision was rederived from frozen stock and is now known once again as BXD24/TyJ, although the stock number has now been changed to 005243.
BXD29/TyJ was also known as BXD29/TyJ-Tlr4, but is now formally BXD29-Tlr4lps-2J/J (JAX stock 000029). The original non-mutant stock is currently known as BXD29/TyJ again but the stock number of these rederived non-mutants has been changed to 010981.
The mitochondrial DNA of all BXD strains were typed by Jing Gu and Shuhua Qi (Nov 2004) using DNAs obtained from the Jackson Laboratory (BXD1 through 42) or from the UTHSC colony. This typing relied on a SNP marker identified by Jan Jiao in Weikuan Gu's laboratory at nucleotide position 9461 in the reference C57BL/6J mitochondrial sequence. Most strains have inherited mitochondria from C57BL/6J. However, the following strains have mitochondria with a D allele at the UT-M-9461 SNP: BXD32, 61, 74, 76, 82, 89, 90, 91, 95, and BXD99. These ten strains could be considered DXB recombinant inbred strains.
Genotypes of these strains: All BXD strains were genotyped in the first half of 2005 at 13377 markers as part of a CTC-Wellcome Trust collaboration. When combined with previous markers, there are a total of 7636 informative markers that differ betweeen the parental strains and that are useful for mapping with the BXD strains. The locations of these makers are known on the latest assembly of the mouse genome (Build 34, mm6). The median distance between these informative markers is 178,831 bp. The mean distance is 324,493 bp. There are only 26 intervals between markers that are longer than 5 Mb. No interval is greater than 10 Mb except on Chr X. These long intervals are essentially monomorphic between the parental strains.
The BXD genotype file used in GeneNetwork up to 2014 includes a selected subset of approximately 3795 markers (out of 7636) and includes all those markers with unique strain distribution patterns (SDP) as well as pairs of markers--the most proximal and most distal--for SDPs represented by two or more markers. Slightly updated versions of this BXD genotype data set can be downloaded by ftp at ftp://atlas.uthsc.edu/Public/BXD_WebQTL_Genotypes.
There are a total of 1848 known recombinations in the 36 older (JAX) BXD set; an average of 48.1 recombinations per strain.
There are a total of 4366 known recombinations in the 53 of the first set of UTHSC BXD strains (BXD43 to BXD102); an average of 82.4 recombinations per strain (Shifman et al., 2006). These RI strains were generated from an advanced intercross, and this accounts for the higher recombination load (Peirce et al., 2005).
The "classic" genotypes of the BXD strains (used through Dec 2016 in GeneNetwork) rely mainly on the Mouse Universal Genotyping Array (MUGA) genotyping platform. They also rely to some extent and earlier genotyping in 2008 using the Affymetrix Mouse Diversity array and even earlier work by Williams et al (2001) using microsatellite markers.
In January 2017, the genotypes of most extant BXD strains were updated. The new data also include initial genotypes for the newest cohort of BXDs (BXD104 to BXD220). This January 2017 genotype file provides consensus genotypes for 198 BXD strains. Of the 198 BXD strains, 191 are independent, whereas 7 are substrains (e.g., BXD48 and BXD48a). This file provides approximate locations of 11500 recombinations, an average of 58 per strain. Genotypes were generated using Affymetrix, MUGA, MegaMUGA, and GigaMUGA Illumina platforms. Microsatellites and eQTL genotypes were generated by the Williams and Lu laboratory. Unknown genotypes were imputed as B or D, or were called as H (heterozygous) if the genotype was uncertain. Genotypes were manually curated by RW Williams. Genotypes were smoothed to remove unlikely recombination events. Almost all recombinations are supported by multiple markers, although only one or two representative markers may be provided in this file. The original parent file (BXD_El_Grande_Master_Used_to_Proof_Final_Genotypes_2016.xlxs) contains data for approximately 37000 markers. A subset of the most informative 7300 markers are included in the 2017 genotype file. Genotypes for Chr Y and Chr M are provisional and will be verified in 2017. As of 2016, many strains with higher numbers (BXD100 and above) are not fully inbred.
Approximately 5100 phenotypes are currently included in the BXD Phenotype database in GeneNetwork as of Jan 2017. You can get an update on this number by typing in an asterisk (*) in the search box.
How to obtain these strains: Please see http://jaxmice.jax.org/strain/000105.html. Cost of the JAX BXD strains was approximately $65.40 each in 2008, $135.00 each in 2014, and $139.90 in 2017. To obtain strains BXD43 and higher please contact Rob Williams. All strains through BXD102 are now fully inbred. We expect to generate as many as 160 viable BXDs strains by 2020. All extant BXD strains are being sequenced in early 2017 at 30X using 10X Chromium libraries at Hudson Alpha (supported by the UTHSC CITG, Williams, Lu, and colleagues at UTHSC, Abraham Palmer and colleagues at UCSD, and by Jonathan Pritchard at Stanford).
For more details on the history, generation, and use of RI strains as genetic reference populations for systems genetics please see Silver (1995)

Q: Can the BXD lineage be inferred from their genomic sequences?



toread: mutation accumulation in human aging tissue




Tissue-specific mutation accumulation in human adult stem cells during life

http://www.nature.com/nature/journal/v538/n7624/full/nature19768.html

Thursday, January 19, 2017

hadoop installation on single-node Ubuntu cluster



http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

cpsc1100, "IF" statement

video introduction

socrative

Phone-bill example:
<=600 min, free
(600, 1800], 1 cent per min
(1800, infinite), 2 cents per min



wine Respondus, macOS, network problem

Installed Respondus through wine on osX laptop.


Byte-5:windows hqin$ wine Respond.exe 

#It runs, but publishing to UTClearn failed.