Friday, May 25, 2018

NSF no-deadline calls

Update on No-Deadline

by dbiblogger
Following up on a previous post, “Upcoming Changes for Fiscal Year 2019 Award Submission,” new solicitations to replace previous solicitations in the DBI Research Resources Cluster will be made available after mid-year 2018.  Programs affected include:
  • Advances in Biological Informatics (ABI -NSF 15-582)
  • Collections in Support of Biological Research (CSBR - NSF 15-577)
  • Improvements in Facilities, Communications, and Equipment at Biological Field Stations and Marine Laboratories (FSML - NSF 16-506)
  • Instrument Development for Biological Research (IDBR - NSF 13-561)
For more information on the move to "no-deadline", please see the Dear Colleague Letter (NSF 18-011) and the Frequently Asked Questions.
Subscribe to the BLOG and stay tuned for updates!

Wednesday, May 23, 2018

informatics tools for cancer research

R03 data analysis

bugs, R random seed problem and RData

I had problem of resetting random seeds, even after commented out the set.seed() line. Later, I realized that I save the working environment in RData file.

NIA aging cell repository

The NIA Aging Cell Repository is located at the Coriell Institute for Medical Research in Camden, New Jersey, where it provides biological samples from older animals and people to researchers investigating the science behind aging. Cells and DNA samples are collected using strict diagnostic criteria and banked under high-quality standards of cell culture and DNA purification Scientists from more than 40 countries have used the highly characterized, viable, and contaminant-free cell cultures from this collection for cellular and molecular research on the biology of aging.  
Examples of potential uses: Cells from young and old mice can be compared to determine specific differences in how they harvest energy, make new proteins, and dispose of waste. DNA collected from octogenarians, nonagenerians, and centenarians can be used to search for potential biomarkers of aging in people.

How to obtain cell and tissue samples

You can order samples directly from the Coriell Institute. The Institute ships an average of 1,200 cell cultures and over 400 DNA samples or panels from the aging cell bank each year. About 90 percent of the shipments go to investigators at academic, nonprofit, or government institutions; at present, cells and DNA are available to these institutions at no cost. Instructions for ordering cells and additional information on pricing is available at

Friday, May 11, 2018

Aging in S pombe

It seems that S pombe in stress-free conditions have a 0.3% constant mortality rate and is non-aging in the sense that mortality not increase with age. However, in stress conditions, aging can occur due to assymetric sepraration of protein aggregates. This suggests that S pombe can hedge its 'dying' modes in different enviroments, a bet-heading method of living and dying.

PRIDE: proteomics data repository

Thursday, May 10, 2018

generate github repo for publication support, network aging manuscript

applejack:network_aging_ms_draft hqin$ pwd


After porting old repo to the new one, I fixed some variable names, clean up unnecessary lines from the codes. "diff' show that old publication file and newly generated publication file are the same (except for one column name). So, the porting repo repeated old results for natural isolate RLS fitting. 

applejack:github hqin$ ls network_aging_ms_draft/0.nat.rls.fitting/sandbox/Bootstrap_summary_for_publication.csv 
applejack:github hqin$ ls bmc_netwk_aging_manuscript/R1/0.nat.rls.fitting/sandbox/Bootstrap_summary_for_publication.csv 
applejack:github hqin$ diff  bmc_netwk_aging_manuscript/R1/0.nat.rls.fitting/sandbox/Bootstrap_summary_for_publication.csv  network_aging_ms_draft/0.nat.rls.fitting/sandbox/Bootstrap_summary_for_publication.csv 
< "","RwithStd","t0withStd","nwithStd","GwithStd","avgLSwithStd"
> "BootstrapMean...c..strains...","RwithStd","t0withStd","nwithStd","GwithStd","avgLSwithStd"

20180531. trying to find old script for histogram overlay from Jan 13 2018 commit.

Tuesday, May 8, 2018

multiple monitor setting, ridgeside

nvidia-settings , X-config for multiple monitors.

LADP password change

Just a quick reminder: the LDAP authentication upgrade is happening this evening. Several reminders below:

This will require a desktop reboot as we apply updates and switch over to LDAP.
After the upgrade, the first time you log in, you'll need to take the following steps:
  1. Do not try to log in using the normal log in screen. If you do this, the login will fail.
  2. Instead of logging in the normal way, press ctrl-alt-F1 and enter your current username and password at the console screen you'll get. If you log in successfully, you'll be prompted to create your new password; remember it must be 14 characters long.
  3. Then, press ctrl-alt-F7 to get back to the normal login screen. You should now be able to log in normally using your new password.

Wednesday, May 2, 2018

machine learning, hyper parameters

hyper-parameters, such as partition data into training, validation, and testing, iteractions, in general have no fixed way to pick theoretically.

hyper-parameters are those that need to be fixed before learning started.

hyperparameter optimization may be done by compare a tuple of hyperparameters, based on a predefined loss function on dependent data.

Tuesday, May 1, 2018


bash: n: command not found
[hqin2@r748 ~]$ pyspark
Python 2.7.11 (default, Feb 23 2016, 17:47:07) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Using Spark's default log4j profile: org/apache/spark/
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
18/05/01 14:09:24 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/05/01 14:09:24 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
18/05/01 14:09:27 WARN ObjectStore: Failed to get database global_temp, returning NoSuchObjectException
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /__ / .__/\_,_/_/ /_/\_\   version 2.1.0

Using Python version 2.7.11 (default, Feb 23 2016 17:47:07)
SparkSession available as 'spark'.
>>> sc.textFile("nana_19950801.tsv")
nana_19950801.tsv MapPartitionsRDD[1] at textFile at
>>> lines_rdd = sc.textFile("nana_19950801.tsv")
>>> lines_rdd
nana_19950801.tsv MapPartitionsRDD[3] at textFile at
>>> dir(lines_rdd)
['__add__', '__class__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__getnewargs__', '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_computeFractionForSampleSize', '_defaultReducePartitions', '_id', '_jrdd', '_jrdd_deserializer', '_memory_limit', '_pickled', '_reserialize', '_to_java_object_rdd', 'aggregate', 'aggregateByKey', 'cache', 'cartesian', 'checkpoint', 'coalesce', 'cogroup', 'collect', 'collectAsMap', 'combineByKey', 'context', 'count', 'countApprox', 'countApproxDistinct', 'countByKey', 'countByValue', 'ctx', 'distinct', 'filter', 'first', 'flatMap', 'flatMapValues', 'fold', 'foldByKey', 'foreach', 'foreachPartition', 'fullOuterJoin', 'getCheckpointFile', 'getNumPartitions', 'getStorageLevel', 'glom', 'groupBy', 'groupByKey', 'groupWith', 'histogram', 'id', 'intersection', 'isCheckpointed', 'isEmpty', 'isLocallyCheckpointed', 'is_cached', 'is_checkpointed', 'join', 'keyBy', 'keys', 'leftOuterJoin', 'localCheckpoint', 'lookup', 'map', 'mapPartitions', 'mapPartitionsWithIndex', 'mapPartitionsWithSplit', 'mapValues', 'max', 'mean', 'meanApprox', 'min', 'name', 'partitionBy', 'partitioner', 'persist', 'pipe', 'randomSplit', 'reduce', 'reduceByKey', 'reduceByKeyLocally', 'repartition', 'repartitionAndSortWithinPartitions', 'rightOuterJoin', 'sample', 'sampleByKey', 'sampleStdev', 'sampleVariance', 'saveAsHadoopDataset', 'saveAsHadoopFile', 'saveAsNewAPIHadoopDataset', 'saveAsNewAPIHadoopFile', 'saveAsPickleFile', 'saveAsSequenceFile', 'saveAsTextFile', 'setName', 'sortBy', 'sortByKey', 'stats', 'stdev', 'subtract', 'subtractByKey', 'sum', 'sumApprox', 'take', 'takeOrdered', 'takeSample', 'toDF', 'toDebugString', 'toLocalIterator', 'top', 'treeAggregate', 'treeReduce', 'union', 'unpersist', 'values', 'variance', 'zip', 'zipWithIndex', 'zipWithUniqueId']
>>> lines_rdd.filter(lambda line: "stanford" in line)
PythonRDD[4] at RDD at PythonRDD.scala:48
>>> rdd = sc.textFile("Complete_Shakespeare.txt")
>>> rdd
Complete_Shakespeare.txt MapPartitionsRDD[6] at textFile at
>>> rdd

graph for biological big data

graph for species interaction networks?

Spark GraphX for EOL data