NOAA dewpoint correlate weakly with RH, R2=0.11 for Hamilton, TN
ERA5 dewpoint correlates with NOAA dewpoint well. R2 = 0.97
This site is to serve as my note-book and to effectively communicate with my students and collaborators. Every now and then, a blog may be of interest to other researchers or teachers. Views in this blog are my own. All rights of research results and findings on this blog are reserved. See also http://youtube.com/c/hongqin @hongqin
NOAA dewpoint correlate weakly with RH, R2=0.11 for Hamilton, TN
ERA5 dewpoint correlates with NOAA dewpoint well. R2 = 0.97
Call: lm(formula = tb_comp$K ~ tb_comp$air_temp) Residuals: Min 1Q Median 3Q Max -4.7346 -0.9967 -0.0165 0.9527 7.6325 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 271.41583 0.14303 1897.6 <2e-16 *** tb_comp$air_temp 0.96368 0.00688 140.1 <2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 1.492 on 597 degrees of freedom Multiple R-squared: 0.9705, Adjusted R-squared: 0.9704 F-statistic: 1.962e+04 on 1 and 597 DF, p-value: < 2.2e-16
== pre-class to do:
calendar email invitation: including guests; done.
socrative questions (midterm exam, questions on contents from last lecture ). done
update Canvas course materials, update learning objectives. assignments as needed. done
* make sure grades are NOT shown on Canvas.
Test-run code: ipynb. NA
learning objectives: updated on canvas, NA
== In-class to do:
clean up desktop space, calendars":
ZOOM, live transcript (start video recording). Turn the computer speaker on.
Socrative sign in
Give time for students to take course surveys:
* Show my ERA5 land weather data, PBS jobs, run errors and outputs.
Arts and Science Dean cited some books on STEM in general education.
https://mitpress.mit.edu/books/robot-proof
from: https://machinelearningmastery.com/difference-test-validation-datasets/
– Training set: A set of examples used for learning, that is to fit the parameters of the classifier.
– Validation set: A set of examples used to tune the parameters of a classifier, for example to choose the number of hidden units in a neural network.
– Test set: A set of examples used only to assess the performance of a fully-specified classifier.
=======lesson plan ==========
15 minutes:
Go over Socrative review questions. while loop, csv handler
=> datePython.py
regular expression in Python
https://docs.python.org/3/howto/regex.html
https://github.com/AstunTechnology/python-basics/blob/master/regular-expressions.md
19-year-old human stage, 21-year-old human stage, 22-year-old human stage, 23-year-old human stage, 24-year-old human stage, 25-year-old human stage, 26-year-old human stage, 27-year-old human stage, 28-year-old human stage, 29-year-old human stage, 30-year-old human stage, 31-year-old human stage, 32-year-old human stage, 33-year-old human stage, 34-year-old human stage, 35-year-old human stage, 36-year-old human stage, 37-year-old human stage, 38-year-old human stage, 39-year-old human stage, 40-year-old human stage, 41-year-old human stage, 42-year-old human stage, 43-year-old human stage, 44-year-old human stage, 45-year-old human stage, 46-year-old human stage, 47-year-old human stage, 48-year-old human stage, 49-year-old human stage, 50-year-old human stage, 51-year-old human stage, 52-year-old human stage, 53-year-old human stage, 54-year-old human stage, 55-year-old human stage, 56-year-old human stage, 57-year-old human stage, 58-year-old human stage, 59-year-old human stage, 60-year-old human stage, 61-year-old human stage, 62-year-old human stage, 63-year-old human stage, 64-year-old human stage, 65-year-old human stage, 66-year-old human stage, 68-year-old human stage, 71-year-old human stage, 73-year-old human stage
https://data.humancellatlas.org/explore/projects/f0f89c14-7460-4bab-9d42-22228a91f185
37069 cells
https://singlecell.broadinstitute.org/single_cell/study/SCP263/aging-mouse-brain
455M is too big to download by browser.
Tring to get certificate at local server
openssl s_client -showcerts -servername singlecell.broadinstitute.org/single_cell/api/v1 -connect server:443 > cacert.pem
https://singlecell.broadinstitute.org/
singlecell.broadinstitute.org/single_cell/api/v1
(base) hqin@CS313BQin q-sandbox % which openssl
/Users/hqin/opt/anaconda3/bin/openssl
(base) hqin@CS313BQin q-sandbox % openssl s_client -showcerts -servername https://singlecell.broadinstitute.org -connect server:443 > cacert.pem
4504972800:error:2008F002:BIO routines:BIO_lookup_ex:system lib:crypto/bio/b_addr.c:730:nodename nor servname provided, or not known
connect:errno=0
CURL download error
curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: https://curl.haxx.se/docs/sslcerts.html
i and h are constant
H hat is the Hamiltonian operator that includes the total energy.
So, Schrodinger Equation indicates that change of a quantum system is ~ ih of the total energy?!
betacoronavirus | GR | B.1.1.529 | 2021-11-18 | (NSP5_P132H,Spike_H69del,Spike_T95I,Spike_A67V,Spike_S373P,Spike_Q493R,Spike_H655Y,N_R203K,Spike_N969K,Spike_N856K,Spike_G142D,NSP3_A1892T,Spike_Q954H,N_P13L,NSP3_L1266I,Spike_N501Y,N_R32del,M_Q19E,Spike_N440K,NSP4_T492I,NSP6_L105del,Spike_N679K,Spike_N764K,Spike_L212I,Spike_Y505H,NSP6_G107del,NSP6_I189V,Spike_T547K,M_D3G,Spike_D796Y,N_G204R,Spike_T478K,Spike_V143del,M_A63T,Spike_G496S,NSP3_V1069I,Spike_K417N,NSP6_S106del,Spike_S371L,Spike_G339D,NSP3_S1265del,NSP14_I42V,Spike_P681H,Spike_Y144del,Spike_ins214EPE,N_S33del,Spike_S375F,Spike_Q498R,Spike_G446S,Spike_S477N,N_E31del,NSP3_K38R,Spike_N211del,Spike_E484A,E_T9I,Spike_V70del,Spike_L981F,NSP12_P323L,Spike_D614G,Spike_Y145del) | VOC Omicron GR/484A (B.1.1.529) first detected in Botswana/Hong Kong/South Africa | 2021-11-11 | Africa / Botswana / South East / Greater Gaborone / Gaborone | Human | 2021-11-23 | TRUE | 0.0131383910524 | 0.379838874855 | 29684 |
https://english.alarabiya.net/coronavirus/2021/11/26/Explainer-What-is-the-new-B-1-1-529-COVID-19-variant-
Follow https://github.com/justinmclark/utc_lookout_remote_jupyter
I used 'grib' environment, and reinstalled jupyter in 'grib'.
I then used a browser on local:8888 in ECS313 desktop, and it worked.
== pre-class to do:
calendar email invitation: including guests; done.
socrative questions (midterm exam, questions on contents from last lecture ). done
update Canvas course materials, update learning objectives. assignments as needed. done
* make sure grades are NOT shown on Canvas.
Test-run code: ipynb. NA
learning objectives: updated on canvas, NA
== In-class to do:
clean up desktop space, calendars":
ZOOM, live transcript (start video recording). Turn the computer speaker on.
Socrative sign in
Give time for students to take course surveys
* ask students to share projects. Need to work on writing.
* Show my ERA5 land weather data, PBS jobs, run errors and outputs.
To apply for Fall 2022 enrollment of PhD program, please apply at https://www.utc.edu/
The deadline for fall 2022 application is February 1 (for students interested in graduate assistantship)
Our graduate school application requires GRE and TOEFL/IELTS scores.
Thanks again for your interest. Please let me know if I can answer some of your questions,
There are 32 Apple counties not found in counties2/
{'AguadillaMunicipio_PuertoRico_US_v2.csv', 'AnchorageMunicipality_Alaska_US_v2.csv', 'AreciboMunicipio_PuertoRico_US_v2.csv', 'BayamónMunicipio_PuertoRico_US_v2.csv', 'BoxElder_Utah_US_v2.csv', 'CaguasMunicipio_PuertoRico_US_v2.csv', 'CarolinaMunicipio_PuertoRico_US_v2.csv', 'Carson_Nevada_US_v2.csv', 'DoradoMunicipio_PuertoRico_US_v2.csv', 'DoñaAna_NewMexico_US_v2.csv', 'Dukes_Massachusetts_US_v2.csv', 'FairbanksNorthStarBorough_Alaska_US_v2.csv', 'Guam_Guam_US_v2.csv', 'GuaynaboMunicipio_PuertoRico_US_v2.csv', 'GuraboMunicipio_PuertoRico_US_v2.csv', 'HumacaoMunicipio_PuertoRico_US_v2.csv', 'James_Virginia_US_v2.csv', 'Juab_Utah_US_v2.csv', 'JuneauandBorough_Alaska_US_v2.csv', 'KenaiPeninsulaBorough_Alaska_US_v2.csv', 'Matanuska-SusitnaBorough_Alaska_US_v2.csv', 'MayagüezMunicipio_PuertoRico_US_v2.csv', 'Millard_Utah_US_v2.csv', 'PonceMunicipio_PuertoRico_US_v2.csv', 'RíoGrandeMunicipio_PuertoRico_US_v2.csv', 'SanJuanMunicipio_PuertoRico_US_v2.csv', 'Sevier_Utah_US_v2.csv', 'St.CroixIsland_VirginIslands_US_v2.csv', 'St.ThomasIsland_VirginIslands_US_v2.csv', 'ToaBajaMunicipio_PuertoRico_US_v2.csv', 'TrujilloAltoMunicipio_PuertoRico_US_v2.csv', 'Weber_Utah_US_v2.csv'}
Sevier Utah is an unincoporated community.
After remove some "and", "City", "Municipio", "Borough", there are 18 left.
There are 15 Apple counties not found in counties2/Out[13]:{'Bayamón_PuertoRico_US_v2.csv', 'BoxElder_Utah_US_v2.csv', 'Carson_Nevada_US_v2.csv', 'DoñaAna_NewMexico_US_v2.csv', 'Dukes_Massachusetts_US_v2.csv', 'Guam_Guam_US_v2.csv', 'James_Virginia_US_v2.csv', 'Juab_Utah_US_v2.csv', 'Mayagüez_PuertoRico_US_v2.csv', 'Millard_Utah_US_v2.csv', 'RíoGrande_PuertoRico_US_v2.csv', 'Sevier_Utah_US_v2.csv', 'St.CroixIsland_VirginIslands_US_v2.csv', 'St.ThomasIsland_VirginIslands_US_v2.csv', 'Weber_Utah_US_v2.csv'}
Dukes_Massachusetts_US_v2.csv is in Apple.In JHU, this seems to be DukesandNantucket_Massachusetts_US_v2.csv
https://www.deepl.com/en/translator
From https://en.wikipedia.org/wiki/Deviance_information_criterion :
There are two calculations in common usage for the effective number of parameters of the model. The first, as described in Spiegelhalter et al. (2002, p. 587), is , where is the expectation of . The second, as described in Gelman et al. (2004, p. 182), is . The larger the effective number of parameters is, the easier it is for the model to fit the data, and so the deviance needs to be penalized.
The deviance information criterion is calculated as
or equivalently as
,
https://www.coursera.org/learn/mcmc-bayesian-statistics/home/week/3
R code
== pre-class to do:
calendar email invitation: including guests; done.
socrative questions (midterm exam, questions on contents from last lecture ). done
update Canvas course materials, update learning objectives. assignments as needed. done
* make sure grades are NOT shown on Canvas.
Test-run code: ipynb. NA
learning objectives: updated on canvas, NA
== In-class to do:
clean up desktop space, calendars":
ZOOM, live transcript (start video recording). Turn the computer speaker on.
Socrative sign in
Give time for students to take course surveys
* ask students to share projects. Need to work on writing.
* discussion covid19 transmission
https://www.nature.com/articles/s41467-021-21358-2
Easy MCMC
https://stephens999.github.io/fiveMinuteStats/MH-examples1.html
Socrative
Slides
ask students to share their screen
Last times, I let students worked on replacingExample1.py
=======lesson plan ==========
15 minutes:
Go over Socrative review questions.
10 minutes:
whileExample1.py :: line.strip()
5 minutes:
whileExample2.py :: line.split()
10 minutes:
=> ask student to modify whileExample2 to process gleep2. This is whileExample3.py
15 minutes
csvHandler.py
20 minutes:
=> Lab exercise 1. datetime.datetime
=> datePython.py
== pre-class to do:
calendar email invitation: including guests; done.
socrative questions (midterm exam, questions on contents from last lecture ). done
update Canvas course materials, update learning objectives. assignments as needed. done
* make sure grades are NOT shown on Canvas.
Test-run code: ipynb. NA
learning objectives: updated on canvas, NA
== In-class to do:
clean up desktop space, calendars":
ZOOM, live transcript (start video recording). Turn the computer speaker on.
Socrative sign in
Give time for students to take course surveys
* ask students to share projects. Need to work on writing.
* Show my ERA5 land weather data
https://cyberedwiki.org/index.php?title=Welcome_to_CyberEd_Wiki
https://www.caeresource.directory/home
The template pbs is:
-bash-4.2$ cat ts_grib_job_template.pbs
#!/bin/bash -l
#$ -S /bin/bash
#$ -N pygrib_NUMBER
#$ -cwd
. /etc/profile.d/modules.sh
module load anaconda
source activate grib
python parse_grib_4JHU-debug.py STARTINDEX ENDINDEX
-bash-4.2$
Qin used a python code to generate pbs jobs, and then submit them using shell script.
-bash-4.2$ cat generate_PBS_bash.py
import os
import re
filebatchshell = "submit_PBS.sh"
fshell = open(filebatchshell, "w")
step = 15
for i in range( 0, 600, step):
fin = open("ts_grib_job_template.pbs", "r")
LinesIn = fin.readlines()
fin.close()
LinesIn[2] = "#$ -N pygrib_" + str(i) + "\n"
LinesIn[7] = "python parse_grib_4JHU.py " + str(i) + " " + str(i + step) + " \n"
fileout = "ts_" + str(i) + ".pbs"
fout = open(fileout, "w")
for line in LinesIn:
fout.write(line)
fout.close()
buffer = "qsub " + fileout + "\n"
fshell.write(buffer)
fshell.close()
To parse the 2m-dew point, I used generate_PBS_bash_dewpoint.py
This is a 7-day run. The last t2m output is on Nov 20, and the last dewpoint file output is on Nov 21.
On Nov 21, I use github to backup running logs, and output csv files
grbs = pygrib.open('2020-2021NovT2m-dewpoint.grib') #does not work
grbs = pygrib.open('2019-2020June10.grib') #works
It would be strange if grib format changed. I need to redownload the grib file. again
6pm, Qin found out that remove the time selection, and the codes runs on 2021 GRIB file.
8140 certification replace 8570 certification.
https://www.comptia.org/blog/what-is-dod-8140-cybersecurity-certifications-and-requirements
https://webapp.utc.edu/common/faculty-ratings/index2.php
https://www.computer.org/csdl/journal/bd
https://conferenceindex.org/disciplines
Springer: International Journal of Data Science and Analytics
ACM Transactions on Knowledge Discovery from Data
UTC course scheduling, time slots
Aquarium fish tank video with Yolo
https://github.com/Developer-Y/cs-video-courses#software-engineering
Generate PBS submission jobs
* python generate_PBS_bash.py
* sh submit_PBS_small.sh
-bash-4.2$ cat submit_PBS_small.sh
qsub PBS/ts_1_2.pbs
qsub PBS/ts_2_3.pbs
qsub PBS/ts_3_4.pbs
qsub PBS/ts_4_5.pbs
qsub PBS/ts_5_6.pbs
qsub PBS/ts_6_7.pbs
qsub PBS/ts_7_8.pbs
qsub PBS/ts_8_9.pbs
qsub PBS/ts_9_10.pbs
qsub PBS/ts_10_11.pbs
-bash-4.2$ qstat | head
job-ID prior name user state submit/start at queue slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
40904 0.50500 epi_job1 hqin r 11/10/2021 01:16:14 all.q@ts26.cm.cluster 1
40905 0.50500 epi_job2 hqin r 11/10/2021 01:16:14 all.q@ts19.cm.cluster 1
40906 0.50500 epi_job3 hqin r 11/10/2021 01:16:14 all.q@ts01.cm.cluster 1
40907 0.50500 epi_job4 hqin r 11/10/2021 01:16:14 all.q@ts33.cm.cluster 1
40909 0.50500 epi_job6 hqin r 11/10/2021 01:16:14 all.q@ts19.cm.cluster 1
40910 0.50500 epi_job7 hqin r 11/10/2021 01:16:14 all.q@ts01.cm.cluster 1
40912 0.50500 epi_job9 hqin r 11/10/2021 01:16:14 all.q@ts29.cm.cluster 1
40914 0.50500 epi_job11 hqin r 11/10/2021 01:16:14 all.q@ts26.cm.cluster 1
total 209792
-rw-r--r-- 1 hqin simctr 23771 Nov 14 04:43 Suffolk_NewYork_US_v2.csv
This runntime error seems to be just a warning, because Rt was still reported.
https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
Logging threshold set at INFO for the EpiNow2 logger
Writing EpiNow2 logs to the console and: /var/folders/bw/k6_tkc2142v1wh5r_1yhkqhc0000gp/T//Rtmpb4xcT4/regional-epinow/2020-10-27.log
Logging threshold set at INFO for the EpiNow2.epinow logger
Writing EpiNow2.epinow logs to the console and: /var/folders/bw/k6_tkc2142v1wh5r_1yhkqhc0000gp/T//Rtmpb4xcT4/epinow/2020-10-27.log
WARN [2021-11-09 19:42:32] epinow: There were 2 divergent transitions after warmup. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them. -
WARN [2021-11-09 19:42:32] epinow: Examine the pairs() plot to diagnose sampling problems
== pre-class to do:
calendar email invitation: including guests; done.
socrative questions (midterm exam, questions on contents from last lecture ). done
update Canvas course materials, update learning objectives. assignments as needed. done
* make sure grades are NOT shown on Canvas.
Test-run code: ipynb. NA
learning objectives: updated on canvas, NA
== In-class to do:
clean up desktop space, calendars":
ZOOM, live transcript (start video recording). Turn the computer speaker on.
Socrative sign in
* two student presented: one on tweet sentiment and election; one on kaggle heart disease prediction.
VAR model
https://en.wikipedia.org/wiki/Vector_autoregression
A VAR model include a set of $k$ variables.
"VAR models are characterized by their order, which refers to the number of earlier time periods the model will use. Continuing the above example, a 5th-order VAR would model each year's wheat price as a linear combination of the last five years of wheat prices. A lag is the value of a variable in a previous time period. So in general a pth-order VAR refers to a VAR model which includes lags for the last p time periods. A pth-order VAR is denoted "VAR(p)" and sometimes called "a VAR with p lags".
"The process of choosing the maximum lag p in the VAR model requires special attention because inference is dependent on correctness of the selected lag order.[2][3]"
With discussion with Prof W and his student M., I tried permutation to see how Rt ~ tweet sentiment co-integrate. To my surprise, after permutation, Johansen's test becomes more significant. So, HQ thought that cointegration of a signal with a random signal may likely to co-integrate. So, this means that significant co-integration of Rt versus some factors that we thought are significant maybe be good news.
We need to demonstrate that both factors are not stationary first, and only their combination are stationary.
According to https://en.wikipedia.org/wiki/Johansen_test , Johansen test seems to only consider of I(1) order of 1 co-integration.
For the number of $k$ time series, "The null hypothesis for the trace test is that the number of cointegration vectors is r = r* < k, vs. the alternative that r = k. Testing proceeds sequentially for r* = 1,2, etc. and the first non-rejection of the null is taken as an estimate of r. The null hypothesis for the "maximum eigenvalue" test is as for the trace test but the alternative is r = r* + 1 and, again, testing proceeds sequentially for r* = 1,2,etc., with the first non-rejection used as an estimator for r."
A time series is integrated of order d if
GitHub raw use token to allow private CSV read into R.
Read private GitHub csv into R
This might be a security issue.
https://news.softpedia.com/news/top-30-critical-security-flaws-most-used-by-cybercriminals-533616.shtml
== pre-class to do:
calendar email invitation: including guests; done.
socrative questions (midterm exam, questions on contents from last lecture ). done
update Canvas course materials, update learning objectives. assignments as needed. done
* make sure grades are NOT shown on Canvas.
Test-run code: ipynb. NA
learning objectives: updated on canvas, NA
== In-class to do:
clean up desktop space, calendars":
ZOOM, live transcript (start video recording). Turn the computer speaker on.
Socrative sign in
* three student volunteer, one on traffic accident, one on covid and entertainment industry, one on medical images.
A student asked about life cycle of data analysis project. HQ answered that any project should have a life cycle, it has given amount of time.
program of study
Total 72 credit hours (24 courses), 24 in doctoral research and dissertation
2x3 credit hours from: CPSC 5210: algorithm, 5260: parallel algorithm, 5240 principle of data analytics, 5440: intro to machine learning.
2x3 of math: Math 5210: linear algebra and matrix theory; MATH 5600 numeral analysis I, MATH 5610: numerical analysis II
For MS students, 24 hours credits are given.
== pre-class to do:
calendar email invitation: including guests; done.
socrative questions (midterm exam, questions on contents from last lecture ). done
update Canvas course materials, update learning objectives. assignments as needed. done
* make sure grades are NOT shown on Canvas.
Test-run code: ipynb. NA
learning objectives: updated on canvas, NA
== In-class to do:
clean up desktop space, calendars":
ZOOM, live transcript (start video recording). Turn the computer speaker on.
Socrative sign in
* start breakroom, ask students to explain their final projects to others. After the breakout room discussion, ask a student to explain other students topic, what are the interesting part of them.