Burrows–Abadi–Needham (BAN) logic
https://en.wikipedia.org/wiki/Burrows%E2%80%93Abadi%E2%80%93Needham_logic#:~:text=Burrows%E2%80%93Abadi%E2%80%93Needham%20logic%20(,secured%20against%20eavesdropping%2C%20or%20both.
This site is to serve as my note-book and to effectively communicate with my students and collaborators. Every now and then, a blog may be of interest to other researchers or teachers. Views in this blog are my own. All rights of research results and findings on this blog are reserved. See also http://youtube.com/c/hongqin @hongqin
https://en.wikipedia.org/wiki/Burrows%E2%80%93Abadi%E2%80%93Needham_logic#:~:text=Burrows%E2%80%93Abadi%E2%80%93Needham%20logic%20(,secured%20against%20eavesdropping%2C%20or%20both.
Referecne: https://www.drivendata.org/competitions/98/nist-federated-learning-1/rules/
privacy-preserving federated learning (PPFL) solutions
democracy-affirming technologies.
the global federated model is trained, the parameters related to the local models could be used to learn about the sensitive information contained in the training data of each client. Similarly, the released global model could also be used to infer sensitive information about the training datasets used.
1.4 GOALS AND OBJECTIVES:
Organizers seek to mature federated learning approaches and build trust in adoption by accelerating the development of efficient PPFL solutions that leverage a combination of input and output privacy techniques to:
Phase 1: Concept Paper. Blue Team Participants will produce a technical white paper (“Concept Paper” or “White Paper”) setting out their proposed solution approach. Technical papers will be evaluated by a panel of judges across a set of weighted criteria. Participants will be eligible to win prizes awarded to the top technical papers, ranked by points awarded.
As you propose your technical solutions, be prepared to clearly describe the technical approaches and sketch out proof of or justification for privacy guarantees. Participants should consider a broad range of privacy threats during the model training and model use phases and consider technical and process aspects including but not limited to cryptographic and non-cryptographic methods, and protection needed within the deployment environment.
Successful technical approaches and proofs of privacy guarantees will include the design of any algorithms, protocols, etc. utilized, as well as formal or informal arguments of how the solution will provide privacy guarantees. Participants will clearly list any additional privacy issues specific to the technological approaches used and justify initial enhancements or novelties compared to the current state-of-the-art. Participant submissions must describe how the solution will cater to the types of data provided to participants and how generalizable the solution is to multiple domains. Expected efficiency/scalability of improvements, privacy vs. utility trade off should be articulated, if possible, at this conceptual stage.
Q: what is the definition of privacy guarantee?
a one-page abstract and a Concept Paper.
Abstract: The one-page abstract must include a title and a brief description of the proposed solution, including the proposed privacy mechanisms and architecture of the federated model. The description should also describe the proposed machine learning model and expected results with regard to accuracy. Successful abstracts will outline how solutions will achieve privacy while minimizing loss to accuracy, a proposed solution, and the anticipated results, as more fully described on the Challenge Website. Abstracts must be submitted by following the instructions on the Challenge Website. Abstracts will be screened by the DrivenData and Organizers’ staff for contest eligibility and used to ensure the composition of the judging panel’s expertise aligns to proposed solutions that will be evaluated throughout the course of the Challenge. Feedback will not be provided.
Concept Paper: The Concept Paper should conceptualize solutions that describe the technical approaches and lay out the proof of privacy guarantees that solve a set of predictive or analytic tasks that support the use cases. Successful Concept Papers will incorporate the originally submitted abstract and be no more than ten pages in length. References will not count towards page length. Participant submissions shall:
I submitted CONCUR Philadelphia ICIBM meeting travel reimbursement request. When I uploaded the hotel, there are errors for allowance. I then created Itinerary to add per diem, and the allowance error went away.
So, next time, I should try create Itinerary first, and then upload hotel.
zoom, turn live caption on, record
https://ai.facebook.com/blog/crypten-a-new-research-tool-for-secure-machine-learning-with-pytorch/
https://crypten.readthedocs.io/en/latest/
holomorphic encryption:
Enc(m1) + Enc(m2) = Enc( m1 + m2)
Enc(m1) x Enc(m2) = Enc( m1 x m2)
So, an untrusted entity can compute addition or multiplication without decryption.
https://en.wikipedia.org/wiki/Homomorphic_encryption
Fully homomorphic encryption (FHE)
From Wikipedia:
In 2016, Cheon, Kim, Kim and Song (CKKS)[35] proposed an approximate homomorphic encryption scheme that supports a special kind of fixed-point arithmetic that is commonly referred to as block floating point arithmetic. The CKKS scheme includes an efficient rescaling operation that scales down an encrypted message after a multiplication. For comparison, such rescaling requires bootstrapping in the BGV and BFV schemes. The rescaling operation makes CKKS scheme the most efficient method for evaluating polynomial approximations, and is the preferred approach for implementing privacy-preserving machine learning applications. The scheme introduces several approximation errors, both nondeterministic and deterministic, that require special handling in practice.[36]
A 2020 article by Baiyu Li and Daniele Micciancio discusses passive attacks against CKKS, suggesting that the standard IND-CPA definition may not be sufficient in scenarios where decryption results are shared.[37] The authors apply the attack to four modern homomorphic encryption libraries (HEAAN, SEAL, HElib and PALISADE) and report that it is possible to recover the secret key from decryption results in several parameter configurations. The authors also propose mitigation strategies for these attacks, and include a Responsible Disclosure in the paper suggesting that the homomorphic encryption libraries already implemented mitigations for the attacks before the article became publicly available. Further information on the mitigation strategies implemented in the homomorphic encryption libraries has also been published.[38][39]
Leaf-aggregator, intermediate aggregator, master aggregator, in hierarchical tree based aggregation
https://arxiv.org/pdf/2203.12163.pdf
AdaFed takes associativity one step further. AdaFed mitigates issues with aggregation overlays by avoiding the construction of actual/physical tree topology.
Book series
Federated Learning (FL) requires an aggregator and parties to exchange model updates. (Page 285)
vulnerable to the inference of private data
System entities of the FL system
the attack surface is used to refer to the exposed parameters and data
against data leak
FL-specific attacks often take advantage of the information transmission during FL.
Differential privacy: differential privacy at the party side or the aggregator side.
For healthcare data and personal information, there are regulation and compliance requirements [14, 63]
page 285: In FL, training data is not explicitly shared.
https://mp.weixin.qq.com/s/7fb4tx0sKXE26IX1nXw4qg
zoom, turn live caption on,
introduce myself
syllabus
Socrative, Room HongQin, anonymous icebreaker,
datacamp registration
participatory coding and video requirement
video submission with hyper-link. Examples of past student submissions.
sample student videos,
past student final report
why R and python
x Email list to calendar invitation
agile at scale
Rally BroadCom
Use IBM Z as the mainframe
according to the student intern, AGILE give employee work and life balance. Software engineers do not have to work in the weekends when AGILE replaced waterfall model.
UNUM plan software engineering work one-year head, in order to request a budget.
To apply for Spring 2023 enrollment of PhD program, please apply at https://www.utc.edu/
https://www.utc.edu/enrollment-management-and-student-affairs/center-for-global-education/international-student-and-scholar-services/prospective-students
https://www.utc.edu/sites/default/files/2021-03/how-to-apply-f12.pdf
The deadline for fall application is February 15 (for students interested in graduate assistantship)
For spring application, the website says Nov 1 as the deadline, but for assistantship, it is advised to complete application as earlier as possible.
Our graduate school application requires GRE and TOEFL/IELTS scores. For students with US degrees, please contact Lora Cook to verify whether foreign language exams might be waived.
Thanks again for your interest. Please let me know if I can answer some of your questions,
Dear XYZ
Currently you have not submitted an application for our university. Our PhD programs have a firm deadline for our PhD programs as noted below. Degrees received from with the U.S. in the last two years will be waived from the English requirements.
Thank you for your interest in the University of Tennessee at Chattanooga. My name is Lora, and I will assist you in the application process.
The following requirements must be fulfilled by February 1st for the fall semestersand by September 1st for spring semesters. We do not begin new students in the summer semester.
Submit official proof of funding. may be submitted after acceptance by the program.
Visit www.utc.edu/international for more information regarding housing, orientation, student fees, health insurance, and the admissions process.
If you have any questions, please don’t hesitate to ask.
Sincerely,
use the anyone can view share link on googleDrive
sh-4.2$ gdown --fuzzy https://drive.google.com/drive/folders/14pXSOuFngosZ5BxVAr_e37Zoeq1pBeDS?usp=sharing
Downloading...
From: https://drive.google.com/drive/folders/14pXSOuFngosZ5BxVAr_e37Zoeq1pBeDS?usp=sharing
To: /gpfs/gsfs1/scr/qinlab/test/14pXSOuFngosZ5BxVAr_e37Zoeq1pBeDS?usp=sharing
895kB [00:00, 43.8MB/s]
-bash-4.2$ gdown --fuzzy https://drive.google.com/file/d/155zRiE0U-VGh40uUdgHSdnIxK0_1GSBa/view?usp=sharing
https://www.usenix.org/conference/usenixsecurity22/fall-accepted-papers
model poisoning
privacy implication of forging?
data reconstruction attack, threat to distributed machine learning
attacks often solve the gradient matching problem via optimization
A deep siamese neural network improves metagenome-assembled genomes in microbiome datasets across different environments
https://github.com/BigDataBiology/SemiBin/
[hqin@firefly02 alphafold-2]$ bash alphafold-singularity-run.sh --fasta_paths T1050.fasta
Mounting /scr -> /mnt/data_dir
Mounting /scr/alphafold-data/uniref90 -> /mnt/uniref90_database_path
Mounting /scr/alphafold-data/mgnify -> /mnt/mgnify_database_path
Mounting /scr/alphafold-data/bfd -> /mnt/bfd_database_path
Mounting /scr/alphafold-data/uniclust30/uniclust30_2018_08 -> /mnt/uniclust30_database_path
Mounting
-> /mnt/pdb70_database_path
Mounting /scr/alphafold-data/pdb_mmcif -> /mnt/template_mmcif_dir
Mounting /scr/alphafold-data/pdb_mmcif -> /mnt/obsolete_pdbs_path
Mounting /home/hqin/alphafold-2 -> /mnt/fasta_path_0
--data_dir=/mnt/data_dir/alphafold-data --uniref90_database_path=/mnt/uniref90_database_path/uniref90.fasta --mgnify_database_path=/mnt/mgnify_database_path/mgy_clusters_2018_12.fa --bfd_database_path=/mnt/bfd_database_path/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt --uniclust30_database_path=/mnt/uniclust30_database_path/uniclust30_2018_08 --pdb70_database_path=/mnt/pdb70_database_path/pdb70 --template_mmcif_dir=/mnt/template_mmcif_dir/mmcif_files --obsolete_pdbs_path=/mnt/obsolete_pdbs_path/obsolete.dat --fasta_paths=/mnt/fasta_path_0/T1050.fasta --output_dir=/mnt/output --benchmark=0 --logtostderr --max_template_date=2021-12-31
/scr:/mnt/data_dir,/scr/alphafold-data/uniref90:/mnt/uniref90_database_path,/scr/alphafold-data/mgnify:/mnt/mgnify_database_path,/scr/alphafold-data/bfd:/mnt/bfd_database_path,/scr/alphafold-data/uniclust30/uniclust30_2018_08:/mnt/uniclust30_database_path,/scr/alphafold-data/pdb70:/mnt/pdb70_database_path,/scr/alphafold-data/pdb_mmcif:/mnt/template_mmcif_dir,/scr/alphafold-data/pdb_mmcif:/mnt/obsolete_pdbs_path,/home/hqin/alphafold-2:/mnt/fasta_path_0,/home/hqin/alphafold-2:/mnt/output
I0804 20:07:16.598631 140258065744832 templates.py:857] Using precomputed obsolete pdbs /mnt/obsolete_pdbs_path/obsolete.dat.
I0804 20:07:16.809164 140258065744832 xla_bridge.py:244] Unable to initialize backend 'tpu_driver': NOT_FOUND: Unable to find driver in registry given worker:
I0804 20:07:17.299607 140258065744832 xla_bridge.py:244] Unable to initialize backend 'tpu': INVALID_ARGUMENT: TpuPlatform is not available.
I0804 20:07:21.937146 140258065744832 run_alphafold.py:385] Have 5 models: ['model_1', 'model_2', 'model_3', 'model_4', 'model_5']
I0804 20:07:21.937372 140258065744832 run_alphafold.py:397] Using random seed 1266311393757702950 for the data pipeline
I0804 20:07:21.937706 140258065744832 run_alphafold.py:150] Predicting T1050
I0804 20:07:21.938605 140258065744832 jackhmmer.py:130] Launching subprocess "/usr/bin/jackhmmer -o /dev/null -A /tmp/tmp8lfb38hd/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /mnt/fasta_path_0/T1050.fasta /mnt/uniref90_database_path/uniref90.fasta"
I0804 20:07:21.984835 140258065744832 utils.py:36] Started Jackhmmer (uniref90.fasta) query
I0804 20:13:41.682038 140258065744832 utils.py:40] Finished Jackhmmer (uniref90.fasta) query in 379.697 seconds
I0804 20:13:44.308897 140258065744832 jackhmmer.py:130] Launching subprocess "/usr/bin/jackhmmer -o /dev/null -A /tmp/tmpqgyb_oib/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /mnt/fasta_path_0/T1050.fasta /mnt/mgnify_database_path/mgy_clusters_2018_12.fa"
I0804 20:13:44.343975 140258065744832 utils.py:36] Started Jackhmmer (mgy_clusters_2018_12.fa) query
I0804 20:20:40.947294 140258065744832 utils.py:40] Finished Jackhmmer (mgy_clusters_2018_12.fa) query in 416.603 seconds
I0804 20:21:01.298550 140258065744832 hhsearch.py:85] Launching subprocess "/usr/bin/hhsearch -i /tmp/tmpwk1zvn0h/query.a3m -o /tmp/tmpwk1zvn0h/output.hhr -maxseq 1000000 -d /mnt/pdb70_database_path/pdb70"
I0804 20:21:01.353056 140258065744832 utils.py:36] Started HHsearch query
I0804 20:21:01.404658 140258065744832 utils.py:40] Finished HHsearch query in 0.051 seconds
Traceback (most recent call last):
File "/app/alphafold/run_alphafold.py", line 427, in <module>
app.run(main)
File "/opt/conda/lib/python3.7/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/opt/conda/lib/python3.7/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/app/alphafold/run_alphafold.py", line 412, in main
is_prokaryote=is_prokaryote)
File "/app/alphafold/run_alphafold.py", line 164, in predict_structure
msa_output_dir=msa_output_dir)
File "/app/alphafold/alphafold/data/pipeline.py", line 179, in process
pdb_templates_result = self.template_searcher.query(uniref90_msa_as_a3m)
File "/app/alphafold/alphafold/data/tools/hhsearch.py", line 96, in query
stdout.decode('utf-8'), stderr[:100_000].decode('utf-8')))
RuntimeError: HHSearch failed:
stdout:
stderr:
- 20:21:01.404 ERROR: In /tmp/hh-suite/src/ffindexdatabase.cpp:11: FFindexDatabase:
- 20:21:01.404 ERROR: could not open file '/mnt/pdb70_database_path/pdb70_cs219.ffdata'
This works on a macbook pro with my acco8nt . Briefly,
Log in with X-window forwarding:
ssh -X -Y user@firefly.simcenter.utc.edu
Then start MATLAB:
/opt/ohpc/pub/MATLAB/R2020b/bin/glnxa64/MATLAB
A X-window should pop up.
SFS interview reference
QUESTIONNAIRE FOR NATIONAL SECURITY POSITIONS
https://www.opm.gov/forms/pdf_fill/sf86.pdf
(base) hqin@CS313BQin ~ % d.sh
(base) hqin@CS313BQin ~ % ssh xxx@firefly.simcenter.utc.edu
xxxx@firefly.simcenter.utc.edu's password:
Activate the web console with: systemctl enable --now cockpit.socket
Last login: Wed Jul 27 10:24:27 2022 from 10.63.72.189
[xxxx@firefly ~]$ ssh firefly01
The authenticity of host 'firefly01 (192.168.223.3)' can't be established.
ECDSA key fingerprint is asdfasfasfasdf
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added 'firefly01,192.168.223.3' (ECDSA) to the list of known hosts.
xxxx@firefly01's password:
Activate the web console with: systemctl enable --now cockpit.socket
[xxxx@firefly01 ~]$ ls
2022-06-16_unmasked.fa Downloads test-scp
anaconda3 github test-ts-hello.tar
'Anaconda3-2021.05-Linux-x86_64 (1).sh' hellosts.txt tmp
bin ibm tmp.Rmd
bindings.txt lib tmp.txt
cdsapp old tools
commands.txt R trash
csh.backup R-4.0.3 unison.log
-cwd.e8271 readme.txt 'VirtualBox VMs'
-cwd.o8271 scr.hqin wget-log
demo simcenter-qinlab
Desktop T1050
[xxxx@firefly01 ~]$ bash /opt/ohpc/pub/singularity/test-scripts/alphafold-2/alphafold-singularity-run.sh
Mounting /scr -> /mnt/data_dir
Mounting /scr/alphafold-data/uniref90 -> /mnt/uniref90_database_path
Mounting /scr/alphafold-data/mgnify -> /mnt/mgnify_database_path
Mounting /scr/alphafold-data/bfd -> /mnt/bfd_database_path
Mounting /scr/alphafold-data/uniclust30/uniclust30_2018_08 -> /mnt/uniclust30_database_path
Mounting /scr/alphafold-data/pdb70 -> /mnt/pdb70_database_path
Mounting /scr/alphafold-data/pdb_mmcif -> /mnt/template_mmcif_dir
Mounting /scr/alphafold-data/pdb_mmcif -> /mnt/obsolete_pdbs_path
--data_dir=/mnt/data_dir/alphafold-data --uniref90_database_path=/mnt/uniref90_database_path/uniref90.fasta --mgnify_database_path=/mnt/mgnify_database_path/mgy_clusters_2018_12.fa --bfd_database_path=/mnt/bfd_database_path/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt --uniclust30_database_path=/mnt/uniclust30_database_path/uniclust30_2018_08 --pdb70_database_path=/mnt/pdb70_database_path/pdb70 --template_mmcif_dir=/mnt/template_mmcif_dir/mmcif_files --obsolete_pdbs_path=/mnt/obsolete_pdbs_path/obsolete.dat --output_dir=/mnt/output --benchmark=0 --logtostderr --fasta_paths=/opt/ohpc/pub/singularity/test-scripts/alphafold-2/T1050.fasta --max_template_date=2021-12-31
/scr:/mnt/data_dir,/scr/alphafold-data/uniref90:/mnt/uniref90_database_path,/scr/alphafold-data/mgnify:/mnt/mgnify_database_path,/scr/alphafold-data/bfd:/mnt/bfd_database_path,/scr/alphafold-data/uniclust30/uniclust30_2018_08:/mnt/uniclust30_database_path,/scr/alphafold-data/pdb70:/mnt/pdb70_database_path,/scr/alphafold-data/pdb_mmcif:/mnt/template_mmcif_dir,/scr/alphafold-data/pdb_mmcif:/mnt/obsolete_pdbs_path,/home/hqin:/mnt/output
I0801 11:48:55.799806 140491381822400 templates.py:857] Using precomputed obsolete pdbs /mnt/obsolete_pdbs_path/obsolete.dat.
I0801 11:48:56.409429 140491381822400 xla_bridge.py:244] Unable to initialize backend 'tpu_driver': NOT_FOUND: Unable to find driver in registry given worker:
I0801 11:48:57.418743 140491381822400 xla_bridge.py:244] Unable to initialize backend 'tpu': INVALID_ARGUMENT: TpuPlatform is not available.
I0801 11:49:03.148635 140491381822400 run_alphafold.py:385] Have 5 models: ['model_1', 'model_2', 'model_3', 'model_4', 'model_5']
I0801 11:49:03.148812 140491381822400 run_alphafold.py:397] Using random seed 725758993269421837 for the data pipeline
I0801 11:49:03.149077 140491381822400 run_alphafold.py:150] Predicting T1050
Traceback (most recent call last):
File "/app/alphafold/run_alphafold.py", line 427, in <module>
app.run(main)
File "/opt/conda/lib/python3.7/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/opt/conda/lib/python3.7/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/app/alphafold/run_alphafold.py", line 412, in main
is_prokaryote=is_prokaryote)
File "/app/alphafold/run_alphafold.py", line 164, in predict_structure
msa_output_dir=msa_output_dir)
File "/app/alphafold/alphafold/data/pipeline.py", line 148, in process
with open(input_fasta_path) as f:
FileNotFoundError: [Errno 2] No such file or directory: '/opt/ohpc/pub/singularity/test-scripts/alphafold-2/T1050.fasta'