GWU jame hahn AI, pediatrics
https://engineering.gwu.edu/james-hahn
This site is to serve as my note-book and to effectively communicate with my students and collaborators. Every now and then, a blog may be of interest to other researchers or teachers. Views in this blog are my own. All rights of research results and findings on this blog are reserved. See also http://youtube.com/c/hongqin @hongqin
https://ezharjan.github.io/AutoSceneGen/
https://github.com/Ezharjan/AutoSceneGen
https://github.com/Ezharjan?tab=repositories
https://www.nvidia.com/en-us/high-performance-computing/earth-2/
AI to biology is like math to physics
george church, dyno, manifold,
| Company | Focus / AI-relevance | Notes |
|---|---|---|
| Lila Sciences | AI-agent platform startup | George Church joined as Chief Scientist in 2025 for Lila Sciences, which is specifically described as an “AI agent platform” startup. (Wikipedia) |
| GC Therapeutics | Cell-therapy company, includes machine-learning / “plug & play” tech | Although primarily a biotech/cell-therapy company, GC Therapeutics uses a platform that combines synthetic biology, gene editing, cell engineering and machine learning. (Fierce Biotech) |
david baker, cofounder
aridunah duvan, ceo
peter korte, simens
bret, figure AI
foxxconn, Liu, ceo
CPU general purpose to GPU accelerated computing platform shift
reinvent new algorithms
CUDA-X libraries, 300+
medical imaging framework
genomic processing
codesign
mixture of expert on the chip, thinking gpu, 32 expert gpus
== pre-class to do:
post video:
calendar email invitation:
homework assignment, data camp,
socrative sign in
update Canvas course materials, update learning objectives. assignments as needed:
Test-run code: skip.
kindle book. using ipad to highlight key points.
== In-class to do:
clean up destktop space, calendars,
ZOOM, live transcript (start video recording).
Socrative sign in,
transformer gpt.ipynb on vertex ai,
decoder only
causal masks,
temperature
student presentation: Google Day practice
breakout rooms: course project.
https://www.politico.com/newsletters/future-pulse/2025/10/22/arpa-h-new-director-00618080
Alicia Jackson’s DARPA roots could profoundly reshape how ARPA-H approaches artificial intelligence — especially generative AI in the life sciences.
While POLITICO reports that the Trump administration cut several ARPA-H AI programs in areas like AI-driven cancer detection and preventive care, that doesn’t mean a retreat from AI. It signals a strategic pivot — from broad, exploratory projects to mission-focused, biologically grounded applications.
At DARPA’s Biological Technologies Office, Jackson led visionary programs such as Living Foundries and BRICS, advancing programmable biology and biosafety.
Her philosophy treats AI as a design engine — a tool for creating biological systems, not just interpreting them.
|
Jackson’s Focus |
How Generative AI Fits In |
|---|---|
|
Programmable biology |
AI models that design new enzymes, antibodies, or pathways. |
|
Biomanufacturing efficiency |
Reinforcement learning to optimize cell or microbial production. |
|
Predictable, controllable systems |
AI that forecasts biological stability and detects anomalies in real time. |
Jackson’s entrepreneurial work — from Evernow to Drawbridge Health — points to a leader focused on translation and commercialization.
Expect ARPA-H to favor AI that accelerates real-world deployment, not theoretical modeling.
Likely directions:
Digital biomanufacturing twins for faster FDA qualification
Human-in-the-loop generative design for explainable AI innovation
Regulatory-ready AI models aligned with FDA’s evolving digital-health framework
Jackson’s history with Safe Genes and BRICS highlights her awareness of biosecurity and dual-use risks.
Her ARPA-H will likely push for “safe and governed” AI, emphasizing:
Explainable generative models for biology
Ethical-control frameworks for AI that manipulates living systems
Red-teaming and validation pipelines — directly inspired by DARPA safety protocols
In practice, that means generative tools will need built-in containment logic to prevent unintended or dangerous outputs.
|
ARPA-H Priority Area |
AI Application Example |
Strategic Outcome |
|---|---|---|
|
Rapid Bio-Design Platforms |
Foundation models for proteins and RNA |
Faster molecule discovery for health and defense |
|
Scalable Biomanufacturing |
Generative control of microbial or cell-free systems |
On-demand vaccines, hormones, or nutrients |
|
Neuro-Restoration Interfaces |
Generative neural encoding |
Brain recovery and adaptive prosthetics |
|
Women’s Health & Aging |
Personalized AI for hormonal and aging biomarkers |
Precision-health insights with consumer impact |
|
AI Safety in Biotechnology |
Red-team and governance frameworks |
Mitigate dual-use and biosecurity risks |
Under Jackson’s leadership:
AI won’t vanish — it will integrate deeply into bioengineering.
Generative AI will fund tangible biological prototypes, not abstract tools.
Open-ended “AI-for-everything” research will give way to DARPA-style challenges — measurable, outcome-driven, and safety-conscious.
In short, ARPA-H’s next AI chapter will likely merge engineering discipline with biological imagination — turning AI into a creative partner for the life sciences, not just an observer.
A team of researchers at the Changchun Veterinary Research Institute in China has zeroed in on a specific strain of the influenza virus, known as D/HY11, which emerged in cattle in northeast China in 2023.
model the evolution and specici jump using viralGPT.
BMS meeting.
Robert Bruno, 3D bioprint and cancer
Lifang Yang
Larry Sanford
Frank Lattanzio
Patrick Sachs
Siqi Guo
Ebony Clark
Lisa Shollengerger
Peter Mollica
== pre-class to do:
post video
calendar email invitation:
socrative sign in
update Canvas course materials, update learning objectives. assignments as needed:
kindle book.
== In-class to do:
clean up destktop space, calendars,
ZOOM, live transcript (start video recording).
** Google Day presentation;
Socrative sign in,
student presentation;
breakout rooms: course project poster work;
== pre-class to do:
post video of lect 4 LSTM
calendar email invitation:
homework assignment, data camp,
socrative sign in
update Canvas course materials, update learning objectives. assignments as needed:
Test-run code: skip.
kindle book. using ipad to highlight key points.
== In-class to do:
clean up destktop space, calendars,
ZOOM, live transcript (start video recording).
Socrative sign in,
student presentation; Terry and ??
breakout rooms: course project
“Humanoids are the next frontier of physical AI, requiring the ability to reason, adapt and act safely in an unpredictable world,” said Rev Lebaredian, vice president of Omniverse and simulation technology at NVIDIA. “With these latest updates, developers now have the three computers to bring robots from research into everyday life — with Isaac GR00T serving as robot’s brains, Newton simulating their body and NVIDIA Omniverse as their training ground.”
ODU faculty profile link, where profile image and some information can be updated
GCP Google Earth Engine
Pratap Ramamurthy
Jeremy Malczyk
Ranadheer Mettu
https://scholarsarchive.byu.edu/etd/10705/
== pre-class to do:
post video of lect 4 LSTM
calendar email invitation:
homework assignment, data camp,
paper selection, high quality, primary research paper.
potential project (agentic bioinformatics analysis, agentic lab report?, pretraining of transformer, word embedding)
socrative questions (questions on contents from last lecture): TF on VAE
update Canvas course materials, update learning objectives. assignments as needed:
Test-run code: skip.
kindle book. using ipad to highlight key points.
== In-class to do:
clean up destktop space, calendars,
ZOOM, live transcript (start video recording).
Socrative sign in, skipped
Anton: presentation
Normalized flow
breakout rooms: student Kris;
https://www.modusponensinstitute.com/
The first Middle School Ethics Olympiad in the Western Hemisphere.
November 22nd, 9am PST.
Qualify for the International Middle School Olympiad!
== pre-class to do:
post video of lect 3 GAN
calendar email invitation:
homework assignment, data camp,
paper selection, high quality, primary research paper.
potential project (agentic bioinformatics analysis, agentic lab report?, pretraining of transformer, word embedding)
socrative questions (questions on contents from last lecture): TF on VAE
update Canvas course materials, update learning objectives. assignments as needed:
Test-run code: skip.
kindle book. using ipad to highlight key points.
== In-class to do:
clean up destktop space, calendars,
ZOOM, live transcript (start video recording).
Socrative sign in, skipped
Segio presentation on DoRA
GAN, principle in pdf, then kindle textbook,
breakout rooms: discuss course projects.
Meeting assets for 202510_CS_795_21992_CS 795 gAI Thu Evening are ready!
Meeting summary
The professor outlined requirements for a generative AI course project focusing on ethical and legal considerations, with students needing to submit proposals within a week. The discussion covered various aspects of autoregression models, tokenization in NLP, and the structure and operation of LSTM cells, including how temperature parameters affect model behavior and creativity. The session concluded with a student presentation on weight decomposed low-rank adaptation methods and their performance compared to full fine-tuning, followed by an announcement about breakout rooms for project discussions.
The professor explained that the course project is the only project in the course and must be related to generative AI, with a focus on ethical and legal considerations. He clarified that the project can involve training discriminators or jailbreaking models, but emphasized that the end result should have a generative aspect. The professor also discussed the project proposal requirements, including team composition, background, data sets, AI approaches, and resource needs, and mentioned that students have one week to submit their proposals.
Hong discussed the autoregression model, highlighting that while ChatGPT is a popular example, there are many other models including LSTM and GRU. He explained the concept of tokenization in natural language processing, noting its importance in splitting text into smaller units for analysis. Hong also described the process of embedding, where tokens are represented by continuous floating-point numbers, often trained in the context of input or output, and how this can lead to interesting and meaningful representations.
Hong explained the structure and operation of LSTM (Long Short-Term Memory) cells, focusing on how they differ from traditional recurrent neural networks. He described how LSTMs use a cell state that maintains memory through weighted matrices shared across all time steps, and detailed the four key operations within each LSTM cell: forget, input, cell state update, and output. Hong noted that while LSTMs were a significant improvement over simple RNNs when introduced 28 years ago, they are now considered less efficient than Transformers due to their fixed weight matrices.
The discussion focused on the implementation and behavior of temperature parameters in language models, particularly LSTM models. Hong explained how temperature affects the stochastic nature of model predictions, with higher temperatures leading to more deterministic outputs and lower temperatures increasing randomness. Hamza and Evan clarified that temperature controls the creativity and randomness of model outputs, with Evan confirming this through research. The group also discussed the limitations of Keras for modifying AI models compared to PyTorch, noting its industrial nature and declining usage.
Hong discussed the evolution and modifications of recurrent neural networks, particularly focusing on the LSTM (Long Short-Term Memory) and its limitations. He explained how a modified version of the recurrent unit, known as the Gated Recurrent Unit (GRU), lacks a cell state and memory, which could potentially make it faster to train but less effective in retaining long-term dependencies. Hong also introduced the concept of quantum machine learning, highlighting a recent development where a classical LSTM was combined with a quantum encoder to create a quantum LSTM. He emphasized the potential for quantum computing to revolutionize machine learning and suggested that future generations might need to learn quantum machine learning, even though it is still in its early stages.
Hong and Sergio discussed the first student presentation of the day, agreeing to take a 5-minute break before resuming at 7:05. Sergio confirmed he was ready to present and successfully shared his screen for the presentation.
Sergio presented on weight decomposed low-rank adaptation, introduced by the NVIDIA group, which builds upon LORA (Low-Rank Adaptation) and DORA (Decomposed Low-Rank Adaptation). He explained that while full fine-tuning adjusts all parameters, parameter-efficient methods like LORA and DORA only modify specific components, aiming to replicate full-tuning results with fewer computations. Sergio detailed how DORA decomposes weights into magnitude and direction, allowing independent updates, which leads to a learning pattern closer to full-tuning compared to LORA. Hong and Evan asked clarifying questions about the decomposition process and the implications of the negative slope in the learning trajectory, which Sergio explained as a lack of correlation between magnitude and direction changes. Terry inquired about training time differences between methods, which Sergio did not fully address in the transcript.
Sergio explained the concept of DORA, a parameter-efficient tuning method that reduces the number of parameters by decomposing the weight matrix, resulting in faster training times compared to full tuning. He highlighted that while DORA introduces some computational cost during tuning, it does not affect model latency during inference. Hong inquired about the meaning of scores in the results, and Sergio clarified that higher scores indicate better performance, though the specific metrics are not clearly defined for generative AI. They also discussed the hyperparameters used, including rank (R), which is a key tuning parameter, and Sergio explained how R is chosen based on the results and dimensions of the weight matrix.
The group discussed the performance of DORA and LoRA models, focusing on their efficiency and accuracy compared to full fine-tuning. Sergio explained that DORA can achieve similar or better accuracy than full fine-tuning with fewer parameters, while LoRA performs better with higher ranks but requires more computational resources. The team also explored the concept of quantized models, where the pre-trained model is compressed to reduce memory demands. Hong clarified questions about the ranking system and parameter usage, and the group discussed the implications of different parameter-efficient tuning methods on model latency. Finally, Hong announced that breakout rooms would be set up for students to discuss potential course projects, with 10 rooms available for participants to join
== pre-class to do:
post video of lec 2 VAE.
calendar email invitation:
homework assignment, data camp,
paper selection, high quality, primary research paper.
potential project (agentic bioinformatics analysis, agentic lab report?, pretraining of transformer, word embedding)
socrative questions (questions on contents from last lecture): TF on VAE
update Canvas course materials, update learning objectives. assignments as needed:
Test-run code: skip.
kindle book. using ipad to highlight key points.
== In-class to do:
clean up destktop space, calendars,
ZOOM, live transcript (start video recording).
Socrative sign in, review VAE
== summary, review VAE
GAN, principle in pdf, then kindle textbook,
breakout rooms,
Meeting summary
The meeting began with a review session on variational autoencoders, where students demonstrated good understanding of key concepts including the variational loss function and reparameterization trick. The discussion then moved to Generative Adversarial Networks (GANs), covering their fundamental components, mathematical framework, and training processes, including the challenges and advancements in model training. The latter part of the meeting focused on practical aspects, including the implementation of GANs for image generation, the use of Google Cloud Platform resources like Vertex AI for machine learning applications, and guidelines for course presentations and storage of work.
Hong led a review session on variational autoencoders, confirming that the encoder maps input data to a single latent vector with randomness introduced through auxiliary parameters. Students demonstrated good understanding of concepts like the variational loss function, which includes both reconstruction loss and a regularization term (KL divergence), and the reparameterization trick that allows backpropagation through sampling steps. Hong noted that while some students hadn't signed in, there were 9 confirmed participants, and mentioned that AI meeting note-taking tools were being used by many attendees. The session concluded with a brief mention of moving on to Generative Adversarial Networks in the next lecture.
Hong explained the concept of Generative Adversarial Networks (GANs), which involve a discriminator and a generator. The discriminator aims to distinguish between real and fake data, while the generator creates synthetic data to fool the discriminator. The goal is to reach an equilibrium where the discriminator cannot reliably identify fake data, achieving a 50-50 chance of correct classification. Hong also described the mathematical framework of the value function that guides the training process, highlighting the adversarial nature of the optimization procedure.
Hong explained the mathematical foundation of a binary classifier using cross entropy loss, describing how the value function can be expressed in terms of Kullback-Leibler divergence and Jensen-Shannon divergence between real data and generated distributions. He outlined the training process as a two-step procedure: first maximizing the discriminator using the full loss function, and then minimizing the generator using a simplified version of the loss.
Hong discussed the challenges and advancements in training generative models, focusing on the WGAN with gradient penalty as the current state-of-the-art method. He explained the technical details of the WGAN, including its use of the Earth mover's distance and the introduction of the epsilon parameter for balancing real and fake data. Hong also highlighted the practical implementation of the WGAN using a real-world example involving the detection of fake bricks, which was demonstrated using a dataset of Lego bricks.
Hong explained the structure of a discriminator and generator model for image generation, noting that the discriminator is a convolutional neural network with a sigmoid output for binary classification, while the generator is similar to a variational autoencoder. Hong outlined the training process, which involves computing binary cross-entropy loss for both the discriminator and generator, and mentioned that the optimizer is specified elsewhere in the code. The discussion touched on the technical details of image expansion methods and the inclusion of noise in the loss function to improve model performance.
Hong discussed the implementation and effectiveness of a generative adversarial network (GAN) with a gradient penalty (GP) for image generation. They explained how the GP is calculated and its role in improving the quality of generated images compared to traditional GANs. Hong also introduced the concept of conditional GANs, which concatenate label information to the input and showed that this simple modification can significantly enhance performance.
Hong discussed the evolution of generative AI methods, noting that while the generative adversarial network (GAN) approach was a significant milestone in 2014, the field has since shifted with the advent of agentic AI, which allows for more specialized and sophisticated critiques. Hong also addressed the use of Google Cloud Platform (GCP) and Vertex AI for students in the class, explaining that while GCP provides a range of industrial-level AI tools, the Vertex AI environment is still in its early stages and may require further development. Evan pointed out that the current GCP course focuses mainly on knowledge checks rather than practical use, and Hamza inquired about the speed and capabilities of Vertex AI compared to ODU's supercomputers, to which Hong clarified that the platforms serve different purposes and are not directly comparable.
Terry demonstrated how to access and use Vertex AI, a Google Cloud service for machine learning and AI applications. He explained the difference between on-premises clusters and cloud resources, emphasizing that Vertex AI provides a managed service for model development, training, and deployment. Terry showed the class how to log into Google Cloud using their ODU student accounts and navigate the Vertex AI interface, highlighting key features like the model garden, Vertex AI studio, notebooks, and deployment options.
The meeting focused on discussing the use of Google Cloud Platform (GCP) resources for the course, particularly Vertex AI and storage solutions. Terry explained that a shared project exists for the class, but students should be cautious about deleting each other's work. He demonstrated how to use buckets for storage and recommended copying important data to Git if needed. The group discussed potential future changes to permissions and the possibility of creating individual projects for each student. Hong clarified that presentations should be individual, not group projects, and explained the format and content expectations for presentations. The class was reminded to save their work before the semester ends, as resources may be deleted afterward.
human variant prediction
https://deepmind.google/discover/blog/alphagenome-ai-for-better-understanding-the-genome/
== pre-class to do:
post video of lec 1. done
calendar email invitation: done
homework assignment, data camp,
paper selection:
potential project (agentic bioinformatics analysis, agentic lab report?, pretraining of transformer, word embedding)
socrative questions (questions on contents from last lecture ):
update Canvas course materials, update learning objectives. assignments as needed:
Test-run code: skip.
kindle book. using ipad to highlight key points.
== In-class to do:
clean up destktop space, calendars,
ZOOM, live transcript (start video recording).
Socrative sign in
=> go over assignments, video, datacamp
=> kingma and Weling, 2013 arxiv
=> hqin's proof work
=> further reading, kingma 2019 tutorial
=> play student videos, setup random breakout rooms to discuss presentation papers
Fall'24 Lecture Videos: https://lnkd.in/efSvp7hY
Fall'24 Lecture Notes: https://lnkd.in/eWBAxQHk
(a) Genomes: Statistical genomics, gene regulation, genome language models, chromatin structure, 3D genome topology, epigenomics, regulatory networks.
(b) Proteins: Protein language models, structure and folding, protein design, cryo-EM, AlphaFold2, transformers, multimodal joint representation learning.
(c) Therapeutics: Chemical landscapes, small-molecule representation, docking, structure-function embeddings, agentic drug discovery, disease circuitry, and target identification.
(d) Patients: Electronic health records, medical genomics, genetic variation, comparative genomics, evolutionary evidence, patient latent representation, AI-driven systems biology.
Foundations and frontiers of computational biology, combining theory with practice. Generative AI, foundation models, machine learning, algorithm design, influential problems and techniques, analysis of large-scale biological datasets, applications to human disease and drug discovery.
First Lecture: Thu Sept 4 at 1pm in 32-144
With: Prof. Manolis Kellis, Prof. Eric Alm, TAs: Ananth Shyamal, Shitong Luo
Course website: https://lnkd.in/eemavz6J