== pre-class to do:
post video of lect 3 GAN
calendar email invitation:
homework assignment, data camp,
paper selection, high quality, primary research paper.
potential project (agentic bioinformatics analysis, agentic lab report?, pretraining of transformer, word embedding)
socrative questions (questions on contents from last lecture): TF on VAE
update Canvas course materials, update learning objectives. assignments as needed:
Test-run code: skip.
kindle book. using ipad to highlight key points.
== In-class to do:
clean up destktop space, calendars,
ZOOM, live transcript (start video recording).
Socrative sign in, skipped
Segio presentation on DoRA
GAN, principle in pdf, then kindle textbook,
breakout rooms: discuss course projects.
Meeting assets for 202510_CS_795_21992_CS 795 gAI Thu Evening are ready!
Meeting summary
Quick recap
The professor outlined requirements for a generative AI course project focusing on ethical and legal considerations, with students needing to submit proposals within a week. The discussion covered various aspects of autoregression models, tokenization in NLP, and the structure and operation of LSTM cells, including how temperature parameters affect model behavior and creativity. The session concluded with a student presentation on weight decomposed low-rank adaptation methods and their performance compared to full fine-tuning, followed by an announcement about breakout rooms for project discussions.
Next steps
- All students: Submit project proposals with title, team members, brief background, motivation, available datasets, AI approach, references, and resource requirements
- Students: Limit project teams to a maximum of two members
- Students: Ensure each team member makes meaningful contributions to the project if working in pairs
- Students: Contact the ODU GCP team for support if needing high-end GPUs
- Evan: Proceed with his project on jailbreaking large language models
Summary
Generative AI Course Project Requirements
The professor explained that the course project is the only project in the course and must be related to generative AI, with a focus on ethical and legal considerations. He clarified that the project can involve training discriminators or jailbreaking models, but emphasized that the end result should have a generative aspect. The professor also discussed the project proposal requirements, including team composition, background, data sets, AI approaches, and resource needs, and mentioned that students have one week to submit their proposals.
Understanding Autoregression and Tokenization
Hong discussed the autoregression model, highlighting that while ChatGPT is a popular example, there are many other models including LSTM and GRU. He explained the concept of tokenization in natural language processing, noting its importance in splitting text into smaller units for analysis. Hong also described the process of embedding, where tokens are represented by continuous floating-point numbers, often trained in the context of input or output, and how this can lead to interesting and meaningful representations.
LSTM Cell Operations and Memory
Hong explained the structure and operation of LSTM (Long Short-Term Memory) cells, focusing on how they differ from traditional recurrent neural networks. He described how LSTMs use a cell state that maintains memory through weighted matrices shared across all time steps, and detailed the four key operations within each LSTM cell: forget, input, cell state update, and output. Hong noted that while LSTMs were a significant improvement over simple RNNs when introduced 28 years ago, they are now considered less efficient than Transformers due to their fixed weight matrices.
Temperature Parameters in Language Models
The discussion focused on the implementation and behavior of temperature parameters in language models, particularly LSTM models. Hong explained how temperature affects the stochastic nature of model predictions, with higher temperatures leading to more deterministic outputs and lower temperatures increasing randomness. Hamza and Evan clarified that temperature controls the creativity and randomness of model outputs, with Evan confirming this through research. The group also discussed the limitations of Keras for modifying AI models compared to PyTorch, noting its industrial nature and declining usage.
Quantum LSTM and Machine Learning
Hong discussed the evolution and modifications of recurrent neural networks, particularly focusing on the LSTM (Long Short-Term Memory) and its limitations. He explained how a modified version of the recurrent unit, known as the Gated Recurrent Unit (GRU), lacks a cell state and memory, which could potentially make it faster to train but less effective in retaining long-term dependencies. Hong also introduced the concept of quantum machine learning, highlighting a recent development where a classical LSTM was combined with a quantum encoder to create a quantum LSTM. He emphasized the potential for quantum computing to revolutionize machine learning and suggested that future generations might need to learn quantum machine learning, even though it is still in its early stages.
Student Presentation Break Discussion
Hong and Sergio discussed the first student presentation of the day, agreeing to take a 5-minute break before resuming at 7:05. Sergio confirmed he was ready to present and successfully shared his screen for the presentation.
Decomposed Low-Rank Adaptation Techniques
Sergio presented on weight decomposed low-rank adaptation, introduced by the NVIDIA group, which builds upon LORA (Low-Rank Adaptation) and DORA (Decomposed Low-Rank Adaptation). He explained that while full fine-tuning adjusts all parameters, parameter-efficient methods like LORA and DORA only modify specific components, aiming to replicate full-tuning results with fewer computations. Sergio detailed how DORA decomposes weights into magnitude and direction, allowing independent updates, which leads to a learning pattern closer to full-tuning compared to LORA. Hong and Evan asked clarifying questions about the decomposition process and the implications of the negative slope in the learning trajectory, which Sergio explained as a lack of correlation between magnitude and direction changes. Terry inquired about training time differences between methods, which Sergio did not fully address in the transcript.
DORA: Parameter-Efficient Model Tuning
Sergio explained the concept of DORA, a parameter-efficient tuning method that reduces the number of parameters by decomposing the weight matrix, resulting in faster training times compared to full tuning. He highlighted that while DORA introduces some computational cost during tuning, it does not affect model latency during inference. Hong inquired about the meaning of scores in the results, and Sergio clarified that higher scores indicate better performance, though the specific metrics are not clearly defined for generative AI. They also discussed the hyperparameters used, including rank (R), which is a key tuning parameter, and Sergio explained how R is chosen based on the results and dimensions of the weight matrix.
Parameter-Efficient Model Tuning Discussion
The group discussed the performance of DORA and LoRA models, focusing on their efficiency and accuracy compared to full fine-tuning. Sergio explained that DORA can achieve similar or better accuracy than full fine-tuning with fewer parameters, while LoRA performs better with higher ranks but requires more computational resources. The team also explored the concept of quantized models, where the pre-trained model is compressed to reduce memory demands. Hong clarified questions about the ranking system and parameter usage, and the group discussed the implications of different parameter-efficient tuning methods on model latency. Finally, Hong announced that breakout rooms would be set up for students to discuss potential course projects, with 10 rooms available for participants to join
Fall'24 Lecture Videos: https://lnkd.in/efSvp7hY
Fall'24 Lecture Notes: https://lnkd.in/eWBAxQHk
(a) Genomes: Statistical genomics, gene regulation, genome language models, chromatin structure, 3D genome topology, epigenomics, regulatory networks.
(b) Proteins: Protein language models, structure and folding, protein design, cryo-EM, AlphaFold2, transformers, multimodal joint representation learning.
(c) Therapeutics: Chemical landscapes, small-molecule representation, docking, structure-function embeddings, agentic drug discovery, disease circuitry, and target identification.
(d) Patients: Electronic health records, medical genomics, genetic variation, comparative genomics, evolutionary evidence, patient latent representation, AI-driven systems biology.
Foundations and frontiers of computational biology, combining theory with practice. Generative AI, foundation models, machine learning, algorithm design, influential problems and techniques, analysis of large-scale biological datasets, applications to human disease and drug discovery.
First Lecture: Thu Sept 4 at 1pm in 32-144
With: Prof. Manolis Kellis, Prof. Eric Alm, TAs: Ananth Shyamal, Shitong Luo
Course website: https://lnkd.in/eemavz6J