Friday, March 7, 2025

2025 0307 Friday AAAI review

 Zoom, start recording

AAAI review (underline) 

Andrew Ng (talk), other talks

https://underline.io/events/473/sessions/19421/lecture/115522-ai-agents-and-applications


TH05: Multi-modal Foundation Model for Scientific Discovery: With Applications in Chemistry, Material, and Biology

AI for science: 

https://zoom.us/rec/play/oFu2Qj2JC6LuGbcYfz8jK1hbY-wGHysPBB1CNJo258SAtBKxc8GMK8Z8gbIy3Oi2_jvAn0EEbXD5F4R3.L3JBr3BHdKD6w6pS?autoplay=true&startTime=1740489296000


B1: AI for Medicine and Healthcare

https://zoom.us/rec/play/XQJMoV7A45ly7ggk8jVZHxcIzb1K4GkU6LSrtB4brwJHtlMyTw_QT5GpWaNAmZYyCmwaixlThXX4kRjk.7xy0OeOZokUB5JMr?autoplay=true&startTime=1740491090000


=> student talk and project update, milestone discussion



CoLab enterprise exercise materials

 

https://cloud.google.com/blog/products/ai-machine-learning/running-alphafold-on-vertexai?_gl=1*1jnafks*_ga*MTgyMTg3NzI4OS4xNzQxMjg5MTMy*_ga_WH2QY8WWF5*MTc0MTI4ODUwNC4yLjEuMTc0MTI4ODUyOS4zNS4wLjA.&e=48754805


github in colab directly

 

Quick correction:
# Replace USERNAME with your GitHub username
# Replace TOKEN with your Personal Access Token

import os

# Set your GitHub credentials
GITHUB_USERNAME = "YOUR_GITHUB_USERNAME"
GITHUB_TOKEN = "YOUR_GITHUB_TOKEN_HERE"

# Clone the private repo using HTTPS, including username and token
!git clone https://{GITHUB_USERNAME}:{GITHUB_TOKEN}@github.com/repo_name/team_123.git

Wednesday, March 5, 2025

math formula between transformer encoder and decoder


The transformer architecture splits into an encoder and a decoder, and while both use the scaled dot‐product attention mechanism, they differ in how and where this mechanism is applied.


Scaled Dot-Product Attention (Common to Both)

At the core of both components is the scaled dot‐product attention defined as:

Attention(Q,K,V)=softmax ⁣(QKTdk)V\text{Attention}(Q, K, V) = \text{softmax}\!\left(\frac{QK^T}{\sqrt{d_k}}\right) V

where:

  • QQ (queries),
  • KK (keys),
  • VV (values), and
  • dkd_k is the dimensionality of the keys.

Encoder

In the encoder, every layer performs self-attention on the input sequence. Here, the queries, keys, and values are all derived from the same input xx:

Q=xWQ,K=xWK,V=xWVQ = xW^Q,\quad K = xW^K,\quad V = xW^V

Thus, the encoder’s self-attention is computed as:

SelfAttentionenc(x)=softmax ⁣((xWQ)(xWK)Tdk)(xWV)\text{SelfAttention}_{\text{enc}}(x) = \text{softmax}\!\left(\frac{(xW^Q)(xW^K)^T}{\sqrt{d_k}}\right)(xW^V)

This mechanism allows each token in the input to attend to all other tokens, integrating contextual information across the entire sequence.


Decoder

The decoder is more complex because it has to generate an output sequence while incorporating information from the encoder. It uses two main attention mechanisms:

  1. Masked Self-Attention:

    The decoder first applies self-attention to its own previous outputs. To maintain the autoregressive property (i.e., ensuring a token only depends on earlier tokens), a mask MM is applied. This mask typically sets the upper triangular part of the attention matrix to a very negative value, so that the softmax zeroes out any attention weights corresponding to future tokens. Formally:

    MaskedAttention(Q,K,V)=softmax ⁣(QKTdk+M)V\text{MaskedAttention}(Q, K, V) = \text{softmax}\!\left(\frac{QK^T}{\sqrt{d_k}} + M\right)V

    Here, QQ, KK, and VV are derived from the decoder’s own input (previously generated tokens).

  2. Encoder-Decoder (Cross) Attention:

    After the masked self-attention, the decoder incorporates information from the encoder. In this step, the queries come from the decoder (from the output of the masked self-attention), while the keys and values come from the encoder’s final output. The formula is:

    EncDecAttention(Qdec,Kenc,Venc)=softmax ⁣(QdecKencTdk)Venc\text{EncDecAttention}(Q_{\text{dec}}, K_{\text{enc}}, V_{\text{enc}}) = \text{softmax}\!\left(\frac{Q_{\text{dec}} K_{\text{enc}}^T}{\sqrt{d_k}}\right)V_{\text{enc}}

    This step allows the decoder to "look" at the input sequence and incorporate context from the encoder into the output generation.


Summary of Differences

  • Input Sources:

    • Encoder: Uses self-attention where QQ, KK, and VV are all derived from the same input xx.
    • Decoder: Uses two attention mechanisms: masked self-attention on its own output (with a mask MM) and cross-attention that uses the encoder's outputs for KK and VV.
  • Masking:

    • Encoder: No masking is necessary; all tokens attend to each other.
    • Decoder: Uses a mask in the self-attention to prevent future tokens from being attended to, preserving the autoregressive property.
  • Attention Layers:

    • Encoder: A single self-attention layer per encoder block.
    • Decoder: Two sequential attention layers (masked self-attention followed by encoder-decoder attention) in each decoder block.

These differences in the attention formulas are key to enabling the decoder to generate coherent output sequences while leveraging the complete context provided by the encoder. 


Monday, March 3, 2025

alphafold 3 on ODU wahab

 

  • Load modules
    • module load alphafold/3.0.1
  • Request a high memory GPU node 
    • salloc -p high-gpu-mem --gres gpu=1
  • Run the software, replacing with your directories
    •  crun.alphafold run_alphafold.py --output_dir=/home/tstil004/alphafold3/output --json_path=/home/tstil004/alphafold3/gmp.json --model-dir=/home/tstil004/alphafold3/models

mit flow

 

Introduction to Flow Matching and Diffusion Models

MIT Computer Science Class 6.S184: Generative AI with Stochastic Differential Equations

https://diffusion.csail.mit.edu/

https://github.com/eje24/iap-diffusion-class/tree/main/docs


ODU HPC

 Since HPC is generally a type of computer that is familiar to many researchers, we offered "Intro to HPC" workshops to help people onboard to using the system. We just had offered the last one last week. We recorded the workshop—anyone is welcome to watch this to learn:


“Introduction to HPC -- 2025-02-25 Workshop raw recording”
Length: 03hr 32m 45s


Our complete  documentation can be read at: https://wiki.hpc.odu.edu/ .

Biomedical Data Science: Mining and Modeling (Spring 2025) Yale,

 

https://cbb752b25.gersteinlab.org/syllabus

Syllabus

Please see last year’s syllabus (with slide packs at the bottom) for previews for this year’s lectures, which will be slightly different. For this year, an updated slide pack will be posted after the lecture. (If it is substantially different from ‘22, an updated video will also be posted.) Video recordings for this year’s lectures can be found in the Media Library tab in Canvas.

 Topic2025 Spring’s Lecture2023 Spring’s LectureCommentLecture Summary
1/13*YALE* Spring term classes begin, 8.20 a.m.    
      
1/13Introduction25i1-25i2a23i1, 23i2a  
1/15DATA - Proteomics I25d323d3Suggested Reading25d3
1/22DATA - Proteomics II25d423d4Suggested Reading25d4
1/27DATA - Genomics I25d123d1 25d1
1/29DATA - Genomics II25d223d2 25d2
2/3DATA - Knowledge Representation & Databases25d523d5 25d5
2/5MINING - Personal Genomes + Seq. Comparison25i2b,25m323i2b,23m3  
2/10MINING - Seq. Comparison (con’t) + Multi-seq Alignment + Fast Alignment25m3, 25m4, 25m523m3, 23m4, 23m5 2512b,25m3-part1
2/12MINING - Variant Calling (incl. a focused section on SVs)+ Basic Multi-Omics25m6a,23m6b, 23m7-pt123m6a, 23m6b,23m7  
2/17Quiz on 1st Halfquiz1 study guide   
2/19MINING - Supervised Mining #2 + Deep Learning Fundamentals #1 23m8b,23m8c  
2/24MINING - Deep Learning Fundamentals #2 + Unsupervised Mining #1 23m9a,23m9c  
2/26MINING - Unsupervised Mining #2 + Single-Cell Analysis #1 23m9d, 23m9e  
3/3MINING - Single-Cell Analysis #2 + Biomedical Image Analysis 23t1  
3/5MINING - Network Analysis 23m10a, 23m10b, 23m10c, 23m10d, 23m10e  
      
      
3/7Spring break begins    
      
3/24MINING - Privacy 23m11a, 23m11b  
3/26MINING/MODELING - Deep Learning Advanced I 23m12a  
3/31MINING/MODELING - Deep Learning Advanced II 23m12b  
4/2SIMULATION - Protein Simulation I 23s1  
4/7SIMULATION - Protein Simulation II 23s2  
4/9SIMULATION - Protein Simulation III 23s3  
4/14SIMULATION - Protein Simulation IV 23s4  
4/16SIMULATION - Protein Simulation V 23s5  
4/21Quiz on 2nd Half    
4/23Final Presentations    
4/25*YALE* Classes end; Reading period begins    
5/1*YALE* Final examinations begin    
5/7*YALE* Final examinations end