Showing posts with label advanced AI course. Show all posts

Saturday, January 4, 2025

Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective

https://arxiv.org/abs/2412.14135

[Submitted on 18 Dec 2024]

Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective

Zhiyuan Zeng, Qinyuan Cheng, Zhangyue Yin, Bo Wang, Shimin Li, Yunhua Zhou, Qipeng Guo, Xuanjing Huang, Xipeng Qiu

OpenAI o1 represents a significant milestone in Artificial Inteiligence, which achieves expert-level performances on many challanging tasks that require strong reasoning this http URL has claimed that the main techinique behinds o1 is the reinforcement learining. Recent works use alternative approaches like knowledge distillation to imitate o1's reasoning style, but their effectiveness is limited by the capability ceiling of the teacher model. Therefore, this paper analyzes the roadmap to achieving o1 from the perspective of reinforcement learning, focusing on four key components: policy initialization, reward design, search, and learning. Policy initialization enables models to develop human-like reasoning behaviors, equipping them with the ability to effectively explore solution spaces for complex problems. Reward design provides dense and effective signals via reward shaping or reward modeling, which is the guidance for both search and learning. Search plays a crucial role in generating high-quality solutions during both training and testing phases, which can produce better solutions with more computation. Learning utilizes the data generated by search for improving policy, which can achieve the better performance with more parameters and more searched data. Existing open-source projects that attempt to reproduce o1 can be seem as a part or a variant of our roadmap. Collectively, these components underscore how learning and search drive o1's advancement, making meaningful contributions to the development of LLM.

AI model editing techniques

chatgpt output, un-edited. Some wrong links and erorrs are apparent.

1. Fine-Tuning

Definition: Adjusting pre-trained models by retraining them on a specific dataset to tailor them to a particular task or domain.
References:
- Fine-Tuning Language Models
- Full and Parameter-Efficient Fine-Tuning

2. Prompt Engineering

Definition: Crafting specific inputs (prompts) to guide the behavior of large language models without altering their parameters.
References:
- Prompting for NLP
- Chain-of-Thought Prompting

3. Model Editing via Retrieval-Augmented Generation (RAG)

Definition: Integrating external databases or retrieval systems to improve or adapt the model's outputs without direct parameter changes.
References:
- RAG Explained
- Dynamic Knowledge Updates

4. Knowledge Injection

Definition: Incorporating domain-specific knowledge into a model post-training.
References:
- Knowledge-Integrated Language Models
- Knowledge Graph Integration

5. Soft Prompt Tuning

Definition: Learning a set of prompt tokens that adjust the behavior of pre-trained models without altering core weights.
References:
- Soft Prompt Tuning Explained
- Prompt-Based Learning

6. Modular Transfer Learning

Definition: Dividing models into modules (e.g., embeddings, encoders, decoders) and only updating or replacing specific components.
References:
- Modular Learning for NLP
- Parameter-Efficient Transfer Learning

7. Dynamic Reweighting

Definition: Adjusting the influence of certain parts of the model during inference based on specific tasks or inputs.
References:
- Sparsity-Based Optimization
- Attention Mechanisms

8. Model Surgery

Definition: Directly modifying neural network weights, layers, or architectures post-training.
References:
- MEMIT: Model Surgery Techniques
- Layerwise Editing

9. Continual Learning

Definition: Allowing a model to learn new information over time without forgetting prior knowledge.
References:
- Elastic Weight Consolidation
- Continual Learning Overview

10. Gradient Editing

Definition: Directly modifying gradients during training to induce specific behaviors or rectify known issues.
References:
- Gradient Editing Explained
- Debugging Neural Models

11. Reinforcement Learning from Human Feedback (RLHF)

Definition: Using human evaluations to fine-tune models, particularly for aligning AI with desired ethical or stylistic outcomes.
References:
- RLHF Overview
- OpenAI's Use of RLHF

12. Model Patching

Definition: Adding or replacing specific components in a model with updated or improved modules.
References:
- Knowledge Patch
- Error Correction Patching

13. Parameter-Free Updating

Definition: Techniques like black-box optimization or external decision systems that modify behavior without changing core parameters.
References:
- Black-Box Optimization
- Decision-Based Editing

14. Memory Editing

Definition: Directly modifying or updating specific "memories" in a model, allowing it to adjust responses to certain inputs or queries without retraining.
Techniques:
- MEMIT (Model Editing Made Informed by Targeting)
- ROME (Rank-One Model Editing)
References:

15. Multi-Modal Model Editing

Definition: Modifying models trained on multi-modal data (e.g., text and images) for domain-specific applications.
References:
- Multi-Modal Learning
- Vision-Language Consistency

16. Federated Learning Adjustments

Definition: Decentralized learning where updates are based on data from multiple users without directly sharing datasets.
References:
- Federated Learning Overview
- Privacy-Preserving Federated Learning

17. Meta-Learning (Learning to Learn)

Definition: Training models to quickly adapt to new tasks with minimal data by leveraging meta-learning algorithms.
References:
- MAML: Model-Agnostic Meta-Learning
- Meta-Learning Overview

This version now includes Memory Editing with references to emerging techniques like MEMIT and ROME for direct manipulation of model-specific knowledge.

Friday, January 3, 2025

editable neural networks in health science

Meng, K., Bau, D., Andonian, A. & Belinkov, Y. Locating and editing factual associations in GPT.

Adv. Neural Inf. Process. Syst. (2022). at

<https://proceedings.neurips.cc/paper_files/paper/2022/hash/6f1d43d5a82a37e89b0665b33bf3a182-

Abstract-Conference.html>

26. Meng, K., Sharma, A. S., Andonian, A., Belinkov, Y. & Bau, D. Mass-editing memory in a

transformer. in International Conference on Learning Representations (arxiv.org, 2023). at

<https://arxiv.org/abs/2210.07229>

27. Mitchell, E., Lin, C., Bosselut, A., Manning, C. D. & Finn, C. Memory-Based Model Editing at

Scale. in Proceedings of the 39th International Conference on Machine Learning (eds. Chaudhuri,

K., Jegelka, S., Song, L., Szepesvari, C., Niu, G. & Sabato, S.) 162, 15817–15831 (PMLR, 17--23

Jul 2022).

28. Hartvigsen, T., Sankaranarayanan, S., Palangi, H., Kim, Y. & Ghassemi, M. Aging with GRACE:

Lifelong Model Editing with Discrete Key-Value Adaptors. in Advances in Neural Information

Processing Systems (2023). at <https://arxiv.org/abs/2211.11031>

29. Mitchell, E., Lin, C., Bosselut, A., Finn, C. & Manning, C. Fast model editing at scale. in

International Conference on Learning Representations (arxiv.org, 2022). at

<https://arxiv.org/abs/2110.11309>

30. Sinitsin, A., Plokhotnyuk, V., Pyrkin, D., Popov, S. & Babenko, A. Editable Neural Networks. in

International Conference on Learning Representations (2020). at <http://arxiv.org/abs/2004.00345>

31. De Cao, N., Aziz, W. & Titov, I. Editing Factual Knowledge in Language Models. in Proceedings of

the 2021 Conference on Empirical Methods in Natural Language Processing 6491–6506

(Association for Computational Linguistics, 2021).

32. Zhong, Z., Wu, Z., Manning, C. D., Potts, C. & Chen, D. MQuAKE: Assessing Knowledge Editing

in Language Models via Multi-Hop Questions. arXiv [cs.CL] (2023). at

<http://arxiv.org/abs/2305.14795>

33. Cohen, R., Biran, E., Yoran, O., Globerson, A. & Geva, M. Evaluating the ripple effects of

knowledge editing in language models. Trans. Assoc. Comput. Linguist. 12, 283–298 (2023).

De Cao, N., Aziz, W. & Titov, I. Editing Factual Knowledge in Language Models. arXiv [cs.CL]

(2021). at <http://arxiv.org/abs/2104.08164>

35. Meng, K., Sharma, A. S., Andonian, A., Belinkov, Y. & Bau, D. Mass-Editing Memory in a

Transformer. arXiv [cs.CL] (2022). at <http://arxiv.org/abs/2210.07229>

36. Mitchell, E., Lin, C., Bosselut, A., Finn, C. & Manning, C. D. Fast Model Editing at Scale. arXiv

[cs.LG] (2021). at <http://arxiv.org/abs/2110.11309>

37. Hartvigsen, T., Sankaranarayanan, S., Palangi, H., Kim, Y. & Ghassemi, M. Aging with GRACE:

Lifelong Model Editing with Key-Value Adaptors. (2022). at

<https://openreview.net/pdf?id=ngCT1EelZk>

Language Models: Problems, Methods, and Opportunities. arXiv [cs.CL] (2023). at

<http://arxiv.org/abs/2305.13172>

41. Hase, P., Hofweber, T., Zhou, X., Stengel-Eskin, E. & Bansal, M. Fundamental problems with model

editing: How should rational belief revision work in LLMs? arXiv [cs.CL] (2024). at

<https://scholar.google.com/citations?view_op=view_citation&hl=en&citation_for_view=FO90FgM

AAAAJ:M3ejUd6NZC8C>

42. Cheng, S., Tian, B., Liu, Q., Chen, X., Wang, Y., Chen, H. & Zhang, N. Can We Edit Multimodal

Large Language Models? in Proceedings of the 2023 Conference on Empirical Methods in Natural

Language Processing (eds. Bouamor, H., Pino, J. & Bali, K.) 13877–13888 (Association for

Computational Linguistics, 2023).

Here are the URLs for the specified papers:

1. **Locating and editing factual associations in GPT**

Meng, K., Bau, D., Andonian, A. & Belinkov, Y. (2022).

[Link to Paper](https://proceedings.neurips.cc/paper_files/paper/2022/hash/6f1d43d5a82a37e89b0665b33bf3a182-Abstract-Conference.html)

2. **Mass-editing memory in a transformer**

Meng, K., Sharma, A. S., Andonian, A., Belinkov, Y. & Bau, D. (2023).

[Link to Paper](https://arxiv.org/abs/2210.07229)

3. **Memory-Based Model Editing at Scale**

Mitchell, E., Lin, C., Bosselut, A., Manning, C. D. & Finn, C. (2022).

[Link to Paper](https://proceedings.mlr.press/v162/mitchell22a.html)

4. **Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors**

Hartvigsen, T., Sankaranarayanan, S., Palangi, H., Kim, Y. & Ghassemi, M. (2023).

[Link to Paper](https://arxiv.org/abs/2211.11031)

5. **Fast model editing at scale**

Mitchell, E., Lin, C., Bosselut, A., Finn, C. & Manning, C. D. (2022).

[Link to Paper](https://arxiv.org/abs/2110.11309)

6. **Editable Neural Networks**

Sinitsin, A., Plokhotnyuk, V., Pyrkin, D., Popov, S. & Babenko, A. (2020).

[Link to Paper](http://arxiv.org/abs/2004.00345)

7. **Editing Factual Knowledge in Language Models**

De Cao, N., Aziz, W. & Titov, I. (2021).

[Link to Paper](http://arxiv.org/abs/2104.08164)

8. **MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions**

Zhong, Z., Wu, Z., Manning, C. D., Potts, C. & Chen, D. (2023).

[Link to Paper](http://arxiv.org/abs/2305.14795)

9. **Evaluating the ripple effects of knowledge editing in language models**

Cohen, R., Biran, E., Yoran, O., Globerson, A. & Geva, M. (2023).

[Link to Paper](https://transacl.org/ojs/index.php/tacl/article/view/3736)

10. **Language Models: Problems, Methods, and Opportunities**

(2023).

[Link to Paper](http://arxiv.org/abs/2305.13172)

11. **Fundamental problems with model editing: How should rational belief revision work in LLMs?**

Hase, P., Hofweber, T., Zhou, X., Stengel-Eskin, E. & Bansal, M. (2024).

[Link to Paper](https://scholar.google.com/citations?view_op=view_citation&hl=en&citation_for_view=FO90FgMAAAAAJ:M3ejUd6NZC8C)

12. **Can We Edit Multimodal Large Language Models?**

Cheng, S., Tian, B., Liu, Q., Chen, X., Wang, Y., Chen, H. & Zhang, N. (2023).

[Link to Paper](https://arxiv.org/abs/2305.14795)

Citations:

[1] https://proceedings.neurips.cc/paper_files/paper/2022/hash/6f1d43d5a82a37e89b

Monday, December 9, 2024

newer AI techniques and trends that have emerged or gained significant traction after 2023:

Here are some of the newer AI techniques and trends that have emerged or gained significant traction after 2023:

## Multimodal AI

Multimodal AI combines multiple modalities such as text, images, audio, and video to create more versatile and effective AI models. This approach allows models like ChatGPT-4 to generate text from various inputs, including images and audio, and to convert between different modalities seamlessly. This trend is expected to enhance applications in fields like financial services, customer analytics, and marketing[1][5][7].

## Small Language Models (SLMs)

SLMs are smaller versions of large language models (LLMs) that can operate efficiently on fewer computing resources, making them accessible on devices like smartphones. These models, such as Microsoft's Phi and Orca, offer similar or sometimes better performance than LLMs in certain areas, democratizing AI use and reducing the need for significant financial investments[1].

## Customizable Generative AI

Customizable AI models are designed to cater to specific industries and user needs, offering more personalization and control over data. This is particularly beneficial in sectors like healthcare, legal, and financial services, where specialized terminology and practices are crucial. Customizable models also enhance privacy and security by reducing reliance on third-party data processing[1].

## Decoupled Contrastive Learning (DCL)

DCL is a new approach to contrastive learning that improves learning efficiency by removing the negative-positive-coupling (NPC) effect present in traditional contrastive learning methods like InfoNCE. This method requires fewer computational resources, smaller batch sizes, and shorter training epochs, yet achieves competitive performance with state-of-the-art models[2][4].

## Explainable AI (XAI)

XAI focuses on making AI models more transparent and interpretable by providing insights into how the models arrive at their decisions. Techniques such as decision trees, linear models, and rule-based systems are used to ensure that AI-driven decisions align with human values and expectations. This trend is gaining popularity as it builds trust and understanding in AI-generated outcomes[3].

## Agentic AI

Agentic AI represents a shift from reactive to proactive AI systems. These AI agents exhibit autonomy, proactivity, and the ability to act independently, setting goals and taking actions without direct human intervention. Applications include environmental monitoring, financial portfolio management, and other areas where autonomous decision-making is beneficial[7].

## Multi-view Graph Contrastive Learning

This approach adapts contrastive learning to recommendation systems by incorporating multiple views of user data. Techniques such as node dropout, edge dropout, and random walks are used to generate diverse views, enhancing the model's ability to capture underlying preferences and behaviors[6].

## Language-Image Contrastive Learning with Efficient Large Language Model and Prompt Fine-Tuning (CLEFT)

CLEFT is a novel method that combines efficient large language models with prompt fine-tuning for language-image contrastive learning. This approach reduces the need for extensive GPU resources and prolonged training times, making it suitable for applications with limited datasets, such as medical imaging[9].

## Retrieval-Augmented Generation

This trend involves combining generative AI models with retrieval systems to enhance the accuracy and relevance of generated content. By retrieving relevant information from a database and integrating it into the generation process, models can produce more informed and contextually accurate outputs[7].

These techniques and trends highlight the rapid evolution and diversification of AI, enabling more efficient, versatile, and interpretable AI applications across various domains.

Citations:

[1] https://khoros.com/blog/ai-trends

[2] https://ai.meta.com/research/publications/decoupled-contrastive-learning/

[3] https://devabit.com/blog/top-11-new-technologies-in-ai-exploring-the-latest-trends/

[4] https://www.amazon.science/blog/new-contrastive-learning-methods-for-better-data-representation

[5] https://www.ibm.com/think/insights/artificial-intelligence-trends

[6] https://www.nature.com/articles/s41598-024-73336-5

[7] https://www.techtarget.com/searchenterpriseai/tip/9-top-AI-and-machine-learning-trends

[8] https://gram-blogposts.github.io/blog/2024/contrast-learning/

[9] https://arxiv.org/abs/2407.21011

Wednesday, December 4, 2024

AI by Hand

https://aibyhand.substack.com/t/workbook

AI by Hand

Tuesday, July 23, 2024

AI by hand

https://aibyhand.substack.com/

Tuesday, July 16, 2024

protein design

https://www.ipd.uw.edu/software/

https://www.nature.com/articles/d41586-024-02214-x

https://www.nature.com/articles/d41586-023-02227-y

https://github.com/universvm/how_to_create_a_protein

RFdiffusion generates ultra-high affinity binders through several key mechanisms:

1. Guided diffusion approach: RFdiffusion uses a diffusion model to gradually sculpt protein structures from a random distribution of atoms. This allows it to generate novel protein structures that can bind tightly to target antigens[2].

2. Fine-tuning for specific tasks: The model can be fine-tuned for tasks like binder design, allowing it to generate proteins tailored for binding to specific targets[2].

3. Conditioning on interface hotspots: RFdiffusion can be provided with information about desired binding sites on the target protein, allowing it to focus on generating binders that interact with those specific regions[2].

4. Scaffold topology control: For some design challenges, the model can be conditioned on secondary structure and block-adjacency information to control the overall topology of the binder[2].

5. Direct generation in context: Unlike some other methods, RFdiffusion can generate binding proteins directly in the context of the target protein, optimizing the interface as it builds the structure[2].

6. High shape complementarity: The "build to fit" approach of RFdiffusion allows it to create binders with very high shape complementarity to the target, which contributes to high affinity[5].

7. Flexible backbone design: RFdiffusion can generate binders while allowing flexibility in the target peptide backbone, potentially finding novel binding modes[5].

8. Iterative refinement: The model can be used to refine existing designs through partial noising and denoising, allowing for fine-tuning of the binding interface[5].

These capabilities have led to remarkable results. For example, RFdiffusion has generated binders with picomolar affinity to several helical peptides, including some that are reported to be the highest-affinity binders achieved directly by computational design without experimental optimization[5].

The power of RFdiffusion lies in its ability to simultaneously optimize the overall protein structure and the binding interface, leading to binders that are both stable and highly complementary to their targets. This approach has significantly increased the success rate of computational protein design, often requiring testing of only dozens of designs to find high-affinity binders, rather than the tens of thousands that might be needed with previous methods[2][5].

Citations:

[1] https://fold.it

[2] https://www.nature.com/articles/s41586-023-06415-8

[3] https://www.ipd.uw.edu/2022/12/a-diffusion-model-for-protein-design/

[4] https://www.bakerlab.org/2023/12/19/designing-binders-with-the-highest-affinity-ever-reported/

[5] https://www.nature.com/articles/s41586-023-06953-1

Monday, July 1, 2024

Large Language Models are Biased Because They Are Large Language Models Philip Resnik

Large Language Models are Biased Because They Are Large Language Models

Philip Resnik

https://arxiv.org/abs/2406.13138

Wednesday, June 26, 2024

TinyML and Efficient Deep Learning Computing

https://hanlab.mit.edu/courses/2023-fall-65940

Efficient AI Computing,
Transforming the Future.

TinyML and Efficient Deep Learning Computing

6.5940

• Fall

• 2023

• https://efficientml.ai

About Logistics All Courses

This course focuses on efficient machine learning and systems. This is a crucial area as deep neural networks demand extraordinary levels of computation, hindering its deployment on everyday devices and burdening the cloud infrastructure. This course introduces efficient AI computing techniques that enable powerful deep learning applications on resource-constrained devices. Topics include model compression, pruning, quantization, neural architecture search, distributed training, data/model parallelism, gradient compression, and on-device fine-tuning. It also introduces application-specific acceleration techniques for large language models and diffusion models. Students will get hands-on experience implementing model compression techniques and deploying large language models (Llama2-7B) on a laptop.

Live Streaming:
https://live.efficientml.ai/
Time:
Tuesday/Thursday 3:35-5:00pm Eastern Time
Location:
36-156
Office Hour:
Thursday 5:00-6:00 pm Eastern Time, 38-344 Meeting Room
Discussion:
Piazza
Homework Submission:
Canvas
Contact:
- For external inquiries, personal matters, or emergencies, you can email us at efficientml-staff [at] mit.edu.
- If you are interested in getting updates, please sign up here to join our mailing list to get notified!

Instructor

Song Han

Associate Professor

Teaching Assistants

Han Cai

Ph.D

Ji Lin

Ph.D

Announcements

2023-12-15
Final project: reports, slides and demo videos

2023-12-14
Final report and course evaluation due

2023-11-09
Mid-term survey: https://forms.gle/xMgCohDLX73cd4af9

2023-10-31
Lab 5 is out.

2023-10-19
Lab 4 is out.

Schedule

Date

Lecture

Logistics

Sep 7

Lecture

Saturday, January 4, 2025

Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective

1. Fine-Tuning

2. Prompt Engineering

3. Model Editing via Retrieval-Augmented Generation (RAG)

4. Knowledge Injection

5. Soft Prompt Tuning

6. Modular Transfer Learning

7. Dynamic Reweighting

8. Model Surgery

9. Continual Learning

10. Gradient Editing

11. Reinforcement Learning from Human Feedback (RLHF)

12. Model Patching

13. Parameter-Free Updating

14. Memory Editing

15. Multi-Modal Model Editing

16. Federated Learning Adjustments

17. Meta-Learning (Learning to Learn)

Friday, January 3, 2025

Monday, December 9, 2024

Wednesday, December 4, 2024

Tuesday, July 23, 2024

Tuesday, July 16, 2024

Monday, July 1, 2024

Large Language Models are Biased Because They Are Large Language Models

Wednesday, June 26, 2024

Efficient AI Computing,Transforming the Future.

TinyML and Efficient Deep Learning Computing

6.5940

•

Fall

•

2023

•

https://efficientml.ai

Instructor

Song Han

Teaching Assistants

Han Cai

Ji Lin

Announcements

Schedule

Date

Lecture

Logistics

Sep 7

Introduction

Sep 12

Basics of Deep Learning

Chapter I: Efficient Inference

Sep 14

Pruning and Sparsity (Part I)

Sep 19

Pruning and Sparsity (Part II)

Sep 21

Quantization (Part I)

Sep 26

Quantization (Part II)

Sep 28

Neural Architecture Search (Part I)

Oct 3

Neural Architecture Search (Part II)

Oct 5

Knowledge Distillation

Oct 10

Student Holiday — No Class

Oct 12

MCUNet: TinyML on Microcontrollers

Oct 17

TinyEngine and Parallel Processing

Chapter II: Domain-Specific Optimization

Oct 19

Transformer and LLM (Part I)

Oct 24

Transformer and LLM (Part II)

Oct 26

Vision Transformer

Oct 31

GAN, Video, and Point Cloud

Efficient AI Computing,
Transforming the Future.