Friday, January 31, 2025

CS7695 lecture 3.

Zoom, start recording

Datacamp review,

= A primer on deep learning in genetics, classification model, continued

Review. ask a student to run it and explain.

pytorch

transformer

Office hours: breakout room with each student. expectation. what to learn.

what are your short and long-term career goals. How does course align with that.

what kind of topics do you suggest?

kaggle analysis, logistic regression

Tuesday, January 28, 2025

Prior-knowledge-defined attention masks for transformers

Prior-knowledge-defined attention masks for transformers involve incorporating domain-specific information or constraints into the attention mechanism. This approach can offer several advantages and disadvantages:

## Advantages

1. Enhanced Interpretability: By incorporating prior knowledge, the model's attention patterns become more aligned with human understanding, making the model's decision-making process more transparent[2].

2. Improved Performance: In specific domains, prior knowledge can guide the model to focus on relevant information, potentially leading to better performance on targeted tasks[2].

3. Reduced Computational Complexity: By limiting attention to specific areas defined by prior knowledge, the model may require fewer computations, especially for long sequences[4].

4. Task-Specific Adaptation: Prior-knowledge masks can be tailored to specific tasks or domains, allowing for more efficient fine-tuning of pre-trained models[4].

## Disadvantages

1. Limited Flexibility: Rigid prior-knowledge masks might constrain the model's ability to learn unexpected patterns or relationships in the data[2].

2. Potential for Bias: If the prior knowledge is incomplete or biased, it may lead the model to make suboptimal decisions or reinforce existing biases in the data[4].

3. Increased Complexity in Design: Creating effective prior-knowledge masks requires domain expertise and careful design, which can be time-consuming and challenging[2].

4. Reduced Generalization: Highly specific prior-knowledge masks might limit the model's ability to generalize across different tasks or domains[4].

To implement prior-knowledge-defined attention masks:

1. Define the Mask: Create a binary or continuous mask based on domain knowledge or task-specific requirements[2].

2. Integration: Incorporate the mask into the attention mechanism, typically by element-wise multiplication with the attention scores before softmax[7].

3. Training: Fine-tune the model with the integrated mask, allowing it to learn within the constraints of the prior knowledge[4].

4. Evaluation: Assess the model's performance and interpretability to ensure the prior-knowledge mask enhances rather than hinders the model's capabilities[2].

By carefully designing and implementing prior-knowledge-defined attention masks, researchers can potentially create more efficient, interpretable, and task-specific transformer models. However, it's crucial to balance the benefits of prior knowledge with the need for model flexibility and generalization.

Citations:

[1] https://stackoverflow.blog/2024/09/26/masked-self-attention-how-llms-learn-relationships-between-tokens/

[2] https://arxiv.org/html/2406.02761v1

[3] https://stackoverflow.com/questions/58127059/how-to-understand-masked-multi-head-attention-in-transformer/59713254

[4] https://openreview.net/forum?id=abHtkQkumD

[5] https://www.reddit.com/r/MLQuestions/comments/1fqjdrf/understanding_masked_attention_in_transformer/

[6] https://blog.pangeanic.com/what-are-transformers-in-nlp

[7] https://datascience.stackexchange.com/questions/65067/proper-masking-in-the-transformer-model

[8] https://www.turing.com/kb/brief-introduction-to-transformers-and-their-power

Conformity in machine learning model prediction evaluation

Conformity in machine learning model prediction evaluation is calculated using a measure called the nonconformity score. This score quantifies how different or "nonconforming" a new data point is compared to the patterns observed in the training data[2]. The process of calculating conformity involves several steps:

1. Training Phase:

- Split the dataset into a proper training set and a calibration set.

- Train the model on the proper training set.

- Use the trained model to make predictions on the calibration set.

2. Nonconformity Calculation:

- For each instance in the calibration set, calculate a nonconformity score.

- This score measures how different the prediction is from the actual value.

3. Prediction Phase:

- For a new data point, calculate its nonconformity score using the trained model.

- Compare this score to the distribution of nonconformity scores from the calibration set.

The nonconformity score can be calculated in various ways, depending on the type of problem:

- For regression: It could be the absolute difference between the predicted and actual values.

- For classification: It might be based on the probability assigned to the correct class.

The key idea is that instances with higher nonconformity scores are less conforming to the training patterns and are therefore associated with higher uncertainty[2].

By using this approach, Inductive Conformal Prediction (ICP) can generate prediction intervals or sets that capture the uncertainty associated with individual predictions. This allows for a more nuanced evaluation of model performance, going beyond simple point predictions to provide a measure of confidence in each prediction.

Citations:

[1] https://www.geeksforgeeks.org/metrics-for-machine-learning-model/

[2] https://www.linkedin.com/pulse/inductive-conformal-prediction-yeshwanth-n

[3] https://kanerika.com/glossary/model-evaluation-metrics/

[4] https://www.youtube.com/watch?v=oqK6rM8fbkk

[5] https://towardsdatascience.com/metrics-to-evaluate-your-machine-learning-algorithm-f10ba6e38234?gi=7cd5f38faaf8

[6] https://www.datasource.ai/en/data-science-articles/model-evaluation-metrics-in-machine-learning

[7] https://www.nature.com/articles/s41598-024-56706-x

[8] https://towardsdatascience.com/all-you-need-is-conformal-prediction-726f18920241?gi=6dd1cfc4136e

Conformity and SHAP (SHapley Additive exPlanations) value analysis are related in the context of machine learning model interpretation and uncertainty quantification. Both approaches aim to provide insights into model behavior, but they focus on different aspects:

1. Uncertainty Quantification: Conformity measures, particularly in the form of nonconformity scores, are used to quantify the uncertainty of model predictions. SHAP values, on the other hand, explain the impact of individual features on model outputs[1][3].

2. Shapley-value Conformity Scores: Recent research has explored combining Shapley values with conformal prediction to create more informative prediction sets. This approach uses Shapley values as conformity scores, resulting in smaller prediction sets for certain significance levels compared to traditional methods[5].

3. Complementary Information: While SHAP values provide feature importance and impact on model predictions, conformity measures offer insights into the reliability and uncertainty of those predictions. Together, they can provide a more comprehensive understanding of model behavior[2].

4. Uncertainty in SHAP Values: Research has also focused on quantifying uncertainty in SHAP value estimations. This includes using Shapley Residuals, Mean-Standard-Error, and Bayesian SHAP to capture different sources of uncertainty in SHAP explanations[6].

5. Application to Uncertainty Explanation: Recent work has adapted the Shapley value framework to explain various types of predictive uncertainty, quantifying each feature's contribution to the conditional entropy of model outputs[4].

By combining conformity measures with SHAP value analysis, researchers and practitioners can gain a more nuanced understanding of both model predictions and their associated uncertainties, leading to more reliable and interpretable machine learning applications.

Citations:

[1] https://proceedings.neurips.cc/paper_files/paper/2023/file/16e4be78e61a3897665fa01504e9f452-Paper-Conference.pdf

[2] https://papers.phmsociety.org/index.php/phmap/article/download/3694/2161

[3] https://mindfulmodeler.substack.com/p/shap-is-not-all-you-need

[4] https://arxiv.org/abs/2306.05724

[5] https://proceedings.mlr.press/v152/jaramillo21a.html

[6] https://scholarship.tricolib.brynmawr.edu/items/1c209352-e4ab-454e-822c-1fe30211b92d

[7] https://pmc.ncbi.nlm.nih.gov/articles/PMC10985608/

[8] https://soil.copernicus.org/articles/10/679/2024/

Conformity and attention masks in transformers can be combined in innovative ways to enhance model performance and uncertainty quantification. Here are some key approaches:

1. Uncertainty-Guided Transformer (UGT): This approach uses conformity measures to guide the attention mechanism. By introducing an uncertainty-guided random masking algorithm (UGRM), higher probability of masking is assigned to uncertain regions during training. This forces the transformer to become more efficient at inferring and recovering content in uncertain regions by exploiting contextual information.[2]

2. Stochastic Attention: Instead of using deterministic attention distributions, the attention mechanism can be made stochastic. This involves sampling attention from a Gumbel-Softmax distribution, which controls the concentration over values. Additionally, key heads in self-attention can be regularized to attend to a set of learnable centroids, effectively performing clustering over keys or hidden states.[4]

3. Probabilistic Transformer: This approach uses probabilistic attention scores to quantify epistemic uncertainties in model predictions. It involves training two models - a majority model focusing on low-uncertainty samples and a minority model focusing on high-uncertainty samples. During inference, these models are dynamically combined based on the input uncertainty to make the final prediction.[6]

4. Transformer Conformal Prediction: This method uses the Transformer architecture, particularly the decoder, as a conditional quantile estimator to predict the quantiles of prediction residuals. These quantiles are then used to estimate prediction intervals. The Transformer's ability to learn temporal dependencies across past prediction residuals benefits the estimation of prediction intervals.[5]

5. Topological Feature Extraction: This approach extracts topological features from attention matrices, providing a low-dimensional, interpretable representation of the model's internal dynamics. This can be used to estimate uncertainty in the transformer's predictions.[8]

These approaches demonstrate how conformity measures and attention masks can be combined to improve uncertainty quantification, enhance model interpretability, and potentially boost performance in various tasks. By integrating these concepts, researchers can develop more robust and reliable transformer models that not only make accurate predictions but also provide valuable insights into their confidence levels.

Citations:

[1] https://www.reddit.com/r/MLQuestions/comments/1fqjdrf/understanding_masked_attention_in_transformer/

[2] https://openaccess.thecvf.com/content/ICCV2021/papers/Yang_Uncertainty-Guided_Transformer_Reasoning_for_Camouflaged_Object_Detection_ICCV_2021_paper.pdf

[3] https://proceedings.mlr.press/v206/seedat23a/seedat23a.pdf

[4] https://cdn.aaai.org/ojs/21364/21364-13-25377-1-2-20220628.pdf

[5] https://arxiv.org/html/2406.05332v1

[6] https://sites.ecse.rpi.edu/~cvrl/Publication/pdf/Guo2022.pdf

[7] https://nejsds.nestat.org/journal/NEJSDS/article/10/text

[8] https://arxiv.org/abs/2308.11295

Friday, January 24, 2025

CS795 20250124 lecture 2

Zoom, start recording

Datacamp review,

slides,

= A primer on deep learning in genetics, classification model

https://colab.research.google.com/github/hongqin/Python-CoLab-bootcamp/blob/master/A_Primer_on_Deep_Learning_in_Genomics_Public.ipynb

= CoLab

= Github - Wahab

= Wahab ondemand,

todo: kaggle analysis, logistic regression

Thursday, January 23, 2025

recent papers exploring the mathematical foundations of artificial intelligence

Here are some recent papers exploring the mathematical foundations of artificial intelligence:

"Formal Mathematical Reasoning: A New Frontier in AI" (December 2024)
- Authors: Kaiyu Yang, Gabriel Poesia, Jingxuan He, Wenda Li, Kristin Lauter, Swarat Chaudhuri, Dawn Song
- Summary: This position paper advocates for the integration of formal mathematical reasoning in AI, emphasizing its importance for advancing AI-driven discoveries in science and engineering. The authors discuss the role of formal systems, such as proof assistants, in verifying the correctness of reasoning and providing automatic feedback. citeturn0search1
- https://arxiv.org/abs/2412.16075?utm_source=chatgpt.com
"Artificial Intelligence: Advanced Mathematical Constructs and Applications" (November 2024)
- Authors: A. Sultan, S. Sridevi, A. Rohini
- Summary: This paper explores the mathematical foundations underpinning AI, machine learning (ML), and deep learning (DL). It highlights the significance of calculus, linear algebra, probability, and statistics in developing and optimizing AI algorithms. The authors demonstrate how these mathematical concepts are essential for data representation and model optimization in AI systems. citeturn0search3
- https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5066774&utm_source=chatgpt.com
"Integrating Discrete Mathematics in Artificial Intelligence: A Computational Perspective with a Vision for Future Technologies" (June 2024)
- Authors: Shalini Mishra, Garima Singh, Manju Prabhakar
- Summary: This research article discusses the integration of discrete mathematics into AI, focusing on its computational aspects and potential impact on future technologies. The authors examine how discrete mathematical structures can enhance the development and analysis of AI algorithms. citeturn0search4
- chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/https://www.isroset.org/pub_paper/IJSRMSS/10-ISROSET-IJSRMSS-09633.pdf?utm_source=chatgpt.com
"A Mathematical Framework of Intelligence and Consciousness Based on Riemannian Geometry" (July 2024)
- Author: Meng Lu
- Summary: This manuscript proposes a mathematical framework using Riemannian geometry to describe the structure and dynamics of intelligence and consciousness. It conceptualizes intelligence elements as tokens embedded in a high-dimensional space, with thought processes depicted as sequential activations along geodesics within these manifolds. citeturn0academia13
"Artificial Intelligence and Inherent Mathematical Difficulty" (August 2024)
- Author: Anonymous
- Summary: This paper explores the relationship between artificial intelligence and the inherent challenges in resolving open mathematical questions. It presents arguments based on computability and complexity theory, illustrating the difficulties in proof discovery and the application of AI-inspired methods to mathematical problem-solving. citeturn0search10
- https://arxiv.org/abs/2408.03345?utm_source=chatgpt.com

These papers provide valuable insights into the mathematical underpinnings of AI, highlighting the ongoing efforts to establish rigorous theoretical foundations for the field.

Wednesday, January 22, 2025

ODU HPC request form

HPC access request form

https://forms.odu.edu/view.php?id=93440

Tuesday, January 21, 2025

whisperx wahab, warnings

[hqin@wahab-01 ~]$ [hqin@wahab-01 ~]$ [hqin@wahab-01 ~]$ module load pytorch-gpu/2.2[hqin@wahab-01 ~]$ crun -p ~/envs/whisperx python Python 3.10.13 (tags/v3.10.13-25-g07fbd8e9251-dirty:07fbd8e9251, Sep 27 2023, 23:32:09) [GCC 13.2.0] :: Intel Corporation on linuxType "help", "copyright", "credits" or "license" for more information.>>> import whisperx/opt/conda/lib/python3.10/site-packages/torch/cuda/__init__.py:628: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML")INFO:speechbrain.utils.quirks:Applied quirks (see `speechbrain.utils.quirks`): [disable_jit_profiling, allow_tf32]INFO:speechbrain.utils.quirks:Excluded quirks specified by the `SB_DISABLE_QUIRKS` environment (comma-separated list): []

Monday, January 20, 2025

whisperx ubuntu workstation

tried to install whisperx to run on gpu, have trouble with cuda libaries. So, default to cpu instend.

Sunday, January 19, 2025

waha whisperx install try

On wahab, crun only work after module load xxxx

hqin@wahab-01 ~]$ ls

[hqin@wahab-01 ~]$ module load container_env

[hqin@wahab-01 ~]$ python -m venv whisperx_env

python: Command not found.

[hqin@wahab-01 ~]$ crun python -m venv whisperx_env

hqin@wahab-01 ~]$ ls whisperx_env[hqin@wahab-01 ~]$

Did not work.

Saturday, January 18, 2025

waha github clone

on odu waha, github can be used with RSA publication key.

For example

git clone git@github.com:hongqin/AI4Health.git

Friday, January 17, 2025

CS795 - day1- 2025 Jan 17 Friday

Zoom, start recording

Datacamp: registration

HPC survey (5 minutes)

The Research & Cloud Computing group (RCC) recently launched a survey regarding the need for training for research computing users. We would like to ask you to promote this survey among your students in classes and research groups as well as your colleagues, postdocs and other staff. The survey link is:

https://odu.co1.qualtrics.com/jfe/form/SV_9zCyC5peVHeQgl0

Please encourage them to submit responses by the end of January so we can use the findings to adjust offerings for this semester. Your help will be greatly appreciated!

CoLab

syllabus,

SoCrative ice break, anonymous

Github

Let students introduct each other in breakout room. Then student A introduce student B.

AI101, tensor flow playground.

== did not finish. leave for next class.

skipp self-introduction video.

project team,

ChatGPT, anthropic,

all of us account

A primer on deep learning in genetics, classification model

https://colab.research.google.com/github/hongqin/Python-CoLab-bootcamp/blob/master/A_Primer_on_Deep_Learning_in_Genomics_Public.ipynb

Thursday, January 16, 2025

Rsync

rsync

https://www.cisecurity.org/advisory/multiple-vulnerabilities-in-rsync-could-allow-for-remote-code-execution_2025-007

Friday, January 10, 2025

Profiling the transcriptomic age of single-cells in humans

https://zenodo.org/records/10405106

Data for "Profiling the transcriptomic age of single-cells in humans"

Wednesday, January 8, 2025

CVE-2024-27322 Should Never Have Been Assigned And R Data Files Are Still Super Risky Even In R 4.4.0

https://rud.is/b/2024/05/03/cve-2024-27322-should-never-have-been-assigned-and-r-data-files-are-still-super-risky-even-in-r-4-4-0/

Spring 2025 course schedule

CS 795/895 DASC, AI for health and life sciences.

Scheduled Meeting Times
Type	Time	Days	Where	Date Range	Schedule Type	Instructors
Scheduled In-Class Meetings	4:30 pm - 7:10 pm	F	ENGINEERING & COMP SCI BLDG 2120	Jan 11, 2025 - Apr 28, 2025	LECTURE	HONG QIN (P)

Monday, January 6, 2025

human scRNA aging data

There are some human single cell aging data,

https://www.nature.com/articles/s41586-024-07606-7

https://www.nature.com/articles/s43587-024-00640-0

https://www.nature.com/articles/s41467-020-17876-0?fromPaywallRec=false

https://pmc.ncbi.nlm.nih.gov/articles/PMC10306289/#_ad93_

Human PBMC scRNA-seq–based aging clocks reveal ribosome to inflammation balance as a single-cell aging hallmark and super longevity

Hongming Zhu ^1,^†, Jiawei Chen ^2,^†, Kangping Liu ^2,^†, Lei Gao ^1,³, Haiyan Wu ¹, Liangliang Ma ⁴, Jieru Zhou ⁴, Zhongmin Liu ^1,^*, Jing-Dong J Han ^2,^*

Saturday, January 4, 2025

Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective

https://arxiv.org/abs/2412.14135

[Submitted on 18 Dec 2024]

Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective

Zhiyuan Zeng, Qinyuan Cheng, Zhangyue Yin, Bo Wang, Shimin Li, Yunhua Zhou, Qipeng Guo, Xuanjing Huang, Xipeng Qiu

OpenAI o1 represents a significant milestone in Artificial Inteiligence, which achieves expert-level performances on many challanging tasks that require strong reasoning this http URL has claimed that the main techinique behinds o1 is the reinforcement learining. Recent works use alternative approaches like knowledge distillation to imitate o1's reasoning style, but their effectiveness is limited by the capability ceiling of the teacher model. Therefore, this paper analyzes the roadmap to achieving o1 from the perspective of reinforcement learning, focusing on four key components: policy initialization, reward design, search, and learning. Policy initialization enables models to develop human-like reasoning behaviors, equipping them with the ability to effectively explore solution spaces for complex problems. Reward design provides dense and effective signals via reward shaping or reward modeling, which is the guidance for both search and learning. Search plays a crucial role in generating high-quality solutions during both training and testing phases, which can produce better solutions with more computation. Learning utilizes the data generated by search for improving policy, which can achieve the better performance with more parameters and more searched data. Existing open-source projects that attempt to reproduce o1 can be seem as a part or a variant of our roadmap. Collectively, these components underscore how learning and search drive o1's advancement, making meaningful contributions to the development of LLM.

AI model editing techniques

chatgpt output, un-edited. Some wrong links and erorrs are apparent.

1. Fine-Tuning

Definition: Adjusting pre-trained models by retraining them on a specific dataset to tailor them to a particular task or domain.
References:
- Fine-Tuning Language Models
- Full and Parameter-Efficient Fine-Tuning

2. Prompt Engineering

Definition: Crafting specific inputs (prompts) to guide the behavior of large language models without altering their parameters.
References:
- Prompting for NLP
- Chain-of-Thought Prompting

3. Model Editing via Retrieval-Augmented Generation (RAG)

Definition: Integrating external databases or retrieval systems to improve or adapt the model's outputs without direct parameter changes.
References:
- RAG Explained
- Dynamic Knowledge Updates

4. Knowledge Injection

Definition: Incorporating domain-specific knowledge into a model post-training.
References:
- Knowledge-Integrated Language Models
- Knowledge Graph Integration

5. Soft Prompt Tuning

Definition: Learning a set of prompt tokens that adjust the behavior of pre-trained models without altering core weights.
References:
- Soft Prompt Tuning Explained
- Prompt-Based Learning

6. Modular Transfer Learning

Definition: Dividing models into modules (e.g., embeddings, encoders, decoders) and only updating or replacing specific components.
References:
- Modular Learning for NLP
- Parameter-Efficient Transfer Learning

7. Dynamic Reweighting

Definition: Adjusting the influence of certain parts of the model during inference based on specific tasks or inputs.
References:
- Sparsity-Based Optimization
- Attention Mechanisms

8. Model Surgery

Definition: Directly modifying neural network weights, layers, or architectures post-training.
References:
- MEMIT: Model Surgery Techniques
- Layerwise Editing

9. Continual Learning

Definition: Allowing a model to learn new information over time without forgetting prior knowledge.
References:
- Elastic Weight Consolidation
- Continual Learning Overview

10. Gradient Editing

Definition: Directly modifying gradients during training to induce specific behaviors or rectify known issues.
References:
- Gradient Editing Explained
- Debugging Neural Models

11. Reinforcement Learning from Human Feedback (RLHF)

Definition: Using human evaluations to fine-tune models, particularly for aligning AI with desired ethical or stylistic outcomes.
References:
- RLHF Overview
- OpenAI's Use of RLHF

12. Model Patching

Definition: Adding or replacing specific components in a model with updated or improved modules.
References:
- Knowledge Patch
- Error Correction Patching

13. Parameter-Free Updating

Definition: Techniques like black-box optimization or external decision systems that modify behavior without changing core parameters.
References:
- Black-Box Optimization
- Decision-Based Editing

14. Memory Editing

Definition: Directly modifying or updating specific "memories" in a model, allowing it to adjust responses to certain inputs or queries without retraining.
Techniques:
- MEMIT (Model Editing Made Informed by Targeting)
- ROME (Rank-One Model Editing)
References:

15. Multi-Modal Model Editing

Definition: Modifying models trained on multi-modal data (e.g., text and images) for domain-specific applications.
References:
- Multi-Modal Learning
- Vision-Language Consistency

16. Federated Learning Adjustments

Definition: Decentralized learning where updates are based on data from multiple users without directly sharing datasets.
References:
- Federated Learning Overview
- Privacy-Preserving Federated Learning

17. Meta-Learning (Learning to Learn)

Definition: Training models to quickly adapt to new tasks with minimal data by leveraging meta-learning algorithms.
References:
- MAML: Model-Agnostic Meta-Learning
- Meta-Learning Overview

This version now includes Memory Editing with references to emerging techniques like MEMIT and ROME for direct manipulation of model-specific knowledge.

Friday, January 3, 2025

editable neural networks in health science

Meng, K., Bau, D., Andonian, A. & Belinkov, Y. Locating and editing factual associations in GPT.

Adv. Neural Inf. Process. Syst. (2022). at

<https://proceedings.neurips.cc/paper_files/paper/2022/hash/6f1d43d5a82a37e89b0665b33bf3a182-

Abstract-Conference.html>

26. Meng, K., Sharma, A. S., Andonian, A., Belinkov, Y. & Bau, D. Mass-editing memory in a

transformer. in International Conference on Learning Representations (arxiv.org, 2023). at

<https://arxiv.org/abs/2210.07229>

27. Mitchell, E., Lin, C., Bosselut, A., Manning, C. D. & Finn, C. Memory-Based Model Editing at

Scale. in Proceedings of the 39th International Conference on Machine Learning (eds. Chaudhuri,

K., Jegelka, S., Song, L., Szepesvari, C., Niu, G. & Sabato, S.) 162, 15817–15831 (PMLR, 17--23

Jul 2022).

28. Hartvigsen, T., Sankaranarayanan, S., Palangi, H., Kim, Y. & Ghassemi, M. Aging with GRACE:

Lifelong Model Editing with Discrete Key-Value Adaptors. in Advances in Neural Information

Processing Systems (2023). at <https://arxiv.org/abs/2211.11031>

29. Mitchell, E., Lin, C., Bosselut, A., Finn, C. & Manning, C. Fast model editing at scale. in

International Conference on Learning Representations (arxiv.org, 2022). at

<https://arxiv.org/abs/2110.11309>

30. Sinitsin, A., Plokhotnyuk, V., Pyrkin, D., Popov, S. & Babenko, A. Editable Neural Networks. in

International Conference on Learning Representations (2020). at <http://arxiv.org/abs/2004.00345>

31. De Cao, N., Aziz, W. & Titov, I. Editing Factual Knowledge in Language Models. in Proceedings of

the 2021 Conference on Empirical Methods in Natural Language Processing 6491–6506

(Association for Computational Linguistics, 2021).

32. Zhong, Z., Wu, Z., Manning, C. D., Potts, C. & Chen, D. MQuAKE: Assessing Knowledge Editing

in Language Models via Multi-Hop Questions. arXiv [cs.CL] (2023). at

<http://arxiv.org/abs/2305.14795>

33. Cohen, R., Biran, E., Yoran, O., Globerson, A. & Geva, M. Evaluating the ripple effects of

knowledge editing in language models. Trans. Assoc. Comput. Linguist. 12, 283–298 (2023).

De Cao, N., Aziz, W. & Titov, I. Editing Factual Knowledge in Language Models. arXiv [cs.CL]

(2021). at <http://arxiv.org/abs/2104.08164>

35. Meng, K., Sharma, A. S., Andonian, A., Belinkov, Y. & Bau, D. Mass-Editing Memory in a

Transformer. arXiv [cs.CL] (2022). at <http://arxiv.org/abs/2210.07229>

36. Mitchell, E., Lin, C., Bosselut, A., Finn, C. & Manning, C. D. Fast Model Editing at Scale. arXiv

[cs.LG] (2021). at <http://arxiv.org/abs/2110.11309>

37. Hartvigsen, T., Sankaranarayanan, S., Palangi, H., Kim, Y. & Ghassemi, M. Aging with GRACE:

Lifelong Model Editing with Key-Value Adaptors. (2022). at

<https://openreview.net/pdf?id=ngCT1EelZk>

Language Models: Problems, Methods, and Opportunities. arXiv [cs.CL] (2023). at

<http://arxiv.org/abs/2305.13172>

41. Hase, P., Hofweber, T., Zhou, X., Stengel-Eskin, E. & Bansal, M. Fundamental problems with model

editing: How should rational belief revision work in LLMs? arXiv [cs.CL] (2024). at

<https://scholar.google.com/citations?view_op=view_citation&hl=en&citation_for_view=FO90FgM

AAAAJ:M3ejUd6NZC8C>

42. Cheng, S., Tian, B., Liu, Q., Chen, X., Wang, Y., Chen, H. & Zhang, N. Can We Edit Multimodal

Large Language Models? in Proceedings of the 2023 Conference on Empirical Methods in Natural

Language Processing (eds. Bouamor, H., Pino, J. & Bali, K.) 13877–13888 (Association for

Computational Linguistics, 2023).

Here are the URLs for the specified papers:

1. **Locating and editing factual associations in GPT**

Meng, K., Bau, D., Andonian, A. & Belinkov, Y. (2022).

[Link to Paper](https://proceedings.neurips.cc/paper_files/paper/2022/hash/6f1d43d5a82a37e89b0665b33bf3a182-Abstract-Conference.html)

2. **Mass-editing memory in a transformer**

Meng, K., Sharma, A. S., Andonian, A., Belinkov, Y. & Bau, D. (2023).

[Link to Paper](https://arxiv.org/abs/2210.07229)

3. **Memory-Based Model Editing at Scale**

Mitchell, E., Lin, C., Bosselut, A., Manning, C. D. & Finn, C. (2022).

[Link to Paper](https://proceedings.mlr.press/v162/mitchell22a.html)

4. **Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors**

Hartvigsen, T., Sankaranarayanan, S., Palangi, H., Kim, Y. & Ghassemi, M. (2023).

[Link to Paper](https://arxiv.org/abs/2211.11031)

5. **Fast model editing at scale**

Mitchell, E., Lin, C., Bosselut, A., Finn, C. & Manning, C. D. (2022).

[Link to Paper](https://arxiv.org/abs/2110.11309)

6. **Editable Neural Networks**

Sinitsin, A., Plokhotnyuk, V., Pyrkin, D., Popov, S. & Babenko, A. (2020).

[Link to Paper](http://arxiv.org/abs/2004.00345)

7. **Editing Factual Knowledge in Language Models**

De Cao, N., Aziz, W. & Titov, I. (2021).

[Link to Paper](http://arxiv.org/abs/2104.08164)

8. **MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions**

Zhong, Z., Wu, Z., Manning, C. D., Potts, C. & Chen, D. (2023).

[Link to Paper](http://arxiv.org/abs/2305.14795)

9. **Evaluating the ripple effects of knowledge editing in language models**

Cohen, R., Biran, E., Yoran, O., Globerson, A. & Geva, M. (2023).

[Link to Paper](https://transacl.org/ojs/index.php/tacl/article/view/3736)

10. **Language Models: Problems, Methods, and Opportunities**

(2023).

[Link to Paper](http://arxiv.org/abs/2305.13172)

11. **Fundamental problems with model editing: How should rational belief revision work in LLMs?**

Hase, P., Hofweber, T., Zhou, X., Stengel-Eskin, E. & Bansal, M. (2024).

[Link to Paper](https://scholar.google.com/citations?view_op=view_citation&hl=en&citation_for_view=FO90FgMAAAAAJ:M3ejUd6NZC8C)

12. **Can We Edit Multimodal Large Language Models?**

Cheng, S., Tian, B., Liu, Q., Chen, X., Wang, Y., Chen, H. & Zhang, N. (2023).

[Link to Paper](https://arxiv.org/abs/2305.14795)

Citations:

[1] https://proceedings.neurips.cc/paper_files/paper/2022/hash/6f1d43d5a82a37e89b

Thursday, January 2, 2025

Autonomous AI-Driven Drug Discovery

A Framework for Autonomous AI-Driven Drug Discovery

Douglas W Selinger, , Eleni Stylianou, , ,

doi: https://doi.org/10.1101/2024.12.17.629024

https://www.biorxiv.org/content/10.1101/2024.12.17.629024v2

Friday, January 31, 2025

Tuesday, January 28, 2025

Friday, January 24, 2025

Thursday, January 23, 2025

Wednesday, January 22, 2025

Tuesday, January 21, 2025

Monday, January 20, 2025

Sunday, January 19, 2025

Saturday, January 18, 2025

Friday, January 17, 2025

Thursday, January 16, 2025

Friday, January 10, 2025

Data for "Profiling the transcriptomic age of single-cells in humans"

Creators

Wednesday, January 8, 2025

CVE-2024-27322 Should Never Have Been Assigned And R Data Files Are Still Super Risky Even In R 4.4.0

Monday, January 6, 2025

Human PBMC scRNA-seq–based aging clocks reveal ribosome to inflammation balance as a single-cell aging hallmark and super longevity

Saturday, January 4, 2025

Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective

1. Fine-Tuning

2. Prompt Engineering

3. Model Editing via Retrieval-Augmented Generation (RAG)

4. Knowledge Injection

5. Soft Prompt Tuning

6. Modular Transfer Learning

7. Dynamic Reweighting

8. Model Surgery

9. Continual Learning

10. Gradient Editing

11. Reinforcement Learning from Human Feedback (RLHF)

12. Model Patching

13. Parameter-Free Updating

14. Memory Editing

15. Multi-Modal Model Editing

16. Federated Learning Adjustments

17. Meta-Learning (Learning to Learn)

Friday, January 3, 2025

Thursday, January 2, 2025

A Framework for Autonomous AI-Driven Drug Discovery