Friday, January 31, 2025

CS7695 lecture 3.

Zoom, start recording

Datacamp review, 

= A primer on deep learning in genetics, classification model, continued 

Review. ask a student to run it and explain. 

pytorch

transformer

Office hours: breakout room with each student. expectation. what to learn. 

 what are your short and long-term career goals. How does course align with that. 

  what kind of topics do you suggest? 

kaggle analysis, logistic regression

Tuesday, January 28, 2025

Prior-knowledge-defined attention masks for transformers

 

Prior-knowledge-defined attention masks for transformers involve incorporating domain-specific information or constraints into the attention mechanism. This approach can offer several advantages and disadvantages:


## Advantages


1. Enhanced Interpretability: By incorporating prior knowledge, the model's attention patterns become more aligned with human understanding, making the model's decision-making process more transparent[2].


2. Improved Performance: In specific domains, prior knowledge can guide the model to focus on relevant information, potentially leading to better performance on targeted tasks[2].


3. Reduced Computational Complexity: By limiting attention to specific areas defined by prior knowledge, the model may require fewer computations, especially for long sequences[4].


4. Task-Specific Adaptation: Prior-knowledge masks can be tailored to specific tasks or domains, allowing for more efficient fine-tuning of pre-trained models[4].


## Disadvantages


1. Limited Flexibility: Rigid prior-knowledge masks might constrain the model's ability to learn unexpected patterns or relationships in the data[2].


2. Potential for Bias: If the prior knowledge is incomplete or biased, it may lead the model to make suboptimal decisions or reinforce existing biases in the data[4].


3. Increased Complexity in Design: Creating effective prior-knowledge masks requires domain expertise and careful design, which can be time-consuming and challenging[2].


4. Reduced Generalization: Highly specific prior-knowledge masks might limit the model's ability to generalize across different tasks or domains[4].


To implement prior-knowledge-defined attention masks:


1. Define the Mask: Create a binary or continuous mask based on domain knowledge or task-specific requirements[2].


2. Integration: Incorporate the mask into the attention mechanism, typically by element-wise multiplication with the attention scores before softmax[7].


3. Training: Fine-tune the model with the integrated mask, allowing it to learn within the constraints of the prior knowledge[4].


4. Evaluation: Assess the model's performance and interpretability to ensure the prior-knowledge mask enhances rather than hinders the model's capabilities[2].


By carefully designing and implementing prior-knowledge-defined attention masks, researchers can potentially create more efficient, interpretable, and task-specific transformer models. However, it's crucial to balance the benefits of prior knowledge with the need for model flexibility and generalization.


Citations:

[1] https://stackoverflow.blog/2024/09/26/masked-self-attention-how-llms-learn-relationships-between-tokens/

[2] https://arxiv.org/html/2406.02761v1

[3] https://stackoverflow.com/questions/58127059/how-to-understand-masked-multi-head-attention-in-transformer/59713254

[4] https://openreview.net/forum?id=abHtkQkumD

[5] https://www.reddit.com/r/MLQuestions/comments/1fqjdrf/understanding_masked_attention_in_transformer/

[6] https://blog.pangeanic.com/what-are-transformers-in-nlp

[7] https://datascience.stackexchange.com/questions/65067/proper-masking-in-the-transformer-model

[8] https://www.turing.com/kb/brief-introduction-to-transformers-and-their-power

Conformity in machine learning model prediction evaluation

 Conformity in machine learning model prediction evaluation is calculated using a measure called the nonconformity score. This score quantifies how different or "nonconforming" a new data point is compared to the patterns observed in the training data[2]. The process of calculating conformity involves several steps:


1. Training Phase:

   - Split the dataset into a proper training set and a calibration set.

   - Train the model on the proper training set.

   - Use the trained model to make predictions on the calibration set.


2. Nonconformity Calculation:

   - For each instance in the calibration set, calculate a nonconformity score.

   - This score measures how different the prediction is from the actual value.


3. Prediction Phase:

   - For a new data point, calculate its nonconformity score using the trained model.

   - Compare this score to the distribution of nonconformity scores from the calibration set.


The nonconformity score can be calculated in various ways, depending on the type of problem:


- For regression: It could be the absolute difference between the predicted and actual values.

- For classification: It might be based on the probability assigned to the correct class.


The key idea is that instances with higher nonconformity scores are less conforming to the training patterns and are therefore associated with higher uncertainty[2].


By using this approach, Inductive Conformal Prediction (ICP) can generate prediction intervals or sets that capture the uncertainty associated with individual predictions. This allows for a more nuanced evaluation of model performance, going beyond simple point predictions to provide a measure of confidence in each prediction.


Citations:

[1] https://www.geeksforgeeks.org/metrics-for-machine-learning-model/

[2] https://www.linkedin.com/pulse/inductive-conformal-prediction-yeshwanth-n

[3] https://kanerika.com/glossary/model-evaluation-metrics/

[4] https://www.youtube.com/watch?v=oqK6rM8fbkk

[5] https://towardsdatascience.com/metrics-to-evaluate-your-machine-learning-algorithm-f10ba6e38234?gi=7cd5f38faaf8

[6] https://www.datasource.ai/en/data-science-articles/model-evaluation-metrics-in-machine-learning

[7] https://www.nature.com/articles/s41598-024-56706-x

[8] https://towardsdatascience.com/all-you-need-is-conformal-prediction-726f18920241?gi=6dd1cfc4136e


Conformity and SHAP (SHapley Additive exPlanations) value analysis are related in the context of machine learning model interpretation and uncertainty quantification. Both approaches aim to provide insights into model behavior, but they focus on different aspects:


1. Uncertainty Quantification: Conformity measures, particularly in the form of nonconformity scores, are used to quantify the uncertainty of model predictions. SHAP values, on the other hand, explain the impact of individual features on model outputs[1][3].


2. Shapley-value Conformity Scores: Recent research has explored combining Shapley values with conformal prediction to create more informative prediction sets. This approach uses Shapley values as conformity scores, resulting in smaller prediction sets for certain significance levels compared to traditional methods[5].


3. Complementary Information: While SHAP values provide feature importance and impact on model predictions, conformity measures offer insights into the reliability and uncertainty of those predictions. Together, they can provide a more comprehensive understanding of model behavior[2].


4. Uncertainty in SHAP Values: Research has also focused on quantifying uncertainty in SHAP value estimations. This includes using Shapley Residuals, Mean-Standard-Error, and Bayesian SHAP to capture different sources of uncertainty in SHAP explanations[6].


5. Application to Uncertainty Explanation: Recent work has adapted the Shapley value framework to explain various types of predictive uncertainty, quantifying each feature's contribution to the conditional entropy of model outputs[4].


By combining conformity measures with SHAP value analysis, researchers and practitioners can gain a more nuanced understanding of both model predictions and their associated uncertainties, leading to more reliable and interpretable machine learning applications.


Citations:

[1] https://proceedings.neurips.cc/paper_files/paper/2023/file/16e4be78e61a3897665fa01504e9f452-Paper-Conference.pdf

[2] https://papers.phmsociety.org/index.php/phmap/article/download/3694/2161

[3] https://mindfulmodeler.substack.com/p/shap-is-not-all-you-need

[4] https://arxiv.org/abs/2306.05724

[5] https://proceedings.mlr.press/v152/jaramillo21a.html

[6] https://scholarship.tricolib.brynmawr.edu/items/1c209352-e4ab-454e-822c-1fe30211b92d

[7] https://pmc.ncbi.nlm.nih.gov/articles/PMC10985608/

[8] https://soil.copernicus.org/articles/10/679/2024/


Conformity and attention masks in transformers can be combined in innovative ways to enhance model performance and uncertainty quantification. Here are some key approaches:


1. Uncertainty-Guided Transformer (UGT): This approach uses conformity measures to guide the attention mechanism. By introducing an uncertainty-guided random masking algorithm (UGRM), higher probability of masking is assigned to uncertain regions during training. This forces the transformer to become more efficient at inferring and recovering content in uncertain regions by exploiting contextual information.[2]


2. Stochastic Attention: Instead of using deterministic attention distributions, the attention mechanism can be made stochastic. This involves sampling attention from a Gumbel-Softmax distribution, which controls the concentration over values. Additionally, key heads in self-attention can be regularized to attend to a set of learnable centroids, effectively performing clustering over keys or hidden states.[4]


3. Probabilistic Transformer: This approach uses probabilistic attention scores to quantify epistemic uncertainties in model predictions. It involves training two models - a majority model focusing on low-uncertainty samples and a minority model focusing on high-uncertainty samples. During inference, these models are dynamically combined based on the input uncertainty to make the final prediction.[6]


4. Transformer Conformal Prediction: This method uses the Transformer architecture, particularly the decoder, as a conditional quantile estimator to predict the quantiles of prediction residuals. These quantiles are then used to estimate prediction intervals. The Transformer's ability to learn temporal dependencies across past prediction residuals benefits the estimation of prediction intervals.[5]


5. Topological Feature Extraction: This approach extracts topological features from attention matrices, providing a low-dimensional, interpretable representation of the model's internal dynamics. This can be used to estimate uncertainty in the transformer's predictions.[8]


These approaches demonstrate how conformity measures and attention masks can be combined to improve uncertainty quantification, enhance model interpretability, and potentially boost performance in various tasks. By integrating these concepts, researchers can develop more robust and reliable transformer models that not only make accurate predictions but also provide valuable insights into their confidence levels.


Citations:

[1] https://www.reddit.com/r/MLQuestions/comments/1fqjdrf/understanding_masked_attention_in_transformer/

[2] https://openaccess.thecvf.com/content/ICCV2021/papers/Yang_Uncertainty-Guided_Transformer_Reasoning_for_Camouflaged_Object_Detection_ICCV_2021_paper.pdf

[3] https://proceedings.mlr.press/v206/seedat23a/seedat23a.pdf

[4] https://cdn.aaai.org/ojs/21364/21364-13-25377-1-2-20220628.pdf

[5] https://arxiv.org/html/2406.05332v1

[6] https://sites.ecse.rpi.edu/~cvrl/Publication/pdf/Guo2022.pdf

[7] https://nejsds.nestat.org/journal/NEJSDS/article/10/text

[8] https://arxiv.org/abs/2308.11295

Friday, January 24, 2025

CS795 20250124 lecture 2

Zoom, start recording

Datacamp review, 

slides, 

= A primer on deep learning in genetics, classification model

https://colab.research.google.com/github/hongqin/Python-CoLab-bootcamp/blob/master/A_Primer_on_Deep_Learning_in_Genomics_Public.ipynb

= CoLab

= Github - Wahab

= Wahab ondemand, 

todo: kaggle analysis, logistic regression

Thursday, January 23, 2025

recent papers exploring the mathematical foundations of artificial intelligence

 Here are some recent papers exploring the mathematical foundations of artificial intelligence:

  1. "Formal Mathematical Reasoning: A New Frontier in AI" (December 2024)

    • Authors: Kaiyu Yang, Gabriel Poesia, Jingxuan He, Wenda Li, Kristin Lauter, Swarat Chaudhuri, Dawn Song
    • Summary: This position paper advocates for the integration of formal mathematical reasoning in AI, emphasizing its importance for advancing AI-driven discoveries in science and engineering. The authors discuss the role of formal systems, such as proof assistants, in verifying the correctness of reasoning and providing automatic feedback. citeturn0search1
    • https://arxiv.org/abs/2412.16075?utm_source=chatgpt.com 
  2. "Artificial Intelligence: Advanced Mathematical Constructs and Applications" (November 2024)

    • Authors: A. Sultan, S. Sridevi, A. Rohini
    • Summary: This paper explores the mathematical foundations underpinning AI, machine learning (ML), and deep learning (DL). It highlights the significance of calculus, linear algebra, probability, and statistics in developing and optimizing AI algorithms. The authors demonstrate how these mathematical concepts are essential for data representation and model optimization in AI systems. citeturn0search3
    • https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5066774&utm_source=chatgpt.com 
  3. "Integrating Discrete Mathematics in Artificial Intelligence: A Computational Perspective with a Vision for Future Technologies" (June 2024)

    • Authors: Shalini Mishra, Garima Singh, Manju Prabhakar
    • Summary: This research article discusses the integration of discrete mathematics into AI, focusing on its computational aspects and potential impact on future technologies. The authors examine how discrete mathematical structures can enhance the development and analysis of AI algorithms. citeturn0search4
    • chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/https://www.isroset.org/pub_paper/IJSRMSS/10-ISROSET-IJSRMSS-09633.pdf?utm_source=chatgpt.com

  4. "A Mathematical Framework of Intelligence and Consciousness Based on Riemannian Geometry" (July 2024)

    • Author: Meng Lu
    • Summary: This manuscript proposes a mathematical framework using Riemannian geometry to describe the structure and dynamics of intelligence and consciousness. It conceptualizes intelligence elements as tokens embedded in a high-dimensional space, with thought processes depicted as sequential activations along geodesics within these manifolds. citeturn0academia13
  5. "Artificial Intelligence and Inherent Mathematical Difficulty" (August 2024)

    • Author: Anonymous
    • Summary: This paper explores the relationship between artificial intelligence and the inherent challenges in resolving open mathematical questions. It presents arguments based on computability and complexity theory, illustrating the difficulties in proof discovery and the application of AI-inspired methods to mathematical problem-solving. citeturn0search10
    • https://arxiv.org/abs/2408.03345?utm_source=chatgpt.com

These papers provide valuable insights into the mathematical underpinnings of AI, highlighting the ongoing efforts to establish rigorous theoretical foundations for the field.

Tuesday, January 21, 2025

whisperx wahab, warnings

 

[hqin@wahab-01 ~]$ [hqin@wahab-01 ~]$ [hqin@wahab-01 ~]$ module load pytorch-gpu/2.2[hqin@wahab-01 ~]$ crun -p ~/envs/whisperx python Python 3.10.13 (tags/v3.10.13-25-g07fbd8e9251-dirty:07fbd8e9251, Sep 27 2023, 23:32:09) [GCC 13.2.0] :: Intel Corporation on linuxType "help", "copyright", "credits" or "license" for more information.>>> import whisperx/opt/conda/lib/python3.10/site-packages/torch/cuda/__init__.py:628: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML")INFO:speechbrain.utils.quirks:Applied quirks (see `speechbrain.utils.quirks`): [disable_jit_profiling, allow_tf32]INFO:speechbrain.utils.quirks:Excluded quirks specified by the `SB_DISABLE_QUIRKS` environment (comma-separated list): []

Monday, January 20, 2025

whisperx ubuntu workstation

 tried to install whisperx to run on gpu, have trouble with cuda libaries. So, default to cpu instend. 

Sunday, January 19, 2025

waha whisperx install try

 On wahab, crun only work after module load xxxx

hqin@wahab-01 ~]$ ls

[hqin@wahab-01 ~]$ module load container_env

[hqin@wahab-01 ~]$ python -m venv whisperx_env

python: Command not found.

[hqin@wahab-01 ~]$ crun python -m venv whisperx_env

hqin@wahab-01 ~]$ ls whisperx_env[hqin@wahab-01 ~]$

Did not work. 







Saturday, January 18, 2025

waha github clone

 on odu waha, github can be used with RSA publication key. 

For example

git clone git@github.com:hongqin/AI4Health.git

Friday, January 17, 2025

CS795 - day1- 2025 Jan 17 Friday

Zoom, start recording

Datacamp: registration

HPC survey (5 minutes)

 The Research & Cloud Computing group (RCC) recently launched a survey regarding the need for training for research computing users. We would like to ask you to promote this survey among your students in classes and research groups as well as your colleagues, postdocs and other staff. The survey link is:

 

https://odu.co1.qualtrics.com/jfe/form/SV_9zCyC5peVHeQgl0

 

Please encourage them to submit responses by the end of January so we can use the findings to adjust offerings for this semester. Your help will be greatly appreciated!


CoLab

syllabus, 

SoCrative ice break, anonymous

Github


Let students introduct each other in breakout room. Then student A introduce student B. 


AI101, tensor flow playground. 


== did not finish. leave for next class. 

skipp self-introduction video. 

project team, 

ChatGPT, anthropic, 

all of us account

A primer on deep learning in genetics, classification model

https://colab.research.google.com/github/hongqin/Python-CoLab-bootcamp/blob/master/A_Primer_on_Deep_Learning_in_Genomics_Public.ipynb


Thursday, January 16, 2025

Rsync

 

rsync

https://www.cisecurity.org/advisory/multiple-vulnerabilities-in-rsync-could-allow-for-remote-code-execution_2025-007



Wednesday, January 8, 2025

CVE-2024-27322 Should Never Have Been Assigned And R Data Files Are Still Super Risky Even In R 4.4.0

CVE-2024-27322 Should Never Have Been Assigned And R Data Files Are Still Super Risky Even In R 4.4.0

 

https://rud.is/b/2024/05/03/cve-2024-27322-should-never-have-been-assigned-and-r-data-files-are-still-super-risky-even-in-r-4-4-0/


Spring 2025 course schedule

 CS 795/895 DASC, AI for health and life sciences. 

Scheduled Meeting Times
TypeTimeDaysWhereDate RangeSchedule TypeInstructors
Scheduled In-Class Meetings4:30 pm - 7:10 pmFENGINEERING & COMP SCI BLDG 2120Jan 11, 2025 - Apr 28, 2025LECTUREHONG QIN (P)

Monday, January 6, 2025

human scRNA aging data

 There are some human single cell aging data,

 
 
 
 
https://pmc.ncbi.nlm.nih.gov/articles/PMC10306289/#_ad93_

Human PBMC scRNA-seq–based aging clocks reveal ribosome to inflammation balance as a single-cell aging hallmark and super longevity

 1, 2, 2, 1,3 1 4 4 1,* 2,*

Saturday, January 4, 2025

Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective

 

https://arxiv.org/abs/2412.14135


Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective

OpenAI o1 represents a significant milestone in Artificial Inteiligence, which achieves expert-level performances on many challanging tasks that require strong reasoning this http URL has claimed that the main techinique behinds o1 is the reinforcement learining. Recent works use alternative approaches like knowledge distillation to imitate o1's reasoning style, but their effectiveness is limited by the capability ceiling of the teacher model. Therefore, this paper analyzes the roadmap to achieving o1 from the perspective of reinforcement learning, focusing on four key components: policy initialization, reward design, search, and learning. Policy initialization enables models to develop human-like reasoning behaviors, equipping them with the ability to effectively explore solution spaces for complex problems. Reward design provides dense and effective signals via reward shaping or reward modeling, which is the guidance for both search and learning. Search plays a crucial role in generating high-quality solutions during both training and testing phases, which can produce better solutions with more computation. Learning utilizes the data generated by search for improving policy, which can achieve the better performance with more parameters and more searched data. Existing open-source projects that attempt to reproduce o1 can be seem as a part or a variant of our roadmap. Collectively, these components underscore how learning and search drive o1's advancement, making meaningful contributions to the development of LLM.


AI model editing techniques

 chatgpt output, un-edited. Some wrong links and erorrs are apparent. 


1. Fine-Tuning


2. Prompt Engineering


3. Model Editing via Retrieval-Augmented Generation (RAG)

  • Definition: Integrating external databases or retrieval systems to improve or adapt the model's outputs without direct parameter changes.
  • References:

4. Knowledge Injection


5. Soft Prompt Tuning


6. Modular Transfer Learning


7. Dynamic Reweighting


8. Model Surgery


9. Continual Learning


10. Gradient Editing


11. Reinforcement Learning from Human Feedback (RLHF)

  • Definition: Using human evaluations to fine-tune models, particularly for aligning AI with desired ethical or stylistic outcomes.
  • References:

12. Model Patching


13. Parameter-Free Updating


14. Memory Editing

  • Definition: Directly modifying or updating specific "memories" in a model, allowing it to adjust responses to certain inputs or queries without retraining.
  • Techniques:
    • MEMIT (Model Editing Made Informed by Targeting)
    • ROME (Rank-One Model Editing)
  • References:

15. Multi-Modal Model Editing


16. Federated Learning Adjustments


17. Meta-Learning (Learning to Learn)


This version now includes Memory Editing with references to emerging techniques like MEMIT and ROME for direct manipulation of model-specific knowledge.

Friday, January 3, 2025

editable neural networks in health science

 Meng, K., Bau, D., Andonian, A. & Belinkov, Y. Locating and editing factual associations in GPT.

Adv. Neural Inf. Process. Syst. (2022). at

<https://proceedings.neurips.cc/paper_files/paper/2022/hash/6f1d43d5a82a37e89b0665b33bf3a182-

Abstract-Conference.html>

26. Meng, K., Sharma, A. S., Andonian, A., Belinkov, Y. & Bau, D. Mass-editing memory in a

transformer. in International Conference on Learning Representations (arxiv.org, 2023). at

<https://arxiv.org/abs/2210.07229>

27. Mitchell, E., Lin, C., Bosselut, A., Manning, C. D. & Finn, C. Memory-Based Model Editing at

Scale. in Proceedings of the 39th International Conference on Machine Learning (eds. Chaudhuri,

K., Jegelka, S., Song, L., Szepesvari, C., Niu, G. & Sabato, S.) 162, 15817–15831 (PMLR, 17--23

Jul 2022).

28. Hartvigsen, T., Sankaranarayanan, S., Palangi, H., Kim, Y. & Ghassemi, M. Aging with GRACE:

Lifelong Model Editing with Discrete Key-Value Adaptors. in Advances in Neural Information

Processing Systems (2023). at <https://arxiv.org/abs/2211.11031>

29. Mitchell, E., Lin, C., Bosselut, A., Finn, C. & Manning, C. Fast model editing at scale. in

International Conference on Learning Representations (arxiv.org, 2022). at

<https://arxiv.org/abs/2110.11309>

30. Sinitsin, A., Plokhotnyuk, V., Pyrkin, D., Popov, S. & Babenko, A. Editable Neural Networks. in

International Conference on Learning Representations (2020). at <http://arxiv.org/abs/2004.00345>

31. De Cao, N., Aziz, W. & Titov, I. Editing Factual Knowledge in Language Models. in Proceedings of

the 2021 Conference on Empirical Methods in Natural Language Processing 6491–6506

(Association for Computational Linguistics, 2021).

32. Zhong, Z., Wu, Z., Manning, C. D., Potts, C. & Chen, D. MQuAKE: Assessing Knowledge Editing

in Language Models via Multi-Hop Questions. arXiv [cs.CL] (2023). at

<http://arxiv.org/abs/2305.14795>

33. Cohen, R., Biran, E., Yoran, O., Globerson, A. & Geva, M. Evaluating the ripple effects of

knowledge editing in language models. Trans. Assoc. Comput. Linguist. 12, 283–298 (2023).

De Cao, N., Aziz, W. & Titov, I. Editing Factual Knowledge in Language Models. arXiv [cs.CL]

(2021). at <http://arxiv.org/abs/2104.08164>

35. Meng, K., Sharma, A. S., Andonian, A., Belinkov, Y. & Bau, D. Mass-Editing Memory in a

Transformer. arXiv [cs.CL] (2022). at <http://arxiv.org/abs/2210.07229>

36. Mitchell, E., Lin, C., Bosselut, A., Finn, C. & Manning, C. D. Fast Model Editing at Scale. arXiv

[cs.LG] (2021). at <http://arxiv.org/abs/2110.11309>

37. Hartvigsen, T., Sankaranarayanan, S., Palangi, H., Kim, Y. & Ghassemi, M. Aging with GRACE:

Lifelong Model Editing with Key-Value Adaptors. (2022). at

<https://openreview.net/pdf?id=ngCT1EelZk>

Language Models: Problems, Methods, and Opportunities. arXiv [cs.CL] (2023). at

<http://arxiv.org/abs/2305.13172>

41. Hase, P., Hofweber, T., Zhou, X., Stengel-Eskin, E. & Bansal, M. Fundamental problems with model

editing: How should rational belief revision work in LLMs? arXiv [cs.CL] (2024). at

<https://scholar.google.com/citations?view_op=view_citation&hl=en&citation_for_view=FO90FgM

AAAAJ:M3ejUd6NZC8C>

42. Cheng, S., Tian, B., Liu, Q., Chen, X., Wang, Y., Chen, H. & Zhang, N. Can We Edit Multimodal

Large Language Models? in Proceedings of the 2023 Conference on Empirical Methods in Natural

Language Processing (eds. Bouamor, H., Pino, J. & Bali, K.) 13877–13888 (Association for

Computational Linguistics, 2023).



Here are the URLs for the specified papers:


1. **Locating and editing factual associations in GPT**  

   Meng, K., Bau, D., Andonian, A. & Belinkov, Y. (2022).  

   [Link to Paper](https://proceedings.neurips.cc/paper_files/paper/2022/hash/6f1d43d5a82a37e89b0665b33bf3a182-Abstract-Conference.html)


2. **Mass-editing memory in a transformer**  

   Meng, K., Sharma, A. S., Andonian, A., Belinkov, Y. & Bau, D. (2023).  

   [Link to Paper](https://arxiv.org/abs/2210.07229)


3. **Memory-Based Model Editing at Scale**  

   Mitchell, E., Lin, C., Bosselut, A., Manning, C. D. & Finn, C. (2022).  

   [Link to Paper](https://proceedings.mlr.press/v162/mitchell22a.html)


4. **Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors**  

   Hartvigsen, T., Sankaranarayanan, S., Palangi, H., Kim, Y. & Ghassemi, M. (2023).  

   [Link to Paper](https://arxiv.org/abs/2211.11031)


5. **Fast model editing at scale**  

   Mitchell, E., Lin, C., Bosselut, A., Finn, C. & Manning, C. D. (2022).  

   [Link to Paper](https://arxiv.org/abs/2110.11309)


6. **Editable Neural Networks**  

   Sinitsin, A., Plokhotnyuk, V., Pyrkin, D., Popov, S. & Babenko, A. (2020).  

   [Link to Paper](http://arxiv.org/abs/2004.00345)


7. **Editing Factual Knowledge in Language Models**  

   De Cao, N., Aziz, W. & Titov, I. (2021).  

   [Link to Paper](http://arxiv.org/abs/2104.08164)


8. **MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions**  

   Zhong, Z., Wu, Z., Manning, C. D., Potts, C. & Chen, D. (2023).  

   [Link to Paper](http://arxiv.org/abs/2305.14795)


9. **Evaluating the ripple effects of knowledge editing in language models**  

   Cohen, R., Biran, E., Yoran, O., Globerson, A. & Geva, M. (2023).  

   [Link to Paper](https://transacl.org/ojs/index.php/tacl/article/view/3736)


10. **Language Models: Problems, Methods, and Opportunities**  

    (2023).  

    [Link to Paper](http://arxiv.org/abs/2305.13172)


11. **Fundamental problems with model editing: How should rational belief revision work in LLMs?**  

    Hase, P., Hofweber, T., Zhou, X., Stengel-Eskin, E. & Bansal, M. (2024).  

    [Link to Paper](https://scholar.google.com/citations?view_op=view_citation&hl=en&citation_for_view=FO90FgMAAAAAJ:M3ejUd6NZC8C)


12. **Can We Edit Multimodal Large Language Models?**  

    Cheng, S., Tian, B., Liu, Q., Chen, X., Wang, Y., Chen, H. & Zhang, N. (2023).  

    [Link to Paper](https://arxiv.org/abs/2305.14795)


Citations:

[1] https://proceedings.neurips.cc/paper_files/paper/2022/hash/6f1d43d5a82a37e89b


Thursday, January 2, 2025

Autonomous AI-Driven Drug Discovery

 

A Framework for Autonomous AI-Driven Drug Discovery

Douglas W SelingerTimothy R WallEleni StylianouEhab M KhalilJedidiah GaetzOren Levy

https://www.biorxiv.org/content/10.1101/2024.12.17.629024v2