Thursday, February 20, 2025

Wednesday, February 19, 2025

A threshold of 0.997 in inductive confirmal prediciton

 A threshold of 0.997 means that—when forming conformal prediction sets—we only include a class if the model’s predicted probability is at least 99.7% confident. In other words, the nonconformity score cutoff is so strict that only extremely confident predictions will make it into the prediction set.

Here’s what that implies:

Very High Confidence Requirement:
With a 0.997 threshold, the model must be almost certain (≥ 99.7% probability) about a class before it’s included in the prediction set. For many examples, this might result in a prediction set with only one class (if that class’s probability exceeds 0.997) or—even worse—an empty set if no class meets that bar.

Coverage vs. Set Size Tradeoff:
Conformal prediction is designed to guarantee that the true label is included in the prediction set at a desired rate (coverage). If you set the threshold so high, you risk lowering the coverage (i.e. many instances might not have the true label in their prediction set) or you might get very sparse (tiny) prediction sets. In practical terms, if most of your examples end up with empty or overly “confident” (but possibly incorrect) prediction sets, then the threshold is too strict.

Is it Good or Bad?
If the model is extremely well-calibrated and truly confident: A threshold of 0.997 could indicate that the model is rarely uncertain, and its predictions are reliable. In such a rare scenario, you might see high coverage (almost every true label is included) and prediction sets that almost always have a single label.
In most realistic settings: Such a high threshold is likely too conservative. It may lead to prediction sets that are too small (or even empty), failing to capture the uncertainty inherent in the data. That would be “bad” because it undermines one of the strengths of conformal prediction—providing informative prediction sets that reflect the model’s uncertainty.

In summary, unless your model is known to be extremely confident and well-calibrated (so that nearly every correct prediction is given with ≥ 99.7% probability), a threshold of 0.997 is likely too strict. You would typically aim for a threshold that balances having reasonably sized prediction sets (capturing uncertainty) while still meeting your desired coverage rate (for instance, 90% coverage when targeting a 10% error rate).

Wednesday, February 12, 2025

Youtube video on biological AI

 

https://www.youtube.com/@valence_labs


https://pmc.ncbi.nlm.nih.gov/articles/PMC11118704/


Multimodal Learning for Mapping the Genotype-Phenotype Dynamics

PMCID: PMC11118704  PMID: 38798675

Predicting gene expression from histone marks using chromatin deep learning models depends on histone mark function, regulatory distance and cellular states

 Predicting gene expression from histone marks using chromatin deep learning models depends on histone mark function, regulatory distance and cellular states 

https://academic.oup.com/nar/advance-article/doi/10.1093/nar/gkae1212/7921050?login=false#498065910


Tuesday, February 11, 2025

ESM evolutionary ai

 https://github.com/evolutionaryscale/esm/tree/main

write a tutorial for AI for health science course. 

Sunday, February 9, 2025

Friday, February 7, 2025

ants image classification

 https://images.cv/dataset/ant-image-classification-dataset


https://github.com/naturalis/FormicID



CS795 lecture biomedical image

 Zoom, start recording

unit 2. biomedical image

project topics, presentation

use ipad to illustrate data organization


TEAM drive for data sharing



Wednesday, February 5, 2025

Heafy.AI

 https://www.heavy.ai/

why first epoch takes the longest time during training,

 The reason why the first epoch takes the longest time during training, with the second epoch taking slightly less time, and later epochs being even faster, is due to several factors, including data preprocessing, caching, and optimization mechanisms. Let's break it down:


1. Initial Data Loading & Preprocessing

  • First Epoch:
    • During the first epoch, the dataset needs to be fully loaded from disk and passed through the preprocessing pipeline.
    • If you are using ImageDataGenerator or tf.data pipelines, the images may be resized, augmented, normalized, and converted to tensors for the first time.
    • This initial processing is computationally expensive.
  • Later Epochs:
    • TensorFlow and Keras utilize caching and preloaded batches to speed up data access in subsequent epochs.
    • Once the images are processed in the first epoch, they may be cached in RAM (or SSD) and reused, reducing load times.

2. CPU/GPU Warm-Up and Optimization

  • First Epoch:
    • When training starts, TensorFlow initializes GPU memory allocation, CUDA kernels, and computational graph optimizations.
    • This setup process introduces additional overhead, making the first epoch the slowest.
  • Later Epochs:
    • By the second epoch, TensorFlow has already optimized its execution path.
    • GPU computation becomes more efficient as TensorFlow compiles kernels and caches intermediate computations.

3. Adaptive Learning Rate & Optimizer State

  • Some optimizers, like Adam, RMSprop, and SGD with momentum, dynamically adjust learning rates and gradients over epochs.
  • This results in:
    • First Epoch: Gradients might be noisier, and initial computations may take longer.
    • Later Epochs: Optimizers become more stable, requiring slightly less computation.

4. Disk I/O Bottleneck

  • If you are loading images from a slow disk (HDD) or network storage, the first epoch takes longer due to disk read latency.
  • In later epochs, data might be cached in RAM or SSD, making loading much faster.

Conclusion

  • First epoch is slow due to data preprocessing, GPU warm-up, and disk I/O.
  • Second epoch is faster because some of these operations are now cached or optimized.
  • Later epochs become even more efficient as optimizations and caching mechanisms continue to improve performance.

Would you like me to check if your training pipeline is utilizing caching properly? 🚀

Monday, February 3, 2025

explain training progress print out



Below is an overview of what you’re seeing in each line of the training log and why the number “163” appears:

  1. Epoch n/20
    This indicates which epoch (out of 20 total) is currently being trained.

  2. 163/163

    • This means your dataset has been split into 163 mini-batches (steps) for each epoch.
    • In other words, every epoch processes 163 batches of data before finishing.
    • For instance, if you have a total of 16,300 samples in your training set and you set the batch size to 100, you would end up with 163 steps per epoch.
  3. Time/Step Info
    For example: 80s 415ms/step. The first number (80s) usually shows how long the entire epoch took. Then 415ms/step is roughly how long each training step took on average.

  4. acc: 0.7738 (Training Accuracy)
    The model’s accuracy on the training set at the end of the epoch.

  5. f1_score: 0.6252 (Training F1)
    The model’s F1 score on the training set at the end of the epoch, if you’ve added F1 as a metric.

  6. loss: 0.5913 (Training Loss)
    The training loss calculated over all batches in that epoch.

  7. val_acc, val_f1_score, val_loss (Validation Metrics)
    These show how the model performed on the validation set during that epoch:

    • val_acc: Validation accuracy.
    • val_f1_score: Validation F1 score.
    • val_loss: Loss on the validation set.

Putting it all together: each epoch processes 163 mini-batches (hence 163/163), and at the end, TensorFlow displays the final metrics on both the training set (acc, f1_score, loss) and the validation set (val_acc, val_f1_score, val_loss)

 



Friday, January 31, 2025

CS7695 lecture 3.

Zoom, start recording

Datacamp review, 

= A primer on deep learning in genetics, classification model, continued 

Review. ask a student to run it and explain. 

pytorch

transformer

Office hours: breakout room with each student. expectation. what to learn. 

 what are your short and long-term career goals. How does course align with that. 

  what kind of topics do you suggest? 

kaggle analysis, logistic regression

Tuesday, January 28, 2025

Prior-knowledge-defined attention masks for transformers

 

Prior-knowledge-defined attention masks for transformers involve incorporating domain-specific information or constraints into the attention mechanism. This approach can offer several advantages and disadvantages:


## Advantages


1. Enhanced Interpretability: By incorporating prior knowledge, the model's attention patterns become more aligned with human understanding, making the model's decision-making process more transparent[2].


2. Improved Performance: In specific domains, prior knowledge can guide the model to focus on relevant information, potentially leading to better performance on targeted tasks[2].


3. Reduced Computational Complexity: By limiting attention to specific areas defined by prior knowledge, the model may require fewer computations, especially for long sequences[4].


4. Task-Specific Adaptation: Prior-knowledge masks can be tailored to specific tasks or domains, allowing for more efficient fine-tuning of pre-trained models[4].


## Disadvantages


1. Limited Flexibility: Rigid prior-knowledge masks might constrain the model's ability to learn unexpected patterns or relationships in the data[2].


2. Potential for Bias: If the prior knowledge is incomplete or biased, it may lead the model to make suboptimal decisions or reinforce existing biases in the data[4].


3. Increased Complexity in Design: Creating effective prior-knowledge masks requires domain expertise and careful design, which can be time-consuming and challenging[2].


4. Reduced Generalization: Highly specific prior-knowledge masks might limit the model's ability to generalize across different tasks or domains[4].


To implement prior-knowledge-defined attention masks:


1. Define the Mask: Create a binary or continuous mask based on domain knowledge or task-specific requirements[2].


2. Integration: Incorporate the mask into the attention mechanism, typically by element-wise multiplication with the attention scores before softmax[7].


3. Training: Fine-tune the model with the integrated mask, allowing it to learn within the constraints of the prior knowledge[4].


4. Evaluation: Assess the model's performance and interpretability to ensure the prior-knowledge mask enhances rather than hinders the model's capabilities[2].


By carefully designing and implementing prior-knowledge-defined attention masks, researchers can potentially create more efficient, interpretable, and task-specific transformer models. However, it's crucial to balance the benefits of prior knowledge with the need for model flexibility and generalization.


Citations:

[1] https://stackoverflow.blog/2024/09/26/masked-self-attention-how-llms-learn-relationships-between-tokens/

[2] https://arxiv.org/html/2406.02761v1

[3] https://stackoverflow.com/questions/58127059/how-to-understand-masked-multi-head-attention-in-transformer/59713254

[4] https://openreview.net/forum?id=abHtkQkumD

[5] https://www.reddit.com/r/MLQuestions/comments/1fqjdrf/understanding_masked_attention_in_transformer/

[6] https://blog.pangeanic.com/what-are-transformers-in-nlp

[7] https://datascience.stackexchange.com/questions/65067/proper-masking-in-the-transformer-model

[8] https://www.turing.com/kb/brief-introduction-to-transformers-and-their-power

Conformity in machine learning model prediction evaluation

 Conformity in machine learning model prediction evaluation is calculated using a measure called the nonconformity score. This score quantifies how different or "nonconforming" a new data point is compared to the patterns observed in the training data[2]. The process of calculating conformity involves several steps:


1. Training Phase:

   - Split the dataset into a proper training set and a calibration set.

   - Train the model on the proper training set.

   - Use the trained model to make predictions on the calibration set.


2. Nonconformity Calculation:

   - For each instance in the calibration set, calculate a nonconformity score.

   - This score measures how different the prediction is from the actual value.


3. Prediction Phase:

   - For a new data point, calculate its nonconformity score using the trained model.

   - Compare this score to the distribution of nonconformity scores from the calibration set.


The nonconformity score can be calculated in various ways, depending on the type of problem:


- For regression: It could be the absolute difference between the predicted and actual values.

- For classification: It might be based on the probability assigned to the correct class.


The key idea is that instances with higher nonconformity scores are less conforming to the training patterns and are therefore associated with higher uncertainty[2].


By using this approach, Inductive Conformal Prediction (ICP) can generate prediction intervals or sets that capture the uncertainty associated with individual predictions. This allows for a more nuanced evaluation of model performance, going beyond simple point predictions to provide a measure of confidence in each prediction.


Citations:

[1] https://www.geeksforgeeks.org/metrics-for-machine-learning-model/

[2] https://www.linkedin.com/pulse/inductive-conformal-prediction-yeshwanth-n

[3] https://kanerika.com/glossary/model-evaluation-metrics/

[4] https://www.youtube.com/watch?v=oqK6rM8fbkk

[5] https://towardsdatascience.com/metrics-to-evaluate-your-machine-learning-algorithm-f10ba6e38234?gi=7cd5f38faaf8

[6] https://www.datasource.ai/en/data-science-articles/model-evaluation-metrics-in-machine-learning

[7] https://www.nature.com/articles/s41598-024-56706-x

[8] https://towardsdatascience.com/all-you-need-is-conformal-prediction-726f18920241?gi=6dd1cfc4136e


Conformity and SHAP (SHapley Additive exPlanations) value analysis are related in the context of machine learning model interpretation and uncertainty quantification. Both approaches aim to provide insights into model behavior, but they focus on different aspects:


1. Uncertainty Quantification: Conformity measures, particularly in the form of nonconformity scores, are used to quantify the uncertainty of model predictions. SHAP values, on the other hand, explain the impact of individual features on model outputs[1][3].


2. Shapley-value Conformity Scores: Recent research has explored combining Shapley values with conformal prediction to create more informative prediction sets. This approach uses Shapley values as conformity scores, resulting in smaller prediction sets for certain significance levels compared to traditional methods[5].


3. Complementary Information: While SHAP values provide feature importance and impact on model predictions, conformity measures offer insights into the reliability and uncertainty of those predictions. Together, they can provide a more comprehensive understanding of model behavior[2].


4. Uncertainty in SHAP Values: Research has also focused on quantifying uncertainty in SHAP value estimations. This includes using Shapley Residuals, Mean-Standard-Error, and Bayesian SHAP to capture different sources of uncertainty in SHAP explanations[6].


5. Application to Uncertainty Explanation: Recent work has adapted the Shapley value framework to explain various types of predictive uncertainty, quantifying each feature's contribution to the conditional entropy of model outputs[4].


By combining conformity measures with SHAP value analysis, researchers and practitioners can gain a more nuanced understanding of both model predictions and their associated uncertainties, leading to more reliable and interpretable machine learning applications.


Citations:

[1] https://proceedings.neurips.cc/paper_files/paper/2023/file/16e4be78e61a3897665fa01504e9f452-Paper-Conference.pdf

[2] https://papers.phmsociety.org/index.php/phmap/article/download/3694/2161

[3] https://mindfulmodeler.substack.com/p/shap-is-not-all-you-need

[4] https://arxiv.org/abs/2306.05724

[5] https://proceedings.mlr.press/v152/jaramillo21a.html

[6] https://scholarship.tricolib.brynmawr.edu/items/1c209352-e4ab-454e-822c-1fe30211b92d

[7] https://pmc.ncbi.nlm.nih.gov/articles/PMC10985608/

[8] https://soil.copernicus.org/articles/10/679/2024/


Conformity and attention masks in transformers can be combined in innovative ways to enhance model performance and uncertainty quantification. Here are some key approaches:


1. Uncertainty-Guided Transformer (UGT): This approach uses conformity measures to guide the attention mechanism. By introducing an uncertainty-guided random masking algorithm (UGRM), higher probability of masking is assigned to uncertain regions during training. This forces the transformer to become more efficient at inferring and recovering content in uncertain regions by exploiting contextual information.[2]


2. Stochastic Attention: Instead of using deterministic attention distributions, the attention mechanism can be made stochastic. This involves sampling attention from a Gumbel-Softmax distribution, which controls the concentration over values. Additionally, key heads in self-attention can be regularized to attend to a set of learnable centroids, effectively performing clustering over keys or hidden states.[4]


3. Probabilistic Transformer: This approach uses probabilistic attention scores to quantify epistemic uncertainties in model predictions. It involves training two models - a majority model focusing on low-uncertainty samples and a minority model focusing on high-uncertainty samples. During inference, these models are dynamically combined based on the input uncertainty to make the final prediction.[6]


4. Transformer Conformal Prediction: This method uses the Transformer architecture, particularly the decoder, as a conditional quantile estimator to predict the quantiles of prediction residuals. These quantiles are then used to estimate prediction intervals. The Transformer's ability to learn temporal dependencies across past prediction residuals benefits the estimation of prediction intervals.[5]


5. Topological Feature Extraction: This approach extracts topological features from attention matrices, providing a low-dimensional, interpretable representation of the model's internal dynamics. This can be used to estimate uncertainty in the transformer's predictions.[8]


These approaches demonstrate how conformity measures and attention masks can be combined to improve uncertainty quantification, enhance model interpretability, and potentially boost performance in various tasks. By integrating these concepts, researchers can develop more robust and reliable transformer models that not only make accurate predictions but also provide valuable insights into their confidence levels.


Citations:

[1] https://www.reddit.com/r/MLQuestions/comments/1fqjdrf/understanding_masked_attention_in_transformer/

[2] https://openaccess.thecvf.com/content/ICCV2021/papers/Yang_Uncertainty-Guided_Transformer_Reasoning_for_Camouflaged_Object_Detection_ICCV_2021_paper.pdf

[3] https://proceedings.mlr.press/v206/seedat23a/seedat23a.pdf

[4] https://cdn.aaai.org/ojs/21364/21364-13-25377-1-2-20220628.pdf

[5] https://arxiv.org/html/2406.05332v1

[6] https://sites.ecse.rpi.edu/~cvrl/Publication/pdf/Guo2022.pdf

[7] https://nejsds.nestat.org/journal/NEJSDS/article/10/text

[8] https://arxiv.org/abs/2308.11295

Friday, January 24, 2025

CS795 20250124 lecture 2

Zoom, start recording

Datacamp review, 

slides, 

= A primer on deep learning in genetics, classification model

https://colab.research.google.com/github/hongqin/Python-CoLab-bootcamp/blob/master/A_Primer_on_Deep_Learning_in_Genomics_Public.ipynb

= CoLab

= Github - Wahab

= Wahab ondemand, 

todo: kaggle analysis, logistic regression

Thursday, January 23, 2025

recent papers exploring the mathematical foundations of artificial intelligence

 Here are some recent papers exploring the mathematical foundations of artificial intelligence:

  1. "Formal Mathematical Reasoning: A New Frontier in AI" (December 2024)

    • Authors: Kaiyu Yang, Gabriel Poesia, Jingxuan He, Wenda Li, Kristin Lauter, Swarat Chaudhuri, Dawn Song
    • Summary: This position paper advocates for the integration of formal mathematical reasoning in AI, emphasizing its importance for advancing AI-driven discoveries in science and engineering. The authors discuss the role of formal systems, such as proof assistants, in verifying the correctness of reasoning and providing automatic feedback. citeturn0search1
    • https://arxiv.org/abs/2412.16075?utm_source=chatgpt.com 
  2. "Artificial Intelligence: Advanced Mathematical Constructs and Applications" (November 2024)

    • Authors: A. Sultan, S. Sridevi, A. Rohini
    • Summary: This paper explores the mathematical foundations underpinning AI, machine learning (ML), and deep learning (DL). It highlights the significance of calculus, linear algebra, probability, and statistics in developing and optimizing AI algorithms. The authors demonstrate how these mathematical concepts are essential for data representation and model optimization in AI systems. citeturn0search3
    • https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5066774&utm_source=chatgpt.com 
  3. "Integrating Discrete Mathematics in Artificial Intelligence: A Computational Perspective with a Vision for Future Technologies" (June 2024)

    • Authors: Shalini Mishra, Garima Singh, Manju Prabhakar
    • Summary: This research article discusses the integration of discrete mathematics into AI, focusing on its computational aspects and potential impact on future technologies. The authors examine how discrete mathematical structures can enhance the development and analysis of AI algorithms. citeturn0search4
    • chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/https://www.isroset.org/pub_paper/IJSRMSS/10-ISROSET-IJSRMSS-09633.pdf?utm_source=chatgpt.com

  4. "A Mathematical Framework of Intelligence and Consciousness Based on Riemannian Geometry" (July 2024)

    • Author: Meng Lu
    • Summary: This manuscript proposes a mathematical framework using Riemannian geometry to describe the structure and dynamics of intelligence and consciousness. It conceptualizes intelligence elements as tokens embedded in a high-dimensional space, with thought processes depicted as sequential activations along geodesics within these manifolds. citeturn0academia13
  5. "Artificial Intelligence and Inherent Mathematical Difficulty" (August 2024)

    • Author: Anonymous
    • Summary: This paper explores the relationship between artificial intelligence and the inherent challenges in resolving open mathematical questions. It presents arguments based on computability and complexity theory, illustrating the difficulties in proof discovery and the application of AI-inspired methods to mathematical problem-solving. citeturn0search10
    • https://arxiv.org/abs/2408.03345?utm_source=chatgpt.com

These papers provide valuable insights into the mathematical underpinnings of AI, highlighting the ongoing efforts to establish rigorous theoretical foundations for the field.

Tuesday, January 21, 2025

whisperx wahab, warnings

 

[hqin@wahab-01 ~]$ [hqin@wahab-01 ~]$ [hqin@wahab-01 ~]$ module load pytorch-gpu/2.2[hqin@wahab-01 ~]$ crun -p ~/envs/whisperx python Python 3.10.13 (tags/v3.10.13-25-g07fbd8e9251-dirty:07fbd8e9251, Sep 27 2023, 23:32:09) [GCC 13.2.0] :: Intel Corporation on linuxType "help", "copyright", "credits" or "license" for more information.>>> import whisperx/opt/conda/lib/python3.10/site-packages/torch/cuda/__init__.py:628: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML")INFO:speechbrain.utils.quirks:Applied quirks (see `speechbrain.utils.quirks`): [disable_jit_profiling, allow_tf32]INFO:speechbrain.utils.quirks:Excluded quirks specified by the `SB_DISABLE_QUIRKS` environment (comma-separated list): []

Monday, January 20, 2025

whisperx ubuntu workstation

 tried to install whisperx to run on gpu, have trouble with cuda libaries. So, default to cpu instend. 

Sunday, January 19, 2025

waha whisperx install try

 On wahab, crun only work after module load xxxx

hqin@wahab-01 ~]$ ls

[hqin@wahab-01 ~]$ module load container_env

[hqin@wahab-01 ~]$ python -m venv whisperx_env

python: Command not found.

[hqin@wahab-01 ~]$ crun python -m venv whisperx_env

hqin@wahab-01 ~]$ ls whisperx_env[hqin@wahab-01 ~]$

Did not work. 







Saturday, January 18, 2025

waha github clone

 on odu waha, github can be used with RSA publication key. 

For example

git clone git@github.com:hongqin/AI4Health.git

Friday, January 17, 2025

CS795 - day1- 2025 Jan 17 Friday

Zoom, start recording

Datacamp: registration

HPC survey (5 minutes)

 The Research & Cloud Computing group (RCC) recently launched a survey regarding the need for training for research computing users. We would like to ask you to promote this survey among your students in classes and research groups as well as your colleagues, postdocs and other staff. The survey link is:

 

https://odu.co1.qualtrics.com/jfe/form/SV_9zCyC5peVHeQgl0

 

Please encourage them to submit responses by the end of January so we can use the findings to adjust offerings for this semester. Your help will be greatly appreciated!


CoLab

syllabus, 

SoCrative ice break, anonymous

Github


Let students introduct each other in breakout room. Then student A introduce student B. 


AI101, tensor flow playground. 


== did not finish. leave for next class. 

skipp self-introduction video. 

project team, 

ChatGPT, anthropic, 

all of us account

A primer on deep learning in genetics, classification model

https://colab.research.google.com/github/hongqin/Python-CoLab-bootcamp/blob/master/A_Primer_on_Deep_Learning_in_Genomics_Public.ipynb


Thursday, January 16, 2025

Rsync

 

rsync

https://www.cisecurity.org/advisory/multiple-vulnerabilities-in-rsync-could-allow-for-remote-code-execution_2025-007



Wednesday, January 8, 2025

CVE-2024-27322 Should Never Have Been Assigned And R Data Files Are Still Super Risky Even In R 4.4.0

CVE-2024-27322 Should Never Have Been Assigned And R Data Files Are Still Super Risky Even In R 4.4.0

 

https://rud.is/b/2024/05/03/cve-2024-27322-should-never-have-been-assigned-and-r-data-files-are-still-super-risky-even-in-r-4-4-0/


Spring 2025 course schedule

 CS 795/895 DASC, AI for health and life sciences. 

Scheduled Meeting Times
TypeTimeDaysWhereDate RangeSchedule TypeInstructors
Scheduled In-Class Meetings4:30 pm - 7:10 pmFENGINEERING & COMP SCI BLDG 2120Jan 11, 2025 - Apr 28, 2025LECTUREHONG QIN (P)

Monday, January 6, 2025

human scRNA aging data

 There are some human single cell aging data,

 
 
 
 
https://pmc.ncbi.nlm.nih.gov/articles/PMC10306289/#_ad93_

Human PBMC scRNA-seq–based aging clocks reveal ribosome to inflammation balance as a single-cell aging hallmark and super longevity

 1, 2, 2, 1,3 1 4 4 1,* 2,*

Saturday, January 4, 2025

Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective

 

https://arxiv.org/abs/2412.14135


Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective

OpenAI o1 represents a significant milestone in Artificial Inteiligence, which achieves expert-level performances on many challanging tasks that require strong reasoning this http URL has claimed that the main techinique behinds o1 is the reinforcement learining. Recent works use alternative approaches like knowledge distillation to imitate o1's reasoning style, but their effectiveness is limited by the capability ceiling of the teacher model. Therefore, this paper analyzes the roadmap to achieving o1 from the perspective of reinforcement learning, focusing on four key components: policy initialization, reward design, search, and learning. Policy initialization enables models to develop human-like reasoning behaviors, equipping them with the ability to effectively explore solution spaces for complex problems. Reward design provides dense and effective signals via reward shaping or reward modeling, which is the guidance for both search and learning. Search plays a crucial role in generating high-quality solutions during both training and testing phases, which can produce better solutions with more computation. Learning utilizes the data generated by search for improving policy, which can achieve the better performance with more parameters and more searched data. Existing open-source projects that attempt to reproduce o1 can be seem as a part or a variant of our roadmap. Collectively, these components underscore how learning and search drive o1's advancement, making meaningful contributions to the development of LLM.