This site is to serve as my note-book and to effectively communicate with my students and collaborators. Every now and then, a blog may be of interest to other researchers or teachers. Views in this blog are my own. All rights of research results and findings on this blog are reserved. See also http://youtube.com/c/hongqin @hongqin
Thursday, July 3, 2025
CPSim: Simulation Toolbox for Security Problems in Cyber-Physical Systems.
A2 consortium 2024 talk links
Symposium below or share with interested colleagues.
- Full list of Symposium talks
- Welcome Remarks
- Pattie Maes, PhD – Opportunities for AI and Wearables to Support Healthy Aging
- Jianying Hu, PhD – Harnessing AI for Advancing Neurodegenerative Disease Therapeutics
- Martin Sliwinski, PhD – Managing Cognitive Impairment – From Early Diagnosis to Supportive Care
- Michael Kahana, PhD – Managing Cognitive Impairment – From Early Diagnosis to Supportive Care
- Conor Walsh, PhD – Machine Learning and AI Methods for Aging and Dementia Care
- Marzyeh Ghassemi, PhD -- Machine Learning and AI Methods for Aging and Dementia Care
- Jordan Smoller, MD – Big Health Data: Challenges and Opportunities
- Griffin Weber, MD, PhD -- Big Health Data: Challenges and Opportunities
Wednesday, July 2, 2025
national family survey of pregancy
national family survey of pregancy
https://www.cdc.gov/nchs/nsfg/index.htm
todo: request to restrickted access variables.
Saturday, June 28, 2025
Fall 2025 schedule
August 23 - Dec 12, 2025, Thursday 6p - 8:40pm.
Type | Time | Days | Where | Date Range | Schedule Type | Instructors |
---|---|---|---|---|---|---|
Scheduled In-Class Meetings | 6:00 pm - 8:40 pm | R | ENGINEERING & COMP SCI BLDG 2120 | Aug 23, 2025 - Dec 12, 2025 | LECTURE |
Tuesday, June 24, 2025
ZIP Code RUCA Approximation,
https://depts.washington.edu/uwruca/ruca-approx.php?utm_source=chatgpt.com
All of Us survey, data codebooks
All of Us survey, data codebooks
https://docs.google.com/spreadsheets/d/1pODkE2bFN-kmVtYp89rtrJg7oXck4Fsex58237x47mA/edit?usp=sharing
Friday, June 20, 2025
A model for the assembly map of bordism-invariant functors
The paper "A model for the assembly map of bordism-invariant functors" by Levin, Nocera, and Saunier (2025) develops advanced categorical frameworks for algebraic topology, particularly through oplax colimits of stable/hermitian/Poincaré categories and bordism-invariant functors123. While not directly addressing machine learning (ML) or large language models (LLMs), its contributions could indirectly influence these fields through three key pathways:
1. Enhanced Categorical Frameworks for ML
The paper's formalization of oplax colimits and Poincaré-Verdier localizing invariants13 provides new mathematical tools for structuring compositional systems. This could advance:
-
Model Architecture Design: By abstracting relationships between components (e.g., neural network layers) as bordism-invariant functors, enabling more rigorous analysis of model behavior under transformations5.
-
Geometric Deep Learning: Topological invariants and assembly maps could refine methods for learning on non-Euclidean data (e.g., graphs, manifolds) by encoding persistence of features under deformations5.
2. Invariance Learning and Equivalence
The bordism-invariance concept—where structures remain unchanged under continuous deformations—offers a mathematical foundation for invariance principles in ML:
-
Data Augmentation: Formalizing "bordism equivalence" could guide the design of augmentation strategies that preserve semantic content (e.g., image rotations as "topological bordisms")5.
-
Robust Feature Extraction: Kernels of Verdier projections13 might model noise subspaces to exclude during feature learning, improving adversarial robustness.
3. LLMs for Structured Reasoning
The paper’s explicit decomposition of complex functors (e.g., Shaneson splittings with twists13) parallels challenges in LLM-based reasoning:
-
Program Invariant Prediction: LLMs that infer program invariants6 could adopt categorical decompositions to handle twisted or hierarchical constraints (e.g., loop invariants in code).
-
Categorical Data Embeddings: LLM-generated numerical representations of categorical data4 might leverage bordism-invariance to ensure embeddings respect equivalence classes (e.g., "color" as a deformation-invariant attribute).
Limitations and Future Directions
The work is highly theoretical, with no direct ML/LLM applications in the paper. Bridging this gap requires:
-
Translating topological bordisms into data-augmentation pipelines.
-
Implementing Poincaré-Verdier invariants as regularization terms in loss functions.
-
Extending LLM-based invariant predictors6 to handle categorical assembly maps.
While speculative, these connections highlight how advanced category theory could enrich ML’s theoretical foundations and LLMs’ reasoning capabilities.
- https://arxiv.org/abs/2506.05238
- https://arxiv.org/pdf/2506.05238.pdf
- https://www.arxiv.org/pdf/2506.05238.pdf
- https://pubmed.ncbi.nlm.nih.gov/39348252/
- https://www.aimodels.fyi/papers/arxiv/category-theoretical-topos-theoretical-frameworks-machine-learning
- https://openreview.net/pdf?id=mXv2aVqUGG
- https://x.com/CTpreprintBot
- https://keik.org/profile/mathat-bot.bsky.social
- https://www.alphaxiv.org/abs/2506.05238
- https://publications.mfo.de/bitstream/handle/mfo/4263/OWR_2024_47.pdf?sequence=1&isAllowed=y
- https://x.com/CTpreprintBot/status/1930943445977518380
- https://www.themoonlight.io/en/review/a-model-for-the-assembly-map-of-bordism-invariant-functors
- https://library.slmath.org/books/Book69/files/wholebook.pdf
- https://www.reed.edu/math-stats/thesis.html
- https://math.mit.edu/events/talbot/2020/syllabus2020.pdf
- https://webhomes.maths.ed.ac.uk/~v1ranick/papers/quinnass.pdf
- https://msp.org/agt/2009/9-4/agt-v9-n4-p16-s.pdf
- https://webhomes.maths.ed.ac.uk/~v1ranick/papers/owsem.pdf
Friday, June 6, 2025
Mendely preprint error
In Mendely, if an article has unspecified type, it often list it as "preprint".
To fix this, just change the document type to 'jounral article' or 'conference proceedings' or other appropriate type.
Thursday, May 29, 2025
Beyond Attention: Toward Machines with Intrinsic Higher Mental States
Beyond Attention: Toward Machines with Intrinsic Higher Mental States
https://techxplore.com/news/2025-05-architecture-emulates-higher-human-mental.html#google_vignette
Wednesday, May 28, 2025
USDA Rural-Urban Commuting Area Codes
USDA
Rural-Urban Commuting Area Codes
https://www.ers.usda.gov/data-products/rural-urban-commuting-area-codes?utm_source=chatgpt.com
three digit 360 zipcode in Alabama
Based on our discussion, I have looked further into the ‘360’ zipcode region, and found it contain 14 counties below:
counties = [ "Autauga County", "Barbour County", "Bullock County", "Butler County",
"Chilton County", "Coosa County", "Covington County", "Crenshaw County",
"Elmore County", "Lowndes County", "Macon County", "Montgomery County",
"Pike County", "Tallapoosa County"]
Amon them, there are “Barbour”, “Bullock”, and “Macon” counties.
So, not sure how useful the three-digit zip code of ‘360’ might be relevant to TU MCH project.
pastbin
https://pastebin.com/uBcFUXCA
- import pandas as pd
- dataset = %env WORKSPACE_CDR
- query = """
- SELECT
- p.person_id AS person_id,
- c.concept_name AS state_name
- FROM `{dataset}.person` AS p
- LEFT JOIN `{dataset}.concept` AS c ON p.state_of_residence_concept_id = c.concept_id
- WHERE c.concept_name LIKE 'PII State: %'
- """
- state_df = pd.read_gbq(query.format(dataset = dataset))
- state_df['state_of_residence'] = state_df['state_name'].str.replace('PII State: ', '')
- state_df.head()
- state_df.shape
- homeless_state_df=state_df[state_df['person_id'].isin(homeless_respiratory_status['person_id'])]
- homeless_state_df['state_of_residence'].value_counts()
All of us research platform, data explore for rural health research.
All of us research platform, data explore for rural health research.
Zip3, observational table
Workspaces > Beginner Intro to AoU Data and the Workbench > Analysis >
In ‘test-state_data_v060523’, found sample on state, county, and zip3.
Monday, May 26, 2025
ODU CS and DSC courses taught by Hong Qin
https://catalog.odu.edu/courses/cs/#graduatecoursestext
https://catalog.odu.edu/courses/dasc/
This course explores the application of AI in health sciences, focusing on machine learning, NLP, computer vision, generative AI techniques for diagnostics, treatment planning, patient monitoring, and biomedical research. It covers precision medicine, ethical AI, and the integration of AI into practice. Students will gain a deep understanding and practical skills to develop innovative AI solutions that address real-world challenges in health sciences.
This course provides a deep dive into the foundations and current advancements in generative AI. It covers key concepts such as transformer models, GANs, VAEs, LLMs, and their applications across various fields, emphasizing both theory and hands-on learning, including ethical considerations such as fairness and bias mitigation. Students will develop a comprehensive understanding of generative AI and gain practical experience.
This course explores the application of AI in health sciences, focusing on machine learning, NLP, computer vision, generative AI techniques for diagnostics, treatment planning, patient monitoring, and biomedical research. It covers precision medicine, ethical AI, and the integration of AI into practice. Students will gain a deep understanding and practical skills to develop innovative AI solutions that address real-world challenges in health sciences.
This course provides a deep dive into the foundations and current advancements in generative AI. It covers key concepts such as transformer models, GANs, VAEs, LLMs, and their applications across various fields, emphasizing both theory and hands-on learning, including ethical considerations such as fairness and bias mitigation. Students will develop a comprehensive understanding of generative AI and gain practical experience.
This course explores the application of AI in health sciences, focusing on machine learning, NLP, computer vision, generative AI techniques for diagnostics, treatment planning, patient monitoring, and biomedical research. It covers precision medicine, ethical AI, and the integration of AI into practice. Students will gain a deep understanding and practical skills to develop innovative AI solutions that address real-world challenges in health sciences.
This course provides a deep dive into the foundations and current advancements in generative AI. It covers key concepts such as transformer models, GANs, VAEs, LLMs, and their applications across various fields, emphasizing both theory and hands-on learning, including ethical considerations such as fairness and bias mitigation. Students will develop a comprehensive understanding of generative AI and gain practical experience.
This course explores the application of AI in health sciences, focusing on machine learning, NLP, computer vision, generative AI techniques for diagnostics, treatment planning, patient monitoring, and biomedical research. It covers precision medicine, ethical AI, and the integration of AI into practice. Students will gain a deep understanding and practical skills to develop innovative AI solutions that address real-world challenges in health sciences.
This course provides a deep dive into the foundations and current advancements in generative AI. It covers key concepts such as transformer models, GANs, VAEs, LLMs, and their applications across various fields, emphasizing both theory and hands-on learning, including ethical considerations such as fairness and bias mitigation. Students will develop a comprehensive understanding of generative AI and gain practical experience.
Saturday, May 24, 2025
logit, sigmoid function, and probability
step-by-step through how the sigmoid function turns a logit into a probability, with both math and plain English at each step.
✅ Step 1: What is a logit?
The logit of a probability is defined as:
This is called the log-odds, because it’s the natural log of the odds.
✅ Step 2: What if we are given the logit and want to get the probability?
Let’s say we are given a number , and we know it’s a logit.
That means:
We want to solve for , the probability.
✅ Step 3: Exponentiate both sides
To get rid of the logarithm, we exponentiate both sides:
This tells us the odds in terms of .
✅ Step 4: Solve for
Now we solve this equation for .
Start with:
Multiply both sides by :
Distribute the left-hand side:
Now get all the 's on one side:
Factor out on the right-hand side:
Now divide both sides by :
✅ Step 5: Simplify into the sigmoid form
We now simplify that expression:
That’s the sigmoid function:
✅ Final Result
So if you’re given a logit value , the sigmoid function:
turns it into a probability .
🔁 Why this is useful
-
In logistic regression, we compute:
-
Then use the sigmoid to convert that score (logit) into a probability of class 1:
This way, the model’s output is interpretable as a probability — suitable for binary classification.
Let me know if you want a plot or visualization to make it even clearer!