Sunday, November 26, 2023

dynamic network and Epidemiological modeling

 Dynamic networks are networks that change over time, such as social networks, recommender systems, and epidemiological networks. Dynamic graphs are mathematical representations of dynamic networks, where nodes represent entities and edges represent interactions or relationships that vary over time. Dynamic graph neural networks (DGNNs) are machine learning models that can learn from dynamic graph data and perform various tasks, such as node classification, link prediction, and graph generation.

Epidemiological modeling is the process of using mathematical models to understand and predict the spread of infectious diseases, such as COVID-19. Epidemiological models can help inform public health policies and interventions to prevent or mitigate the impact of pandemics. One of the challenges of epidemiological modeling is to account for the complex and dynamic nature of human contact networks, which influence how diseases transmit among individuals and populations.

Dynamic network and dynamic graph can be used for epidemiological modeling in several ways. One way is to use dynamic graph data as input for DGNNs, and train them to learn the temporal and structural patterns of disease transmission. For example, 3 proposed an epidemiological neural network (ENN) that exploits dynamic graph structured data applied to the COVID-19 outbreak. The ENN uses a recurrent neural network (RNN) to capture the temporal evolution of the dynamic graph, and a graph convolutional network (GCN) to capture the spatial features of the nodes and edges. The ENN can predict the number of infected, recovered, and deceased cases at the node and graph level, as well as generate synthetic dynamic graphs that simulate the disease spread under different scenarios.

Another way is to use DGNNs to generate dynamic graphs that can be used as input for epidemiological models. For example, 1 surveyed several DGNN models that can generate dynamic graphs based on different objectives, such as preserving the graph properties, optimizing a reward function, or matching a target distribution. These models can be used to create realistic and diverse dynamic graphs that represent human contact networks, and then apply epidemiological models, such as the susceptible-infectious-recovered (SIR) model, to simulate the disease dynamics on these graphs. This can help evaluate the performance and robustness of different epidemiological models, as well as explore the effects of various factors, such as network structure, intervention strategies, and disease parameters, on the disease spread.

Saturday, November 25, 2023

transfomer for large social networks

 There are several possible options to use transformer to model large social dynamic networks, depending on the goals and challenges of the task. Here are some examples:

These are some of the options to use transformer to model large social dynamic networks. I hope this helps you. 😊

Thursday, November 23, 2023

basic circuit

 https://qiskit.org/documentation/tutorials/circuits/01_circuit_basics.html

evolutionary landscape

 The concept of the evolutionary landscape offers a compelling visualization of evolution's mechanisms as they act upon biological entities, encompassing genes, proteins, populations, or entire species. This concept, as outlined by Richter (2023), enables us to conceptualize these entities as navigating through a vast search space that encapsulates all possible variations. Here, the fitness of each variant is denoted by its position on the landscape, indicative of the entity's ability to survive and reproduce within its environment. Intriguingly, these landscapes may exhibit either smooth or rugged terrains, a feature that hinges on the impact of minor alterations in the entity on its overall fitness (Kauffman & Levin, 1987).


Additionally, evolutionary landscapes are dynamic, prone to shifts contingent upon environmental changes or transformations within the entities themselves, a notion first propounded by Wright in his seminal 1932 work (Wright, 1932). The fitness landscape, a specific subtype of evolutionary landscapes, zeroes in on the interplay between genotypes and reproductive success. Sewall Wright's introduction of this concept in 1932 (Wright, 1932) has since cemented its significance in evolutionary biology and as a tool in tackling optimization challenges (Stadler, 2002).


A fitness landscape serves as a pivotal analytical tool to examine how populations adapt to their environments. It sheds light on the influence of various evolutionary processes, such as natural selection, genetic drift, mutation, and recombination, on the evolutionary trajectory of these populations (Gavrilets, 2004). Moreover, the fitness landscape paradigm is instrumental in assessing the efficacy of different evolutionary algorithms. These algorithms, inspired by the tenets of natural evolution, are designed to identify optimal or near-optimal solutions to complex problems (Reeves & Rowe, 2003).


References: 


- Kauffman, S. A., & Levin, S. (1987). Towards a general theory of adaptive walks on rugged landscapes. Journal of Theoretical Biology, 128(1), 11-45.

- Gavrilets, S. (2004). Fitness landscapes and the origin of species (MPB-41) (Vol. 41). Princeton University Press.

- Reeves, C., & Rowe, J. (2003). Genetic algorithms: principles and perspectives: a guide to GA theory. Kluwer Academic Publishers.

- Stadler, P. F. (2002). Fitness landscapes. In Biological evolution and statistical physics (pp. 183-204). Springer, Berlin, Heidelberg.

- Wright, S. (1932). The roles of mutation, inbreeding, crossbreeding, and selection in evolution. Proceedings of the sixth international congress of genetics, 1, 356-366.

Wednesday, November 22, 2023

qiskit primitives

 

https://qiskit.org/ecosystem/ibm-runtime/primitives.html

Sampler: Generates quasi-probability distribution from input circuits.

Estimator: Calculates expectation values from input circuits and observables.

https://learning.quantum-computing.ibm.com/tutorial/working-with-the-qiskit-runtime-sampler-primitive#primitives



Monday, November 20, 2023

EMBER (Elastic Malware Benchmark for Empowering Researchers)

 The EMBER (Elastic Malware Benchmark for Empowering Researchers) dataset is an important resource for machine learning in the context of cybersecurity, specifically for malware detection. Here are the details:


- **Overview**: EMBER is a collection of features extracted from PE (Portable Executable) files, serving as a benchmark dataset for training static PE malware machine learning models. The dataset includes features from PE files scanned in or before 2017 (EMBER2017) and 2018 (EMBER2018)【83†source】.


- **Contents**: 

  - EMBER2017 contains features from 1.1 million PE files.

  - EMBER2018 includes features from 1 million PE files.


- **URLs for Download**:

  - EMBER2017 (Feature Version 1): [Download Link](https://ember.elastic.co/ember_dataset.tar.bz2)

  - EMBER2017 (Feature Version 2): [Download Link](https://ember.elastic.co/ember_dataset_2017_2.tar.bz2)

  - EMBER2018 (Feature Version 2): [Download Link](https://ember.elastic.co/ember_dataset_2018_2.tar.bz2)【84†source】.


- **Repository**: The GitHub repository for EMBER provides additional resources and tools to train benchmark models, extend the feature set, or classify new PE files using these models【82†source】.


This dataset is particularly useful for researchers and professionals in the field of cybersecurity who are focusing on developing and enhancing machine learning models for malware detection and analysis.

recent network intrusion datasets

 Here are some recent network intrusion datasets suitable for machine learning training, along with their sources and URLs:


1. **UNSW-NB15 Dataset**: This dataset contains nine different types of attacks, including DoS, worms, backdoors, and fuzzers, along with raw network packets. The training set includes 175,341 records and the testing set 82,332 records from different types, both attack and normal【41†source】. 

   - URL: [UNSW-NB15 Dataset](https://paperswithcode.com/dataset/unsw-nb15)


2. **CICIDS2017 Dataset**: The CICIDS2017 dataset consists of labeled network flows, including full packet payloads in pcap format, along with labeled flows and CSV files for machine and deep learning purposes【48†source】.

   - URL: [CICIDS2017 Dataset](https://paperswithcode.com/dataset/cicids2017)


3. **UQ NIDS Datasets (FlowMeter Format)**: Introduced by Sarhan et al., this dataset is specifically formatted for use with FlowMeter, a tool for extracting flow-based features from network traffic【55†source】.

   - URL: [UQ NIDS Datasets (FlowMeter Format)](https://paperswithcode.com/dataset/uq-nids-datasets-flowmeter-format)


4. **CIC IoT Dataset 2022**: Aimed at profiling, behavioral analysis, and vulnerability testing of different IoT devices, this dataset encompasses various experiments capturing network traffic of IoT devices under different conditions, including power, idle, interactions, scenarios, active use, and attacks【62†source】【63†source】.

   - URL: [CIC IoT Dataset 2022](https://paperswithcode.com/dataset/cic-iot-dataset-2022)


5. **IoT Benign and Attack Traces Dataset**: This dataset includes data collected for research on detecting volumetric attacks on IoT devices. It contains flow data and annotations for both benign and attack scenarios【73†source】【74†source】.

   - URL: [IoT Benign and Attack Traces Dataset](https://paperswithcode.com/dataset/iot-benign-and-attack-traces)


These datasets are valuable resources for training and evaluating machine learning models for network intrusion detection, especially considering the diverse nature of network attacks and behaviors they encompass.

Tuesday, November 7, 2023

Wednesday, November 1, 2023

UK Biobank

 

https://www.kaggle.com/code/hongqin/using-uk-biobank-to-scale-up-your-research-python/edit


Evolution of the primate trypanolytic factor APOL1

 

Evolution of the primate trypanolytic factor APOL1

https://www.pnas.org/doi/10.1073/pnas.1400699111#:~:text=African%20trypanosomes%20are%20parasites%20that,kidney%20disease%20in%20African%20Americans.


allofus, smoking study

 

https://academic.oup.com/jamia/advance-article/doi/10.1093/jamia/ocad205/7330649?utm_source=advanceaccess&utm_campaign=jamia&utm_medium=email&nbd=35144805721&nbd_source=campaigner&login=false


all of us training


https://allofus.nih.gov/about 

mission

https://support.researchallofus.org/hc/en-us/articles/14927774297620-The-new-VariantDataset-VDS-format-for-All-of-Us-short-read-WGS-data


genomic data

https://twitter.com/AllofUsCEO/status/1514996980661071879

CEO, Josh Danny,