Friday, September 20, 2024

Scope of Work examples

 

scope of work examples

Here are a couple of examples: 

https://shsu.edu/dept/office-of-research-and-sponsored-programs/submissions-and-awards-pages/SOW-Examples.pdf?language_id=1


Statement of Work (SOW) Examples Solid SOW Examples #1  #2 Throughout years one and two, Dr. Susan Scientist will be responsible for recruiting 120 study subjects from the State University Autism Center and collecting all Health-Related Quality of Life (HRQoL) data from this site. Dr. Scientist will also be responsible for monitoring the quality of the HRQoL data in accordance with the attached protocol and for reporting the data collected to Dr. Peter Physician of Massachusetts General Hospital following a schedule to be mutually determined by the end of year one. Beginning in year three and continuing for the next six months, Dr. Scientist will create a parent focus group at the State University Autism Center and lead a series of meetings with this group. She will analyze the focus group data and, working with co-Investigators at MGH, will take the lead in drafting an enhanced autism-specific HRQoL tool. Dr. Scientist will be responsible for psychometric testing of this new tool at all collaborating sites and for analyzing and reporting these data to Dr. Physician by the middle of year four. Throughout the course of the project, Dr. Scientist and her team will work with co-Investigators at MGH and all study sites to plan for and participate in the project’s Advisory Committee meetings, in addition to participating in monthly conference calls to discuss progress and providing written reports to Dr. Physician on an annual basis for inclusion in the annual progress report and final report to the sponsor.  Dr. Scientist and her team will also work with Dr. Physician on the preparation of conference presentations and manuscripts reporting on the study. The Kendall Square Scientific Institute is a recognized international leader in bringing the power of genomics to medicine, and, since it’s founding in 1990, has been a leader in developing automated methods for data production and analysis and managing these in a high throughput production environment. Dr. Paul Physician will be the PI of the project at KSSI and will be responsible for performing a whole genome analysis scan (WSAS) using over 650,000 single nucleotide polymorphisms (SNPs) in 100 aviremic controllers, 1000 viremic controllers, and 500 age-gender-ethnicity matched control individuals with progressive viremicHIV-1 infection. Dr. Sarah Scientist will lead the sequencing component. She will be responsible for generating viral genomes from 1000 aviremic subjects, 250 progressors, and longitudinal data from 160 Partners Research Management Intranet                                         http://phsresearchintranet.partners.org/acutely infected individuals. She will also conduct bioinformatics analyses of these data to identify HLA associated mutations and other sequence polymorphisms associated with the control of HIV. In year one, the KSSI will complete a WSAS of 300 controllers and 500 controls. This will be expanded in years two and three to 1000 aviremic and 1000 viremic controllers, as well as progressors. In years four and five, KSSI will perform extensive re-sequencing to identify the causal allele by resequencing the genomic region and dense genotyping of all known genetic variation in the region of interest across all samples. In years four and five, the KSSI will also perform replication experiments in 3000 ACTG samples to determine the degree of effective of genes initially identified on viral load. Resequencing will occur using the Sequenon genotyping platform. Throughout the project Drs. Physician and Scientist will participate with the MGH co-Investigators in monthly conference calls and/or meetings to discuss progress and analyze data. In addition, on an annual basis they will provide the MGHPI with written progress reports for inclusion in the annual report and then final report to the sponsor. Weak SOW Examples #1 #2 Dr. Sam Scientist, at the University of Hawaii, has extensive experience with avian influenza infection at clinical bases in addition to his numerous researches on the avian influenza in basic studies. Dr. Scientist as a Program Director will be responsible for projects in Hawaii, in particular focusing on investigations using avian flu by comparing with human influenza viruses.  Dr. Patrick Physician, at the Beth Israel Deaconess Medical Center in Boston, has an extensive experience and knowledge in alveolar macrophage research, as he is a Director of Lung Macrophage Research Laboratory. He will be responsible for providing expertise about lectins on human alveolar macrophages and viral infection at the cellular level. 

ODU Smarter Proctoring

 

When we set up the SmarterProctoring options for the exams, we as the faculty get to designate what we consider to be the acceptable proctoring options.  The ones I generally recommend are 

  • Institution Testing Centers [this includes our own College of science Testing Center in the Perry Library]
  • NCTA Testing Centers
  • Professional Testing Centers
  • Military Base Testing Center
  • High School Testing Center
  • Testing Administrator (College, University or Private Testing Service)
  • Librarian (Public Library)
  • Military Personnel
  • Professional Education Center
  • Live Online Proctoring    [a SmarterProctoring staff member watching them through their webcam and screen capture software]

 

The Smarter Proctoring staff at ODU can assist students with finding a tutor convenient to their geographic area.  

 

Thursday, September 19, 2024

python for data science, Muray book

 

https://www.murachforinstructors.com/shop/murach-s-python-for-data-science-2nd-edition-detail

Murach’s Python for Data Science (2nd Edition)

by Scott McCoy
15 chapters, 588 pages, 240 illustrations
Published May 2024
ISBN 978-1-943873-17-3

Murach’s Python for Data Science starts by covering everything your students need to hit the ground running when using Python for data science. First, it presents a crash course in using the Pandas and Seaborn libraries for data analysis and visualization. Then, it presents a thorough course in data analysis, including how to use the Scikit-learn library to create statistical models that make predictions. Finally, it presents four real-world case studies that tie all the coursework together.

Now available as a Canvas course!

The Canvas course file contains all the objectives, quizzes, assignments, and slides that you need to run an effective course. It only takes a few clicks to import it into the Canvas LMS. Then, you can customize it for your course when it is available. Learn more.

The text was perfect for my class. It provided a solid foundation for my students in using the Pandas and Seaborn libraries. I really appreciated the four case studies. They were a big help for my students as they illustrated all phases of data analysis and visualization.”

J. Jasperson – Texas A&M University

  • About this Book
  • Table of Contents
  • Courseware
  • FAQs
  • Corrections

Murach college books and courseware since 1974

Wednesday, September 18, 2024

alphafold and viral evolution

 

Where did viruses come from? AlphaFold and other AIs are finding answers


https://www.nature.com/articles/d41586-024-02970-w?utm_source=Live+Audience&utm_campaign=f4bc88ad07-nature-briefing-daily-20240917&utm_medium=email&utm_term=0_b27a691814-f4bc88ad07-49233615

climate change, mosquito borne disease

 


Mosquito-borne disease cases rising

Rising temperatures and heavy rainfall are turning Europe into a breeding ground for mosquito-borne diseases, researchers warn. New figures show there have been 715 locally acquired cases of West Nile virus across 15 European countries this year. Climate change is creating cosy conditions for Culex pipiens and Aedes albopictus mosquitoes in places they couldn’t previously thrive, extending their range and the transmission period of the diseases they carry. “We are faced with a problem where new places could become hotspots of transmission that were not prepared for this before,” says genetic epidemiologist Houriiyah Tegally.


Llama31. fine tuning

 Fine-tuning large models like Meta's LLaMA 3.1, especially the 205B parameter version, requires substantial hardware resources. Here's an estimate of the recommended hardware requirements:

1. GPU (VRAM) Requirements:

For the 205 billion parameter model, the VRAM requirement is significant because the model needs to fit into the GPU memory during training. Typically:

  • VRAM requirement: For inference, this model will likely require around 200-400 GB of VRAM, depending on optimization and precision (float32 vs. bfloat16 or quantized versions).

  • Multi-GPU setup: Since no single GPU currently offers such a large amount of VRAM, a multi-GPU setup with 8 GPUs, such as NVIDIA A100 (80GB) or H100, is recommended. With model parallelism, you can spread the model across several GPUs.

    GPU Recommendation:

    • 8x NVIDIA A100 (80GB) or NVIDIA H100 (80GB or 120GB), with NVLink for fast inter-GPU communication.
    • Alternatively, 4x NVIDIA H100 (120GB) could work, but 8 GPUs will reduce training time.

2. CPU Requirements:

While GPU handles most of the fine-tuning, the CPU is crucial for data preprocessing and managing the training pipeline.

  • Recommended CPU: A high-core-count CPU like the AMD EPYC or Intel Xeon line would be ideal.
  • Cores and Threads: Aim for 64 cores or more. Large LLaMA models benefit from parallelized data loading and preprocessing, so multiple threads are essential to prevent bottlenecks.

3. RAM Requirements:

LLaMA models require a substantial amount of system RAM for the CPU to handle data efficiently.

  • RAM requirement: For fine-tuning the 205B model, you'll need at least 1.5 TB to 2 TB of RAM.
  • The general rule is that you'll need 4x to 6x the VRAM in system RAM, especially for large-scale models.

4. Storage:

  • Fast storage is crucial. NVMe drives with high read/write speeds are recommended.
  • You'll need at least 10-20 TB of storage to store datasets, checkpoints, and logs during the fine-tuning process. For ultra-fast read/write during training, NVMe SSDs are preferred.
  • RAID arrays or direct-attached storage (DAS) with NVMe drives for even higher throughput are an option for larger datasets.

5. Networking:

  • If you're using multiple nodes (for distributed fine-tuning), you'll want high-speed networking between nodes, such as InfiniBand or 10/100 Gbps Ethernet for fast data transfers and synchronization.

6. Power Supply:

  • The GPUs and high-core CPUs will require significant power. Depending on the number of GPUs, 3000-5000W power supplies, preferably redundant, are necessary to handle the load reliably.

7. Cooling:

  • Effective cooling, such as liquid cooling or advanced air cooling, is needed for the GPUs and CPU to run optimally without throttling.

Summary of Requirements:

  • GPU (VRAM): 8x NVIDIA A100 (80GB) or 8x H100 (80GB or 120GB) GPUs.
  • CPU: 64-core or more, high-end AMD EPYC or Intel Xeon.
  • RAM: 1.5 TB to 2 TB.
  • Storage: 10-20 TB NVMe storage for fast read/write.
  • Networking: InfiniBand or 10/100 Gbps Ethernet for distributed setups.
  • Power Supply: 3000-5000W with redundancy.
  • Cooling: Liquid cooling recommended for stability under long fine-tuning runs.

If you're considering fine-tuning the LLaMA 3.1 205B model, make sure your system can handle

Sunday, September 15, 2024

top issues to consider for AI application in healthcare:

 Based on the search results, here are the top issues to consider for AI application in healthcare:


## Data Privacy and Security


Healthcare AI requires access to vast amounts of sensitive patient data, raising significant privacy and security concerns[5][6]. There are risks of data breaches, unauthorized access, and potential misuse of personal health information. Ensuring robust data protection measures and compliance with privacy regulations is crucial.


## Algorithmic Bias and Fairness 


AI systems trained on historical data may perpetuate or exacerbate existing biases and health disparities, particularly affecting minority groups and underrepresented populations[5][7]. Addressing bias in algorithms and ensuring fairness across diverse patient groups is a major challenge.


## Clinical Integration and Adoption


There are significant challenges in integrating AI tools into existing clinical workflows and electronic health record systems[5]. Gaining physician buy-in, overcoming resistance to change, and ensuring usability are key considerations for successful adoption.


## Transparency and Explainability


Many advanced AI systems, especially deep learning models, operate as "black boxes," making it difficult to understand how they arrive at decisions or recommendations[2][3]. This lack of transparency raises concerns about accountability and trust in AI-generated insights.


## Ethical and Regulatory Considerations


The use of AI in healthcare raises various ethical issues around patient autonomy, informed consent, and accountability for AI-assisted decisions[6]. Navigating the evolving regulatory landscape for healthcare AI is also a significant challenge.


## Data Quality and Interoperability


Ensuring high-quality, diverse, and interoperable data for training and validating AI models is crucial[1][5]. Differences in data formats and completeness across healthcare systems pose challenges for developing robust AI solutions.


## Patient Trust and Expectations


Building patient trust in AI technologies and managing expectations about their capabilities and limitations is important[6]. Clear communication about the role of AI in patient care is essential.


## Validation and Ongoing Monitoring


Rigorous validation of AI models in real-world clinical settings and continuous monitoring of their performance over time are necessary to ensure safety and effectiveness[7].


## Liability and Accountability


Determining responsibility and liability in cases where AI systems contribute to medical errors or adverse outcomes is a complex issue that needs to be addressed[3].


## Workforce Impact and Training


Considering the impact of AI on healthcare jobs and ensuring proper training for healthcare professionals to effectively use AI tools are important considerations[5].


Citations:

[1] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6616181/

[2] https://emeritus.org/blog/healthcare-challenges-of-ai-in-healthcare/

[3] https://www.medpro.com/challenges-risks-artificial-intelligence

[4] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8966801/

[5] https://prsglobal.com/blog/6-common-healthcare-ai-mistakes

[6] https://www.thomsonreuters.com/en-us/posts/technology/ai-usage-healthcare/

[7] https://psnet.ahrq.gov/perspective/artificial-intelligence-and-patient-safety-promise-and-challenges

[8] https://www.brookings.edu/articles/generative-ai-in-health-care-opportunities-challenges-and-policy/

Saturday, September 14, 2024

traffic prediction

 

https://arxiv.org/abs/2409.03282

Interpretable mixture of experts for time series prediction under recurrent and non-recurrent conditions


temporal fusion transformer (TFT)

Time Features:

  • Time of day (cyclic encoding)
  • Day of week (cyclic encoding)
  • Month of year (cyclic encoding)
  • Holiday indicator (binary)

The Temporal Fusion Transformer (TFT) is a deep learning model designed for multi-horizon time series forecasting, which means it predicts future values over multiple time steps. It is particularly useful in scenarios involving complex temporal dependencies and multiple input features. TFT is both powerful for prediction and interpretable, making it stand out among time series forecasting models.

Key Components of TFT:

  1. Variable Selection:

    • TFT dynamically selects relevant features (both static and time-varying) that are important for making predictions at each time step. This is done using a gated residual network which assigns importance weights to different features. It allows the model to focus on the most relevant inputs for prediction, enhancing interpretability.
  2. LSTM-based Encoder-Decoder:

    • TFT employs long short-term memory (LSTM) networks, a type of recurrent neural network, for encoding past data (in a context window) and decoding future data (in the prediction window). The LSTM captures temporal patterns from the input data, which are crucial for accurate forecasting.
  3. Multi-Head Attention:

    • One of the standout features of TFT is the use of multi-head attention, inspired by the Transformer model. This mechanism helps the model focus on different parts of the time series and various time steps. Attention helps identify important temporal dependencies, such as sudden changes or long-term trends, at multiple time points.
  4. Gating Mechanisms:

    • TFT uses gating mechanisms throughout the model to regulate how information flows through its layers. These gates help prevent irrelevant information from propagating forward, improving efficiency and reducing noise in predictions.
  5. Quantile Regression:

    • Instead of just predicting a single point estimate, TFT can output quantile predictions (e.g., predictions at the 10th, 50th, and 90th percentiles), making it possible to estimate uncertainties in the forecast. This is particularly helpful when making forecasts under uncertain or volatile conditions.

Interpretability in TFT:

TFT is designed with interpretability in mind. Two main methods of interpretation are:

  1. Feature Importance: TFT quantifies the importance of each input feature in predicting the target value. This allows users to understand which features, such as weather conditions, traffic incidents, or the time of day, play the most crucial role in predictions.
  2. Temporal Attention: By utilizing multi-head attention, TFT can show which time steps in the past (within the context window) are the most influential for making predictions at future time steps.

Why TFT is Suitable for Traffic Prediction:

  • Capturing Complex Temporal Dependencies: Traffic patterns often involve recurring trends (like rush hours) as well as non-recurring events (like accidents or severe weather). TFT’s attention mechanism helps capture both short-term and long-term dependencies between these events and traffic speed.
  • Interpretability: Understanding the factors that influence traffic speeds, such as weather or incidents, is crucial for decision-making. TFT’s interpretability allows for insights into how these features affect predictions in different conditions (recurrent vs. non-recurrent).
  • Multi-Source Inputs: TFT can efficiently handle multiple sources of data (like traffic incidents, weather conditions, etc.), making it well-suited for multi-variable prediction problems like traffic speed forecasting.

In this paper, TFT is used as the backbone for expert models in both recurrent and non-recurrent traffic prediction, benefiting from its ability to handle temporal dependencies and provide interpretability​.

Friday, September 13, 2024

ODU travel form, travel policy

 

ODU travel form                                                                                                                             

If your conference requires travel please do not forgot to also complete the ATA form.  


Instructions on Travel Policies (See also attached flowchart)

S. Kuhn, August 2024 (no guarantee of correctness)

There are specific rules concerning ALL professional travel (i.e., any travel that is not strictly personal,

like vacations, but in your capacity as an ODU or ODURF-affiliated employee), both by you and by

your students/postdocs (and visitors), whether on grant money or state money. Make sure you learn

about these rules before making travel arrangements (including buying airline tickets), either from

your GCA (grant administrator) at ODURF or from the department staff (for state-supported travel). At

a minimum, keep in mind:

- For ALL travel abroad (ODU-supported and ODURF supported), University Policy 1007

(https://ww1.odu.edu/about/policiesandprocedures/university/1000/1007) requires you to register through

the ODU Travel Registry System https://travelregistry.odu.edu/.

- If your research is sponsored by the US Dept. of Energy or NIH, travel that is (partially) paid for,

sponsored, reimbursed, or otherwise funded by outside entities may have to be reported on the

COI interface of the ODURF Portal https://hera.odurf.odu.edu/RFPortal, even outside the annual

COI disclosure cycle (typically after travel but close to completion of the trip). Professional

societies, foreign institutions of higher education, for-profit entities and non-profit entities are a

few examples of outside entities that require disclosures. Travel paid for by the University, federal,

state, or local government agencies, US higher education institutions, US academic teaching

hospitals, US medical centers or research institutes affiliated with a US institution of higher

education, generally do NOT need to be disclosed.

- For STATE-SUPPORTED travel, you (and/or your students, postdocs, visitors…) MUST get

previous authorization through Chrome River.

- For practically all travel supported/reimbursed by ODURF (including from discretionary or

incentive accounts), you must fill out and submit a request for an Advanced Travel Authorization

(ATA) and receive an ATA number.*) This includes all international travel and any domestic travel

supported by your grant or discretionary account if the travel is overnight or if you plan to rent a

car. Go to https://researchfoundation.odu.edu/forms/, scroll to the bottom and click on “Advance

Travel Authorization (ATA) Request Form”. Fill out the form in Adobe and submit *) to your Grant

Officer (GCA). If you (your postdoc/student/…) are (is) an employee of ODU (as opposed to only

ODURF), the form will require your immediate ODU supervisor’s (usually the chair’s) signature –

send to the chair after you have completed it. Note that you must have an “export control

checklist” on file for your grant and/or fill one out for all international travel. Note that there is a

2nd page – you must fill that out for international travel! Also be advised that the approval process

for international travel through ODURF can be rather lengthy (2 weeks is by no means atypical),

as the Director of Secure Research and Regulated Activities (presently David Flanagan) will have

to check and confirm that there are no concerns connected with your travel. This is even true if you

travel to another NATO country or western democracy. Plan accordingly and complete all of the

steps listed above as early as possible.

*) The only excep,on is travel supported by the DEPARTMENTAL discre,onary account, in which case the

Department Chair (as the PI) will submit the ATA form for you – you s,ll need to fill it out completely, though!

- It is recommended that your travel bookings (in particular international airline tickets) go through

“CI Azumano Travel”. They are contracted with ODURF and can issue a ticket at no cost to you

once you have an ODURF ATA#. CI Azumano will need confirmation from your GCA once you

have approved the itinerary; they will directly invoice ODURF for the cost of the ticket. Of course,

you can also book your travel yourself and get reimbursed. However, in particular for international

travel, you must make sure to follow all federal rules, like “Fly American Act” and export control

regulations (even if all you “export" is yourself!). Your GCA will help you with this – this can take

some time, so start early! Some other costs that you have to prepay (registration fees, hotel, etc.)

can by submitted for reimbursement right away through a Travel Advance/Expense report form.

- After concluding any grant-supported travel, you must fill out (or complete, if you received any

advances) the Travel Advance/Expense form.

- For state-supported travel, including travel by students, postdocs or visitors (e.g., Colloquium

speakers), you must get approval through Chrome River. Furthermore, you must have receipts for

everything that you plan to get reimbursements for (including each meal, taxi cab fare, etc.). You

even have to document which hotel you stayed in (whether the State pays for it or not!) and your

flight itinerary and boarding passes. Also, you cannot exceed the officially approved per diem rate

for meals and lodging, even if you have receipts. (All of this includes travel by visitors that you

want to have reimbursed). In other words: Work closely with Department staff both before and

after your travel!

- Keeping all receipts is a good idea for any kind of travel.

Here is the ODURF web page with all of their travel policies: https://researchfoundation.odu.edu/travel/.

Ask your GCA if you have any questions.

Instructions on ODURF procurement

Any direct purchase of items or services (including tickets, registration, publication fees etc.) from your

grant requires a purchase requisition. A separate requisition must be submitted for each order/vendor.

(You can get reimbursement for travel expenditures from your own private funds before or after your

travel through the Travel Advance/Expense Report at

https://researchfoundation.odu.edu/wp-content/uploads/2020/10/Travel-Advance-Expense-Report.pdf)

The fastest way to initiate a purchase requisition from ODURF is by going to

https://researchfoundation.odu.edu/forms/ and pick the form “Purchase Requisition” (see instructions

there if you don’t have access). The form is mostly self-explanatory. More detailed instructions can be

found at https://researchfoundation.odu.edu/procurement/ .

FLOW DIAGRAM FOR ALL TRAVEL INVOLVING ODURF – SEE NEXT PAGE =>