COMP 5960/6960 - 091 Introductory BioMedical Data Analysis (Python)

University of Utah

Semester: Spring 2026
Time: Self-Paced
Location: Online

Instructor: Tingying He (tingyinghe@sci.utah.edu)
Faculty Coordinator: Jeff Phillips (jeff.m.phillips@utah.edu)

General Information

This course will provide an introduction to programming in Python with topics and pace designed for biomedical students interested in data science. Prior programming experience is not required. Students will learn how to write code for handling data, focusing on dataframe representations. Using these common representations, students will learn to prepare data for analysis starting from various formats, visualize its contents, and perform basic analysis to evaluate the data veracity. This course is structured as a series of stackable short-courses, where students need to complete 4 short courses in the semester to fulfill requirements for this credit-earning course.

Short Courses

This is a composite course consisting of the four self-paced online short courses, listed below.

You should complete all four of these courses sequentially.

If you finish a course early, you are welcome to start the next course (you do not need to wait until after the previous course’s due date).

1. Introduction to Python for Data Analysis

Description: This course introduces the Python programming language, with a focus on using Python for data analysis, and is designed for beginners who are new to Python and coding. Specific topics covered include using VS Code and working with Jupyter notebooks along with basic coding principles, defining variables, loading libraries, working with Pandas DataFrames, and visualizing data using the seaborn library. This course involves self-paced learning with pre-recorded tutorials and practice exercises that can be completed at your convenience.

2. Machine Learning with Python

Description: This course is designed to introduce you to common Machine Learning (ML) algorithms that you can directly apply to compute predictions on real-world data. Machine learning involves building models that learn patterns from data in order to make predictions or decisions—often automatically—on new, unseen data.

In this course, we will focus on biomedical data. For example, you might use the skills from this course to predict patient outcomes based on hospital records, identify individuals at risk for a particular condition, or estimate the likely success of a treatment for new patients based on past results. The goal is to develop models that generalize well from historical data to future cases.

However, real data is often messy, incomplete, and biased. The performance and fairness of your ML models depend not only on the algorithms you use but also on the quality and representativeness of your data. This course will teach you both the technical tools and the critical thinking skills needed to build trustworthy ML models—from data preprocessing and feature engineering to model training, validation, and evaluation. You’ll also learn to recognize the limitations of your models and to assess when predictions may or may not be reliable in practice.

3. Data Science Ethics

Description: The course is structured around two main topics: data and algorithms. We will explore ethical issues from the perspective of both the data subject (the individual whose data is collected) and the data scientist (the person responsible for handling this data). The first part of the course focuses on the rights and responsibilities related to personal data, while the second part addresses the ethical considerations surrounding algorithmic decision-making, such as the accuracy and fairness of algorithms that impact real-world decisions.

4. Unsupervised Learning with Python

Description: This course prepares students in the basics of unsupervised learning, also known as data mining. This is the topic of working with data for analysis, but without predefined labels to guide you towards making predictions and understanding. This more challenging situation still allows for clustering and dimensionality reduction. This course covers both center-based clustering like k-means and density-based clustering like DBScan. Then it covers both linear (like principal component analysis) and non-linear (like UMAP) dimensionality reduction approaches. Students will learn how to apply these techniques through the most common python libraries, and also gain guidance on when to apply which case.

Grading

Grade: Each short course contains a number of projects that are at the end of each “module”.

Your grade for each short course will be based on the average of your individual project scores.

Your grade for the entire composite course will correspond to the average score you receive across the four short courses.

Late policy: If you do not complete a short-course by its due date (see above), you will lose 2% of your grade from that short-course per day, unless instructor permission is granted.

Final exam: There is no final exam for this course.