BIO 462: Computational Cancer Biology
BIO 462 is a class designed to teach computational cancer biology using Python and real cancer datasets.
We present a module for teaching computational cancer biology to demonstrate how modern omics data are used in precision oncology.
Each lesson explores fundamental cancer concepts and supporting molecular data. Below are links to the Google Colab notebooks for homework assignments
1 through 6. Each lesson is designed to contain a week’s worth of instruction material, so the computational exercises in each
notebook are estimated to take approximately 6 hours. At the beginning of each Colab notebook, we include a section describing the topic and relevant literature
to be discussed in the classroom. As the module is completely online, it is appropriate for either traditional in-person instruction
or online courses.
- Lesson 1: Introduction to Cancer Datasets
- Introduces students to cancer datasets through the use of Pandas dataframes.
Teaches basic dataframe manipulation and cancer clinical data exploration.
- Lesson 2: Missense Mutation
- Teaches students identification and interpretation of missense mutations.
Integrates protein domain classification with CPTAC data using the UniProt API.
- Lesson 3: Truncation Mutation
- Teaches students some analysis techniques for identifying mutations in genes and how those can lead to truncated proteins.
Introduces extended UniProt integration and visualization of truncating mutations with lollipop plots.
- Lesson 4: Copy Number Variation
- Teaches students the difference between focal and arm level CNV events and how they affect individuals vs populations.
Teaches visualization techniques for identifying frequent events in a population.
- Lesson 5: Transcriptomics
- Teaches students how to identify differential transcripts through t-testing using a wrap t-test function
Teaches students how to perform a pathway enrichment analysis on transcriptomics data through GSEA/Enrichr and g:Profiler.
- Lesson 6: Proteomics
- Teaches students identification of differential proteins and co-expression networks.
Introduces advanced UniProt integration.