Day 5. Unified Analytical Group Projects

On the final day of this bootcamp, we will group up with the other members of our table and embark on a joint effort to take what we have learned this past week and use it to explore a large-scale biological data set in a collaborative fashion. We will make use of the Geuvadis Project which combines data on genetic variation from the 1000 Genomes Project with gene expression measurements derived from RNA-sequencing generated in Lappalainen et al, Nature 2013 to detect and visualize potential expression quantative trait loci (eQTL).

Typically, such research projects can take a very long time to generate the data and analyze the results. For the purposes of this bootcamp, we will be using a small subset of these data and will attempt to recreate the published results over these regions. Our goal is to give you a taste of what types of data exploration are now available to you with the simple yet powerful biocomputing tools you have learned and to serve as a foundation for your future research endeavors.


Schedule:

Session Time Topics
I 9:00-10:15 AM Introduction to eQTLs and Overview of Project
  10:15-10:30 AM Coffee Break
II 10:30-12:00 AM Obtaining, Parsing and Formatting Data
  12:00-1:00 PM Lunch
III 1:00-2:15 PM Parallel Association Testing and Visualization
  2:15-2:30 PM Coffee Break
IV 2:30-4:00 PM Group Presentations and Discussion


Instructors:

Ryan Mills (RM)
Jacob Kitzman (JK)
Barry Grant (BG)
Hui Jiang (HJ)

Class Questionnaire

Please help us improve this course by completing this questionnaire.


Data Sets:

  • Genotype data for 465 individuals
    • Remote site
    • Local FLUX directory: /scratch/biobootcamp_fluxod/remills/bioboot/geuvadis/genotypes
  • Expression data for 465 individuals
    • Remote site
    • Local FLUX directory: /scratch/biobootcamp_fluxod/remills/bioboot/geuvadis/analysis_results


Project Resources


Analysis notebooks


  • Using what you’ve learned the previous days, login to FLUX, navigate to your personal /scratch folder, and make a new directory called ‘day5’.
  • Download today’s ipython notebook into this directory using wget:
    • wget https://raw.githubusercontent.com/bioboot/web-2016/gh-pages/class-material/Day5.ipynb
  • For this exercise, we will need to run ipython notebook on flux. As with Day 4, we can now make use of an internal University of Michigan tool called ARC Connect to do of all of this far us. Navigate to this URL and login with your UM account. When prompted, complete the 2-factor authentication. From the ARC Connect screen, choose:

    • Select biobootcamp_fluxod under Account
    • Select Jupyter Notebook under Sesson type.
    • All other values can remain at default
    • Press Submit your job and wait for it to be allocated (it might take a few minutes)