How do we do Bioinformatics?

A description of the computational setup for “Foundations of Bioinformatics” (BGGN-213).

Notes:

This third screen-cast addresses the question of how do we actually do bioinformatics?

Many of the most well known bioinformatics tools and databases are online and free to use. Indeed we will start off in this course using these online tools and resources. However, we will quickly find that bioinformatics often requires analyzing large complex datasets. And it is just not feasible to do this online. The recommended approach to more advanced analysis, particularly of larger datasets, is to work with a computer that offers UNIX integration.

UNIX is an operating system that is particularly well suited to working with bioinformatics data files and has many powerful (and flexible) commands that can be strung together to process your data for you. The real strength of learning Unix is that most of these commands can be combined in an almost unlimited fashion. So if you can learn just five Unix commands, you will be able to do a lot more than just five things. Our objective here is to learn a subset of Unix and to become a productive Unix user without knowing or using every program and feature.

In this course we will also introduce you to the R programing language and environment for data analysis and graphics. R is an extremely powerful tool useful for exploring and understand data in an open-ended, highly interactive, iterative way. Learning R will give you the freedom to experiment and problem solve during data analysis — exactly what we need as bioinformaticians and data scientists.

To overcome bottleneck of limiting computing resources we can turn to high performance computing. In this course we will learn to use cloud-computing and the supercomputing resources offered by XSEDE to provide much faster processing speed, larger storage capacity, wider availability of resources and often unique computing capabilities.



Other screen cast videos

Thumbnail

Using Jetstream for Bioinformatics

Introduction to the Jetstream on-demand virtual machine system.

Thumbnail

What is Bioinformatics?

What will students learn in this course?