CPSC 301 - Computing for the Life Sciences

2016W2 Section 201

Syllabus

External quicklinks: Connect, Piazza

Course Description: Basic concepts, tools, and techniques for working with scientific data at larger scales, higher speeds, and lower costs that would otherwise be impossible. Applications and examples drawn from the life sciences. No prior computing background is required.

Meeting Times: Tuesday, Thursday, 9:30 AM – 11:00 AM
First Class: Tuesday January 3, 2016
Location: DMP 310
Instructor: Jessica Dawson (jqdawson@cs.ubc.ca)
Instructor’s Office Location: ICCS 227
Instructor’s Office Hours: Thursdays 11:30am – 12:30 PM, or by appointment
TAs (usual Labs) :

  • Angela Chang (L2B, L2C)
  • Aaron Monga (L2B, L2D)
  • Alex Wong (L2B, L2C)
  • Adam Ziada (L2A, L2E, L2D)
  • Bo Hu (L2E, L2C)
  • David Johnson (L2A, L2F)
  • Jolene Wong (L2E, L2F)
  • Oliver Zhang (L2D)
  • Paul McDade (L2A, L2F)
  • Racquel Singh (Lectures, L2D)
You can reach your TAs in Piazza via private message.

Prerequisites: Third-year standing or higher.

Course Communication: Most information will be disseminated through this web site or Piazza. Students are expected to check Piazza regularly for announcements, and this website for updated course content.

  • We'll use Piazza for the course discussion board. This is the best forum for asking questions about logistics or course content. The instructor and TA will monitor the discussion board regularly, and will respond more quickly to questions posted there than questions asked by email. Students are expected to monitor the Piazza discussions regularly as well.
  • Questions regarding lab work can be directed to your TA during the labs or via a private message on Piazza. Questions regarding lab grading should be directed to the TA first.
  • For personal issues talk to or email your course instructor.

Tentative Grading Scheme

In order to pass the course, a student must receive a passing grade on the final exam. It’s possible I’ll make changes to the exact percentage breakdowns shown here:
Course Element Fraction of Grade
Clickers 4%
In-class group exercises 4%
Reading Quizzes 4%
Labs 25%
Midterm Exam (80 minutes) 20%
Final Exam (2.5 hours) 43%

To pass the course you must obtain a 50% overall grade in the course, and in addition you must:

  • obtain an overall grade of 50% for labs
  • pass the final examination

If you fail the labs or the final exam, your score will be set to the lower of 45% and your score otherwise in the course (including the labs and final exam).

Pre-class reading and quizzes

Before each lecture I will ask you to read the pages from the notes and the textbook that are relevant to the topics we will discuss in the lecture. You will complete a short quiz in Connect to test that you have understood the key terms and concepts. You'll have two attempts to complete each quiz - your score will be the average of the attempts. It is important to complete the pre-class reading and quiz to be prepared for lectures.

Lectures

Attendence and participation in lectures is an important part of the course! In the lecture, I will assume that you know the basic concepts from the assigned reading and I will focus more on examples and your questions. I will use clicker questions to gauge your understanding of the concepts you read and discuss those that are not clear to you. Occasionally I will ask you to work with a partner on some exercises (or small problems) and submit your solution for marking shortly after class. I will regularly post the lecture notes on the course web site. You may download the PDF slides and have them in the class to add your own annotations and any explanations and different examples I may present during the lecture.

Labs

There will be nine or ten labs. All of the labs have before-lab and in-lab sections. Some labs may also have some after-lab assignments. The labs are the main homework assignments in the course.

A lab is usually due at 9:00am on the Sunday that follows the day the lab started. We will NOT accept late submissions except in exceptional circumstances.

For more information on the labs see the labs home page on this site.

If you miss your lab section due to illness:

  • Try to attend a different lab section during the same week.
  • If you are unable to attend any lab section during a particular week, submit your solution through handin, if you finish before the deadline, and bring your work to your next lab section to demonstrate to the TA. If you cannot submit the assignment on time, email it to your TA and provide the TA with an explanation.
  • If you are unable to complete a lab before the next lab starts, you should talk to your instructor. 

Exams

There is a midterm and a final in CPSC 301.

  • There will be no makeup for the midterm. The final exam will absorb the midterm's component of the grade, should the midterm be missed.
  • If you miss the final exam, you must follow the procedure of your Faculty to request a deferred exam. Note that you may be refused permission to sit a deferred exam if you have not demonstrated sufficient prior work in the course.

Late submission policies

Labs: Contact the instructor or your TA promptly (i.e., as soon as you are aware of the problem) if a medical or family reason prevents you from handing in your labs on time. In extraordinary circumstances, we may allow late turn-in of some assignments if you contact course staff (post a private pessage on Piazza) with a clear explanation of the problem well in advance of the deadline (i.e., at least 48 hours). Poor planning or procrastination do not constitute extraodinary circumstances.

Quizzes, clickers and lecture exercises: At the end of the course, we will drop your worst quiz, lecture exercise, and lecture clicker score. Consider this permission, given in advance, to not submit a quiz or to miss a class because of illness, travel, starting the course late, conflicts with other courses, etc. No further allowance will be made for failure to submit quizzes or attend class except in truly exceptional circumstances such as a prolonged and serious illness.

A student must pass the final exam in order to pass the course.

Text and Instructional Resources

Textbook: The main textbook for the Python component of the course is: Practical Programming: An Introduction to Computer Science Using Python 3 (2nd Edition), by Paul Grise, Jennifer Campbell, Jason Montojo.

You can buy the paper or the online version or both at the publisher's website or on Amazon.ca. Class readings will be identified in this version.

Other books that introduce programming using Python 3 can be used in the course, but they may not contain all the topics we'll cover. A book which can be used for this course is: How to Think Like a Computer Scientist - Learning with Python 3  by Peter Wentworth, Jeffrey Elkner, Allen B. Downey, and Chris Meyers, which is available online free from the Open Book Project.  This site also has a number of tutorials and other useful tips for writing Python programs.

Connect: Gradez, quizzes and in-class exercise submission will all be on Connect.

Piazza: Course discussion board.

iClickers: If you don’t already have a clicker, you can buy an i>Clicker from the bookstore. You need to register your clicker on Connect to start earning your clicker grades; if you register late, you’ll miss out on some marks. If the ID on your clicker is worn off, don’t despair. You can drop by the help desk at Chapman Learning Commons to get it retrieved. This is found on the 3rd floor of the Irving K Barber Learning Centre.

Academic Honesty and Plagarism: I expect students to be aware of and adhere to the UBC policy on academic integrity and plagiarism in all their work in this course. Academic misconduct of any kind, including cheating on quizzes, lab assignments or exams, will not be tolerated. The consequences for academic misconduct will include a grade of zero, and you could also face possible explusion from the course or suspension.

Course Content

The course is targeted at students with little or no programming experience, but it is expected that you know how to use email, a web browser (such as Microsoft Explorer or Mozilla Firefox) and a word processor (such as Microsoft Word). If you found this web page, you probably know enough. The rest we will teach you. Here are some of the things that we are thinking about including:

You can program! We will spend a few weeks exploring the idea of programming using a simple and fun graphical programming language called Scratch. Every student will create their own personal computer animation in Scratch in the very first lab. In fact, if you know how to install software on your computer, try it out now -- the Scratch web page has lots of examples of neat animations and video games that first time programmers wrote within a few hours of first trying the system.

A little bit of Excel. You may have already used Microsoft Excel (or another spreadsheet program) in your courses or summer jobs. In CPSC 301 we will use Excel for a couple of weeks to explore ideas about graphing data, storing data and data bases. Examples of data that we are thinking about using include statistical information from the epidemic simulation, or gene expression information for normal, pre-cancer and cancerous cells from the NCI Cancer Genome Anatomy Project.

Programming with Python. Scratch is fun, and spreadsheets (such as Excel) are very useful for certain types of problems, but to really harness the enormous power and flexibility of computers and networks, you need a more general way of telling them what to do. Python is an easy to learn, powerful, adaptable programming language used in virtually every computer application domain. For example, if you surf the Web, you have (probably unknowingly) run many Python programs. In CPSC 301 we will learn to use Python to perform image manipulations (such as what you might do now in Photoshop or iPhoto), download gene sequence and protein activation data from Internet bioinformatics databases (such as NCBI's GenBank), translate files between different formats and transport them across the Internet, and call into libraries of code written by other people (such as BioPython) to perform powerful data analyses (such as gene sequence matching using BLAST). That may sound ambitious, but with a modern programming language like Python and the power of the Internet, you can learn how to do a lot in a very short time.

And Some Other Stuff that You May Find Useful. We cannot teach you all of computer science in a three credit course, but there are some topics that we are considering and that are likely to prove useful in your future work with computers:

  • How do we compare two programs that accomplish the same task to determine which one is "better"?
  • What are all those different file name extensions when you choose to "Save As," and why would you choose one over another?
  • How can you use Python to call programs written in other languages (such as BioPerl)?
  • In what different formats can we store data in the computer, and what are the advantages and disadvantages of each?
  • What makes Google rank one web page higher than another?
  • Why are there so many programming languages?
Do you have other questions that you would like answered?

Course Level Learning Outcomes

Students who complete this course will be able to:

  • Create, identify, view and modify common data storage formats using common applications.
  • Choose an appropriate data format for a specified task, explain why that format was chosen over alternatives, and identify potential shortcomings of the chosen format.
  • Write, modify, debug, analyze and execute simple programs to create, collect, transform, transmit, manage, retrieve, analyze or visualize data in simple ways.
  • Download, install, execute, call into and validate well-designed scientific software from the Internet for more complex data and its manipulation.
  • Develop and debug programs in both an integrated development environment (eg Scratch, possibly Python) and a text-based setting (eg Python).
  • Discuss how the capabilities and limitations of computers and networks might influence their use in a specified scientific task.

Relationship with Other Computing Courses

This course is really targeted at students with little or no programming experience. Officially, it is not for credit for students who have credit for any of the following: APSC 160, Computer Science AP, CPSC 110, CPSC 111, EOSC 211, or transfer credit equivalent to CPSC 111. Unofficially, even if you have not taken one of these courses, you may be bored if you have any significant experience programming in a text-based language like Python, Perl, Java, C/C++, Basic, Fortran, Pascal, Matlab, etc.

CPSC 301 is not a course in bioinformatics, although we will briefly discuss some topics from bioinformatics. If you have programming experience and are interested in learning more about bioinformatics, consider CPSC 445 and/or MICB 405 instead.

Students who have taken CPSC 101 or 100 (but not CPSC 110, 111 or 103) are very welcome in CPSC 301.

Rationale for CPSC 301

Although they are very good at it, computers are not just for word processing and networks are not just for downloading music. Computer science is the study of process, and when a process is automated the results can be revolutionary; for example, consider the explosion of gene sequence data since the advent of automated sequencing. Computer algorithms are increasingly in control of our lives in obvious -- such as Google -- and sometimes not so obvious -- such as consumer credit rating -- ways. The inevitable conclusion is that computers and their networks are vital tools for creation, collection, analysis, visualization and storage of data throughout all fields of the sciences, arts, business and engineering.

Unfortunately, many students have limited experience working with computers beyond tools like word processors and web browsers. A symptom of this lack of basic computing experience is the slow, frustrating and error-prone but all too common practice of manually transcribing data between applications, either by cut and paste or by typing numbers, simply because the two applications use different storage formats.

While there are other computing courses throughout the faculty of science, those classes are focused on programming skills, are discipline dependent, and/or require numerous prerequisites. Designed for students with no previous university level computing experience, this course will introduce general concepts of information storage, transmission, retrieval, manipulation and visualization. The emphasis will be on practical tools and techniques, with applications and examples drawn from throughout science but in particular from the life sciences.

The underlying story of the course is how computers and networks allow us to work with data at scales, speed and cost that would otherwise be impossible; however, to achieve these capabilities, we must be rigorous in how we specify the data and the manipulations that will be applied. The course will begin with simple static data, such as files, and progress to programs which construct and interact with dynamic data. The course will also investigate how a computational viewpoint can lead to questions of scientific interest which might not otherwise have arisen.