Find us on GitHub

SANBI, University of the Western Cape

Mar 7-11, 2016

9:00 am - 1:00 pm

Instructors: Peter van Heusden

General Information

The material in this workshop is largely based around that produced by Software Carpentry. Software Carpentry's mission is to help scientists and engineers get more research done in less time and with less pain by teaching them basic lab skills for scientific computing. This hands-on workshop will cover basic concepts and tools, including program design, version control, data management, and task automation. Participants will be encouraged to help one another and to apply what they have learned to their own research problems.

For more information on what Software Carpentry teaches and why, please see our paper "Best Practices for Scientific Computing".

Who: The course is aimed at graduate students in bioinformatics and other researchers. You don't need to have any previous knowledge of the tools that will be presented at the workshop.

Where: 5th floor, Life Sciences Building. Get directions with OpenStreetMap or Google Maps.

Requirements: Participants must bring a laptop with a few specific software packages installed (listed below). They are also required to abide by a Code of Conduct (once again borrowed from Software Carpentry).

Contact: Please mail pvh@sanbi.ac.za for more information.


Schedule

Day 1 - Monday

09:00 Workshop start and laptop check
09:30 Automating tasks with the Unix shell
11:00 Coffee
11:30 Automating tasks with the Unix shell
13:00 Wrap-up

Day 2 - Tuesday

09:30 Using the SANBI compute cluster
11:00 Coffee
11:30 Version control with Git
13:00 Wrap-up

Day 3 - Wednesday

09:30 Programming with Python
11:00 Coffee
11:30 Programming with Python
13:00 Wrap-up

Day 4 - Thursday

11:30 Programming with Python
13:00 Lunch break
14:00 Programming with Python
15:30 Wrap-up

Day 5 - Friday

09:30 Programming with Python
11:00 Coffee
11:30 Programming with Python
13:00 Wrap-up

Etherpad: http://pad.software-carpentry.org/computing-sanbi-032016.
We will use this Etherpad for chatting, taking notes, and sharing URLs and bits of code.


Syllabus

The Unix Shell

  • Files and directories
  • History and tab completion
  • Pipes and redirection
  • Looping over files
  • Creating and running shell scripts
  • Finding things
  • Notes...
  • Reference...

Programming in Python

Using the SANBI compute cluster

  • Finding software
  • Creating and submitting jobs
  • Monitoring jobs on the cluster
  • Examining the state of the cluster
  • Notes

Version Control with Git

  • Creating a repository
  • Recording changes to files: add, commit, ...
  • Viewing changes: status, diff, ...
  • Ignoring files
  • Working on the web: clone, pull, push, ...
  • Resolving conflicts
  • Open licenses
  • Where to host work, and why
  • Reference...

Setup

To participate in a Software Carpentry workshop, you will need access to the software described below. In addition, you will need an up-to-date web browser.

Software Carpentry maintains a list of common issues that occur during installation as a reference for instructors that may be useful on the Configuration Problems and Solutions wiki page.

Text Editor

When you're writing code, it's nice to have a text editor that is optimized for writing code, with features like automatic color-coding of key words. The default text editor on Mac OS X and Linux is usually set to Vim, which is not famous for being intuitive. if you accidentally find yourself stuck in it, try typing the escape key, followed by :q! (colon, lower-case 'q', exclamation mark), then hitting Return to return to the shell.

Windows

nano is a basic editor and the default that instructors use in the workshop. To install it, download the Software Carpentry Windows installer and double click on the file to run it. This installer requires an active internet connection.

Others editors that you can use are Notepad++ or Sublime Text or Atom. Be aware that you must add its installation directory to your system path. Please ask your instructor to help you do this.

Mac OS X

nano is a basic editor and the default that instructors use in the workshop. It should be pre-installed.

Others editors that you can use are Text Wrangler or Sublime Text or Atom.

Linux

nano is a basic editor and the default that instructors use in the workshop. It should be pre-installed.

Others editors that you can use are Gedit, Kate or Sublime Text or Atom.

Python

Python is a popular language for scientific computing, and great for general-purpose programming as well. Installing all of its scientific packages individually can be a bit difficult, so we recommend Anaconda, an all-in-one installer.

Regardless of how you choose to install it, please make sure you install Python version 3.x (e.g., 3.4 is fine).

We will teach Python using the IPython notebook, a programming environment that runs in a web browser. For this to work you will need a reasonably up-to-date browser. The current versions of the Chrome, Safari and Firefox browsers are all supported (some older browsers, including Internet Explorer version 9 and below, are not).

Windows

  1. Open http://continuum.io/downloads with your web browser.
  2. Download the Python 3 installer for Windows.
  3. Install Python 3 using all of the defaults for installation except make sure to check Make Anaconda the default Python.

Mac OS X

  1. Open http://continuum.io/downloads with your web browser.
  2. Download the Python 3 installer for OS X.
  3. Install Python 3 using all of the defaults for installation.

Linux

  1. Open http://continuum.io/downloads with your web browser.
  2. Download the Python 3 installer for Linux.
  3. Install Python 3 using all of the defaults for installation. (Installation requires using the shell. If you aren't comfortable doing the installation yourself stop here and request help at the workshop.)
  4. Open a terminal window.
  5. Type
    bash Anaconda-
    and then press tab. The name of the file you just downloaded should appear.
  6. Press enter. You will follow the text-only prompts. When there is a colon at the bottom of the screen press the down arrow to move down through the text. Type yes and press enter to approve the license. Press enter to approve the default location for the files. Type yes and press enter to prepend Anaconda to your PATH (this makes the Anaconda distribution the default Python).

SSH and SCP

The Secure (Remote) Shell (SSH) and Secure Copy (SCP) programs allow logging in and copy data to and from remote machines.

Windows

Download the Putty SSH client (it runs from where you download it, no install necessary) and WinSCP SCP software.

Mac OS X

MacOS comes with both SSH and SCP clients built in that can be run from the Terminal. The Terminal application also has SCP support, as explained in this StackExchange answer.

Linux

Linux comes with both SSH and SCP clients built in that can be run from the Terminal.