Software Carpentry - How good are the Best Practices?


Adapted from Software Carpentry Best Practices in Scientific Computing


Good programmers are 10X more productive than average

Good practices are 10X more productive than average

Rule 1: Write Programs for People not Computers

Rule 1.1: Keep it simple

Rule 1.2: Make names consistent, distinctive, and meaningful.

Rule 1.3: Make code style and formatting consistent.

Rule 2: Let the Computer Do the Work

Rule 2.1: Make the computer repeat tasks.

Rule 2.2: Save recent commands in a file for re-use.

Rule 2.3: Use a build tool to automate workflows.

Rule 3: Make Incremental Changes

Rule 3.1: Small steps with frequent feedback

Rule 3.2: Use a version control system.

Rule 3.3: Version control EVERYTHING

Rule 4: Don't Repeat Yourself (or Others)

Rule 4.1: There can be only one

Rule 4.2: Modularize code rather than copying and pasting.

Rule 4.3: Re-use code instead of rewriting it.

Rule 5: Plan for Mistakes

Note: improving quality increases productivity

Rule 5.1: Don't trust. Verify

Rule 5.2: Use an off-the-shelf unit testing library.

Testing is Hard

Rule 5.3: Turn bugs into test cases.

Test-Driven Development

Rule 5.4: Use a symbolic debugger.

Rule 6: Optimize Software Only After It Works Correctly

Rule 6.1: Use a profiler to identify bottlenecks.

Rule 6.2: Write code in the highest-level language possible.

Rule 7: Document Design and Purpose not Mechanics

Rule 7.1: Document interfaces and reasons not implementations.

Rule 7.2: Refactor code in preference to explaining how it works.

Rule 7.3: Embed the documentation for a piece of software in that software.

Rule 8: Collaborate

Rule 8.1: Use pre-merge code reviews.

Rule 8.2 Use pair programming

Rule 8.3: Use an issue tracking tool.

Gosh, That's a Lot

One step at a time.

  1. Use text-based interfaces
  2. Turn history into scripts
  3. Put everything in version control
  4. Use test-driven development

Citation: Best Practices for Scientific Computing" , PLOS Biology, Jan. 2014.


Annotated Best Practices

Edit the markdown document: web/2016/day2/docs/

Add your choices below. Write them in the following format.

by ialbert

KEEP Rule 1.1 Keep it simple

Simplicity is the most powerful virtue that any process can have. There is only one problem: it is kind of difficult to keep it simple

KEEP Rule 3.1: Small steps with frequent feedback

There is great value in keep the entire pipeline working at most times. Save often, commit often. Rerun often.

KEEP Rule 3.3: Version control EVERYTHING

While git was designed for software you should keep everything (other than large datasets) in it. You get free backup and replication with it!

TOSS Rule 5.4 Use a symbolic debugger.

There is nothing wrong with print statements. Symbolic debuggers promote writing complex programs. If you can't debug a program with simple print statements your program may be already too complicated.

TOSS Rule 8.1 Use pre-merge code reviews

This is a concept borrowed from software engineering where it is assumed that all people on a team work on a the exact same and relatively simple problem. This rarely happens in sciences. This rule is one of these "feel good" rules that are just unrealistic in scientific practice.

TOSS Rule 8.2 Use pair programming

Pair programming is again a concept borrowed from software engineering. But it disregards the fact that most software engineers need to solve far simpler and far better defined problems than scientists do. It is sort of a pipe dream that we can do this.

Penn State • 2016 • bootcamp-central via pyblue