Open writing projects/Scientific Programming with Python and Subversion/Outline: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
(removed 'modular' and moved subversion to section 2 and intro to python to section 3)
Line 1: Line 1:
== Outline ==
== Outline ==
* sections marked with '(modular)' can be re-written using a different technology (i.e. git instead of svn)


=== 0 Introduction ===
=== 0 Introduction ===
Line 22: Line 20:
** everything from data generation to analysis to plots can be done in python, making every aspect of your project consistent. These together promote ''good scientific practices'' (data integrity, data reproduceability)
** everything from data generation to analysis to plots can be done in python, making every aspect of your project consistent. These together promote ''good scientific practices'' (data integrity, data reproduceability)


 
=== 2 Source Control Management with Subversion ===
=== 2 A brief introduction to python ===
* What the scientist needs to know to get started (References to Programming Python for more programming detail?)
** variable assignment
** basic control structures
** functions
** package structure and import
** objects (just like packages)
 
 
=== 3 Source Control Management with Subversion ===


* What is source control?
* What is source control?
Line 38: Line 26:
** A way to keep a history of every step in a process.
** A way to keep a history of every step in a process.
** Not only for computer code, but for data, plots, paper manuscripts, etc.
** Not only for computer code, but for data, plots, paper manuscripts, etc.
* A modular introduction to Subversion
* An introduction to Subversion
** What is a repository?
** What is a repository?
** How to create a repository
** How to create a repository
Line 47: Line 35:
* Advanced Topics
* Advanced Topics
** Branching and Merging
** Branching and Merging
=== 3 A brief introduction to python ===
* What the scientist needs to know to get started (References to Programming Python for more programming detail?)
** variable assignment
** basic control structures
** functions
** package structure and import
** objects (just like packages)




=== 4 Making scientific plots with python ===
=== 4 Making scientific plots with python ===


* A modular introduction to matplotlib
* An introduction to matplotlib
** basic functionality - simple line, bar, histogram plots
** basic functionality - simple line, bar, histogram plots
** more sophisticated graphics - insets, labeling with text, drawing arrows
** more sophisticated graphics - insets, labeling with text, drawing arrows
Line 62: Line 59:
=== 5 Crunching numbers with python ===
=== 5 Crunching numbers with python ===


* Python community modules (modular)
* Python community modules
** using numpy for matrix manipulations
** using numpy for matrix manipulations
** using the scipy project tools
** using the scipy project tools
Line 80: Line 77:
*** typically done by 'eye' by running the code manually and looking at output
*** typically done by 'eye' by running the code manually and looking at output
*** with unit tests can see if the code failed, and if it did, where exactly
*** with unit tests can see if the code failed, and if it did, where exactly
* Using python and nose to write unit tests? (modular)
* Using python and nose to write unit tests?
** example of test code, and how to run the tests
** example of test code, and how to run the tests
*** bioinformatics
*** bioinformatics

Revision as of 11:45, 24 March 2008

Outline

0 Introduction

  • Why this book?
    • Motivation - There's lots of information about what you can do with computers in biology, chemistry, and physics, but little training in how to do it
    • Assumes no prior knowledge of Python; introduces computing tools as they are needed in the context of a typical scientific investigation. This makes it useful to both beginners and more experienced users
    • goal - to make managing projects easier, but more importantly to promote good scientific practice using computing methods
  • Introduce scientific themes throughout the book
    • Covers themes from biology, informatics, and physics? - for informatics, maybe use examples from one of the NCBI coffee breaks


Part I: Intro to scientific programming using python

1 Why use python for scientific programming?

  • What is python?
    • computer language that offers easy access to high-level functions, and has a large and growing community of scientific users
  • Why build scientific applications in python?
    • python code looks clean - easy to understand yours or your collaborators code a week later
    • everything from data generation to analysis to plots can be done in python, making every aspect of your project consistent. These together promote good scientific practices (data integrity, data reproduceability)

2 Source Control Management with Subversion

  • What is source control?
    • Similar to Word 'track changes' or wiki 'history' but for all the files in a project.
    • A way to keep a history of every step in a process.
    • Not only for computer code, but for data, plots, paper manuscripts, etc.
  • An introduction to Subversion
    • What is a repository?
    • How to create a repository
    • How to make basic commits
    • Seeing differences between versions
    • Retrieving past versions
    • Collaboration using subversion
  • Advanced Topics
    • Branching and Merging


3 A brief introduction to python

  • What the scientist needs to know to get started (References to Programming Python for more programming detail?)
    • variable assignment
    • basic control structures
    • functions
    • package structure and import
    • objects (just like packages)


4 Making scientific plots with python

  • An introduction to matplotlib
    • basic functionality - simple line, bar, histogram plots
    • more sophisticated graphics - insets, labeling with text, drawing arrows
    • interactive graphics - adjusting parameters for real-time fitting
  • An example project use of matplotlib
    • bioinformatics
    • physics


5 Crunching numbers with python

  • Python community modules
    • using numpy for matrix manipulations
    • using the scipy project tools
    • interacting with the Gnu Scientific Library
  • An example project
    • bioinformatics
    • physics
    • others?


6 Unit testing for scientists

  • What is unit testing?
    • A way to generate automated tests of small units of code
  • Why do unit testing?
    • example: switching a sorting algorithm - how do you know the code works the same way
      • typically done by 'eye' by running the code manually and looking at output
      • with unit tests can see if the code failed, and if it did, where exactly
  • Using python and nose to write unit tests?
    • example of test code, and how to run the tests
      • bioinformatics
      • physics
  • How do I know which tests to write?
    • (This one is hard)


7 Advanced topics - using SWIG and psyco to speed up python code

  • (this section could be omitted initially)
  • What if python is not fast enough for my project?
    • Several options:
      • Use psyco to 'compile' the python code
      • Identify the slow parts and write them in C/C++ and bind them to python using SWIG
  • Using psyco
  • Using C with SWIG


Part II: Examples

  • Ideally we could have an svn repo set up for people to pull from to look at the code examples at each step of the way
  • A complete case study of [blah] from start to finish
    • Creating a code repository
    • Writing your first code [Be more specific: Is this just the code for the case study or will you also talk about how to approach the scientific problem before writing the code?]
    • Writing your first tests
    • Moving on [To what?]