Open writing projects/Scientific Programming with Python and Subversion/Outline: Difference between revisions
From OpenWetWare
Jump to navigationJump to search
No edit summary |
No edit summary |
||
Line 11: | Line 11: | ||
** Covers themes from biology, informatics, and physics? - for informatics, maybe use examples from one of the [http://www.ncbi.nlm.nih.gov/Coffeebreak/ NCBI coffee breaks] | ** Covers themes from biology, informatics, and physics? - for informatics, maybe use examples from one of the [http://www.ncbi.nlm.nih.gov/Coffeebreak/ NCBI coffee breaks] | ||
=== 1 Why use | == Part I: Intro to scientific programming using python == | ||
=== 1 Why use python for scientific programming === | |||
* What is python? | * What is python? | ||
** computer language that offers easy access to high-level functions, and has a large and growing community of scientific users | ** computer language that offers easy access to high-level functions, and has a large and growing community of scientific users | ||
* Why build scientific applications in python? | * Why build scientific applications in python? | ||
** python code looks clean - easy to understand your code a week later | ** python code looks clean - easy to understand yours or your collaborators code a week later | ||
** everything | ** everything from data generation to analysis to plots can be done in python, making every aspect of your project consistent. These together promote ''good scientific practices'' (data integrity, data reproduceability) | ||
=== 2 A brief introduction to python === | |||
* What the scientist needs to know to get started (References to Programming Python for more programming detail?) | |||
** variable assignment | ** variable assignment | ||
** basic control structures | ** basic control structures | ||
Line 27: | Line 31: | ||
=== | === 3 Source Control Management with Subversion === | ||
* What is source control? | * What is source control? | ||
** | ** Similar to Word 'track changes' or wiki 'history' but for all the files in a project. | ||
** A way to keep a history of every step in a process. | ** A way to keep a history of every step in a process. | ||
** Not only for computer code, but for data, plots, paper manuscripts, etc. | ** Not only for computer code, but for data, plots, paper manuscripts, etc. | ||
* | * A modular introduction to Subversion | ||
** What is a repository | ** What is a repository? | ||
** How to create a repository | ** How to create a repository | ||
** How to make | ** How to make basic commits | ||
** Seeing differences between versions | ** Seeing differences between versions | ||
** Retrieving past versions | ** Retrieving past versions | ||
Line 44: | Line 48: | ||
=== | === 4 Making scientific plots with python === | ||
* | * A modular introduction to matplotlib | ||
** basic functionality - simple line, bar, histogram plots | ** basic functionality - simple line, bar, histogram plots | ||
** more sophisticated graphics - insets, labeling with text, drawing arrows | ** more sophisticated graphics - insets, labeling with text, drawing arrows | ||
Line 54: | Line 58: | ||
** physics | ** physics | ||
=== | |||
=== 5 Crunching numbers with python === | |||
* Python community modules (modular) | * Python community modules (modular) | ||
Line 63: | Line 68: | ||
** bioinformatics | ** bioinformatics | ||
** physics | ** physics | ||
** others? | |||
=== | === 6 Unit testing for scientists === | ||
* What is unit testing? | * What is unit testing? | ||
Line 79: | Line 85: | ||
** (This one is hard) | ** (This one is hard) | ||
=== 7 Advanced topic - using SWIG and psyco to speed up python code === | === 7 Advanced topic - using SWIG and psyco to speed up python code === | ||
Line 97: | Line 95: | ||
* Using psyco | * Using psyco | ||
* Using C with SWIG | * Using C with SWIG | ||
== Part II: Examples == | |||
* Case studies: This section could be omitted initially - ideally we could have an svn repo set up for people to pull from to look at the code examples at each step of the way | |||
* go through from start to finish | |||
** initially create a repository | |||
** the first code | |||
** the first tests | |||
** moving on |
Revision as of 08:34, 24 March 2008
Outline
- sections marked with '(modular)' can be re-written using a different technology (i.e. git instead of svn)
0 Introduction
- Why this book?
- motivation - A classic problem in the sciences is there;s lots of training in the science you can do with computers, but little training in how to do it
- Assumes no prior knowledge of Python; introduces computing tools as they are needed in the context of a typical scientific investigation. This makes it useful to both beginners and more experienced users
- goal - to make managing projects easier, but more importantly to promote good scientific practice using computing methods
- Introduce scientific themes throughout the book
- Covers themes from biology, informatics, and physics? - for informatics, maybe use examples from one of the NCBI coffee breaks
Part I: Intro to scientific programming using python
1 Why use python for scientific programming
- What is python?
- computer language that offers easy access to high-level functions, and has a large and growing community of scientific users
- Why build scientific applications in python?
- python code looks clean - easy to understand yours or your collaborators code a week later
- everything from data generation to analysis to plots can be done in python, making every aspect of your project consistent. These together promote good scientific practices (data integrity, data reproduceability)
2 A brief introduction to python
- What the scientist needs to know to get started (References to Programming Python for more programming detail?)
- variable assignment
- basic control structures
- functions
- package structure and import
- objects (just like packages)
3 Source Control Management with Subversion
- What is source control?
- Similar to Word 'track changes' or wiki 'history' but for all the files in a project.
- A way to keep a history of every step in a process.
- Not only for computer code, but for data, plots, paper manuscripts, etc.
- A modular introduction to Subversion
- What is a repository?
- How to create a repository
- How to make basic commits
- Seeing differences between versions
- Retrieving past versions
- Collaboration using subversion
- Advanced Topics
- Branching and Merging
4 Making scientific plots with python
- A modular introduction to matplotlib
- basic functionality - simple line, bar, histogram plots
- more sophisticated graphics - insets, labeling with text, drawing arrows
- interactive graphics - adjusting parameters for real-time fitting
- An example project use of matplotlib
- bioinformatics
- physics
5 Crunching numbers with python
- Python community modules (modular)
- using numpy for matrix manipulations
- using the scipy project tools
- interacting with the Gnu Scientific Library
- An example project
- bioinformatics
- physics
- others?
6 Unit testing for scientists
- What is unit testing?
- A way to generate automated tests of small units of code
- Why do unit testing?
- example: switching a sorting algorithm - how do you know the code works the same way
- typically done by 'eye' by running the code manually and looking at output
- with unit tests can see if the code failed, and if it did, where exactly
- example: switching a sorting algorithm - how do you know the code works the same way
- Using python and nose to write unit tests? (modular)
- example of test code, and how to run the tests
- bioinformatics
- physics
- example of test code, and how to run the tests
- How do I know which tests to write?
- (This one is hard)
7 Advanced topic - using SWIG and psyco to speed up python code
- this section could be omitted initially
- What if python is not fast enough for my project?
- Several options:
- Use psyco to 'compile' the python code
- Identify the slow parts and write them in C/C++ and bind them to python using SWIG
- Several options:
- Using psyco
- Using C with SWIG
Part II: Examples
- Case studies: This section could be omitted initially - ideally we could have an svn repo set up for people to pull from to look at the code examples at each step of the way
- go through from start to finish
- initially create a repository
- the first code
- the first tests
- moving on