Wilke:Creating figures: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
No edit summary
Line 26: Line 26:
require(Hmisc) # for function errbar()
require(Hmisc) # for function errbar()


# the data to plot
# The data to plot.
mean.zdg <- c( 0.35208113, 0.07153585, -0.04377547, -0.12779811,
mean.zdg <- c( 0.35208113, 0.07153585, -0.04377547, -0.12779811,
     -0.25646981, -0.18000377, -0.17827170, 0.03797358, -0.10975094,
     -0.25646981, -0.18000377, -0.17827170, 0.03797358, -0.10975094,
Line 36: Line 36:
     121 ) # start position of the analysis window (in nucleotides)
     121 ) # start position of the analysis window (in nucleotides)


# The code to generate the figure. Output file will be "T7_zdg.pdf".
# The option "useDingbats=False" fixes a font problem that some
# open-source pdf readers experience.
pdf( "T7_zdg.pdf", width=4.5, height=4, useDingbats=FALSE )
pdf( "T7_zdg.pdf", width=4.5, height=4, useDingbats=FALSE )
par( mai=c(0.65, 0.65, 0.1, 0.05), mgp=c(2, 0.5, 0), tck=-0.03 )
par( mai=c(0.65, 0.65, 0.1, 0.05), mgp=c(2, 0.5, 0), tck=-0.03 )

Revision as of 15:51, 6 July 2010

Notice: The Wilke Lab page has moved to http://wilkelab.org.
The page you are looking at is kept for archival purposes and will not be further updated.
THE WILKE LAB

Home        Contact        People        Research        Publications        Materials

Creating Publication-Quality Figures

Many different programs can be used to generate figures. Unfortunately, almost all graphing programs have poor default settings. Therefore, if you create a figure without changing the defaults, you can be almost certain that your figure is not ready to be published.

Here are a few general guidelines:

  • Add labels to all axes.
  • Make axis labels and any other labels in the figure sufficiently large. By default, most graphing programs use labels that are far too small. Note that labeling that looks good on the computer screen is often too small for print, because we tend to zoom figures to a large size when we prepare them. To test whether labels are of a good size, zoom out to a point where the figure spans only 3-4 inch in width. If you can still comfortably read the labels, then they are of a good size.
  • Minimize visual clutter by maximizing the amount of ink used to convey data relative to the total amount of ink. Therefore, remove any distracting background (such as a grid in the figure). Use half-open figures where there are axes at the bottom and the left, but not at the right and the top. Remove any lines that don't convey any information. Make sure that the lines that represent data are thicker than the axis lines.
  • Don't put a title on top of the figure. The title belongs into the figure caption.
  • Be mindful of color usage. Many people are color blind and may not be able to distinguish some of the different colors you are using. In general, if at all possible, a figure should still convey all its information when printed black-and-white.
  • If possible, avoid overly busy line styles, such as dotted or dashed lines, in particular many different types of dotted or dashed lines. Always avoid patterned fill styles in bar graphs.
  • Avoid pie charts, and in particular 3d pie charts. These types of graphs do not accurately convey quantitative information.
  • In general, MS Excel cannot produce acceptable figures and should be avoided. MS Excel also makes it difficult to export figures into commonly used formats such as eps, pdf, or svg. Many people achieve excellent results with the programs R, gnuplot, Grace, or Matlab.
Example of a poorly designed figure.
An improved version of the same figure.

Creating figures with plain R

We produce most of our figures in the lab with R. The advantage of making figures with R is that creation of the figure is tightly integrated with the data analysis process, and that we can script and automate figure creation. The latter point is particularly important for reproducibility; a data file plus associated R script is all that is needed to regenerate the exact published figure.

Below follows an example R script to generate a typical figure. You can use this script as a template to generate similar figures. If you need to place two or more figures next to each other, the simplest way to achieve that with the template script is to make use of the split.screen() function.

require(Hmisc) # for function errbar()

# The data to plot.
mean.zdg <- c( 0.35208113, 0.07153585, -0.04377547, -0.12779811,
    -0.25646981, -0.18000377, -0.17827170, 0.03797358, -0.10975094,
     0.04821887, 0.03103208, 0.12747170, 0.07016604 ) # means
se.zdg <- c( 0.1192534, 0.1421630, 0.1408142, 0.1497453, 0.1508856,
     0.1492282, 0.1563277, 0.1174525, 0.1337940, 0.1261310,
     0.1473556, 0.1263988, 0.1258108 ) # standard errors 
window.start <- c( 1, 11, 21, 31, 41, 51, 61, 71, 81, 91, 101, 111,
     121 ) # start position of the analysis window (in nucleotides)

# The code to generate the figure. Output file will be "T7_zdg.pdf".
# The option "useDingbats=False" fixes a font problem that some
# open-source pdf readers experience.
pdf( "T7_zdg.pdf", width=4.5, height=4, useDingbats=FALSE )
par( mai=c(0.65, 0.65, 0.1, 0.05), mgp=c(2, 0.5, 0), tck=-0.03 )
plot( window.start, mean.zdg, type= 'l', col='black',
      ylim=c(-0.45, 0.45), axes=FALSE,
      xlab='Window start position (nt)',
      ylab=expression(bar(Z)[Delta][G]))
errbar( window.start, mean.zdg, mean.zdg+se.zdg, mean.zdg-se.zdg,
        add=TRUE, bg='grey60', pch=18, cex=1, xlab='', ylab='' )
abline( h=0, col='grey60', lty=2 )
axis( 1,
      at=c( 1, 11, 21, 31, 41, 51, 61, 71, 81, 91, 101, 111, 121 ),
      c(1, NA, 21, NA, 41, NA, 61, NA, 81, NA, 101, NA, 121) )
axis( 2,
      at=c(-.4, -.3, -.2, -.1, 0, .1, .2, .3, .4),
      c(-0.4, NA, -0.2, NA, 0, NA, 0.2, NA, 0.4) )
legend( "topright", "Phage T7", pch=c(18), col=c('black'), bty='n' )
dev.off()

Notes on using R

  • There are two well-known packages that automate much of the work of preparing multivariate graphics: lattice and ggplot2.
  • The ColorBrewer website can be helpful in selecting colors that reinforce the story your data tells. The website also helps you select color schemes that print in black and white. The RColorBrewer package then provides palettes to use these color schemes in your R plots.
  • If you are using linux, use the graphics devices in the Cairo package (e.g., CairoSVG(), CairoEPS()) to produce your graphics. The points in the graphics produced with the default device sometimes render as letters in document viewers.
  • If you are using ggplot2, you may increase the relative amount of ink in the graph that conveys data by setting custom theme options. The following options remove the background grid and the boxes around panel labels:
g <- g + theme_bw() 
g <- g + opts(panel.grid.minor=theme_blank(), panel.grid.major=theme_blank(), panel.background=theme_blank()) 
g <- g + opts(strip.background=theme_blank(), strip.text.y = theme_text())
  • If your are using ggplot2, it should also be possible to find some option or edit the source code to produce half-open plot borders. A workaround is to export an SVG graphic and then edit the paths in the SVG file that describe the plot borders.
  • Exporting the graphic to SVG also allows you to use a graphical editor such as Inkscape to make a few final touches on the graph.
  • If there is something you want to do, search the web. Often a tutorial or mailing-list exchange will quickly come up and provide you with useful information. Here are a few notable websites:
    • R Graphical Manual This website allows you to browse through the graphics produced by the example code of a large number of R packages.
    • Learning R This blog has a lot of tips and examples of ggplot2 usage. There is also a series summarized and distributed in this post with many examples of how to make multivariate figures with both lattice and ggplot2.