McClean: Plotting Stacked Histograms: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
 
(21 intermediate revisions by the same user not shown)
Line 1: Line 1:
=Summary=
=Summary=
This explains the basics of plotting histograms stacked vertically (this allows you to see the shift, in for instance, fluorescence in a population of cells analyzed by flow cytometry).
This explains the basics of plotting histograms stacked vertically (this allows you to see the shift in, for instance, fluorescence in a population of cells analyzed by flow cytometry).  Example code and data is under "Files" at the end of the page.


=Example=
=Example=
Your data could be anything.  In this example, the variable "Data" contains five (5) rows, each of which contain 9000 fluorescence readings from a FACS experiment.  Each row represents a timepoint, with induction of GFP increasing with time.
Your data could be anything.  In this example, the variable "Data" (stored in MakeStackHistData.mat) contains five (5) rows, each of which contain 9000 fluorescence readings from a flow cytometry experiment.  Each row represents a time point, with induction of GFP increasing with time.




Line 10: Line 10:


close all; clear all;
close all; clear all;
load('Data.mat')
load('MakeStackHistData.mat')
</pre>
</pre>


Chose bins (you probably want to use the same bin for every plot, since you will be stacking them along the same y-axis) and then bin your data using the Matlab "hist" command.  We also keep track of the distributions' means since we use this to color the histograms later.
Chose bins (you probably want to use the same bin for every plot, since you will be stacking them along one y-axis) and then bin your data using the Matlab "hist" command.  We also keep track of the distributions' means since we use this to color the histograms later.


<pre>
<pre>
Line 53: Line 53:




MMColorMap(1:end,2)=MMdiff;
MMColorMap(1:end,2)=MMdiff; %The colormap is RGB, so changing the second column changes the green values.


%Set up the figure and axis properties:
%Set up the figure and axis properties:
Line 84: Line 84:
</pre>
</pre>


[[Image:Fig4_BarChartEx.png‎]]
[[Image:ExampleStackedHistograms.png|600px]]


Save your figure in a variety of formats for later use (recall that we made h our figure handle):
Save your figure in a variety of formats for later use (recall that we made h our figure handle):
Line 95: Line 95:
</pre>
</pre>


[[Image:Fig4_BarChartEx.png‎]]
=Code=
You can copy and paste the code below into a Matlab m-file to run all of the examples shown above.  You will also the "Data.mat" example data:


=Code=
You can copy and paste the code below into a Matlab m-file to run all of the examples shown above.  You will also need the two functions listed in the references below, available from the Matlab file exchange at [http://www.mathworks.com/matlabcentral/ Matlab Central].
<pre>
<pre>
close all;
%% Preliminaries:


%Suppose you have the following data for two different strains across 4
close all; clear all;
%different experimental conditions (Conditions A,B,C,D, from left to right)
load('MakeStackHistData.mat')
Strain1_Mean=[0.5137    3.2830    1.5887    5.9188];
Strain2_Mean=[0.4042    2.9884    0.5709    2.7766];
Strain1_std=[1.1393    2.8108    2.2203    3.5233];
Strain2_std=[0.8762    2.8478    0.9878    2.2197];


%% Define the bins to use for our data (you will need to adjust this depending on your data):


%Plot this data as a bar chart
%In this case we are using the same bins for each data set.  You probably
bar([1 2 3 4],[Strain1_Mean' Strain2_Mean'])
%want to do this when you are plotting stacked histograms.
legend('Strain 1','Strain 2')
pause; close all;


%This looks ok, but we would really like some error bars, so we use a handy
bins=logspace(0,4,60);
%function from the file exchange:
x=bins;
h=figure; hold;
barwitherr([Strain1_std' Strain2_std'], [1 2 3 4],[Strain1_Mean' Strain2_Mean'])
legend('Strain 1','Strain 2')
pause; close all;


%This is ok, but we'd rather only have one-sided error barsTo do this,
%% Bin your data using Matlabs "hist" function.   
%you will send barwitherr zeros for the lower error and keep the upper
%error as is by sending in the matrix cat(3,zeros(4,2),[Strain1_std'
%Strain2_std']) for the error
barwitherr(cat(3,zeros(4,2),[Strain1_std' Strain2_std']), [1 2 3 4],[Strain1_Mean' Strain2_Mean'])
legend('Strain 1','Strain 2')
pause; close all;


%Now let's use better colors by changing the color map and set the bar
%The variable "n" will be the number in each bin described by the variable
%widths, line widths, axis fonts etc to something prettier
%"x".  HistData will become a matrix of the normalized bins (normalized to
barwitherr(cat(3,zeros(4,2),[Strain1_std' Strain2_std']), [1 2 3 4],[Strain1_Mean' Strain2_Mean'],'LineWidth',2,'BarWidth',0.9)
%the total number of elements). Means will become a vector of the mean
legend('Strain 1','Strain 2')
%value for each distribution, which we will use when coloring our
%set the axis properties
%histograms (so that colors roughly correspond to the mean of the
ax=gca;
%distribution).
set(ax, 'FontSize',12)


HistData=[]; 
Means=[];


%Don't like the colors? You can change them by modifying the colormap:
for i=1:5
barmap=[0.7 0.7 0.7; 0.05 .45 0.1]; %[0.7 0.7 0.7] is grey, [ 0.05 .45 0.1] is a green
    [n,x]=hist(Data(i,:),x);
colormap(barmap);
    HistData=[HistData; n./sum(n)];
ylabel('Data','FontSize',14)
    Means=[Means mean(Data(i,:))];
title('Title of Experiment','FontSize',14)
end
pause;


%It isn't very useful to have our experimental conditions labelled 1,2,3,4
%so can we change these to words? Yes:
set(ax, 'XTick',[1 2 3 4],'XTickLabel',{'A','B','C','D' });
pause;
%But this isn't perfect, maybe we want more information on the axis.  To
%have actual labels rotate them using the handy xticklabel_rotate function:
%set(ax, 'FontSize',12,'XTick',[1 2 3 4],'XTickLabel',{'Condition A','Condition B','Condition C','Condition D' });
xticklabel_rotate([1 2 3 4],45,{'Condition A','Condition B','Condition C','Condition D' })
pause


%If you are going to use this figure in a presentation or paper you can
%save it in various forms (including as a file for adobe illustrator):


%Recall that h is our figure handle:
%% Define a colormap for the histograms that will make the histograms brighter as the mean of the distribution increases
saveas(h, 'ExampleBar.fig','fig')
 
saveas(h, 'ExampleBar.png','png')
% In this case we chose to make the histograms brighter green at higher
saveas(h, 'ExampleBar.ai','ai')
% mean values since the flow cytometry data is of GFP.
   
   
  close all;
%Define a color map
MMColorMap=zeros(5,3);
 
%Define colors so that they scale with the difference between the mean
%fluorescence at a given timepoint and the mean at time 0
 
 
MM=sort(Means);
MMdiff=Means-Means(1);
MMdiff=MMdiff./(max(MMdiff));
 
 
MMColorMap(1:end,2)=MMdiff;
 
%Set up the figure and axis properties:
h=figure; hold; colors=colormap;
 
set(gca,'XScale','log')
set(gca,'XLim',[10,2000])
set(gca,'PlotBoxAspectRatioMode','manual')
set(gca,'PlotBoxAspectRatio',[1 3 1])
set(gca,'FontSize',12)
set(gca,'XTick',[100 1000 10000 100000])
set(gca,'YTick',[0 1])
ylabel('Fraction of Cell Population','FontSize',14)
xlabel('Fluorescence [a.u.]','FontSize',14)
 
%% Plot the histograms along the y-axis
 
spacing=.15; %Spacing along the y-axis chosen empirically
 
for i=1:5
    fill([x(1);x'; x'],[i*spacing; (HistData(i,:)+i*spacing)'; ones(1,length(x))'*i*spacing],MMColorMap(i,:),'LineStyle','none')
    semilogx(x,HistData(i,:)+i*spacing,'LineWidth',3,'Color','k');
end
 
 
%% Save the histogram figure
saveas(h,'ExampleStackedHistograms','fig')
saveas(h,'ExampleStackedHistograms','png')
saveas(h,'ExampleStackedHistograms','ai')
saveas(h,'ExampleStackedHistograms','pdf')
</pre>
</pre>


Line 170: Line 184:
Please feel free to post comments, questions, or improvements to this protocol. Happy to have your input!
Please feel free to post comments, questions, or improvements to this protocol. Happy to have your input!


*'''[[User:Megan N McClean|Megan N McClean]] 17:27, 11 June 2012 (EDT)''': There are probably more elegant ways of doing this, but this solution has worked well for me so far. Please feel free to update and add information as you figure out better ways of doing this.
*'''[[User:Megan N McClean|Megan N McClean]] 17:27, 17 July 2013(EDT)''': This ought to get you started.  There are many improvements that could be made.  For instance, a more sophisticated/attractive color scheme or automatic selection of the spacing along the y-axis. Knock yourselves out!


=References=
=Files=
Function xticklabel_rotate: [http://www.mathworks.com/matlabcentral/fileexchange/3486 xticklabel_rotate]
[[Media: MakeStackedHistograms.m‎ | Script for stacked histogram example]]


Function barwitherr: [http://www.mathworks.com/matlabcentral/fileexchange/30639-bar-chart-with-error-bars barwitherr]
[[Media: MakeStackHistData.mat‎ | Data for stacked histogram example]]


=Contact=
=Contact=

Latest revision as of 11:55, 4 June 2014

Summary

This explains the basics of plotting histograms stacked vertically (this allows you to see the shift in, for instance, fluorescence in a population of cells analyzed by flow cytometry). Example code and data is under "Files" at the end of the page.

Example

Your data could be anything. In this example, the variable "Data" (stored in MakeStackHistData.mat) contains five (5) rows, each of which contain 9000 fluorescence readings from a flow cytometry experiment. Each row represents a time point, with induction of GFP increasing with time.


%% Preliminaries:

close all; clear all;
load('MakeStackHistData.mat')

Chose bins (you probably want to use the same bin for every plot, since you will be stacking them along one y-axis) and then bin your data using the Matlab "hist" command. We also keep track of the distributions' means since we use this to color the histograms later.

%Set up bins (we are making histograms of flow cytometry data so we chose logarithmically spaced bins):
bins=logspace(0,4,60);
x=bins;

%Bin the data using "hist" and keep track of the number of elements "n" in each bin "x" for each row in "Data".  Also keep track of the mean of each row of "Data":

HistData=[];  
Means=[];

for i=1:5
    [n,x]=hist(Data(i,:),x);
    HistData=[HistData; n./sum(n)];
    Means=[Means mean(Data(i,:))];
end

We set up a colormap so that our histograms change in color as the mean of their distribution increases:


%% Define a colormap for the histograms that will make the histograms brighter as the mean of the distribution increases

% In this case we chose to make the histograms brighter green at higher
% mean values since the flow cytometry data is of GFP.
 
%Define a color map
MMColorMap=zeros(5,3);

%Define colors so that they scale with the difference between the mean
%fluorescence at a given timepoint and the mean at time 0


MM=sort(Means);
MMdiff=Means-Means(1);
MMdiff=MMdiff./(max(MMdiff));


MMColorMap(1:end,2)=MMdiff;  %The colormap is RGB, so changing the second column changes the green values.

%Set up the figure and axis properties:
h=figure; hold; colors=colormap;

set(gca,'XScale','log')
set(gca,'XLim',[10,2000])
set(gca,'PlotBoxAspectRatioMode','manual')
set(gca,'PlotBoxAspectRatio',[1 3 1])
set(gca,'FontSize',12)
set(gca,'XTick',[100 1000 10000 100000])
set(gca,'YTick',[0 1])
ylabel('Fraction of Cell Population','FontSize',14)
xlabel('Fluorescence [a.u.]','FontSize',14)


Plot the histograms along the y-axis. We choose the spacing variable empirically so that the plot "looks good":


spacing=.15;  %Spacing along the y-axis chosen empirically 

for i=1:5
    fill([x(1);x'; x'],[i*spacing; (HistData(i,:)+i*spacing)'; ones(1,length(x))'*i*spacing],MMColorMap(i,:),'LineStyle','none')
    semilogx(x,HistData(i,:)+i*spacing,'LineWidth',3,'Color','k');
end

Save your figure in a variety of formats for later use (recall that we made h our figure handle):

saveas(h,'ExampleStackedHistograms','fig')
saveas(h,'ExampleStackedHistograms','png')
saveas(h,'ExampleStackedHistograms','ai')
saveas(h,'ExampleStackedHistograms','pdf')
 

Code

You can copy and paste the code below into a Matlab m-file to run all of the examples shown above. You will also the "Data.mat" example data:

%% Preliminaries:

close all; clear all;
load('MakeStackHistData.mat')

%% Define the bins to use for our data (you will need to adjust this depending on your data):

%In this case we are using the same bins for each data set.  You probably
%want to do this when you are plotting stacked histograms.

bins=logspace(0,4,60);
x=bins;

%% Bin your data using Matlabs "hist" function.  

%The variable "n" will be the number in each bin described by the variable
%"x".  HistData will become a matrix of the normalized bins (normalized to
%the total number of elements).  Means will become a vector of the mean
%value for each distribution, which we will use when coloring our
%histograms (so that colors roughly correspond to the mean of the
%distribution).

HistData=[];  
Means=[];

for i=1:5
    [n,x]=hist(Data(i,:),x);
    HistData=[HistData; n./sum(n)];
    Means=[Means mean(Data(i,:))];
end



%% Define a colormap for the histograms that will make the histograms brighter as the mean of the distribution increases

% In this case we chose to make the histograms brighter green at higher
% mean values since the flow cytometry data is of GFP.
 
%Define a color map
MMColorMap=zeros(5,3);

%Define colors so that they scale with the difference between the mean
%fluorescence at a given timepoint and the mean at time 0


MM=sort(Means);
MMdiff=Means-Means(1);
MMdiff=MMdiff./(max(MMdiff));


MMColorMap(1:end,2)=MMdiff;

%Set up the figure and axis properties:
h=figure; hold; colors=colormap;

set(gca,'XScale','log')
set(gca,'XLim',[10,2000])
set(gca,'PlotBoxAspectRatioMode','manual')
set(gca,'PlotBoxAspectRatio',[1 3 1])
set(gca,'FontSize',12)
set(gca,'XTick',[100 1000 10000 100000])
set(gca,'YTick',[0 1])
ylabel('Fraction of Cell Population','FontSize',14)
xlabel('Fluorescence [a.u.]','FontSize',14)

%% Plot the histograms along the y-axis

spacing=.15;  %Spacing along the y-axis chosen empirically 

for i=1:5
    fill([x(1);x'; x'],[i*spacing; (HistData(i,:)+i*spacing)'; ones(1,length(x))'*i*spacing],MMColorMap(i,:),'LineStyle','none')
    semilogx(x,HistData(i,:)+i*spacing,'LineWidth',3,'Color','k');
end


%% Save the histogram figure
saveas(h,'ExampleStackedHistograms','fig')
saveas(h,'ExampleStackedHistograms','png')
saveas(h,'ExampleStackedHistograms','ai')
saveas(h,'ExampleStackedHistograms','pdf')

Notes

Please feel free to post comments, questions, or improvements to this protocol. Happy to have your input!

  • Megan N McClean 17:27, 17 July 2013(EDT): This ought to get you started. There are many improvements that could be made. For instance, a more sophisticated/attractive color scheme or automatic selection of the spacing along the y-axis. Knock yourselves out!

Files

Script for stacked histogram example

Data for stacked histogram example

Contact

or instead, discuss this protocol.