Preparing to Use STEM
- First, I downloaded the software and registered with the website. STEM
- After unziping the file (7-zip > Extract Here), I launched the program using the command window.
- To do this, I went to the start menu and clicked Programs > Accessories > Command Prompt.
- Then I entered the following commands in the window that appeared:
java -mx512M -jar stem.jar -d defaults.txt
Preparing the Spreadsheet
- I opened my master spreadsheet and inserted a new worksheet and named it "stem".
- I copied over data from the "final" worksheet to the "stem" worksheet.
- Renamed the columns: "MasterIndex"→ "SPOT" and "ID"→ "Gene Symbol"
- Deleted all of the data columns except AvgLogFC columns.
- Renamed the data columns with time and units for simplicity.
- 'Save as Text (Tab-delimited) (*.txt).
- Expression Data Info: selected my file (no normalization/add 0 and Spot IDs included in the data file)
- Gene Info: 'Saccharomyces cerevisiae (SGD) with no cross references'and no gene locations
- Options: STEM Clustering Method was selected, no changes
- Execute: Run the program
Viewing and Saving STEM Results
- Changed to "Based on real time" from Interface Options and took a screenshot of this window (saved to powerpoint)
- Opened detailed plots of each profile and took individual screenshots (saved to powerpoint)
- "Profile Gene Table" and "Profile GO Table": saved tables and uploaded to lionshare with correct names
Analyzing and Interpreting STEM Results
- Selected profile 9 for further interpretation, which was mostly down regulated and never up-regulated. I chose this gene because I find down-regulated genes easier to understand and explain. It also followed a simple pattern that I knew I would be able to put into context.
- 221 genes were assigned to this profile.
- 55.9 (56) genes were expected to be assigned to this profile.
- The p value is 1.5E-65 (significant).
- Opened the GO list and selected the third row. From the menu, I clicked Data > Filter > Autofilter.
- Looked at terms with p value of < 0.05: 38 terms
- Looked at corrected values with p value of < 0.05: 2 terms
- GO:0005737 cytoplasm: All of the contents of a cell excluding the plasma membrane and nucleus, but including other subcellular structures.
- GO:0044445 cytosolic part: Any constituent part of cytosol, that part of the cytoplasm that does not contain membranous or particulate subcellular components.
- GO:0005829 cytosol: The part of the cytoplasm that does not contain organelles but which does contain other particulate matter, such as protein complexes.
- GO:0005622 intracellular: The living contents of a cell; the matter contained within (but not including) the plasma membrane, usually taken to exclude large vacuoles and masses of secretory or ingested material. In eukaryotes it includes the nucleus and cytoplasm.
- GO:0044424 intracellular part: The living contents of a cell; the matter contained within (but not including) the plasma membrane, usually taken to exclude large vacuoles and masses of secretory or ingested material. In eukaryotes it includes the nucleus and cytoplasm.
- GO:0008483 transaminase activity: Catalysis of the transfer of an amino group to an acceptor, usually a 2-oxo acid.
- GO:0016769 transferase activity, transferring nitrogenous groups: Catalysis of the transfer of a nitrogenous group from one compound (donor) to another (acceptor).
- GO:0034637 cellular carbohydrate biosynthetic process: The chemical reactions and pathways resulting in the formation of carbohydrates, any of a group of organic compounds based of the general formula Cx(H2O)y, carried out by individual cells.
- GO:0009063 cellular amino acid catabolic process: The chemical reactions and pathways resulting in the breakdown of amino acids, organic acids containing one or more amino substituents.
- GO:0044444 cytoplasmic part: Any constituent part of the cytoplasm, all of the contents of a cell excluding the plasma membrane and nucleus, but including other subcellular structures.
- SOURCE: http://geneontology.org
- The pattern for my profile was 0.0, -1.0, -2.0, -2.0, -1.0, 0.0. Of these most significant GO terms, most have to do with cytoplasm/cytosol. There are a couple for forming carbohydrates and breaking down the amino acids, but I was truly surprised to see that the cytoplasm/cytosol/intracellular parts would matter so much in this profile. To be honest, I still don't understand how these elements would be regulated, or for what purpose/advantage. I can only imagine that changing the interior makeup of the cell might affect the surface area to volume ratio and help a down regulated cell conserve heat and energy.
- Opened web window .
- Opened gene list in Excel for profile 9.
- Copied the list of gene IDs into web box for ORFs/Genes.
- Checked the box for Check for all TFs.
- Unchecked the box for Indirect Evidence.
- Clicked the Search button.
- Top 10 transcription factors: Ste12p (36.2%), Rap1p (29.9%), Fhl1p (18.6%), Cin5p (14.0%), Phd1p (13.6%), Sok2p (13.1%), Yap6p (12.7%), Yap5p (11.3%), Skn7p (10.4%), Yap1p (9.5%).
- GLN3 is on the list, representing 3.2% and 7 genes: YDR210w, YEL007w, UGA1, CPS1, PUT1, ZEO1, WTM1.
- Transcription factors I used to general the matrix and diagram: CIN5, CUP9, FHL1, GTS1, HSF1, MSN1, MSN4, NRG1, RAP1, RCS1, REB1, ROX1, RPH1, YAP1, YAP6, GLN3, STE12, PHD1, SOK2, YAP5, SKN7
- I added the top five (non-overlapping) transcription factors to the list because I figured that if they represent such a large portion of regulated genes that they should be included in the map.
- Before I generated the figures, I unchecked the box for "Indirect Evidence" and selected "JPEG" from the drop-down menu for the "Output Image".
- Clicked "Generate"
- Saved RegulationMatrix to Lionshare.
- Clicked on the "Image" link to see the diagram of the network.
- Pasted image into my PowerPoint file and uploaded to Lionshare.
Screenshots for Week 12
Regulation Matrix for Dahlquist_wt Profile 9