Haynes:GalaxyChiP: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
Line 45: Line 45:
# Click '''Add Datasets to Visualization'''.
# Click '''Add Datasets to Visualization'''.
# Select the refGene data and click '''Add'''. Wait for the data to load.
# Select the refGene data and click '''Add'''. Wait for the data to load.
# After the data has loaded, you should see some colors within the track. Zoom in until you see lines and bars that look like gene annotations.
# After the data has loaded, you should see some colors within the track. Zoom in until you see thin lines and thick bars that look like gene annotations.
# Move your cursor over the track label on the left to open display options. Click the downward arrow (Set display options) icon. Set the display to '''Pack'''. You should now see transcript labels (e.g., NM_12345678).
# Use the slider at the bottom of your browser window to scroll across the map.
# Use the drop down menu at the top to switch bewteen chromosomes





Revision as of 16:20, 28 March 2014

<- Back to Protocols

Getting Started: Create a Galaxy Account

Intro: Databases usually do a very good job of collecting published data, sharing data, and making this data searchable/ usable by scientists outside of the publishing author's lab. However, it is almost impossible for the databases to keep up with all data that is constantly being generated by different labs. Therefore, you still need some DIY skills to make use of the most recent data. Scientists are pretty smart, but not everyone can become a computer software developer and make his/her own tools on the fly, but no single software package can do anything and everything you want. One solution is the Galaxy platform, which is a suite of tools available online for free. This platform balances ease-of-use with customizability.


  1. Go to http://usegalaxy.org.
  2. Under User in the top menu, select Register to create a new account. It's free.



View Human Genes as Map (Track)

When an organism's entire genetic content has been "read" by biochemical sequencing techniques, every single A T C, and G, is recorded in a database, in the order that each letter (nucleotide) appears in the organism. The collective data is referred to as a sequenced genome. Because each nucleotide has a position, each one gets a number, or coordinate. Since genes are made up of nucleotides, the start and end of each gene is assigned a range of coordinates. All of this data is stored and shared with scientists so that we can use the same information to make discoveries that link back to a universal set of coordinates in whatever genome we are investigating. The data are usually stored as tables with thousands of rows of numbers. Galaxy can be used to turn the numerical data into easy-to-view illustrated genomic maps.

Part 1: Transfer gene mapping data to Galaxy

  1. Log in to Galaxy.
  2. Click Get Data > UCSC Main Table Browser in the left menu.
  3. In the window titled Table Browser, set the parameters to the following values
    1. clade: mammal, genome: human, assembly: Mar. 2006 (NCBI36/hg18)*
    2. group: Genes and Gene predictions, track: RefSeq Genes
    3. table: refGene
    4. region: genome
    5. output format: BED - Browser Extensible Data, Send output to: Galaxy
    6. output file: (leave blank)
    7. file type returned: plain text
  4. Click "Get Output."
  5. In the window titled Output refGene as BED, set the parameters to the following values
    1. name = tb_refGene, description = table browser query on refGene, visibility = full, url = (leave blank)
    2. Create on BED record per: Whole Gene
  6. Click "Send query to Galaxy"
  7. A new file will appear under History (right side menu). Wait until the job has completed.
  8. Under History, click the name of the file to open up more options.
  9. Click the eye (View Data) icon. You should see a huge table with many rows that look something like this:
    chr1 67051159 67163158 NM_024763 0 - 67052400 67163102 0 17
  • Note - 3/28/14 - Newer releases of the human genome (hg19, hg36) have been made available. Much of the genomic data that you will be using is most likely based on the hg18 map. Please use hg19 or any other release as appropriate.

Part 2: View the data as a map

  1. Click Visualization in the top menu. In the pop-up menu, select New Track Browser.
  2. Set the Browser name to Human Genome.
  3. Set the Reference genome build to Human Mar. 2006 (NCBI36/hg18), or which ever genome is consistent with the data you imported in the previous step.
  4. Click Create. This sets up an empty framework for displaying data that maps to the coordinates of the reference genome you selected. Now you must populate the map with data.
  5. Click Add Datasets to Visualization.
  6. Select the refGene data and click Add. Wait for the data to load.
  7. After the data has loaded, you should see some colors within the track. Zoom in until you see thin lines and thick bars that look like gene annotations.
  8. Move your cursor over the track label on the left to open display options. Click the downward arrow (Set display options) icon. Set the display to Pack. You should now see transcript labels (e.g., NM_12345678).
  9. Use the slider at the bottom of your browser window to scroll across the map.
  10. Use the drop down menu at the top to switch bewteen chromosomes




Add ChIP Data to a Genome Map (Track)

These steps walk you through uploading BED data from a ChIP experiment nd viewing them alongside a gene map.

  1. If you have not already done so, store one or more BED files on your local hard drive. If you are interested in using data from another lab, make sure that the lab provides to you the data in BED format.
  2. Log in to Galaxy.
  3. Click Get Data > Upload File in the left menu.
  4. Set the parameters to the following values:
    1. File: browse for the BED file on your hard drive
    2. Genome: Select a genome that corresponds to your BED data. If you are using the human genome, be sure to select the hg number (e.g., hg18) that corresponds to your BED data.
    3. Convert spaces to tabs: yes
  5. Click the [Execute] button.
  6. Under History on the right side of the page you should see some indication of the data being uploaded to Galaxy. BED files are typically very large. Please be patient while the file uploads.
  7. When the upload is finished, this item will be labeled with a #1 and the file name under History.
  8. Click the eye icon to preview the data. It should look similar to the example below:
  9. Click the pencil icon to edit the file. You can rename the data file here. Importantly, make sure Chrom column, Start Column, End column, Name/Identifier column, and Strand column are set to 1, 2, 3, 4, and 6 respectively. Column 5 is usually the "score" of the row in a BED file. Select the “Datatype” tab and make sure it is set to interval. Click “Save.”
  10. Your data is now ready to be analyzed.

Note: Because you are a registered user, Galaxy will save this data under History until you delete it (even after you log off and log back in). Note that you have a space limit, so you should delete any files under History that are incorrect, or that you no longer need.

Visualize the data as a track

  1. You must start with uploaded data. See "Getting Started, Part 2".
  2. Select Visualization in the top menu. In the pop-up menu, select New Track Browser.
  3. Click on the name of the data file under History to open up the options.
  4. Click the histogram (bar graph) icon. In the pop-up menu, select Trackster.