LncRNA ReleaseNotes: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
 
Line 1: Line 1:
==September 16, 2011==
==September 16, 2011==
===lncRNA Annotation===
===Notes===
<B>Progress</B>
<B>2012 Jan 19</B>


Paul 2012 Jan 10
PRC2 core function: suz12, eed, ezh2, jarid2, pct2/mtt2
 
function of suz12?
suz12gt and wt chip data comparison: 2420 gens lose h3k27me3, 970 genes maintain h3k27me3.
 
pick prc2 components from Joe and other RNA-seq and chip-seq datasets
 
generate a sub-network for all the components
 
hoxD cluster genes? 1,
 
suz12gt, suz12gt-bgal-kd , ezh1 kd, ezh2 kd,
 
 
 
 
 
 
 
 
<B>Paul 2012 Jan 10</B>


intro
intro

Latest revision as of 17:08, 20 January 2012

September 16, 2011

Notes

2012 Jan 19

PRC2 core function: suz12, eed, ezh2, jarid2, pct2/mtt2

function of suz12? suz12gt and wt chip data comparison: 2420 gens lose h3k27me3, 970 genes maintain h3k27me3.

pick prc2 components from Joe and other RNA-seq and chip-seq datasets

generate a sub-network for all the components

hoxD cluster genes? 1,

suz12gt, suz12gt-bgal-kd , ezh1 kd, ezh2 kd,





Paul 2012 Jan 10

intro active silent poised expressionfigure

project

h2a.z interacting partners hmgn2 interaction with h2a.z

a. Use hmgn2 expression in as many tissues as possible b. Use mass spectrum to identify protein interaction partners c.


hmg proteins hmgn1, 2, 3a, 3b, 4, and 5 why hmgn? hmgn2 chip-seq peaked at tts how is it calculated? distribution of chip-seq data, percentage of binding sites, promoter 1000bp 42.4% promoter 1000-2000bp 2.4% distal intergenic 19.5% intron 14.3%

h2a.z knock down chip-seq analysis use chip-seq - great for enrichment analysis k-mean clustering use 6 or 7 clusters, genes vs tss distance, enrichment analysis of clusters not done yet.

a small compendium of chip-seq cluster analysis? try mutual information

meet with Paul and the team to answer questions consequence of h2a.z and hgmn ? mechanism of h2a.z and hgmn, tss? Promoter? Time or location? confirmation, take a target set to do pcr?

August 30, 2011

lncRNA Annotation

Description

Identify differentially regulated lncRNAs during cardiomyocyte differentiation using the sequencing files from the cardiomyocyte differentiation time course. Move forward in further analyzing these data to create essentially Figure 1 of the next paper.

Goals

Global analysis of differential lncRNA expression during cardiomyocyte differentiation.

Datasets

The lncRNA annotation file which has been sorted from all lncRNAs contained in ENSEMBL and from the Guttman et al. Nature paper that can be used to identify lncRNAs in the cardiomyocyte expression data sets.

Methods

Some ideas for moving forward:

  1. Cluster lncRNA data to find stage-specific expression patterns for the lncRNAs. What is the best representation for this?
  2. Cluster lncRNA with the rest of the expression data to determine broader clusters.
  3. Determine potential pathways (GO, Ingenuity, Gene Set Enrichment Analysis?) based on the broader clusters.
  4. Compare chromatin patterns to the lncRNA expression cluster data (we have these data as part of the cardiac consortium)

Use these data to identify candidate lncRNAs for further genetic analysis and to derive informative data from the expression analysis in order to learn more about the function of these lncs and to possibly learn more about their regulation.

References

August 29, 2011

EB differentiation time course

Progress

A report was delivered on Aug. 29, 2011. The missing experimental dataset at D4 will be conducted and included in the next round of analysis.

August 16, 2011

EB differentiation time course

Description

Analyze the RNA-Seq data generated for the lncRNA (lnc011) knockdown (kd) and control ESC lines (0d) that were differentiated into EBs (6d and 9d).

Goals

The goal is to display the differentially expressed genes in a figure and to further analyze the genes that are mis-regulated as a function of the kd relative to the control (scrambled). Heat map or more informative presentation of the data is expected.

To learn more about how lnc (lnc11) may control cell fate specification by "plotting" the expression of the mis-regulated genes (in the GO categories) along the cardiomyocyte differentiation pathway using the RNA-Seq data for the various time points (namely D0, D4, D5.3, and D10).

Datasets

Tables attached including fold change relative to control.

The file (all.ComparisonExpn.txt) contains counts and FPKM that was generated with the sequencing files.

The big spreadsheet is with all the tests for differential expression. There are some semi-redundant columns for counts and fpkms. If there are no reads for d0 and d4, there won't be any counts or a stat test, but if there are reads for d5.3, then day0 and day4 will be shown as 0 and a stat test will be conducted.

Methods

Ideally generate a GO-type figure and gene network figure possibly by selecting the genes included in over-represented GO categories. The preliminary GO analysis via GOStat (attached) using a 1.5x cut-off for "down-regulated genes" shows enrichment for categories that have roles in heart function including muscle contraction, sarcomere function, heart development, blood circulation, etc.

Progress

The report was delivered on Aug. 29, 2011. The missing experimental dataset at D4 will be conducted and included in the next round of analysis.