User:Robert M. MacCallum/WTFGSB Reportback
Welcome Trust Functional Genomics and Systems Biology Workshop
30 November to 1 December 2009
A few talks have no notes, usually because they were too specific.
Edison Liu: ‘Integrative Study of Estrogen Receptor Biology in Human Cancer’
Estrogen (or is it EGF) receptor (ER) binding site analysis (ChIP and bioinf) - "Cosmic" score, correlation with RNA PolII binding and H3K4meX marks.
Some functional binding is 1Mb away from gene!! Only 9% in 5k "promoter".
Looping for efficient transcription, grouping of coregulated genes ("looped out" genes don't respond to ER)
Johan Rung: ‘A multi-stage genome-wide association study detects a novel risk locus near IRS1 for type 2 diabetes, insulin resistance, and hyperinsulinemia’
GWAS for type 2 diabetes
F Pradezynski: ‘Systems Level Approach of Hepatitis C Virus Infection’
Y2H between various virus proteomes and human proteins.
Many human pathways interfered with, in particular the ones you'd expect (interferon reponse)
Chris Bakal: ‘Describing the Systems Architecture of Cell Morphogenesis’
Wounding, cell morphology, image analysis -> 100+ feature profile of cell's, morphology.
"canalised" morphology space (jumps between states)
Keith Baggerly: ‘The Importance Of Reproducibility In High-Throughput Biology: A Case Study’
Reproducibility in hi-thru biology
The data was in GEO but when analysed again, the gene lists, heat maps etc were completely different.
Eventually an "off by one" error was found, caused, equally, by pasting data from excel and the non-existence of documentation for the software (R package).
Later papers from the offending authors had further errors (mislabeled drugs, repeated figures from earlier work). Letters to the editor were responded with "we did it again and got the same results" (can you believe it!).
In the end, the study had led to clinical trials and so Baggerly and colleagues published a proper paper exposing the problems in a statistical journal. Soon after that the medical journals were on the case and the trials were stopped.
Nick Luscombe: ‘Nucleoporins, chromosomal organisation and gene regulation.’
Nuclear lamins known to tether transcriptionally inactive DNA
Nucleoporins now shown to be assoc with active gene expression.
Also through ChIP some proteins bind to enable X chromosome dosage compensation.
Mark Gerstein: ‘Understanding Protein Function on a Genome-scale using Networks’
A review of several years' network work. Including some Venter ocean sample sequence analysis (map to pathways, correlate with environmental factors with some canonical ..... method (is this like bi-clustering?))
Yoram Louzoun: 'Immunomic analysis of viruses CD8+ T cell epitope repertoire'
Not in programme.
Mentioned an epitope prediction approach called SIR (Size of Immune Repertoire score) which models MHC peptide binding.
Some scheduled speakers didn't speak in this session.
Day two, session one
Seth Grant: ‘System Biology of The Synapse and Behaviour’
Complexity of post-synaptic molecular machinery (several thousand proteins). Conserved in invertebrates (50% of prots) and single celled (25%). Evolution of the machinery (including plasticity) preceded evolution of synapses.
Very slow evolution.
Caleb Webber: ‘Identifying CNV genes that contribute to developmental delay and autism’
CNV in mouse
What's special about pathological CNVs? (vs. benign)
Human CNVs look up mouse phenotypes (somehow!)
(Didn't follow this very frenetic talk 100%)
Florian Markowetz: ‘Mapping Dynamic Histone Acetylation Patterns to Gene Expression in Nanog-depleted Murine Embryonic Stem Cells’
ES cell histone modifications
days 1 3 5 of ES development - 4 analyses
Protein MS ChIP-chip histone Rna pol II Microarrays
day 0 nanog TF downreg -> network of TFs
clustering of smoothed histone profiles (around TSS)
when mRNA upreg, small local acetylation around TSS when mRNA down, wider deacetylation around TSS.
increased correlation between H acet and gene expression through time (more at day 5 than day 1) genome-wide
predict gene expr from histone acetylation using LOTS of ML methods (in R)
Grant Belgard: ‘Transcriptome-Wide Functional Anatomy of Mouse Cortical Layering Revealed Through Deep Sequencing’
6 layers of neocortex
many cell types spanning several layers
paired end 50bp reads
(you get some intronic reads)
some intergenic regions detected (a few percent of reads)
layer specific genes, various layers show various GO enrichments.
John Hogenesch: ‘A journey through the clock network’
Circadian clock genes through hi-thru func genomics. nice robot video.
siRNA screen (seems to be tunable to desired knockdown level)
clock pathway is robust - surprising lack of lethal knock outs
Day two, session two
Peter Hoen: ‘Functional Genomics as a Readout In Therapy Development’
(standing in for Gert-Jan van Ommen)
Duchenne muscular dystrophy
Andrew Teschendorff: ‘Pathway-Centric Classification of Breast Cancer’
classification of breast cancer
Dan Geschwind: ‘Human-Specific Transcriptional Regulation of Cns Development Genes By Foxp2’
transcriptional regulation of CNS development genes by FoxP2
looked at human vs chimp regulation of genes (microarray) in a cell line.
many genes respond differently (up and down)
But why? The 2 AA diffs are not in known DNA binding domain
6 genes regulated via proximal promoter (luciferase reporter)
validated in vivo
haNCS human accelerated non coding sequences (look this up)
Recent paper showing two mitochondrial network types in neurons (synaptic and cell body)
Compare human vs chimp networks
Douglas Kell: ‘The cellular uptake of pharmaceutical drugs: a problem not of biophysics but of systems biology’
Suit and tie alert!
networks described in unambiguous fashion, SBML, ChEBI SMILES etc for small molecules.
uptake of drugs, via transporters (proteins).
Day two, session three
Genevieve Konopka: ‘Comparative Gene Expression in Primate Brain Using Nextgen Sequencing’
Can't do multi-species (human, chimp, macaque) on a human affy chip.
Next gen sequencing! Four brain regions.
Networks from WGCNA
Tom Freeman: ‘Identification of Expression Networks in Immunity’
Networks in immunity
mentioned proteasome (did I see that on VB expression map wrt immunity?)
graphical markup for pathways
some kind of flow simulations through them
Day two, session four
Frank Holstege: ‘Understanding regulatory circuitry through expression-profile phenotypes’
1200 regulatory components, TFs, kinases, ch remodelers, RNA processing -> mutations and expression microarrays
done so far deletome
some kinases have no diff expr, is it because they are inactive in standard conditions or is it because of redundancy?
The use some synthetic genetic interaction prediction to choose pairs
some kinases redundant with phosphatase! it's cross talk between two pathways (somehow).
different types of redundancy:
- quantitative (double has more effect than single(s))
- incongruent (effects in single are not in double)
also used the data for protein complex prediction
Stefan Weimann: ‘Modeling and Experimental Testing of Cell Cycle Regulation by the Erbb- Protein and Mirna Network in Breast Cancer’
new targets for drug resistant breast cancer
ErbB signalling network
the drug is an ErbB2 antibody
Louis Serrano: ‘Systems Biology of a Small Bacterium’
689 ORFs + 44 RNAs
maybe only 10-11 TFs (E. coli 100 or so)
full complement of chromatin remodelling
plan was to do loads of -omics + electron microscopy
transcriptomics: arrays 62 conditions, tiling array
detailed look at transcripts (reverse strand ncRNA, no idea of mechanism) multiple TSSs
where you have operons encoding 4 genes, you don't just see mRNA of all four, you get different levels of each gene, somehow...
plenty of regulatory complexity
metabolome: KEGG didn't work out, had to do lots of manual work to build metabolic map. defined minimal medium.
know reactions are there, but 10-12 enzymes are not known
200 molecules per protein per cell
Day three, session one
Jurg Bahler: ‘Differential marking of intronic and exonic DNA regions with respect to RNA polymerase II occupancy, histone density, and H3K36me3 MODIFICATION patterns’
Pre-post splicing levels measured with RNA seq.
Splicing efficiency regulated
Co-transcriptional splicing. Look for relationship between splicing and chromatin - H3K36me3 lower in introns.
Measured transcript levels after transcription blocking compound - measure decay, however drugs have side-effects. Better to measure PolII occupancy and RNA abundance and estimate decay with a formula.
Looked at response to oxidative stress. See patterns of expression with stable transcription.
Jaak Vilo: ‘Network reconstruction and mining of high-throughput data’
Network reconstruction. Analysis tools. GraphWeb NAR db issue.
"MEM" query similar expression in multiple datasets. You could put everything in together (like VB expr maps) but there could be crap data included. Instead they do a post-analysis of separate queries (one per dataset) using ranks. P-value for enrichment of low ranks. "Low" depends on a rank threshold - try all and find lowest p-value.
web tool may have anopheles affy data (they get it from ArrayExpress) - it does but can't figure out which gene or probe symbols to query with!
really nice annotation cloud mouse-over!
Adler et al Genome Biology 2009 in press
Annelies Fieuw: ‘Integrative analysis of coding and non-coding gene expression and copy numbers in neuroblastoma’
look for more genes implicated in pathology
looked for correlated m(i)RNA expression and genome copy number
Geoffrey Faulkner: ‘Transposed Elements are Massively Transcribed in Mammalian Cells’
1/2 human genome. only 100 mobile though. mostly Alu SINEs and L1 LINEs. something about neurons recently in Nature
plenty more immobile
CAGE - 25bp 5' tags, somehow find TE promoters through sequencing.
see correlated expression between TE and nearby gene
likely to be positive regulators
RNAi against TE transcripts - have phenotypes (myoblast morphology)
Eileen Furlong: ’Making global predictions of cis-regulatory activity’
need map of cis-regulatory elements and their inputs
lots of chip-chip through development.
usually two antibodies for each TF (for consensus) and strict FDR
look for combinatorial binding
they've found 8000 CRMs (for mesoderm development) 2000 target genes
further expts determine +ve or -ve effect of CRMs
recommends Nature Genetics v36 2006, Reinitz group - models of TF neworks for eve stripe 2 enhancer
80% of literature CRMs are in the chip set ("atlas")
predict 5 different expression pattern from binding signatures with SVM, predict expression of chip CRMs and I guess validate experimentally, 4 classes 80% validated.
Day three, session two
Wolfgang Huber: ‘Detecting genetic interactions and multiparametric dynamic phenotypes in RNAi perturbation microscopy imaging assays’
transcriptome characterisation in yeast
3' nucleosome depleted regions
nucleosome depleted regions are shared in bidirectional promoters.
in yeast, antisense transcripts interfere with sense, also interferes with H modification (somehow) and also activating transcription
48 (wild?) strains and transcriptomics ((look for association with SNPs))
genes with antisense are more often OFF
anticorrelation of sense/antisense xscripts
is it "opportunistic" transcription? (pol finds open chromatin and xscribes) (Stuhl review?)
more TF binding sites for coding than antisense
strain data QTL for expression
- coding: 75% distal effects
- antisense: 50/50 local/distal (not sure of interpretation)
transcripts usually extend 100bp into the promoter of the opposite transcript. this is necessary for anti-correlated xscript levels
TATA also contributes to regulated expression
no evidence for translation of reverse xscripts so far
Felix Naef: ‘Rhythmic protein-DNA interactomes and circadian transcription regulatory networks’
understand gene expression programs under circadian control (e.g. genes expressed in heart at 10am, same kind of thing in liver)
known "E box" element (driven by TF BMAL1?) driven by circadian system, but it can't drive all downstream effects because of their timing - is there another element?
something special about tandem arrangement of E-box element. CLOCK/BMAL1 heterodimer binding
Bussemaker 2001 cis-reg network algorithm?
Not solved phase specifity problem yet.
Caroline Brorsson: ‘A Genome-Wide SNPxSNP Search for Epistasis Identifies Gene-Gene Interactions in Type 1 Diabetes’
Type 1 diabetes
12000 cases, 13000 controls -> 40 regions associated with T1D
mostly immune genes
small effects of each variant
looking for epistasis in GWAS
Mark McCarthy: ‘The End of the Beginning: Genetic Success and the Long Road to Functional Inference’
plenty of loci found now (~20?) but no decent prediction, AUC with variants only = 0.6, with BMI+Age=0.78! explains around 5% of predisposition
agilent array for CNVs, none found for T2D or some other diseases.
Day three, session three
Matthias Uhlen: ‘Human Proteome Atlas’
goal: antibodies against all proteins http://proteinatlas.org
have got an epitope predictor - training data available
working towards a subcellular "index" for all proteins.
first draft proteome by 2014
long term goal, paired antibodies (shouldn't rely on one!)
all Abs available through sigma.
antibody specificity; proteins 10^7 dynamic range in cells; paired antibodies against different epitopes make it more specific
looking into FRET paired antibodies - seems to work.
don't have paired for the majority
some kind of validation with commercial abs, 40% work nicely??? wiki based community based validation of antibodies.
systems level analysis:
how many proteins are tissue specific (one cell type) < 2%
or at a larger level (say, brain) = 10%
some new tissue specific genes found though.
how many prots in a cell? ~65% of all
brain had fewest
levels of proteins do distinguish cell types
antibody "array" using beads, multiplexed 384 abs x 384 samples in one run!
used on blood plasma - personalised medicine and biomarkers
going to do 20 diseases (400 patients per disease) biobanks
George Koumbaris: ‘X-chromosome disorders: Identification of underlying mechanisms’
breakpoint analysis of X-chromosome disorders
Jean-Baptise Cazier: ‘Methodological Aspects of Metabonome Quantitative Trait Locus Mapping in Organ Extracts using Nuclear Magnetic Resonance Profiling’
metabolome from fat, plasma, urine, etc with NMR spectra (40,000 points)
- through time: (no liver)
- multiple animals
found a locus (on chr14) and a metabolite (benzoate, a gut microbial metabolite)
Gavin Sherlock: ‘Molecular Characterization of the Fitness Landscape in Asexually Evolving Populations of Saccharomyces Cerevisiae'
yeast evolution (in the lab)
count and isolate clones in a population.
R Y and G dies in population
equal proportions, glucose limitation (for selection)
haploid, no sex, several 100 generations (I think)
observe "clonal interference"
isolate clones (facs and then split into 7 then fitness measure then solexa sequencing)
see mutants and amplifications (saw hexose transporter amplic)
clones arising later have more mutations
each of 5 lineages were distinct (no shared mutations) even if same colour
some mutations may be hitchhikers (not adaptive) ; some kind of sex under selection somehow lets you figure out which mutations are adaptive
Day three, session four
Steve Oliver: ‘Conservatism and Innovation in the Design and Evolution of a Simple Eukaryote’
more yeast evolution. hemizygous mutants in competition with each other. most genes in one copy give happy yeast, some show haploid insufficiency. some others are haploid proficient.
these are "high flux" genes
after whole genome duplication, they are more likely to stay in two copies.
Chris Pacheco: ‘Missplicing of Cyclin-G Associated Kinase is a Risk Factor for Developing Parkinson’s Disease’
Alvis Brazma: A global map of major transcriptional states of the human genome’
9000 raw data files from Affy U133A in GEO and ArrayExpress
After QC, 5372 samples remained (206 studies, 163 labs -> 369 conditions)
only about 25% "normal"
cell lines are very different
3rd is tissue of origin
first 3 components explain 37% of the variability
also a MDS (like Tom Freeman's graphs)
6 main classes: brain, muscle, x, y, z, q (beyond this, the signal is weak - maybe lab effects)
take a leukaemia cluster, figure out genes
also introduced gene expression atlas at ebi