User:Lindenb/Notebook/UMR915/20101119

From OpenWetWare

Jump to: navigation, search

20101118        Top        20101122       


Contents

indexing the genome

adding the whole sequence of the chromosomes in the BDB.

 Found >chr10 (overflows: 0 keys:0 time=0mins)
 Found >chr10_random (overflows: 0 keys:914598 time=4mins)
 Found >chr11 (overflows: 0 keys:914688 time=4mins)
 (...)
 overflows: 394766 keys: 1047086 time=593mins
 du -h bdb/
 7.8G    bdb/

GATK

updating from version 1.0.3471 to 1.0.4705

the new version has changed the way it scans the reference sequence. see also Brad Chapman's observation http://friendfeed.com/yokofakun/a0715b86/updated-gatk-it-doesn-t-want-my-reference-genome

for hg18 one must use Homo_sapiens_assembly18.fasta ( http://getsatisfaction.com/gsa/topics/differences_in_gatk_supplied_hg18_reference_file_vs_ucsc )

Belgium data

no indel result for sample1:

java -jar GenomeAnalysisTK-1.0.4705/GenomeAnalysisTK.jar -T IndelGenotyperV2  -R /GENOTYPAGE/data/pubdb/broad.mit.edu/gsa/resources/Homo_sapiens_assembly18.fasta -I jeter.bam  -bed jeter.bed -verbose jeter2.txt -o jeter3.vcf

Freebayes

https://github.com/ekg/freebayes

had problem compiling the program ( error convert.h , changed the source but not sure). command line was:

bin/freebayes --fasta-reference /GENOTYPAGE/data/pubdb/ucsc/hg18/chromosomes/hg18.fa -i /GENOTYPAGE/data/users/lindenb/20101108_belgium/jeter.bam
Personal tools