User:Lindenb/Notebook/UMR915/20100618

From OpenWetWare
Jump to navigationJump to search

20100617        Top        20100621       


mysql

creating a table for the samples

 create table sample(
      id int unsigned primary key auto_increment,
      name varchar(20) unique,
      description text,
      created datetime,
      modified timestamp
      );

SNP calling

filtering on depth( min/max) and Q: minimum RMS mapping quality for SNP

 samtools pileup -vcf $(HG18) recal_bwa_rmdup_XXXX1.bam |\
  samtools.pl varFilter -d 4 -D 1200 -Q 25 |\
  java -jar ~/src/code915/dist/pileupfilter.jar -bed PCCR.bed  | wc -l
 43209

 gunzip -c daily/20100601/for_pierre_010610/vcf_maq_pccr_changed_filters/XXXX1.vcf.gz | wc -l
 35784

some filters are missing. OK, now just keep the single nucleotide mutations:

 (...) | awk -F "  " '{if(length($4)==1) print;}' | wc -l
 37121

adding filter for the snp quality:

 samtools pileup -vcf  $(HG18) recal_bwa_rmdup_XXXX1.bam | samtools.pl varFilter -d 4 -D 1200 -Q 25 |\
 java -jar ~/src/code915/dist/pileupfilter.jar -bed PCCR.bed -min-sq 25 |\
 awk -F "        " '{if(length($4)==1) print;}' > jeter.pileup
 wc jeter.pileup
 
 35286

comparison SS2/my data

 comm -12: 34187
 comm -13:  1569  (uniq to SS2 data) eg: 10:1036884,10:114049332
 comm -23:  1099  (uniq to MY data) e.g: 10:101153184,10,101283025

Comparaison Dindel/samtools

same as yesterday but remove non-PCCR from dindel (update: bug found in my src by SS2)

  • count dindel :1492
  • count samtools :3739
  • uniq dindel :889
  • uniq samtools :3136
  • comm dindel/samtools :603

script was: sh 20100618.sh

 cat ${HOME}/daily/20100604/XXXXX_variantCalls.VCF |\
 grep -v "#" | awk -F '	' '{if(length($5)!=1 && $7==".") print; }' |\
 java -jar  /home/lindenb/src/code915/dist/posfilter.jar -db XXX_0809_PCCR.bed |\
 cut -d '	' -f 1,2 | sort | uniq > jeter1.txt
 
 
 cat ${HOME}/daily/20100528/indels_samtools_bwa/PCCR/XXXXXXsamtoolsIndel.vcf |\
 grep -v "#" | awk -F '	' '{if(length($5)!=1) print; }' |\
 cut -d '	' -f 1,2 | sort | uniq > jeter2.txt
 
 echo -n "count dindel :"
 wc -l jeter1.txt
 echo -n "count samtools :"
 wc -l jeter2.txt
 echo -n "uniq dindel :"
 comm --check-order -23 jeter1.txt jeter2.txt | wc -l
 echo -n "uniq samtools :"
 comm --check-order -13 jeter1.txt jeter2.txt | wc -l
 echo -n "comm dindel/samtools :"
 comm --check-order -12 jeter1.txt jeter2.txt | wc -l
 
 rm jeter1.txt jeter2.txt