Wayne:High Throughput Sequencing Resources
From OpenWetWare
(Difference between revisions)
(→CBI Collaboratory) |
|||
| Line 133: | Line 133: | ||
== High throughput (HT) platform and read types == | == High throughput (HT) platform and read types == | ||
<ul> | <ul> | ||
| + | <li> ABI-SOLiD | ||
<li> Illumina single-end vs. paired-end | <li> Illumina single-end vs. paired-end | ||
| - | <li> | + | <li> Ion Torrent |
| - | + | ||
<li> MiSeq | <li> MiSeq | ||
| - | <li> | + | <li> Roche-454 |
| + | <li> Solexa | ||
</ul> | </ul> | ||
| Line 189: | Line 190: | ||
<li> Clip sequence artefacts (e.g. adapters, primers) | <li> Clip sequence artefacts (e.g. adapters, primers) | ||
</ul> | </ul> | ||
| + | |||
| + | <br> | ||
| + | <div align="right">[http://openwetware.org/wiki/Wayne:High_Throughput_Sequencing_Resources Top]</div> | ||
| + | <div align="right">[http://openwetware.org/wiki/Wayne_Lab Wayne Lab Home]</div> | ||
| + | |||
| + | |||
| + | == FASTQC and FASTX tools == | ||
| + | |||
| + | <br> | ||
| + | <div align="right">[http://openwetware.org/wiki/Wayne:High_Throughput_Sequencing_Resources Top]</div> | ||
| + | <div align="right">[http://openwetware.org/wiki/Wayne_Lab Wayne Lab Home]</div> | ||
| + | |||
| + | == BED and SAM tools == | ||
| + | |||
| + | *<div>[http://code.google.com/p/bedtools/ BED tools]</div> | ||
| + | *<div>[http://samtools.sourceforge.net SAMtools]</div> | ||
| + | |||
| + | <br> | ||
| + | <div align="right">[http://openwetware.org/wiki/Wayne:High_Throughput_Sequencing_Resources Top]</div> | ||
| + | <div align="right">[http://openwetware.org/wiki/Wayne_Lab Wayne Lab Home]</div> | ||
| + | |||
| + | == GATK variant calling == | ||
<br> | <br> | ||
Revision as of 21:40, 15 February 2013
Basic server commands (for Sirius)
Here is a list of commonly used linux commands:
| Command | Usage |
| pwd | Print working directory (your current location |
| ls | List (all contents of current location) |
| ls options | ls -a (hidden files), ls -l (long/detailed list), ls -t (sorted by time modified instead of name) |
| cd /give/path | Change directories |
| cd .. | Go up one directory |
| mkdir directoryName | Make a new directory |
| rmdir directoryName | Remove directory (must be empty)...Remember that you cannot undo this move! |
| rmdir -r directoryName | Recursively remove directory and the files it contains...Remember that you cannot undo this move! |
| rmdir filename | Remove specified file...Remember that you cannot undo this move! |
| head filename | Print to screen the top 10 lines or so of the specified file |
| tail filename | Print to screen the last 10 lines or so of the specified file |
| more filename | Allows file contents or piped output to be sent to the screen one page at a time |
| less filename | Opposite of more command |
| wc filename | Print byte, word, and line counts |
| wc filename [options] | -c (bytes); -l (lines); -w (words) delimited by whitespace or newline |
| whereis [filename, command] | Lists all occurances of filename or command |
| mv | Move (akin to cut/paste), to remove the file in the current location; Usage: mv current/path/filename destination/path/filename |
| cp | Copy (also used to rename files if you keep them in their current path), keeps a copy in the current path; Usage: cp current/path/filename destination/path/filename |
| nohup commands & | To initiate a no-hangup background job |
| screen | To initiate a new screen session to start a new background job |
| tar -xzf filename.tar.gz | Decompress tar.gz file |
| gzip -c filename >filename.gz | Compress file into tar.gz; the ">" means print to outfile filename.gz |
Here is a list of commonly used linux commands for learning about the CPU utilization:
| Command | Usage |
| top | Display top CPU processes/jobs and provides an ongoing look at processor activity in real time. It displays a listing of the most CPU-intensive tasks on the system, and can provide an interactive interface for manipulating processes. It can sort the tasks by CPU usage, memory usage and runtime. |
| mpstat | To display the utilization of each CPU individually. It reports processors related statistics. |
| mpstat -P ALL | The mpstat command display activities for each available processor, processor 0 being the first one. Global average activities among all processors are also reported. |
| sar | Displays the contents of selected cumulative activity counters in the operating system |
High throughput (HT) platform and read types
- ABI-SOLiD
- Illumina single-end vs. paired-end
- Ion Torrent
- MiSeq
- Roche-454
- Solexa
CBI Collaboratory
UCLA'sComputational Biosciences Institute Collaboratory hosts a variety of 3-day workshops that provide both a general introduction to genome/bioinformatic sciences as well as more advanced (focus) workshops (e.g. ChIP-Seq; BS-Seq; Exome sequencing). The CBI Collaboratory focuses on a set of publicly available resources, from the web-based bioinformatic tool Galaxy/UCLA (resource for HT workflows and is a central location of a variety of HT tools for multiple platforms and data types), but also tools such as R and Matlab. The introductory workshops do not require any programming experience and the Collaboratory Fellows additionally serve as a counseling resource for data analysis.
File formats and conversions
- bcl
- qseq
- fastq
Deplexing using barcoded sequence tags
- Editing (or hamming) distance
Quality control
- Fastx tools
- Using mapping as the quality control for reads
Trimming and clipping
- Trim based on low quality scored per nucleotide position within a read
- Clip sequence artefacts (e.g. adapters, primers)
FASTQC and FASTX tools
BED and SAM tools
GATK variant calling
R basics
HT sequence analysis using R (and Bioconductor)
DNA sequence analysis
RNA-seq analysis
Common objectives of transcriptome analysis:
- Quantifying and annotating aligned reads
- Normalizing RNA-Seq read count data and identifying differentially expressed genes (DEG) (R packages):
- easyRNASeq (simplifies read counting per genome feature)
- DEXSeq (Inference of differential exon usage)
- baySeq (also see: segmentSeq)
- Genominator (Bullard et al. 2010)
- Detection of alternative splice junctions
SOLiD software tools


