PGP and Tranche: Difference between revisions
Andrea Loehr (talk | contribs) |
Andrea Loehr (talk | contribs) No edit summary |
||
Line 105: | Line 105: | ||
====PGP on Tranche==== | ====PGP on Tranche==== | ||
The | The public PGP data are now available on [https://proteomecommons.org/tranche/ Tranche].<br/> | ||
To download using the command line tool: <br/> | |||
</tt>java -Xmx512m -jar Tranche-Downloader.jar -r 'PATH' PGP_HASH </tt><br/> | |||
Available on [https://proteomecommons.org/tranche/ Tranche] are the following data. Use these paths as 'PATH' in the command above <br/> | |||
PGP/PGP_1/PGP_1_FC_00037/PGP_1_FC_00037_L003/<br/> | |||
PGP/PGP_3/PGP_3_FC_00035/PGP_3_FC_00035_L003/<br/> | |||
PGP/PGP_3/PGP_3_FC_00037/PGP_3_FC_00037_L002/<br/> | |||
PGP/PGP_5/PGP_5_FC_00044/PGP_5_FC_00044_L002/<br/> | |||
PGP/PGP_7/PGP_7_FC_00044/PGP_7_FC_00044_L004/<br/> | |||
PGP/PGP_8/PGP_8_FC_00037/PGP_8_FC_00037_L001/<br/> | |||
PGP/PGP_8/PGP_8_FC_00051/PGP_8_FC_00051_L002/<br/> | |||
PGP/PGP_8/PGP_8_FC_00051/PGP_8_FC_00051_L006/<br/> | |||
PGP/PGP_9/PGP_9_FC_00043/PGP_9_FC_00043_L001/<br/> | |||
PGP/PGP_9/PGP_9_FC_00051/PGP_9_FC_00051_L003/<br/> | |||
PGP/PGP_9/PGP_9_FC_00051/PGP_9_FC_00051_L007/<br/> | |||
PGP/PGP_10/PGP_10_FC_00041/PGP_10_FC_00041_L003/<br/> | |||
CONTROL/CONTROL_FC00035/CONTROL_FC00035_L001/<br/> | |||
CONTROL/CONTROL_FC00037/CONTROL_FC00037_L008/<br/> | |||
CONTROL/CONTROL_FC00041/CONTROL_FC00041_L001/<br/> | |||
CONTROL/CONTROL_FC00043/CONTROL_FC00043_L001/<br/> | |||
CONTROL/CONTROL_FC00044/CONTROL_FC00044_L001/<br/> | |||
CONTROL/CONTROL_FC00051/CONTROL_FC00051_L001/<br/> | |||
[[Category:PGP]] | [[Category:PGP]] |
Revision as of 17:04, 19 April 2009
People
- User:Andrea Loehr (PGP)
- User:Alexander Wait Zaranek (PGP)
- User:James A. Hill (Tranche)
- User:Bryan E. Smith (Tranche)
Tranche
In order to increase the utility of project data and make more of it available to the public, the Personal Genome Project (PGP) has launched PersonalGenomes@Home. This effort uses ProteomeCommons.org's Tranche Network for persistent storage. The Tranche Project is a free and open source file sharing tool that enables collections of computers to easily share and cite scientific data sets. Designed and built with scientists and researchers in mind, Tranche essentially solves the data sharing problem in a secure and scalable fashion.
Tranche User Account
To apply for a user account fill out the form for a ProteomeCommons User Account. Pending applications are reviewed each business day.
System Requirements
Java Runtime Environment 5.0 or later; See System Requirements
Tranche User Guide and Instructions for Up- and Downloads
A detailed user guide can be found Tranche User Guide here.
There are three ways to add or get data from the network:
- GUI: Go to the Tranche homepage and click "Launch Tranche". (Requires Java 5+ with Web Start)
- Command-line tools: See below
- Java API: For custom tools development
The most popular of the three is the GUI, as it is easy to use. The command-line tools are useful for automating tasks or working in headless environments, and the API is useful when integrating Tranche in a software project or for creating a custom tool
Tranche up- and downloads can be run over the command line using the upload tool and the download tool.
wget --no-check-certificate https://proteomecommons.org/tranche/files/CommandLineAddFileTool.zip wget --no-check-certificate https://proteomecommons.org/tranche/files/CommandLineGetFileTool.zip
In order to use these tools you also need a login, which you can get at ProteomeCommons.org.
Download each tool, unzip the file, go into unzipped directory, type java -jar NAME.jar --help to obtain usage information. (If java is not in your system path, add it to your path or type the full path /path/to/java -jar NAME.jar --help.
For usage information java -jar Tranche-Downloader.jar --help
Download a project with a certain hash: java -jar Tranche-Downloader.jar HASH
For usage information: java -jar Tranche-Uploader.jar --help
Upload a file:
java -Xmx521m -jar Tranche-Uploader.jar -u USER.zip.encrypted -p PASSWORD -c true -t "MY TITLE" -d "MY DESCRIPTION" /home/DataForUpload
There is the option to download/upload encrypted data:
java -jar Tranche-Downloader.jar -e supersecret HASH
java -jar Tranche-Uploader.jar -u FILE.zip.encrypted -p supersecret /home/DataForDownpload
Example scripts are provided: download script and upload script.
To get notified about changes and upgrades one can join the automated tool group for command-line tools and API.
Transferring Data onto Tranche
For initial data transfer, could ship (two?) USB drives to BPF:
Attn: Andrew Gagne Biopolymers Facility 77 Ave. Louis Pasteur Room 0088 Boston, MA 02115
We have:
- PGP2 - FC37_2 - http://genomerator.freelogy.org/~awz/pgp2-FC_00037_L002/ Note: Other data sets could appear on the hard-disk(s) with this directory structure. On arrival, data could be loaded into Tranche as 100 data "bundles" per data-set (i.e per Illumina lane).
We need:
- PGP1 - FC37_3
- PGP3 - FC35_3
- PGP5 - FC44_2
- PGP7 - FC44_4
- PGP8 - FC37_1,FC51_2,FC51_6
- PGP9 - FC43_3,FC51_3,FC51_7
- PGP10 - FC41_3
Also, could use:
- CONTROL - FC35, FC37, FC41, FC43, FC44, FC51.
For all the above there is a top level directory (eg. pgp2-FC_00037_L002) and exactly 36 directories below that. Within each of those directories there are 4x100 files. For this release, it would be ideal if the data was organized in tranche as 18x100 "randomly addressable" data sets that a volunteer computer could ask for as desired. Each addressable "bundle" of data would then be 4x36 files.
Example: Upload project, download a portion using command-line tools
- Get directory to test.
besmit@besmit-kubuntu:~/PGP-Test$ wget -r -l 1 http://genomerator.freelogy.org/~awz/pgp2-FC_00037_L002/C36.1/
- Moved downloaded directory contents to C36.1/. Upload this directory to Tranche. Requires login to upload. See -h or --help for information about parameters. The very last argument is the directory to upload.
besmit@besmit-kubuntu:~/Desktop/TrancheLabs/Upload$ java -Xmx512m -jar Tranche-Uploader.jar -U bryan -P ********** -d "This is my description. Passphrase required for download." -t "This is my title: C35.1 encrypted" -e pgptest4 -c true C36.1/
- This is the stderr for the project. Intended for debugging, etc.
Using batch chunk upload?: yes Started total of 10 file encoding threads.
- This is the stdout for the project - the hash used to identify the project. This should be saved.
uiRL5wtqG5FyzE9PnJG47dbxuU3PqpX3aE2Gq9SNJa5vRvlgn14hwUEBW8UZyXIeQWLP9B49sb6/W8dBOz1+QfRC5UkAAAAAAAEnnA==
- Download files tifs with filenames referring to G or C nucleotides
besmit@besmit-kubuntu:~/Desktop/TrancheLabs/Download$ java -Xmx512m -jar Tranche-Downloader.jar -e pgptest4 -r _[gc].tif.gz$ uiRL5wtqG5FyzE9PnJG47dbxuU3PqpX3aE2Gq9SNJa5vRvlgn14hwUEBW8UZyXIeQWLP9B49sb6/W8dBOz1+QfRC5UkAAAAAAAEnnA==
- The only output is the path to download directory, shown when download complete
/home/besmit/Desktop/TrancheLabs/Download/tranche-downloads/C36.1
PGP on Tranche
The public PGP data are now available on Tranche.
To download using the command line tool:
java -Xmx512m -jar Tranche-Downloader.jar -r 'PATH' PGP_HASH
Available on Tranche are the following data. Use these paths as 'PATH' in the command above
PGP/PGP_1/PGP_1_FC_00037/PGP_1_FC_00037_L003/
PGP/PGP_3/PGP_3_FC_00035/PGP_3_FC_00035_L003/
PGP/PGP_3/PGP_3_FC_00037/PGP_3_FC_00037_L002/
PGP/PGP_5/PGP_5_FC_00044/PGP_5_FC_00044_L002/
PGP/PGP_7/PGP_7_FC_00044/PGP_7_FC_00044_L004/
PGP/PGP_8/PGP_8_FC_00037/PGP_8_FC_00037_L001/
PGP/PGP_8/PGP_8_FC_00051/PGP_8_FC_00051_L002/
PGP/PGP_8/PGP_8_FC_00051/PGP_8_FC_00051_L006/
PGP/PGP_9/PGP_9_FC_00043/PGP_9_FC_00043_L001/
PGP/PGP_9/PGP_9_FC_00051/PGP_9_FC_00051_L003/
PGP/PGP_9/PGP_9_FC_00051/PGP_9_FC_00051_L007/
PGP/PGP_10/PGP_10_FC_00041/PGP_10_FC_00041_L003/
CONTROL/CONTROL_FC00035/CONTROL_FC00035_L001/
CONTROL/CONTROL_FC00037/CONTROL_FC00037_L008/
CONTROL/CONTROL_FC00041/CONTROL_FC00041_L001/
CONTROL/CONTROL_FC00043/CONTROL_FC00043_L001/
CONTROL/CONTROL_FC00044/CONTROL_FC00044_L001/
CONTROL/CONTROL_FC00051/CONTROL_FC00051_L001/