Polysat: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
(→‎Documentation: new versions)
(→‎Wish List: deleted stuff that has been added (or doesn't make sense))
Line 40: Line 40:
Since I have just recently released polysat, I am very interested in getting feedback!
Since I have just recently released polysat, I am very interested in getting feedback!


This section lists additional functionality that I'm thinking of adding to polysat.  If you have any additional requests, or would like to "vote" for one of the items below to be a top priority, just send me an email!  If you have created your own functions to interface with the package and would like to be added as a contributor, I am open to that as well.
This section lists additional functionality that I'm thinking of adding to polysat.  If you have any additional requests (please be specific), or would like to "vote" for one of the items below to be a top priority, just send me an email!  If you have created your own functions to interface with the package and would like to be added as a contributor, I am open to that as well.


* Store all information pertinent to a particular dataset in one object, instead of severalI am experimenting with creating one or more S4 classes to contain the slots <code>@Genotypes</code>, <code>@PopInfo</code>, <code>@PopNames</code>, <code>@Missing</code>, <code>@Usatnts</code>, and <code>@Ploidies</code>.  This should increase user friendliness and prevent a lot of errors.
* For allopolyploids, assign alleles to one genome or the other based on what genotypes are found in the population.  (This is a complex problem and not on the to-do list for my dissertation, but could be very useful.) Use these allele assignments to re-code allopolyploid data into autopolyploid data by splitting each locus into two or more loci.
* Some sort of iterative computation in order to better estimate allele frequencies.  (Very high on the to-do list since ''Molecular Ecology Resources'' wants to see it added to the package.)
* On a related note, test whether genotype distributions in a population are consistent with autopolyploid or allopolyploid inheritance.
* Make a graphical front end for the package.  I lack the programming expertise to do this, but if I find myself with some free time on my hands I could learn. I'm definitely open to collaboration on this one!
* Use allele frequency estimations to randomly generate unambiguous genotypes for a dataset with partial heterozygotes.  These could then be passed to software such as <code>adegenet</code> that allows for polyploidy but not allele copy number ambiguity.
* Make a graphical front end for the package.  I lack the programming expertise to do this, but am open to collaborating with someone else on the project if there are any volunteers.
* More population statistics (Weir and Cockerham 1984, etc.).
* More population statistics (Weir and Cockerham 1984, etc.).
* Use allele frequency estimations to randomly generate unambiguous genotypes for a dataset with partial heterozygotes.  These could then be passed to software such as <code>adegenet</code> that allows for polyploidy but not allele copy number ambiguity.
* Parentage analysis
* Parentage analysis
* For allopolyploids, assign alleles to one genome or the other based on what genotypes are found in the population.
* A method (other than genetic distance distributions as in [http://www.bentleydrummer.nl/software/software/Other%20Software.html GenoType]) to quantitatively distinguish asexual and sexual progeny.  I'm doing a study on apomixis in blackberries so I have a bunch of notes jotted down on this, although I at least temporarily abandoned the idea.
* A method (other than genetic distance distributions as in [http://www.bentleydrummer.nl/software/software/Other%20Software.html GenoType]) to quantitatively distinguish asexual and sexual progeny.  I'm doing a study on apomixis in blackberries so I have a bunch of notes jotted down on this, although I at least temporarily abandoned the idea.
* Someone asked me to add more functionality for plant breeding questions in general.  I am not a plant breeder and so I am not going to do this any time in the near future.  It would probably be better for someone in the plant breeding community to write an expansion for the package (or a new package that depends on <code>polysat</code>) that fits the community's needs.


== Frequently asked questions ==
== Frequently asked questions ==

Revision as of 15:38, 27 September 2010

polysat is an R package for polyploid microsatellite analysis in ecological genetics. The second publicly available version, 1.0, is available on CRAN as of September 2010.

What polysat does

  • Assumes allele copy number ambiguity in partial heterozygotes
  • Handles data of any ploidy, including mixed ploidy samples
  • Stores genotype data in a simple format that can be easily manipulated to exclude or add samples and loci
  • Imports and exports data in ABI GeneMapper Genotypes Table, GenoDive, Structure, SPAGeDi, ATetra, Tetrasat/Tetra, and binary presence/absence formats.
  • Calculates pairwise distances between individuals using a stepwise mutation model or infinite alleles model
  • Counts alleles to assist user in estimating ploidy
  • Estimates allele frequencies and calculates pairwise FST based on these estimates. Mixed ploidy population size is measured in genomes rather than individuals.

Author and Maintainer

User:Lindsay V. Clark

Obtaining polysat

If you don't already have R, download it from CRAN and install it.

At the prompt in the R console, type:

install.packages("combinat")

install.packages("polysat")

library(polysat)

Documentation

Tutorial manual: Most users will want to read this first to get a general idea of how to use the package. It starts with a broad tutorial to familiarize users with the package, then goes into more detail about how data are stored in polysat and which analyses are appropriate for autopolyploid and allopolyploid data.

R code from tutorial manual: You can copy and paste this code into the R console in order to follow along with the tutorial, or edit it to work with your own data. Emacs Speaks Statistics is a really handy program for editing this type of file and sending lines directly to R, but you can also use a simpler text editor such as Notepad to view and edit this file.

Reference manual: This is an alphabetized collection of all of the help files provided with the package. It contains more details about each function, as well as additional examples.

How to cite polysat

Clark, LV and Jasieniuk, M, 2011. POLYSAT: an R package for polyploid microsatellite analysis. Molecular Ecology Resources (in review).

Wish List

Since I have just recently released polysat, I am very interested in getting feedback!

This section lists additional functionality that I'm thinking of adding to polysat. If you have any additional requests (please be specific), or would like to "vote" for one of the items below to be a top priority, just send me an email! If you have created your own functions to interface with the package and would like to be added as a contributor, I am open to that as well.

  • For allopolyploids, assign alleles to one genome or the other based on what genotypes are found in the population. (This is a complex problem and not on the to-do list for my dissertation, but could be very useful.) Use these allele assignments to re-code allopolyploid data into autopolyploid data by splitting each locus into two or more loci.
  • On a related note, test whether genotype distributions in a population are consistent with autopolyploid or allopolyploid inheritance.
  • Use allele frequency estimations to randomly generate unambiguous genotypes for a dataset with partial heterozygotes. These could then be passed to software such as adegenet that allows for polyploidy but not allele copy number ambiguity.
  • Make a graphical front end for the package. I lack the programming expertise to do this, but am open to collaborating with someone else on the project if there are any volunteers.
  • More population statistics (Weir and Cockerham 1984, etc.).
  • Parentage analysis
  • A method (other than genetic distance distributions as in GenoType) to quantitatively distinguish asexual and sexual progeny. I'm doing a study on apomixis in blackberries so I have a bunch of notes jotted down on this, although I at least temporarily abandoned the idea.

Frequently asked questions

If you have never used R before, particularly if you find command-line software to be intimidating, you may need to spend a day or two just learning R before you even touch polysat. (Look for the An Introduction to R manual on the CRAN website.) I have tried to make polysat as user-friendly as possible, but that cannot substitute for a basic understanding of how R works. Trust me, learning R is worth it! R is very powerful and efficient software for data analysis, and if you take the time to learn it for the sake of using polysat, you may find yourself using R in other areas of your research. If you are not sure how something works, try experimenting to see if it does what you think it does.

  • I have made my PCA plot. Can I add a label for each sample? Yes. See ?text.
  • In read.GeneMapper I got the error "line 2 did not have X elements". Each line of the file needs to have the same number of tab stops. You can add these manually in a text editor, or if you open and save the file in a spreadsheet program it should automatically insert the right number of tab stops.

Known issues

  • In version 0.1, read.SPAGeDi will not work with missing=0, missing=00, etc. This should not be an issue in version 1.0 because of the change in data structure. (In either version, even if the missing data symbol is at the default, -9, the software still knows that zero indicates missing data in a SPAGeDi file.)

Source code

For advanced R users, here is the source code for the functions in the package, so that you may tweak them or create new functions for your own use:

Current version (1.0)

Older versions

Media: polysat_0.1_functions.R.txt

External links

  • You can rate and review polysat on Crantastic. (I am of course also open to questions and comments via email.)
  • CRAN page with source and binary downloads.