R Statistics: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
No edit summary
Line 2: Line 2:


==What is R?==
==What is R?==
R is a system for statistical analyses and graphics created by Ross Ihaka
R is a system for statistical analyses and graphics created by Ross Ihaka
and Robert Gentleman1. R is both a software and a language considered as a
and Robert Gentleman1. R is both a software and a language considered as a
dialect of the S language created by the AT&T Bell Laboratories. S is available
dialect of the S language created by the AT&T Bell Laboratories. S is available
as the software S-PLUS commercialized by Insightful2. There are important
as the software S-PLUS commercialized by Insightful2. There are important
di�erences in the designs of R and of S: those who want to know more on this
erences in the designs of R and of S: those who want to know more on this
point can read the paper by Ihaka & Gentleman (1996) or the R-FAQ3, a copy
point can read the paper by Ihaka & Gentleman (1996) or the R-FAQ3, a copy
of which is also distributed with R.
of which is also distributed with R.
Line 13: Line 12:
its development and distribution are carried out by several statisticians known
its development and distribution are carried out by several statisticians known
as the R Development Core Team.
as the R Development Core Team.
R is available in several forms: the sources (written mainly in C and
R is available in several forms: the sources (written mainly in C and
some routines in Fortran), essentially for Unix and Linux machines, or some
some routines in Fortran), essentially for Unix and Linux machines, or some
pre-compiled binaries for Windows, Linux, and Macintosh. The �les needed
pre-compiled binaries for Windows, Linux, and Macintosh. The les needed
to install R, either from the sources or from the pre-compiled binaries, are
to install R, either from the sources or from the pre-compiled binaries, are
distributed from the internet site of the Comprehensive R Archive Network
distributed from the internet site of the Comprehensive R Archive Network
Line 21: Line 21:
the distributions of Linux (Debian, . . . ), the binaries are generally
the distributions of Linux (Debian, . . . ), the binaries are generally
available for the most recent versions; look at the CRAN site if necessary.
available for the most recent versions; look at the CRAN site if necessary.
R has many functions for statistical analyses and graphics; the latter are
R has many functions for statistical analyses and graphics; the latter are
visualized immediately in their own window and can be saved in various formats
visualized immediately in their own window and can be saved in various formats
(jpg, png, bmp, ps, pdf, emf, pictex, x�g; the available formats may
(jpg, png, bmp, ps, pdf, emf, pictex, xg; the available formats may
depend on the operating system). The results from a statistical analysis are
depend on the operating system). The results from a statistical analysis are
displayed on the screen, some intermediate results (P-values, regression coef-
displayed on the screen, some intermediate results (P-values, regression coef-
�cients, residuals, . . . ) can be saved, written in a �le, or used in subsequent
cients, residuals, . . . ) can be saved, written in a le, or used in subsequent
analyses.
analyses.
The R language allows the user, for instance, to program loops to successively
The R language allows the user, for instance, to program loops to successively
analyse several data sets. It is also possible to combine in a single
analyse several data sets. It is also possible to combine in a single
program di�erent statistical functions to perform more complex analyses. The
erent statistical functions to perform more complex analyses. The
1Ihaka R. & Gentleman R. 1996. R: a language for data analysis and graphics. Journal
 
of Computational and Graphical Statistics 5: 299{314.
R users may benet from a large number of programs written for S and available
2See http://www.insightful.com/products/splus/default.asp for more information
3http://cran.r-project.org/doc/FAQ/R-FAQ.html
4For more information: http://www.gnu.org/
5http://cran.r-project.org/
1
R users may bene�t from a large number of programs written for S and available
on the internet6, most of these programs can be used directly with R.
on the internet6, most of these programs can be used directly with R.
At �rst, R could seem too complex for a non-specialist. This may not
At rst, R could seem too complex for a non-specialist. This may not
be true actually. In fact, a prominent feature of R is its  
be true actually. In fact, a prominent feature of R is its  
exibility. Whereas
exibility. Whereas a classical software displays immediately the results of an analysis, R stores
a classical software displays immediately the results of an analysis, R stores
these results in an \object", so that an analysis can be done with no result
these results in an \object", so that an analysis can be done with no result
displayed. The user may be surprised by this, but such a feature is very useful.
displayed. The user may be surprised by this, but such a feature is very useful.
Indeed, the user can extract only the part of the results which is of interest.
Indeed, the user can extract only the part of the results which is of interest.
For example, if one runs a series of 20 regressions and wants to compare the
di�erent regression coe�cients, R can display only the estimated coe�cients:
thus the results may take a single line, whereas a classical software could well
open 20 results windows. We will see other examples illustrating the
exibility
of a system such as R compared to traditional softwares.


==Download and Install R==
==Download and Install R==

Revision as of 06:57, 20 June 2006

R for Statiscal Computing

What is R?

R is a system for statistical analyses and graphics created by Ross Ihaka and Robert Gentleman1. R is both a software and a language considered as a dialect of the S language created by the AT&T Bell Laboratories. S is available as the software S-PLUS commercialized by Insightful2. There are important erences in the designs of R and of S: those who want to know more on this point can read the paper by Ihaka & Gentleman (1996) or the R-FAQ3, a copy of which is also distributed with R. R is freely distributed under the terms of the GNU General Public Licence4; its development and distribution are carried out by several statisticians known as the R Development Core Team.

R is available in several forms: the sources (written mainly in C and some routines in Fortran), essentially for Unix and Linux machines, or some pre-compiled binaries for Windows, Linux, and Macintosh. The les needed to install R, either from the sources or from the pre-compiled binaries, are distributed from the internet site of the Comprehensive R Archive Network (CRAN)5 where the instructions for the installation are also available. Regarding the distributions of Linux (Debian, . . . ), the binaries are generally available for the most recent versions; look at the CRAN site if necessary.

R has many functions for statistical analyses and graphics; the latter are visualized immediately in their own window and can be saved in various formats (jpg, png, bmp, ps, pdf, emf, pictex, xg; the available formats may depend on the operating system). The results from a statistical analysis are displayed on the screen, some intermediate results (P-values, regression coef- cients, residuals, . . . ) can be saved, written in a le, or used in subsequent analyses.

The R language allows the user, for instance, to program loops to successively analyse several data sets. It is also possible to combine in a single erent statistical functions to perform more complex analyses. The

R users may benet from a large number of programs written for S and available on the internet6, most of these programs can be used directly with R. At rst, R could seem too complex for a non-specialist. This may not be true actually. In fact, a prominent feature of R is its exibility. Whereas a classical software displays immediately the results of an analysis, R stores these results in an \object", so that an analysis can be done with no result displayed. The user may be surprised by this, but such a feature is very useful. Indeed, the user can extract only the part of the results which is of interest.

Download and Install R

Links to tutorials

Examples for commonly used statistis

Bioconductor & Microarray data Analysis