Imperial College/Courses/Fall2009/Synthetic Biology (MRes class)/'R' Tutorial/Basic Commands: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
No edit summary
No edit summary
Line 22: Line 22:
==Useful Commands and Functions==
==Useful Commands and Functions==


Program management
'''Program management'''
q() # quit
* q() # quit
help(…),?…,?help,find # help manual
* help(…),?…,?help,find # help manual
help.start() # help in html format
* help.start() # help in html format
; # cmd separator
* ; # cmd separator
# # comment mark
* # # comment mark
ls(), objects() # see which R objects are in the R
* ls(), objects() # see which R objects are in the R workspace
  workspace
* rm(x,y) # remove x,y from workspace
rm(x,y) # remove x,y from workspace
* source(‘file.R’) # runs file.R from working directory
source(‘file.R’) # runs file.R from working directory
* sink(‘file.lis’) # sends output to file.lis in working dir
sink(‘file.lis’) # sends output to file.lis in working dir
* sink() # output reverts to console
sink() # output reverts to console
* .Last.value # value from previous expression
.Last.value # value from previous expression


save(),dump(),write(),dput(),dget(),write()
* save(),dump(),write(),dput(),dget(),write()


Data management
'''Data management'''
read.table(“file.dat”,header=TRUE,row.names=1)
* read.table(“file.dat”,header=TRUE,row.names=1)
scan("ex.data", skip = 1) # reading fixed formatted input
* scan("ex.data", skip = 1) # reading fixed formatted input
names(islands) # print the names attribute of the
* names(islands) # print the names attribute of the islands data set
  islands data set
* table(rpois(100,5)) # build a contingency table of the counts at each combination of factor levels
table(rpois(100,5)) # build a contingency table of the counts
* make.names(…)
  at each combination of factor levels
* matrix(data,nrow = 1,ncol = 1,byrow = FALSE,dimnames) #creates a matrix
make.names(…)
* data()                      # list all available data sets
matrix(data,nrow = 1,ncol = 1,byrow = FALSE,dimnames) #creates a matrix
* data(package = base)        # list the data sets in the base package
data()                      # list all available data sets
* data(women) # load the data set women
data(package = base)        # list the data sets in the base package
* file.show # view file
data(women) # load the data set women
* attach(women) # attaches database to search path
file.show # view file
* detach("women") # remove database from search path
attach(women) # attaches database to search path
* library()                  # list all available packages
detach("women") # remove database from search path
* library(eda)             # load package ‘eda'
library()                  # list all available packages
* print(x) # prints its argument and returns it invisibly (generic)
library(eda)             # load package ‘eda'
* edit(…) # edit a data frame or matrix
print(x) # prints its argument and returns it
* summary(height)          # a generic function used to produce result summaries
  invisibly (generic)
edit(…) # edit a data frame or matrix
summary(height)          # a generic function used to produce
  result summaries




Data manipulation
'''Data manipulation'''
mode(object), length(object) # returns mode and length of object
* mode(object), length(object) # returns mode and length of object
str() # displays structure of an arbitrary R
* str() # displays structure of an arbitrary R object
  object
* c(1:5, 10.5, "next") # generic fnc which combines args into a vector
c(1:5, 10.5, "next") # generic fnc which combines args into a
* x[1:10] # indexes vector
  vector
* paste(c(“a”,”b”),1:10) # combine one by one into char vector
x[1:10] # indexes vector
* dim(x) or dim(x) <- c(3,4) # retrieve or set the dimension of an object
paste(c(“a”,”b”),1:10) # combine one by one into char vector
* array # creates or tests for arrays
dim(x) or dim(x) <- c(3,4) # retrieve or set the dimension of an
* as.matrix(x) # attempts to turn x into a matrix
  object
* is.matrix(x) # tests if x is a (strict) matrix
array # creates or tests for arrays
* numeric(3) # produces vector of zeroes of length 3
as.matrix(x) # attempts to turn x into a matrix
* list(x=cars[,1], y=cars[,2]) # collects items together (of different types)
is.matrix(x) # tests if x is a (strict) matrix
* unlist # flattens list
numeric(3) # produces vector of zeroes of length 3
* factor # used to encode a vector as a factor
list(x=cars[,1], y=cars[,2]) # collects items together (of different
 
  types)
unlist # flattens list
factor # used to encode a vector as a factor
# defines a partition into groups  
# defines a partition into groups  
cbind(0, rbind(1, 1:3)) # combine args by columns or rows
* cbind(0, rbind(1, 1:3)) # combine args by columns or rows
as.**** (eg as.matrix(x) # coerce numerical data frame to
* as.**** (eg as.matrix(x) # coerce numerical data frame to numerical matrix
  numerical matrix
* is.**** (eg is.matrix(x) # test of argument
is.**** (eg is.matrix(x) # test of argument
* args(t.test) # displays the argument names of a function
args(t.test) # displays the argument names of a
* margin.table(m,1) # give margin totals of array
  function
margin.table(m,1) # give margin totals of array
 
Program control
function( arglist ) expr
return(value)
if(cond) cons.expr  else  alt.expr
for(var in seq) expr
while(cond) expr
repeat expr
break
next
tapply(1:n, fac, sum) # apply function to each comb of factor
  levels
 
Operators
+ - * / ^ (element by element operations with recycling)
%%  (mod)
%/% (integer division)
crossprod
%*% (matrix prod, inner product)
outer %o% (outer product)
a&b (and), a|b (a or b), !a (not a)
precedence: $ [] ^ unary- : (%% %/% %*%) (* /) (+ - ?) (< > <= >= == !=) ! (& | && ||) ~ (<- ->)


Mathematical functions
'''Program control'''
solve backsolve forwardsolve t(transpose)
* function( arglist ) expr
uniroot polyroot optimize nlm deriv
* return(value)
log log10 sqrt exp sin cos tan acos asin atan cosh sinh tanh gamma lgamma choose lchoose bessel
* if(cond) cons.expr  else  alt.expr
abs sign sum prod diff cumsum cumprod min max pmax pmin range length
* for(var in seq) expr
diag scale nrow ncol length append drop
* while(cond) expr
det eigen svd qr chol chol2inv
* repeat expr
eigen(cbind(c(1,-1),c(-1,1))) # computes eigenvalues and eigenvectors
* break
* next
* tapply(1:n, fac, sum) # apply function to each comb of factor levels


'''Operators'''
* + - * / ^ (element by element operations with recycling)
* %%  (mod)
* %/% (integer division)
* crossprod
* %*% (matrix prod, inner product)
* outer %o% (outer product)
* a&b (and), a|b (a or b), !a (not a)
* precedence: $ [] ^ unary- : (%% %/% %*%) (* /) (+ - ?) (< > <= >= == !=) ! (& | && ||) ~ (<- ->)


Statistical functions  
'''Mathematical functions'''
mean var cov cor sd mad median range IQR fivenum quantile mahalanobis
* solve backsolve forwardsolve t(transpose)
sort rev order rank sort.list
* uniroot polyroot optimize nlm deriv
ceiling floor round trunc signif zapsmall jitter
* log log10 sqrt exp sin cos tan acos asin atan cosh sinh tanh gamma lgamma choose lchoose bessel
all duplicated unique any lower.tri upper.tri
* abs sign sum prod diff cumsum cumprod min max pmax pmin range length
approx approxfun spline splinefun curve
* diag scale nrow ncol length append drop
mean(x, trim = .10) # (trimmed) mean
* det eigen svd qr chol chol2inv
* eigen(cbind(c(1,-1),c(-1,1))) # computes eigenvalues and eigenvectors




'''Statistical functions'''
* mean var cov cor sd mad median range IQR fivenum quantile mahalanobis
* sort rev order rank sort.list
* ceiling floor round trunc signif zapsmall jitter all duplicated unique any lower.tri upper.tri
* approx approxfun spline splinefun curve
* mean(x, trim = .10) # (trimmed) mean


Graphics
'''Graphics'''
par(mfrow=c(2,3)) # create 2x3 array of figs filled row-wise plot pairs coplot boxplot boxplot.stats hist stem density piechart barplot dotplot qqplot qqnorm qqline ppoints interaction.plot lowess contour persp image stars symbols  
* par(mfrow=c(2,3)) # create 2x3 array of figs filled row-wise  
par axis box lines abline segments points text mtext title labels legend plotmath arrows polygon Hershey plot.window xy.coords rug
* plot pairs coplot boxplot boxplot.stats hist stem density piechart barplot dotplot qqplot qqnorm qqline ppoints interaction.plot lowess contour persp image stars symbols  
colors hsv rgb rainbow gray palette  
* par axis box lines abline segments points text mtext title labels legend plotmath arrows polygon Hershey plot.window xy.coords rug
multifigure parameters)
* colors hsv rgb rainbow gray palette  
graphics devices: postscript pictex windows png jpeg bmp xfig bitmap
* multifigure parameters)
locator() # read position of graphics cursor
* graphics devices: postscript pictex windows png jpeg bmp xfig bitmap
identify() # identifies near point in graphic
* locator() # read position of graphics cursor
* identify() # identifies near point in graphic


Statistical distributions & sampling
'''Statistical distributions & sampling'''
sample(n)    # random permutation
* sample(n)    # random permutation
sample(x,replace=T) # bootstrap sample
* sample(x,replace=T) # bootstrap sample
set.seed RNGkind .Random.seed
* set.seed RNGkind .Random.seed
Prefixes: d (density) p (distribution function) q (quantile function)
* Prefixes: d (density) p (distribution function) q (quantile function)
r (random deviates)
* r (random deviates)
chisq t F norm binom pois exp beta gamma lnorm unif geom cauchy logis
* chisq t F norm binom pois exp beta gamma lnorm unif geom cauchy logis hyper nbinom weibull wilcox
hyper nbinom weibull wilcox


Statistical tests
'''Statistical tests'''
t.test prop.test binom.test wilcox.test kruskal.test ansari.test bartlett.test cor.test fisher.test fligner.test friedman.test ks.test mantelhaen.test mcnemar.test mood.test pairwise.prop.test pairwise.t.test pairwise.wilcox.test print.pairwise.htest prop.trend.test quade.test shapiro.test var.test
* t.test prop.test binom.test wilcox.test kruskal.test ansari.test bartlett.test cor.test fisher.test fligner.test friedman.test ks.test mantelhaen.test mcnemar.test mood.test pairwise.prop.test pairwise.t.test pairwise.wilcox.test print.pairwise.htest prop.trend.test quade.test shapiro.test var.test
chisq.gof ks.gof
chisq.gof ks.gof
contrast contrasts p.adjust pairwise.t.test pairwise.table ptukey qtukey  
* contrast contrasts p.adjust pairwise.t.test pairwise.table ptukey qtukey  
power.prop.test power.t.test print.power.htest
* power.prop.test power.t.test print.power.htest


Statistical procedures  
'''Statistical procedures'''
anova aov lm glm loglin manova fitted add1 drop1 resid deviance predict coef effect dummy.coef fitted.values alias step factor interaction model.tables proj plot summary
* anova aov lm glm loglin manova fitted add1 drop1 resid deviance predict coef effect dummy.coef fitted.values alias step factor * interaction model.tables proj plot summary





Revision as of 03:28, 6 October 2009

Fall 2009 - Synthetic Biology (MRes class)

Home        Lecture        'R' Tutorial        Resources        Literature

<html> <body> <!-- Start of StatCounter Code --> <script type="text/javascript"> var sc_project=3315864; var sc_invisible=0; var sc_partition=36; var sc_security="8bb2efcd"; </script>

<script type="text/javascript" src="http://www.statcounter.com/counter/counter_xhtml.js"></script><noscript><div class="statcounter"><a class="statcounter" href="http://www.statcounter.com/"><img class="statcounter" src="http://c37.statcounter.com/3315864/0/8bb2efcd/0/" alt="blog stats" /></a></div></noscript> <!-- End of StatCounter Code -->

</body> </html>

Introduction to 'R'




Useful Commands and Functions

Program management

  • q() # quit
  • help(…),?…,?help,find # help manual
  • help.start() # help in html format
  • ; # cmd separator
  • # # comment mark
  • ls(), objects() # see which R objects are in the R workspace
  • rm(x,y) # remove x,y from workspace
  • source(‘file.R’) # runs file.R from working directory
  • sink(‘file.lis’) # sends output to file.lis in working dir
  • sink() # output reverts to console
  • .Last.value # value from previous expression
  • save(),dump(),write(),dput(),dget(),write()

Data management

  • read.table(“file.dat”,header=TRUE,row.names=1)
  • scan("ex.data", skip = 1) # reading fixed formatted input
  • names(islands) # print the names attribute of the islands data set
  • table(rpois(100,5)) # build a contingency table of the counts at each combination of factor levels
  • make.names(…)
  • matrix(data,nrow = 1,ncol = 1,byrow = FALSE,dimnames) #creates a matrix
  • data() # list all available data sets
  • data(package = base) # list the data sets in the base package
  • data(women) # load the data set women
  • file.show # view file
  • attach(women) # attaches database to search path
  • detach("women") # remove database from search path
  • library() # list all available packages
  • library(eda) # load package ‘eda'
  • print(x) # prints its argument and returns it invisibly (generic)
  • edit(…) # edit a data frame or matrix
  • summary(height) # a generic function used to produce result summaries


Data manipulation

  • mode(object), length(object) # returns mode and length of object
  • str() # displays structure of an arbitrary R object
  • c(1:5, 10.5, "next") # generic fnc which combines args into a vector
  • x[1:10] # indexes vector
  • paste(c(“a”,”b”),1:10) # combine one by one into char vector
  • dim(x) or dim(x) <- c(3,4) # retrieve or set the dimension of an object
  • array # creates or tests for arrays
  • as.matrix(x) # attempts to turn x into a matrix
  • is.matrix(x) # tests if x is a (strict) matrix
  • numeric(3) # produces vector of zeroes of length 3
  • list(x=cars[,1], y=cars[,2]) # collects items together (of different types)
  • unlist # flattens list
  • factor # used to encode a vector as a factor
  1. defines a partition into groups
  • cbind(0, rbind(1, 1:3)) # combine args by columns or rows
  • as.**** (eg as.matrix(x) # coerce numerical data frame to numerical matrix
  • is.**** (eg is.matrix(x) # test of argument
  • args(t.test) # displays the argument names of a function
  • margin.table(m,1) # give margin totals of array

Program control

  • function( arglist ) expr
  • return(value)
  • if(cond) cons.expr else alt.expr
  • for(var in seq) expr
  • while(cond) expr
  • repeat expr
  • break
  • next
  • tapply(1:n, fac, sum) # apply function to each comb of factor levels

Operators

  • + - * / ^ (element by element operations with recycling)
  • %% (mod)
  • %/% (integer division)
  • crossprod
  • %*% (matrix prod, inner product)
  • outer %o% (outer product)
  • a&b (and), a|b (a or b), !a (not a)
  • precedence: $ [] ^ unary- : (%% %/% %*%) (* /) (+ - ?) (< > <= >= == !=) ! (& | && ||) ~ (<- ->)

Mathematical functions

  • solve backsolve forwardsolve t(transpose)
  • uniroot polyroot optimize nlm deriv
  • log log10 sqrt exp sin cos tan acos asin atan cosh sinh tanh gamma lgamma choose lchoose bessel
  • abs sign sum prod diff cumsum cumprod min max pmax pmin range length
  • diag scale nrow ncol length append drop
  • det eigen svd qr chol chol2inv
  • eigen(cbind(c(1,-1),c(-1,1))) # computes eigenvalues and eigenvectors


Statistical functions

  • mean var cov cor sd mad median range IQR fivenum quantile mahalanobis
  • sort rev order rank sort.list
  • ceiling floor round trunc signif zapsmall jitter all duplicated unique any lower.tri upper.tri
  • approx approxfun spline splinefun curve
  • mean(x, trim = .10) # (trimmed) mean

Graphics

  • par(mfrow=c(2,3)) # create 2x3 array of figs filled row-wise
  • plot pairs coplot boxplot boxplot.stats hist stem density piechart barplot dotplot qqplot qqnorm qqline ppoints interaction.plot lowess contour persp image stars symbols
  • par axis box lines abline segments points text mtext title labels legend plotmath arrows polygon Hershey plot.window xy.coords rug
  • colors hsv rgb rainbow gray palette
  • multifigure parameters)
  • graphics devices: postscript pictex windows png jpeg bmp xfig bitmap
  • locator() # read position of graphics cursor
  • identify() # identifies near point in graphic

Statistical distributions & sampling

  • sample(n) # random permutation
  • sample(x,replace=T) # bootstrap sample
  • set.seed RNGkind .Random.seed
  • Prefixes: d (density) p (distribution function) q (quantile function)
  • r (random deviates)
  • chisq t F norm binom pois exp beta gamma lnorm unif geom cauchy logis hyper nbinom weibull wilcox

Statistical tests

  • t.test prop.test binom.test wilcox.test kruskal.test ansari.test bartlett.test cor.test fisher.test fligner.test friedman.test ks.test mantelhaen.test mcnemar.test mood.test pairwise.prop.test pairwise.t.test pairwise.wilcox.test print.pairwise.htest prop.trend.test quade.test shapiro.test var.test

chisq.gof ks.gof

  • contrast contrasts p.adjust pairwise.t.test pairwise.table ptukey qtukey
  • power.prop.test power.t.test print.power.htest

Statistical procedures

  • anova aov lm glm loglin manova fitted add1 drop1 resid deviance predict coef effect dummy.coef fitted.values alias step factor * interaction model.tables proj plot summary


Data entry and manipulation: (x can be any of several types; y and z are vectors)
Command Meaning
x<-c(1, 2, 3, 4) Create a vector of numbers
x Prints contents of x
y[2:5] Returns 2nd to 5th elements of vector y
y[-3] Returns a vector of all elements in y except for the 3rd
y[y<10] Sub-vector of all entries in y less than 10
z[y<10] Sub-vector of all entries in z for which the corresponding entries in y are less than 10 (x & y must be same length)
x<-list(y,z), x$y , x$z Construct of list with two vectors in it , Returns vector y, Returns vector z
x<-data.frame(y,z), x$y, x$z Construct of dataframe* with two vectors in it, Returns vector y, Returns vector z
x<-factor(y) Converts numeric type y into a factor
is.factor(y) Returns “TRUE” if y contains factors (numeric or symbolic)
is.numeric(y) Returns “TRUE” if y contains numeric data
is.na(y) Returns “TRUE” for each entry
dimnames(x) Lists the different attributes of an array or dataframe
levels(x)=c("a", "b",…) Assign names to each factor value
x<-read.table(file="inp.txt") Read a dataset from an ascii text file of data. Add “header=TRUE” if the file contains descriptive headers
load("filename") Loads R data from filename
save(x, "filename") Saves R object x into filename
save.image("filename") Saves all current R objects into filename


Descriptive statistics: (x can be a vector or data frame; y and z are vectors)
Command Meaning
mean(x) Calculate mean of vector x (or of all vectors in data frame x)
median(x) Calculate median of vector x (or of all vectors in data frame x)
sd(x) Calculate standard deviation of vector x (or of all vectors in data frame

x)

var(x) Calculate variance of vector x (or of all vectors in data frame x)
summary(x) Calculate summary of vector x (or of all vectors in data frame x)
boxplot(x), Create boxplot of vector x (or of all vectors in data frame x)
boxplot(x~y) Create multiple boxplots of data in x, based on categories in y.
stripchart(x) Create stripchart of vector x (or of all vectors in data frame x)
stripchart(x~y) Create multiple stripcharts of data in x, based on categories in y.
hist(y) Create histogram of vector y (command will not work on a data frame)
qqnorm(y) Creates a “normal quantile-quantile” plot of y; used to test if data in x is normally distributed
plot(z~y) Makes an “x-y” plot of vector z vs. vector y