A FIRST TUTORIAL ON R 
text version of a document by, very likely, Chris Raphael
with unauthorized changes by Don Byrd, early Feb. 2008


To get R:

1. Download R (it's free) from the website http://cran.r-project.org . There are
versions for Macintosh, Linux, and Windows.
2. From the same website, click on "Manuals" (at the left, near the bottom) to
find documentation.

R can be used either to run pre-written programs or purely interactively, as a calculator.
Try typing each of the following expressions -- except for the initial ">", R's command
prompt -- to R (followed by the return key). Note: anything on a line following '#' is a
comment and is ignored by R.
> 5+3
> (12*7)+4				# No. of keys on a piano (7 octaves + 4 at the bottom)
> log(81/80)
> exp(20)
> help("exp")			# Just what is "exp", anyway?
> exp(exp(exp(20))) 	# R is only human! :-)

R has most any mathematical function you can think of: sqrt(), sin() ... mostly with
easily guessable names. Expressions using the logical operators   ==  !=  <  >   give
_Boolean_ values, namely T or F:
> 4>3					# this evaluates to T (true)
> sqrt(4)==2			# so does this
> sqrt(5)==2 			# this evaluates to F (false)

It is possible to have variables even when you use R as a calculator. Most strings
beginning with an alphabetic character will be treated as variables. Try typing some of
the following lines in succession; no need to include the comments, of course. Or, you
can copy and paste them -- not a bad habit to get into.
> x <- 3 						# set x to 3
> y <- x*x+x
> y								# print the value of y
> freq <- 440 * 2^((m-69)/12)	# What does this do? Not much unless you set m first
> m <- 60
> freq <- 440 * 2^((m-69)/12)	# It works better this time
> freq


Vectors

One of the nicest aspects of R is the way it handles _vectors_ (sometimes called one-
dimensional arrays). However, it can be tricky, e.g., using vectors of different lengths
together -- intentionally or otherwise! Here are several ways to create and use vectors:
> xV <- seq(1,50)				# xV is now the vector (1, 2, ..., 50)
> xV <- 1:50					# same thing
> yV <- seq(-pi,pi,length=50)	# yV consists of 50 evenly spaced values from -pi to pi
> sqV <- c(1,4,9,16)			# c means "combine"; sqV is now the vector (1,4,9,16)
> sumV <- xV+yV					# vectors of same length can be added, multiplied, etc.
> combV <- c(sqV, xV)			# anything, even vectors of any length, can be combined
> xVMag <- 4*xV					# this is interpreted correctly too


Random Number Generation

Random numbers are useful for many things, especially in probability theory and
statistics (R's original raison d'etre), as well as many areas of music informatics.
R has a bunch of functions for generating them. There are many different
_distributions_ of random numbers; the most important for us are the _uniform_
(all possible values are equally likely) and the _normal_ (which produces the
Bell-shaped curve you've probably encountered before). R has lots of built-in functions
for doing things with random numbers. For instance:
> xV <- runif(100)	# creates a vector of 100 (uniformly distributed) random numbers between 0 and 1.
> punif(v)			# is the probability that a Unif(0,1) random number is less than v

There are similar functions for a variety of other distributions, including the
normal(0,1) (rnorm,pnorm,qnorm), Exponential, Binomial, Poisson, Cauchy (rcauchy,
pcauchy, qcauchy), and others.


Subsets

> xV <- runif(100)		# creates a vector of 100 Unif(0,1) random numbers
> xV[1]					# the first element of xV
> xV[c(1,3,5)]			# a vector containing 1st, 3rd and 5th elements of xV
> yV <- xV>.5			# a 100-long vector of Boolean values; yV[i] is T iff xV[i] > .5
> bigxV <- xV[xV>.8]	# the "xV's" that are greater than .8
> bigxV


Simple Graphs. Try the following. NB: if you do more than one "plot" at a time, you'll
see only the last one!
> xV <- seq(0,1,length=100)
> yV <- xV^2			# yV = xV squared
> plot(xV,yV)			# plot with (xV[1],yV[1]) ... (xV[100],yV[100])
> plot(xV,yV,"l")		# plot has lots of options: say help("plot") to find out
> plot(yV,xV,"l")
> plot(yV,xV,"s")
> plot(yV)				# same as plot(1:length(yV),yV)


Source Files. You will need to write simple programs in R, and getting programs working
almost always requires some trial-and-error iteration. It's not practical to do that
just by typing statements to R; you have to write your programs in a text editor and
save them in files. On a Mac, the easiest solution is probably to use R's built-in text
editor. In the File menu, use New Document (or Cmd-N). As an added bonus, it knows the
syntax of R, which helps in several ways (it automatically inserts matching braces and
parentheses, highlights one when its "partner" is selected, etc.). However, you can also
use a program like BBEdit or the free Text Wrangler. On Windows, you can use the
Notepad, or Notepad2 or Textpad. (CAUTION! Do _not_ use a word processor like MS Word or
Wordpad, or OS X's TextEdit, which -- despite its name -- is really a word processor.
Word processors have several problems for writing programs; the worst is that unless
you're very careful, you'll have a file with invisible formatting information that will
cause syntax errors in R!) Suppose you create a file containing these lines:
nDays <- 90
xV <- runif(nDays,-.5,.4)
yV <- cumsum(xV)			# yV[1] = xV[1], yV[2] = xV[1]+xV[2], etc.
priceV <- 100*exp(yV)
plot(priceV, type="l", xlab="day", ylab="sale price ($1000)")
title("Average Home Value")
print("Price history ($1000): ")
print(priceV)

Save it with the name "PriceCrash.r". To run it, type this in the R Console window:
> source("PriceCrash.r")

Important: that assumes that PriceCrash.r is in R's current "working directory"! If it's
not, you must give the correct path to the file, for example:
> source("/Users/donbyrd/Documents/WebSiteDon/Teach/RTools+Docs/PriceCrash.r")

R understands the common abbreviation "~" for the user's home directory, so this works too:

> source("~/Documents/WebSiteDon/Teach/RTools+Docs/PriceCrash.r")

This technique allows you to write a program in the usual incremental way. If you want
to get a hard copy of the printout and the plot (for example, to submit as your
homework), do the following:
> postscript("myplot.ps")	# write plot in the postscript file "myplot.ps"
> sink("myout.txt")			# write text output to "myout.txt"
> source("PriceCrash.r")	# run the program you created
> dev.off()					# redirect plots back to screen. Don't forget this!
> sink()					# redirect text output back to screen. ditto.


A Fun Example. Suppose two decks of cards are shuffled; then the cards are lined up side
by side, and you count the number of places where the two decks have the same card. What
is the probability that are no matches? This is a hard calculation to do, but you could
estimate the probability by doing it many times and observing the proportion of times it
occurs. Here's how. The 52 cards are represented simply by integers from 1 to 52. Each
time through the loop, the program "shuffles" both decks, then counts the matches; if
there aren't any, it adds one to the "no match" count.

nTrials <- 1000				# number of trials: try nos. like 10, 100, 1000, 10,000
nZeros <- 0					# to count the number of times the "no match" event happens
for (i in 1:nTrials) {
	xV <- 1:52;
	deck1V <- sample(xV, 52, replace=F)	# a random permutation of the "cards"
	deck2V <- sample(xV, 52, replace=F)	# another random permutation
	nMatches <- sum(deck1V==deck2V)		# number of matching cards this time
	if (nMatches==0) nZeros <- nZeros+1
	#cat("nMatches=", nMatches, "nZeros=", nZeros, "\n")
}
cat("estimated probability of no matches=", nZeros/nTrials, "\n")	# the result


This is also one of my R Example Programs, namely CardMatchingFunExample.r. One tricky
thing is the statement "nMatches <- sum(deck1V==deck2V)"; what does that actually do?
To find out, after running the program, you might tell R the following:

> deck1V
> deck2V
> deck1V==deck2V
> sum(deck1V==deck2V)

This will give you an idea of what happened the last time through the loop. You
can also un-comment the '#cat("nMatches="...' line to see what's happening _every_
time through, but you might want to reduce nTrials to a small number first!


Working Directories

Anytime you refer to a file -- either with the "source" command, to run a program,
or to read data from or write data to a file -- R needs to know the directory or
folder to use for the file. If you don't give a complete path, it uses the current
_working directory_. Here's how to set the working directory or find out what it's
set to.

> setwd("~/Documents/WebSiteDon/Teach/RTools+Docs")
> getwd()

The source command has a nifty "chdir" option that temporarily changes to the
directory containing the file being run:

> source("~/Documents/WebSiteDon/Teach/RTools+Docs/FancyProgram.r", chdir=TRUE)


Demos, Help, & Quitting

> demo()					# get info about built-in demos
> help("rnorm")				# gives information about the function rnorm. Of couse this works
>							# for other functions too, and even for operators.
> q() 						# quit R. If it asks if you want to save the workspace,
							# just say no.