Skip to content

Updating versions of R can be a pain in terms of getting all those packages you had on the old version onto the new version. Some people suggest copying the library folder from one installation to another and running update.packages(). Other people have other methods.

Here’s my simple method:

On the old installation make a vector of names of packages and write it to a file:

packs <- row.names(installed.packages())
write.table(packs, "packs.txt") # could also use dump or save or....

On the new version of R just read in the file and run install.packages:

packs <- read.table("packs.txt")
install.packages(as.character(packs\$x)) # *



Select the mirror and wait a few minutes.

Simple!

* because this method ends up having packs as a table with 1 column (interpreted as a factor), we have to force R to think its a character

One of the great things about R is that if you use scripts, you have a record of what you’ve done. If you copy the console output into the script then you also have a copy of that. Brilliant…until you forget or get lazy.

But what if you could make a pdf of your work? Using Sweave (S being then language that R is based on and weave being the verb), you can. But it does use LaTeX, so you have to learn a little bit of that too, as well as install it. Check out CTAN – the LaTeX equivalent of CRAN

If you use RStudio, this is really easy though. You open a new “R Sweave” file which already has most of what you need to begin – the bones of the LaTeX document:

\documentclass{article}\begin{document}\SweaveOpts{concordance=TRUE}\end{document}

After the \SweaveOpts line you can start typing any description of the analysis youre doing. To start an “R chunk” (some code for R to interpret) you type

<<>>=

enter your R code and then type

@

to end it. So a short file might look like this:

\documentclass{article}\usepackage[top=1in, bottom=1in, left=1in, right=1in]{geometry}\usepackage[noae]{Sweave} \title{Cars}\begin{document}\SweaveOpts{concordance=TRUE}\maketitleOpen the cars dataset:<<>>=data(cars)@Show a summary of the dataset:<<>>=summary(cars)@Make a figure<<fig=TRUE>>=plot(cars[,1], cars[,2])@\end{document}

I added a couple of lines to the code to make it look a little different – the line with top, bottom etc just alters the margins using the geometry package. I also added \usepackage[noae]{Sweave} because ‘ symbols stop it working…the [noae] allows it to include them.For the figure I included the fig=TRUE between the << and >> to tell LaTeX to include the figure. There are other arguments to tell it to ignore the section or just return the result etc.

Once youve got that, you just hit the “Compile PDF” button on the RStudio tool bar.

If you dont use RStudio, you have to use the Sweave function

help("Sweave", package="utils")

I hope someone finds this helpful!!

As Ive already written, getting data into R from your precious xlsx files is really handy. No need to clutter up your computer with txt or csv files. The previous post I wrote about the gdata package for importing data from xlsx files and was pointed to, among others, the xlsx package. xlsx seems to be a good package, easy to use and, importantly, fast. Its based on java, but it comes with all the relevant jar files in an accompanying package which installs on its own if you have the install dependencies setting to TRUE.

To read in with xlsx its the same as any other read function, you just need to tell it which sheet to read, by either name (sheetName argument) or number (sheetIndex):

library(xlsx)
dat <- read.xlsx("testfile.xlsx", sheetName="")

There are various other options that other packages for importing excel files dont seem to have such as rowIndex and colIndex for specifying which rows or columns you want to import. There is also a second function (read.xlsx2) which is apparently an order of magnitude faster for those particularly big files. Once youve selected the data and run the code, you can happily work with the data.

Writing to xlsx files might be useful too, for storage or data sharing with people who dont use R for instance. This is dead easy with xlsx!

If you want just a single dataframe in the workbook you simply do something like the following:

data(cars)
write.xlsx(cars, "cars_dataframe.xlsx")

To create a new file containing multiple dataframes from R, you first create the workbook, add sheets to that workbook and then add the dataframes to the sheets and save the workbook to whatever file you want.

cars <- createWorkbook()
cars1 <- createSheet(wb=cars, sheetName="Cars")
cars2 <- createSheet(wb=cars, sheetName="MTCars")
data(cars); data(mtcars)
addDataFrame(x=cars, sheet=cars1)
addDataFrame(x=mtcars, sheet=cars2)
saveWorkbook(cars, "Cars_datasets.xlsx")

By default this will add both column and row names, but this can be overridden using the row.names or col.names arguments in the addDataFrame function. You can also add the dataframes to a particular starting place in the sheet using the startRow and startCol arguments to the addDataFrame function.

Theres also some funky styling stuff you can do using the CellStyle, Fill, Alignment, Font and setCellStyle functions of the following sort (from ?CellStyle).

  wb <- createWorkbook()
sheet <- createSheet(wb, "Sheet1")
rows  <- createRow(sheet, rowIndex=1)
cell.1 <- createCell(rows, colIndex=1)[[1,1]]
setCellValue(cell.1, "Hello R!")
cs <- CellStyle(wb) +
Font(wb, heightInPoints=20, isBold=TRUE, isItalic=TRUE,
name="Courier New", color="orange") +
Fill(backgroundColor="lavender", foregroundColor="lavender",
pattern="SOLID_FOREGROUND") +
Alignment(h="ALIGN_RIGHT")
setCellStyle(cell.1, cs)
# you need to save the workbook now if you want to see this art

Enjoy!

R2 is a useful tool for determining how strong the relationship between two variables is. Unfortunately, the definition of R2 for mixed effects models is difficult – do you include the random variable or just the fixed effects? Including just the fixed effects is essentially a standard linear model, while including the random effects could confuse some readers (you have a much higher R2). So which do you report? Nakagawa and Schielzeth (2013) say both! They even provide a the formulae for their calculation.

Over on the sample(ECOLOGY) blog, an R function has been written for lme and lmer models and reports both R2 based on just the fixed effects (marginal R2) and that incorporating the random effects (conditional R2). Simply copy the code into the console, hit enter and give the function a list of your models.

e.g. (from the sample(ECOLOGY) page)

Example
mod1=lmer(rnorm(100,5,10)~rnorm(100,20,100)+(1|rep(c("A","B"),50)))
mod2=lmer(rnorm(100,5,10)~rnorm(100,20,100)+rnorm(100,0.5,2)+(1|rep(c("A","B"),50)))
rsquared.lme(list(mod1,mod2))

A couple of warnings for lmer users though:

1. you might have to tweak the code if you only have a single random effect for in lmer models. If you have multiple random effect levels or lme models, you should be fine
2. the function is currently written for “mer” class models from lmer – the newer development versions of lmer use the “merMod” class and do away with @ as a slot

Nakagawa, S., and H. Schielzeth. 2013. A general and simple method for obtaining R2 from generalized linear mixed-effects models. Methods in Ecology and Evolution 4(2): 133-142.

Some remote sensing folks use IDL and Excel to analyse their data. They take the ratio of a pair of reflectence values and correlate them against known values of some parameter. However, because they have a large number of bands, they compare a large number of bands to the known values, comparing the models with r2 or some such criteria.

I helped a friend do this over the last couple of days. Using a number of for loops, we created the ratios and then calculated linear, polynomial and non-linear models. Using some simple dataframe reordering, we sorted the models by fit (using AIC as r2 doesnt make sense for nonlinear models) and refit the best 20 or so models, validating them against some data not used to construct the original model.

The code (about 500 lines including quite a lot of comments) took about 2-3 hours to write (me teaching what bits of the code do along the way, so perhaps an hour and a halfs coding normally) and runs in about 5 minutes. I have no idea how long the IDL equivalent code takes to run, but doing this in the usual software environments, a mixture of IDL and Excel Im told, would take about a week.

I wonder what/how many other scripts written in other languages that are in common use would be much quicker in R (not that R is an especially fast language).

A post over on Dang, another error (show me yours and I’ll show you mine) has a method of working with R which uses an IDE called Eclipse in conjunction with a plugin called StatET. Eclipse is one of a number of IDEs that I’m aware of (Tinn-R being another, but this Sciviews pages has an enormous list).

I like R-Studio and have recommended it to a number of other people, many of whom have also taken up using it (one even saying that once youve gotten used to it, you cant imagine using R without it). Similar to other IDEs, it has various panes for assorted purposes. I have it set up so that I have scripts top-left, the R console bottom-left, workspace, history and package selection top-right and then system navigator, plots and help bottom-right. These different panels make working with R-Studio a pleasure, you set them up as you want them and can shift them around if you need.

One of the most useful features is the workspace. This shows the name, class and size of objects and lets you see at a glance what you have in your workspace. Super useful!!! I also really like that everything is all in one window, rather than having to switch between the internet browser for help, a plot window, code and console windows (yes, I am aware that you can make R place console, script and plots in a single window, but i dislike the text editor and the way that you sometimes have to search for the window, especially on small screens).

R-Studio can also link in with Sweave,  R Markdown and HTML so you can make some nice PDFs straight from the same program. It Even has version control if you have GIT or SVN. Its not a feature that I use, but you can also join objects and script together into projects so that you just load the project and all of the code and objects are there.

A great piece of software!

R is great for making graphics! You have an infinite amount of control over your plots, the only limits being your imagination (and perhaps your R coding ability). Some things are easy to do, others not quite so easy of course. One of the trickier parts, thats not limited to R, is positioning labels on a figure. What do I mean? Well, quite often in journals or where ever you have one figure thats split into multiple panels, easy showing something different (e.g. Figure 1a shows the relationship between miles per gallon and displacement; Fig 1b shows the relationship between mpg and the number of carburettors). Quite often in these situations it can be difficult to get the label (the “a” or the “b”) in the same location relative to the figure region. Which can look crappy. It can be time consuming to get it right.

data(mtcars)
par(mfrow=c(1,2), mai=c(1,0.8,0.3,0.2))
plot(mpg ~ disp, mtcars)
text(420, 31, "a")
plot(mpg ~ carb, mtcars)
text(7.8, 31, "b")

Note how much closer to the right hand side the “b” is in comparison with the “a”

Ive written a function that makes it easy and reproducible. One simple line of code that can be repeated for each panel and the problem is solved! Its available from HERE!! Or via sourcing as below:

source("http://db.tt/FxQnTn55")
par(mfrow=c(1,2), mai=c(1,0.8,0.3,0.2))
# (mfrow makes the panels, mai manipulates the margin size)
plot(mpg ~ disp, mtcars)
panellab(perc.x=10, lab="a")
plot(mpg ~ carb, mtcars)
panellab(perc.x=10, lab="b")

The function allows you to specify how far from the corner you want your label in percentage of the graph area. You can specify the percentage for x and y individually (arguments perc.x and perc.y, respectively; specifying only perc.x makes perc.y the same as perc.x). You can also specify which corner using the pos argument (pos = “TR” – top right, “TL” – top left, “BR” – bottom right, “BL” – bottom left; defaults to “TR”). You can also specify any other regular text arguments such as col or font.

Hope its useful to someone!!!