Sometimes it is preferable to label data series instead of using a legend. This post demonstrates one way of using labels instead of legend in a ggplot2 plot.

 `> library(ggplot2)`
 ```> p <- ggplot(dfm, aes(month, value, group = City, colour = City)) + geom_line(size = 1) + opts(legend.position = "none")```
 ```> p + geom_text(data = dfm[dfm\$month == "Dec", ], aes(label = City), hjust = 0.7, vjust = 1)```

The addition of labels requires manual calculation of the label positions which are then passed on to geom_text(). If one wanted to move the labels around, the code would need manual adjustment – label positions need to be recalculated..

This problem is easily solved with the help of directlabels package by Toby Dylan Hocking that “is an attempt to make direct labeling a reality in everyday statistical practice by making available a body of useful functions that make direct labeling of common plots easy to do with high-level plotting systems such as lattice and ggplot2″.

 `> install.packages("directlabels", repos = "http://r-forge.r-project.org")`
 `> library(directlabels)`

The above plot can be reproduced with one line of code.

 ```> direct.label(p, list(last.points, hjust = 0.7, vjust = 1))```

In addition to several predefined positioning functions, one can also write their own positioning function. For example, placing the rotated labels at the starting values of each series.

 ```> angled.firstpoints <- list("first.points", rot = 45, hjust = 0, vjust = -0.7) > direct.label(p, angled.firstpoints)```

I agree with the author’s conclusion that the directlabels package simplifies and makes more convenient the labeling of data series in both lattice and ggplot2.

Thanks to Baptiste for bringing this package to my attention.

tags: , , ,

In 2006 UserR conference Jim Porzak gave a presentation on data profiling with R. He showed how to draw summary panels of the data using a combination of grid and base graphics.

Unfortunately the code has not (yet) been released as a package, so when I recently needed to quickly review several datasets at the beginning of an analysis project I started to look for alternatives. A quick search revealed two options that offer similar functionality: r2lUniv package and describe() function in Hmisc package.

Hadley Wickham recently shared a nice tip on how to get a faceted scatterplot plot with all points in the background of each plot.

This technique makes a clever use of setting the faceting variable to NULL so that all points are plotted in light grey in all the facets.

 `> library(ggplot2)`
 ```> ggplot(mtcars, aes(cyl, mpg)) + geom_point(data = transform(mtcars, gear = NULL), colour = "grey80") + geom_point() + facet_grid(~gear) + theme_bw()```

Update 17 May 2010

bch asked in the comments below, how to achieve the same when there are two facets. The method is the same, now one would need to exclude both of the facetting variables from the dataset used to draw the light grey points.

 ```> ggplot(mtcars, aes(cyl, mpg)) + geom_point(data = mtcars[, !names(mtcars) %in% c("am", "gear")], colour = "grey80") + geom_point() + facet_grid(am ~ gear) + theme_bw()```

David@Work blog shows how to fill in the area between two crossing lines in an Excel chart. This post was also published as a guest-post on PTS blog.

Let’s try to replicate this graph in ggplot2.

A few weeks ago I needed to export a number of data frames to separate worksheets in an Excel file. Although one could output csv-files from R and then import them manually or with the help of VBA into Excel, I was after a more streamlined solution, as I would need to repeat this process quite regularly in the future.

CRAN has several packages that offer the functionality of creating an Excel file, however several of them provide only the very basic functionality. The R-wiki page on exchanging data between R and Windows applications focuses mainly on the data import problem.

My objective was to find an export method that would allow me to easily split a larger dataframe by values of a given variable so that each subset would be exported to its own worksheet in the same Excel file. I tried out the different ways of achieving this and documented my findings below.

A few people have emailed me and enquired about the use of tools mentioned at the end of this post to make blogposts with embedded R-commands. Below is a small step-by-step walkthrough of how to accomplish this.

On the ggplot2 mailing-list the following question was asked:

How to create a back-to-back bar chart with ggplot2?

For anyone who don’t know what I am talking about, have a look on a recent paper from the EU. I’d like to create plots like the graphs 5,6,18 in the paper.

An example graph from the above report is below:

Let’s create the same graph in ggplot2.