Skip to content

ggplot2: Labelling Data Series and Adding a Data Table

April 29, 2009

Stephen Few has posted on his website a few design examples on how to improve the presentation of quantitative information.

One of the examples is depicting the average monthly temperature in three cities.

https://learnr.wordpress.com/wp-content/uploads/2009/04/example2improvedsolution.gif

This post tries to replicate the graph in ggplot2, and demonstrate how to label data series, and how to add a data table to the plot.

The first step after importing the data is to convert it from wide format to long format, and replace the long month names with abbreviations, after which it is time to have a first look at the data.

> library(ggplot2)
> df <- structure(list(City = structure(c(2L,
     3L, 1L), .Label = c("Minneapolis", "Phoenix",
     "Raleigh"), class = "factor"), January = c(52.1,
     40.5, 12.2), February = c(55.1, 42.2, 16.5),
     March = c(59.7, 49.2, 28.3), April = c(67.7,
         59.5, 45.1), May = c(76.3, 67.4, 57.1),
     June = c(84.6, 74.4, 66.9), July = c(91.2,
         77.5, 71.9), August = c(89.1, 76.5,
         70.2), September = c(83.8, 70.6, 60),
     October = c(72.2, 60.2, 50), November = c(59.8,
         50, 32.4), December = c(52.5, 41.2,
         18.6)), .Names = c("City", "January",
     "February", "March", "April", "May", "June",
     "July", "August", "September", "October",
     "November", "December"), class = "data.frame",
     row.names = c(NA, -3L))
> dfm <- melt(df, variable_name = "month")
> levels(dfm$month) <- month.abb
> p <- ggplot(dfm, aes(month, value, group = City,
     colour = City))
> (p1 <- p + geom_line(size = 1))
https://learnr.wordpress.com/wp-content/uploads/2009/04/temperature1.png

A few plot elements need changing: use black-white theme, format y-axes labels, add plot title, and remove gridlines, axis titles, plot background.

> dgr_fmt <- function(x, ...) {
     parse(text = paste(x, "*degree", sep = ""))
 }
> none <- theme_blank()
> p2 <- p1 + theme_bw() + scale_y_continuous(formatter = dgr_fmt,
     limits = c(0, 100), expand = c(0, 0), ) +
     opts(title = expression("Average Monthly Temperatures (" *
         degree * "F)")) + opts(panel.grid.major = none,
     panel.grid.minor = none) + opts(legend.position = "none") +
     opts(panel.background = none) + opts(panel.border = none) +
     opts(axis.line = theme_segment(colour = "grey50")) +
     xlab(NULL) + ylab(NULL)
https://learnr.wordpress.com/wp-content/uploads/2009/04/temperature2.png

Next add the reference lines.

> (p3 <- p2 + geom_vline(xintercept = c(2.9,
     5.9, 8.9, 11.9), colour = "grey85", alpha = 0.5) +
     geom_hline(yintercept = 32, colour = "grey80",
         alpha = 0.5) + annotate("text", x = 1.2,
     y = 35, label = "Freezing", colour = "grey80",
     size = 4) + annotate("text", x = c(1.5,
     4.5, 7.5, 10.5), y = 97, label = c("Winter",
     "Spring", "Summer", "Autumn"), colour = "grey70",
     size = 4))
https://learnr.wordpress.com/wp-content/uploads/2009/04/temperature3.png

And finally the series labels. Note that a different dataset is used containing only the positions of labels.

> (p4 <- p3 + geom_text(data = dfm[dfm$month ==
     "Dec", ], aes(label = City), hjust = 0.7,
     vjust = 1))
https://learnr.wordpress.com/wp-content/uploads/2009/04/temperature4.png

The original graph also includes a data table with all the values. It is possible to include a table of values on the plot using grid.text, however using geom_text() allows for more flexibility. Essentially all the values are plotted on a graph with all the background elements then removed.

> data_table <- ggplot(dfm, aes(x = month, y = factor(City),
     label = format(value, nsmall = 1), colour = City)) +
     geom_text(size = 3.5) + theme_bw() + scale_y_discrete(formatter = abbreviate,
     limits = c("Minneapolis", "Raleigh", "Phoenix")) +
     opts(panel.grid.major = none, legend.position = "none",
         panel.border = none, axis.text.x = none,
         axis.ticks = none) + opts(plot.margin = unit(c(-0.5,
     1, 0, 0.5), "lines")) + xlab(NULL) + ylab(NULL)
https://learnr.wordpress.com/wp-content/uploads/2009/04/temperature5.png

Now the only step remaining is to set up the viewports and combine the two plots into one.

> Layout <- grid.layout(nrow = 2, ncol = 1, heights = unit(c(2,
     0.25), c("null", "null")))
> grid.show.layout(Layout)
> vplayout <- function(...) {
     grid.newpage()
     pushViewport(viewport(layout = Layout))
 }
> subplot <- function(x, y) viewport(layout.pos.row = x,
     layout.pos.col = y)
> mmplot <- function(a, b) {
     vplayout()
     print(a, vp = subplot(1, 1))
     print(b, vp = subplot(2, 1))
 }
> mmplot(p4, data_table)
https://learnr.wordpress.com/wp-content/uploads/2009/04/temperature6.png
21 Comments leave one →
  1. April 30, 2009 12:19 am

    Very great ! I did’t think that it was possible to add a table below plot… Your method is clever, thanks.

  2. Dimitri permalink
    December 24, 2009 4:38 pm

    Beautiful !!
    Thank you very much.

  3. baptiste permalink
    December 31, 2009 3:43 pm

    Nice!

    You may be interested to try the directlabels package on R-forge which provides ggplot2 functions to replace the legend with coloured text alongside the curves as you did here.

  4. gd047 permalink
    August 23, 2010 11:23 am

    The command for adding the reference lines produces the error:

    Error: When _setting_ aesthetics, they may only take one value. Problems: label

    Could you please correct the code?

    • November 16, 2011 5:30 am

      Getting the same bug here… Error: When _setting_ aesthetics, they may only take one value. Problems: label

    • James Dalrymple permalink
      April 6, 2012 11:25 pm

      The code is correct, you need to load the library grid beforehand before it works.
      library(grid)

      • May 9, 2012 1:33 am

        I still have this issue when I’m trying to plot my own graph, even after loading grid. Is there something new I might be missing?

      • learnr permalink*
        May 9, 2012 1:19 pm

        What sort of error messages do you get?

  5. December 29, 2011 12:07 am

    Damn… nicely done. I’m going to trace through this example as I need to do something similar for some of my graphs.

  6. shobhit permalink
    April 12, 2012 11:05 am

    awesome!

  7. ambarrio permalink
    June 2, 2012 2:00 pm

    Hi, I was following this great example and I found out that you have to do two variants if you work with R2.15.0:

    p2 <- p1 + theme_bw() + scale_y_continuous(labels=math_format(.x * degree)) +
    opts(title = expression("Average Monthly Temperatures (" * degree * "F)")) +
    opts(panel.grid.major = none, panel.grid.minor = none) + opts(legend.position = "none") +
    opts(panel.background = none) + opts(panel.border = none) + opts(axis.line = theme_segment(colour = "grey50")) +
    xlab(NULL) + ylab(NULL)

    p3 <- p2 + geom_vline(xintercept = c(2.9, 5.9, 8.9, 11.9), colour = "grey85", alpha = 0.5) +
    geom_hline(yintercept = 32, colour = "grey80", alpha = 0.5) +
    annotate("text", x = 1.2, y = 35, label = "Freezing", colour = "grey80", size = 4) +
    annotate("text", x = c(1.5), y = 97, label = c("Winter"), colour = "grey70", size = 4) +
    annotate("text", x = c(4.5), y = 97, label = c("Spring"), colour = "grey70", size = 4) +
    annotate("text", x = c(7.5), y = 97, label = c("Summer"), colour = "grey70", size = 4) +
    annotate("text", x = c(10.5), y = 97, label = c("Autumn"), colour = "grey70", size = 4)

    and

    data_table <- ggplot(dfm, aes(x = month, y = factor(City), label = format(value, nsmall = 1), colour = City)) +
    geom_text(size = 3.5) + theme_bw() +
    scale_y_discrete(labels = abbreviate, limits = c("Minneapolis", "Raleigh", "Phoenix")) +
    opts(panel.grid.major = none, legend.position = "none", panel.border = none, axis.text.x = none, axis.ticks = none) +
    opts(plot.margin = unit(c(-0.5, 1, 0, 0.5), "lines")) + xlab(NULL) + ylab(NULL)

    LearnR, thanks for your cool tutorials.

    /ambarrio

  8. September 11, 2012 12:05 pm

    # for ggplot2 0.92 release
    library(ggplot2)
    library(reshape)
    library(grid)
    df <- structure(list(City = structure(c(2L, 3L, 1L), .Label = c("Minneapolis", "Phoenix", "Raleigh"), class = "factor"), January = c(52.1, 40.5, 12.2), February = c(55.1, 42.2, 16.5), March = c(59.7, 49.2, 28.3), April = c(67.7, 59.5, 45.1), May = c(76.3, 67.4, 57.1), June = c(84.6, 74.4, 66.9), July = c(91.2, 77.5, 71.9), August = c(89.1, 76.5, 70.2), September = c(83.8, 70.6, 60), October = c(72.2, 60.2, 50), November = c(59.8, 50, 32.4), December = c(52.5, 41.2, 18.6)), .Names = c("City", "January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"), class = "data.frame", row.names = c(NA, -3L))
    dfm <- melt(df, variable_name = 'month')
    levels(dfm$month) <- month.abb
    p <- ggplot(dfm, aes(month, value, group = City, colour = City))
    p1 <- p + geom_line(size = 1)
    ## formatter function
    dgr_fmt <- function(x, …){
    parse(text = paste(x, "*degree", sep = ''))
    }
    ## end dgr_fmt
    p2 <- p1 + theme_bw() + scale_y_continuous(labels = dgr_fmt, limits = c(0, 100), expand = c(0, 0) ) + labs(title = expression("Average Monthly Temperatures (" * degree * "F)"), x = NULL, y = NULL) + theme(panel.grid.major = none, panel.grid.minor = none, legend.position = "none", panel.background = none, panel.border = none, axis.line = element_line(colour = "grey50"))
    p3 <- p2 + geom_vline(xintercept = c(2.9, 5.9, 8.9, 11.9), colour = "grey85", alpha = 0.5) + geom_hline(yintercept = 32, colour = "grey80", alpha = 0.5) + annotate("text", x = 1.2, y = 35, label = "Freezing", colour = "grey80", size = 4) + annotate("text", x = c(1.5, 4.5, 7.5, 10.5), y = 97, label = c("Winter", "Spring", "Summer", "Autumn"), colour = "grey70", size = 4)
    p4 <- p3 + geom_text(data = dfm[dfm$month == "Dec", ], aes(label = City), hjust = 0.7, vjust = 1)
    data_table <- ggplot(dfm, aes(x = month, y = factor(City), label = format(value, nsmall = 1), colour = City)) + geom_text(size = 3.5) + theme_bw() + scale_y_discrete(labels = abbreviate, limits = c("Minneapolis", "Raleigh", "Phoenix")) + theme(panel.grid.major = none, legend.position = "none", panel.border = none, axis.text.x = none, axis.ticks = none, plot.margin = unit(c(-0.5, 1, 0, 0.5), "lines")) + labs(x = NULL, y = NULL)
    Layout <- grid.layout(nrow = 2, ncol = 1, heights = unit(c(2, 0.25), c("null", "null")))
    grid.show.layout(Layout)
    vplayout <- function(…) {
    grid.newpage()
    pushViewport(viewport(layout = Layout))
    }
    subplot <- function(x, y) viewport(layout.pos.row = x, layout.pos.col = y)
    mmplot <- function(a, b) {
    vplayout()
    print(a, vp = subplot(1, 1))
    print(b, vp = subplot(2, 1))
    }
    mmplot(p4, data_table)

  9. M.Adams permalink
    February 6, 2013 11:56 pm

    That was a great example. Thanks for sharing with us. I am trying to add minor grid lines to the data table but having hard time with it. Is it because the x and y axis are not continuous ?

    data_table + theme(panel.grid.minor.x = element_line(size = 2,color=”black”))
    data_table + theme(panel.grid.minor.y = element_line(size = 2,color=”black”))

  10. M.Adams permalink
    February 7, 2013 6:38 pm

    Great post. Thanks for sharing with us. I am trying to add minor grid lines to the data_table but they don’t show up the graph, Any suggestions how to do it ?

    data_table + theme(panel.grid.minor.x = element_line(size = 2,color=”white”))
    data_table + theme(panel.grid.minor.y = element_line(size = 2,color=”white”))

    • learnr permalink*
      June 6, 2013 9:57 am

      This seems correct to me, so maybe try a different colour?

  11. lost permalink
    January 14, 2014 3:11 am

    Great post, I can’t seem to get the table to line up with the plot though without a great deal of manual fussing with the margins for the table (it either takes up too little or too much room), do you know how to set these programatically?

  12. markelgl permalink
    March 22, 2018 5:43 pm

    Great post, it was really useful. Nonetheless, some functions are outdated. I have updated the code with some minor changes, just in case is helpful for someone:

    library(ggplot2)
    library(reshape2)
    library(grid)

    df <- structure(list(City = structure(c(2L,3L, 1L), .Label = c("Minneapolis", "Phoenix", "Raleigh"), class = "factor"),
    January = c(52.1, 40.5, 12.2),
    February = c(55.1, 42.2, 16.5),
    March = c(59.7, 49.2, 28.3),
    April = c(67.7, 59.5, 45.1),
    May = c(76.3, 67.4, 57.1),
    June = c(84.6, 74.4, 66.9),
    July = c(91.2, 77.5, 71.9),
    August = c(89.1, 76.5, 70.2),
    September = c(83.8, 70.6, 60),
    October = c(72.2, 60.2, 50),
    November = c(59.8, 50, 32.4),
    December = c(52.5, 41.2, 18.6)),
    .Names = c("City", "January","February", "March", "April", "May", "June","July", "August",
    "September", "October","November", "December"),
    class = "data.frame",
    row.names = c(NA, -3L))

    dfm <- melt(df, variable.name = "month")

    levels(dfm$month) <- month.abb

    p <- ggplot(dfm, aes(month, value, group = City, colour = City))

    (p1 <- p + geom_line(size = 1))

    dgr_fmt <- function(x, …) {
    parse(text = paste(x, "*degree", sep = ""))
    }
    none <- theme_void()

    p2 <- p1 + scale_y_continuous(labels = dgr_fmt, limits = c(0, 100), expand = c(0, 0) ) +
    ggtitle(expression("Average Monthly Temperatures (" * degree * "F)"))+
    scale_color_discrete(guide = FALSE)+
    xlab(NULL) + ylab(NULL)+
    theme_bw() +
    theme(panel.grid.major = element_blank(),
    panel.grid.minor = element_blank(),
    panel.background = element_blank(),
    panel.border = element_blank(),
    axis.line = element_line(colour = "grey50"))

    (p3 <- p2 + geom_vline(xintercept = c(2.9,5.9, 8.9, 11.9), colour = "grey85", alpha = 0.5) +
    geom_hline(yintercept = 32, colour = "grey80",alpha = 0.5) +
    annotate("text", x = 1.2, y = 35, label = "Freezing", colour = "grey80", size = 4) +
    annotate("text", x = c(1.5, 4.5, 7.5, 10.5), y = 97,
    label = c("Winter","Spring", "Summer", "Autumn"), colour = "grey70", size = 4))

    (p4 <- p3 + geom_text(data = dfm[dfm$month == "Dec", ], aes(label = City), hjust = 0.7, vjust = 1))

    data_table <- ggplot(dfm, aes(x = month, y = factor(City), label = format(value, nsmall = 1), colour = City)) +
    geom_text(size = 3.5) +
    scale_color_discrete(guide = FALSE)+
    scale_y_discrete(labels = abbreviate, limits = c("Minneapolis", "Raleigh", "Phoenix")) +
    xlab(NULL) + ylab(NULL)+
    theme_bw() +
    theme(panel.grid.major = element_blank(),
    panel.border = element_blank(),
    axis.text.x = element_blank(),
    axis.ticks = element_blank(),
    plot.margin = unit(c(-0.5, 1, 0, 0.5), "lines"))

    Layout <- grid.layout(nrow = 2, ncol = 1, heights = unit(c(2, 0.25), c("null", "null")))

    grid.show.layout(Layout)

    vplayout <- function(…) {
    grid.newpage()
    pushViewport(viewport(layout = Layout))
    }

    subplot <- function(x, y) viewport(layout.pos.row = x,
    layout.pos.col = y)
    mmplot <- function(a, b) {
    vplayout()
    print(a, vp = subplot(1, 1))
    print(b, vp = subplot(2, 1))
    }

    mmplot(p4, data_table)

  13. S.Stender permalink
    July 31, 2018 1:40 am

    Thanks Markelgl – very helpful post! Everything worked, except I had to change function(…) to function(), and function(x, …) to function(x) for it to work in my R session (RStudio, version 3.4.4).

Trackbacks

  1. directlabels: Adding direct labels to ggplot2 and lattice plots « Learning R
  2. directlabels: Adding direct labels to ggplot2 and lattice plots « Learning R
  3. Grouped forest plots using ggplot2 | Когнитивная психология и эмоции

Leave a comment