Skip to content

ggplot2: Bump Chart

May 6, 2009

Charts & Graphs blog has posted a 2009 Kentucky Derby bump chart made in Excel.

Next a similar bump chart would be constructed in ggplot2.

After typing the horse positions into a csv-file, importing it into R, the bump chart using ggplot2 default settings looks like this.

> library(ggplot2)
> df <- read.csv("derby.csv")
> df$Horse <- with(df, reorder(Horse, Finish))
> dfm <- melt(df)
> p <- ggplot(dfm, aes(factor(variable), value,
     group = Horse, colour = Horse, label = Horse))
> p1 <- p + geom_line() + geom_text(data = subset(dfm,
     variable == "Finish"), aes(x = factor(variable +
     0.5)), size = 3.5, hjust = 0.8)

Not very nice. However, the impression changes after a few cosmetic changes to the default formatting: format x-axis labels, remove legend & axis ticks, also change the sort order of y-axis.

> labels <- c(expression(1/4), expression(1/2),
     expression(3/4), "1m", "Str", "Finish",
> p1 + theme_bw() + opts(legend.position = "none",
     panel.border = theme_blank(), axis.ticks = theme_blank()) +
     scale_x_discrete(breaks = c(levels(dfm$variable),
         ""), labels = labels) + scale_y_continuous(breaks = NA,
     trans = "reverse") + xlab(NULL) + ylab(NULL)
13 Comments leave one →
  1. Kye Gilder permalink
    May 15, 2009 11:54 pm

    Where can I obtain a copy of the (“derby.csv” file (horse position for 2009 Kentucky Derby). I’d like to try and use ggplot2 and recreate this plot. Thanks!

    • learnr permalink
      May 16, 2009 11:33 am doesn’t allow to upload csv-files, unfortunately.
      However, copy-paste the code below to replace df <- read.csv("derby.csv").

      df <- structure(list(Horse = structure(c(11L, 16L, 13L, 15L, 3L, 18L, 10L, 17L, 19L, 8L, 5L, 9L, 1L, 4L, 12L, 2L, 14L, 7L, 6L), .Label = c("Advice", "Atomic Rain", "Chocolate Candy", "Desert Party", "Dunkirk", "Flying Private", "Friesan Fire", "General Quarters", "Hold Me Back", "Join in the Dance", "Mine That Bird", "Mr. Hot Stuff", "Musket Man", "Nowhere to Hide", "Papa Clem", "Pioneer of the Nile", "Regal Ransom", "Summer Bird", "West Side Bernie"), class = "factor"), X1.4 = c(19L, 3L, 8L, 5L, 17L, 16L, 1L, 2L, 13L, 12L, 9L, 14L, 15L, 4L, 18L, 10L, 11L, 6L, 7L), X1.2 = c(19L, 3L, 8L, 4L, 12L, 16L, 1L, 2L, 17L, 13L, 10L, 5L, 15L, 6L, 18L, 9L, 14L, 7L, 11L), X3.4 = c(19L, 4L, 7L, 3L, 15L, 16L, 1L, 2L, 14L, 11L, 9L, 6L, 17L, 5L, 18L, 10L, 12L, 8L, 13L), X1m = c(12L, 2L, 7L, 4L, 8L, 15L, 1L, 3L, 17L, 10L, 11L, 5L, 13L, 6L, 16L, 9L, 18L, 14L, 19L), Str = c(1L, 2L, 4L, 3L, 7L, 9L, 5L, 6L, 13L, 10L, 12L, 8L, 14L, 11L, 16L, 15L, 18L, 17L, 19L), Finish = 1:19), .Names = c("Horse", "X1.4", "X1.2", "X3.4", "X1m", "Str", "Finish"), class = "data.frame", row.names = c(NA, -19L))

  2. May 18, 2009 6:37 am

    Ohhhhhh. The geek in me is memerized by the possibilities. Thanks!

  3. June 20, 2011 7:39 pm

    Great tutorial. I’m trying to amend it to some college football rankings showing the college name on the y axis (cf your horse name) at both the start and finish. The x axis is the weekly ranking – where the numbers of weeks may vary by year of ranking
    I use scale_x_continuous(breaks=c(1:(max(cfRanking$week)))) to show each week but I am not sure how to set the college names at each end. Currently they are on the main graph and am having bit of a problem duplicating your labels methofd of adding extra fiels (or 2 in my case)

    • learnr permalink*
      June 21, 2011 2:38 am

      It’s a bit tricky to respond to you without seeing the actual code that is causing you problems.

  4. June 21, 2011 5:00 pm

    Thanks for reply. The relevant code from the script is here

    cfRanking <- sqlQuery(channel,paste(
    "select rank,college,week
    from collegefootballrankings
    where season = 2010 and rank <16
    order by week desc, rank

    p <- ggplot(cfRanking, aes(week, rank, group = college, colour = college, label = college))

    p1 <- p + geom_line() + geom_text(data = subset(cfRanking,
    week == min(cfRanking$week)), aes(x = week), size = 2.5, hjust = 0.2) + geom_text(data = subset(cfRanking,
    week == max(cfRanking$week)), aes(x = week), size = 2.5, hjust = 0.3)

    p2 <-p1 + theme_bw() + opts(legend.position = "none",
    panel.border = theme_blank(), axis.ticks = theme_blank()) +
    scale_x_continuous(breaks=c(1:(max(cfRanking$week)))) + scale_y_continuous(breaks=c(1:25),
    trans = "reverse") + xlab(NULL) +ylab(NULL) + opts(axis.text.y = theme_text(size = 8))
    p3 <- p2 + opts(title = expression(" Massey College Football Rankings 2010"))

    It can be seen here


  5. June 14, 2015 11:50 pm

    Check out these charts for the 2002-2005 Kentucky Derbies.

  6. Bedrose P. Lipschitz permalink
    April 7, 2017 7:55 pm

    “melt” function doesn’t compute in R 3.3.3 with ggplot2 loaded. What library is that in, or what is the function? Thanks.

    • Bedrose P. Lipschitz permalink
      April 7, 2017 11:17 pm

      Also, your code in the first window looks hosed to me.


  1. Jessica Chapel / Railbird v2 - Mine That Bump
  2. How to: Most underused data visualization | SevenNet
  3. Solution: Most underused data visualization #dev #development #fix | IT Info
  4. Word frequency over time by user in R | news-rss feed form stackoverflow

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: