Skip to content

ggplot2: Marimekko/Mosaic Chart

March 29, 2009

Another variation of the variable width column chart is a Marimekko chart or mosaic chart or eikosogram. In many marketing departments this chart is used to show proportion of a product market by region, and proportion of region by product. Jon Peltier’s tutorial demonstrates the technique to create one in Excel.

http://learnr.files.wordpress.com/2009/03/marimekko12.png?w=384&h=323

Let’s attempt to do the same in ggplot2.

We start with creating the dataframe to work with…

> df <- data.frame(segment = c("A", "B", "C",
     "D"), segpct = c(40, 30, 20, 10), Alpha = c(60,
     40, 30, 25), Beta = c(25, 30, 30, 25),
     Gamma = c(10, 20, 20, 25), Delta = c(5,
         10, 20, 25))

Next, I add a few helper variables, and convert the data to long format.

Calculate cumulative width on x-axes, starting point of each column and erase segpct variable.

> df$xmax <- cumsum(df$segpct)
> df$xmin <- df$xmax - df$segpct
> df$segpct <- NULL

Data looks like this before the long-format conversion:

> head(df)
  segment Alpha Beta Gamma Delta xmax xmin
1       A    60   25    10     5   40    0
2       B    40   30    20    10   70   40
3       C    30   30    20    20   90   70
4       D    25   25    25    25  100   90

After melting the data looks like this:

> library(ggplot2)
> dfm <- melt(df, id = c("segment", "xmin", "xmax"))
> head(dfm)
  segment xmin xmax variable value
1       A    0   40    Alpha    60
2       B   40   70    Alpha    40
3       C   70   90    Alpha    30
4       D   90  100    Alpha    25
5       A    0   40     Beta    25
6       B   40   70     Beta    30

Now we need to determine how the columns are stacked and where to position the text labels.

Calculate ymin and ymax:

> dfm1 <- ddply(dfm, .(segment), transform, ymax = cumsum(value))
> dfm1 <- ddply(dfm1, .(segment), transform,
     ymin = ymax - value)

Positioning of text:

> dfm1$xtext <- with(dfm1, xmin + (xmax - xmin)/2)
> dfm1$ytext <- with(dfm1, ymin + (ymax - ymin)/2)

Finally, we are ready to start the plotting process:

> p <- ggplot(dfm1, aes(ymin = ymin, ymax = ymax,
     xmin = xmin, xmax = xmax, fill = variable))

Use grey border to distinguish between the segments:

> p1 <- p + geom_rect(colour = I("grey"))
http://learnr.files.wordpress.com/2009/03/marimekko-chart-p11.png?w=599&h=417

The explanation of different fill colours will be included in the text label of Segment A using the ifelse function.

> p2 <- p1 + geom_text(aes(x = xtext, y = ytext,
     label = ifelse(segment == "A", paste(variable,
         " - ", value, "%", sep = ""), paste(value,
         "%", sep = ""))), size = 3.5)
http://learnr.files.wordpress.com/2009/03/marimekko-chart-p21.png?w=599&h=417

The maximum y-axes value is 100 (as in 100%), and to add the segment description above each column I manually specify the text position.

> p3 <- p2 + geom_text(aes(x = xtext, y = 103,
     label = paste("Seg ", segment)), size = 4)
http://learnr.files.wordpress.com/2009/03/marimekko-chart-p31.png?w=599&h=417

Some last-minute changes to the default formatting: remove axis labels, legend and gridlines.

> p3 + theme_bw() + labs(x = NULL, y = NULL,
     fill = NULL) + opts(legend.position = "none",
     panel.grid.major = theme_line(colour = NA),
     panel.grid.minor = theme_line(colour = NA))
http://learnr.files.wordpress.com/2009/03/marimekko-chart-p41.png?w=599&h=417

Or, if the default palette is not to one’s liking it is very easy to use a ColorBrewer palette instead:

> last_plot() + scale_fill_brewer(palette = "Set2")
http://learnr.files.wordpress.com/2009/03/marimekko-chart-p51.png?w=599&h=417
About these ads
10 Comments leave one →
  1. baptiste permalink
    March 31, 2009 2:46 pm

    Nice, i just wonder if perhaps you shouldn’t set expand=c(0,0) on both axes (percent wouldn’t make much sense beyond this range). I wouldn’t know how to place the labels outside the plot area in ggplot2 though.

  2. Eduardo permalink
    June 25, 2009 11:06 am

    Should the area of each segment be proportional to its size (instead of to the x size)?

  3. Jill permalink
    July 5, 2012 5:59 pm

    This is really helpful! I do have a question about applying colors to the chart though. I’m trying to apply scale_colour_grey to the chart, but I get a warning (45: In grid.Call.graphics(L_lines, x$x, x$y, index, x$arrow) :supplied color is not numeric nor character) and it doesn’t actually apply the colors. I’m wondering if there’s a limit on the number of colors in a scale as there are 14 sections within the segments (equivalent to your Alpha/Beta/Gamma/Delta). I’m not quite sure how to fix it. Any ideas?

    • learnr permalink*
      July 17, 2012 11:41 am

      Check whether your fill variable is continuous and not discrete. As far as I can tell there are no limits on the number of shades of grey in a scale.

  4. April 23, 2014 8:46 am

    Reblogged this on deseRt scrolls.

Trackbacks

  1. Spinning multi-color 2 | weloveyourwalls design blog
  2. Statistical Graphics and more » Blog Archive » Don’t call me “Marimekko”
  3. Mosaic time series in R » Statistical Algorithms
  4. Electoral Marimekko Plots « David B. Sparks
  5. How to create a Marimekko/Mosaic plot in ggplot2 | Ask & Answers

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 177 other followers

%d bloggers like this: