ggplot2: Changing the Default Order of Legend Labels and Stacking of Data
“How to change the order of legend labels” is a question that gets asked relatively often on ggplot2 mailing list. A variation of this question is how to change the order of series in stacked bar/lineplots.
While these two questions seem to be related, in fact they are separate as the legend is controlled by scales, whereas stacking is controlled by the order of values in the data.
Recently I spent some time getting my head around this, and below is a quick recap.
Changing the Ordering of Legend Labels
The standard stacked barplot looks like this:
> library(ggplot2) |
> ggplot(diamonds, aes(clarity, fill = cut)) + geom_bar() |

You notice that in the legend “Fair” is at the top and “Ideal” at the bottom. But what if I would like to order the labels in the reverse order, so that “Ideal” would be at the top?
The order of legend labels can be manipulated by reordering the factor levels of the cut variable mapped to fill aesthetic.
> levels(diamonds$cut) [1] "Fair" "Good" "Very Good" "Premium" [5] "Ideal" > diamonds$cut <- factor(diamonds$cut, levels = rev(levels(diamonds$cut))) > levels(diamonds$cut) [1] "Ideal" "Premium" "Very Good" "Good" [5] "Fair" |
> ggplot(diamonds, aes(clarity, fill = cut)) + geom_bar() |

The legend entries are now in reverse order (and so is the stacking).
Changing Data Stacking Order
The order aesthetic changes the order in which the areas are stacked on top of each other.
The following aligns the order of both the labels and the stacking.
> ggplot(diamonds, aes(clarity, fill = cut, order = -as.numeric(cut))) + + geom_bar() |

Or, alternatively, reordering the factor levels again:
> diamonds$cut <- factor(diamonds$cut, levels = rev(levels(diamonds$cut))) > ggplot(diamonds, aes(clarity, fill = cut, order = -as.numeric(cut))) + + geom_bar() |

The things are not so simple. You are messing up the data by changing levels as you do. At first plot, the number of ideal diamonds is the smallest, on the second plot it is the largest. So you would have to make the reordering of the levels a bit more cleverly. However, I’m always in trouble when I have to do this, it would be nice to see your solution.
Strange, on all the plots I can see the Ideal diamond count is the largest.
Please refresh your browser (image) cache, as initially the wrong image files were uploaded, but I have deleted these since.
What’s the incantation to manually specify the order of the levels? I’d rather just type it in.
How do you change the order of facets in a faceted plot?
There’s nothing wrong with manually specifying the order of the levels, but it becomes more prone to errors if the level names are long.
Similarly to changing the legend keys, you can change the order of facets by changing the order of the underlying faceting factor.
I would like to argue against using this kind of visualization. Obviously, this isn’t a forum for visualization but I feel compelled to write.
There is an inherent ambiguity whether the values represented by colored areas are absolute or relative. If absolute, the story the data is telling would be significantly different; if they’re relative, determining values is nearly impossible without relative scales.
So whether or not you can move labels around is, in my opinion, eclipsed by whether you should be trying to in the first place.
I agree that stacked plots should be used with caution and care.
I also think that they are a good tool in performing exploratory data analysis. Consider the following two charts
qplot(clarity, data=diamonds)
vs
qplot(clarity, data=diamonds, fill=cut)
In my view the second plot (which is a stacked plot as well) conveys considerably more information about the composition of the dataset compared to the first plot.
I’ve been using this method with great pleasure. But in version 0.8.8 the order parameter doesn’t affect the order of the bars so this technique doesn’t work. It has been reported as a bug and presumably will be fixed in a future version of ggplot2.
Regarding reordering of legend labels, I’m not having much with the approach described above when using qplot (below). Should I be doing this in ggplot instead? I’ve yet to master that one… 🙂
Cheers,
Yannick
qplot(as.character(PRR), Activity, fill=factor(Assay), data=final5, geom=”boxplot”, position=”dodge”) + theme_bw() + scale_x_discrete(name=’Binarized PRR’,breaks=c(0,1), labels=c(“0″,”1”)) + scale_y_continuous(name = ‘Activity (Z-score)’) + opts(legend.key = theme_blank())
I suggest you have another look at your source data, and see if the factors are ordered properly. What works with qplot works with ggplot as well, I just have decided that setting up ggplot layers is clearer even if it requires more typing.
This is a wonderfully useful explanation of how to address the need to order stuff in graphs.
I wonder if you would consider elaborating further on two points.
First, in your diamond example “cut” has a kind of natural ranking “Fair”..”Ideal” (or the other way around), so the question I pose will sound a bit odd, but what if I wanted to order cut as “Premium”, “Good”, “Ideal”, “Fair”, “Very Good” (in terms of stacking order and legend order both).
Second, there is another “ordering surprise” that I have recently encountered, and that is ordering of facets in a multi-facet graph.
And a closing comment: it seems to me that the majority of stacked bar graphs and facets that I would prepare would have a categorical ordering that I would want to explicitly control. Seldom do I have categories that have any kind of “natural ordering” (like “Fair”..”Ideal”); most often my categories need to be ordered in aid of presentation – in other words, I need to plot the graph and then figure out how to arrange the ordering of stacking (especially) and faceting to present the comparison in its most visually compelling fashion.
Therefore, boldly assuming the rest of the bar / facet graphers out there are like me 🙂 I humbly suggest that this topic be forever after covered in great and gory detail in all explanatory writings on ggplot 🙂 🙂
Thanks again for this wonderful set of examples!
All you need to do is to manually change the order of the factor levels.
For example, something like this:
diamonds$cut <- factor(diamonds$cut, levels = c("Fair", "Ideal", "Premium", "Very Good", "Good")))
Thanks, this helped me a lot! And the distinction between the two cases saved me a lot of trial and error trying to fix what I initially thought was an issue with the legend, but was really an issue with the order of the series I added to the plot.
I think we need to way to change the legend order, so factor level 1 appears at the bottom.
I understand the solutions given above, but for a graph where I want the most significant item at the bottom, it makes sense for the legend order of colors to follow the graph order of colors (level 1 at the bottom).
To ensure the stability of the color most significant item does not change with respect to the relative levels of the others, and with respect to changing numbers of factor levels, I need the most significant to have factor level 1 (not n).
In a plot with a color scheme (tile plot), why does the legend go from low number at top to high number at bottom? How do I reverse that? And who decided that? That goes contrary to every number line I have ever seen! It is contrary to convention. Using order, and negating whatever is used in this example won’t work for me because the scale is set by ggplot automatically for me.
You can change the order of legend labels either manually in ggplot2 or by reordering the underlying factor.
Thank you. Thank you. This was needed.
If you want to change order of the legend only just type
+ scale_fill_hue(guide = guide_legend(reverse=TRUE))
This works for ggplot2 versions 0.9.0 and above.
@ Marco: Great 🙂 This new addition to ggplot2 should be on the top of this page.
In the cookbook it also says: http://www.cookbook-r.com/Graphs/Legends_%28ggplot2%29/
# These two methods are equivalent:
bp + guides(fill = guide_legend(reverse=TRUE))
bp + scale_fill_discrete(guide = guide_legend(reverse=TRUE))
# You can also modify the scale directly:
bp + scale_fill_discrete(breaks = rev(levels(PlantGrowth$group)))
If you want to avoid modifying your data and/or want to modifiy the order manually (in my case I have a line graph and I want the legend to be ordered in the same way as the last values of the lines), check the cookbook for version 0.9.3: http://www.cookbook-r.com/Graphs/Legends_%28ggplot2%29/
# These two methods are equivalent:
bp + guides(fill = guide_legend(reverse=TRUE))
bp + scale_fill_discrete(guide = guide_legend(reverse=TRUE))
# You can also modify the scale directly:
bp + scale_fill_discrete(breaks = rev(levels(PlantGrowth$group)))
Thank you, the `aes(order=)` syntax no longer looks to be supported, but this works.
Great stuff. Just what I needed. Thanks for the R wizardry!