This is the 5th post in a series attempting to recreate the figures in Lattice: Multivariate Data Visualization with R (R code) with ggplot2.

Previous parts in this series: Part 1, Part 2, Part 3, Part 4.

## Chapter 5 – Scatter Plots and Extensions

Topics covered:

• The standard scatter plot
• Using subscripts
• Using the type argument
• Variants for large data
• Scatter plot matrix
• Parallel coordinate plot

### Figure 5.1

 ```> library(lattice) > library(ggplot2)```

lattice

 ```> pl <- xyplot(lat ~ long | cut(depth, 2), data = quakes) > print(pl)```

ggplot2

 `> quakes\$Depth <- with(quakes, cut(depth, 2))`
 ```> pg <- ggplot(quakes, aes(long, lat)) + geom_point(shape = 1) + + facet_grid(~Depth) + opts(aspect.ratio = 1) > print(pg)```

### Figure 5.2

lattice

 ```> pl <- xyplot(lat ~ long | cut(depth, 3), data = quakes, + aspect = "iso", pch = ".", cex = 2, type = c("p", + "g"), xlab = "Longitude", ylab = "Latitude", + strip = strip.custom(strip.names = TRUE, var.name = "Depth")) > print(pl)```

ggplot2

 `> quakes\$Depth <- with(quakes, cut(depth, 3))`
 ```> pg <- ggplot(quakes, aes(long, lat)) + geom_point() + + facet_grid(~Depth, labeller = label_both) + coord_equal() + + labs(x = "Longitude", y = "Latitude") > print(pg)```

### Figure 5.3

lattice

 ```> pl <- xyplot(lat ~ long, data = quakes, aspect = "iso", + groups = cut(depth, breaks = quantile(depth, ppoints(4, + 1))), auto.key = list(columns = 3, title = "Depth"), + xlab = "Longitude", ylab = "Latitude") > print(pl)```

ggplot2

 ```> quakes\$Depth <- with(quakes, cut(depth, breaks = quantile(depth, + ppoints(4, 1)), include.lowest = TRUE))```
 ```> pg <- ggplot(quakes, aes(long, lat, colour = Depth)) + + geom_point() + coord_equal() + labs(x = "Longitude", + y = "Latitude") + opts(title = "Depth") > print(pg)```

### Figure 5.4

lattice

 ```> depth.col <- gray.colors(100)[cut(quakes\$depth, 100, + label = FALSE)] > depth.ord <- rev(order(quakes\$depth))```
 ```> pl <- xyplot(lat ~ long, data = quakes[depth.ord, ], + aspect = "iso", type = c("p", "g"), pch = 21, fill = depth.col[depth.ord], + cex = 2, xlab = "Longitude", ylab = "Latitude") > print(pl)```

ggplot2

 ```> pg <- ggplot(quakes, aes(long, lat, colour = factor(cut(quakes\$depth, + 100, label = FALSE)))) + geom_point(size = 4) + geom_point(size = 4, + shape = 1, colour = "steelblue", alpha = 0.4) + labs(x = "Longitude", + y = "Latitude") + scale_colour_grey() + theme_bw() + + opts(legend.position = "none") + coord_equal() > print(pg)```

### Figure 5.5

lattice

 ```> quakes\$Magnitude <- equal.count(quakes\$mag, 4) > quakes\$color <- depth.col > quakes.ordered <- quakes[depth.ord, ]```
 ```> pl <- xyplot(lat ~ long | Magnitude, data = quakes.ordered, + aspect = "iso", fill.color = quakes.ordered\$color, + cex = 2, panel = function(x, y, fill.color, ..., + subscripts) { + fill <- fill.color[subscripts] + panel.grid(h = -1, v = -1) + panel.xyplot(x, y, pch = 21, fill = fill, ...) + }, xlab = "Longitude", ylab = "Latitude") > print(pl)```

ggplot2

 ```> fn <- function(data = quakes\$mag, number = 4, ...) { + intrv <<- as.data.frame(co.intervals(data, number, + ...)) + mag <- sort(unique(data)) + intervals <- ldply(mag, function(x) { + t(as.numeric(x < intrv\$V2 & x > intrv\$V1)) + }) + tmp <- melt(cbind(mag, intervals), id.var = 1) + tmp[tmp\$value > 0, 1:2] + } > quakes.ordered <- merge(quakes, fn()) > intrv <- with(intrv, paste(V1, V2, sep = "-")) > quakes.ordered <- rename(quakes.ordered, c(variable = "magnitude")) > quakes.ordered\$magnitude <- factor(quakes.ordered\$magnitude, + labels = intrv)```
 ```> pg <- ggplot(quakes.ordered, aes(long, lat, colour = factor(cut(depth, + 100, label = FALSE)))) + geom_point(size = 4) + facet_grid(~magnitude, + labeller = label_both) + scale_colour_grey() + theme_bw() + + labs(x = "Longitude", y = "Latitude") + opts(legend.position = "none") + + coord_equal() > print(pg)```
 Note Custom wrapper function fn() used to break the data into intervals.

### Figure 5.6

lattice

 ```> depth.breaks <- do.breaks(range(quakes.ordered\$depth), + 50) > quakes.ordered\$color <- level.colors(quakes.ordered\$depth, + at = depth.breaks, col.regions = gray.colors)```
 ```> pl <- xyplot(lat ~ long | Magnitude, data = quakes.ordered, + aspect = "iso", groups = color, cex = 2, panel = function(x, + y, groups, ..., subscripts) { + fill <- groups[subscripts] + panel.grid(h = -1, v = -1) + panel.xyplot(x, y, pch = 21, fill = fill, ...) + }, legend = list(right = list(fun = draw.colorkey, + args = list(key = list(col = gray.colors, at = depth.breaks), + draw = FALSE))), xlab = "Longitude", ylab = "Latitude") > print(pl)```

ggplot2

 ```> pg <- ggplot(quakes.ordered, aes(long, lat, colour = depth)) + + geom_point(size = 4) + facet_grid(~magnitude, labeller = label_both) + + coord_equal() + scale_colour_gradient(low = "grey30", + high = "grey90") + labs(x = "Longitude", y = "Latitude") + + theme_bw() > print(pg)```

### Figure 5.7

lattice

 ```> types.plain <- c("p", "l", "o", "r", "g", "s", "S", "h", + "a", "smooth") > types.horiz <- c("s", "S", "h", "a", "smooth") > horiz <- rep(c(FALSE, TRUE), c(length(types.plain), length(types.horiz))) > types <- c(types.plain, types.horiz) > set.seed(2007041) > x <- sample(seq(-10, 10, length = 15), 30, TRUE) > y <- x + 0.25 * (x + 1)^2 + rnorm(length(x), sd = 5)```
 ```> pl <- xyplot(y ~ x | gl(1, length(types)), xlab = "type", + ylab = list(c("horizontal=TRUE", "horizontal=FALSE"), + y = c(1/6, 4/6)), as.table = TRUE, layout = c(5, + 3), between = list(y = c(0, 1)), strip = function(...) { + panel.fill(trellis.par.get("strip.background")\$col[1]) + type <- types[panel.number()] + grid.text(lab = sprintf("\"%s\"", type), x = 0.5, + y = 0.5) + grid.rect() + }, scales = list(alternating = c(0, 2), tck = c(0, + 0.7), draw = FALSE), par.settings = list(layout.widths = list(strip.left = c(1, + 0, 0, 0, 0))), panel = function(...) { + type <- types[panel.number()] + horizontal <- horiz[panel.number()] + panel.xyplot(..., type = type, horizontal = horizontal) + })[rep(1, length(types))] > print(pl)```

ggplot2

 `No direct support - one would need to draw 15 separate graphs and combine these into one using grid.page()`

### Figure 5.8

 `> data(Earthquake, package = "MEMSS")`

lattice

 ```> pl <- xyplot(accel ~ distance, data = Earthquake, panel = function(...) { + panel.grid(h = -1, v = -1) + panel.xyplot(...) + panel.loess(...) + }, xlab = "Distance From Epicenter (km)", ylab = "Maximum Horizontal Acceleration (g)") > print(pl)```

ggplot2

 ```> pg <- ggplot(Earthquake, aes(distance, accel)) + geom_point() + + geom_smooth(method = "loess", se = FALSE) + xlab("Distance From Epicenter (km)") + + ylab("Maximum Horizontal Acceleration (g)") > print(pg)```

### Figure 5.9

lattice

 ```> pl <- xyplot(accel ~ distance, data = Earthquake, type = c("g", + "p", "smooth"), scales = list(log = 2), xlab = "Distance From Epicenter (km)", + ylab = "Maximum Horizontal Acceleration (g)") > print(pl)```

ggplot2

 ```> pg <- pg + scale_x_log2() + scale_y_log2() > print(pg)```

### Figure 5.10

 `> library(locfit)`

lattice

 ```> Earthquake\$Magnitude <- equal.count(Earthquake\$Richter, + 3, overlap = 0.1) > coef <- coef(lm(log2(accel) ~ log2(distance), data = Earthquake))```
 ```> pl <- xyplot(accel ~ distance | Magnitude, data = Earthquake, + scales = list(log = 2), col.line = "grey", lwd = 2, + panel = function(...) { + panel.abline(reg = coef) + panel.locfit(...) + }, xlab = "Distance From Epicenter (km)", ylab = "Maximum Horizontal Acceleration (g)") > print(pl)```

ggplot2

 ```> Earthquake2 <- merge(Earthquake, fn(Earthquake\$Richter, + 3, overlap = 0.1), by.x = "Richter", by.y = "mag", + all.x = TRUE)```
 ```> pg <- ggplot(Earthquake2, aes(distance, accel)) + facet_grid(~variable, + labeller = label_both) + geom_smooth(method = "lm", + se = F, fullrange = T, colour = "steelblue", size = 1) + + geom_smooth(method = "locfit", formula = y ~ x, se = F) + + geom_point() + scale_x_log2() + scale_y_log2() + + xlab("Distance From Epicenter (km)") + ylab("Maximum Horizontal Acceleration (g)") > print(pg)```
 Note Custom wrapper function fn() used to break the data into intervals. See Figure 5.5.

### Figure 5.11

 `> data(SeatacWeather, package = "latticeExtra")`

lattice

 ```> pl <- xyplot(min.temp + max.temp + precip ~ day | month, + ylab = "Temperature and Rainfall", data = SeatacWeather, + type = "l", lty = 1, col = "black") > print(pl)```

ggplot2

 ```> p.precip <- ggplot(SeatacWeather, aes(day)) + facet_grid(~month) + + geom_line(aes(y = min.temp)) + geom_line(aes(y = max.temp)) + + ylab("Temperature and Rainfall") > pg <- p.precip + geom_line(aes(y = precip)) > print(pg)```

### Figure 5.12

 `> maxp <- max(SeatacWeather\$precip, na.rm = TRUE)`

lattice

 ```> pl <- xyplot(min.temp + max.temp + I(80 * precip/maxp) ~ + day | month, data = SeatacWeather, lty = 1, col = "black", + ylab = "Temperature and Rainfall", type = c("l", + "l", "h"), distribute.type = TRUE) > print(pl)```

ggplot2

 ```> pg <- p.precip + geom_linerange(aes(ymin = 0, ymax = 80 * + precip/maxp)) > print(pg)```

### Figure 5.13

lattice

 ```> pl <- update(trellis.last.object(), ylab = "Temperature (Fahrenheit) \n and Rainfall (inches)", + panel = function(...) { + panel.xyplot(...) + if (panel.number() == 2) { + at <- pretty(c(0, maxp)) + panel.axis("right", half = FALSE, at = at * + 80/maxp, labels = at) + } + }) > print(pl)```

ggplot2

 `ggplot2 does not support the addition of a secondary axis.`

### Figure 5.14

 ```> library(hexbin) > data(gvhd10, package = "latticeExtra")```

lattice

 ```> pl <- xyplot(asinh(SSC.H) ~ asinh(FL2.H) | Days, gvhd10, + aspect = 1, panel = panel.hexbinplot, .aspect.ratio = 1, + trans = sqrt) > print(pl)```

ggplot2

 ```> pg <- ggplot(gvhd10, aes(asinh(FL2.H), asinh(SSC.H), + fill = sqrt(..count..))) + geom_hex() + facet_wrap(~Days, + nrow = 2) + opts(legend.position = "none") > print(pg)```

### Figure 5.15 – Scatter Plot Matrix

lattice

 ```> pl <- splom(USArrests) > print(pl)```

ggplot2

 ```> pg <- plotmatrix(USArrests) > print(pg)```
 Note plotmatrix function is still at experimental stage.

### Figure 5.16

lattice

 ```> pl <- splom(~USArrests[c(3, 1, 2, 4)] | state.region, + pscales = 0, type = c("g", "p", "smooth")) > print(pl)```

ggplot2

 `There is currently no easy way of achieving the same in ggplot2`

### Figure 5.17

lattice

 ```> pl <- splom(~data.frame(mpg, disp, hp, drat, wt, qsec), + data = mtcars, groups = cyl, pscales = 0, varnames = c("Miles\nper\ngallon", + "Displacement\n(cu. in.)", "Gross\nhorsepower", + "Rear\naxle\nratio", "Weight", "1/4 mile\ntime"), + auto.key = list(columns = 3, title = "Number of Cylinders")) > print(pl)```

ggplot2

 ```> pg <- plotmatrix(with(mtcars, data.frame(mpg, disp, hp, + drat, wt, qsec))) > print(pg)```
 Note plotmatrix function is still at experimental stage. Colour mapping is a planned future feature.

### Figure 5.18

lattice

 ```> pl <- parallel(~mtcars[c(1, 3, 4, 5, 6, 7)] | factor(cyl), + mtcars, groups = carb, key = simpleKey(levels(factor(mtcars\$carb)), + points = FALSE, lines = TRUE, space = "top", + columns = 3), layout = c(3, 1)) > print(pl)```

ggplot2

 ```> mtcars <- namerows(mtcars, col.name = "car") > df <- melt(mtcars[c(-8:-10)], id.var = c("cyl", "carb", + "car")) > dfm <- ddply(df, .(variable), transform, rng = rescaler(value, + type = "range"))```
 ```> pg <- ggplot(dfm, aes(group = car, colour = factor(carb))) + + geom_line(aes(variable, rng)) + facet_grid(~cyl) + + coord_flip() > print(pg)```

### Figure 5.19

lattice

 ```> pl <- parallel(~asinh(gvhd10[c(3, 2, 4, 1, 5)]), data = gvhd10, + subset = Days == "13", alpha = 0.01, lty = 1) > print(pl)```

ggplot2

 ```> df <- gvhd10[gvhd10\$Days == "13", c(1:5)] > df\$id <- seq_along(df[, 1]) > df <- melt(df, id.vars = c("id")) > df\$variable <- factor(df\$variable, levels = names(gvhd10)[c(3, + 2, 4, 1, 5)]) > df <- ddply(df, .(variable), transform, value = rescaler(asinh(value), + type = "range"))```
 ```> pg <- ggplot(df, aes(value, variable, group = id)) + + geom_path(alpha = 0.01) + theme_bw() > print(pg)```
 Note Built-in black and white theme is used, otherwise the thin grey lines would be invisible on a grey background.

July 15, 2009 1:52 pm

This is so great! You seem to learn R very fast! There was one minor problem with 5.14:

> pg print(pg)

July 16, 2009 12:44 pm

Strange, I don’t seem to get this error.

July 15, 2009 7:20 pm

Nice comparisons. Aspect ratio is important. Is there not an aspect=”iso”, for example, in ggplot?

July 16, 2009 1:01 pm

You can set aspect ratio using one of the two options:
`opts(aspect.ratio = 1)`
or
`coord_equal(ratio = 1)`

July 21, 2009 12:21 am

Thanks. In your plots above, you are using “opts(aspect.ratio = 1)” to mimic lattice plots that use “aspect=’iso'”. This is not correct (ok, it would be if the values in both variables spanned the same range). “opts(aspect.ratio = 1)” only causes the bounding box of each facet to be a square. The idea behind aspect=’iso’, however, is, quoting the documentation: “where the relation between physical distance on the device and distance in the data scale are forced to be the same for both axes”.

So the alternative you gave me – “coord_equal(ratio = 1)” is the ggplot way to get isometric scales. But playing around with this I see that it still forces the facets to be square, extending the axis limits of whatever axis comes up short. This is very strange and constraining behavior if you ask me.

3. July 15, 2009 7:38 pm

Great Job! I wonder if at the end of the series you are going to collect all the ‘episodes’ in a single pdf so one can read it offline easily.

July 16, 2009 12:46 pm

That’s a good idea. I need to think how to best accomplish it.

• July 17, 2009 4:42 pm

I suggest to use the literate programming approach that allows to produce both the data analysis (the images in you case) and the Report at the same time.
I use it extensively for reporting taking advantage of the R Sweave function:

http://www.stat.uni-muenchen.de/~leisch/Sweave/

If you need some help don’t hesitate to ask! :-)

• July 16, 2009 5:41 pm

Very good suggestion!

July 16, 2009 1:54 am

Nice work! It would help if you could cut & paste your code without the “>” symbols to facilitate our cutting/pasting into an R session…

July 16, 2009 12:48 pm

In Windows version of R you could paste the code using “Paste commands only” option from the right-click menu. This option gets rid of the “>” symbols.

October 2, 2009 4:09 am

Thanks for the tutorials, they are great:

I was wondering if you could help me, how I can modify the number of tickmarcks in a ggplot in the x-axis?. I have 10 scatterplots placed in two rows, and the tickmarcks of the x-axis are overlapping.

Regards,

Roberto

p <- ggplot(bdhier, aes(lescudo.dorsal, lperna2b)) +
opts(axis.test.x= theme_text(angle = 90, size=4, colour = "black"))+
opts(aspect.ratio = 1)+
facet_wrap(~pop.num,nrow=2) + geom_point() +
labs(x= "Log Dorsal Plate length (mm)", y = "Log Leg 2 length (mm)")

October 4, 2009 9:33 am

One option would be to specify the breaks/labels manually.

Have a look, there was a thread on a similar topic on ggplot2 mailing list just last week.