# ggplot2: LOESS smoothing

Jon Peltier writes about the LOESS smoothing in Excel, and presents a utility to facilitate adding smoothers to the data. He goes on to show how to use smoothing to help analyze the body mass indexes (BMI) of Playboy playmates – a topic recently discussed in Flowingdata forums.

He also created the following graph in Excel with the help of a user defined function (UDF).

Now, let’s try to recreate this chart in ggplot2.

First, download the Excel datafile from Wired. I have tried to comment the code so that it is easier to follow.

Load necessary libraries: `ggplot2` for graphing, `xlsReadWrite` for Excel import (the original data on wired.com website is in an Excel file).

> library(ggplot2) > library(xlsReadWrite) |

Import data, and select the first 662 rows, as the data also contains some summaries we don’t need.

> df <- read.xls("1702_Infoporn_Playmate_Data.xls") > df <- df[1:662, ] |

Revise variable names, convert to lower case.

> names(df)[3] <- "bust" > names(df)[4] <- "cupsize" > names(df) <- tolower(names(df)) |

Combine year and month + convert to date format

> df$date <- as.Date(paste(1, df$month, df$year), + "%d %B %Y") |

**First try**

> qplot(data = df, date, bmi) |

Replace zero bmi values with NA

> df[, "bmi"] <- ifelse(df[, "bmi"] == 0, NA, df[, + "bmi"]) |

**Second try**

> (p <- qplot(data = df, date, bmi, xlab = "", ylab = "")) |

Add LOESS smoother

> (p1 <- p + geom_smooth(method = "loess", size = 1.5)) |

Add black&white theme, add title, remove axis labels

> p1 + theme_bw() + opts(title = "Playmate BMI - Loess Analysis") |

I came to the same conclusion regarding the original, linear regression analysis, and got similar results to yours. fitting a kernel regression in MATLAB. It is worth noting, though, the substantial spread about any of these trend curves.

I think the reason that the theme_bw() looks much more readable on these webpages is that it matches the background of the page (i.e. white)? With thick text theme_gray() works ok. But it’s great to know that one can alter things easily.

Do you by any chance have the dataset to recreate this example? the Wired website is not longer available.

There was a problem with the link in the post – I have now corrected this.

The Wired story is available here.

And the data can be downloaded from here.

Nice blog! Thanks for sharing your experience with R.

I found the playmate stats revealing (pun). Keep in mind that a BMI under 20 is considered unhealthy. The trend has been from around 20 in 1953 to 18 today. This is contrasting the general population’s increasing BMI.

Another interesting graph from the data is the waist hip ratio. The Greeks considered the golden ratio the standard for beauty at 0.61. As the playmate’s BMI decreases, the ratio is being shifted to 0.7. We were closer to the Greek ideal in the 1950’s. (http://en.wikipedia.org/wiki/Golden_ratio)

There are a number of interesting insights from this data set. A couple come to mind:

1. our ideal of beauty and the golden ratio

2. dropping BMI in the models vs the increasing population’s BMI

3. health implications idealizing an unhealthy & unobtainable ideal for women

Someone should write an article.