4.5 Line plots/time series plots
Let’s plot the life expectancy in the United Kingdom over time (Figure 4.7):
gapdata %>% filter(country == "United Kingdom") %>% ggplot(aes(x = year, y = lifeExp)) + geom_line()
As a recap, the steps in the code above are:
- inside the
filter(), our condition is
country == "United Kingdom";
- We initialise
ggplot()and define our main variables:
aes(x = year, y = lifeExp);
- we are using a new geom -
This is identical to how we used
In fact, by just changing
point in the code above works - and instead of a continuous line you’ll get a point at every 5 years as in the dataset.
But what if we want to draw multiple lines, e.g., for each country in the dataset?
Let’s send the whole dataset to
The reason you see this weird zigzag in Figure 4.8 (1) is that, using the above code,
ggplot() does not know which points to connect with which.
Yes, you know you want a line for each country, but you haven’t told it that.
So for drawing multiple lines, we need to add a
group aesthetic, in this case
group = country:
This code works as expected (Figure 4.8 (2)) - yes there is a lot of overplotting but that’s just because we’ve included 142 lines on a single plot.