Labelling plot using data.frame in ggplot2

Categories: data

Just a quick technical post mainly for my future self. When I am next time googling with ggplot label with dataframe data.frame I will hopefully find this.

ggplot2 is a popular library for making static graphics in R. Functions such as geom_label or geom_text (as well as geom_sf_text, geom_sf_label and geom_text_repel & geom_label_repel from ggrepel-package) are neat tools for annotating plots with text. Most common case is to label series in line plot, add values on top of bar chart bars or adding the name and population of a region into a map.

In past week I needed to create a map with a small table on top of each region. I am used to use leaflet-package and interactive maps like here Eduskuntavaalit 2019: Helsingin äänestysalueiden top 3, but with growing pressures to make things accessible, I have switched back to static maps and simple alt-texts. Below is an example code on how to do one at the level of election constituencies in Finland so that it shows the name of the constituency, number of municipalities and the area of constituency.

library(dplyr)
library(glue)
library(ggplot2)
library(tidyr)
library(readr)
library(geofi)
library(sf)

muni <- get_municipalities()
vaalipiiri <- muni %>% 
  group_by(vaalipiiri_name_en) %>% 
  summarise(n_muni = n())
vaalipiiri$area <- as.integer(round(sf::st_area(vaalipiiri)/1000000))

print(vaalipiiri)
## Simple feature collection with 13 features and 3 fields
## geometry type:  GEOMETRY
## dimension:      XY
## bbox:           xmin: 83747.59 ymin: 6637032 xmax: 732907.7 ymax: 7776431
## projected CRS:  ETRS89 / TM35FIN(E,N)
## # A tibble: 13 x 4
##    vaalipiiri_name_en    n_muni                                       geom  area
##  * <chr>                  <int>                             <GEOMETRY [m]> <int>
##  1 Åland constituency        16 MULTIPOLYGON (((90854.78 6694278, 89955.1…  1525
##  2 Central Finland cons…     22 POLYGON ((434977 6839711, 430753.7 684303… 18926
##  3 Häme constituency         21 POLYGON ((290386.6 6750853, 290311.6 6755… 12682
##  4 Helsinki constituency      1 MULTIPOLYGON (((402737.7 6680700, 402069.…   264
##  5 Lapland constituency      21 POLYGON ((337502.3 7581690, 332585.3 7585… 98930
##  6 Oulu constituency         38 MULTIPOLYGON (((334215.3 7125046, 334345.… 61918
##  7 Pirkanmaa constituen…     23 POLYGON ((262460.4 6807729, 263748.3 6815… 15513
##  8 Satakunta constituen…     16 MULTIPOLYGON (((195782.2 6798446, 196072 …  8354
##  9 Savo-Karelia constit…     32 POLYGON ((451260.4 6978835, 455573.9 6980… 43900
## 10 Southeast Finland co…     27 MULTIPOLYGON (((470374.8 6722711, 464319.… 28966
## 11 Uusimaa constituency      25 MULTIPOLYGON (((280618.9 6639331, 277760 …  9335
## 12 Vaasa constituency        40 MULTIPOLYGON (((189333.9 7001075, 190233.… 27208
## 13 Varsinais-Suomi cons…     27 MULTIPOLYGON (((180938.5 6714890, 181988.… 10444

Next we need to create the small data frame for each constituency.

Vaasa constituency

vpiiri   Vaasa
n_muni      40
area     27208
pad_same_length_plus_2 <- function(x,
                                   padchar=" ",
                                   padside= "right", 
                                   padplus = 2){
  stringr::str_pad(x, max(nchar(x))+padplus, side=padside, pad=padchar)
                                   } 
vaalipiiri_df <- st_drop_geometry(vaalipiiri) %>% 
  setNames(c("vpiiri","n_muni","area"))

vpt <- unique(vaalipiiri_df$vpiiri)
vplista <- list()
for (i in seq_along(vpt)){
  tmpdat <- tibble(vaalipiiri = vpt[i],
                   label = vaalipiiri_df %>%
                     filter(vpiiri == vpt[i]) %>%
                     mutate_all(as.character) %>%
                     mutate(vpiiri = gsub(" constituency| maakunnan", "", vpiiri)) %>% 
                     pivot_longer(1:3) %>% 
                     mutate(name = pad_same_length_plus_2(name, padplus = 0),
                            value = pad_same_length_plus_2(value, padside = "left", padplus = 0))  %>%
                     setNames(c(" "," ")) %>% 
                     format_tsv() %>%  
                     sub("\n$", "", .)) %>%
    mutate(label = paste0(vaalipiiri,"\n",label))
  vplista[[vpt[i]]] <-  tmpdat
}
label_data <- do.call(bind_rows, vplista)
mapdata2 <- left_join(vaalipiiri,label_data, by = c("vaalipiiri_name_en" = "vaalipiiri"))
ggplot(mapdata2, aes(fill = area,
                    label = label)) +
  labs(title = "Election constituencies in comparison",
       fill = "Constituency area") +
  geom_sf() +
  ggrepel::geom_label_repel(data = mapdata2 %>%
                             sf::st_set_geometry(NULL) %>%
                             bind_cols(mapdata2 %>% 
                                         sf::st_centroid() %>% 
                                         sf::st_coordinates() %>% as_tibble()),
                           aes(label = label,
                               x = X,
                               y = Y#,
                               ),
                           fill = alpha("white", 2/3),
                           color = "black",
                           family = "Space Mono",
                           size = 3.5,
                           max.overlaps = 35) +
  scale_fill_viridis_c()

Translations

See also