In a couple days, I’m going to drive across the country to Utah, my home state. I haven’t been out west with my whole family in four years—not since 2019 when we moved from Spanish Fork, Utah to Atlanta, Georgia. According to Google Maps, it’s a mere 1,900 miles (3,000 kilometers) through the middle of the United States and should take 28 hours, assuming no stops.

Google Maps route from Atlanta, Georgia to Spanish Fork, Utah

For bonus fun though, my wife and I decided to make this trip even more adventurous and hit as many exciting stops as possible. So rather than drive through the middle of Kansas, we’re going along the southern border on the way there, visiting New Orleans, Louisiana; San Antonio, Texas; Carlsbad Caverns, New Mexico; the Grand Canyon in Arizona; and then camping for a couple days at Capitol Reef National Park in Southern Utah before finally arriving at my aunt’s house in Spanish Fork, Utah. On the way home, we’re going across the northern part of the United States, visiting my sister in Idaho before heading towards Yellowstone in Wyoming; Devil’s Tower in Wyoming; Mount Rushmore in South Dakota; and Nauvoo, Illinois (Mormons!). It’s going to be an awesome—but way longer—mega road trip.

To help keep six kids entertained beyond just plugging them into iPads, my wife made some road trip journals to make the experience more memorable and educational, and we’re bringing a bunch of books and extra materials about each of the stops we’ll be making so they can do research along the way.

I just finished revamping my online data visualization class this week since the summer semester is starting next week, so visualizing data through maps has been on my mind. In my session on visualizing maps, I added a section on making and visualizing your own geocoded data (i.e. plotting specific cities on a map), so I figured it would be neat to visualize this huge road trip somehow to make some maps to include in the kids’ journals.

Getting the latitude and longitude coordinates for specific cities or addresses is super easy—you can either right click on Google Maps and copy specific coordinates, or you can automate it with the {tidygeocoder} package in R.

Getting route coordinates is a lot trickier though, since it involves all sorts of complex algorithms (accounting for road locations, speed limits, traffic, etc.). This is why Google Maps is so invaluable—it does an excellent job of figuring this out. But Google’s data is proprietary.

There’s an R package named {mapsapi} that makes it really easy to get data from the Google Maps API, and if you get an API key from Google (following these instructions), you can extract geocoded {sf}-compatible route data from Google with the mp_directions() function. It’s neat and quick and easy, but it’s expensive. When I first started playing with the maps API in R, I racked up $0.50 in charges just with testing a bunch of routing options. The Google Maps API gives you $200 in free credits every month and I was well below that amount, but it still was a little nervewracking to see that it was that expensive in just an hour of tinkering. Also, setting up the API key was fairly complicated (creating a Google Cloud project, setting up billing, enabling specific services, storing the API key securely on my computer, etc.), and I don’t want to go through that hassle in the future.

I’m a fan of the OpenStreetMap project, and it’s one of the backends (through Nominatim) for {tidygeocoder} (and I used it in a previous blog post showing some coordinates in interactive Leaflet maps). In searching around the internet for open source alternatives to the Google Maps API, I discovered that OpenStreetMap can also do directions and routing through the Open Source Routing Machine (OSRM) project. You can use OSRM’s public demo server for small-scale routing things, or you can install your own local instance through Docker. And super conveniently, there’s also an R package called {osrm} for accessing the OSRM API. This means that it’s possible to pull geocoded routing and directions data into R in an open source (and free!) way.

So let’s see how to use {osrm} and make some neat road trip maps with {sf} and {ggplot2}!

Getting started

Who this post is for

Here’s what I assume you know:

You’re familiar with R and the tidyverse (particularly {dplyr} and {ggplot2}).
You’re somewhat familiar with {sf} for working with geographic data. I have a whole tutorial here and a simplified one here and the {sf} documentation has a ton of helpful vignettes and blog posts, and there are also two free books about it: Spatial Data Science and Geocomputation with R. Also check this fantastic post out to learn more about the anatomy of a geometry column with {sf}.

Packages and functions

Before officially getting started, let’s load all the packages we need and create some helpful functions and variables:

Code

library(tidyverse)     # ggplot, dplyr, and friends
library(sf)            # Handle spatial data in R in a tidy way
library(tigris)        # Access geographic data from the US Census
library(tidygeocoder)  # Automated geocoding
library(osrm)          # Access OSRM through R
library(ggrepel)       # Nicer non-overlapping labels
library(glue)          # Easier string interpolation
library(scales)        # Nicer labeling functions
library(patchwork)     # Combine plots nicely
library(ggspatial)     # Nicer map features like scale bars

# Custom ggplot theme to make pretty plots
# Get the font at https://fonts.google.com/specimen/Overpass
theme_roadtrip <- function() {
  theme_void(base_family = "Overpass Light") +
    theme(
      plot.title = element_text(family = "Overpass", face = "bold", hjust = 0.5),
      strip.text = element_text(
        family = "Overpass ExtraBold", face = "plain",
        size = rel(1.1), hjust = 0.5)
    )
}

# Make labels use Overpass by default
update_geom_defaults("label_repel", 
                     list(family = "Overpass",
                          fontface = "plain"))
update_geom_defaults("label", 
                     list(family = "Overpass",
                          fontface = "plain"))

update_geom_defaults("text_repel", 
                     list(family = "Overpass",
                          fontface = "plain"))
update_geom_defaults("text", 
                     list(family = "Overpass",
                          fontface = "plain"))

# Yellowstone colors
clrs <- NatParksPalettes::natparks.pals("Yellowstone")

Code

#' Format duration in minutes and hours
#'
#' This function takes a numeric input \code{x} representing a duration in minutes,
#' rounds it to the nearest 15 minutes, and formats the result as a string
#' indicating the number of hours and minutes in the duration.
#'
#' @param x A numeric input representing a duration in minutes.
#' @return A character vector of formatted duration strings.
#' @examples
#' fmt_duration(c(93, 1007, 3056))
fmt_duration <- function(x) {
  # Round to the nearest 15 minutes
  n_seconds <- round(seconds(x * 60) / (15 * 60)) * (15 * 60)
  n_seconds <- seconds_to_period(n_seconds)
  
  out <- map_chr(n_seconds, \(n) {
    if (seconds(n) <= 59) {
      # If this is less than an hour, don't format anything with hours
      glue("{MM} minutes", MM = minute(n))
    } else {
      # I only want to format this as a number of hours. If the duration is
      # longer than 24 hours, seconds_to_period() rolls over into days (i.e.
      # seconds_to_period(60 * 60 * 24) returns "1d 0H 0M 0S"), and it shows
      # zero hours. So we extract the day part of the period, multiply it by 24,
      # and add it to the hour component that we want to display
      extra_day_hours <- day(n) * 24
  
      glue("{HH} hour{s} {MM} minutes",
        HH = scales::label_comma()(hour(n) + extra_day_hours),
        MM = minute(n),
        s = ifelse(hour(n) == 1, "", "s")
      )
    }
  })
  
  return(out)
}

fmt_miles <- scales::label_number(accuracy = 10, suffix = " miles", big.mark = ",")

miles_to_meters <- function(x) {
  x * 1609.344
}

meters_to_miles <- function(x) {
  x / 1609.344
}

km_to_miles <- function(x) {
  meters_to_miles(x * 1000)
}

State data

First we need state boundaries. We can get these from the US Census with states() from {tigris}:

Code

us_states <- states(resolution = "20m", year = 2022, cb = TRUE)

lower_48 <- us_states %>% 
  filter(!(NAME %in% c("Alaska", "Hawaii", "Puerto Rico")))

Here’s what they look like (using the Albers projection for the contiguous USA):

Code

ggplot() + 
  geom_sf(data = lower_48) +
  coord_sf(crs = st_crs("ESRI:102003")) +  # Albers
  theme_roadtrip()

Super quick and easy.

Stops data

Next we need to geocode and plot all the stops we’ll be making on the trip. I’ll stick them all in a data frame with columns indicating the direction (with a nod to Bilbo Baggins’s “There and Back Again” journey with the dwarves in The Hobbit) and the day of the leg of the trip:

Code

stops_raw <- tribble(
  ~direction,   ~day, ~city,
  "There",      1,    "Atlanta, Georgia", 
  "There",      2,    "New Orleans, Louisiana", 
  "There",      3,    "San Antonio, Texas", 
  "There",      4,    "Carlsbad, New Mexico", 
  "There",      5,    "Grand Canyon NP, Arizona",
  "There",      6,    "Grover, Utah",  
  "Back again", 1,    "Spanish Fork, Utah", 
  "Back again", 1,    "Shelley, Idaho",  
  "Back again", 2,    "Yellowstone NP, Wyoming",
  "Back again", 2,    "Devil's Tower, Wyoming",  
  "Back again", 3,    "Mount Rushmore, South Dakota", 
  "Back again", 4,    "Albert Lea, Minnesota", 
  "Back again", 5,    "Nauvoo, Illinois", 
  "Back again", 6,    "Atlanta, Georgia"
)  %>% 
  mutate(direction = fct_inorder(direction))
stops_raw
## # A tibble: 14 × 3
##    direction    day city                        
##    <fct>      <dbl> <chr>                       
##  1 There          1 Atlanta, Georgia            
##  2 There          2 New Orleans, Louisiana      
##  3 There          3 San Antonio, Texas          
##  4 There          4 Carlsbad, New Mexico        
##  5 There          5 Grand Canyon NP, Arizona    
##  6 There          6 Grover, Utah                
##  7 Back again     1 Spanish Fork, Utah          
##  8 Back again     1 Shelley, Idaho              
##  9 Back again     2 Yellowstone NP, Wyoming     
## 10 Back again     2 Devil's Tower, Wyoming      
## 11 Back again     3 Mount Rushmore, South Dakota
## 12 Back again     4 Albert Lea, Minnesota       
## 13 Back again     5 Nauvoo, Illinois            
## 14 Back again     6 Atlanta, Georgia

I could go to Google Maps and right click on each of these places and copy the latitude/longitude coordinates, but that’s not very reproducible and involves a lot of clicking around. We can do it programmatically with geocode() from {tidygeocoder}, but first we need to help the API a little. With most of these places, OpenStreetMap will return the center of the city (like with Atlanta), and that’s fine. With larger places like the Grand Canyon and Yellowstone, though, OpenStreetMap will give coordinates for the middle of the parks, which are literally in the middle of nowhere and will mess up our directions (since it’s not really possible to drive to the middle of the whole Grand Canyon). To fix this we’ll make a little dataset of some more specific addresses and join it to the list of stops. We’ll geocode the more specific address column:

Code

stops_addresses <- tribble(
  ~city, ~address,
  "Grand Canyon NP, Arizona", "Grand Canyon Visitor Center, Arizona",
  "Yellowstone NP, Wyoming", "Old Faithful, Teton County, Wyoming",
)

stops_to_geocode <- stops_raw %>% 
  left_join(stops_addresses, by = join_by(city)) %>% 
  # Combine the address and city columns, with a preference for address
  mutate(address = coalesce(address, city))

stops_to_geocode
## # A tibble: 14 × 4
##    direction    day city                         address                             
##    <fct>      <dbl> <chr>                        <chr>                               
##  1 There          1 Atlanta, Georgia             Atlanta, Georgia                    
##  2 There          2 New Orleans, Louisiana       New Orleans, Louisiana              
##  3 There          3 San Antonio, Texas           San Antonio, Texas                  
##  4 There          4 Carlsbad, New Mexico         Carlsbad, New Mexico                
##  5 There          5 Grand Canyon NP, Arizona     Grand Canyon Visitor Center, Arizona
##  6 There          6 Grover, Utah                 Grover, Utah                        
##  7 Back again     1 Spanish Fork, Utah           Spanish Fork, Utah                  
##  8 Back again     1 Shelley, Idaho               Shelley, Idaho                      
##  9 Back again     2 Yellowstone NP, Wyoming      Old Faithful, Teton County, Wyoming 
## 10 Back again     2 Devil's Tower, Wyoming       Devil's Tower, Wyoming              
## 11 Back again     3 Mount Rushmore, South Dakota Mount Rushmore, South Dakota        
## 12 Back again     4 Albert Lea, Minnesota        Albert Lea, Minnesota               
## 13 Back again     5 Nauvoo, Illinois             Nauvoo, Illinois                    
## 14 Back again     6 Atlanta, Georgia             Atlanta, Georgia

To get the coordinates, we’ll feed the address column to geocode() and then convert the resulting numeric long and lat into an official sf geometry column (with the WGS 84 (−180 to 180) scale):

Code

stops_geocoded <- stops_to_geocode %>% 
  geocode(address, method = "osm") %>%
  st_as_sf(coords = c("long", "lat"), crs = st_crs("EPSG:4326"))

Code

stops_geocoded
## Simple feature collection with 14 features and 4 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -112 ymin: 29.4 xmax: -84.4 ymax: 44.6
## Geodetic CRS:  WGS 84
## # A tibble: 14 × 5
##    direction    day city                         address                                  geometry
##  * <fct>      <dbl> <chr>                        <chr>                                 <POINT [°]>
##  1 There          1 Atlanta, Georgia             Atlanta, Georgia                     (-84.4 33.7)
##  2 There          2 New Orleans, Louisiana       New Orleans, Louisiana                 (-90.1 30)
##  3 There          3 San Antonio, Texas           San Antonio, Texas                   (-98.5 29.4)
##  4 There          4 Carlsbad, New Mexico         Carlsbad, New Mexico                  (-104 32.4)
##  5 There          5 Grand Canyon NP, Arizona     Grand Canyon Visitor Center, Arizona  (-112 36.1)
##  6 There          6 Grover, Utah                 Grover, Utah                          (-111 38.2)
##  7 Back again     1 Spanish Fork, Utah           Spanish Fork, Utah                    (-112 40.1)
##  8 Back again     1 Shelley, Idaho               Shelley, Idaho                        (-112 43.4)
##  9 Back again     2 Yellowstone NP, Wyoming      Old Faithful, Teton County, Wyoming   (-111 44.5)
## 10 Back again     2 Devil's Tower, Wyoming       Devil's Tower, Wyoming                (-105 44.6)
## 11 Back again     3 Mount Rushmore, South Dakota Mount Rushmore, South Dakota          (-103 43.9)
## 12 Back again     4 Albert Lea, Minnesota        Albert Lea, Minnesota                (-93.4 43.6)
## 13 Back again     5 Nauvoo, Illinois             Nauvoo, Illinois                     (-91.4 40.6)
## 14 Back again     6 Atlanta, Georgia             Atlanta, Georgia                     (-84.4 33.7)

Atlanta is in there twice, so we’ll need to drop the last row so that we don’t plot the points and labels for it twice:

Code

all_stops_unique <- stops_geocoded %>% 
  slice(1:(n() - 1))

And let’s plot it all:

Code

ggplot() +
  geom_sf(data = lower_48) +
  geom_sf(data = all_stops_unique) +
  geom_label_repel(
    data = all_stops_unique,
    aes(label = city, geometry = geometry),
    stat = "sf_coordinates", seed = 1234,
    size = 3, segment.color = clrs[3], min.segment.length = 0
  ) +
  annotation_scale(
    location = "bl", bar_cols = c("grey30", "white"),
    unit_category = "imperial", text_family = "Overpass"
  ) +
  coord_sf(crs = st_crs("ESRI:102003")) +  # Albers
  theme_roadtrip()

Routing

Next we need to figure out the routes between each of these points. The osrmRoute() function in {osrm} takes two main arguments: (1) coordinates for the source location (src) and (2) coordinates for the destination location (dst). To make it easy to work with routing data in a dataframe, we need to manipulate the data a bit so that we get a row per route, with columns for the origin and destination cities.

We’ll create a copy of the geometry column in stops_geocoded and shift it forward one row with lead() and drop the last row because it doesn’t have a destination:

Code

routes_raw <- stops_geocoded %>% 
  select(-address) %>% 
  rename(
    origin_geometry = geometry,
    origin_city = city
  ) %>% 
  mutate(
    destination_geometry = lead(origin_geometry),
    destination_city = lead(origin_city)
  ) %>% 
  filter(row_number() != n())
routes_raw
## Simple feature collection with 13 features and 4 fields
## Active geometry column: origin_geometry
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -112 ymin: 29.4 xmax: -84.4 ymax: 44.6
## Geodetic CRS:  WGS 84
## # A tibble: 13 × 6
##    direction    day origin_city                  origin_geometry destination_geometry destination_city            
##  * <fct>      <dbl> <chr>                            <POINT [°]>          <POINT [°]> <chr>                       
##  1 There          1 Atlanta, Georgia                (-84.4 33.7)           (-90.1 30) New Orleans, Louisiana      
##  2 There          2 New Orleans, Louisiana            (-90.1 30)         (-98.5 29.4) San Antonio, Texas          
##  3 There          3 San Antonio, Texas              (-98.5 29.4)          (-104 32.4) Carlsbad, New Mexico        
##  4 There          4 Carlsbad, New Mexico             (-104 32.4)          (-112 36.1) Grand Canyon NP, Arizona    
##  5 There          5 Grand Canyon NP, Arizona         (-112 36.1)          (-111 38.2) Grover, Utah                
##  6 There          6 Grover, Utah                     (-111 38.2)          (-112 40.1) Spanish Fork, Utah          
##  7 Back again     1 Spanish Fork, Utah               (-112 40.1)          (-112 43.4) Shelley, Idaho              
##  8 Back again     1 Shelley, Idaho                   (-112 43.4)          (-111 44.5) Yellowstone NP, Wyoming     
##  9 Back again     2 Yellowstone NP, Wyoming          (-111 44.5)          (-105 44.6) Devil's Tower, Wyoming      
## 10 Back again     2 Devil's Tower, Wyoming           (-105 44.6)          (-103 43.9) Mount Rushmore, South Dakota
## 11 Back again     3 Mount Rushmore, South Dakota     (-103 43.9)         (-93.4 43.6) Albert Lea, Minnesota       
## 12 Back again     4 Albert Lea, Minnesota           (-93.4 43.6)         (-91.4 40.6) Nauvoo, Illinois            
## 13 Back again     5 Nauvoo, Illinois                (-91.4 40.6)         (-84.4 33.7) Atlanta, Georgia

With the data like this, we can now feed each row individually to osrmRoute and get geocoded paths between each origin and destination:

Code

routes_geocoded_raw <- routes_raw %>% 
  rowwise() %>% 
  mutate(route = osrmRoute(
    src = origin_geometry, 
    dst = destination_geometry)
  )

Code

routes_geocoded_raw
## Simple feature collection with 13 features and 5 fields
## Active geometry column: origin_geometry
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -112 ymin: 29.4 xmax: -84.4 ymax: 44.6
## Geodetic CRS:  WGS 84
## # A tibble: 13 × 7
## # Rowwise: 
##    direction    day origin_city                  origin_geometry destination_geometry destination_city             route$src $dst  $duration $distance                                                                                 $geometry
##  * <fct>      <dbl> <chr>                            <POINT [°]>          <POINT [°]> <chr>                        <chr>     <chr>     <dbl>     <dbl>                                                                          <LINESTRING [°]>
##  1 There          1 Atlanta, Georgia                (-84.4 33.7)           (-90.1 30) New Orleans, Louisiana       src       dst        513.      753. (-84.4 33.7, -84.5 33.6, -84.8 33.4, -84.9 33.1, -85.1 32.9, -85.4 32.6, -85.7 32.5, -...
##  2 There          2 New Orleans, Louisiana            (-90.1 30)         (-98.5 29.4) San Antonio, Texas           src       dst        596.      870. (-90.1 30, -90.3 30, -90.5 30.1, -90.9 30.2, -91.1 30.4, -91.3 30.5, -92.1 30.2, -93.2...
##  3 There          3 San Antonio, Texas              (-98.5 29.4)          (-104 32.4) Carlsbad, New Mexico         src       dst        458.      727. (-98.5 29.4, -98.7 29.7, -98.9 30, -99.5 30.3, -99.8 30.5, -99.9 30.5, -100 30.4, -101...
##  4 There          4 Carlsbad, New Mexico             (-104 32.4)          (-112 36.1) Grand Canyon NP, Arizona     src       dst        749.     1100. (-104 32.4, -104 32.5, -104 32.6, -104 32.9, -104 32.9, -104 33.3, -105 33.4, -105 33....
##  5 There          5 Grand Canyon NP, Arizona         (-112 36.1)          (-111 38.2) Grover, Utah                 src       dst        510.      628. (-112 36.1, -112 36, -112 36, -112 36, -112 36, -112 35.9, -112 35.9, -112 35.9, -111 ...
##  6 There          6 Grover, Utah                     (-111 38.2)          (-112 40.1) Spanish Fork, Utah           src       dst        197.      273. (-111 38.2, -111 38.2, -111 38.3, -112 38.3, -112 38.4, -112 38.4, -112 38.4, -112 38....
##  7 Back again     1 Spanish Fork, Utah               (-112 40.1)          (-112 43.4) Shelley, Idaho               src       dst        262.      411. (-112 40.1, -112 40.2, -112 40.3, -112 40.5, -112 41, -112 41.1, -112 41.2, -112 41.6,...
##  8 Back again     1 Shelley, Idaho                   (-112 43.4)          (-111 44.5) Yellowstone NP, Wyoming      src       dst        199.      240. (-112 43.4, -112 43.8, -112 43.9, -112 43.9, -112 44, -112 44, -111 44.1, -111 44.2, -...
##  9 Back again     2 Yellowstone NP, Wyoming          (-111 44.5)          (-105 44.6) Devil's Tower, Wyoming       src       dst        552.      694. (-111 44.5, -111 44.4, -111 44.5, -111 44.5, -110 44.5, -110 44.6, -110 44.5, -110 44....
## 10 Back again     2 Devil's Tower, Wyoming           (-105 44.6)          (-103 43.9) Mount Rushmore, South Dakota src       dst        161.      215. (-105 44.6, -105 44.6, -105 44.6, -105 44.6, -105 44.5, -105 44.5, -105 44.5, -105 44....
## 11 Back again     3 Mount Rushmore, South Dakota     (-103 43.9)         (-93.4 43.6) Albert Lea, Minnesota        src       dst        548.      866. (-103 43.9, -103 43.9, -103 44, -103 44, -103 44.1, -103 44.1, -103 44.1, -102 44.1, -...
## 12 Back again     4 Albert Lea, Minnesota           (-93.4 43.6)         (-91.4 40.6) Nauvoo, Illinois             src       dst        359.      487. (-93.4 43.6, -93.3 43.1, -92.7 43.1, -92.7 43, -92.7 43, -92.6 43, -92.5 42.7, -92.5 4...
## 13 Back again     5 Nauvoo, Illinois                (-91.4 40.6)         (-84.4 33.7) Atlanta, Georgia             src       dst        847.     1199. (-91.4 40.6, -91.4 40.4, -91.6 40.4, -91.5 39.7, -91.4 39.7, -91.4 39.6, -91 39.2, -90...

The route details are included in a nested data frame list column named route, so we’ll need to unnest it. The resulting data frame now has three different geometry columns (!) for the origin point, destination point, and the route, so we’ll set the route column as the data frame’s primary geometry column (which makes it so we can just use geom_sf(data = routes_geocoded) instead of geom_sf(data = routes_geocoded, aes(geometry = whatever))). The route data also includes some helpful details about the distance (in kilometers) and duration (in minutes), so we’ll create nicely formatted text-based versions of these too:

Code

routes_geocoded <- routes_geocoded_raw %>% 
  unnest(route, names_sep = "_") %>% 
  st_set_geometry("route_geometry") %>% 
  mutate(
    distance_miles = km_to_miles(route_distance),
    distance_text = fmt_miles(distance_miles),
    duration_text = fmt_duration(route_duration)
  )
routes_geocoded
## Simple feature collection with 13 features and 11 fields
## Active geometry column: route_geometry
## Geometry type: LINESTRING
## Dimension:     XY
## Bounding box:  xmin: -113 ymin: 29.4 xmax: -84.4 ymax: 44.9
## Geodetic CRS:  WGS 84
## # A tibble: 13 × 14
##    direction    day origin_city                  origin_geometry destination_geometry destination_city             route_src route_dst route_duration route_distance                                                                            route_geometry distance_miles distance_text duration_text   
##  * <fct>      <dbl> <chr>                            <POINT [°]>          <POINT [°]> <chr>                        <chr>     <chr>              <dbl>          <dbl>                                                                          <LINESTRING [°]>          <dbl> <chr>         <chr>           
##  1 There          1 Atlanta, Georgia                (-84.4 33.7)           (-90.1 30) New Orleans, Louisiana       src       dst                 513.           753. (-84.4 33.7, -84.5 33.6, -84.8 33.4, -84.9 33.1, -85.1 32.9, -85.4 32.6, -85.7 32.5, -...           468. 470 miles     8 hours 30 minu…
##  2 There          2 New Orleans, Louisiana            (-90.1 30)         (-98.5 29.4) San Antonio, Texas           src       dst                 596.           870. (-90.1 30, -90.3 30, -90.5 30.1, -90.9 30.2, -91.1 30.4, -91.3 30.5, -92.1 30.2, -93.2...           541. 540 miles     10 hours 0 minu…
##  3 There          3 San Antonio, Texas              (-98.5 29.4)          (-104 32.4) Carlsbad, New Mexico         src       dst                 458.           727. (-98.5 29.4, -98.7 29.7, -98.9 30, -99.5 30.3, -99.8 30.5, -99.9 30.5, -100 30.4, -101...           452. 450 miles     7 hours 45 minu…
##  4 There          4 Carlsbad, New Mexico             (-104 32.4)          (-112 36.1) Grand Canyon NP, Arizona     src       dst                 749.          1100. (-104 32.4, -104 32.5, -104 32.6, -104 32.9, -104 32.9, -104 33.3, -105 33.4, -105 33....           684. 680 miles     12 hours 30 min…
##  5 There          5 Grand Canyon NP, Arizona         (-112 36.1)          (-111 38.2) Grover, Utah                 src       dst                 510.           628. (-112 36.1, -112 36, -112 36, -112 36, -112 36, -112 35.9, -112 35.9, -112 35.9, -111 ...           390. 390 miles     8 hours 30 minu…
##  6 There          6 Grover, Utah                     (-111 38.2)          (-112 40.1) Spanish Fork, Utah           src       dst                 197.           273. (-111 38.2, -111 38.2, -111 38.3, -112 38.3, -112 38.4, -112 38.4, -112 38.4, -112 38....           169. 170 miles     3 hours 15 minu…
##  7 Back again     1 Spanish Fork, Utah               (-112 40.1)          (-112 43.4) Shelley, Idaho               src       dst                 262.           411. (-112 40.1, -112 40.2, -112 40.3, -112 40.5, -112 41, -112 41.1, -112 41.2, -112 41.6,...           256. 260 miles     4 hours 15 minu…
##  8 Back again     1 Shelley, Idaho                   (-112 43.4)          (-111 44.5) Yellowstone NP, Wyoming      src       dst                 199.           240. (-112 43.4, -112 43.8, -112 43.9, -112 43.9, -112 44, -112 44, -111 44.1, -111 44.2, -...           149. 150 miles     3 hours 15 minu…
##  9 Back again     2 Yellowstone NP, Wyoming          (-111 44.5)          (-105 44.6) Devil's Tower, Wyoming       src       dst                 552.           694. (-111 44.5, -111 44.4, -111 44.5, -111 44.5, -110 44.5, -110 44.6, -110 44.5, -110 44....           431. 430 miles     9 hours 15 minu…
## 10 Back again     2 Devil's Tower, Wyoming           (-105 44.6)          (-103 43.9) Mount Rushmore, South Dakota src       dst                 161.           215. (-105 44.6, -105 44.6, -105 44.6, -105 44.6, -105 44.5, -105 44.5, -105 44.5, -105 44....           133. 130 miles     2 hours 45 minu…
## 11 Back again     3 Mount Rushmore, South Dakota     (-103 43.9)         (-93.4 43.6) Albert Lea, Minnesota        src       dst                 548.           866. (-103 43.9, -103 43.9, -103 44, -103 44, -103 44.1, -103 44.1, -103 44.1, -102 44.1, -...           538. 540 miles     9 hours 15 minu…
## 12 Back again     4 Albert Lea, Minnesota           (-93.4 43.6)         (-91.4 40.6) Nauvoo, Illinois             src       dst                 359.           487. (-93.4 43.6, -93.3 43.1, -92.7 43.1, -92.7 43, -92.7 43, -92.6 43, -92.5 42.7, -92.5 4...           303. 300 miles     6 hours 0 minut…
## 13 Back again     5 Nauvoo, Illinois                (-91.4 40.6)         (-84.4 33.7) Atlanta, Georgia             src       dst                 847.          1199. (-91.4 40.6, -91.4 40.4, -91.6 40.4, -91.5 39.7, -91.4 39.7, -91.4 39.6, -91 39.2, -90...           745. 750 miles     14 hours 0 minu…

Let’s see how this looks! Because we set the default geometry of routes_geocoded to the route, geom_sf(data = routes_geocoded) will automatically plot the LINESTRING geometries for the routes as paths:

Code

ggplot() +
  geom_sf(data = lower_48) +
  geom_sf(data = routes_geocoded, color = clrs[1]) +
  geom_sf(data = all_stops_unique) +
  geom_label_repel(
    data = all_stops_unique,
    aes(label = city, geometry = geometry),
    stat = "sf_coordinates", seed = 1234,
    size = 3, segment.color = clrs[3], min.segment.length = 0
  ) +
  annotation_scale(
    location = "bl", bar_cols = c("grey30", "white"),
    unit_category = "imperial", text_family = "Overpass"
  ) +
  coord_sf(crs = st_crs("ESRI:102003")) +  # Albers
  theme_roadtrip()

AHH that’s so cool! It works!

We’re not done yet though—let’s make this even fancier!

States crossed through

We’re going to be driving through a lot of states. Let’s highlight each of the states we’ll cross through to see how many there are. To do this we can use st_intersection() to find which of the lower_48 states contain a part of the different routes. We’ll then make a copy of lower_48 with a new column indicating if we’ll visit the state:

Code

states_crossed_through <- st_intersection(
  st_transform(lower_48, st_crs(routes_geocoded)),
  routes_geocoded
)

# There are 32 rows here, but 18 unique states (i.e. one day will end in a state
# and start the next day in the same state, so it gets counted twice)
states_crossed_through %>% 
  select(STATEFP, NAME, direction, day)
## Simple feature collection with 32 features and 4 fields
## Geometry type: GEOMETRY
## Dimension:     XY
## Bounding box:  xmin: -113 ymin: 29.4 xmax: -84.4 ymax: 44.9
## Geodetic CRS:  WGS 84
## First 10 features:
##      STATEFP        NAME direction day                       geometry
## 4         13     Georgia     There   1 LINESTRING (-84.4 33.7, -84...
## 10        22   Louisiana     There   1 LINESTRING (-89.6 30.3, -89...
## 32        28 Mississippi     There   1 LINESTRING (-88.4 30.5, -88...
## 41        01     Alabama     There   1 LINESTRING (-85.2 32.9, -85...
## 1         48       Texas     There   2 LINESTRING (-93.7 30.1, -93...
## 10.1      22   Louisiana     There   2 LINESTRING (-90.1 30, -90.3...
## 1.1       48       Texas     There   3 LINESTRING (-98.5 29.4, -98...
## 36        35  New Mexico     There   3 LINESTRING (-104 32, -104 3...
## 36.1      35  New Mexico     There   4 LINESTRING (-104 32.4, -104...
## 47        04     Arizona     There   4 LINESTRING (-109 35.4, -109...
unique(states_crossed_through$NAME)
##  [1] "Georgia"      "Louisiana"    "Mississippi"  "Alabama"      "Texas"        "New Mexico"   "Arizona"      "Utah"         "Idaho"        "Montana"      "Wyoming"      "South Dakota" "Minnesota"    "Illinois"     "Iowa"         "Kentucky"     "Missouri"     "Tennessee"

# Create a column that flags if the state is cross through
lower_48_highlighted <- lower_48 %>% 
  mutate(visited = NAME %in% unique(states_crossed_through$NAME))

Now we can fill using the visited column and make the visited states subtly darker:

Code

ggplot() +
  geom_sf(data = lower_48_highlighted, aes(fill = visited)) +
  geom_sf(data = routes_geocoded, color = clrs[1]) +
  geom_sf(data = all_stops_unique) +
  geom_label_repel(
    data = all_stops_unique,
    aes(label = city, geometry = geometry),
    stat = "sf_coordinates", seed = 1234,
    size = 3, segment.color = clrs[3], min.segment.length = 0
  ) +
  annotation_scale(
    location = "bl", bar_cols = c("grey30", "white"),
    unit_category = "imperial", text_family = "Overpass"
  ) +
  scale_fill_manual(values = c("grey90", "grey80"), guide = "none") +
  coord_sf(crs = st_crs("ESRI:102003")) +  # Albers
  theme_roadtrip()

Map of road trip route with crossed-through states highlighted

Summary statistics

Now that we have a list of the states we’ll cross through, let’s play with some quick summary statistics.

Code

n_states <- length(unique(states_crossed_through$NAME))
n_states
## [1] 18

total_distance <- sum(routes_geocoded$route_distance) %>% km_to_miles() %>% fmt_miles()
total_distance
## [1] "5,260 miles"

total_time <- sum(routes_geocoded$route_duration) %>% fmt_duration()
total_time
## [1] "99 hours 15 minutes"

We’re going to drive through 18 states, with a total distance of 5,260 miles over the span of 99 hours 15 minutes (!!)

We can use these distance or duration columns to add some extra detail to the map:

Code

ggplot() +
  geom_sf(data = lower_48_highlighted, aes(fill = visited)) +
  geom_sf(data = routes_geocoded, color = clrs[1]) +
  geom_sf(data = all_stops_unique) +
  geom_label_repel(
    data = routes_geocoded,
    aes(label = distance_text, geometry = route_geometry),
    stat = "sf_coordinates", seed = 12345,
    size = 3, segment.color = clrs[3], min.segment.length = 0,
    fill = colorspace::lighten(clrs[5], 0.8)
  ) +
  annotation_scale(
    location = "bl", bar_cols = c("grey30", "white"),
    unit_category = "imperial", text_family = "Overpass"
  ) +
  scale_fill_manual(values = c("grey90", "grey80"), guide = "none") +
  coord_sf(crs = st_crs("ESRI:102003")) +  # Albers
  labs(title = glue("{n_states} states; {total_distance}; and {total_time} in the car. Bring it on.")) +
  theme_roadtrip()

Map of road trip showing the distances between each stop

Faceting

We have some nice tidy data here with columns for direction and day, which makes it easy to subset the data. We can make separate maps for the drive there and the drive back, or even separate maps for each day of the trip. Let’s add facet_wrap() to the plot:

Code

ggplot() +
  geom_sf(data = lower_48_highlighted, aes(fill = visited)) +
  geom_sf(data = routes_geocoded, color = clrs[1]) +
  geom_sf(data = all_stops_unique) +
  annotation_scale(
    location = "bl", bar_cols = c("grey30", "white"),
    unit_category = "imperial", text_family = "Overpass"
  ) +
  scale_fill_manual(values = c("grey90", "grey80"), guide = "none") +
  coord_sf(crs = st_crs("ESRI:102003")) +  # Albers
  theme_roadtrip() +
  facet_wrap(vars(direction), ncol = 1)

There are two problems with this though:

The highlighted states are the same in both facets
The final destination point is missing in each facet (Spanish Fork, Utah in the “There” facet; Atlanta, Georgia in the “Back Again” facet)

Both of these issues are fixable though.

Fixing state highlighting

First we need to change how we identify the states we’ll cross through. Before, we kept the unique state names, but this stripped away the details about the direction of the trip, so we’ll find all the distinct combinations of direction and state and then join that little dataset to lower_48 to make the visited column:

Code

states_crossed_through_better <- st_intersection(
  st_transform(lower_48, st_crs(routes_geocoded)),
  routes_geocoded
) %>% 
  distinct(direction, NAME)
states_crossed_through_better
##       direction         NAME
## 4         There      Georgia
## 10        There    Louisiana
## 32        There  Mississippi
## 41        There      Alabama
## 1         There        Texas
## 36        There   New Mexico
## 47        There      Arizona
## 35        There         Utah
## 12   Back again        Idaho
## 35.2 Back again         Utah
## 15   Back again      Montana
## 25   Back again      Wyoming
## 23   Back again South Dakota
## 16   Back again    Minnesota
## 14   Back again     Illinois
## 18   Back again         Iowa
## 3    Back again     Kentucky
## 4.1  Back again      Georgia
## 7    Back again     Missouri
## 9    Back again    Tennessee

lower_48_highlighted_better <- lower_48 %>% 
  left_join(states_crossed_through_better, by = join_by(NAME)) %>% 
  mutate(visited = !is.na(direction))
as_tibble(lower_48_highlighted_better)
## # A tibble: 51 × 12
##    STATEFP STATENS  AFFGEOID    GEOID STUSPS NAME       LSAD         ALAND      AWATER direction                                                                                 geometry visited
##    <chr>   <chr>    <chr>       <chr> <chr>  <chr>      <chr>        <dbl>       <dbl> <fct>                                                                           <MULTIPOLYGON [°]> <lgl>  
##  1 48      01779801 0400000US48 48    TX     Texas      00    676685555821 18974391187 There      (((-107 31.9, -107 32, -107 32, -107 32, -106 32, -106 32, -106 32, -106 32, -105 32... TRUE   
##  2 06      01779778 0400000US06 06    CA     California 00    403673617862 20291712025 <NA>       (((-119 33.5, -118 33.5, -118 33.4, -118 33.4, -118 33.3, -118 33.3, -118 33.3, -118... FALSE  
##  3 21      01779786 0400000US21 21    KY     Kentucky   00    102266581101  2384240769 Back again (((-89.5 36.6, -89.5 36.6, -89.4 36.6, -89.4 36.6, -89.3 36.6, -89.3 36.6, -89.3 36.... TRUE   
##  4 13      01705317 0400000US13 13    GA     Georgia    00    149486268417  4418716153 There      (((-85.6 35, -85.5 35, -85.4 35, -85.4 35, -85.3 35, -85.3 35, -85 35, -85 35, -85 3... TRUE   
##  5 13      01705317 0400000US13 13    GA     Georgia    00    149486268417  4418716153 Back again (((-85.6 35, -85.5 35, -85.4 35, -85.4 35, -85.3 35, -85.3 35, -85 35, -85 35, -85 3... TRUE   
##  6 55      01779806 0400000US55 55    WI     Wisconsin  00    140292518676 29343193162 <NA>       (((-86.9 45.4, -86.8 45.5, -86.8 45.4, -86.9 45.4, -86.9 45.3, -87 45.4, -86.9 45.4)... FALSE  
##  7 41      01155107 0400000US41 41    OR     Oregon     00    248630319014  6169061220 <NA>       (((-125 42.8, -124 43, -124 43, -124 43.1, -124 43.2, -124 43.2, -124 43.3, -124 43.... FALSE  
##  8 29      01779791 0400000US29 29    MO     Missouri   00    178052253239  2487526202 Back again (((-95.8 40.6, -95.5 40.6, -95.4 40.6, -95.3 40.6, -95.2 40.6, -95.1 40.6, -94.9 40.... TRUE   
##  9 51      01779803 0400000US51 51    VA     Virginia   00    102258178227  8528072639 <NA>       (((-76 37.3, -76 37.4, -76 37.4, -75.9 37.5, -75.9 37.6, -75.9 37.6, -75.9 37.7, -75... FALSE  
## 10 47      01325873 0400000US47 47    TN     Tennessee  00    106792368794  2322190840 Back again (((-90.3 35, -90.3 35, -90.2 35.1, -90.2 35.1, -90.2 35.1, -90.1 35.1, -90.1 35.2, -... TRUE   
## # ℹ 41 more rows

Let’s see how it works by adding a new geom_sf() layer for the highlighted states. We need to double up here with both lower_48 and lower_48_highlighted_better because if we only plot the highlighted states, each facet will only contain those states and not the underlying US map, which we don’t want.

Code

ggplot() +
  geom_sf(data = lower_48) +
  geom_sf(data = na.omit(lower_48_highlighted_better), aes(fill = visited)) +
  geom_sf(data = routes_geocoded, color = clrs[1]) +
  geom_sf(data = all_stops_unique) +
  annotation_scale(
    location = "bl", bar_cols = c("grey30", "white"),
    unit_category = "imperial", text_family = "Overpass"
  ) +
  scale_fill_manual(values = c("grey80"), guide = "none") +
  coord_sf(crs = st_crs("ESRI:102003")) +  # Albers
  theme_roadtrip() +
  facet_wrap(vars(direction), ncol = 1)

Each facet now only shows the states corresponding to each direction

There, we fixed issue #1. ✅

Issue #2—the missing endpoint in each facet—is a little trickier to deal with, but still doable. Before messing with the real data, let’s look at a simpler toy example first. Here’s a little dataset with three different “routes” between cities. The coordinates here aren’t actually coordinates—they’re just some arbitrary numbers. But that’s okay—this basically mirrors what we have in routes_geocoded

Fixing missing destination cities

Code

example <- tribble(
  ~day, ~origin_city, ~destination_city, ~origin_coords, ~destination_coords,
  1,    "Atlanta",    "Boston",          1234,           5678,
  2,    "Boston",     "Chicago",         5678,           91011,
  3,    "Chicago",    "Denver",          91011,          131415
) 
example
## # A tibble: 3 × 5
##     day origin_city destination_city origin_coords destination_coords
##   <dbl> <chr>       <chr>                    <dbl>              <dbl>
## 1     1 Atlanta     Boston                    1234               5678
## 2     2 Boston      Chicago                   5678              91011
## 3     3 Chicago     Denver                   91011             131415

The reason we’re not getting the final point in each facet or subset is because there are only three rows here, but four cities. If we plotted the origin_coords column, Atlanta would be missing; if we plotted the destination_coords column, Denver would be missing. We need to manipulate this data so that it has 4 rows (Atlanta, Boston, Chicago, Denver), with the correct coordinates for each city.

There’s an easy way to do this with pivot_longer() from {tidyr}. We can pivot with multiple columns that follow a pattern in their names. Here, all four of the columns we care about are are prefixed destination or origin, followed by a _. If we specify these name settings in pivot_longer(), it’ll automatically create a column named type for destination and origin and it’ll put the rest of the data in corresponding columns:

Code

example %>% 
  pivot_longer(
    cols = c(origin_city, destination_city, origin_coords, destination_coords),
    names_to = c("type", ".value"),
    names_sep = "_"
  )
## # A tibble: 6 × 4
##     day type        city    coords
##   <dbl> <chr>       <chr>    <dbl>
## 1     1 origin      Atlanta   1234
## 2     1 destination Boston    5678
## 3     2 origin      Boston    5678
## 4     2 destination Chicago  91011
## 5     3 origin      Chicago  91011
## 6     3 destination Denver  131415

Now we have one row per city, which is close to what we need. Boston and Chicago are duplicated (since they’re both destination and origin cities), so we need to filter out duplicate cities. There are lots of ways to do this—my preferred way is to group by city and keep the first row in each group using slice():

Code

example %>% 
  pivot_longer(
    cols = c(origin_city, destination_city, origin_coords, destination_coords),
    names_to = c("type", ".value"),
    names_sep = "_"
  ) %>% 
  group_by(city) %>% 
  slice(1)
## # A tibble: 4 × 4
## # Groups:   city [4]
##     day type        city    coords
##   <dbl> <chr>       <chr>    <dbl>
## 1     1 origin      Atlanta   1234
## 2     1 destination Boston    5678
## 3     2 destination Chicago  91011
## 4     3 destination Denver  131415

Woohoo! A dataset with four rows and the correct coordinates for each city. If we could plot this, we’d get both endpoints (Atlanta and Denver).

This same double pivot approach (magically!!) works with the special geometry columns in our actual routes data:

Code

all_stops_endpoints <- routes_geocoded %>% 
  pivot_longer(
    cols = c(origin_city, destination_city, origin_geometry, destination_geometry),
    names_to = c("type", ".value"),
    names_sep = "_"
  ) %>% 
  group_by(direction, city) %>% 
  slice(1) %>% 
  arrange(direction, day, desc(type)) %>% 
  # Use "geometry" as the default geometry column instead of the routes
  st_set_geometry("geometry")

all_stops_endpoints
## Simple feature collection with 15 features and 11 fields
## Active geometry column: geometry
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -112 ymin: 29.4 xmax: -84.4 ymax: 44.6
## Geodetic CRS:  WGS 84
## # A tibble: 15 × 13
## # Groups:   direction, city [15]
##    direction    day route_src route_dst route_duration route_distance                                                                            route_geometry distance_miles distance_text duration_text       type        city                             geometry
##    <fct>      <dbl> <chr>     <chr>              <dbl>          <dbl>                                                                          <LINESTRING [°]>          <dbl> <chr>         <chr>               <chr>       <chr>                         <POINT [°]>
##  1 There          1 src       dst                 513.           753. (-84.4 33.7, -84.5 33.6, -84.8 33.4, -84.9 33.1, -85.1 32.9, -85.4 32.6, -85.7 32.5, -...           468. 470 miles     8 hours 30 minutes  origin      Atlanta, Georgia             (-84.4 33.7)
##  2 There          1 src       dst                 513.           753. (-84.4 33.7, -84.5 33.6, -84.8 33.4, -84.9 33.1, -85.1 32.9, -85.4 32.6, -85.7 32.5, -...           468. 470 miles     8 hours 30 minutes  destination New Orleans, Louisiana         (-90.1 30)
##  3 There          2 src       dst                 596.           870. (-90.1 30, -90.3 30, -90.5 30.1, -90.9 30.2, -91.1 30.4, -91.3 30.5, -92.1 30.2, -93.2...           541. 540 miles     10 hours 0 minutes  destination San Antonio, Texas           (-98.5 29.4)
##  4 There          3 src       dst                 458.           727. (-98.5 29.4, -98.7 29.7, -98.9 30, -99.5 30.3, -99.8 30.5, -99.9 30.5, -100 30.4, -101...           452. 450 miles     7 hours 45 minutes  destination Carlsbad, New Mexico          (-104 32.4)
##  5 There          4 src       dst                 749.          1100. (-104 32.4, -104 32.5, -104 32.6, -104 32.9, -104 32.9, -104 33.3, -105 33.4, -105 33....           684. 680 miles     12 hours 30 minutes destination Grand Canyon NP, Arizona      (-112 36.1)
##  6 There          5 src       dst                 510.           628. (-112 36.1, -112 36, -112 36, -112 36, -112 36, -112 35.9, -112 35.9, -112 35.9, -111 ...           390. 390 miles     8 hours 30 minutes  destination Grover, Utah                  (-111 38.2)
##  7 There          6 src       dst                 197.           273. (-111 38.2, -111 38.2, -111 38.3, -112 38.3, -112 38.4, -112 38.4, -112 38.4, -112 38....           169. 170 miles     3 hours 15 minutes  destination Spanish Fork, Utah            (-112 40.1)
##  8 Back again     1 src       dst                 262.           411. (-112 40.1, -112 40.2, -112 40.3, -112 40.5, -112 41, -112 41.1, -112 41.2, -112 41.6,...           256. 260 miles     4 hours 15 minutes  origin      Spanish Fork, Utah            (-112 40.1)
##  9 Back again     1 src       dst                 262.           411. (-112 40.1, -112 40.2, -112 40.3, -112 40.5, -112 41, -112 41.1, -112 41.2, -112 41.6,...           256. 260 miles     4 hours 15 minutes  destination Shelley, Idaho                (-112 43.4)
## 10 Back again     1 src       dst                 199.           240. (-112 43.4, -112 43.8, -112 43.9, -112 43.9, -112 44, -112 44, -111 44.1, -111 44.2, -...           149. 150 miles     3 hours 15 minutes  destination Yellowstone NP, Wyoming       (-111 44.5)
## 11 Back again     2 src       dst                 552.           694. (-111 44.5, -111 44.4, -111 44.5, -111 44.5, -110 44.5, -110 44.6, -110 44.5, -110 44....           431. 430 miles     9 hours 15 minutes  destination Devil's Tower, Wyoming        (-105 44.6)
## 12 Back again     2 src       dst                 161.           215. (-105 44.6, -105 44.6, -105 44.6, -105 44.6, -105 44.5, -105 44.5, -105 44.5, -105 44....           133. 130 miles     2 hours 45 minutes  destination Mount Rushmore, South Dakota  (-103 43.9)
## 13 Back again     3 src       dst                 548.           866. (-103 43.9, -103 43.9, -103 44, -103 44, -103 44.1, -103 44.1, -103 44.1, -102 44.1, -...           538. 540 miles     9 hours 15 minutes  destination Albert Lea, Minnesota        (-93.4 43.6)
## 14 Back again     4 src       dst                 359.           487. (-93.4 43.6, -93.3 43.1, -92.7 43.1, -92.7 43, -92.7 43, -92.6 43, -92.5 42.7, -92.5 4...           303. 300 miles     6 hours 0 minutes   destination Nauvoo, Illinois             (-91.4 40.6)
## 15 Back again     5 src       dst                 847.          1199. (-91.4 40.6, -91.4 40.4, -91.6 40.4, -91.5 39.7, -91.4 39.7, -91.4 39.6, -91 39.2, -90...           745. 750 miles     14 hours 0 minutes  destination Atlanta, Georgia             (-84.4 33.7)

And plotted:

Code

ggplot() +
  geom_sf(data = lower_48) +
  geom_sf(data = na.omit(lower_48_highlighted_better), aes(fill = visited)) +
  geom_sf(data = routes_geocoded, color = clrs[1]) +
  geom_sf(data = all_stops_endpoints) +
  geom_label_repel(
    data = all_stops_endpoints,
    aes(label = city, geometry = geometry),
    stat = "sf_coordinates", seed = 1234,
    size = 3, segment.color = clrs[3], min.segment.length = 0
  ) +
  annotation_scale(
    location = "bl", bar_cols = c("grey30", "white"),
    unit_category = "imperial", text_family = "Overpass"
  ) +
  scale_fill_manual(values = c("grey80"), guide = "none") +
  coord_sf(crs = st_crs("ESRI:102003")) +  # Albers
  theme_roadtrip() +
  facet_wrap(vars(direction), ncol = 1)

Each facet now shows the destination city

Issue #2 fixed. ✅

Zooming in

We can also make separate maps that are zoomed in on each day of the trip in addition these bigger overarching ones. This will allow us to add more details like driving time and distances without getting too crowded on the map.

We’ll create a mini map of the first day by itself, and then scale up the process to make all the mini maps programmatically.

One day of the trip

First we’ll extract the route, cities, and states for the first day of the trip out there:

Code

route_day1 <- routes_geocoded %>% 
  filter(direction == "There", day == 1)

stops_day1 <- all_stops_endpoints %>% 
  filter(direction == "There", day == 1)

states_crossed_day1 <- st_intersection(
  st_transform(lower_48, st_crs(route_day1)),
  route_day1
) %>% 
  distinct(NAME)

lower_48_highlighted_day1 <- lower_48 %>% 
  mutate(visited = NAME %in% states_crossed_day1$NAME)

Let’s plot these to make sure it worked:

Code

ggplot() +
  geom_sf(data = lower_48_highlighted_day1, aes(fill = visited)) +
  geom_sf(data = route_day1, color = clrs[1]) +
  geom_sf(data = stops_day1) +
  geom_sf_label(
    data = stops_day1, 
    aes(label = city),
    nudge_y = miles_to_meters(80)
  ) +
  scale_fill_manual(values = c("grey90", "grey80"), guide = "none") +
  coord_sf(crs = st_crs("ESRI:102003")) +  # Albers
  theme_roadtrip()

Drive for the first day on the full US map

Yep, it worked. But we don’t need to show the whole map—we want to zoom in on just the area around the route for the day. To do this, we can use st_bbox() to extract a rectangle of coordinates around the route, creating a bounding box for the map that we can then use with coord_sf() to zoom in.

st_bbox() returns a named vector of numbers with the x- and y-axis limits:

Code

bbox <- st_bbox(route_day1)
bbox
##  xmin  ymin  xmax  ymax 
## -90.1  30.0 -84.4  33.7

Since the elements are all named, we can use them to specify the different values in coord_sf(). First we’ll temporarily stop using the Albers projection, since the numbers we’ve extracted with st_bbox() are on the WGS 84 (−180 to 180) scale and Albers uses meters, which will make the map not work. (But we’ll fix that in a minute!)

Code

ggplot() +
  geom_sf(data = lower_48_highlighted_day1, aes(fill = visited)) +
  geom_sf(data = route_day1, color = clrs[1]) +
  geom_sf(data = stops_day1) +
  geom_sf_label(
    data = stops_day1, 
    aes(label = city),
    # We're not using Albers, so this isn't measured in meters; it's decimal degrees
    nudge_y = 0.15
  ) +
  annotation_scale(
    location = "bl", bar_cols = c("grey30", "white"),
    unit_category = "imperial", text_family = "Overpass"
  ) +
  scale_fill_manual(values = c("grey90", "grey80"), guide = "none") +
  coord_sf(
    xlim = c(bbox["xmin"], bbox["xmax"]), 
    ylim = c(bbox["ymin"], bbox["ymax"])
  ) +
  theme_roadtrip()

Trip for the first day, but zoomed in a little too far

We’re zoomed in (yay) but the edges of the bounding box window are too close to the points and the route, and the labels are cropped funny. Also, to shift the labels up we had to think in decimal degrees instead of meters, which I can’t do naturally. And also, we can’t change projections—because we’re specifying the bounding box coordinates in decimal degrees, we’re stuck with WGS 84 instead of Albers or any other projection. These are all fixable issues, fortunately.

To fix all this, we’ll expand the bounding box window by 150 miles on all sides using st_buffer() and then convert the coordinates to Albers before extracting the coordinates of the window for plotting:

Code

bbox_nice <- route_day1 %>% 
  st_bbox() %>%  # Extract the bounding box of the coordinates
  st_as_sfc() %>%  # Convert the bounding box matrix back to an sf object
  st_buffer(miles_to_meters(150)) %>%  # Add 150 miles to all sides
  st_transform("ESRI:102003") %>%  # Switch to Albers
  st_bbox()  # Extract the bounding box of the expanded box
bbox_nice
##     xmin     ymin     xmax     ymax 
##   301851 -1071395  1366321  -105313

These are no longer in decimal degrees, so we can use them with the Albers projection. We’ve also added a 150-mile buffer around the window, so we should have room for all the labels now. Let’s check it:

Code

ggplot() +
  geom_sf(data = lower_48_highlighted_day1, aes(fill = visited)) +
  geom_sf(data = route_day1, color = clrs[1]) +
  geom_sf(data = stops_day1) +
  geom_sf_label(
    data = stops_day1, 
    aes(label = city),
    # We're in Albers again, so we can work with meters (or miles)
    nudge_y = miles_to_meters(30)
  ) +
  annotation_scale(
    location = "bl", bar_cols = c("grey30", "white"),
    unit_category = "imperial", text_family = "Overpass",
    width_hint = 0.4
  ) +
  scale_fill_manual(values = c("grey90", "grey80"), guide = "none") +
  coord_sf(
    xlim = c(bbox_nice["xmin"], bbox_nice["xmax"]), 
    ylim = c(bbox_nice["ymin"], bbox_nice["ymax"]),
    crs = st_crs("ESRI:102003")
  ) +
  theme_roadtrip()

Trip for the first day, correctly zoomed and using the Albers projection

Heck yeah.

However, the shapes look a little curved and distorted because we’re zoomed in so much. Albers is great for big countries, but on a small scale like this, WGS 84 is probably fine. That’s what Google Maps does—it only starts getting weird and curvy along horizontal latitude lines when you zoom out really far. Let’s do the first day map one last time without the Albers conversion:

Code

bbox_nice <- route_day1 %>% 
  st_bbox() %>%  # Extract the bounding box of the coordinates
  st_as_sfc() %>%  # Convert the bounding box matrix back to an sf object
  st_buffer(miles_to_meters(150)) %>%  # Add 150 miles to all sides
  st_bbox()  # Extract the bounding box of the expanded box

ggplot() +
  geom_sf(data = lower_48_highlighted_day1, aes(fill = visited)) +
  geom_sf(data = route_day1, color = clrs[1]) +
  geom_sf(data = stops_day1) +
  geom_sf_label(
    data = stops_day1, 
    aes(label = city),
    # We're in WGS 84, so these are decimal degrees
    nudge_y = 0.5
  ) +
  geom_label_repel(
    data = route_day1,
    aes(label = distance_text, geometry = route_geometry),
    stat = "sf_coordinates", seed = 12345,
    family = "Overpass ExtraBold", fontface = "plain",
    size = 3.5, fill = colorspace::lighten(clrs[5], 0.8), color = clrs[1], 
    segment.color = "grey50", min.segment.length = 0
  ) +
  annotation_scale(
    location = "bl", bar_cols = c("grey30", "white"),
    unit_category = "imperial", text_family = "Overpass",
    width_hint = 0.3
  ) +
  scale_fill_manual(values = c("grey90", "grey80"), guide = "none") +
  coord_sf(
    xlim = c(bbox_nice["xmin"], bbox_nice["xmax"]), 
    ylim = c(bbox_nice["ymin"], bbox_nice["ymax"]),
    crs = st_crs("EPSG:4326")
  ) +
  theme_roadtrip()

Trip for the first day, correctly zoomed and using the WGS 84 projection

Neat. The northern borders of Georgia, Alabama, and Mississippi are all flat here, which is great.

All the days of the trip

We successfully plotted one day of the trip. But we’ll be driving for 11 days (!) and need 11 plots. I don’t want to copy/paste this all code 11 times, so we’ll stick each major step into a function:

Code

# Take a set of routes and do some pivoting to create a dataset that includes
# the start and end points
expand_endpoints <- function(routes) {
  routes %>% 
    pivot_longer(
    cols = c(origin_city, destination_city, origin_geometry, destination_geometry),
    names_to = c("type", ".value"),
    names_sep = "_"
  ) %>% 
  group_by(city) %>% 
  slice(1) %>% 
  arrange(desc(type)) %>% 
  st_set_geometry("geometry")
}

# Take a set of routes and return a dataset with a flag marking which states they cross
find_states <- function(routes) {
  # st_intersection() complains about innocuous things so suppress the warnings
  suppressWarnings({
    states_crossed <- st_intersection(
      st_transform(lower_48, st_crs(routes)),
      routes
    ) %>% 
      distinct(NAME)
  })
  
  lower_48 %>% 
    mutate(visited = NAME %in% states_crossed$NAME)
}

# Use routes, stopes, and states to build a plot
build_day_map <- function(routes, stops, states, direction, day) {
  bbox_nice <- routes %>%
    st_bbox() %>%  # Extract the bounding box of the coordinates
    st_as_sfc() %>%  # Convert the bounding box matrix back to an sf object
    st_buffer(miles_to_meters(150)) %>%  # Add 150 miles to all sides
    st_bbox()  # Extract the bounding box of the expanded box

  ggplot() +
    geom_sf(data = states, aes(fill = visited)) +
    geom_sf(data = routes, color = clrs[1]) +
    geom_sf(data = stops) +
    geom_sf_label(
      data = stops,
      aes(label = city),
      # We're in WGS 84, so these are decimal degrees
      nudge_y = 0.4
    ) +
    geom_label_repel(
      data = routes,
      aes(label = distance_text, geometry = route_geometry),
      stat = "sf_coordinates", seed = 12345,
      family = "Overpass ExtraBold", fontface = "plain",
      size = 3.5, fill = colorspace::lighten(clrs[5], 0.8), color = clrs[1], 
      segment.color = "grey50", min.segment.length = 0
    ) +
    annotation_scale(
      location = "bl", bar_cols = c("grey30", "white"),
      unit_category = "imperial", text_family = "Overpass",
      width_hint = 0.4
    ) +
    scale_fill_manual(values = c("grey90", "grey80"), guide = "none") +
    coord_sf(
      xlim = c(bbox_nice["xmin"], bbox_nice["xmax"]),
      ylim = c(bbox_nice["ymin"], bbox_nice["ymax"]),
      crs = st_crs("EPSG:4326")
    ) +
    labs(title = glue("{direction}, day {day}")) +
    theme_roadtrip()
}

With everything functionalized, we can do some super wild {purrr} magic. If we take routes_geocoded and group it by direction and day (so there’s a group per driving day), we’ll get a list column that contains the geocded coordinates for each day:

Code

daily_plots <- routes_geocoded %>% 
  group_by(direction, day) %>% 
  nest(.key = "routes")

daily_plots
## # A tibble: 11 × 3
## # Groups:   direction, day [11]
##    direction    day routes       
##    <fct>      <dbl> <list>       
##  1 There          1 <sf [1 × 12]>
##  2 There          2 <sf [1 × 12]>
##  3 There          3 <sf [1 × 12]>
##  4 There          4 <sf [1 × 12]>
##  5 There          5 <sf [1 × 12]>
##  6 There          6 <sf [1 × 12]>
##  7 Back again     1 <sf [2 × 12]>
##  8 Back again     2 <sf [2 × 12]>
##  9 Back again     3 <sf [1 × 12]>
## 10 Back again     4 <sf [1 × 12]>
## 11 Back again     5 <sf [1 × 12]>

# Check the first day
daily_plots$routes[[1]]
## Simple feature collection with 1 feature and 9 fields
## Active geometry column: route_geometry
## Geometry type: LINESTRING
## Dimension:     XY
## Bounding box:  xmin: -90.1 ymin: 30 xmax: -84.4 ymax: 33.7
## Geodetic CRS:  WGS 84
## # A tibble: 1 × 12
##   origin_city      origin_geometry destination_geometry destination_city       route_src route_dst route_duration route_distance                                                                            route_geometry distance_miles distance_text duration_text     
##   <chr>                <POINT [°]>          <POINT [°]> <chr>                  <chr>     <chr>              <dbl>          <dbl>                                                                          <LINESTRING [°]>          <dbl> <chr>         <chr>             
## 1 Atlanta, Georgia    (-84.4 33.7)           (-90.1 30) New Orleans, Louisiana src       dst                 513.           753. (-84.4 33.7, -84.5 33.6, -84.8 33.4, -84.9 33.1, -85.1 32.9, -85.4 32.6, -85.7 32.5, -...           468. 470 miles     8 hours 30 minutes

# Check the first day of the return trip
daily_plots$routes[[7]]
## Simple feature collection with 2 features and 9 fields
## Active geometry column: route_geometry
## Geometry type: LINESTRING
## Dimension:     XY
## Bounding box:  xmin: -112 ymin: 40.1 xmax: -111 ymax: 44.7
## Geodetic CRS:  WGS 84
## # A tibble: 2 × 12
##   origin_city        origin_geometry destination_geometry destination_city        route_src route_dst route_duration route_distance                                                                            route_geometry distance_miles distance_text duration_text     
##   <chr>                  <POINT [°]>          <POINT [°]> <chr>                   <chr>     <chr>              <dbl>          <dbl>                                                                          <LINESTRING [°]>          <dbl> <chr>         <chr>             
## 1 Spanish Fork, Utah     (-112 40.1)          (-112 43.4) Shelley, Idaho          src       dst                 262.           411. (-112 40.1, -112 40.2, -112 40.3, -112 40.5, -112 41, -112 41.1, -112 41.2, -112 41.6,...           256. 260 miles     4 hours 15 minutes
## 2 Shelley, Idaho         (-112 43.4)          (-111 44.5) Yellowstone NP, Wyoming src       dst                 199.           240. (-112 43.4, -112 43.8, -112 43.9, -112 43.9, -112 44, -112 44, -111 44.1, -111 44.2, -...           149. 150 miles     3 hours 15 minutes

We can then use purrr::map() and purrr::pmap() to feed that nested list-column data frame through the different functions to expand endpoints, find crossed-through states, and build the daily plot.

Code

daily_plots <- routes_geocoded %>% 
  group_by(direction, day) %>% 
  nest(.key = "routes") %>% 
  mutate(
    stops = map(routes, expand_endpoints),
    states = map(routes, find_states),
    plot = pmap(list(routes, stops, states, direction, day), build_day_map)
  )

Before looking at the plots, check out all these nested datasets! Each day has a dataset for the routes, stops, states, and final plot, and everything is contained here in a data frame. MAGICAL.

Code

daily_plots
## # A tibble: 11 × 6
## # Groups:   direction, day [11]
##    direction    day routes        stops         states         plot  
##    <fct>      <dbl> <list>        <list>        <list>         <list>
##  1 There          1 <sf [1 × 12]> <sf [2 × 11]> <sf [49 × 11]> <gg>  
##  2 There          2 <sf [1 × 12]> <sf [2 × 11]> <sf [49 × 11]> <gg>  
##  3 There          3 <sf [1 × 12]> <sf [2 × 11]> <sf [49 × 11]> <gg>  
##  4 There          4 <sf [1 × 12]> <sf [2 × 11]> <sf [49 × 11]> <gg>  
##  5 There          5 <sf [1 × 12]> <sf [2 × 11]> <sf [49 × 11]> <gg>  
##  6 There          6 <sf [1 × 12]> <sf [2 × 11]> <sf [49 × 11]> <gg>  
##  7 Back again     1 <sf [2 × 12]> <sf [3 × 11]> <sf [49 × 11]> <gg>  
##  8 Back again     2 <sf [2 × 12]> <sf [3 × 11]> <sf [49 × 11]> <gg>  
##  9 Back again     3 <sf [1 × 12]> <sf [2 × 11]> <sf [49 × 11]> <gg>  
## 10 Back again     4 <sf [1 × 12]> <sf [2 × 11]> <sf [49 × 11]> <gg>  
## 11 Back again     5 <sf [1 × 12]> <sf [2 × 11]> <sf [49 × 11]> <gg>

We can extract the plot for any single day with indexing:

Code

daily_plots$plot[[10]]

Or we can filter, pull, and pluck like good denizens of the tidyverse:

Code

daily_plots %>% 
  filter(direction == "Back again", day == 5) %>% 
  pull(plot) %>% pluck(1)

Here’s all of them simultaneously inside a Quarto tabset, just for fun:

The trip there
The trip back again

Code

daily_plots$plot[[1]]

Code

daily_plots$plot[[2]]

Code

daily_plots$plot[[3]]

Code

daily_plots$plot[[4]]

Code

daily_plots$plot[[5]]

Code

daily_plots$plot[[6]]

Code

daily_plots$plot[[7]]

Code

daily_plots$plot[[8]]

Code

daily_plots$plot[[9]]

Code

daily_plots$plot[[10]]

Code

daily_plots$plot[[11]]

Or, instead of extracting each plot individually like this, we can use wrap_plots() from {patchwork} to combine a whole list of plots automatically, though we have a lot less control over the plots this way—some of the maps use a landscape orientation and some use a portrait orientation, so they’re a big mess when combined like this:

Code

daily_plots %>%
  pull(plot) %>% 
  wrap_plots()

Ugly massive plot of all the days of the trip

Time to add some maps to road trip journals and finish packing!

Citation

BibTeX citation:

@online{heiss2023,
  author = {Heiss, Andrew},
  title = {How to Make Fancy Road Trip Maps with {R} and
    {OpenStreetMap}},
  date = {2023-06-01},
  url = {https://www.andrewheiss.com/blog/2023/06/01/geocoding-routing-openstreetmap-r/},
  doi = {10.59350/rgwda-0tv16},
  langid = {en}
}

For attribution, please cite this work as:

Heiss, Andrew. 2023. “How to Make Fancy Road Trip Maps with R and OpenStreetMap.” June 1, 2023. https://doi.org/10.59350/rgwda-0tv16.