This post talks about making interactive visualizations in R with leaflet(). In this example, I’ll map the USA locations of two of the biggest coffee chains, Starbucks and Dunkin’ Donuts. This package allows us to map data and play interactively with it. For instance, we can zoom in or zoom out to augment or diminish map details, respectively. We can add markers that signal the position of our data in the map and move the mouse cursor over to get information about it. Interestingly, we can save the leaflet map as a web map which can be shared with others so that they can also manipulate it in an interactive way. These are just some of the actions/ activities which can be done with leaflet.

Now, let’s work on our data. First, we’ll load the libraries needed for this post.

library(here) # to create a path to the current directory
library(tidyverse) # to load packages related to data cleaning (e.g. dplyr) and data visualization(ggplot2)
library(readxl) # to load excel files
library(leaflet) # to create interactive maps
library(htmlwidgets) # to save leaflet as an html file
library(htmltools) # labels more compatible with html files
library(leaflet.extras) # to add more tile options
library(skimr) # summary statistics
library(mapview) # to save the interactive map as an image

Let me give you just a glimpse of what leaflet can do. We can create a simple interactive map using the “CartoDB.DarkMatter” tile.

Note: To have this tile you must install the leaflet.extras package.

leaflet(width = "100%", 
        options = leafletOptions(preferCanvas = TRUE)) %>% 
  addProviderTiles("CartoDB.DarkMatter", 
                   options = providerTileOptions(
                     updateWhenZooming = FALSE,
                     updateWhenIdle = TRUE)) %>%
  setView(lng = -100, lat = 35, zoom = 4)

As we can see, it is now possible to use the plus and minus signs to zoom in and zoom out our map, correspondingly. Meaning we have an interactive map!

Before we dive into the leaflet package, we should load the data which we intend to map.

# load files
# starbucks
starbucks <- read_excel(here::here("coffee_chains.xlsx"), 
                        sheet = 1)
glimpse(starbucks)
## Observations: 25,600
## Variables: 13
## $ Brand            <chr> "Starbucks", "Starbucks", "Starbucks", "Starb...
## $ `Store Number`   <chr> "47370-257954", "22331-212325", "47089-256771...
## $ `Store Name`     <chr> "Meritxell, 96", "Ajman Drive Thru", "Dana Ma...
## $ `Ownership Type` <chr> "Licensed", "Licensed", "Licensed", "Licensed...
## $ `Street Address` <chr> "Av. Meritxell, 96", "1 Street 69, Al Jarf", ...
## $ City             <chr> "Andorra la Vella", "Ajman", "Ajman", "Abu Dh...
## $ `State/Province` <chr> "7", "AJ", "AJ", "AZ", "AZ", "AZ", "AZ", "AZ"...
## $ Country          <chr> "AD", "AE", "AE", "AE", "AE", "AE", "AE", "AE...
## $ Postcode         <chr> "AD500", NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
## $ `Phone Number`   <chr> "376818720", NA, NA, NA, NA, NA, NA, NA, "266...
## $ Timezone         <chr> "GMT+1:00 Europe/Andorra", "GMT+04:00 Asia/Du...
## $ Longitude        <dbl> 1.53, 55.47, 55.47, 54.38, 54.54, 54.49, 54.4...
## $ Latitude         <dbl> 42.51, 25.42, 25.39, 24.48, 24.51, 24.40, 24....
# dunkin' donuts
dunkin_donuts <- read_excel(here::here("coffee_chains.xlsx"), 
                            sheet = 3)
glimpse(dunkin_donuts)
## Observations: 4,898
## Variables: 22
## $ id                <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 1...
## $ biz_name          <chr> "Dunkin' Donuts", "Dunkin' Donuts", "Dunkin'...
## $ e_address         <chr> "2178 Eastern Valley Rd", "480 Cahaba Valley...
## $ e_city            <chr> "Bessemer", "Pelham", "Dothan", "Phoenix", "...
## $ e_state           <chr> "AL", "AL", "AL", "AZ", "AZ", "AZ", "AZ", "A...
## $ e_postal          <dbl> 35022, 35124, 36301, 85021, 85015, 85051, 85...
## $ e_zip_full        <chr> "35022-5363", "35124-1364", "36301-5747", "8...
## $ e_country         <chr> "USA", "USA", "USA", "USA", "USA", "USA", "U...
## $ loc_county        <chr> "Jefferson", "Shelby", "Houston", "Maricopa"...
## $ loc_area_code     <dbl> 205, 205, 334, 602, 602, 602, 602, 602, 623,...
## $ loc_FIPS          <dbl> 1073, 1117, 1069, 4013, 4013, 4013, 4013, 40...
## $ loc_MSA           <dbl> 1000, 1000, 2180, 6200, 6200, 6200, 6200, 62...
## $ loc_PMSA          <chr> "null", "null", "null", "null", "null", "nul...
## $ loc_TZ            <chr> "CST", "CST", "CST", "MST", "MST", "MST", "M...
## $ loc_DST           <chr> "Y", "Y", "Y", "N", "N", "N", "N", "N", "N",...
## $ loc_LAT_centroid  <dbl> 33.4335, 33.3024, 31.1461, 33.5597, 33.5095,...
## $ loc_LAT_poly      <dbl> 33.33441, 33.33448, 31.18990, 33.55306, 33.5...
## $ loc_LONG_centroid <dbl> -86.9037, -86.8026, -85.4124, -112.0949, -11...
## $ loc_LONG_poly     <dbl> -86.99597, -86.78430, -85.39924, -112.10104,...
## $ web_url           <chr> "http://www.dunkindonuts.com", "http://www.d...
## $ biz_info          <chr> NA, NA, NA, "(602) 242-4314", NA, NA, NA, NA...
## $ biz_phone         <chr> "(205) 425-1333", "(205) 988-3664", "(334) 6...

We have already the two data frames, one with Starbucks data and another one with Dunkin’ Donuts data. We can now wrangle the data. First, the Starbucks data:

# wrangling data
# starbucks
starbucks_tidy <- starbucks %>%
  # select only coffee locations in the USA
  filter(Country == "US", Brand == "Starbucks") %>%
  # select only columns of interest. Without longitude and latitude we cannot map the locations 
  select(Brand, 
         lng = Longitude, 
         lat = Latitude, Country) %>%
  # tidy the columns' names
  select_all(tolower)
glimpse(starbucks_tidy)
## Observations: 13,311
## Variables: 4
## $ brand   <chr> "Starbucks", "Starbucks", "Starbucks", "Starbucks", "S...
## $ lng     <dbl> -149.78, -149.84, -149.85, -149.89, -149.86, -149.87, ...
## $ lat     <dbl> 61.21, 61.14, 61.11, 61.13, 61.14, 61.19, 61.22, 61.18...
## $ country <chr> "US", "US", "US", "US", "US", "US", "US", "US", "US", ...

Now, the Dunkin’ Donuts data frame:

# dunkin' donuts
dunkin_donuts_tidy <- dunkin_donuts %>%
  # various cases where the name is not well spelled. we need to recode them
  mutate(biz_name = recode(biz_name,
                           "Donuts Dunkin" = "Dunkin' Donuts",
                           "Dunkin' Donuts-baskln Robbins" = "Dunkin' Donuts",
                           "Dunkin' Donuts Center" = "Dunkin' Donuts",
                           "Dunkin' Donuts/Baskin Robbins" = "Dunkin' Donuts")) %>%
  # select only cases where the coffee chain is Dunkin' Donuts
  filter(biz_name == "Dunkin' Donuts") %>%
  # select columns of interest and change its names 
  select(brand = biz_name, 
         lng = loc_LONG_poly, 
         lat = loc_LAT_poly, 
         country = e_country) %>%
  mutate(country = case_when(country == "USA" ~ "US"))
glimpse(dunkin_donuts_tidy)
## Observations: 4,866
## Variables: 4
## $ brand   <chr> "Dunkin' Donuts", "Dunkin' Donuts", "Dunkin' Donuts", ...
## $ lng     <dbl> -86.99597, -86.78430, -85.39924, -112.10104, -112.1108...
## $ lat     <dbl> 33.33441, 33.33448, 31.18990, 33.55306, 33.52416, 33.5...
## $ country <chr> "US", "US", "US", "US", "US", "US", "US", "US", "US", ...

In the next step, we will put together the Starbucks and Dunkin’ Donuts data in a common data frame called coffee_chains.

# combine cases
coffee_chains <- bind_rows(starbucks_tidy, dunkin_donuts_tidy)
skim(coffee_chains)
## Skim summary statistics
##  n obs: 18177 
##  n variables: 4 
## 
## -- Variable type:character -----------------------------------------------------------
##  variable missing complete     n min max empty n_unique
##     brand       0    18177 18177   9  14     0        2
##   country       0    18177 18177   2   2     0        1
## 
## -- Variable type:numeric -------------------------------------------------------------
##  variable missing complete     n   mean    sd      p0     p25    p50
##       lat       0    18177 18177  38.21  5.2    19.64   34.1   39.34
##       lng       0    18177 18177 -92.75 18.91 -159.46 -112.05 -86.17
##     p75   p100     hist
##   41.77  64.85 <U+2581><U+2582><U+2585><U+2587><U+2583><U+2581><U+2581><U+2581>
##  -76.51 -67.28 <U+2581><U+2581><U+2581><U+2586><U+2582><U+2582><U+2587><U+2587>

Now that we have our data frame with both coffee chains, let’s focus on our main goal: Create an interactive map of both coffee chains with leaflet.

We can use multiple functions from the leaflet package to map our data. Immediately before that, I will create a two color palette for our coffee chains.

# create colors
pal_color <- colorFactor(palette = c("#007042", "#ea4498"),
                   levels = c("Starbucks", "Dunkin' Donuts"))

Moving on with our goal in mind, we can add our data frame to our leaflet map. Using the function addCircleMarkers() we can map and change the look of how coffee chains are displayed.

#add coffe_chains to our map
  coffee_chains %>% 
  leaflet(width = "100%", 
          options = leafletOptions(preferCanvas = TRUE)) %>% 
  addProviderTiles("CartoDB.DarkMatter", 
                   options = providerTileOptions(
                     updateWhenZooming = FALSE,
                     updateWhenIdle = TRUE)) %>% 
  addCircleMarkers(radius = 3, # size of the markers
                   label = ~htmlEscape(brand), # label of the marker
                   color = ~pal_color(brand), # brand color associated with the marker
                   popup = ~paste0("<b>", brand)) # popup with the brand's name in bold

As a result, it looks that Dunkin’ Donuts is more concentrated in the East Coast than in the West Coast. Starbucks seems to be spread equally across the USA. Of course, we could filter the data to have even more accurate information.

We can now filter the data. Let’s start with Dunkin’ Donuts:

# filter starbucks
  #add coffe_chains to our map
  coffee_chains %>% 
  leaflet(width = "100%", 
          options = leafletOptions(preferCanvas = TRUE)) %>% 
  addProviderTiles("CartoDB.DarkMatter") %>%
  # add filter
  addCircleMarkers(data = filter(coffee_chains, brand == "Dunkin' Donuts"),
                   radius = 3,
                   label = ~htmlEscape(brand),
                   color = ~pal_color(brand),
                   popup = ~paste0("<b>", brand)) 

As expected, Dunkin’ Donuts has a greater presence in the East Coast, while as the next map shows, Starbucks is more or less evenly distributed across the country.

# filter starbucks
  #add coffe_chains to our map
  coffee_chains %>% 
  leaflet(width = "100%", 
          options = leafletOptions(preferCanvas = TRUE)) %>% 
  addProviderTiles("CartoDB.DarkMatter", 
                   options = providerTileOptions(
                     updateWhenZooming = FALSE,
                     updateWhenIdle = TRUE)) %>% 
  # add filter
  addCircleMarkers(data = filter(coffee_chains, brand == "Starbucks"),
                   radius = 3,
                   label = ~htmlEscape(brand),
                   color = ~pal_color(brand),
                   popup = ~paste0("<b>", brand)) 

Nonetheless, there is a better way to do it. We can filter the data by using the addLayersControl() where we can - with the mouse pointer - check the brand that we want to visualize. Additionally, we can also add a legend to our interactive map.

coffee_chains %>%
leaflet(width = "100%", 
        options = leafletOptions(preferCanvas = TRUE)) %>%
addProviderTiles("CartoDB.DarkMatter", 
                 options = providerTileOptions(
                   updateWhenZooming = FALSE,
                   updateWhenIdle = TRUE)) %>%
addProviderTiles("CartoDB.DarkMatter") %>%
addCircleMarkers(data = filter(coffee_chains, brand == "Starbucks"),# add brand filter
                 radius = 3,
                 label = ~htmlEscape(brand),
                 color = ~pal_color(brand),
                 popup = ~paste0("<b>", brand),
                 group = "Starbucks") %>%
  addCircleMarkers(data = filter(coffee_chains, brand == "Dunkin' Donuts"),# add brand filter
                   radius = 3,
                   label = ~htmlEscape(brand),
                   color = ~pal_color(brand),
                   popup = ~paste0("<b>", brand),
                   group = "Dunkin' Donuts" ) %>%
  # add legend
  addLegend(pal = pal_color,
            values = c("Starbucks", "Dunkin' Donuts"),
            # opacity of .5, legend title called Brand and its position on the topright corner
            opacity = 0.5, title = "Brand", position = "topright") %>%
  # add layers to select the brands that are mapped
  addLayersControl(overlayGroups = c("Starbucks", "Dunkin' Donuts"))
## Assuming "lng" and "lat" are longitude and latitude, respectively
## Assuming "lng" and "lat" are longitude and latitude, respectively

This way, we finish our interactive map.

Hope you have enjoyed this post about the leaflet package in R. It has an enormous power to make interactive maps. Feel free to leave your comments. Happy R coding!