Cliff’s Document Repository
  • All Documents
  • About Cliff

Data Storytelling, Again!

Transforming how traditional plots into something intuitive.

R
Visuals
Storytelling
2023
Author

Cliff Weaver

Published

July 26, 2023

Modified

July 28, 2023

Introduction

You can toggle to a light background using the toggle switch on the upper right next to the magnifying glass. I think the light option works best for this post.

Did you ever create a chart that looks like this:

Be honest! Of course you have! Would you create the same thing today? Hopefully not.

You try to make sense of the comparisons across the entire graph but with so many bars of alternating colors, it looks less like a data visualization and more like a test pattern, or an old-school 3-D image that you can’t see without those red-and-blue glasses.

Let’s see how something like this can present the information in a more compelling manner.

Develop Data

if(!require(easypackages)){install.packages("easypackages")}
library(easypackages)
packages('tidyverse', 'randomNames','ggtext', 'scales', 'ggrepel', prompt = FALSE)
set.seed(1235)
my_tbl = tibble(name = randomNames(20, ethnicity = 5, which.names = "first"),
                target = round(runif(20, min = 59000, max = 70000),0),
                deals = round(runif(20, min = 52000, max = 84000)))
my_tbl <- my_tbl %>% rowwise() %>% mutate(max = max(c(target, deals))) 

my_tbl <- my_tbl %>% arrange(max) %>% select(-max) %>% mutate(name = as_factor(name))

my_tbl <- my_tbl %>% mutate(exceed = case_when(deals > target ~ 1,
                                               TRUE ~ 0))
my_tbl_exceed <- my_tbl %>% select(name, exceed) %>% filter(exceed ==1)

my_tbl <- my_tbl %>% select(-c(exceed))  %>% pivot_longer(cols=c("target", "deals"),
                                  names_to = "performance",
                                  values_to = "revenue")

Recreate the Original

format.money  <- function(x, ...) {
  paste0("$", formatC(as.numeric(x), format="f", digits=1, big.mark=","))
}
my_tbl <- my_tbl %>% mutate(rev_label = paste0(format.money(revenue/1000), "K"))

my_tbl %>% ggplot(aes(x=name, y = revenue, fill = performance, label = rev_label)) +
  geom_bar(stat="identity", position = position_dodge()) +
  theme(axis.text.x = element_text(angle = 90, hjust=0.99, vjust=0.5), legend.position = "none", 
        plot.title = element_markdown(), panel.background = element_rect(fill = "white"), 
        axis.ticks.x = element_blank()) +
  scale_color_manual("#90C6E9", "#F00D00") +
  geom_text(angle = 90, size = 2, position = position_dodge(width = .9), hjust = -.1) +
  ggtitle("<b>Sales report for Q2</b><br> <span style = 'color:#90C6E9;'> revenue targets </span> vs. 
          <span style = 'color:#F00D00;'> closed deals </span> by associate") +
  xlab("") + ylab("") + scale_y_continuous(labels = label_dollar(scale = .001, suffix = "K"))

Let’s make Improvements

By implementing a slightly different graph type, we remove a lot of extra visual clutter, lighten up the view for an audience, and give ourselves the chance to add in thoughtful labels and annotations. The graph type I’m talking about is a connected (Cleveland) dot plot.

Dot plots are woefully underused in most business communications, and I have my suspicions as to why. Although you can create them in just about any tool, connected dot plots are not available as default graph types in Excel, PowerPoint, Power BI, Tableau, Qlik, Domo, Looker Studio, or MicroStrategy. (They are, however, in basic libraries for R and Python, and are easily created with online tools such as flourish.studio and Datawrapper.) Even so, it is fairly easy to find step-by-step instructions online for creating these charts in your tool of choice.

Of course being an R evangelist, I create a new version of the plot using R. After all, ggplot is the most powerful plotting solution and perhaps the most imitated. Even Python developers respect ggplot!

format.money  <- function(x, ...) {
  paste0("$", formatC(as.numeric(x), format="f", digits=1, big.mark=","))
}

my_tbl <- my_tbl %>% mutate(rev_label = paste0(format.money(revenue/1000), "K"))
my_tbl %>% 
  ggplot(aes(revenue, name, label=round(revenue, 0), color = performance)) + 
  geom_point(shape=19, size = 3) + scale_color_manual(values = c("blue", "red")) +
  geom_line(color="grey") + 
  
  geom_richtext(data = my_tbl %>% filter(!name %in% my_tbl_exceed$name & performance == "target"), fill=NA, label.color=NA,
             aes(label = str_glue("{name}: <span style='font-size:6pt'> {rev_label} </span>"), 
                 x = revenue + 4000, y = name, color=performance, size=3)) +
  geom_richtext(data = my_tbl %>% filter(name %in% my_tbl_exceed$name & performance == "deals"), fill=NA, label.color=NA,
            aes(label = str_glue("<b>{name}</b>: <span style='font-size:6pt'> {rev_label} </span>"), 
                x = revenue+4000, y = name, color=performance, size=3)) +
    scale_size_identity() +
    theme(plot.title = element_markdown(), legend.position = "none", 
        panel.background = element_rect(fill = "white"), axis.ticks.x = element_blank(),
        axis.ticks.y = element_blank(), axis.text.y = element_blank(), axis.title.y = element_blank()) +
  ggtitle("<b>Sales report for Q2</b><br> <span style = 'color:red;'> Target </span> vs. 
          <span style = 'color:blue;'> Closed Deals </span> by Sales Manager") + xlab("Revenue") +
  scale_x_continuous(labels = label_dollar(scale = .001, suffix = "K"), limits = c(50000, 90000)) +
  annotate(geom = "richtext", x = 85000, y=5, label.color=NA,
           label="<span style='color:blue;'>Patrick </span> killed it this month!<br> 
           <b><span style='color:blue;'>14.2K</span> </b>above plan.<br>
           (<span style='color:red;'>Stevie Ray </span>needs some help.<br>
           <b><span style='color:red;'>7.4K</span> </b>under plan.)", size = 3)

If I were to explain why I believe the above plot is far more intuitive and informative than the original, I would be tooting my own horn. So you decide for yourself.

Blog made with Quarto, by Cliff Weaver. License: CC BY-SA 2.0.