library(tidyverse)
library(skimr)
library(plotly)
library(here)
Exploring the lifespans of historical figures born on a Leap Day
Background
Happy belated Leap Day! This week’s #tidytuesday is focused on significant historical events and people who were born or died on a Leap Day. The aim of this post is to contribute a couple data visualizations to this social data project. Specifically, I used plotly
and Tableau to create my contributions.
<- read_csv(
data_births here(
"blog",
"posts",
"2024-02-27-tidytuesday-2024-02-27-leap-day",
"births.csv"
) )
Let’s do a quick glimpse()
and skim()
of our data, just so we get an idea of what we’re working with here.
glimpse(data_births)
Rows: 121
Columns: 4
$ year_birth <dbl> 1468, 1528, 1528, 1572, 1576, 1640, 1692, 1724, 1736, 1792, 1812, 1828, 1836, …
$ person <chr> "Pope Paul III", "Albert V", "Domingo Báñez", "Edward Cecil", "Antonio Neri", …
$ description <chr> NA, "Duke of Bavaria", "Spanish theologian", "1st Viscount Wimbledon", "Floren…
$ year_death <dbl> 1549, 1579, 1604, 1638, 1614, 1704, 1763, 1822, 1784, 1868, 1880, 1921, 1908, …
skim(data_births)
Name | data_births |
Number of rows | 121 |
Number of columns | 4 |
_______________________ | |
Column type frequency: | |
character | 2 |
numeric | 2 |
________________________ | |
Group variables | None |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
---|---|---|---|---|---|---|---|
person | 0 | 1.00 | 6 | 29 | 0 | 121 | 0 |
description | 1 | 0.99 | 12 | 95 | 0 | 107 | 0 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|
year_birth | 0 | 1.00 | 1919.90 | 101.01 | 1468 | 1920 | 1944.0 | 1976 | 2004 | ▁▁▁▁▇ |
year_death | 65 | 0.46 | 1933.61 | 126.53 | 1549 | 1920 | 1989.5 | 2013 | 2023 | ▁▁▁▁▇ |
Data description
This week’s data comes from the February 29th Wikipedia page. Three data sets are made available, one focused on significant events, as well as births and deaths of historical figures that occurred on a Leap Day. Given what’s available, I was interested in exploring the age and lifespan of the historical figures born on a Leap Day. Here’s the wrangling code I created to explore the data.
<- data_births |>
data_age mutate(
is_alive = ifelse(is.na(year_death), 1, 0),
year_death = ifelse(is.na(year_death), 2024, year_death),
age = year_death - year_birth
|>
) arrange(desc(age)) |>
relocate(person, description, year_birth, year_death, age)
$person <- factor(data_age$person, levels = data_age$person[order(data_age$year_birth)]) data_age
What are the lifespans of historical figures born on a leap day?
To explore this question, I decided to create a dumbbell chart. In the chart, the blue dots represent the person’s birth year. The black dot represents the year the person died. Absence of the black dot indicates a person is still alive, while the grey line represents the person’s lifespan. If you hover over the dots, a tool tip with information about each person is shown.
<- data_age |> filter(is_alive == 0)
not_alive
plot_ly(
data_age, color = I("gray80"),
text = ~paste(
"<br>",
person, "Age: ", age, "<br>",
description
),hoverinfo = "text"
|>
) add_segments(x = ~year_birth, xend = ~year_death, y = ~person, yend = ~person, showlegend = FALSE) |>
add_markers(x = ~year_birth, y = ~person, color = I("#0000FF"), name = "Birth year") |>
add_markers(data = not_alive, x = ~year_death, y = ~person, color = I("black"), name = "Year passed") |>
layout(
title = list(
text = "<b>Lifespans of historical figures born on a Leap Day</b>",
xanchor = "center",
yanchor = "top",
font = list(family = "arial", size = 24)
),xaxis = list(
title = "Year born | Year died"
),yaxis = list(
title = ""
) )
An attempt using Tableau
I also created a version of this visualization using Tableau. You can view my attempt here. I was required to make a few concessions with this attempt, as I was unable to have as much fine control of the plot elements as I would have liked. However, I’m happy with what turned out.
Reuse
Citation
@misc{berke2024,
author = {Berke, Collin K},
title = {Exploring the Lifespans of Historical Figures Born on a
{Leap} {Day}},
date = {2024-03-05},
langid = {en}
}