Data Animation Using Plotly Express

Guha Ayan
6 min readJan 16, 2021

What is Plotly Express

Plotly is one of the leading charting libraries. Stemmed from javascript background, plotly has an excellent Python binding. While it supports complex manipulation of figures and visual effects, it also provides a simple api called Plotly Express. It is super useful for creating excellent visulaizations simply with single function call.

In this post I would like to discuss few unconventional visualizations to analyze or represent data. First, lets choose a dataset.

Data Preparation

I have started following Covid data managed by Center for Systems Science and Engineering (CSSE) at Johns Hopkins University since almost when they started. It is used as the best source of data almost everywhere and I would like to take this opportunity to thank good guys in JHU who maintains this information.

I will use Covid time series data in this post. I will also like to normalize raw Covid figures with population of the countries and will use 2018 data.

import plotly.express as px
import plotly.graph_objects as go
import pandas as pd
import numpy as np
import datetime, math

Read Covid data from JHU github

def jhuurl2df(url,column_name):
## Function to read JHU Covid Time series

base_df = pd.read_csv(url)

## Unpivot
df = base_df.melt(id_vars=['Province/State','Country/Region', 'Lat', 'Long' ], var_name='date', value_name=column_name)

## Column Rename and few basic transform
df['dt_name'] = pd.to_datetime(df.date)
df['dt'] = df.dt_name.apply(lambda x: datetime.datetime.strftime(x,'%Y-%m-%d'))
df['country_name'] = df['Country/Region']

## Group By and SUM
df = df.groupby(['country_name','dt']).agg({column_name: 'sum'}).reset_index()

## Prepare log values
df['log_'+column_name] = df[column_name].apply(lambda x: 0 if x == 0 else math.log(x))
return df
confirmed_url = 'https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv'
death_url = 'https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv'
recovered_url = 'https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_recovered_global.csv'
confirmed = jhuurl2df(confirmed_url,'confirmed')
deaths = jhuurl2df(death_url,'deaths')
recovered = jhuurl2df(recovered_url,'recovered')

Read country related data

## Read Country Details from github
country_url = 'https://raw.githubusercontent.com/lukes/ISO-3166-Countries-with-Regional-Codes/master/all/all.csv'
country_all = pd.read_csv(country_url)
## Map Country names with JHU names
country_all['name'] = np.select(
[
country_all['name'].eq('United States of America'),
country_all['name'].eq('Bolivia (Plurinational State of)'),
country_all['name'].eq('Brunei Darussalam'),
country_all['name'].eq('Myanmar'),
country_all['name'].eq('Congo'),
country_all['name'].eq('Congo, Democratic Republic of the'),
country_all['name'].eq("Côte d'Ivoire"),
country_all['name'].eq('Iran (Islamic Republic of)'),
country_all['name'].eq("Lao People's Democratic Republic"),
country_all['name'].eq('Moldova, Republic of'),
country_all['name'].eq('Russian Federation'),
country_all['name'].eq('Syrian Arab Republic'),
country_all['name'].eq('Taiwan, Province of China'),
country_all['name'].eq('Tanzania, United Republic of'),
country_all['name'].eq('United Kingdom of Great Britain and Northern Ireland'),
country_all['name'].eq('Venezuela (Bolivarian Republic of)'),
country_all['name'].eq('Viet Nam'),
country_all['name'].eq('Korea, Republic of')
],
[
'US', 'Bolivia','Brunei','Burma','Congo (Brazzaville)', 'Congo (Kinshasa)', "Cote d'Ivoire", 'Iran', 'Laos',
'Moldova', 'Russia', 'Syria', 'Taiwan*', 'Tanzania', 'United Kingdom', 'Venezuela', 'Vietnam','Korea, South'
],
default=country_all['name']
)
## Read population data
country_pop_url = 'https://raw.githubusercontent.com/datasets/population/master/data/population.csv'
country_pop = pd.read_csv(country_pop_url)
## Filter for 2018
country_pop = country_pop[(country_pop.Year == 2018)]
## Merge Country details and Population
country = country_all.merge(country_pop, left_on = 'alpha-3', right_on = 'Country Code')
country['Population'] = country['Value']
country = country[['name', 'alpha-3', 'region', 'Population']]

Now, merge country data and covid data, and finally create metrics for reporting. Our primary metrics will be confirmed cases per 1000 person (Incident rate) and number of deaths per 1000 confirmed cases (Case Mortality).

gdf = confirmed.merge(deaths,how='inner', on=['country_name', 'dt']).merge(recovered,how='left', on=['country_name', 'dt'])
cdr = gdf.merge(country, how='left', left_on = 'country_name', right_on = 'name')
cdr = cdr[(cdr.country_name.notnull()) & (cdr.confirmed > 0) & (cdr.region.notnull()) & (cdr.confirmed.notnull())]
cdr['confirmed_per_capita_1k'] = 1000*(cdr['confirmed']/cdr['Population'])
cdr['death_per_confirmed_1k'] = 1000*(cdr['deaths']/cdr['confirmed'])
cdr['death_per_capita_1k'] = 1000*(cdr['deaths']/cdr['Population'])
cdr['recovered_per_confirmed_1k'] = 1000*(cdr['recovered']/cdr['confirmed'])

Let us prepare 2 data sets. First one for time series visualization and second one for a specific day, how was the metrics

from itertools import product
from datetime import datetime
a = range(3,13)
b = [10,20,30]
ds = []
for p in product(a,b):
d = datetime(2020, p[0], p[1])
ds.append(d)

dss = [x.strftime('%Y-%m-%d') for x in ds]
cdrn = cdr[(cdr.dt.isin(dss))].sort_values(by=['region', 'country_name','dt'])cdrp = cdr[(cdr.dt == '2020-12-31') & (cdr.name.notnull()) ]

Now we are ready for visualization.

Animation

Plotly express provides a super handy animation feature. It supports many types of charts and the results are very nice. Let us explore some of the chart types I love with animation.

Map Animation

Plotly express has pretty nice support for in-built world maps. It supports both a generic earth view as well as choropleth version. The best thing is all it takes single function call.

fig = px.scatter_geo(cdrn, 
locations="alpha-3",
color="region",
size="confirmed_per_capita_1k",
animation_frame="dt",
projection="natural earth",
hover_data=cdrn.columns,
title="How Did it Spread (Confirmed per 1000 People)")
fig.show()
fig = px.choropleth(cdrn, locations="alpha-3",
color="death_per_confirmed_1k",
hover_name="country_name",
animation_frame='dt',
color_continuous_scale="Reds",
template="plotly_dark",
range_color=[0,110],
title="Case Mortality Time Series")
fig.show()

Bar Animation

Bar charts are very useful comparison tools. It is extremely simple to understand yet powerful communication tool. With plotly express animation, changing bar charts to tell a story is extremely easy.

fig = px.bar(cdrn[(cdrn.dt >= '2020-05-01')], 
x="confirmed_per_capita_1k",
y="country_name",
orientation='h',
animation_frame="dt",
color="region",
hover_data=cdrn.columns,
range_x = [0,110],
title="Case Mortality Time Series"
)
fig.show()

Scatter Animation

Scatter charts are often used to combine multiple metrics in a single charts. Typically, 2 metrics are plotted against x and y axis, and a 3rd metrics is also incorporated in the chart using size variable.

fig = px.scatter(cdrn[(cdrn.region == 'Americas')].sort_values(by=['dt']), 
x="death_per_confirmed_1k", y="confirmed_per_capita_1k",

size="Population",
animation_frame="dt",
color="country_name",
range_x=[0,110],
range_y=[0,70],
hover_data=cdrn.columns,
labels={"death_per_confirmed_1k": "Case Mortality", "confirmed_per_capita_1k": "Confirmed/1000"}
title="Case Mortality Vs Incidence Rate Time Series (Americas)",
)
fig.show()

Facets

While it is very useful information by itself, I really like the facet feature of plotly express. It can be used in many different contexts. Here is a simple example:

fig = px.scatter(cdrn, 
x="death_per_confirmed_1k", y="confirmed_per_capita_1k",
size="Population",
animation_frame="dt", animation_group="country_name",
color="region", hover_name="country_name",
facet_col = 'region',
range_x=[0,110],
range_y=[0,70],
hover_data=cdrn.columns,
title="Case Mortality Vs Incidence Rate Time Series, with Region Split",
labels={"death_per_confirmed_1k": "Case Mortality", "confirmed_per_capita_1k": "Confirmed/1000"}
)
fig.show()

Polar Bar Animation

I love polar bar charts, they are super easy to understand and look nice. This is a new addition in plotly express animation library and hence not yet present in the gallery. Notice the simple way to enable log scale as the numbers increase.

fig = px.bar_polar(cdrn[(cdrn.region == 'Asia')].sort_values(by=['dt']), 
r="confirmed",
theta="country_name",
animation_frame="dt",
animation_group="country_name",
color="country_name",

color_discrete_sequence= px.colors.qualitative.G10,
log_r = True,
title="Comparison of Confirmed Numbers over time")
fig.show()

Couple of notes:

  1. Themes and Templates

Plotly supports handful of templates out-of-the-box. Here I chose dark theme and few other style elements as default. These can be set at the beginning of the program. Each of them can be overridden at the figure level, so I recommend to use defaults heavily and modify certain pieces of styles according to the figures.

px.defaults.template = "plotly_dark"
px.defaults.color_continuous_scale = px.colors.sequential.Blackbody
px.defaults.width = 800
px.defaults.height = 600
  1. Gotchas with Animated Graphs

a. For few cases, plotly fixes the categories available at the first animation frame. Then subsequent categories are neglected. This is bit annoying when you have new categories appears over time. The best way to handle this to backfill numbers.

b. Plotly fixes the coordinates as per the first frame and does not update it automatically. This may lead to scenarios where plots go outside of the charting area. This requires to provide a range for coordinates upfront. This requires few trial and error steps. I suggest to set the ranges high at first and tune in 2–3 iterations. From plotly docs

Note that you should always fix the x_range and y_range to ensure that your data remains visible throughout the animation.

Hope you enjoyed the read. Please share your thoughts and comments.

--

--