Data from #tidytuesday week of 2020-05-12 (source) but plotting in python.

Load modules

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

Download and parse data

volcano_raw = pd.read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-05-12/volcano.csv")

volcano = volcano_raw[['primary_volcano_type', 'elevation']].sort_values(by='elevation', ascending=False)

Visualize dataset

sns.set(style="darkgrid")
plt.figure(figsize=(20,15))

p = sns.boxplot(x=volcano.elevation, y=volcano.primary_volcano_type)
p = sns.swarmplot(x=volcano.elevation, y=volcano.primary_volcano_type, color=".35")

plt.xlabel("Elevation")
plt.ylabel("")
plt.title("What is the average elevation by volcano type?",
          x=0.01, horizontalalignment="left", fontsize=20)
plt.figtext(0.9, 0.08, "by: @eeysirhc", horizontalalignment="right")
plt.figtext(0.9, 0.07, "Source: The Smithsonian Institute", horizontalalignment="right")

plt.show()

Improvements

  • Find and replace similar volcano categories such as stratovolcano which is the same as stratovolcano(es)