The differences between this unsanctioned #tardythursday and the official #tidytuesday:

  1. These will publish on Thursday (obviously)
  2. The dataset will come from a completely different week of TidyTuesday
  3. For a surprise, I’ll code with either #rstats or python (similar to #makeovermonday)

Load modules

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

Download and parse data


df=df_raw[['state_name', 'early_career_pay', 'mid_career_pay']].groupby('state_name').mean().reset_index()

Visualize dataset


g=sns.regplot(x="early_career_pay", y="mid_career_pay", data=df)

for line in range(0,df.shape[0]):
     g.text(df.early_career_pay[line]+0.01, df.mid_career_pay[line], 
     df.state_name[line], horizontalalignment='left', 
     size='medium', color='black')
plt.xlabel("Early Career Pay")
plt.ylabel("Mid Career Pay")
plt.title("Average Salary Potential by State: Early vs Mid Career",
      x=0.01, horizontalalignment="left", fontsize=16)
plt.figtext(0.9, 0.09, "by: @eeysirhc", horizontalalignment="right")
plt.figtext(0.9, 0.08, "Source:", horizontalalignment="right")

Now, all that is left is to find something catchy for the other days of the week - lol.