TidyTuesday: Cocktails pt.2

This is part 2 of TidyTuesday: Cocktails. Below shows how we can use #rstats to write a cocktail recommendation system that takes in a drink and returns a few other cocktails based on similarly mixed ingredients. Load libraries library(tidyverse) library(recommenderlab) Download and parse data Note: please check out part 1 for deatils on processing steps bc_raw <- read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-05-26/boston_cocktails.csv') bc <- bc_raw %>% mutate(ingredient = str_to_lower(ingredient)) %>% distinct() %>% select(name, ingredient) bc_tidy <- bc %>% filter(!...

May 28, 2020 · Christopher Yee

TidyTuesday: Cocktails

Data from #tidytuesday week of 2020-05-26 (source) If you are looking for the R script then you can find it here Load packages library(tidyverse) library(ggrepel) library(FactoMineR) Download data bc_raw <- read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-05-26/boston_cocktails.csv') Data processing Standardize cases bc_raw %>% count(ingredient, sort = TRUE) %>% filter(str_detect(ingredient, "red pepper sauce")) ## # A tibble: 2 x 2 ## ingredient n ## <chr> <int> ## 1 Hot red pepper sauce 4 ## 2 hot red pepper sauce 1 Let’s fix that by making all ingredient values to lower case:...

May 26, 2020 · Christopher Yee

TidyTuesday: Volcano Eruptions (python)

Data from #tidytuesday week of 2020-05-12 (source) but plotting in python. Load modules import pandas as pd import matplotlib.pyplot as plt import seaborn as sns Download and parse data volcano_raw = pd.read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-05-12/volcano.csv") volcano = volcano_raw[['primary_volcano_type', 'elevation']].sort_values(by='elevation', ascending=False) Visualize dataset sns.set(style="darkgrid") plt.figure(figsize=(20,15)) p = sns.boxplot(x=volcano.elevation, y=volcano.primary_volcano_type) p = sns.swarmplot(x=volcano.elevation, y=volcano.primary_volcano_type, color=".35") plt.xlabel("Elevation") plt.ylabel("") plt.title("What is the average elevation by volcano type?", x=0.01, horizontalalignment="left", fontsize=20) plt.figtext(0.9, 0.08, "by: @eeysirhc", horizontalalignment="right") plt....

May 12, 2020 · Christopher Yee

TidyTuesday: Animal Crossing

Data from #tidytuesday week of 2020-05-05 (source) Load packages library(tidyverse) library(ggfortify) Download data villagers_raw <- read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-05-05/villagers.csv') Process data villagers <- villagers_raw %>% select(gender, species, personality) %>% mutate(species = str_to_title(species)) %>% group_by(gender, species, personality) %>% summarize(n = n()) %>% mutate(pct_total = n / sum(n)) %>% ungroup() Visualize data villagers %>% ggplot(aes(personality, pct_total, fill = gender, color = gender, group = gender)) + geom_polygon(alpha = 0.5) + geom_point() + coord_polar() + facet_wrap(~species) + labs(x = NULL, y = NULL, color = NULL, fill = NULL, title = "Animal Crossing: villager personality traits by species & gender", caption = "by: @eeysirhc\nsource:VillagerDB") + theme_bw() + theme(legend....

May 6, 2020 · Christopher Yee

TidyTuesday: Beer Production

Data from #tidytuesday week of 2020-03-31 (source) Load packages library(tidyverse) library(gganimate) library(gifski) Download data beer_states_raw <- read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-03-31/beer_states.csv") Clean data beer_total <- beer_states_raw %>% # FILL NULL VALUES WITH 0 replace(., is.na(.), 0) %>% # REMOVE LINE ITEM FOR 'TOTAL' filter(state != 'total') %>% # COMPUTE TOTAL BARRELS PER YEAR BY STATE group_by(year, state) %>% summarize(total_barrels = sum(barrels)) %>% ungroup() Create rankings beer_final <- beer_total %>% group_by(year) %>% mutate( # CALCULATE RANKINGS BY TOTAL BARRELS PRODUCED EACH YEAR rank = min_rank(-total_barrels) * 1....

April 14, 2020 · Christopher Yee