Visualizing intraday SEM performance with R

Aside from the base bid, Google SEM campaign performance can be influenced by contextual signals from the customer. These include but are not limited to: device, location, gender, parental status, household income, etc. For this post we’ll focus on ad schedule (or intraday) and visualize how time of day and day of week is performing. Load data library(tidyverse) # ANONYMIZED SAMPLE DATA df <- read_csv("https://raw.githubusercontent.com/Eeysirhc/random_datasets/master/intraday_performance.csv") Spot check our data df %>% sample_n(20) ## # A tibble: 20 x 5 ## account day_of_week hour_of_day roas conv_rate ## <chr> <chr> <dbl> <dbl> <dbl> ## 1 Account 3 Tuesday 5 0....

December 23, 2019 · Christopher Yee

Code Answers to SQL Murder Mystery

Pretty fun murder mystery from @knightlab - can you find the killer using #SQL?https://t.co/vXcMtY2b1c — Christopher Yee (@Eeysirhc) December 20, 2019 CLUE #1 There is a murder in SQL City on 2018-01-15. select * from crime_scene_report where type = 'murder' and city = 'SQL City' and date = '20180115' CLUE #2 Witness 1 lives in the last house on Northwestern Dr. Witness 2 is named Annabel and lives somehwere on Franklin Ave....

December 21, 2019 · Christopher Yee

TidyTuesday: Adoptable Dogs

Data from #tidytuesday week of 2019-12-17 (source) Quick post to showcase the amazing {reticulate} package which has made my life so much easier! Who said you had to choose between R vs Python? Load packages library(tidyverse) library(reticulate) R then Python Grab and parse data df_rdata <- read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-12-17/dog_moves.csv") df_rdata <- df_rdata %>% filter(inUS == 'TRUE') %>% select(location, total) df_rdata %>% head() ## # A tibble: 6 x 2 ## location total ## <chr> <dbl> ## 1 Texas 566 ## 2 Alabama 1428 ## 3 North Carolina 2627 ## 4 South Carolina 1618 ## 5 Georgia 3479 ## 6 California 1664 Plot data import pandas as pd import seaborn as sns import matplotlib....

December 17, 2019 · Christopher Yee

Calculating & estimating annual salaries with R

A couple weeks ago, a friend asked me about my base annual salary during my time as Square’s SEO Lead. Rather than spitting out a number, I thought it would be more interesting to see if we could answer her question using #rstats. tl;dr This is what I posted on Twitter: Ok #bayesian twitter: helping a friend with salary negotiations and this incorporates what she wants, job boards, confirmed salaries, etc……how do I validate if this model is a load of crock or not?...

November 26, 2019 · Christopher Yee

Connect R to Amazon Redshift Database

This is a quick technical post for anyone who needs full CRUD capabilities to retrieve their data from a Redshift table, manipulate data in #rstats and sending it all back up again. Dependencies Load libraries library(tidyverse) library(RPostgreSQL) # INTERACT WITH REDSHIFT DATABASE library(glue) # FORMAT AND INTERPOLATE STRINGS Amazon S3 For this data pipeline to work you’ll also need the AWS command line interface installed. # RUN THESE COMMANDS INSIDE TERMINAL brew install awscli aws configure # ANSWER QUESTIONS access / secret / zone Read data Set connection You’ll need to replace with your own database credentials below:...

October 24, 2019 · Christopher Yee

R functions for simulation, sampling & visualization

In my previous article about simulating page speed data, I broke one of the cardinal rules in programming: don’t repeat yourself. There was a reason for this: I wanted to show what is going on under the hood and the theoretical concepts associated with them before using other functions in R. For this follow-up, I’ll highlight a few #rstats shortcuts that will make your life easier when generating and exploring simulated data....

October 16, 2019 · Christopher Yee

Simulating data to explore page speed performance

We may be inundated with data but sometimes collecting it can be a challenge in and of itself. A few reasons off the top of my head: Sparsity Difficult to measure Impractical to devote company resources to it Lack of technical expertise to actually build or acquire it Lazy (yours truly - except for that one time) Through simulation we can generate our own dataset with the added benefit of fully understanding what features we choose to put in our models (or leave out)....

September 23, 2019 · Christopher Yee

Find your favorite Twitter user with the rtweet package

Do you know who your favorite person on Twitter is? Probably! Did you ever want to quantify that statement? Probably not! Are you curious to find out who someone else’s favorite Twitter user is? Now you can with R! The code below is brought to you by Namita and her hilarious tweet: face some possibly uncomfortable truths about yourself and others with 4 easy lines of code using #rtweet and the #tidyverse pic....

August 25, 2019 · Christopher Yee