Data Visualization

[Updated] Top Industries from Inc.5000 Companies

Changelog Originally published on September 10th, 2019 Built a Shiny app for this Full code can be found on GitHub One of my favorite online marketers, (the) Glen Allsopp, tweeted the following: Over the past few weeks I've went through every site in the Inc. 5000. My mind has been blown multiple times. Don't click if you're easily distracted. Enjoy! https://t.co/mHVK8rvb9X pic.twitter.com/BoEb3qQ7LZ — Glen Allsopp (@ViperChill) August 27, 2019 The public spreadsheet contains four fields: ...

Visualizing intraday SEM performance with R

Aside from the base bid, Google SEM campaign performance can be influenced by contextual signals from the customer. These include but are not limited to: device, location, gender, parental status, household income, etc. For this post we’ll focus on ad schedule (or intraday) and visualize how time of day and day of week is performing. Load data library(tidyverse) # ANONYMIZED SAMPLE DATA df <- read_csv("https://raw.githubusercontent.com/Eeysirhc/random_datasets/master/intraday_performance.csv") Spot check our data df %>% sample_n(20) ## # A tibble: 20 x 5 ## account day_of_week hour_of_day roas conv_rate ## <chr> <chr> <dbl> <dbl> <dbl> ## 1 Account 3 Tuesday 5 0.509 0.0183 ## 2 Account 2 Friday 4 1.11 0.0401 ## 3 Account 2 Sunday 11 1.07 0.0309 ## 4 Account 3 Saturday 18 1.09 0.0301 ## 5 Account 1 Thursday 19 0.303 0.0165 ## 6 Account 1 Tuesday 8 0.362 0.0230 ## 7 Account 2 Saturday 4 0.722 0.0340 ## 8 Account 3 Friday 10 0.653 0.00844 ## 9 Account 2 Wednesday 8 0.448 0.0262 ## 10 Account 1 Saturday 9 0.858 0.0467 ## 11 Account 1 Saturday 18 0.266 0.0136 ## 12 Account 1 Saturday 8 0.871 0.0349 ## 13 Account 2 Friday 14 0.546 0.0196 ## 14 Account 1 Sunday 5 0.0444 0.00889 ## 15 Account 3 Wednesday 21 0.530 0.0248 ## 16 Account 1 Tuesday 16 0.801 0.0451 ## 17 Account 2 Monday 2 0.884 0.0230 ## 18 Account 2 Wednesday 19 0.772 0.0275 ## 19 Account 3 Monday 21 0.444 0.0367 ## 20 Account 1 Tuesday 3 0 0 Clean data Convert to factors The day_of_week is a character and time_of_day is a double data type. We need to transform them to factors so they don’t surprise us later. ...

Visualizing Netflix viewing activity

If you are like me then it’s very likely you share your Netflix account with multiple users. If you are also like me then it’s very likely you would be curious about how your Netflix viewing activity coompares and contrasts to all the parasites on your account! In this post we’ll leverage #rstats to visualize what that will look like. Load packages Let’s fire up our favorite packages. library(tidyverse) library(lubridate) library(igraph) library(ggraph) library(tidygraph) library(influenceR) Download data With the exception of my own viewing activity (I’m not ashamed!), I have provided anonymized Netflix viewing data from a few family and friends for you to follow along. ...

Mining Google Trends data with R

Google Trends is great for understanding relative search popularity for a given keyword or phrase. However, if we wanted to explore the topics some more it is quite clunky to retrieve that data within the web interface. Enter the gtrendsR package for #rstats and what better way to demonstrate how this works than by pulling search popularity for ramen, pho, and spaghetti (hot on the heels of my last article about ramen ratings)! ...

Text Mining the Redacted Mueller Report

After two politically-charged years, Robert Mueller finally concluded his investigation on Russian interference with the 2016 presidential elections. The outcome was a 440+ page report on their findings - the perfect candidate for some text mining. Side note: the idea for this post came when my attempts to extract the PDF text proved unsuccessful because it was locked in an unsearchable version. As a consequence of that I did a little tweet mining instead: too busy to read all 400+ pages of the #muellerreport but apparently not busy enough to do some text/tweet mining with #rstats according to this network graph though I should definitely check out page 290 pic.twitter.com/RPiahsrg9A ...

Hello, can we stop using pie charts?

I came across this tweet and its corresponding graph a few days ago: Did you know? 🧐 1‐word keywords account for only 2.8% of all the keywords people search for in the United States. pic.twitter.com/GXdfttn3jk — Tim Soulo (@timsoulo) February 21, 2019 I love ahrefs and all but it’s 2019 - WHY ARE WE STILL USING PIE CHARTS?! I’ll spare my opinion since there is already a ton of literature out but here’s a few to get started: ...

For the Love of Data, Segment!

Aggregated data is misleading. Let’s read that again: aggregated data is misleading. Why? Because the homogenized set buries the meaningful insights away. For example, I recently came across a competitive SEO analysis that examined the relationship between the number of ranking organic keywords to the estimated traffic from organic search for a handful of websites. In my opinion, this is a great start to understand the opportunity size of a market and how a given business stacks up against its competitors. ...

Data Viz: Top Marketing Words in Linkedin Job Titles

I abhor tabulated data for a number of reasons: Quite difficult on the human eye to spot trends Puts a burden on the end user to spend extra time digesting the information True insights get lost because the devil is in the details In fact, individuals who join Square’s SEO team (I’m hiring by the way) are required to read this book on how to visualize data before making any presentations - an SEO bible, if you will. ...