Christopher Yee

TidyTuesday: Thanksgiving Dinner

Analyzing data for #tidytuesday week of 11/20/2018 (source) # LOAD PACKAGES AND PARSE DATA library(tidyverse) library(scales) library(RColorBrewer) library(forcats) thanksgiving_raw <- read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2018/2018-11-20/thanksgiving_meals.csv") thanksgiving <- thanksgiving_raw %>% filter(celebrate != 'No') What are the most popular pies for Thanksgiving ? thanksgiving %>% select(pie1:pie13) %>% pivot_longer(pie1:pie13, names_to = "pie_type") %>% filter(value != 'None') %>% select(value) %>% group_by(value) %>% count() %>% filter(n > 10) %>% ungroup() %>% ggplot(aes(reorder(value, n), n, label = n)) + geom_bar(aes(fill = value), alpha = 0.9, stat='identity') + coord_flip() + theme_classic() + theme(legend.position = 'none') + labs(title = "Most Popular Pies for Thanksgiving (n=980)", subtitle = "Question: Which type of pie is typically served at your Thanksgiving dinner? \n Please select all that apply", x = "", y = "") ...

For the Love of Data, Segment!

Aggregated data is misleading. Let’s read that again: aggregated data is misleading. Why? Because the homogenized set buries the meaningful insights away. For example, I recently came across a competitive SEO analysis that examined the relationship between the number of ranking organic keywords to the estimated traffic from organic search for a handful of websites. In my opinion, this is a great start to understand the opportunity size of a market and how a given business stacks up against its competitors. ...

Data Viz: Top Marketing Words in Linkedin Job Titles

I abhor tabulated data for a number of reasons: Quite difficult on the human eye to spot trends Puts a burden on the end user to spend extra time digesting the information True insights get lost because the devil is in the details In fact, individuals who join Square’s SEO team (I’m hiring by the way) are required to read this book on how to visualize data before making any presentations - an SEO bible, if you will. ...

On Innovation (Mini-Rant)

TL;DR implementing a system to “drive” (e.g. creating a team or process) and “celebrate” (e.g. recognition awards because let’s be real those are just popularity contests) innovation is counter-productive to progress and I firmly believe this can only be fostered via company culture, environment, public forum, mind share, etc. What is innovation? I define it as the application of an idea in a useful and novel way to ensure an entity remains relevant while fundamentally changing the way it is perceived. ...

TF-IDF Explained: With Help From US Presidents

TF-IDF, or Term Frequency-Inverse Document Frequency, has long been utilized by search engines to score and rank a document’s relevance for any given search query. In spite of this, I think it continues to be a misunderstood or under-the-radar concept in the broader SEO world due to 1) “keyword density” being much easier to explain and 2) it’s like a word salad when you read it for the very first time. ...

Moz: Put Your Money Where Your [Diversity] Mouth Is

I attended MozCon 2017 last month where it’s always a blast to reconnect with old colleagues and make new friends. That being said, this isn’t going to be your typical feel good post or conference recap. Instead, it’s going to be an observation about conference diversity, specifically MozCon. CONTEXT I attended my first MozCon back in 2013 with SEOgadget (now BuiltVisible) where it was also Moz’s first time hosting the conference at the Washington State Convention Center in Seattle. I don’t recall the exact numbers but I’d venture a guess it was anywhere between 800-1K attendees. It didn’t feel too large where you’d get lost in the crowd but it was intimate enough. ...

Extracting Links from a Page with Ruby and Nokogiri

Scraper is a pretty good Chrome extension I use on a regular basis to quickly extract links from a page. Unfortunately, there can be rare instances where it actually takes more effort to use. For example, if I wanted to retrieve all links from Hewlett-Packard’s HTML sitemap, I would need to create multiple Google spreadsheets to capture that data because of the way the page is structured. In this particular case, I’d have to scrape the page a total of 14 times to account for the different sections. ...

A Year of Webkit2png

When I joined SEOgadget last year, my first blog post was about using webkit2png for site audits, stalking and more. What I didn’t mention was my 2013 new years resolution - to track the home page of three websites for the entire year with webkit2png. The following videos come from the home pages of Macy’s, Yahoo and Amazon with a years worth of images compiled together. It’s nothing too crazy but feel free to turn on your favorite jam, sit back, relax and view them for your pleasure. ...