For the Love of Data, Segment!

by Christopher Yee on August 28, 2018

Aggregated data is misleading.

Let’s read that again: aggregated data is misleading.

Why? Because the homogenized set buries the meaningful insights away.

For example, I recently came across a competitive SEO analysis that examined the relationship between the number of ranking organic keywords to the estimated traffic from organic search for a handful of websites.  In my opinion, this is a great start to understand the opportunity size of a market and how a given business stacks up against its competitors.

In that deliverable the visualized data turned out to look something like the one below (not the same websites but using for illustrative purposes):

The report would go on to declare Coursera (again not the actual industry) the primary SEO competitor and spent the remainder of the report breaking down what they are doing well, not so well, and spit out a list of “action items.”

Read more…


Data Viz: Top Marketing Words in Linkedin Job Titles

by Christopher Yee on March 10, 2018

I abhor tabulated data for a number of reasons:

  1. Quite difficult on the human eye to spot trends
  2. Puts a burden on the end user to spend extra time digesting the information
  3. True insights get lost because the devil is in the details

In fact, individuals who join Square’s SEO team (I’m hiring by the way) are required to read this book on how to visualize data before making any presentations – an SEO bible, if you will.

Which is why when I saw Eli publish this article summarizing top marketing words in Linkedin job titles, I found it the perfect opportunity to build upon the data he gathered.  So, what does that look like?

In terms of total job postings by city, NYC comes out on top followed by Los Angeles and a close third in the San Francisco Bay Area.

Read more…

On Innovation (Mini-Rant)

by Christopher Yee on February 19, 2018

TL;DR implementing a system to “drive” (e.g. creating a team or process) and “celebrate” (e.g. recognition awards because let’s be real those are just popularity contests) innovation is counter-productive to progress and I firmly believe this can only be fostered via company culture, environment, public forum, mind share, etc.

What is innovation?  I define it as the application of an idea in a useful and novel way to ensure an entity remains relevant while fundamentally changing the way it is perceived.

I’d go so far as to claim these processes are proposed by innovation/management consultants to attract clientele….completely flies in the face of the spirit of innovation.  What it does instead is it creates a workforce addicted to the dopamine rush of putting a shiny award on their desk.  This ultimately means we only look at what we’re not currently doing, what others are doing and then copying them.  That is not innovation – that is incrementality (which is still important for the business).

Based on purely anecdotal evidence, I believe innovation is born from five factors:

Read more…

TF-IDF Explained: With Help From US Presidents

by Christopher Yee on September 25, 2017

TF-IDF, or Term Frequency-Inverse Document Frequency, has long been utilized by search engines to score and rank a document’s relevance for any given search query.  In spite of this, I think it continues to be a misunderstood or under-the-radar concept in the broader SEO world due to 1) “keyword density” being much easier to explain and 2) it’s like a word salad when you read it for the very first time.

With the help from past US Presidents and their State of the Union addresses, I’ll attempt to explain this numerical statistic in its simplest form.

The Basics

At its core, TF-IDF is used to quantify how important a word is to a document when compared with a larger collection of text.  This essentially gives less prominence to a word that has been used more frequently, and more weight to a word which has been used less across a known text corpus.  The beauty of this calculation is it efficiently removes commonly used words like “the”, “but”, “for”, etc. yet it can distill the document down to its primary lexical components.

For example, SEO of  yesteryear dictates: “if you want to rank for that keyword then you need to mention it X times on the page.”  This obviously isn’t the case but let’s run with this as a working example.

If we take the State of the Union addresses from George Washington, Abraham Lincoln, Dwight D. Eisenhower, Bill Clinton, George W. Bush & Barack Obama, we can plot out their term frequencies to get something like this:

Common words among the last three US Presidents? America, American(s), people, tax and jobs.  “World” is a high frequency term after Eisenhower presumably because US foreign policy placed more emphasis on the international realm post-WW2, instead of its reclusive pre-war status.

Read more…

Moz: Put Your Money Where Your [Diversity] Mouth Is

by Christopher Yee on August 13, 2017

I attended MozCon 2017 last month where it’s always a blast to reconnect with old colleagues and make new friends.  That being said, this isn’t going to be your typical feel good post or conference recap.  Instead, it’s going to be an observation about conference diversity, specifically MozCon.


I attended my first MozCon back in 2013 with SEOgadget (now BuiltVisible) where it was also Moz’s first time hosting the conference at the Washington State Convention Center in Seattle.  I don’t recall the exact numbers but I’d venture a guess it was anywhere between 800-1K attendees.  It didn’t feel too large where you’d get lost in the crowd but it was intimate enough.

Intimate to the point where you would be acutely aware of the fact that you’re a minority – not something I was used to as a San Francisco native.  In fact, the only other Asian I saw was Stephanie Chang who was working for Distilled at the time.  Why do I bring this up?  Because in 2013, the SEO industry was just starting to go “mainstream” so the demographic makeup naturally skewed toward the White Male, making it easier to remember other Asians.

Read more…

Extracting Links from a Page with Ruby and Nokogiri

by Christopher Yee on February 13, 2014

Scraper is a pretty good Chrome extension I use on a regular basis to quickly extract links from a page. Unfortunately, there can be rare instances where it actually takes more effort to use.

For example, if I wanted to retrieve all links from Hewlett-Packard’s HTML sitemap, I would need to create multiple Google spreadsheets to capture that data because of the way the page is structured. In this particular case, I’d have to scrape the page a total of 14 times to account for the different sections.


Read more…

A Year of Webkit2png

by Christopher Yee on January 1, 2014

When I joined SEOgadget last year, my first blog post was about using webkit2png for site audits, stalking and more.  What I didn’t mention was my 2013 new years resolution – to track the home page of three websites for the entire year with webkit2png.

The following videos come from the home pages of Macy’s, Yahoo and Amazon with a years worth of images compiled together.  It’s nothing too crazy but feel free to turn on your favorite jam, sit back, relax and view them for your pleasure.

Enjoy and have an amazing 2014!  =]

Read more…

Crayon Syntax Highlighter Themes

by Christopher Yee on October 21, 2013

If you write a technical blog post about optimizing source code for SEO or programming scripts, I highly recommend the Crayon Syntax Highlighter for WordPress users – it gives your examples a nice, snazzy look to it.  The plugin includes 25 default themes but I couldn’t find a good preview gallery for them anywhere so I decided to list them all out below.  Enjoy!


This is the "Ado" theme.

Arduino Ide

This is the "Arduino Ide" theme.

Cg Cookie


This is the "Classic" theme.


This is the "Eclipse" theme.

Read more…

Updated aHrefs Link Analysis Script

by Christopher Yee on March 18, 2013

I updated my aHrefs bulk link analysis script to improve its functionality by adding two features.

  1. The script now returns the results in a CSV file called ahrefs_results.csv
  2. Introduces the .map Ruby enumerable for a “cleaner” syntax

The source code for this Ruby script can be found at my Github repository.

My next task is defining individual functions to eliminate any code redundancy and ultimately speed up the API calls.  Stay tuned!

Joining the SEOgadget Family

by Christopher Yee on February 28, 2013

This post is super late but if you didn’t know already I left my short tenure with Macy’s earlier this month and joined the SEOgadget family!

You can read my first SEOgadget blog post here.

I’m helping out Laura Lippay expand the US office so I’ll be getting a taste of both agency and startup life.  Business is already booming and I’ve got so much work ahead we are looking to hire another Organic Search Strategist.  Yes, that’s right – I need a partner in crime!

Read more…