A few months ago I tweeted a complex (and tedious) Excel formula on how to classify keywords:

I then ended it with:

In hindsight the third comment could be interpreted as “gud luk lulz!”

For this walkthrough we’ll use the {fuzzyjoin} #rstats package to replicate the aforementioned Excel method. In a future post we’ll build a neural net instead to achieve this for 1M+ keywords.

``library(tidyverse)``

``df <- read_csv("https://raw.githubusercontent.com/Eeysirhc/random_datasets/master/keywords_digital_marketing.csv")``

## Inspect the data

``df``
``````## # A tibble: 29 x 1
##    Keyword
##    <chr>
##  1 digital marketing agency
##  2 what is digital marketing
##  3 digital marketing jobs
##  4 digital marketing salary
##  5 digital marketing manager
##  6 digital marketing course
##  7 digital marketing strategy
##  8 digital marketing services
##  9 digital marketing specialist salary
## 10 digital marketing certificate
## # … with 19 more rows``````

## Introducing {fuzzyjoin}

Fuzzyjoin is an amazing package “that allows matching not just on values that match between columns, but on inexact matching.”

This is exactly what we need to identify and match keyword strings together so let’s first load the package.

``````# install.packages("fuzzyjoin")
library(fuzzyjoin)``````

To understand how this works we’ll start small by building out the consideration identifier segment.

``````consideration <- as_tibble(c("what is", "blog", "specialist",
"near me", "agency", "compan",
"service", "example")) %>%
rename(consideration = value)``````

Then we (fuzzy) match the two dataframes with the regex_left_join function to produce the following result:

``````regex_left_join(df, consideration, by = c("Keyword" = "consideration")) %>%
replace(., is.na(.), "-") # REPLACE NA VALUES``````
``````## # A tibble: 29 x 2
##    Keyword                             consideration
##    <chr>                               <chr>
##  1 digital marketing agency            agency
##  2 what is digital marketing           what is
##  3 digital marketing jobs              -
##  4 digital marketing salary            -
##  5 digital marketing manager           -
##  6 digital marketing course            -
##  7 digital marketing strategy          -
##  8 digital marketing services          service
##  9 digital marketing specialist salary specialist
## 10 digital marketing certificate       -
## # … with 19 more rows``````

Not bad - we were able to assign each keyword to the respective consideration segment while ignoring those that do not apply.

Now let’s combine everything for the other categories…

``````# TRANSACTIONAL
transactional <- as_tibble(c("agency", "compan", "consult",
"service")) %>%
rename(transactional = value)

# EVALUATION
evaluation <- as_tibble(c("blog", "consult", "agency",
"compan", "service", "example")) %>%
rename(evaluation = value)

# CONSIDERATION
consideration <- as_tibble(c("what is", "blog", "specialist",
"near me", "agency", "compan",
"service", "example", "strategy")) %>%
rename(consideration = value)

# AWARENESS
awareness <- as_tibble(c("what is", "tool", "definition",
"channel", "near me", "blog",
"course", "new", "trend", "tip",
"strategy")) %>%
rename(awareness = value)

# OUT OF FUNNEL
out_of_funnel <- as_tibble(c("degree", "institute", "course",
"certif", "skill", "for dummi",
"training", "quote", "job",
"salary", "intern", "manager",
"resume", "analyst", "strategist",
"director", "specialist")) %>%
rename(out_of_funnel = value)``````

And finally match them with the keywords dataset…

``````df_parsed <- regex_left_join(df, transactional, by = c("Keyword" = "transactional")) %>%
regex_left_join(evaluation, by = c("Keyword" = "evaluation")) %>%
regex_left_join(consideration, by = c("Keyword" = "consideration")) %>%
regex_left_join(awareness, by = c("Keyword" = "awareness")) %>%
regex_left_join(out_of_funnel, by = c("Keyword" = "out_of_funnel")) %>%
replace(., is.na(.), "-") %>% # REPLACE NA VALUES
head(10) # FILTER ON FIRST 10``````

Step 5) Win.

Keyword transactional evaluation consideration awareness out_of_funnel
digital marketing agency agency agency agency
what is digital marketing
what is what is
digital marketing jobs
job
digital marketing salary
salary
digital marketing manager
manager
digital marketing course
course course
digital marketing strategy
strategy strategy
digital marketing services service service service
digital marketing specialist salary
specialist
salary
digital marketing specialist salary
specialist
specialist