Find Keywords To Add To Your Resume With Python

Enhance your resume by adding keywords based on job descriptions by using Python and nltk

4 min readJan 24, 2021

Photo by Glenn Carstens-Peters on Unsplash

Table of Content

Introduction
Functions
Result
Conclusion

Introduction

I know how time-consuming it is when it comes to looking for a job because I’ve been experiencing this firsthand. One of the most important things when applying for a job is tailoring your resume to the job posting to increase your resume’s visibility in applicant tracking systems (ATSs). For this, you need to read the job description very carefully, find keywords (especially those that appear repeatedly) and add them to your resume.

Since I am too lazy to repeat boring things, I’ve found a solution to my problem and created a function with Python. It takes a resume and job description in plain text file format and returns a list of word suggestions with their frequency of occurrence in the job posting to add to your resume. I will only show the functions inside the parent function in this post, if you’d like to take a look at the full code you can find it here on my GitHub. Let’s get into detail.

One detail I should mention is that since job descriptions in the postings can have different titles such as ‘required skills’, ‘required qualifications’, ‘qualifications’, ‘responsibilities’, I decided to include the entire job posting, including the company description. You can always modify your text file as you like at any time, you can only include the qualifications section or any part you want.

Functions

Let’s start with importing the libraries and modules. We will be cleaning the text input by using them before starting to work on it.

Import libraries and modules

import pandas as pd
import nltk
from nltk.corpus import stopwords
import re
import string
wn = nltk.WordNetLemmatizer() #Lemmatizer#Stopwords in English languagestopword = nltk.corpus.stopwords.words('english')

clean_the text()

Now it’s time to define the first function clean_the_text(). This function will take the body of text and return a list of words that are cleaned from non-word characters, punctuations, and stopwords and bring them into lower case. Also, it will lemmatize the words.

def clean_the_text(text):
        
        #Replace non-word characters with empty space
        text = re.sub('[^A-Za-z0-9\s]', ' ', text)
        
        #Remove punctuation
        text = ''.join([word for word in text if word not in
               string.punctuation])
        
        #Bring text to lower case
        text = text.lower()
        
        #Tokenize the text
        tokens = re.split('\W+', text)
        
        #Remove stopwords
        text = [word for word in tokens if word not in stopword]
        
        #Lemmatize the words
        text = [wn.lemmatize(word) for word in text]
        
        #Return text
        return text

common_words()

The other function to define is common_words(). It will take two lists (words from the job posting and words from the resume) and return the set of words that two lists share in common. So, I can later count how many words are matching and find what the ratio of matching words to the resume is. The reason the lists are switched to Python sets in this function is that looking for items in a set is much faster than looking for them in a list. Since we will only look for the matching words in both sets, we will use the function set.intersection().

def common_words(l_1, l_2):
        matching_words = set.intersection(set(l_1), set(l_2))
        return matching_words

Frequency table

Now, what I want to do is to create a frequency table to find out how many words there are in the job posting that I didn’t use in my resume. So, it may give me an idea about the importance of the word by its frequency of use. After creating the frequency table in a Python dictionary format, I will sort the values in descending order.

# Create an empty dictionary
freq_table = {}
    
# Create frequency table for the words that are not in the list_2(resume) but appear in the list_1(job posting) 
   
for word in list_1:
    if not word in list_2:
        if word in freq_table:
            freq_table[word] += 1
        else:
            freq_table[word] = 1
    
# Sort the dictionary by values in descending order
freq_table = dict(sorted(freq_table.items(), key=lambda item: item[1], reverse=True))

And for a clearer look, I will switch the dictionary to a Pandas data frame.

df = pd.DataFrame.from_dict(freq_table.items())
df.columns = ['word','count']

Result

Let’s see the outcome! I’ve used these functions for a job ad that I liked with my resume.

The number of matching words:

list_1 = clean_the_text(job_posting)
list_2 = clean_the_text(resume)common_keywords = common_words(list_1, list_2)print('The number of common words in your resume and the job posting is: {}'.format(len(common_keywords)))Output: 
The number of common words in your resume and the job posting is: 40

Percentage of the matching words:

print('{:.0%} of the words in your resume are in the job description'.format(len(common_keywords)/len(list_2)))Output: 10% of the words in your resume are in the job description

Frequency table (list of suggested words):

A table of data. First column is the index starting from zero. Second column is the words. Last column is their counts. — Frequency table for words

I created a frequency table for my resume and the job ad I chose with the Python code I defined above earlier. These are the words in the job posting but I don’t have them in my resume. So, it could be wise to add a couple of them.

Conclusion

By using these functions, you can tailor your resume based on the job description and enhance the visibility of your resume in applicant tracking systems (ATSs) when you apply for a job online.

If you have any feedback, please don’t hesitate to reach out to me on my LinkedIn!