top of page

Busted! Secrets on Indeed

Writer's picture: leo huangleo huang

Updated: Aug 23, 2020


Recently, I did a data scraping project using python on indeed. Because Indeed is the biggest and most popular job site in Canada, I am usually overwhelmed by the number of searches that it provides. A tremendous amount of time was usually spent on finding suitable and up-to-date postings.


After performing data filtering and sorting on acquired data, I realized only less than 30% of the job search results are useful. This project was an example of me implementing my data scraping and analysis skillsets to gain insights into the modern digitized world.

 

Python Library used:

· Beautiful Soup in Bs4: This library helps us with HTML parsing

· Request: Website Connections

· Time: to add interrupt within each web search request

· Pandas: database management

· seaborn: plotting

· Numpy: Array manipulation

 

User Input: (Searched on Aug 23rd, 2020)

· Job title: 'Engineer'

· Location: 'GTA, ON'


Raw Web Scraping Output:

· Search Results: 885 postings (Impossible for me to sort manually!!!)

· Record information: Job Title, Company, Location, Posting Date, Salary, posting URL

 

Data Filtering and Sorting

· Remove duplicate job searches that lead to the same posting URL

· Remove duplicate job searches with same job title under the same company name

· Remove job searches that are older or equal to 30 days


Post-Processing Result:

· Useful job postings: 239 (only 27% of the raw searches)

· Automatically download the data frame in CSV format

· Plotted the top ten hiring companies in a bar chart



 

Take Away and Future Opportunities

I am amazed by the productivity boost that python gave me through some simple programming. In the modern digitized world, it's crucial for us to navigate through excessive information and locate useful ones. As a mechanical engineer, I am always looking for opportunities to incorporate data management and process streamlining techniques into our discipline. Apart from indeed, I have also looked into LinkedIn. Apparently, Linkedin has a much stricter privacy policy. Please stay tuned for my next project! Freel free to message me on Linkedin if you have any questions.

23 views0 comments

Comments


  • linkedin

©2020 by Leo. Proudly created with Wix.com

bottom of page