Automated Indeed Scraper - Advanced Tool for Extracting Job Listings with Email Addresses
We designed an efficient automated system for scraping job listings from Indeed that filters offers containing email addresses and integrates with the JobPortal platform. Discover our solution for Mesoworks.

Challenges
- Automating the scraping of Indeed job listings containing email addresses
- Optimizing data extraction process efficiency and storage
- Integration with client's existing JobPortal system
- Migrating the data storage system from Google Sheets to PostgreSQL database
Implemented solutions
- Creation of an advanced Indeed scraper using Python, Selenium and BeautifulSoup
- Implementation of intelligent filtering for job postings containing email addresses
- Execution of data migration to PostgreSQL and full integration with JobPortal
- Automation of the daily job listing retrieval and analysis process
Automated Indeed Scraper - Advanced Tool for Extracting Job Listings with Email Addresses
Project Overview
We created an advanced system that automatically retrieves job listings from the Polish Indeed portal daily. The tool analyzes and filters postings for the presence of email addresses, providing key value to the recruitment processes of our client, Mesoworks.
Initially, data was stored in Google Sheets, but as part of our optimization, we performed a complete migration to a PostgreSQL database. Currently, the system is fully integrated with the JobPortal platform, enabling efficient management of acquired job listings.
Key Features and Technologies
Indeed Scraping Automation
- Daily listing retrieval - we used Python with Selenium and BeautifulSoup libraries to create a reliable Indeed scraper
- Advanced offer filtering - we implemented precise algorithms for detecting email addresses in job posting content
- Block avoidance mechanisms - we applied proxy rotation and session management to increase scraper reliability
Data Integration and Storage
- Migration from Google Sheets to PostgreSQL - we increased system efficiency and scalability
- Full synchronization with JobPortal - we integrated our solution with the client's existing platform
- Data management API - we created an API using FastAPI for easy access to collected data
Infrastructure and Performance
- Microservice-based architecture - we ensured independent scaling of individual components
- Asynchronous task handling - we used Celery with Redis for efficient task queue management
- Containerization with Docker - we enabled easy deployment and environment management
Measurable Project Results
- HR process automation - elimination of over 20 hours of manual work weekly
- Increased recruitment efficiency - 300% increase in candidates acquired from listings with direct email contact
- Solution scalability - the system currently handles over 10,000 job listings daily
- Integration with client ecosystem - seamless cooperation with the existing JobPortal platform
Conclusions
Our advanced Indeed scraper with email address detection capability has significantly streamlined Mesoworks' HR processes. Through automation of job listing scraping, filtering for contact information, and integration with JobPortal, the client can acquire job candidates much more efficiently.
The use of modern technologies such as Python, Selenium, PostgreSQL, FastAPI, and Docker allowed us to create an efficient, scalable, and reliable solution that meets all of the client's business requirements.