Automated PGF and Neuca24 Pharmaceutical Wholesalers Scraper - Drug Pricing and Availability Monitoring System
I created an advanced system that automates the extraction of pricing data and product availability information from Poland's two largest pharmaceutical wholesalers - PGF and Neuca24. My solution enables daily updates, comparative price analysis, and Excel data export, supporting purchase optimization in the pharmaceutical industry.

Challenges
- Automation of daily extraction of pricing data and inventory levels from PGF and Neuca24 systems
- Overcoming login security and navigation challenges in closed pharmaceutical wholesale systems
- Ensuring reliable data update schedule and exception handling
- Standardization and normalization of data from different wholesale systems into a unified format
- Efficient storage of historical pricing data for trend analysis
- Creating a useful interface for price comparison between wholesalers
Implemented solutions
- I designed dedicated scrapers with automated login handling for each pharmaceutical wholesaler
- I implemented a security bypass system using sessions and user behavior emulation
- I configured a reliable daily update system through crontab with error notification mechanisms
- I created algorithms for normalization and product matching between different wholesale systems
- I designed an optimized SQLite database schema with indexing for efficient data storage and retrieval
- I built an intuitive API interface enabling filtering, sorting, and data export to Excel
Automated PGF and Neuca24 Pharmaceutical Wholesalers Scraper - Drug Pricing and Availability Monitoring System
Project Overview
I created an advanced system for automatic monitoring and data extraction from Poland's two largest pharmaceutical wholesalers - PGF (Polish Pharmaceutical Group) and Neuca24. My solution enables systematic collection of information about prices, availability, and purchase conditions of pharmaceutical products, allowing the client to optimize purchasing processes and inventory management.
The system was designed with the specific requirements of the pharmaceutical industry in mind, accounting for differences in data structure, product naming conventions, and pricing systems used by both wholesalers.
Advanced Data Collection Mechanisms
Automatic Login and Navigation in Wholesale Systems
- Secure credential management - I implemented a mechanism for secure storage and use of login data
- User behavior emulation - I created a system simulating natural human interactions with the web interface to bypass bot detection mechanisms
- Adaptive navigation - the system intelligently moves through the structure of both wholesaler websites, responding to interface changes
Comprehensive Pharmaceutical Data Acquisition
- Complete product data extraction - extraction of information about trade names, active substances, dosages, packaging, manufacturers, and EAN codes
- Price and discount monitoring - tracking of catalog prices, discounts, promotions, and special offers
- Availability data - gathering information about inventory levels, delivery times, and minimum order quantities
- Commercial terms - collection of data on special purchase conditions and loyalty programs
Advanced Process Automation
Reliable Task Scheduling System
- Crontab configuration - I implemented a precise update schedule taking into account wholesaler server load
- Retry mechanism - the system automatically retries in case of failure with exponential delay
- Error notifications - I created an alert system notifying about data collection problems
- Activity logs - detailed logs enabling diagnostics and problem-solving
Pharmaceutical Data Processing and Standardization
- Format unification - normalization of different data formats used by wholesalers
- Product deduplication - advanced algorithms identifying the same products despite naming differences
- Data validation and cleaning - detection and correction of inconsistencies in collected information
- Derived metrics calculation - automatic calculation of unit cost, margin, and other indicators
Efficient Data Storage Architecture
Optimized SQLite Database
- Well-designed database schema - I designed a table structure reflecting relationships between products, prices, and availability
- Efficient indexing - I optimized searches through strategic indexing of key fields
- Price history management - the system stores historical data enabling price trend analysis
- Compact structure - despite storing large amounts of data, the database remains efficient and easy to manage
Versatile Data Access
Intuitive API Interface
- RESTful API - I created a programming interface based on FastAPI for easy data access
- Advanced filtering - ability to search for products by name, active substance, manufacturer, and other criteria
- Sorting and pagination - efficient management of large result sets
- Access security - authentication system protecting data from unauthorized access
Data Export Functionalities
- Excel spreadsheet generation - creation of detailed reports in Excel format with formatting and formulas
- Comparative reports - automatic price comparisons of the same products in both wholesalers
- On-demand export - ability to generate custom reports as needed
- Report scheduling - automatic generation and sending of periodic reports
Practical Business Applications
The pharmaceutical wholesaler scraper system finds application in:
- Purchase optimization - selecting the most cost-effective wholesaler for specific products
- Inventory planning - monitoring drug availability and predicting shortages
- Price trend analysis - tracking price changes over time for strategic purchase planning
- Supplier negotiations - having current market data during term negotiations
- Order process automation - integration with inventory management systems
Results and Benefits
The system I created brings measurable benefits to the client:
- Time savings - elimination of several hours daily of manual price checking
- Purchase cost optimization - reduction of drug expenses through selection of the best offers
- Better product availability - minimization of shortages through inventory monitoring
- Business decision support - access to current and historical data for strategic planning
The system is regularly updated and improved to adapt to changes in wholesaler interfaces and client requirements. The use of Python, SQLite, and FastAPI technologies ensures flexibility, efficiency, and ease of maintenance of the solution.