All projects
06ETL / Automation2023

SecondBite — OC Store ETL

End-to-end ETL and automation for an OpenCart store, operated through Google Sheets.

Type
ETL pipeline / freelance
Period
2023
Stack
Python · Selenium · Pandas · Google Sheets API
Interface
Google Sheets

Pipeline stages

Scrape → normalize → match → load.

A complete ETL and automation pipeline for an OpenCart store. It extracts product data from multiple source stores, normalizes it to a standard feed format, match-checks against existing inventory, and loads new entries or updates existing ones — with Google Sheets as the client-facing operational view.

  • Drive ChromeDriver via Selenium to scrape listings, handling pagination and dynamic content.
  • Normalize extracted data into the OC feed format with Pandas (field mapping, type coercion, missing values).
  • Match-check scraped records against destination inventory using stable product identifiers.
  • Load new entries or update existing ones based on the comparison result.
  • Sync the final feed state to Google Sheets as the client-facing operational view.