Remote, but you must be in the following location
Join the Team
Serv, a global executive recruitment partner, is hiring on behalf of our client Mercator.ai for a Staff Data Engineer (Data Pipelines).
MercatorAI is building scalable data infrastructure to power high-quality, data-driven decision making at scale. As an early-stage company, the team is focused on creating robust, future-ready systems that can handle complex data ingestion, transformation, and delivery across a growing national footprint.
This role is responsible for owning and evolving the company’s data pipeline architecture, ensuring systems are scalable, reliable, and built to support long-term growth.
The Staff Data Engineer is the technical owner of MercatorAI’s data infrastructure and pipeline systems. This individual will lead the design, development, and scaling of distributed data pipelines while remaining hands-on in implementation.
This is a builder role suited for someone who enjoys working directly in the code while also setting architectural direction. You will be responsible for scaling existing systems, improving performance and reliability, and integrating modern tools including AI-assisted workflows to enhance engineering output.
This role requires strong technical depth, ownership, and the ability to operate effectively in a fast-paced, early-stage environment.
Lead the architecture and evolution of scalable, distributed data pipelines, ensuring high availability and performance at scale
Design and implement robust data models to support reporting and advanced data applications
Build and maintain distributed web scraping systems using tools such as Playwright, Selenium, and BeautifulSoup
Develop systems capable of handling anti-scraping measures, proxy rotation, and high-volume data extraction
Integrate AI and LLMs into engineering workflows for code generation, automation, and optimization
Apply prompt engineering techniques to improve data processing, documentation, and troubleshooting
Identify and implement system and process improvements to optimize performance and efficiency
Manage and scale cloud-based data infrastructure, including data warehouses, object storage, and search systems
Deploy and maintain containerized workloads using Kubernetes
Implement data quality monitoring and governance processes to ensure accuracy and reliability
Mentor junior engineers through code reviews, documentation, and knowledge sharing
Communicate technical concepts clearly and provide business context for engineering decisions
Required Experience & Skills
5+ years of experience in Data Engineering with a track record of scaling systems
Expert proficiency in Python and advanced SQL, including performance tuning and optimization
Strong experience with workflow orchestration tools such as Airflow or Prefect and transformation tools such as dbt
Proven experience building resilient web scraping systems using Playwright, Selenium, and BeautifulSoup
Deep understanding of relational and NoSQL databases including Postgres, MongoDB, and ElasticSearch
Experience working with large-scale data systems such as BigQuery
Strong proficiency with CI/CD pipelines, Git, and Docker
Experience designing and maintaining distributed systems with high availability and fault tolerance
Nice-to-Have Experience
Experience with GCP or AWS and Kubernetes for infrastructure management
Familiarity with LLMs such as ChatGPT, Claude, or Gemini for engineering workflows
Experience with prompt optimization and AI-assisted development
Our ideal teammate:
Is highly proactive and takes full ownership of systems and outcomes
Balances hands-on execution with long-term architectural thinking
Communicates complex ideas clearly and effectively
Thrives in fast-paced, early-stage environments
Is detail-oriented and committed to building high-quality, scalable systems
Actively supports and elevates teammates through knowledge sharing and mentorship
Brings curiosity and continuously looks for ways to improve systems and processes