Company
Constructor logo

Constructor

constructor.io
Location
Fully remote
Apply

Data Engineer: Data Platform Team

About us

Constructor.io powers product search and discovery for some of the largest retailers in the world. We serve billions of requests every week, and you’ve probably seen our results somewhere and used our product without knowing it. We differentiate ourselves by focusing on metrics over features, and reinventing search and discovery from the ground up as a machine learning challenge with the specific goal of improving metrics like revenue. We’re approximately doubling year over year despite the market slow down and have customers in every eCommerce vertical. We’re a passionate team of technologists who love solving problems and want to make our customers’ and coworkers’ lives better. We value empathy, openness, curiosity, continuous improvement, and are excited by metrics that matter. We believe that empowering everyone in a company to do what they think is best can lead to great things.


Data Platform Team

The Data Platform team within Data Science and Engineering is an integral unit that serves internal stakeholders. It develops a data platform that

  • Stores, processes data, produces artefacts the is used by the backend team to run in production

  • Convenient tools for engineers to create, schedule, and run their data workloads.

  • Data quality validation.

  • Real-time Analytical API that serves analytics to our customers.

Data Science and Engineering consist of a mix of data engineers & analysts owning & collaborating on multiple projects. As a Data Platform team member, you will use world-class analytical, engineering, and data processing techniques to build the foundational infrastructure, tooling, and analytical capabilities and enable the business to move forward.


Challenges you will tackle

Constructor integrates with our customers via providing them with client-side libraries that interact with our API. These libraries transmit logs/events (we call them behavioral events) that are used by Constructor to train ML models, make business decisions, conduct AB-tests and so much more. These logs are currently handled by Behavioral API — a Python service that collects these logs and stores them in fluentd -> S3. This piece of infrastructure is currently owned by the backend team. The data platform team is about to take on ownership of this API so that we control data collection and processing from start to finish.

Your first project and main focus for the first few months as the Data Engineer will be to help our team to take over the code ownership of this service, separate it from the backend and deploy it as a standalone service, integrate it with the rest of the data platform and work on the infrastructure that owns the data delivery component.

The DP team has epic objectives like deploying and switching to a new scheduler, deploying a DB that will be used by the recommendation systems, Analytics service development with its storage underneath, optimizing Spark pipelines and designing smarter data model for our data warehouse, and many more. We create and operate the data infra, so you will be a part of those projects as the DE in our team.

  • You have high proficiency in any programming language (Python is preferred). You have experience with backend development with any web framework (for Python it can be Django, Flask, FastAPI, …).

  • You are proficient at SQL (any variant)

  • You have experience working with AWS and have knowledge of its services (EC2, IAM, S3, Lambda, ECS, ECR, …) used for data processing.

  • You have an excellent understanding of data storage, database types and architecture, and you can apply this knowledge to build an effective data infrastructure.

  • You enjoy working with big amounts of data

  • You proactively find opportunities to improve the product and lead this endeavor to success.

  • Bonus points: experience working with ClickHouse

  • Bonus points: advanced knowledge of AWS (CloudFormation, CDK, SNS, SQS)

  • Bonus points: Familiarity with the big data stack (Spark, Presto/Athena, Hive).

  • Unlimited vacation time -we strongly encourage all of our employees take at least 3 weeks per year

  • A competitive compensation package including stock options

  • Company sponsored US health coverage (100% paid for employee)

  • Fully remote team - choose where you live

  • Work from home stipend! We want you to have the resources you need to set up your home office

  • Apple laptops provided for new employees

  • Training and development budget for every employee, refreshed each year

  • Parental leave for qualified employees

  • Work with smart people who will help you grow and make a meaningful impact