Data Engineer

About DOJO

We’re building a new product category - an AI-powered Operating System that transforms how companies grow. One and a half years in, we’re growing 20% month-over-month, closing a major funding round, and used by 100+ world-class brands. We were recently named one of Wired’s 100 Hottest Startups and included on Sifted’s Startups To Watch.

Under the hood, we’re building a next-generation AI and data platform - multi-agent systems, a real-time data fabric synthesizing hundreds of millions of signals, graph-based knowledge representations, and proprietary evaluation infrastructure. All in production, all evolving fast. Our technical surface spans agentic reasoning at scale, data quality across thousands of heterogeneous sources, and real-time intelligence from noisy unstructured data - in a domain where results are immediately measurable. Our engineers come from teams like Feedzai, OutSystems, Talka, and Unbabel, where shipping production AI and data systems at scale is the baseline.

We’re a product company first. We don’t build tools for consultants to configure - we build a product customers love, one that works flawlessly, with great design, supported by engineering excellence that makes it possible. We make the simple easy and the complex possible. And we build our business around this ethos.

About This Role

DOJO’s data layer is our biggest competitive advantage. It’s a strategic platform that powers product features, feeds our AI agents, drives evaluation systems, and gives customers insights no other tool can. It sits at the center of everything we build.

You’ll own the data infrastructure that makes this work: ingestion from thousands of heterogeneous sources, transformation, quality, and delivery at scale. Synthesizing noisy, unstructured marketing data into structured intelligence that grounds our AI agents - your work directly shapes the quality of our product and the trust our customers place in us.

Experience Levels

We don’t do hierarchy - everyone builds, everyone ships, everyone has a voice. What changes with experience is the surface area you take on and the complexity of the calls you’re ready to make.

  • A few years in - You’re building pipelines and quality systems, growing into platform ownership, and learning from engineers who’ve built data infrastructure at scale.

  • Deep experience - You own the data platform. You make infrastructure decisions and set the quality standards that keep the product trustworthy.

  • Been doing this a long time - You define the data architecture and technical vision. You shape how data powers every product capability and AI system.

What You’ll Do

  • Own the data platform end-to-end - ingestion from diverse sources, transformation, quality assurance, and delivery at scale, making deliberate infrastructure choices that compound over time

  • Build and maintain the graph-based semantic layer that bridges unstructured marketing signals with structured, queryable data that is continuously enriched and updated

  • Establish and enforce data quality standards - schema validation, anomaly detection, and monitoring that catches problems before customers do

  • Work directly with the founders, AI engineers, and product teams to ensure the data layer enables fast iteration on new features and agent capabilities

  • Close the AI-data loop - keep data fresh and accurate so our agents learn from the best signals, and use AI at the core of the data infrastructure itself

  • Help shape our engineering culture and raise the bar as the team grows - through code review, architectural decisions, and how we work together

You May Be a Good Fit If You Have

  • Deep experience building and operating production data infrastructure - pipelines, storage, quality systems, and the tooling around them

  • Experience with modern data engineering patterns - data mesh, data vault, data quality, and data governance - not classic ETL and BI warehouse architectures

  • Strong proficiency in Python and modern data tools (e.g. Dagster, Polars, DuckDB, Apache Iceberg, Apache Arrow, ClickHouse, Kafka)

  • An eye for data quality and scale as a first-class engineering problem, not an afterthought

  • Strong foundations in CS or Engineering, though exceptional candidates with alternative backgrounds are welcome

  • Experience with data infrastructure for AI systems - evaluation pipelines, data flywheels, or MLOps - is a strong plus

  • Experience with streaming architectures and real-time data pipelines is a plus

How to Apply

Send your CV and a few sentences on why this role interests you to careers@dojoai.com. No cover letter needed - just clear, direct communication.

We encourage you to apply even if you don’t meet every qualification listed above. We value diverse perspectives and believe that a wide range of experiences can contribute to our team’s success.