AI Engineering vs ML Engineering — System Design Space

Context

System Types in System Design Interviews

The old shared ML/AI branch is now split into two separate engineering themes.

The old AI/ML Engineering section tried to keep too many different problems in one route: production ML, MLOps, LLM/RAG, agent systems, historical AI context, and product cases. That was acceptable while the section was small. As it grew, the route stopped being clear because topics with different engineering concerns were still grouped together mostly out of habit.

This page is a fork in the road: it helps you see where the line runs between ML Engineering, with the model lifecycle, serving, and the feedback loop around the model, and AI Engineering, with LLM products, orchestration, and guardrails.

Two different engineering centers of gravity

ML Engineering

Here the work starts where model quality alone is no longer enough: you train on usable data, release the model, measure it by error metrics, and keep it running in production after the launch, not only before it.

feature pipelines and dataset quality
model release, calibration, and rollout safety
serving, latency, and cost control
retraining, drift, and the next improvement cycle

AI Engineering

Here the model is taken as a ready-made component, and the engineering difficulty moves into orchestration, evaluation, and the product loop around LLM, RAG, and agent systems, AI assistants, and platform decisions: the cost of an error is not accuracy but how the product behaves on a live request.

LLM, RAG, and agent architecture
tool calling, memory, and workflow orchestration
prompting, safety, and guardrails
AI assistants, product scenarios, and platform cases

How to choose the right route

The route from models and data to a working system

When your main question is how to train, release, measure, and operate a model, start with ML Engineering — it sets the language of the lifecycle that everything else rests on.

The route from the LLM product to its architecture

Is your main question about RAG, agents, AI assistants, guardrails, and product delivery around LLMs? Then start with AI Engineering: model quality is secondary — what wins or loses is the system around it.

In practice you need both themes — the only question is the order

ML Engineering gives you the language of lifecycle and operating discipline; AI Engineering helps you reason about modern AI products and orchestration on top of models. Skip the first, and the conversation about the second quickly hits questions you have nothing to answer with.

New theme

ML Engineering

Models, pipelines, serving, calibration, retraining, feature contracts, and operating ML systems in production.

Open theme

New theme

AI Engineering

LLMs, RAG, agents, AI assistants, guardrails, evaluation, and AI product architecture.

Open theme

References

Chip Huyen — AI Engineering: Building Applications with Foundation Models (O'Reilly, 2025)Chip Huyen — Designing Machine Learning Systems (O'Reilly, 2022)Martin Zinkevich, Google — Rules of Machine Learning: Best Practices for ML Engineering (Google for Developers)Google Cloud — MLOps: Continuous Delivery and Automation Pipelines in Machine Learning (Architecture Center)

Where to go next

ML Engineering: Designing Models, Pipelines, and the Production Loop - The theme's intro chapter: where to start the conversation about the model lifecycle and operating it.
AI Engineering: Designing LLM, Agent, and Copilot Systems - The theme's intro chapter: how the product loop around an LLM works once the model is ready.
ML Engineering theme - The full list of ML Engineering chapters after the split.
AI Engineering theme - The full list of AI Engineering chapters after the split.