Data scientist builds the model. ML engineer ships it.

The ML Engineer Associate exam draws a deliberate line between two roles that are often conflated. A data science question might ask: “Which regularisation technique reduces overfitting in a neural network by randomly dropping neurons during training?” An MLA-C01 question asks: “A SageMaker training job completes successfully but the deployed endpoint latency exceeds the SLA under peak load. Which combination of endpoint configuration changes reduces p99 latency without retraining the model?” The shift — from selecting the right algorithm to operating a reliable, cost-efficient ML system in production — defines every domain of this exam.

AWS launched MLA-C01 as the associate-tier entry point for the ML engineering track, positioned below the existing AWS Certified Machine Learning – Specialty (MLS-C01), which remains the credential for practitioners who need depth in algorithm selection, statistical theory, and advanced feature engineering. MLA-C01 does not test ML theory at that depth. It tests whether you can wire together AWS services to build an end-to-end ML pipeline, monitor it in production, and make informed trade-offs when something breaks or costs spike. Engineers with one to two years of hands-on SageMaker experience are the target audience; the recommendation to hold or understand AWS Cloud Practitioner-level knowledge is the only baseline AWS suggests.

The four-domain structure maps directly to the ML lifecycle: get data in and clean it (Domain 1), train and evaluate a model (Domain 2), ship it and keep it running (Domain 3), and design the right ML solution architecture for a given business constraint (Domain 4). Domain weights are roughly equal, with data preparation and model development carrying slightly more weight, reflecting that a production ML system fails most often at the data layer and the training pipeline — not at the inference endpoint.

The four exam domains

Domain 1 — Data Preparation for Machine Learning (~28%)

The highest-weighted domain tests the full data engineering layer that feeds every training job: ingestion, transformation, labelling, and quality validation. Candidates who treat this domain as “just ETL” underperform — the exam tests ML-specific data concerns that general data engineering certifications do not cover.

  • Data ingestion at scale: Amazon S3 as the primary ML data lake; AWS Glue for serverless ETL (crawlers, job bookmarks, DynamicFrames vs Spark DataFrames); Amazon Kinesis Data Streams and Kinesis Data Firehose for streaming feature pipelines; AWS Lake Formation for governed data lake access control. The exam tests which service to use given throughput, latency, and cost constraints — not the API calls to configure them.
  • SageMaker Data Wrangler: the no-code/low-code data preparation interface inside SageMaker Studio. Data Wrangler supports 300+ built-in transforms, automatic data quality reports, and one-click export to SageMaker Pipelines or SageMaker Feature Store. The exam tests when Data Wrangler is the appropriate choice (rapid prototyping, non-engineering personas) versus a custom Glue job or Spark script (large-scale production pipelines, complex custom logic).
  • SageMaker Feature Store: the managed feature repository that stores, shares, and retrieves features for training and serving. Offline store (S3-backed, for training) versus online store (low-latency DynamoDB-backed, for real-time inference) is a core exam concept. Point-in-time correct feature retrieval — ensuring the training dataset uses only features that would have been available at prediction time — prevents training-serving skew, the most common cause of model degradation in production.
  • Data labelling: Amazon SageMaker Ground Truth for managed human labelling workflows (public workforce via Mechanical Turk, private workforce, vendor-managed workforce). Automated data labelling (active learning loop that pre-labels high-confidence examples and routes uncertain examples to humans). Ground Truth Plus for fully managed, AWS-staffed labelling services. The exam tests when automated labelling is appropriate and what the confidence threshold tradeoffs are.
  • Handling imbalanced datasets: oversampling minority classes (SMOTE — Synthetic Minority Oversampling Technique), undersampling majority classes, adjusting class weights in the loss function, and using stratified sampling for train/validation/test splits. SageMaker Clarify for pre-training bias detection identifies class imbalance before training begins rather than discovering it after model evaluation.
  • Data quality and validation: SageMaker Model Monitor’s data quality monitoring job detects distribution drift between the baseline training data statistics and the live inference request distribution. Knowing when to trigger retraining versus when drift is acceptable noise is an explicitly testable judgment call. The exam also tests feature importance — identifying and dropping low-signal features to reduce dimensionality and training cost without sacrificing model performance.

Domain 2 — ML Model Development (~26%)

Domain 2 tests the training layer: choosing between built-in algorithms and custom containers, running managed training jobs, optimising hyperparameters, and evaluating model performance. The exam does not test the mathematics of individual algorithms — it tests which algorithm class is appropriate for a given problem type and dataset characteristic.

  • SageMaker built-in algorithms: XGBoost for tabular classification and regression (the most commonly tested algorithm — know its key hyperparameters: max_depth, eta, num_round, subsample); Linear Learner for classification and regression with sparse high-dimensional inputs; k-means and k-nearest neighbours for clustering and anomaly detection; PCA for dimensionality reduction; BlazingText for word embeddings and text classification; DeepAR for time-series forecasting; Object Detection and Image Classification for computer vision tasks. The exam presents a business scenario and asks which built-in algorithm is the most appropriate starting point.
  • Custom training containers: when built-in algorithms do not cover the required model architecture, SageMaker supports custom Docker containers pushed to Amazon ECR. Know the SageMaker Training Toolkit conventions: the /opt/ml/input/data/ channel structure for training data, /opt/ml/model/ for saving the trained model, and /opt/ml/output/failure for failure reporting. SageMaker Script Mode (using the SageMaker Python SDK’s Estimator class with a custom entry_point script) is preferred over fully custom containers when the framework (TensorFlow, PyTorch, Hugging Face, MXNet) is already supported by a SageMaker pre-built container.
  • SageMaker Training Jobs: managed compute for training (no cluster provisioning or teardown). Spot training instances reduce training costs by up to 90% by using unused EC2 capacity; checkpointing to S3 is required to resume spot-interrupted jobs without losing progress. Distributed training — data parallelism (splitting the dataset across multiple GPUs, each with a full model copy; gradients averaged via AllReduce) and model parallelism (splitting the model across GPUs for models too large to fit in a single GPU’s memory) — is tested at the “when and why” level, not the implementation level.
  • SageMaker Automatic Model Tuning (hyperparameter optimisation): Bayesian optimisation as the default strategy (learns from prior job results to focus on promising hyperparameter regions, more sample-efficient than random search); random search as the baseline; Hyperband as the early-stopping strategy that terminates poorly performing jobs before they complete, reducing cost. Know how to define a hyperparameter search space (continuous, integer, categorical) and how to specify the objective metric.
  • SageMaker Experiments: tracks training runs, hyperparameter configurations, input datasets, and evaluation metrics in a structured format. Enables comparison of model versions and reproducibility. The exam tests how Experiments integrates with training jobs and SageMaker Pipelines to create a complete audit trail — a requirement in regulated industries.
  • Model evaluation: classification metrics — accuracy (misleading on imbalanced datasets), precision (minimise false positives), recall/sensitivity (minimise false negatives), F1 (harmonic mean of precision and recall), AUC-ROC (threshold-independent measure of discriminative ability). Regression metrics — RMSE (penalises large errors more heavily), MAE (robust to outliers), R² (proportion of variance explained). The exam tests which metric is appropriate given the business cost of false positives versus false negatives — a fraud detection model that minimises false negatives (missed fraud) tolerates more false positives than a cancer screening model that minimises false negatives (missed cancer) while being more conservative about false positives (unnecessary procedures).
  • Foundation models and generative AI: Amazon Bedrock provides API access to foundation models (Anthropic Claude, Amazon Titan, Meta Llama, Mistral, Cohere) without requiring training infrastructure. SageMaker JumpStart provides one-click fine-tuning and deployment of open-source foundation models. The exam tests when to use Bedrock (managed API, no fine-tuning needed, lowest operational overhead), SageMaker JumpStart (fine-tuning on proprietary data with managed infrastructure), or a fully custom SageMaker training job (maximum control, custom architectures).

Domain 3 — Deployment and Orchestration of ML Workloads (~22%)

Domain 3 is the operational core of the exam. It tests every layer of getting a trained model into production and keeping it running: endpoint types, deployment strategies, pipeline orchestration, model governance, and cost optimisation. Many candidates with strong data science backgrounds underperform here because this domain resembles cloud operations more than ML research.

  • SageMaker Inference endpoint types: real-time endpoints (synchronous, low-latency inference, persistent endpoint, billed per second); batch transform (asynchronous, processes entire datasets without a persistent endpoint, lower cost for offline scoring); asynchronous inference (queued requests processed with SQS, no timeout limit, appropriate for long-running inference or variable traffic with cost optimisation); serverless inference (pay-per-use, automatic scaling to zero, ideal for infrequent bursty traffic). The exam presents a scenario with latency, throughput, traffic pattern, and cost constraints and asks which endpoint type best satisfies the requirements.
  • Multi-model endpoints and multi-container endpoints: multi-model endpoints host thousands of models on a single endpoint instance by lazy-loading models from S3, reducing cost for large model catalogues with sparse per-model traffic. Multi-container endpoints host multiple different containers on a single endpoint, enabling sequential inference pipelines (pre-processing → model → post-processing) without client-side orchestration.
  • Deployment strategies: blue/green deployment (shift 100% of traffic to the new model version after validation, instant rollback by reverting traffic); canary deployment (shift a small percentage of traffic to the new version, monitor metrics, gradually increase if healthy); A/B testing with production variants (route traffic proportionally to multiple model versions simultaneously to compare performance under real traffic). SageMaker production variants define each model version, instance type, and traffic weight on a single endpoint.
  • SageMaker Pipelines: the native ML workflow orchestration service. A Pipeline is a DAG of steps — Processing (data transformation), Training, Evaluation, Condition (branch on metric threshold), RegisterModel, Transform, and Clarify — that executes on managed infrastructure with automatic caching of unchanged steps. Pipelines integrate with SageMaker Experiments for tracking, the Model Registry for governance, and Amazon EventBridge for event-driven triggers (e.g., retrain when new data lands in S3).
  • SageMaker Model Registry: central catalogue for trained model versions with approval workflows (PendingManualApproval → Approved → Rejected). Approved model versions can be deployed automatically via Pipelines or manually via the Studio UI. The Registry stores model metadata, training job ARN, evaluation metrics, and the associated model package group. The exam tests how the Registry enables governance requirements: no model reaches production without an approval gate.
  • Auto Scaling for endpoints: Application Auto Scaling for SageMaker endpoints scales instance count based on the InvocationsPerInstance metric (the canonical ML endpoint scaling metric, not CPU or memory). Target tracking scaling policy maintains a target invocations-per-instance value. Step scaling for more aggressive scale-out on sudden traffic spikes. The exam tests scaling policy selection given latency SLA and cost constraints.
  • Cost optimisation: spot training for up to 90% savings on training jobs (with checkpointing); right-sizing endpoint instance types using SageMaker Inference Recommender (runs load tests across instance families and reports latency/throughput/cost tradeoffs); SageMaker Savings Plans for committed usage discounts on inference; serverless inference for workloads that can tolerate cold-start latency in exchange for zero idle cost.

Domain 4 — ML Solution Design (~24%)

Domain 4 tests architectural judgment: choosing between AWS AI services, SageMaker, and custom solutions; designing for security, reliability, and responsible AI; and translating business requirements into ML system requirements. Questions in this domain are scenario-heavy and reward practitioners who think in terms of constraints and trade-offs rather than memorised service lists.

  • AWS AI services vs SageMaker vs custom solutions: Amazon Rekognition (image and video analysis, no ML expertise required), Amazon Comprehend (NLP — entity recognition, sentiment, key phrase extraction), Amazon Textract (document data extraction with layout understanding), Amazon Transcribe (speech-to-text), Amazon Polly (text-to-speech), Amazon Forecast (managed time-series forecasting), Amazon Personalize (managed real-time recommendations). Use pre-built AI services when the task fits an existing API, customisation is unnecessary, and time-to-value matters most. Use SageMaker when the task requires a custom model, proprietary data, or domain-specific fine-tuning. The exam tests this decision at the system design level — presenting business requirements and asking which approach satisfies them.
  • Responsible AI and bias detection: SageMaker Clarify provides three capabilities — pre-training bias detection (identifies imbalance in the training dataset before a model is trained), post-training bias detection (measures how the trained model’s predictions differ across demographic groups), and model explainability (SHAP-based feature attribution showing which features drove individual predictions). The exam tests when Clarify is required (regulated use cases, high-stakes decisions) and what each capability can and cannot prove.
  • Security architecture for ML workloads: SageMaker VPC mode runs all training and inference compute inside a customer-managed VPC, preventing data from traversing the public internet; combined with VPC endpoints (PrivateLink) for S3 and SageMaker APIs, this satisfies strict data residency requirements. IAM execution roles for training jobs and endpoints should follow least privilege — the exam presents over-permissioned role configurations and asks how to remediate them. Encryption at rest (S3 SSE-KMS for training data and model artefacts, EBS volume encryption for training instances) and in transit (TLS) are required for regulated workloads. Network isolation mode prevents training containers from making outbound calls to the internet.
  • MLflow on SageMaker: SageMaker Managed MLflow provides a fully managed MLflow Tracking Server for experiment tracking, model registry, and model serving, eliminating the need to self-host MLflow on EC2. The exam tests MLflow as the open-source standard for cross-framework experiment tracking and when using SageMaker Managed MLflow is preferred over SageMaker Experiments (teams already using MLflow, multi-framework environments, requirement for MLflow-compatible tooling).
  • Model monitoring and drift detection: SageMaker Model Monitor runs scheduled monitoring jobs against a live endpoint, comparing incoming request features and model predictions against a baseline captured at deployment time. Four monitor types — data quality (feature distribution drift), model quality (prediction accuracy drift, requires ground truth labels), bias drift (fairness metric changes over time), and feature attribution drift (SHAP value changes). CloudWatch alarms trigger retraining pipelines via EventBridge when drift exceeds thresholds. The exam tests which monitor type detects which category of production degradation.
  • Designing for reliability: multi-AZ SageMaker endpoint deployments for high availability (SageMaker distributes instances across AZs automatically when multiple instances are specified); cross-region model replication via the Model Registry for disaster recovery; SageMaker Pipelines with idempotent steps and cached outputs for resilient retraining workflows. The exam tests which architectural choices satisfy RTO/RPO requirements for ML systems — mapping ML reliability requirements to the same framework used for non-ML workloads in AWS Solution Architect exams.
The MLA-C01 question pattern that surprises the most candidates: the exam often presents two technically correct AWS service choices and asks you to identify the one that best satisfies an operational constraint — cost, latency, or time-to-deploy. Knowing what SageMaker Pipelines and SageMaker Model Monitor do is not enough; you need to know when to use them versus a simpler, cheaper alternative. If the scenario says “a small team, infrequent retraining, tight budget,” the answer is almost never the most feature-rich SageMaker service.

How MLA-C01 fits the AWS certification landscape

AWS currently offers three ML-related certifications at different levels and audiences. Understanding where MLA-C01 sits prevents wasted study time.

For most cloud and DevOps engineers pivoting into ML engineering, MLA-C01 is the correct first ML certification: it tests the AWS-native tooling and operational skills that are directly applicable to production ML work without requiring the statistics depth that MLS-C01 demands. Holding an AWS associate credential (Solutions Architect, Developer, or SysOps) before attempting MLA-C01 significantly reduces ramp-up time because Domain 4 (ML Solution Design) tests the same architectural reasoning — trade-offs between cost, reliability, latency, and security — that appear across all AWS associate exams.

What to study: the SageMaker ecosystem is the exam

Unlike the AWS Specialty ML exam, which tests a wide surface of AWS data services (EMR, Glue, Redshift, Kinesis, Athena) with equal depth, MLA-C01 is heavily SageMaker-centric. Seventy to eighty percent of exam questions will involve a SageMaker service, configuration, or trade-off. The study priority is clear:

The AWS-provided sample exam questions in the official MLA-C01 candidate guide skew heavily toward operational and architectural judgment questions rather than factual recall. If your study material is mostly flashcards and service descriptions, supplement it with scenario-based practice that forces you to compare two or three plausible service choices.

Why it matters for cert candidates

MLA-C01 rewards practitioners who think in terms of ML system trade-offs, not ML algorithm expertise. The single most common study mistake is over-investing in algorithm theory (bias-variance, regularisation, gradient descent variants) at the expense of AWS service mechanics (SageMaker Pipelines step types, Feature Store online vs offline, endpoint scaling metrics). The exam is an ML engineering exam — it tests whether you can build and operate ML systems on AWS, not whether you can derive the backpropagation equations. Allocate 70% of your study time to hands-on SageMaker labs and 30% to architectural decision-making scenarios. The official AWS exam guide contains the authoritative domain breakdown and sample questions — it is the single most important preparation document and should be reviewed before any third-party course.

Test your MLA-C01 knowledge across all four domains with practice questions on CertQuests.

Start MLA-C01 Practice Questions →