AWS AI & SageMaker

Amazon Web Services is the world's largest cloud provider, with the most mature and comprehensive set of AI/ML services. SageMaker is its flagship ML platform — handling everything from data preparation to model training, deployment, and monitoring.

📖 Covers: SageMaker Studio · Training Jobs · Model Registry · Endpoints · Bedrock · S3 · IAM · Cost Tips

AWS AI Service Landscape

🏋️
SageMaker

End-to-end ML platform: notebooks, training, deployment, monitoring

🤖
Bedrock

Managed LLM API: Claude, Llama, Titan, Stable Diffusion. No GPU management.

👁️
Rekognition

Pre-built computer vision: object detection, face recognition, content moderation

📝
Comprehend

NLP service: sentiment, entities, key phrases, language detection

🗣️
Transcribe / Polly

Speech-to-text and text-to-speech services

🔄
Glue / Athena

Data preparation and SQL analytics on S3 data lakes

SageMaker: The ML Lifecycle

1
SageMaker Studio

Web-based IDE. Jupyter notebooks with managed compute. No local setup.

2
Data Wrangler

Visual data preparation: clean, transform, and visualise S3 data.

3
Training Jobs

Managed container-based training. Auto-provisions GPUs, handles checkpointing.

4
Experiments

Track hyperparameters, metrics, and artefacts across training runs.

5
Model Registry

Version models, track lineage, approve/reject for deployment.

6
Endpoints

Deploy models as real-time or batch REST APIs. Auto-scaling built-in.

7
Model Monitor

Detect data drift and model quality degradation in production.

Launch a Training Job

Python · SageMaker Training Job (PyTorch)
import sagemaker
from sagemaker.pytorch import PyTorch

role = sagemaker.get_execution_role()
sess = sagemaker.Session()

# Upload training data to S3
s3_data = sess.upload_data('data/', bucket='my-bucket', key_prefix='training')

# Define estimator
estimator = PyTorch(
    entry_point='train.py',          # Your training script
    role=role,
    framework_version='2.1',
    py_version='py310',
    instance_type='ml.p3.2xlarge',  # V100 GPU, $3.82/hr
    instance_count=1,
    hyperparameters={
        'epochs': 50,
        'learning-rate': 0.001,
    }
)

# Start training (provisions GPU, runs, saves model to S3)
estimator.fit({'training': s3_data})

Deploy a Model Endpoint

Python · SageMaker Real-Time Endpoint
# Deploy trained model as REST API
predictor = estimator.deploy(
    initial_instance_count=1,
    instance_type='ml.c5.xlarge',  # CPU for inference, cheaper
    endpoint_name='my-model-prod'
)

# Make predictions
import json
result = predictor.predict({'inputs': [[5.1, 3.5, 1.4, 0.2]]})
print(result)

# Clean up (endpoints cost money even when idle!)
predictor.delete_endpoint()

AWS Bedrock — Managed LLM API

Bedrock gives you API access to foundation models (Claude, Llama, Titan) without managing any infrastructure. Pay per token, scale to zero automatically.

Python · Bedrock — Claude API
import boto3
import json

bedrock = boto3.client('bedrock-runtime', region_name='us-east-1')

response = bedrock.invoke_model(
    modelId='anthropic.claude-3-5-sonnet-20241022-v2:0',
    body=json.dumps({
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": 512,
        "messages": [{"role": "user", "content": "Summarise this document..."}]
    })
)
result = json.loads(response['body'].read())
print(result['content'][0]['text'])

Cost Optimisation Tips

💡
Use Spot Instances

Save up to 90% on training jobs. SageMaker handles interruption and checkpointing automatically.

💡
Choose the right instance

ml.p3.2xlarge (~$3.82/hr) for training; ml.c5.xlarge (~$0.19/hr) for CPU inference.

💡
Delete idle endpoints

Endpoints bill 24/7. Use Lambda to auto-scale to zero or delete during off-hours.

💡
Use Serverless Inference

For intermittent traffic — only charges per prediction. Zero cost when idle.

Frequently Asked Questions

What IAM permissions does SageMaker need?

SageMaker needs an execution role with: AmazonSageMakerFullAccess (for SageMaker APIs), S3 read/write access for your data bucket, ECR access if using custom containers, and CloudWatch Logs access. Never use AdministratorAccess in production — follow least privilege.

How do I version my models in SageMaker?

Use the SageMaker Model Registry. Register models after training with estimator.register(). Models go through an approval workflow (Pending → Approved/Rejected) before being deployed to production endpoints. Model lineage tracks which training job and data version produced each model.

Frequently Asked Questions

What will I learn here?

This page covers the core concepts and techniques you need to understand the topic and progress confidently to the next lesson.

How should I use this page?

Start with the overview, then follow the section links to deepen your understanding. Use the table of contents on the right to jump to specific sections.

What should I read next?

Use the navigation below to continue to the next lesson or explore related topics.