Data Encryption in the Cloud

Encryption is the last line of defense for your data. If an attacker bypasses every other control — IAM, network, application security — properly encrypted data remains unreadable without the keys. Understanding cloud encryption isn't optional for anyone building systems that handle sensitive data.

Two States of Data, Two Types of Encryption

Data exists in two states that require different encryption approaches:

Encryption at Rest

Data stored on disk — S3 objects, EBS volumes, database records, model checkpoints — is encrypted at rest. If someone physically steals a hard drive (or accesses raw storage), they see ciphertext, not data. Modern cloud storage services (S3, EBS, GCS, Azure Blob) offer server-side encryption (SSE) with a single configuration toggle — there's no excuse not to enable it.

Encryption in Transit

Data moving over a network — API calls, database queries, model inference requests — should be encrypted in transit. TLS (Transport Layer Security) encrypts the connection between client and server. HTTPS uses TLS. Modern cloud services enforce TLS by default. Key concerns: enforce TLS minimum version (TLS 1.2+, ideally 1.3), disable deprecated ciphers, and use certificate validation to prevent man-in-the-middle attacks.

Both matter: Data encrypted at rest but transmitted in cleartext can be intercepted. Data encrypted in transit but stored unencrypted can be stolen from disk. You need both.

Key Management Services (KMS)

Encryption is only as strong as your key management. A KMS is a managed service for creating, storing, and controlling cryptographic keys. All major cloud providers offer one: AWS KMS, Google Cloud KMS, and Azure Key Vault.

How KMS Works

You never handle the raw encryption keys in your application. Instead: you create a KMS key (a Customer Master Key / CMK), grant your application permission to use it, and call the KMS API to encrypt/decrypt data. The actual key material never leaves the KMS service in plaintext — it's hardware-backed and FIPS 140-2 validated. KMS logs every key usage in CloudTrail, giving you a complete audit trail.

Envelope Encryption

For large data (a 100GB model checkpoint), you don't send the entire file to KMS for encryption. Instead, KMS generates a data encryption key (DEK) for you. You use the DEK (locally, in memory) to encrypt your data, then store only the encrypted DEK alongside the encrypted data. To decrypt, you ask KMS to decrypt the DEK — KMS uses your CMK to do so — and use the decrypted DEK to decrypt your data. This is envelope encryption — the KMS key never directly encrypts large amounts of data, only the small DEK.

Types of Encryption Keys

🔑

AWS-Managed Keys

Created and managed by AWS on your behalf. Free. Automatically rotated. You can't customize rotation or control policies granularly.

🗝️

Customer-Managed Keys (CMK)

You create and control. Fine-grained key policies. Scheduled deletion. Full audit trail. $1/key/month. Best for sensitive data and compliance.

🏦

Customer-Provided Keys

You provide the key for each operation — the provider uses it but never stores it. Maximum control. Operational burden falls entirely on you.

🔐

HSM-Backed Keys

CloudHSM / Cloud HSM services store keys in dedicated hardware security modules — physical tamper-resistant devices. Highest security, highest cost, for regulated industries.

Encryption for AI Workloads

Training Data

Training datasets containing PII or proprietary information must be encrypted at rest in S3/GCS. Use customer-managed keys so you can audit and revoke access. Server-side encryption with KMS is the baseline; client-side encryption before upload provides defense-in-depth for highly sensitive datasets.

Model Artifacts

Trained model weights are valuable intellectual property. Encrypt model checkpoints and final artifacts with CMKs. Apply S3 bucket policies that deny unencrypted uploads. Use S3 Object Lock (WORM — write once, read many) for compliance-critical model versions.

Inference Endpoints

All inference API traffic must use TLS 1.2+ minimum. For healthcare and financial AI, consider end-to-end encryption where inference requests are encrypted client-side before reaching your infrastructure — protecting against even internal threats.

Frequently Asked Questions

Does encryption affect performance?

Modern CPUs have hardware acceleration for AES encryption (AES-NI instruction set), making encryption overhead negligible — often less than 1% for most workloads. The bigger concern is key operations (KMS API calls for decryption), which add network round trips. Envelope encryption mitigates this: you call KMS once to decrypt the DEK, then use it for all subsequent decryption locally. For GPU training workloads, storage encryption has minimal impact on GPU utilization.

What is client-side encryption vs. server-side encryption?

Server-side encryption (SSE): the cloud provider's service encrypts your data after it's received. The data travels in cleartext to the provider's endpoint over TLS, then is encrypted for storage. The provider has access to the plaintext during processing. Client-side encryption: you encrypt data before sending it to the provider. The provider only ever sees ciphertext — they can't read your data even if compelled or breached. Client-side encryption is necessary when you need to ensure the provider can never access the plaintext.

What happens to my encrypted data if I lose the KMS key?

You permanently lose access to the encrypted data. This is why key deletion in AWS KMS has a mandatory 7–30 day waiting period — to prevent accidental data loss. Rotating keys (creating new versions) doesn't delete old versions, so historical data remains accessible. For disaster recovery: never delete CMKs for data you still need, maintain key backup policies, and test decryption with your key recovery procedures before you actually need them.

Frequently Asked Questions

What will I learn here?

This page covers the core concepts and techniques you need to understand the topic and progress confidently to the next lesson.

How should I use this page?

Start with the overview, then follow the section links to deepen your understanding. Use the table of contents on the right to jump to specific sections.

What should I read next?

Use the navigation below to continue to the next lesson or explore related topics.