Feature Stores

Training-serving skew — where features computed at training time differ subtly from those computed at serving time — is one of the most insidious production ML bugs. Feature stores solve this by centralising feature computation, ensuring the exact same logic runs in both training pipelines and real-time serving, and enabling feature reuse across teams and models.

The Problem: Training-Serving Skew

Consider a fraud model that uses "number of transactions in the last hour" as a feature. During training, this is computed from historical batch data. In production, it's computed in real time from a streaming database. Any difference in how "last hour" is computed — timezone handling, null treatment, windowing boundaries — silently degrades model performance.

Without Feature Store

Training pipeline: Python script reads from S3 → computes features with Pandas. Serving pipeline: Java microservice reads from Redis → computes features differently. Skew is invisible until production accuracy drops.

With Feature Store

One feature definition (Python). Training retrieves historical point-in-time features from the offline store. Serving retrieves the same feature from the online store. Same logic, guaranteed consistency.

Feature Store Architecture

Offline Store

Historical feature values for training. Backed by data warehouses (BigQuery, Snowflake, S3 + Parquet). Supports point-in-time correct joins — retrieves the feature value as it existed at the time of each training example (prevents data leakage).

Online Store

Low-latency feature retrieval for real-time inference. Backed by Redis, DynamoDB, or Bigtable. Features are pre-computed and cached. P99 latency <5ms for most use cases.

Feast: Open-Source Feature Store

Feast is the most widely used open-source feature store. Here's a complete example:

# 1. Define feature views (feature_repo/features.py)
from feast import Entity, FeatureView, Field, FileSource
from feast.types import Float32, Int64
from datetime import timedelta

customer = Entity(name="customer_id", description="Customer identifier")

customer_stats = FeatureView(
    name="customer_transaction_stats",
    entities=[customer],
    ttl=timedelta(days=7),
    schema=[
        Field(name="tx_count_1h",    dtype=Int64),
        Field(name="tx_amount_avg",  dtype=Float32),
        Field(name="days_since_join", dtype=Int64),
    ],
    source=FileSource(path="data/customer_stats.parquet",
                      timestamp_field="event_timestamp"),
)

# 2. Apply registry
# feast apply

# 3. Training: retrieve historical features
from feast import FeatureStore
import pandas as pd

store = FeatureStore(repo_path=".")
training_df = store.get_historical_features(
    entity_df=pd.DataFrame({
        "customer_id": [1, 2, 3],
        "event_timestamp": pd.to_datetime(["2024-01-01", "2024-01-01", "2024-01-01"])
    }),
    features=["customer_transaction_stats:tx_count_1h",
              "customer_transaction_stats:tx_amount_avg"],
).to_df()

# 4. Serving: retrieve online features (after materialisation)
# feast materialize-incremental $(date -u +"%Y-%m-%dT%H:%M:%S")
online_features = store.get_online_features(
    features=["customer_transaction_stats:tx_count_1h"],
    entity_rows=[{"customer_id": 42}],
).to_dict()

Feature Store Options Compared

ToolTypeStrengthsBest For
FeastOpen sourceFree, pluggable backends, Python-nativeTeams starting out
TectonSaaSStreaming features, enterprise support, monitoringLarge ML teams
HopsworksOpen coreGreat UI, Spark/Flink integration, feature pipelinesData engineering teams
Vertex AI FSManagedNative GCP integration, BigQuery backendGCP-native teams
SageMaker FSManagedNative AWS integration, streaming ingestionAWS-native teams

Point-in-Time Correct Joins

The most critical feature store capability. During training, naive joins use the latest feature value — but this leaks future information. If you're predicting whether a customer will churn in March, you shouldn't use their April transaction count as a feature.

⚠️ Data Leakage

A model trained with future feature values will look great in offline evaluation but fail completely in production. Point-in-time joins ensure each training row only uses feature values available before the prediction timestamp. Always use this for time-series and event-based ML problems.

When Do You Need a Feature Store?

You Need One When...

  • Multiple models use the same features
  • You've had training-serving skew bugs
  • Feature computation takes >10 minutes
  • >3 data scientists sharing features
  • Real-time feature latency matters (<10ms)
  • Compliance requires feature auditability

Probably Don't Need One If...

  • Only 1–2 models in production
  • Features are simple (no windowing/aggregation)
  • Batch predictions only (no real-time serving)
  • Small team with tight code review
  • Early-stage project, still iterating
  • Infrastructure cost is a concern
💡 Start Simple

Before a full feature store, a shared feature computation library (a Python package that both training and serving import) eliminates most skew. This is 20% of the complexity for 80% of the benefit. Adopt Feast when feature sharing across teams becomes the bottleneck.

Frequently Asked Questions

How does a feature store handle real-time (streaming) features?

Streaming features require a stream processor (Kafka + Flink/Spark Streaming) to continuously compute aggregations and write to the online store. Tecton and Hopsworks have native streaming support. With Feast, you handle stream processing separately (e.g., with Flink) and write results to Feast's online store. Real-time features add significant infrastructure complexity — batch features recomputed hourly cover most use cases.

What is feature materialisation?

Materialisation is the process of computing and writing feature values into the online store so they can be retrieved at serving time with low latency. You run feast materialize (batch) or feast materialize-incremental on a schedule (e.g., every hour) to keep the online store fresh. Without materialisation, online feature retrieval would require recomputing from raw data on every request — too slow.

Can I use a feature store with LLMs?

Yes, for structured features used alongside LLMs. In a RAG system, you might store pre-computed document embeddings in a feature store / vector database, or use a feature store to provide user context features (subscription tier, language preference, interaction history) that get included in the LLM prompt. The feature store handles the structured data side; the vector database handles semantic retrieval.

Frequently Asked Questions

What will I learn here?

This page covers the core concepts and techniques you need to understand the topic and progress confidently to the next lesson.

How should I use this page?

Start with the overview, then follow the section links to deepen your understanding. Use the table of contents on the right to jump to specific sections.

What should I read next?

Use the navigation below to continue to the next lesson or explore related topics.