Phase 4: LLMs & Generative AI

Large Language Models have transformed what AI can do. ChatGPT, Claude, Gemini, Llama — they all share the same core architecture: a massive Transformer trained on internet-scale text. In this phase, you'll understand how they work and how to build with them.

🎯

Goal

Build LLM-powered applications from scratch

⏱️

Time

8 – 12 weeks

🛠️

Tools

Hugging Face, LangChain, OpenAI API, FAISS

What is a Large Language Model?

An LLM is a neural network — specifically a Transformer — trained to predict the next token given all previous tokens. "Large" means billions of parameters. "Language" means it understands and generates human text. The key insight: predict next word well enough and you develop general intelligence about language, reasoning, and world knowledge.

GPT-2 (2019)1.5B1K tokens✅ Open

GPT-3 (2020)175B4K tokens❌ API only

Llama 3 (2024)8B – 70B128K tokens✅ Open weights

GPT-4o (2024)~220B est.128K tokens❌ API only

Claude 3.5 (2024)Undisclosed200K tokens❌ API only

The LLM Training Pipeline

Pre-training

Train on massive text corpus (Common Crawl, books, Wikipedia). Learn to predict next token. Cost: $1M–$100M+.

↓

Supervised Fine-Tuning (SFT)

Fine-tune on high-quality instruction-response pairs. Teach the model to follow instructions.

↓

RLHF (Reinforcement Learning from Human Feedback)

Human raters rank responses. A reward model is trained. PPO optimises the LLM against the reward signal.

Topics in This Phase

⚙️

How LLMs Work

Tokenisation, embeddings, context windows, temperature, sampling strategies. Deep dive into the mechanics.

Read Guide →

✍️

Prompt Engineering

Zero-shot, few-shot, chain-of-thought, system prompts. Get better results from any LLM without training.

Read Guide →

🎯

Fine-Tuning & LoRA

Adapt open-source LLMs to specific tasks. LoRA, QLoRA, PEFT — fine-tune 7B models on a single GPU.

Read Guide →

🔍

RAG Systems

Retrieval-Augmented Generation: give LLMs access to your private knowledge base. No hallucinations.

Read Guide →

🗄️

Vector Databases

FAISS, Pinecone, Weaviate, ChromaDB. Store and search embeddings for semantic similarity at scale.

Read Guide →

LLM Capabilities & Limitations

✅ What LLMs Are Good At

Text generation & creative writing
Code generation & debugging
Summarisation & translation
Question answering with context
Classification & entity extraction
Reasoning through structured problems

❌ LLM Limitations

Hallucination (confident but wrong)
Knowledge cutoff (no real-time info)
Inconsistency across runs
Weak at precise arithmetic
Long-context degradation
Expensive to serve at scale

💡 Start Here

Before fine-tuning, always try prompt engineering first. 80% of LLM performance improvements come from better prompts, not from training. Only fine-tune when you need consistent domain-specific style or behaviour that prompts can't achieve.

Frequently Asked Questions

What's the difference between GPT, BERT, and Llama?

GPT (decoder-only) is trained to generate text — great for completion and chat. BERT (encoder-only) is trained to understand context — great for classification and NER. Llama is Meta's open-weights decoder-only model, similar to GPT but freely downloadable and runnable locally.

Can I run LLMs locally?

Yes! Tools like Ollama and LM Studio let you run Llama, Mistral, and other open models on your laptop. Expect 7B models to run at 10–20 tokens/second on a modern MacBook M-series chip. Quantised (4-bit) models are smaller and faster.

How much does it cost to use LLM APIs?

Claude Sonnet: ~$3/million tokens input, $15/million output. GPT-4o: ~$5/$15. These prices are dropping ~50% per year. For high-volume production apps, open-source + self-hosting often becomes cheaper above 1B tokens/month.

Frequently Asked Questions

What will I learn here?

This page covers the core concepts and techniques you need to understand the topic and progress confidently to the next lesson.

How should I use this page?

Start with the overview, then follow the section links to deepen your understanding. Use the table of contents on the right to jump to specific sections.

What should I read next?

Use the navigation below to continue to the next lesson or explore related topics.