Overview
This module covers AI/ML infrastructure for data platforms, including vector databases for similarity search, LLM Operations (RAG architecture, prompt engineering, embeddings at scale), and feature stores for centralized feature management.
Module Contents
Vector Databases
LLM Ops
Feature Stores
RAG Architecture Overview
RAG Patterns:
- Naive RAG: Simple retrieve and generate pipeline
- Advanced RAG: Query rewriting, reranking, context compression
- Hybrid Search: Vector + BM25 for best results
- Metadata Filtering: Pre-filter for better performance
Vector Database Comparison
Feature Comparison
| Database | Type | Deployment | Index Types | Max Dimensions | Cost | Best For |
|---|
| Pinecone | Managed | Cloud-only | HNSW | 20,000 | $$ | RAG, production, managed |
| Milvus | Open Source | Self-hosted/Cloud | IVF, HNSW, FLAT | 32,768 | $ | On-premises, K8s |
| pgvector | Extension | Self-hosted | IVFFlat, HNSW | 2,000 | $ | PostgreSQL shops |
| Weaviate | Open Source | Self-hosted/Cloud | HNSW | Unlimited | $-$$ | Knowledge graphs |
Selection Criteria
Feature Store Architecture
Feature Store Benefits:
- Consistency: Same features in training and inference
- Reusability: Share features across models
- Version Control: Track feature changes over time
- Point-in-Time Correctness: Avoid data leakage
- Governance: Feature ownership, documentation, lineage
Cost Considerations
Vector Database Costs
| Factor | Impact | Optimization |
|---|
| Vector dimension | Storage + compute | Use lower dimensions (384 vs. 1536) |
| Index type | Memory vs. accuracy | HNSW for balance |
| Deployment | OpEx vs. CapEx | Open source for scale |
| Region | Data transfer | Colocate with data |
Embedding Generation Costs
| Model | Cost per 1M tokens | Quality | Speed |
|---|
| OpenAI ada-002 | $0.10 | Excellent | Medium |
| OpenAI text-embedding-3-small | $0.02 | Very good | Fast |
| Cohere embed-v3 | $0.10 | Very good | Fast |
| Sentence Transformers | Compute cost | Good | Medium |
Feature Store Costs
| Platform | Cost | Notes |
|---|
| Feast | Free (self-hosted) | Open source |
| Hopsworks | Usage-based | Managed service |
| Tecton | Custom | Enterprise |
| Vertex AI | Usage-based | GCP integration |
Learning Objectives
After completing this module, you will:
- Select vector databases: Pinecone vs. Milvus vs. pgvector vs. Weaviate
- Implement RAG: Retrieval Augmented Generation patterns
- Scale embeddings: Batch and real-time embedding pipelines
- Use feature stores: Feast, Hopsworks for ML feature management
- Optimize AI/ML costs: Model selection, deployment, serving
Module Dependencies
Quick Start
Vector Databases
- Start with vector databases overview for comparison
- Choose based on use case:
- Production RAG: Pinecone
- Self-hosted: Milvus
- PostgreSQL shops: pgvector
- Knowledge graphs: Weaviate
LLM Ops
- Learn RAG architecture for LLM systems
- Implement prompt engineering pipelines for production
- Scale embeddings with batch processing and caching
Feature Stores
- Understand feature store patterns
- Choose platform:
- Open-source: Feast
- Enterprise: Hopsworks
Next Steps
- Study Vector Databases
- Learn RAG Architecture
- Implement Feature Stores
- Proceed to Module 6: CI/CD for Data
Estimated Time to Complete Module 5: 8-10 hours
Total Files: 12 markdown files with 60+ diagrams