Overview
This module covers data modeling methodologies (Kimball, Data Vault, Inmon), modern transformation with dbt, data quality tools (Great Expectations, Soda, Data Contracts), and governance patterns (Data Mesh, Data Fabric, lineage tracking).
Module Contents
Dimensional Modeling
Data Quality
Governance
| Document | Description | Key Topics |
|---|
| Data Mesh | Decentralized architecture | Domain ownership, data products |
| Data Fabric | Unified architecture | Metadata, integration |
| Lineage Tracking | Data lineage | OpenLineage, Marquez |
Modeling Methodology Comparison
Kimball vs. Data Vault vs. Inmon
| Dimension | Kimball | Data Vault 2.0 | Inmon CIF |
|---|
| Approach | Bottom-up | Hybrid | Top-down |
| Start with | Data marts | Hub/Link/Satellite | Enterprise warehouse |
| Time to value | Fast | Medium | Slow |
| Agility | High | Very High | Low |
| Audit trail | Limited | Excellent | Good |
| Complexity | Low | Medium | High |
| Best for | Most orgs | Enterprise integration | Regulated industries |
Data Quality Strategy
Governance Evolution
Learning Objectives
After completing this module, you will:
- Apply dimensional modeling: Kimball star schema patterns
- Understand Data Vault: Hub, Link, Satellite design
- Use dbt effectively: Models, tests, documentation, CI/CD
- Implement data quality: Great Expectations, Soda, Data Contracts
- Design governance: Data Mesh vs. Data Fabric
- Track lineage: OpenLineage, end-to-end visibility
Module Dependencies
Next Steps
- Study Kimball Fundamentals
- Learn dbt Best Practices
- Implement Data Quality
- Design Governance Strategy
Estimated Time to Complete Module 4: 10-12 hours