Skip to content

Module 4: Data Modeling & Warehousing


Overview

This module covers data modeling methodologies (Kimball, Data Vault, Inmon), modern transformation with dbt, data quality tools (Great Expectations, Soda, Data Contracts), and governance patterns (Data Mesh, Data Fabric, lineage tracking).


Module Contents

Dimensional Modeling

DocumentDescriptionKey Topics
Kimball FundamentalsDimensional modelingStar schema, snowflake, conformed dimensions
Data Vault 2.0Enterprise modelingHub, Link, Satellite patterns
Inmon Corporate FactoryEnterprise warehouseCIF methodology

Transformation

DocumentDescriptionKey Topics
dbt Best PracticesModern transformationModels, tests, documentation
dbt TestingTesting in dbtData tests, schema tests
dbt Version ControlCI/CD for dbtBranching, deployment

Data Quality

DocumentDescriptionKey Topics
Great ExpectationsGE frameworkExpectations, validation
Soda DataSoda CLI/CloudChecks, monitors
Data ContractsContract-based qualitySchema + SLA agreements

Governance

DocumentDescriptionKey Topics
Data MeshDecentralized architectureDomain ownership, data products
Data FabricUnified architectureMetadata, integration
Lineage TrackingData lineageOpenLineage, Marquez

Modeling Methodology Comparison


Kimball vs. Data Vault vs. Inmon

DimensionKimballData Vault 2.0Inmon CIF
ApproachBottom-upHybridTop-down
Start withData martsHub/Link/SatelliteEnterprise warehouse
Time to valueFastMediumSlow
AgilityHighVery HighLow
Audit trailLimitedExcellentGood
ComplexityLowMediumHigh
Best forMost orgsEnterprise integrationRegulated industries

Data Quality Strategy


Governance Evolution


Learning Objectives

After completing this module, you will:

  1. Apply dimensional modeling: Kimball star schema patterns
  2. Understand Data Vault: Hub, Link, Satellite design
  3. Use dbt effectively: Models, tests, documentation, CI/CD
  4. Implement data quality: Great Expectations, Soda, Data Contracts
  5. Design governance: Data Mesh vs. Data Fabric
  6. Track lineage: OpenLineage, end-to-end visibility

Module Dependencies


Next Steps

  1. Study Kimball Fundamentals
  2. Learn dbt Best Practices
  3. Implement Data Quality
  4. Design Governance Strategy

Estimated Time to Complete Module 4: 10-12 hours