Skip to content

YAML Contracts

Define contracts in YAML for version control and easy editing.

Basic YAML Contract

# contracts/orders.yaml
name: orders
description: E-commerce order data contract

columns:
  order_id:
    type: string
    not_null: true

  amount:
    type: float
    min: 0
    max: 100000

  currency:
    type: string
    allowed: ["USD", "EUR", "GBP"]

rules:
  - row_count > 0
  - "null_rate(amount) < 0.01"

Loading YAML Contracts

from dqflow import Contract

contract = Contract.from_yaml("contracts/orders.yaml")
result = contract.validate(df)

Column Types in YAML

YAML Type Python Equivalent
string str
integer int
float float
boolean bool
timestamp "timestamp"

Full Example

name: orders
description: E-commerce order data quality contract

columns:
  order_id:
    type: string
    not_null: true

  customer_id:
    type: string
    not_null: true

  amount:
    type: float
    min: 0
    max: 100000

  currency:
    type: string
    allowed: ["USD", "EUR", "GBP"]

  status:
    type: string
    allowed: ["pending", "processing", "shipped", "delivered", "cancelled"]

  created_at:
    type: timestamp
    freshness_minutes: 1440

rules:
  - row_count > 0
  - "null_rate(amount) < 0.01"
  - "null_rate(customer_id) < 0.001"

Saving Contracts to YAML

contract = Contract(
    name="orders",
    columns={...},
    rules=[...],
)

contract.to_yaml("contracts/orders.yaml")

Organizing Contracts

Recommended folder structure:

project/
├── contracts/
│   ├── orders.yaml
│   ├── customers.yaml
│   └── products.yaml
├── data/
└── pipelines/

Using with CLI

dq validate contracts/orders.yaml data/orders.parquet
dq show contracts/orders.yaml