CLI Usage¶
dqflow provides a command-line interface for validating data.
Commands¶
dq validate¶
Validate data against a contract:
Arguments:
CONTRACT- Path to YAML contract fileDATA- Path to data file (parquet, csv, json)
Options:
--output,-o- Output format:text(default) orjson--fail-fast- Exit with error code 1 on validation failure
Examples:
# Basic validation
dq validate contracts/orders.yaml data/orders.parquet
# JSON output
dq validate contracts/orders.yaml data/orders.csv --output json
# Fail fast (for CI/CD)
dq validate contracts/orders.yaml data/orders.parquet --fail-fast
dq show¶
Display contract details:
Example:
Output:
Contract: orders
Description: E-commerce order data contract
Columns:
order_id: string (NOT NULL)
amount: float (min=0, max=100000)
currency: string (allowed=['USD', 'EUR', 'GBP'])
created_at: timestamp (freshness=1440m)
Rules:
- row_count > 0
- null_rate(amount) < 0.01
dq infer¶
Infer a contract from existing data:
Arguments:
DATA- Path to data fileOUTPUT- Path to write contract YAML
Example:
Supported File Formats¶
| Format | Extension |
|---|---|
| Parquet | .parquet |
| CSV | .csv |
| JSON | .json |
CI/CD Integration¶
Use --fail-fast in CI pipelines:
# GitHub Actions example
- name: Validate data quality
run: dq validate contracts/orders.yaml data/orders.parquet --fail-fast
JSON Output¶
Use -o json for machine-readable output: