Field notes on autonomous data.

Guides, product news, and perspective on moving data teams from maintenance to ownership.

Latest

Guide

What is a data product?

A dataset built like software — owned, tested, documented, and governed — so the org can trust it and build on it.

Read →

Guide

Change Data Capture (CDC), explained

What CDC is, why it beats batch, and the methods — log-based, query-based, trigger-based — with their trade-offs.

Read →

Guide

ETL vs ELT: what changed, which to use

Why ELT overtook ETL in the cloud era, when ETL still fits, and a practical way to decide.

Read →

Guide

Replicate Postgres to BigQuery in real time

The approaches, the hard parts — schema drift, type mapping, deletes, backfill — and a reference setup.

Read →

Engineering

Log-based vs query-based CDC

The three CDC methods compared on source impact, completeness, and latency — and when each wins.

Read →

Engineering

Exactly-once delivery in data pipelines

At-most, at-least, exactly-once — what each guarantee means, why exactly-once is hard, and how systems get there.

Read →

Guide

Schema drift: what it is, how to handle it

Why source schema changes break pipelines — and the strategies that keep data flowing safely.

Read →

Guide

What is reverse ETL?

Syncing modeled warehouse data back into the operational tools business teams live in — data activation, explained.

Read →

Guide

Data pipeline pricing explained

MAR, rows moved, connectors, credits — how the models work and how to estimate your real bill.

Read →

Guide

Migrating off Informatica: a practical path

A structured, lower-risk alternative to a multi-quarter GSI engagement — inventory, convert, validate, cut over.

Read →

Field notes on autonomous data.

AI agents for data engineering: what they actually do

What is a data product?

Change Data Capture (CDC), explained

ETL vs ELT: what changed, which to use

Replicate Postgres to BigQuery in real time

Log-based vs query-based CDC

Exactly-once delivery in data pipelines

Schema drift: what it is, how to handle it

What is reverse ETL?

Data pipeline pricing explained

Migrating off Informatica: a practical path

See it for yourself.