The Database Is Your Event Bus: CDC‑First Architectures with Postgres, Outbox/Inboxes, and Debezium in 2025
If you are building a real‑time, service‑oriented system in 2025, defaulting to a heavyweight streaming platform is no longer the only sensible move. There is a simpler path for a large class of use cases: treat your operational database as the source of truth and your event bus. With change data capture (CDC), transactional outbox/inbox tables, and idempotent consumers, you can propagate state across services and read models safely, backfill without tears, and preserve consistency — all without adopting Kafka‑by‑default.
This article presents a detailed, opinionated blueprint for CDC‑first architectures using Postgres, Debezium, and time‑tested patterns. It covers design choices, operational guardrails, and code you can paste into production.
TL;DR
- Use Postgres logical decoding and Debezium to capture changes from your service database.
- Publish events by writing to an outbox table inside the same transaction as your business data.
- Deliver to consumers that persist into an inbox table and process idempotently.
- Favor at‑least‑once delivery with deduplication; avoid dual writes and exotic exactly‑once setups.
- Monitor replication slots, WAL retention, and end‑to‑end latency; design for replays and backfills using incremental snapshots.
- Adopt this pattern until you truly need distributed stream processing, extreme fan‑out, or multi‑tenant scale that stretches a single database.
Why CDC‑First, Not Kafka‑By‑Default
Operational simplicity is oxygen in small and medium platforms. A CDC‑first design:
- Eliminates dual writes: data changes and events are atomically recorded in one transaction.
- Reduces moving parts: there is one authoritative store, and the replication pipeline is one connector.
- Improves correctness: the transaction log is the ground truth of what committed, ordered by commit time.
- Preserves portability: you can add Kafka/Pulsar later without redesigning producers.
When should you still reach for Kafka early?
- You need cross‑stream joins, windowed aggregations, or high‑throughput stream processing.
- Fan‑out is large (hundreds of consumers) or you require long‑term retention and replay for many teams.
- You need multi‑region, active‑active event mesh with sophisticated routing or governance.
Most product backends, event‑driven integrations, CQRS read models, cache fills, and search indexers do not need a streaming backbone on day one. CDC‑first gets you there faster and safer.
The Core Idea: Treat Postgres as Your Event Bus
Postgres already orders, persists, and replicates your changes. Logical decoding lets you read an ordered stream of row‑level changes from the write‑ahead log (WAL). Debezium provides robust connectors, schema change handling, and convenient transforms that turn WAL entries into application‑level events.
The minimal CDC‑first loop:
- Your service writes domain state and a corresponding outbox event within the same Postgres transaction.
- Debezium reads committed changes from the WAL, filters/unwraps outbox rows, and delivers events to your chosen sink (Kafka is optional; Pulsar, Kinesis, Pub/Sub, Event Hubs, Redis Streams, or HTTP are viable with Debezium Server).
- Consumers read events, persist to an inbox with idempotence keys, and perform side effects or update read models.
Because the outbox write and the state change are atomic, there is no observable gap or reordering between them.
Architecture Diagram (in words)
- Service DB (Postgres, wal_level=logical)
- Business tables (orders, payments, etc.)
- Outbox table (append‑only)
- Debezium Postgres connector
- Uses logical replication slot and publication
- Applies Outbox Event Router SMT
- Emits events to sink (Kafka, Pulsar, Pub/Sub, Kinesis, Event Hubs, Redis Streams, or HTTP)
- Consumer services
- Persist events in inbox table with unique constraint on event_id
- Process with at‑least‑once semantics; deduplicate by event_id
- Emit their own outbox events when needed
Postgres Setup in 2025: Capabilities and Constraints
Key Postgres features that make CDC‑first practical today:
- Logical decoding via
pgoutput
(built‑in), compatible with managed Postgres. Avoid external plugins like wal2json on RDS/Cloud SQL where installation is restricted. - Publications on filtered tables; per‑table replica identities to control old value capture.
- Postgres 16: logical decoding on standbys, reducing primary load in read‑heavy clusters.
- Robust catalogs for monitoring lag:
pg_stat_replication
,pg_replication_slots
,pg_lsn
arithmetic.
Operational checklist:
wal_level=logical
- Ensure
max_replication_slots
andmax_wal_senders
accommodate your connectors. - Set
max_slot_wal_keep_size
to cap WAL retention for stalled slots. - Use primary keys on all replicated tables; otherwise set
REPLICA IDENTITY
appropriately (often FULL for tables without PK). - On managed Postgres (RDS, Aurora, Cloud SQL), enable logical replication per provider guidance and use
pgoutput
.
References:
- Postgres logical replication: https://www.postgresql.org/docs/current/logical-replication.html
- RDS Postgres logical replication: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/PostgreSQL.Logical.Replication.html
The Transactional Outbox Pattern
Never publish to an external system directly from your request handler. Publish by inserting an event row into an outbox table in the same transaction as your domain write. Debezium then emits that row reliably.
Schema example:
sqlcreate table outbox_events ( id uuid primary key, aggregate_type text not null, -- e.g., order, payment aggregate_id text not null, -- domain id event_type text not null, -- e.g., OrderPlaced event_payload jsonb not null, -- event data headers jsonb default '{}'::jsonb, -- optional metadata occurred_at timestamptz not null default now(), published_at timestamptz -- set by Debezium or downstream if desired ); create index on outbox_events (aggregate_type, aggregate_id); create index on outbox_events (occurred_at);
Writing domain state and event atomically:
sqlbegin; insert into orders (id, customer_id, status, total_amount) values ($1, $2, 'PLACED', $3); insert into outbox_events (id, aggregate_type, aggregate_id, event_type, event_payload) values ( gen_random_uuid(), 'order', $1, 'OrderPlaced', jsonb_build_object( 'order_id', $1, 'customer_id', $2, 'total_amount', $3 ) ); commit;
Outbox best practices:
- Append‑only: never update payloads; if you must mark delivery, add a separate projection table for operational status.
- Small, self‑contained payloads: include everything consumers need, avoiding read‑your‑writes issues.
- Partitioning: for very high volumes, consider partitioning by month and pruning.
- Retention: keep at least weeks of history to enable replays; archive to S3/object storage if needed.
Debezium: Your CDC Engine
Debezium is the de facto standard for CDC in the JVM ecosystem. In 2025 you have two main ways to deploy it:
- Kafka Connect with Debezium connector plugins (classic; powerful when Kafka is present).
- Debezium Server (standalone Quarkus app) that reads from databases and writes to various sinks without requiring Kafka.
Key advantages:
- Robust Postgres connector using
pgoutput
with snapshotting, schema change events, and backoff. - Single Message Transforms (SMTs), including the Outbox Event Router that maps outbox rows to topic/subject and unwraps payloads.
- Incremental snapshots to backfill existing data without downtime.
Debezium Server example configuration (no Kafka required)
Below is a minimal Debezium Server configuration that reads Postgres outbox events and writes to Apache Pulsar. Swap the sink for Kinesis, Google Pub/Sub, Redis Streams, or Azure Event Hubs similarly.
properties# application.properties for Debezium Server # Sink debezium.sink.type=pulsar debezium.sink.pulsar.service.url=pulsar://pulsar-broker:6650 # Source connector debezium.source.connector.class=io.debezium.connector.postgresql.PostgresConnector # Postgres connection debezium.source.database.hostname=postgres debezium.source.database.port=5432 debezium.source.database.user=debezium debezium.source.database.password=secret debezium.source.database.dbname=orders # Logical decoding debezium.source.plugin.name=pgoutput # Publication and slot management debezium.source.publication.autocreate.mode=filtered debezium.source.publication.name=dbz_publication # Create a logical slot name unique to this connector debezium.source.slot.name=dbz_orders_outbox # Table filter: only capture outbox debezium.source.table.include.list=public.outbox_events # Snapshot behavior # initial: snapshot on first start; subsequent starts resume from offsets debezium.source.snapshot.mode=initial # Outbox Event Router SMT debezium.transforms=outbox # Route events based on outbox columns # Subject/topic determined by aggregate_type and event_type debezium.transforms.outbox.type=io.debezium.transforms.outbox.EventRouter # Specify field names debezium.transforms.outbox.table.field.event.id=id debezium.transforms.outbox.table.field.event.key=aggregate_id debezium.transforms.outbox.table.field.event.type=event_type debezium.transforms.outbox.table.field.event.payload=event_payload # Route destination topic name template (Pulsar uses persistent://tenant/namespace/topic) debezium.transforms.outbox.route.by.field=aggregate_type # Emit CloudEvents metadata (optional) debezium.format.value=cloudevents # Offsets debezium.offset.storage.file.filename=/data/offsets.dat # Heartbeats help detect liveness and measure lag debezium.source.heartbeat.interval.ms=10000
This uses pgoutput
so it runs on managed Postgres. The Outbox Event Router SMT maps outbox rows into clean event messages keyed by aggregate_id
. When you later add new services, they can subscribe to the relevant subjects without touching the producer.
References:
- Debezium documentation: https://debezium.io/documentation/
- Outbox Event Router SMT: https://debezium.io/documentation/reference/stable/transformations/outbox-event-router.html
The Inbox Pattern and Idempotent Consumers
CDC is at‑least‑once by nature. Consumers must be idempotent. An inbox table with a unique constraint on event id is the simplest, most reliable strategy. Persist each event before side effects, deduplicate on the database boundary, and commit the processing outcome atomically.
Inbox schema:
sqlcreate table inbox_events ( event_id uuid primary key, source text not null, -- e.g., 'orders' subject text not null, -- topic or aggregate_type event_type text not null, aggregate_id text not null, payload jsonb not null, received_at timestamptz not null default now(), processed_at timestamptz, status text not null default 'PENDING', -- PENDING, DONE, DEAD error text ); create index on inbox_events (status, received_at);
Consumer loop pattern:
- Poll the event stream.
- For each message, attempt an UPSERT into
inbox_events
byevent_id
. - If inserted new row, proceed to handle the business logic; if conflict, skip (duplicate).
- Perform side effects and update
status
to DONE within the same transaction if those effects are in the same database; otherwise at least mark the row and persist outgoing side effects via your own outbox.
Example idempotent handler in SQL plus application logic:
sql-- Upsert event into inbox to deduplicate insert into inbox_events (event_id, source, subject, event_type, aggregate_id, payload) values ($event_id, $source, $subject, $event_type, $aggregate_id, $payload) on conflict (event_id) do nothing;
Pseudo‑code (Go‑like) using a transactional unit of work:
gofor msg := range stream { err := withTx(db, func(tx *Tx) error { // Deduplicate r := tx.Exec( 'insert into inbox_events(event_id, source, subject, event_type, aggregate_id, payload) values ($1,$2,$3,$4,$5,$6) on conflict do nothing', msg.ID, msg.Source, msg.Subject, msg.Type, msg.AggregateID, msg.Payload, ) if r.RowsAffected == 0 { // duplicate; already processed or in-flight return nil } // Apply business logic using payload; e.g., update read model if err := apply(tx, msg); err != nil { tx.Exec('update inbox_events set status = $1, error = $2 where event_id = $3', 'DEAD', err.Error(), msg.ID) return err } // Mark done tx.Exec('update inbox_events set status = $1, processed_at = now() where event_id = $2', 'DONE', msg.ID) return nil }) if err != nil { // optional: backoff, send to DLQ if using a broker, or rely on retry policy } }
Scaling consumers:
- Use
select ... for update skip locked
to fetch batches of PENDING rows when consuming via polling from a durable queue or after sink connectors deliver via e.g., HTTP. - Partition inbox by day or hash of
event_id
if volumes are high. - Keep processing idempotent by making downstream updates idempotent (e.g.,
insert ... on conflict do update ... where target.version < incoming.version
).
Ordering and Causality: What You Can and Cannot Assume
- WAL ordering is by transaction commit time, not statement time. Within a single aggregate (order id), you want consumers keyed by
aggregate_id
to preserve per‑aggregate order. Debezium keys messages accordingly via the outbox SMT. - Cross‑aggregate ordering is not guaranteed nor usually required.
- If a consumer depends on multiple event types from different aggregates with strict causality, consider consolidating those concerns into a single aggregate or using a saga with explicit state machines.
Snapshots, Backfills, and Replays
CDC alone only captures changes after a point in time. You often need an initial backfill and occasional reprocessing:
- Initial load: Debezium supports snapshotting on first run (
snapshot.mode=initial
). For outbox‑only capture, you may instead perform an application‑level historical publish. - Incremental snapshots: Debezium can run chunked snapshots of existing tables while streaming new changes, coordinated via the Debezium signal table.
- Replays: retain outbox rows and use connector offsets to rewind; consumers remain safe due to inbox deduplication.
Signal table and incremental snapshot example:
sql-- Create Debezium signal table (if not auto-created) create table if not exists debezium_signal ( id varchar(64) primary key, type varchar(32) not null, data jsonb ); -- Trigger an incremental snapshot of a table insert into debezium_signal (id, type, data) values ( 'snapshot-2025-01', 'execute-snapshot', jsonb_build_object( 'data-collections', array['public.orders'], 'type', 'INCREMENTAL' ) );
Enable incremental snapshots in Debezium Server:
propertiesdebezium.source.incremental.snapshot.allow=true debezium.source.signal.data.collection=public.debezium_signal
References:
- Incremental snapshots: https://debezium.io/documentation/reference/stable/incremental-snapshots.html
Exactly‑Once vs At‑Least‑Once: Be Practical
Exactly‑once end‑to‑end across distributed systems is expensive and brittle. Kafka introduced idempotent producers and transactional consumers, but once you cross into databases and external APIs, guarantees degrade.
Instead, commit to:
- At‑least‑once delivery in the pipeline (Debezium and sinks).
- Idempotent processing with inbox dedupe and conditional updates.
- Idempotent side effects via natural keys and version checks, or compensating actions when that is not possible.
Data modeling for idempotence:
- Use a stable
event_id
(UUID) as the primary key in the inbox. - If events represent state transitions, include a monotonic
version
orsequence
per aggregate and apply only if newer. - For read models, use
insert ... on conflict do update
guarded by version:
sqlinsert into order_read_model (order_id, status, version) values ($1, $status, $version) on conflict (order_id) do update set status = excluded.status, version = excluded.version where order_read_model.version < excluded.version;
Example End‑to‑End: Orders and Payments
Scenario: the Orders service emits OrderPlaced
; the Payments service reserves funds and emits PaymentAuthorized
.
Orders service database:
sqlcreate table orders ( id text primary key, customer_id text not null, status text not null, total_amount numeric not null, version int not null default 0 ); create table outbox_events ( id uuid primary key, aggregate_type text not null, aggregate_id text not null, event_type text not null, event_payload jsonb not null, occurred_at timestamptz not null default now() );
Orders service places an order:
sqlbegin; insert into orders (id, customer_id, status, total_amount, version) values ($order_id, $customer_id, 'PLACED', $total, 1); insert into outbox_events (id, aggregate_type, aggregate_id, event_type, event_payload) values ( gen_random_uuid(), 'order', $order_id, 'OrderPlaced', jsonb_build_object('order_id', $order_id, 'total', $total, 'customer_id', $customer_id, 'version', 1) ); commit;
Debezium reads outbox_events
and emits messages keyed by aggregate_id
with payload from event_payload
.
Payments service inbox and processing:
sqlcreate table inbox_events ( event_id uuid primary key, subject text not null, event_type text not null, aggregate_id text not null, payload jsonb not null, status text not null default 'PENDING', received_at timestamptz not null default now(), processed_at timestamptz ); create table payments ( order_id text primary key, status text not null, reserved_amount numeric not null, version int not null default 0 );
Handler pseudo‑code:
gohandleOrderPlaced := func(msg Event) error { return withTx(db, func(tx *Tx) error { // Deduplicate r := tx.Exec( 'insert into inbox_events(event_id, subject, event_type, aggregate_id, payload) values ($1,$2,$3,$4,$5) on conflict do nothing', msg.ID, msg.Subject, msg.Type, msg.AggregateID, msg.Payload, ) if r.RowsAffected == 0 { return nil } total := msg.Payload.GetNumber('total') // Idempotent upsert with version guard tx.Exec( 'insert into payments(order_id, status, reserved_amount, version) values ($1,$2,$3,$4) ' + 'on conflict(order_id) do update set status = excluded.status, reserved_amount = excluded.reserved_amount, version = excluded.version ' + 'where payments.version < excluded.version', msg.AggregateID, 'AUTHORIZED', total, msg.Payload.GetInt('version'), ) // Emit outbox for payment authorized tx.Exec( 'insert into outbox_events(id, aggregate_type, aggregate_id, event_type, event_payload) values (gen_random_uuid(), $1, $2, $3, $4)', 'payment', msg.AggregateID, 'PaymentAuthorized', jsonb_build_object('order_id', msg.AggregateID, 'amount', total), ) tx.Exec('update inbox_events set status = $1, processed_at = now() where event_id = $2', 'DONE', msg.ID) return nil }) }
The Payments service emits its own outbox event, which Debezium picks up and forwards to interested consumers (e.g., Shipping service) with the same guarantees and idempotency.
Schema Evolution and Contracts
CDC exposes your database schema to the world. Be deliberate:
- Use versioned event types and payloads that are backward‑compatible: additive changes, optional fields with defaults.
- If you use Kafka, consider Avro or Protobuf with a schema registry. If you use Debezium Server without Kafka, you can still wrap events with CloudEvents metadata and embed a simple version field.
- Avoid leaking internal column names directly; with outbox payloads you control the shape of events.
Debezium emits schema change events too, which you can monitor to alert on risky migrations.
Operational Pitfalls and How to Avoid Them
- Replication slots can retain WAL if a connector stalls. Monitor
pg_replication_slots
forrestart_lsn
lag and setmax_slot_wal_keep_size
to cap retention. - Vacuum and bloat: heavy update/delete tables with
REPLICA IDENTITY FULL
can increase WAL volume. Prefer primary keys and avoid FULL when possible. - Publication scope: do not replicate your entire database by default; filter to
outbox_events
(and any other required tables) to cap throughput. - Managed Postgres constraints: use
pgoutput
. You cannot installwal2json
on RDS/Cloud SQL. Adjust parameter groups to enable logical replication. - Connector restarts: persist offsets to durable storage (file, S3, or Kafka offsets). Back up offset state as part of disaster recovery.
- Poison events: implement a DLQ or mark inbox rows as DEAD after bounded retries and alert.
Monitoring KPIs:
- Source lag in LSN and time: Debezium exposes
source-lag-ms
metrics. - Replication slot lag:
pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)
. - End‑to‑end latency: from
occurred_at
in outbox toprocessed_at
in inbox; track p95/p99. - Throughput: Debezium
records-per-second
, sink publish rate, consumer ack rate.
Performance and Capacity Planning
What can you expect from Postgres + Debezium?
- Throughput: tens of thousands of events per second per database are feasible on modern hardware when capturing a narrow outbox table with compact payloads. Your mileage depends on WAL volume, I/O, and sink throughput.
- Latency: sub‑second end‑to‑end at p50 with modest load; low‑single‑digit seconds p95 are typical for steady traffic. Spiky workloads benefit from batching at the consumer side.
- Hot rows: outbox is append‑only; contention is low. Consumers scale horizontally by partitioning work on
event_id
oraggregate_id
.
Postgres tuning tips:
wal_compression=on
for update‑heavy workloads.checkpoint_timeout
andmax_wal_size
tuned to your disk subsystem; avoid frequent checkpoints.- Move logical decoding to a read replica if using Postgres 16 logical decoding on standbys to reduce primary load.
- Use SSDs with generous IOPS; logical replication is I/O sensitive.
Multi‑Region and Disaster Recovery
- Single‑writer per aggregate: design so that only one region writes to a given aggregate to avoid conflict resolution.
- Regional CDC: run Debezium next to the writer region or on a physical or logical standby in that region.
- DR for connectors: keep offset storage replicated; be able to recreate replication slots and publications from infra code.
- Replays: rely on outbox retention and consumer inbox idempotency to rehydrate after failover.
Global ordering across regions is not a realistic goal. Embrace per‑aggregate ordering with eventual consistency.
Local Development and Testing
- Use Testcontainers to spin up Postgres and Debezium Server in integration tests. Seed an outbox row, wait for the sink message, assert your consumer writes to the inbox and updates the read model.
- Verify idempotency by delivering the same message twice; assert only one side effect.
- Test schema migrations under CDC load: add a column to the outbox payload and roll forward consumers incrementally.
Reference:
- Testcontainers: https://www.testcontainers.org/
Step‑By‑Step Blueprint
- Model your domain events explicitly: name, payload fields, and versioning strategy.
- Add an outbox table to each service database with the schema shown here.
- In application code, write to business tables and outbox in the same transaction.
- Deploy Debezium Server configured for your database using
pgoutput
. Filter to the outbox table, apply the Outbox Event Router SMT. - Choose a sink appropriate to your stack: Kafka if you already have it, otherwise Pulsar, Kinesis, Pub/Sub, Event Hubs, or Redis Streams.
- Implement consumers that persist to an inbox with a unique
event_id
and process idempotently. - Instrument and alert on replication lag, end‑to‑end latency, and inbox dead letters.
- Set up incremental snapshots for backfills and keep outbox history long enough for replays.
- Document contracts for event payloads; adopt additive, backward‑compatible changes.
- Reassess when to introduce a streaming backbone if you grow into cross‑stream joins, heavy fan‑out, or long‑term multi‑team replay needs.
Frequently Asked Questions
- Can I skip the outbox and let Debezium capture business tables directly? You can, but you will leak schema and produce overly chatty change events. The outbox gives you stable, purposeful event contracts and smaller payloads.
- Do I need Kafka? No. Debezium Server supports multiple sinks. If you already have Kafka and a schema registry, it integrates nicely; if not, you do not need to introduce it only for CDC.
- How do I handle deletes? Emit a domain event in the outbox when an aggregate is deleted; consumers can tombstone their read models accordingly. If you must capture physical deletes, ensure replica identity includes key fields.
- What about PII? Do not leak sensitive columns into outbox payloads. Consider encrypting payloads at rest in Postgres and in transit; apply field‑level redaction where appropriate.
Conclusion
In 2025, CDC‑first is a pragmatic default for many real‑time systems. With Postgres logical decoding, Debezium, and the outbox/inbox patterns, you can propagate state with correctness and minimal operational overhead. You get atomic publication, controlled schemas, safe replays, and a clear path to scale. When you truly need heavyweight streaming features, you can add them later without rewriting producers — because your database has been the event bus all along.
References and Further Reading
- Postgres logical replication: https://www.postgresql.org/docs/current/logical-replication.html
- Debezium documentation: https://debezium.io/documentation/
- Outbox Event Router SMT: https://debezium.io/documentation/reference/stable/transformations/outbox-event-router.html
- Incremental snapshots: https://debezium.io/documentation/reference/stable/incremental-snapshots.html
- Transactional outbox pattern: https://microservices.io/patterns/data/transactional-outbox.html
- Idempotent consumer pattern: https://microservices.io/patterns/communication-style/idempotent-consumer.html
- AWS RDS Postgres logical replication: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/PostgreSQL.Logical.Replication.html
- CloudEvents: https://cloudevents.io/