ETL

Build real-time Postgres replication applications in Rust
Documentation · Examples · Issues

ETL is a Rust framework by Supabase for building high-performance, real-time data replication apps on Postgres.

It sits on top of Postgres logical replication and gives you Rust-native building blocks for copying existing data, streaming ongoing changes, and writing them to your own destination. Run it as a standalone replicator binary or embed it as a library in your own Rust service.

ETL is intentionally cheap to operate: it is one lightweight Rust process on top of Postgres logical replication. You do not need Kafka, Flink, Debezium, or another coordination service to run a pipeline.

What ETL Does

flowchart LR
    Postgres["Postgres publication"] --> ETL["ETL<br/>copy + stream"]
    ETL --> Destination["Destination"]

ETL runs as one process that coordinates an initial copy, a continuous replication stream, and a state/schema store for recovery:

Initial copy backfills the existing rows covered by a Postgres publication.
Streaming replication forwards ongoing inserts, updates, deletes, truncates, and schema events.
State recovery lets a durable store resume table state, schema versions, and destination metadata after restarts.

Why ETL?

Capability	What it gives you
Real-time replication	Stream Postgres changes as they happen.
Initial copy	Backfill existing table data before CDC begins.
Schema changes	Track simple DDL changes today; destination-specific DDL behavior is documented in Schema Changes.
Cheap operations	Run one lightweight Rust process without Kafka, Flink, Debezium, or extra control-plane infrastructure.
Library or binary	Use ETL as a standalone replicator or embed it in your own Rust application.
Configurable throughput	Tune batching, parallel table sync, retries, and memory backpressure.
Extensible runtime	Implement custom destinations and state/schema stores.
Typed Rust API	Work with structured events, rows, schemas, and errors.

Requirements

ETL officially supports and tests against PostgreSQL 14, 15, 16, 17, and 18.

PostgreSQL 15+ is recommended for advanced publication features:
- Column-level filtering
- Row-level filtering with WHERE clauses
- FOR ALL TABLES IN SCHEMA syntax
PostgreSQL 14 is supported with table-level publication filtering.

For detailed configuration instructions, see the Configure Postgres documentation.

Get Started

ETL is currently installed from Git while we prepare for a crates.io release. Choose the destination features you need.

For a first production deployment, start with the stable BigQuery module:

[dependencies]
etl = { git = "https://github.com/supabase/etl" }
etl-destinations = { git = "https://github.com/supabase/etl", features = ["bigquery"] }
tokio = { version = "1", features = ["full"] }

Then create a pipeline that reads from a Postgres publication and writes to BigQuery.

use etl::{
    config::{
        BatchConfig, InvalidatedSlotBehavior, MemoryBackpressureConfig, PgConnectionConfig,
        PipelineConfig, TableSyncCopyConfig, TcpKeepaliveConfig, TlsConfig,
    },
    pipeline::Pipeline,
    store::MemoryStore,
};
use etl_destinations::bigquery::BigQueryDestination;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let pg = PgConnectionConfig {
        host: "localhost".into(),
        port: 5432,
        name: "mydb".into(),
        username: "postgres".into(),
        password: Some("password".to_string().into()),
        tls: TlsConfig { enabled: false, trusted_root_certs: String::new() },
        keepalive: TcpKeepaliveConfig::default(),
    };

    let store = MemoryStore::new();
    let pipeline_id = 1;
    let destination = BigQueryDestination::new_with_key_path(
        "my-gcp-project".into(),
        "my_dataset".into(),
        "/path/to/service-account-key.json",
        None,
        1,
        pipeline_id,
        store.clone(),
    )
    .await?;

    let config = PipelineConfig {
        id: pipeline_id,
        publication_name: "my_publication".into(),
        pg_connection: pg,
        batch: BatchConfig {
            max_fill_ms: 5000,
            memory_budget_ratio: 0.2,
        },
        table_error_retry_delay_ms: 10_000,
        table_error_retry_max_attempts: 5,
        max_table_sync_workers: 4,
        max_copy_connections_per_table: PipelineConfig::DEFAULT_MAX_COPY_CONNECTIONS_PER_TABLE,
        memory_refresh_interval_ms: 100,
        memory_backpressure: Some(MemoryBackpressureConfig::default()),
        table_sync_copy: TableSyncCopyConfig::default(),
        invalidated_slot_behavior: InvalidatedSlotBehavior::default(),
    };

    // Start the pipeline.
    let mut pipeline = Pipeline::new(config, store, destination);
    pipeline.start().await?;

    // Wait for the pipeline indefinitely.
    pipeline.wait().await?;

    Ok(())
}

For a guided walkthrough, start with Your First Pipeline. For runnable destination examples, see etl-examples.

Destinations

ETL is designed to be extensible: you can implement your own destination, or use one of the modules shipped in etl-destinations.

Feature	Destination	Status	Notes
`bigquery`	Google BigQuery	Stable	Full CRUD-capable replication for analytics workloads.
`ducklake`	DuckLake	In progress	Open data lake replication with local or S3-compatible storage.
`iceberg`	Apache Iceberg	Deprecated for now	The module remains available, but new deployments should prefer BigQuery or DuckLake.

Enable one or more destination modules with crate features:

[dependencies]
etl = { git = "https://github.com/supabase/etl" }
etl-destinations = { git = "https://github.com/supabase/etl", features = ["bigquery", "ducklake"] }

Development

See DEVELOPMENT.md for setup instructions, migration workflows, and development guidelines.

Contributing

We welcome pull requests and GitHub issues. We currently cannot accept new custom destinations unless there is significant community demand, as each destination carries a long-term maintenance cost. We are prioritizing core stability, observability, and ergonomics. If you need a destination that is not yet supported, please start a discussion or issue so we can gauge demand before proposing an implementation.

License

Apache‑2.0. See LICENSE for details.

Made with ❤️ by the Supabase team

Name		Name	Last commit message	Last commit date
Latest commit History 1,209 Commits
.cargo		.cargo
.config		.config
.github		.github
.vscode		.vscode
docs		docs
etl-api		etl-api
etl-benchmarks		etl-benchmarks
etl-config		etl-config
etl-destinations		etl-destinations
etl-examples		etl-examples
etl-postgres		etl-postgres
etl-replicator		etl-replicator
etl-telemetry		etl-telemetry
etl		etl
scripts		scripts
xtask		xtask
.dockerignore		.dockerignore
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
DEVELOPMENT.md		DEVELOPMENT.md
LICENSE		LICENSE
README.md		README.md
cliff.toml		cliff.toml
deny.toml		deny.toml
dependencies.dot		dependencies.dot
flake.lock		flake.lock
flake.nix		flake.nix
mkdocs.yaml		mkdocs.yaml
rust-toolchain.toml		rust-toolchain.toml
rustfmt.toml		rustfmt.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ETL

What ETL Does

Why ETL?

Requirements

Get Started

Destinations

Development

Contributing

License

About

Uh oh!

Releases

Sponsor this project

Uh oh!

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

ETL

What ETL Does

Why ETL?

Requirements

Get Started

Destinations

Development

Contributing

License

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages