From 51743243e5008ca0f23845ba6adf91aaac2d414e Mon Sep 17 00:00:00 2001
From: Andy Stark <andrew.stark@redis.com>
Date: Fri, 13 Mar 2026 15:37:47 +0000
Subject: [PATCH 1/7] DOC-6370 started Snowflake source prep docs

---
 .../data-pipelines/prepare-dbs/snowflake.md   | 302 ++++++++++++++++++
 1 file changed, 302 insertions(+)
 create mode 100644 content/integrate/redis-data-integration/data-pipelines/prepare-dbs/snowflake.md

diff --git a/content/integrate/redis-data-integration/data-pipelines/prepare-dbs/snowflake.md b/content/integrate/redis-data-integration/data-pipelines/prepare-dbs/snowflake.md
new file mode 100644
index 0000000000..f0daa18abd
--- /dev/null
+++ b/content/integrate/redis-data-integration/data-pipelines/prepare-dbs/snowflake.md
@@ -0,0 +1,302 @@
+---
+Title: Prepare Snowflake for RDI
+alwaysopen: false
+categories:
+- docs
+- integrate
+- rs
+- rdi
+description: Prepare Snowflake databases to work with RDI
+group: di
+linkTitle: Prepare Snowflake
+summary: Redis Data Integration keeps Redis in sync with the primary database in near
+  real time.
+type: integration
+weight: 20
+---
+
+This guide describes the steps required to prepare a Snowflake database as a source for Redis Data Integration (RDI) pipelines.
+
+RDI uses the [RIOTX](https://redis.github.io/riotx/) collector to stream data from Snowflake to Redis. 
+During the [snapshot]({{< relref "/integrate/redis-data-integration/data-pipelines#pipeline-lifecycle" >}}) phase, RDI reads the current state of the database using the JDBC driver. In the 
+[streaming]({{< relref "/integrate/redis-data-integration/data-pipelines#pipeline-lifecycle" >}})
+phase, RDI uses [Snowflake Streams](https://docs.snowflake.com/en/user-guide/streams) to 
+capture changes related to the monitored tables. Note that RIOTX will automatically create and manage 
+the required streams.
+
+## Setup
+
+The following checklist shows the steps to prepare a Snowflake database for RDI,
+with links to the sections that explain the steps in full detail.
+You may find it helpful to track your progress with the checklist as you
+complete each step.
+
+{{< note >}}
+Snowflake is only supported with RDI deployed on Kubernetes/Helm. RDI VM mode does not support Snowflake as a source database.
+{{< /note >}}
+
+```checklist {id="snowflakelist"}
+- [ ] [Set up Snowflake permissions](#1-set-up-snowflake-permissions)
+- [ ] [Configure authentication](#2-configure-authentication)
+- [ ] [Set up secrets for Kubernetes deployment](#3-set-up-secrets-for-kubernetes-deployment)
+- [ ] [Configure RDI for Snowflake](#4-configure-rdi-for-snowflake)
+```
+
+## 1. Set up Snowflake permissions
+
+The RDI user requires the following permissions to connect and capture data from Snowflake:
+
+- `SELECT` on source tables
+- `CREATE STREAM` permission (RIOTX automatically creates and manages Snowflake Streams for CDC)
+- `USAGE` permission on the warehouse for query execution
+
+Grant the required permissions to your RDI user:
+
+```sql
+-- Grant usage on the warehouse
+GRANT USAGE ON WAREHOUSE COMPUTE_WH TO ROLE rdi_role;
+
+-- Grant usage on the database and schema
+GRANT USAGE ON DATABASE MYDB TO ROLE rdi_role;
+GRANT USAGE ON SCHEMA MYDB.PUBLIC TO ROLE rdi_role;
+
+-- Grant SELECT on tables to capture
+GRANT SELECT ON TABLE MYDB.PUBLIC.customers TO ROLE rdi_role;
+GRANT SELECT ON TABLE MYDB.PUBLIC.orders TO ROLE rdi_role;
+
+-- Grant CREATE STREAM permission for CDC
+GRANT CREATE STREAM ON SCHEMA MYDB.PUBLIC TO ROLE rdi_role;
+
+-- Assign the role to your RDI user
+GRANT ROLE rdi_role TO USER rdi_user;
+```
+
+## 2. Configure authentication
+
+RDI supports two authentication methods for Snowflake. You must configure one of these methods.
+
+### Password authentication
+
+Use standard username and password credentials. Store these securely using Kubernetes secrets (see step 3).
+
+### Private key authentication
+
+For enhanced security, use key-pair authentication:
+
+1. Generate a private key:
+
+    ```bash
+    openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out rsa_key.p8 -nocrypt
+    ```
+
+1. Generate the public key:
+
+    ```bash
+    openssl rsa -in rsa_key.p8 -pubout -out rsa_key.pub
+    ```
+
+1. Register the public key with your Snowflake user:
+
+    ```sql
+    ALTER USER rdi_user SET RSA_PUBLIC_KEY='<public_key_content>';
+    ```
+
+## 3. Set up secrets for Kubernetes deployment
+
+Before deploying the RDI pipeline, configure the necessary secrets.
+
+### Password authentication
+
+```bash
+kubectl create secret generic source-db \
+  --namespace=rdi \
+  --from-literal=SOURCE_DB_USERNAME=your_username \
+  --from-literal=SOURCE_DB_PASSWORD=your_password
+```
+
+### Private key authentication
+
+Create a secret with the private key file:
+
+```bash
+kubectl create secret generic source-db-ssl \
+  --namespace=rdi \
+  --from-file=client.key=/path/to/rsa_key.p8
+```
+
+Also create the source-db secret with the username:
+
+```bash
+kubectl create secret generic source-db \
+  --namespace=rdi \
+  --from-literal=SOURCE_DB_USERNAME=your_username
+```
+
+## 4. Configure RDI for Snowflake
+
+Use the following example configuration in your `config.yaml` file:
+
+```yaml
+sources:
+  snowflake:
+    type: riotx
+    connection:
+      type: snowflake
+      url: "jdbc:snowflake://myaccount.snowflakecomputing.com/"
+      username: "${SOURCE_DB_USERNAME}"
+      password: "${SOURCE_DB_PASSWORD}"  # Omit for key-pair auth
+      database: "MYDB"
+      schema: "PUBLIC"
+      warehouse: "COMPUTE_WH"
+      # role: "RDI_ROLE"                 # Optional: Snowflake role
+      # cdcDatabase: "CDC_DB"            # Optional: Separate database for CDC streams
+      # cdcSchema: "CDC_SCHEMA"          # Optional: Separate schema for CDC streams
+    tables:
+      customers: {}
+      orders: {}
+    advanced:
+      riotx:
+        poll: "30s"
+        snapshot: "INITIAL"              # Or "NEVER" to skip initial snapshot
+        # streamLimit: 100000            # Optional: Max stream length
+        # clearOffset: false             # Optional: Clear offset on start
+
+targets:
+  target:
+    connection:
+      type: redis
+      host: ${TARGET_DB_HOST}
+      port: ${TARGET_DB_PORT}
+      user: ${TARGET_DB_USERNAME}
+      password: ${TARGET_DB_PASSWORD}
+
+processors:
+  target_data_type: json
+```
+
+{{< note >}}
+The Snowflake connector supports connecting to exactly one database and schema. All table names in the `tables` section are assumed to be in the configured database and schema.
+{{< /note >}}
+
+### Snowflake connection properties
+
+| Property      | Type   | Required | Description                                                    |
+|---------------|--------|----------|----------------------------------------------------------------|
+| `type`        | string | Yes      | Must be `"snowflake"`                                          |
+| `url`         | string | Yes      | JDBC URL: `jdbc:snowflake://<account>.snowflakecomputing.com/` |
+| `username`    | string | Yes      | Snowflake username                                             |
+| `password`    | string | No*      | Snowflake password                                             |
+| `database`    | string | Yes      | Snowflake database name                                        |
+| `schema`      | string | Yes      | Snowflake schema name                                          |
+| `warehouse`   | string | Yes      | Snowflake warehouse name                                       |
+| `role`        | string | No       | Snowflake role name                                            |
+| `cdcDatabase` | string | No       | Database for CDC streams (if different from source)            |
+| `cdcSchema`   | string | No       | Schema for CDC streams (if different from source)              |
+
+* Either `password` or private key authentication is required. See [Configure authentication](#2-configure-authentication) for details.
+
+### Advanced RIOTX options
+
+Configure under `sources.<name>.advanced.riotx`:
+
+| Property      | Type    | Default     | Description                            |
+|---------------|---------|-------------|----------------------------------------|
+| `poll`        | string  | `"30s"`     | Polling interval for stream changes    |
+| `snapshot`    | string  | `"INITIAL"` | Snapshot mode: `INITIAL` or `NEVER`    |
+| `streamLimit` | integer | -           | Maximum stream length (XTRIM MAXLEN)   |
+| `keyColumns`  | array   | -           | Columns to use as message keys         |
+| `clearOffset` | boolean | `false`     | Clear existing offset on start         |
+| `count`       | integer | `0`         | Limit records per poll (0 = unlimited) |
+
+## Troubleshooting
+
+### Connection issues
+
+**Error: "Failed to connect to Snowflake"**
+
+- Verify the account URL is correct (format: `<account>.snowflakecomputing.com`)
+- Check network connectivity to Snowflake
+- Verify the warehouse is running and accessible
+- Check firewall rules allow outbound HTTPS (port 443)
+
+**Error: "Authentication failed"**
+
+- For password auth: verify username and password are correct
+- For key-pair auth: verify the private key matches the public key registered in Snowflake
+- Ensure the user has appropriate permissions
+
+**Error: "Warehouse not found"**
+
+- Verify the warehouse name is correct
+- Ensure the user has USAGE permission on the warehouse
+
+### CDC issues
+
+**No data appearing in Redis**
+
+1. Verify Snowflake Streams exist for target tables:
+
+    ```sql
+    SHOW STREAMS IN SCHEMA my_database.my_schema;
+    ```
+
+1. Check the polling interval configuration
+1. Verify Redis connection is working
+1. Check RIOTX collector logs:
+
+    ```bash
+    kubectl logs -n rdi -l app=riotx-collector-source
+    ```
+
+**Stale or missing changes**
+
+- Snowflake Streams have a retention period (default 14 days)
+- If the collector was offline longer than retention, changes may be lost
+- Consider using `clearOffset: true` to restart from current state
+
+### Performance tuning
+
+**High Snowflake API usage**
+
+- Increase `poll` interval (e.g., `"60s"` or `"120s"`)
+- Use a dedicated warehouse for CDC operations
+
+**Redis memory concerns**
+
+- Set `streamLimit` to cap stream length
+- Use `count` to limit records per poll batch
+
+**Initial snapshot too slow**
+
+- Use `snapshot: "NEVER"` to skip initial snapshot
+- Pre-load data using other methods if needed
+
+### Enable debug logging
+
+Enable debug logging in the source configuration:
+
+```yaml
+sources:
+  snowflake:
+    type: riotx
+    logging:
+      level: debug
+    # ... rest of configuration
+```
+
+View collector logs:
+
+```bash
+kubectl logs -n rdi -l app=riotx-collector-source -f
+```
+
+## 5. Configuration is complete
+
+Once you have followed the steps above, your Snowflake database is ready for RDI to use.
+
+## See also
+
+- [Snowflake Streams Documentation](https://docs.snowflake.com/en/user-guide/streams)
+- [Snowflake Key Pair Authentication](https://docs.snowflake.com/en/user-guide/key-pair-auth)
+- [RDI Deployment Guide]({{< relref "/integrate/redis-data-integration/data-pipelines/deploy" >}})
+

From 0070eeffc2e790f8eabd4ba7caddb65936290d9a Mon Sep 17 00:00:00 2001
From: Andy Stark <andrew.stark@redis.com>
Date: Fri, 13 Mar 2026 16:45:10 +0000
Subject: [PATCH 2/7] DOC-6370 replaced 'streaming' with 'CDC'

---
 .../data-pipelines/prepare-dbs/snowflake.md                     | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/content/integrate/redis-data-integration/data-pipelines/prepare-dbs/snowflake.md b/content/integrate/redis-data-integration/data-pipelines/prepare-dbs/snowflake.md
index f0daa18abd..c0061cae7a 100644
--- a/content/integrate/redis-data-integration/data-pipelines/prepare-dbs/snowflake.md
+++ b/content/integrate/redis-data-integration/data-pipelines/prepare-dbs/snowflake.md
@@ -19,7 +19,7 @@ This guide describes the steps required to prepare a Snowflake database as a sou
 
 RDI uses the [RIOTX](https://redis.github.io/riotx/) collector to stream data from Snowflake to Redis. 
 During the [snapshot]({{< relref "/integrate/redis-data-integration/data-pipelines#pipeline-lifecycle" >}}) phase, RDI reads the current state of the database using the JDBC driver. In the 
-[streaming]({{< relref "/integrate/redis-data-integration/data-pipelines#pipeline-lifecycle" >}})
+[Change data capture (CDC)]({{< relref "/integrate/redis-data-integration/data-pipelines#pipeline-lifecycle" >}})
 phase, RDI uses [Snowflake Streams](https://docs.snowflake.com/en/user-guide/streams) to 
 capture changes related to the monitored tables. Note that RIOTX will automatically create and manage 
 the required streams.

From ab7a4770d66150859bc61d1f5781bef6bece2211 Mon Sep 17 00:00:00 2001
From: Andy Stark <andrew.stark@redis.com>
Date: Fri, 10 Apr 2026 09:17:06 +0100
Subject: [PATCH 3/7] DOC-6370 remove RIOT-X mentions in text

---
 .../data-pipelines/prepare-dbs/snowflake.md           | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/content/integrate/redis-data-integration/data-pipelines/prepare-dbs/snowflake.md b/content/integrate/redis-data-integration/data-pipelines/prepare-dbs/snowflake.md
index c0061cae7a..1b3eed6e34 100644
--- a/content/integrate/redis-data-integration/data-pipelines/prepare-dbs/snowflake.md
+++ b/content/integrate/redis-data-integration/data-pipelines/prepare-dbs/snowflake.md
@@ -16,12 +16,11 @@ weight: 20
 ---
 
 This guide describes the steps required to prepare a Snowflake database as a source for Redis Data Integration (RDI) pipelines.
-
-RDI uses the [RIOTX](https://redis.github.io/riotx/) collector to stream data from Snowflake to Redis. 
+ 
 During the [snapshot]({{< relref "/integrate/redis-data-integration/data-pipelines#pipeline-lifecycle" >}}) phase, RDI reads the current state of the database using the JDBC driver. In the 
 [Change data capture (CDC)]({{< relref "/integrate/redis-data-integration/data-pipelines#pipeline-lifecycle" >}})
 phase, RDI uses [Snowflake Streams](https://docs.snowflake.com/en/user-guide/streams) to 
-capture changes related to the monitored tables. Note that RIOTX will automatically create and manage 
+capture changes related to the monitored tables. Note that RDI will automatically create and manage
 the required streams.
 
 ## Setup
@@ -47,7 +46,7 @@ Snowflake is only supported with RDI deployed on Kubernetes/Helm. RDI VM mode do
 The RDI user requires the following permissions to connect and capture data from Snowflake:
 
 - `SELECT` on source tables
-- `CREATE STREAM` permission (RIOTX automatically creates and manages Snowflake Streams for CDC)
+- `CREATE STREAM` permission (RDI automatically creates and manages Snowflake Streams for CDC)
 - `USAGE` permission on the warehouse for query execution
 
 Grant the required permissions to your RDI user:
@@ -195,7 +194,7 @@ The Snowflake connector supports connecting to exactly one database and schema.
 
 * Either `password` or private key authentication is required. See [Configure authentication](#2-configure-authentication) for details.
 
-### Advanced RIOTX options
+### Advanced configuration options
 
 Configure under `sources.<name>.advanced.riotx`:
 
@@ -242,7 +241,7 @@ Configure under `sources.<name>.advanced.riotx`:
 
 1. Check the polling interval configuration
 1. Verify Redis connection is working
-1. Check RIOTX collector logs:
+1. Check the collector logs:
 
     ```bash
     kubectl logs -n rdi -l app=riotx-collector-source

From 9c8d6504ec66e3643f45572fdd271fdf4d931d6e Mon Sep 17 00:00:00 2001
From: Jeremy Plichta <jeremy.plichta@redis.com>
Date: Sun, 19 Apr 2026 09:23:15 -0600
Subject: [PATCH 4/7] Align Snowflake docs with updated RDI config

---
 .../data-pipelines/prepare-dbs/snowflake.md   | 105 ++++++++++++------
 1 file changed, 69 insertions(+), 36 deletions(-)

diff --git a/content/integrate/redis-data-integration/data-pipelines/prepare-dbs/snowflake.md b/content/integrate/redis-data-integration/data-pipelines/prepare-dbs/snowflake.md
index 1b3eed6e34..c55bb8d980 100644
--- a/content/integrate/redis-data-integration/data-pipelines/prepare-dbs/snowflake.md
+++ b/content/integrate/redis-data-integration/data-pipelines/prepare-dbs/snowflake.md
@@ -16,12 +16,12 @@ weight: 20
 ---
 
 This guide describes the steps required to prepare a Snowflake database as a source for Redis Data Integration (RDI) pipelines.
- 
-During the [snapshot]({{< relref "/integrate/redis-data-integration/data-pipelines#pipeline-lifecycle" >}}) phase, RDI reads the current state of the database using the JDBC driver. In the 
+
+During both the [snapshot]({{< relref "/integrate/redis-data-integration/data-pipelines#pipeline-lifecycle" >}}) and
 [Change data capture (CDC)]({{< relref "/integrate/redis-data-integration/data-pipelines#pipeline-lifecycle" >}})
-phase, RDI uses [Snowflake Streams](https://docs.snowflake.com/en/user-guide/streams) to 
-capture changes related to the monitored tables. Note that RDI will automatically create and manage
-the required streams.
+phases, RDI uses [Snowflake Streams](https://docs.snowflake.com/en/user-guide/streams) to read data from the monitored
+tables. For the initial snapshot, RDI creates the stream with `SHOW_INITIAL_ROWS = TRUE` so it can read the current
+table contents before continuing with ongoing CDC. RDI automatically creates and manages the required streams.
 
 ## Setup
 
@@ -43,19 +43,25 @@ Snowflake is only supported with RDI deployed on Kubernetes/Helm. RDI VM mode do
 
 ## 1. Set up Snowflake permissions
 
-The RDI user requires the following permissions to connect and capture data from Snowflake:
+The RDI user requires permissions to read the source tables and to create the Snowflake objects RDI uses for CDC:
+
+- `USAGE`, `OPERATE` on the warehouse used for RDI reads
+- `USAGE` on the source database and source schema
+- `SELECT` on the source tables
+- `USAGE` on the CDC schema used by RDI
+- `CREATE STREAM`, `CREATE TABLE` on the CDC schema used by RDI
 
-- `SELECT` on source tables
-- `CREATE STREAM` permission (RDI automatically creates and manages Snowflake Streams for CDC)
-- `USAGE` permission on the warehouse for query execution
+If you configure `cdcDatabase` and `cdcSchema`, grant the CDC permissions there. Otherwise, grant them in the source
+schema. If your Snowflake setup requires it, also grant any additional cross-database privileges needed for the CDC
+schema to reference the source tables.
 
 Grant the required permissions to your RDI user:
 
 ```sql
 -- Grant usage on the warehouse
-GRANT USAGE ON WAREHOUSE COMPUTE_WH TO ROLE rdi_role;
+GRANT USAGE, OPERATE ON WAREHOUSE COMPUTE_WH TO ROLE rdi_role;
 
--- Grant usage on the database and schema
+-- Grant usage on the source database and schema
 GRANT USAGE ON DATABASE MYDB TO ROLE rdi_role;
 GRANT USAGE ON SCHEMA MYDB.PUBLIC TO ROLE rdi_role;
 
@@ -63,8 +69,9 @@ GRANT USAGE ON SCHEMA MYDB.PUBLIC TO ROLE rdi_role;
 GRANT SELECT ON TABLE MYDB.PUBLIC.customers TO ROLE rdi_role;
 GRANT SELECT ON TABLE MYDB.PUBLIC.orders TO ROLE rdi_role;
 
--- Grant CREATE STREAM permission for CDC
-GRANT CREATE STREAM ON SCHEMA MYDB.PUBLIC TO ROLE rdi_role;
+-- Grant permissions on the schema RDI uses for CDC objects
+GRANT USAGE ON SCHEMA MYDB.RDI_CDC TO ROLE rdi_role;
+GRANT CREATE STREAM, CREATE TABLE ON SCHEMA MYDB.RDI_CDC TO ROLE rdi_role;
 
 -- Assign the role to your RDI user
 GRANT ROLE rdi_role TO USER rdi_user;
@@ -78,6 +85,13 @@ RDI supports two authentication methods for Snowflake. You must configure one of
 
 Use standard username and password credentials. Store these securely using Kubernetes secrets (see step 3).
 
+{{< note >}}
+Many Snowflake accounts require MFA for password-based sign-ins. If you want to use password authentication for RDI,
+configure the Snowflake user as a service user that is allowed to authenticate non-interactively. Otherwise, use
+private key authentication instead. For more information, see the Snowflake
+[MFA rollout documentation](https://docs.snowflake.com/en/user-guide/security-mfa-rollout).
+{{< /note >}}
+
 ### Private key authentication
 
 For enhanced security, use key-pair authentication:
@@ -142,22 +156,26 @@ sources:
     connection:
       type: snowflake
       url: "jdbc:snowflake://myaccount.snowflakecomputing.com/"
-      username: "${SOURCE_DB_USERNAME}"
+      user: "${SOURCE_DB_USERNAME}"
       password: "${SOURCE_DB_PASSWORD}"  # Omit for key-pair auth
       database: "MYDB"
-      schema: "PUBLIC"
       warehouse: "COMPUTE_WH"
       # role: "RDI_ROLE"                 # Optional: Snowflake role
       # cdcDatabase: "CDC_DB"            # Optional: Separate database for CDC streams
       # cdcSchema: "CDC_SCHEMA"          # Optional: Separate schema for CDC streams
+    schemas:
+      - PUBLIC
     tables:
-      customers: {}
-      orders: {}
+      PUBLIC.customers: {}
+      PUBLIC.orders: {}
     advanced:
       riotx:
         poll: "30s"
         snapshot: "INITIAL"              # Or "NEVER" to skip initial snapshot
+        # streamPrefix: "data:"          # Optional: Redis stream prefix
         # streamLimit: 100000            # Optional: Max stream length
+        # keyColumns:                    # Recommended: stable key columns
+        #   - "id"
         # clearOffset: false             # Optional: Clear offset on start
 
 targets:
@@ -174,7 +192,9 @@ processors:
 ```
 
 {{< note >}}
-The Snowflake connector supports connecting to exactly one database and schema. All table names in the `tables` section are assumed to be in the configured database and schema.
+Snowflake uses one configured `database` and one or more source-level `schemas`. In the `tables` section, specify each
+table as `SCHEMA.table`. Even when you configure only one schema, explicit `SCHEMA.table` names are recommended for
+clarity.
 {{< /note >}}
 
 ### Snowflake connection properties
@@ -183,10 +203,9 @@ The Snowflake connector supports connecting to exactly one database and schema.
 |---------------|--------|----------|----------------------------------------------------------------|
 | `type`        | string | Yes      | Must be `"snowflake"`                                          |
 | `url`         | string | Yes      | JDBC URL: `jdbc:snowflake://<account>.snowflakecomputing.com/` |
-| `username`    | string | Yes      | Snowflake username                                             |
+| `user`        | string | Yes      | Snowflake user                                                 |
 | `password`    | string | No*      | Snowflake password                                             |
 | `database`    | string | Yes      | Snowflake database name                                        |
-| `schema`      | string | Yes      | Snowflake schema name                                          |
 | `warehouse`   | string | Yes      | Snowflake warehouse name                                       |
 | `role`        | string | No       | Snowflake role name                                            |
 | `cdcDatabase` | string | No       | Database for CDC streams (if different from source)            |
@@ -194,18 +213,28 @@ The Snowflake connector supports connecting to exactly one database and schema.
 
 * Either `password` or private key authentication is required. See [Configure authentication](#2-configure-authentication) for details.
 
+### Snowflake source properties
+
+| Property   | Type   | Required | Description                                                      |
+|------------|--------|----------|------------------------------------------------------------------|
+| `schemas`  | array  | Yes      | Schema names to capture from                                     |
+| `tables`   | object | Yes      | Tables to capture, keyed as `SCHEMA.table`                       |
+
 ### Advanced configuration options
 
 Configure under `sources.<name>.advanced.riotx`:
 
-| Property      | Type    | Default     | Description                            |
-|---------------|---------|-------------|----------------------------------------|
-| `poll`        | string  | `"30s"`     | Polling interval for stream changes    |
-| `snapshot`    | string  | `"INITIAL"` | Snapshot mode: `INITIAL` or `NEVER`    |
-| `streamLimit` | integer | -           | Maximum stream length (XTRIM MAXLEN)   |
-| `keyColumns`  | array   | -           | Columns to use as message keys         |
-| `clearOffset` | boolean | `false`     | Clear existing offset on start         |
-| `count`       | integer | `0`         | Limit records per poll (0 = unlimited) |
+| Property       | Type    | Default     | Description                                  |
+|----------------|---------|-------------|----------------------------------------------|
+| `poll`         | string  | `"30s"`     | Polling interval for stream changes          |
+| `snapshot`     | string  | `"INITIAL"` | Snapshot mode: `INITIAL` or `NEVER`          |
+| `streamPrefix` | string  | `"data:"`   | Prefix for the Redis stream written by RDI   |
+| `streamLimit`  | integer | -           | Maximum stream length (XTRIM MAXLEN)         |
+| `keyColumns`   | array   | -           | Stable source columns to use as message keys |
+| `clearOffset`  | boolean | `false`     | Clear existing offset on start               |
+| `count`        | integer | `0`         | Limit records per poll (0 = unlimited)       |
+
+For reliable update and delete handling, define `keyColumns` with a stable business key or surrogate key when possible.
 
 ## Troubleshooting
 
@@ -233,10 +262,10 @@ Configure under `sources.<name>.advanced.riotx`:
 
 **No data appearing in Redis**
 
-1. Verify Snowflake Streams exist for target tables:
+1. Verify Snowflake Streams exist in the CDC schema:
 
     ```sql
-    SHOW STREAMS IN SCHEMA my_database.my_schema;
+    SHOW STREAMS IN SCHEMA my_cdc_database.my_cdc_schema;
     ```
 
 1. Check the polling interval configuration
@@ -244,21 +273,24 @@ Configure under `sources.<name>.advanced.riotx`:
 1. Check the collector logs:
 
     ```bash
-    kubectl logs -n rdi -l app=riotx-collector-source
+    kubectl get deployments -n rdi | grep riotx-collector
+    kubectl logs -n rdi deployment/<riotx-collector-deployment>
     ```
 
 **Stale or missing changes**
 
-- Snowflake Streams have a retention period (default 14 days)
-- If the collector was offline longer than retention, changes may be lost
+- Snowflake Streams depend on Snowflake change tracking and retention settings
+- If the collector was offline longer than the available retention window, changes may be lost
 - Consider using `clearOffset: true` to restart from current state
 
 ### Performance tuning
 
-**High Snowflake API usage**
+**High Snowflake warehouse usage**
 
 - Increase `poll` interval (e.g., `"60s"` or `"120s"`)
 - Use a dedicated warehouse for CDC operations
+- Each poll first calls Snowflake's `SYSTEM$STREAM_HAS_DATA` function to check whether the stream has new data. This
+  check does not start the warehouse; warehouse compute starts only when RDI reads rows from the stream.
 
 **Redis memory concerns**
 
@@ -286,7 +318,8 @@ sources:
 View collector logs:
 
 ```bash
-kubectl logs -n rdi -l app=riotx-collector-source -f
+kubectl get deployments -n rdi | grep riotx-collector
+kubectl logs -n rdi deployment/<riotx-collector-deployment> -f
 ```
 
 ## 5. Configuration is complete
@@ -297,5 +330,5 @@ Once you have followed the steps above, your Snowflake database is ready for RDI
 
 - [Snowflake Streams Documentation](https://docs.snowflake.com/en/user-guide/streams)
 - [Snowflake Key Pair Authentication](https://docs.snowflake.com/en/user-guide/key-pair-auth)
+- [Snowflake MFA rollout documentation](https://docs.snowflake.com/en/user-guide/security-mfa-rollout)
 - [RDI Deployment Guide]({{< relref "/integrate/redis-data-integration/data-pipelines/deploy" >}})
-

From 324169c5386a95139991612657066e991985e034 Mon Sep 17 00:00:00 2001
From: Jeremy Plichta <jeremy.plichta@redis.com>
Date: Sun, 19 Apr 2026 09:35:06 -0600
Subject: [PATCH 5/7] Clarify Snowflake stream permission prerequisites

---
 .../data-pipelines/prepare-dbs/snowflake.md   | 23 ++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/content/integrate/redis-data-integration/data-pipelines/prepare-dbs/snowflake.md b/content/integrate/redis-data-integration/data-pipelines/prepare-dbs/snowflake.md
index c55bb8d980..eb35bad7f1 100644
--- a/content/integrate/redis-data-integration/data-pipelines/prepare-dbs/snowflake.md
+++ b/content/integrate/redis-data-integration/data-pipelines/prepare-dbs/snowflake.md
@@ -43,7 +43,8 @@ Snowflake is only supported with RDI deployed on Kubernetes/Helm. RDI VM mode do
 
 ## 1. Set up Snowflake permissions
 
-The RDI user requires permissions to read the source tables and to create the Snowflake objects RDI uses for CDC:
+The following are the minimum runtime permissions for the RDI role to read the source tables and create the Snowflake
+objects RDI uses for CDC:
 
 - `USAGE`, `OPERATE` on the warehouse used for RDI reads
 - `USAGE` on the source database and source schema
@@ -55,6 +56,17 @@ If you configure `cdcDatabase` and `cdcSchema`, grant the CDC permissions there.
 schema. If your Snowflake setup requires it, also grant any additional cross-database privileges needed for the CDC
 schema to reference the source tables.
 
+{{< note >}}
+Before RDI can create the initial stream for a source table, Snowflake change tracking must already be enabled on that
+table, or the role creating the initial stream must own the table. If the source tables are not owned by the RDI role,
+ask a Snowflake administrator or table owner to enable change tracking first:
+
+```sql
+ALTER TABLE MYDB.PUBLIC.customers SET CHANGE_TRACKING = TRUE;
+ALTER TABLE MYDB.PUBLIC.orders SET CHANGE_TRACKING = TRUE;
+```
+{{< /note >}}
+
 Grant the required permissions to your RDI user:
 
 ```sql
@@ -77,6 +89,15 @@ GRANT CREATE STREAM, CREATE TABLE ON SCHEMA MYDB.RDI_CDC TO ROLE rdi_role;
 GRANT ROLE rdi_role TO USER rdi_user;
 ```
 
+If you use centralized grant management, you can also add future grants in the CDC schema so newly created tables and
+streams automatically receive the desired privileges. These grants are optional and are not part of the minimum runtime
+permissions:
+
+```sql
+GRANT SELECT ON FUTURE TABLES IN SCHEMA MYDB.RDI_CDC TO ROLE rdi_role;
+GRANT SELECT ON FUTURE STREAMS IN SCHEMA MYDB.RDI_CDC TO ROLE rdi_role;
+```
+
 ## 2. Configure authentication
 
 RDI supports two authentication methods for Snowflake. You must configure one of these methods.

From 9feb9a72b253143bfa9ce9d1656daffb4bf088ad Mon Sep 17 00:00:00 2001
From: Jeremy Plichta <jeremy.plichta@redis.com>
Date: Sun, 19 Apr 2026 10:06:36 -0600
Subject: [PATCH 6/7] Refine Snowflake stream ownership guidance

---
 .../data-pipelines/prepare-dbs/snowflake.md            | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/content/integrate/redis-data-integration/data-pipelines/prepare-dbs/snowflake.md b/content/integrate/redis-data-integration/data-pipelines/prepare-dbs/snowflake.md
index eb35bad7f1..2961300b0f 100644
--- a/content/integrate/redis-data-integration/data-pipelines/prepare-dbs/snowflake.md
+++ b/content/integrate/redis-data-integration/data-pipelines/prepare-dbs/snowflake.md
@@ -57,9 +57,13 @@ schema. If your Snowflake setup requires it, also grant any additional cross-dat
 schema to reference the source tables.
 
 {{< note >}}
-Before RDI can create the initial stream for a source table, Snowflake change tracking must already be enabled on that
-table, or the role creating the initial stream must own the table. If the source tables are not owned by the RDI role,
-ask a Snowflake administrator or table owner to enable change tracking first:
+RDI manages the Snowflake streams it uses for snapshot and CDC. The collector creates the stream in the configured CDC
+schema and later issues `CREATE OR REPLACE STREAM` statements to keep the stream aligned with the expected offset, so
+the RDI role must be able to create and own those stream objects in the CDC schema.
+
+There is one stricter bootstrap requirement for the first stream created on a source table: if Snowflake change
+tracking is not already enabled on that table, only the table owner can create that initial stream. If the source
+tables are not owned by the RDI role, ask a Snowflake administrator or table owner to enable change tracking first:
 
 ```sql
 ALTER TABLE MYDB.PUBLIC.customers SET CHANGE_TRACKING = TRUE;

From f7eb9f49d0657c19bb8e040a7a2335810cf1e461 Mon Sep 17 00:00:00 2001
From: Jeremy Plichta <jeremy.plichta@redis.com>
Date: Sun, 19 Apr 2026 10:37:48 -0600
Subject: [PATCH 7/7] Mark Snowflake source docs as private preview

---
 .../data-pipelines/prepare-dbs/snowflake.md                      | 1 +
 1 file changed, 1 insertion(+)

diff --git a/content/integrate/redis-data-integration/data-pipelines/prepare-dbs/snowflake.md b/content/integrate/redis-data-integration/data-pipelines/prepare-dbs/snowflake.md
index 2961300b0f..c047971901 100644
--- a/content/integrate/redis-data-integration/data-pipelines/prepare-dbs/snowflake.md
+++ b/content/integrate/redis-data-integration/data-pipelines/prepare-dbs/snowflake.md
@@ -11,6 +11,7 @@ group: di
 linkTitle: Prepare Snowflake
 summary: Redis Data Integration keeps Redis in sync with the primary database in near
   real time.
+bannerText: Snowflake source support for Redis Data Integration is currently in private preview. Features and behavior are subject to change. General private preview terms apply.
 type: integration
 weight: 20
 ---