Postgres
Calabi Connect's certified Postgres connector offers the following features:
- Replicate data from tables, views and materialized views. Other data objects won't be replicated to the destination like indexes, permissions.
- Multiple methods of keeping your data fresh, including Change Data Capture (CDC) and replication using the xmin system column.
- All available sync modes, providing flexibility in how data is delivered to your destination.
- Reliable replication at any table size with checkpointing and chunking of database reads.
The contents below include a 'Quick Start' guide, advanced setup steps, and reference information (data type mapping, and changelogs). See here to troubleshoot issues with the Postgres connector.
Please note the required minimum platform version is v0.58.0 for this connector.
Quick Start
Here is an outline of the minimum required steps to configure a Postgres connector:
- Create a dedicated read-only Postgres user with permissions for replicating data
- Create a new Postgres source in the Calabi Connect UI using
xminsystem column - (Calabi Connect Only) Allow inbound traffic from Calabi Connect IPs
Once this is complete, you will be able to select Postgres as a source for replicating data.
Step 2: Create a new Postgres source in Calabi Connect UI
From your Calabi Connect or Calabi Connect account, select Sources from the left navigation bar, search for Postgres, then create a new Postgres source.
To fill out the required information:
- Enter the username and password you created in Step 1.
- Select an SSL mode. You will most frequently choose
requireorverify-ca. Both of these always require encryption.verify-caalso requires certificates from your Postgres database. - Select
Standard (xmin)from available replication methods. This uses the xmin system column to reliably replicate data from your database.- If your database is particularly large (> 500 GB), you will benefit from configuring your Postgres source using logical replication (CDC).
Step 3: (Calabi Connect Only) Allow inbound traffic from Calabi Connect IPs
If you are on Calabi Connect, you will always need to modify your database configuration to allow inbound traffic from Calabi Connect IPs. You can find a list of all IPs that need to be allowlisted in our Calabi Connect Security docs.
Now, click Set up source in the Calabi Connect UI. Calabi Connect will now test connecting to your database. Once this succeeds, you've configured an Calabi Connect Postgres source!
Advanced Configuration using CDC
Calabi Connect uses logical replication of the Postgres write-ahead log (WAL) to incrementally capture deletes using a replication plugin:
- See here to learn more on how Calabi Connect implements CDC.
- See here to learn more about Postgres CDC requirements and limitations.
We recommend configuring your Postgres source with CDC when:
- You need a record of deletions.
- You have a very large database (500 GB or more).
- Your table has a primary key but doesn't have a reasonable cursor field for incremental syncing (
updated_at).
These are the additional steps required (after following the quick start) to configure your Postgres source using CDC:
- Provide additional
REPLICATIONpermissions to read-only user - Enable logical replication on your Postgres database
- Create a replication slot on your Postgres database
- Create publication and replication identities for each Postgres table
- Enable CDC replication in the Calabi Connect UI
Step 1: Prepopulate your Postgres source configuration
We recommend following the steps in the quick start section to confirm that Calabi Connect can connect to your Postgres database prior to configuring CDC settings.
For CDC, you may connect to primary/master databases or replicas. To use a replica as a source, Postgres must be at version 16.1 or later and this connector must be at version 3.6.21 or later. You must also enable additional configurations on the database instance (for help, see the Postgres official documentation).
Step 2: Provide additional permissions to read-only user
To configure CDC for the Postgres source connector, grant REPLICATION permissions to the user created in step 1 of the quick start:
ALTER USER <user_name> REPLICATION;
Step 3: Enable logical replication on your Postgres database
To enable logical replication, follow these steps based on your deployment environment.
Bare Metal, VMs, and Docker
To enable logical replication on bare metal, VMs (EC2/GCE/etc), or Docker, configure the following parameters in the postgresql.conf file for your Postgres database.
| Parameter | Description | Set value to |
|---|---|---|
| wal_level | Type of coding used within the Postgres write-ahead log | logical |
| max_wal_senders | The maximum number of processes used for handling WAL changes | min: 1 |
| max_replication_slots | The maximum number of replication slots that are allowed to stream WAL changes | 1 (if Calabi Connect is the only service reading subscribing to WAL changes. More than 1 if other services are also reading from the WAL) |
AWS Postgres RDS or Aurora
- Go to the Configuration tab for your DB cluster.
- Find your cluster parameter group. Either edit the parameters for this group or create a copy of this parameter group to edit. If you create a copy, change your cluster's parameter group before restarting.
- Within the parameter group page, search for
rds.logical_replication. Select this row and click Edit parameters. Set this value to1. - Wait for a maintenance window to automatically restart the instance or restart it manually.
AWS Aurora implements a CDC caching layer that is incompatible with Calabi Connect's CDC implementation. To use Calabi Connect with AWS Aurora, disable the CDC caching layer. Disable CDC caching by setting the rds.logical_wal_cache parameter to 0 in the AWS Aurora parameter group.
Azure Database for Postgres
Change the replication mode of your Postgres DB on Azure to logical using the replication menu of your PostgreSQL instance in the Azure Portal. Alternatively, use the Azure CLI to run the following command:
az postgres server configuration set --resource-group group --server-name server --name azure.replication_support --value logical
az postgres server restart --resource-group group --name server
Step 4: Create a replication slot on your Postgres database
Step 5: Create publication and replication identities for each Postgres table
The Calabi Connect UI currently allows selecting any tables for CDC. If a table is selected that is not part of the publication, it will not be replicated even though it is selected. If a table is part of the publication but does not have a replication identity, that replication identity will be created automatically on the first run if the Calabi Connect user has the necessary permissions.
Step 6: Enable CDC replication in Calabi Connect UI
In your Postgres source, change the update method to Read Changes using Change Data Capture (CDC), and enter the replication slot and publication you just created.
Postgres Replication Methods
The Postgres source currently offers 3 methods of replicating updates to your destination: CDC, xmin and standard (with a user defined cursor). Both CDC and xmin are the most reliable methods of updating your data.
Connecting with SSL or SSH Tunneling
SSL Modes
SSH Tunneling
If you are using SSH tunneling, as Calabi Connect requires encrypted communication, select SSH Key Authentication or Password Authentication if you selected disable, allow, or prefer as the SSL Mode; otherwise, the connection will fail.
For SSH Tunnel Method, select:
No Tunnelfor a direct connection to the databaseSSH Key Authenticationto use an RSA private key as your secret for establishing the SSH tunnelPassword Authenticationto use a password as your secret for establishing the SSH tunnel
Connect via SSH Tunnel
You can connect to a Postgres instance via an SSH tunnel.
When using an SSH tunnel, you are configuring Calabi Connect to connect to an intermediate server (also called a bastion or a jump server) that has direct access to the database. Calabi Connect connects to the bastion and then asks the bastion to connect directly to the server.
To connect to a Postgres instance via an SSH tunnel:
- While setting up the Postgres source connector, from the SSH tunnel dropdown, select:
- SSH Key Authentication to use a private key as your secret for establishing the SSH tunnel
- Password Authentication to use a password as your secret for establishing the SSH Tunnel
- For SSH Tunnel Jump Server Host, enter the hostname or IP address for the intermediate (bastion) server that Calabi Connect will connect to.
- For SSH Connection Port, enter the port on the bastion server. The default port for SSH connections is 22.
- For SSH Login Username, enter the username to use when connecting to the bastion server. Note: This is the operating system username and not the Postgres username.
- For authentication:
- If you selected SSH Key Authentication, set the SSH Private Key to the private Key that you are using to create the SSH connection.
- If you selected Password Authentication, enter the password for the operating system user to connect to the bastion server. Note: This is the operating system password and not the Postgres password.
Generating a private key for SSH Tunneling
The connector supports any SSH compatible key format such as RSA or Ed25519. To generate an RSA key, for example, run:
ssh-keygen -t rsa -m PEM -f myuser_rsa
The command produces the private key in PEM format and the public key remains in the standard format used by the authorized_keys file on your bastion server. Add the public key to your bastion host to the user you want to use with Calabi Connect. The private key is provided via copy-and-paste to the Calabi Connect connector configuration screen to allow it to log into the bastion server.
Configuring Entra authentication
The Calabi Connect source can be configured to authenticate as a Microsoft Entra service principal. This allows Calabi Connect to use short-lived identity tokens to authenticate to an Azure Postgres server. Consult the Microsoft documentation on this topic for more detail on configuring the server and other Entra resources.
To configure the Calabi Connect Postgres source with Entra authentication:
- Set the
Usernameto the Entra ID, as discussed in Microsoft's documentation. - Set the password to a client secret for your Entra service principal.
- Enable the Entra service principal authentication toggle.
- Provide the Entra tenant ID and Entra client (or app) ID of the service principal.
Limitations & Troubleshooting
To see connector limitations, or troubleshoot your Postgres connector, see more in our Postgres troubleshooting guide.
Data type mapping
According to Postgres documentation, Postgres data types are mapped to the following data types when synchronizing data. You can check the test values examples here. If you can't find the data type you are looking for or have any problems feel free to add a new test!
| Postgres Type | Resulting Type | Notes |
|---|---|---|
bigint | number | |
bigserial, serial8 | number | |
bit | string | Fixed-length bit string (e.g. "0100"). |
bit varying, varbit | string | Variable-length bit string (e.g. "0100"). |
boolean, bool | boolean | |
box | string | |
bytea | string | Variable length binary string with hex output format prefixed with "\x" (e.g. "\x6b707a"). |
character, char | string | |
character varying, varchar | string | |
cidr | string | |
circle | string | |
date | string | Parsed as ISO8601 date time at midnight. CDC mode doesn't support era indicators. Issue: #14590 |
double precision, float, float8 | number | Infinity, -Infinity, and NaN are not supported and converted to null. Issue: #8902. |
hstore | string | |
inet | string | |
integer, int, int4 | number | |
interval | string | |
json | string | |
jsonb | string | |
line | string | |
lseg | string | |
macaddr | string | |
macaddr8 | string | |
money | number | |
numeric, decimal | number | Infinity, -Infinity, and NaN are not supported and converted to null. Issue: #8902. |
path | string | |
pg_lsn | string | |
point | string | |
polygon | string | |
real, float4 | number | |
smallint, int2 | number | |
smallserial, serial2 | number | |
serial, serial4 | number | |
text | string | |
time | string | Parsed as a time string without a time-zone in the ISO-8601 calendar system. |
timetz | string | Parsed as a time string with time-zone in the ISO-8601 calendar system. |
timestamp | string | Parsed as a date-time string without a time-zone in the ISO-8601 calendar system. |
timestamptz | string | Parsed as a date-time string with time-zone in the ISO-8601 calendar system. |
tsquery | string | |
tsvector | string | |
uuid | string | |
xml | string | |
enum | string | |
tsrange | string | |
array | array | E.g. "["10001","10002","10003","10004"]". |
| composite type | string |