Key Features
- Automatic table management: tables are created automatically inside your dataset; schema evolution is handled as the source schema changes.
- History + latest state: every change event is preserved in a history table, with an upsert view exposing the current state per primary key.
- CDC replication: primary keys configured in the subscription mapping are used to apply updates and deletes correctly.
Prerequisites
- A GCP project with BigQuery enabled and an existing dataset — Popsink creates tables, not datasets.
- A service account with the following roles on the dataset/project:
- BigQuery Data Editor (
roles/bigquery.dataEditor) — create tables and insert data - BigQuery Job User (
roles/bigquery.jobUser) — run load jobs
- BigQuery Data Editor (
Configuration
| Field | Required | Description |
|---|---|---|
| Service Account | Yes | GCP service account key in JSON format — paste the full content of the key file |
| Project | No | GCP project ID; auto-extracted from the service account’s project_id when empty |
| Dataset | Yes | Existing BigQuery dataset where tables will be created |
How It Works
For each subscription, Popsink creates:- A history table (
{table}_history) containing all CDC events with metadata columns. - An upsert view (
{table}) exposing the latest state per primary key.