Migrate from Karapace to Kora
This guide walks you through migrating all schemas, subjects, versions, and compatibility configurations from a Karapace schema registry into Kora. The migration preserves original Karapace schema IDs exactly — your producers and consumers do not need to be reconfigured. The process runs in four sequential phases: audit, dry-run, migrate, verify.| Phase | Command | What it does |
|---|---|---|
| 1. Audit | just migrate-audit | Snapshots all schemas, subjects, and configs from Karapace into a local JSON file |
| 2. Dry run | just migrate-dry-run | Simulates the migration — prints every action without writing to the database |
| 3. Migrate | just migrate-run | Writes schemas directly into Kora’s PostgreSQL, preserving all original IDs |
| 4. Verify | just migrate-verify | Validates every schema ID, subject, and version in Kora against the audit snapshot |
Prerequisites
Tools
uv manages the Python virtual environment and dependencies (fastavro, psycopg2-binary) automatically on first run. You do not need to run pip install manually.
Access requirements
Before starting, make sure you have:- HTTP/HTTPS access to your Karapace instance (for the audit and to resolve any collisions)
- Direct PostgreSQL access to your Kora database (TCP, port 5432 by default) — required for the migration step
- HTTP/HTTPS access to your Kora instance (for the verification step)
Kora database must be empty
Warning:migrate-runrequires a completely empty Kora database (zero rows inschema_contents). It will refuse to run and exit with an error if any schemas already exist. Run the migration on a fresh Kora instance only.
Configuration
All scripts read connection details from environment variables. Thejustfile loads a .env file from the project root automatically (set dotenv-load), so the recommended approach is to create a .env file once and reuse it across all phases.
Setting up .env
Create a .env file in the root of this repository:
Note: If your Karapace or Kora instance does not require authentication, omit the*_USERand*_PASSWORDvariables entirely.
Step 1 — Audit Karapace
migration/audits/.
Fetching is parallelised across subjects (20 concurrent workers by default), so auditing a large registry typically completes in seconds.
Example output:
Step 2 — Check for dedup collisions
Before moving to the dry run, look atdedup_collision_count in the audit summary.
What is a dedup collision?
Karapace assigns a new schema ID every time a schema is registered, even if the content is byte-for-byte identical to an existing schema. Kora uses content-based deduplication (via schema fingerprinting), so two subjects sharing identical schema content would be collapsed into a single ID — breaking the ID-preservation guarantee. The audit script detects this automatically and reports it as a dedup collision.If dedup_collision_count is 0
No action needed. Proceed to Step 3.
If dedup_collision_count is greater than 0
The migration will refuse to run until collisions are resolved. The audit JSON includes a dedup_collisions array identifying the conflicting schemas:
- Identify which subjects reference each conflicting ID (check the
schemas_by_id[id].subject_versionsarray in the audit file). - In Karapace, consolidate the affected subjects so they all reference the same schema ID — typically by re-registering one subject under the other’s schema version, then deleting the duplicate.
- Re-run
just migrate-auditto produce a fresh snapshot. - Confirm
dedup_collision_countis now0.
Note: If consolidating the subjects is not straightforward, contact Popsink support — they can advise on the safest resolution strategy for your topology.
Step 3 — Dry run
Step 4 — Run the migration
| Step | What it writes |
|---|---|
| 1/5 | schema_contents — all schema texts with their original Karapace IDs (explicit INSERT with ID, bypassing the auto-increment sequence) |
| 2/5 | subjects — all subject names, including soft-deleted subjects |
| 3/5 | schema_versions — all version → schema ID mappings, including soft-deleted versions |
| 4/5 | config — global compatibility level and per-subject overrides |
| 5/5 | Sequences — resets the PostgreSQL BIGSERIAL sequences to the current MAX(id) so future registrations continue from the right value |
Step 5 — Verify
| Check | What it validates |
|---|---|
| Schema IDs | Every schema ID from Karapace resolves in Kora to the correct schema content |
| Subject versions | Every subject/version pair maps to the expected schema ID and content |
| Subject list | The set of active subjects in Kora matches the active subjects in the audit |
migrate-verify passes with zero failures.
Environment variables reference
| Variable | Used by | Description |
|---|---|---|
KARAPACE_URL | migrate-audit | Base URL of the source Karapace instance, e.g. https://karapace.example.com |
KARAPACE_USER | migrate-audit | BasicAuth username for Karapace (optional) |
KARAPACE_PASSWORD | migrate-audit | BasicAuth password for Karapace (optional) |
KORA_DB_URL | migrate-dry-run, migrate-run | PostgreSQL connection URL, e.g. postgresql://user:pass@host:5432/dbname |
KORA_URL | migrate-verify | Base URL of the target Kora instance, e.g. https://kora.example.com |
KORA_USER | migrate-verify | BasicAuth username for Kora (optional) |
KORA_PASSWORD | migrate-verify | BasicAuth password for Kora (optional) |
AUDIT_FILE | migrate-dry-run, migrate-run, migrate-verify | Path to the audit JSON file. Defaults to the most recently modified file in migration/audits/ |
Troubleshooting
ERROR: N dedup collision(s) in audit
The migration script detected identical schema content assigned to different IDs in Karapace. See Step 2 — Check for dedup collisions for the resolution process.
ERROR: schema_contents is not empty (N row(s) exist)
The target Kora database already has schema data. This migration tool is designed for initial population only. If you need to re-run the migration, restore the database to a clean state first (e.g. drop and recreate the schema, then re-run Kora’s database migrations).
No audit file found in audits/
Either just migrate-audit has not been run yet, or AUDIT_FILE points to a path that does not exist. Run just migrate-audit first, or set AUDIT_FILE explicitly in your .env.
Connection refused / timeout on Karapace or Kora
Verify that:- The URL in
KARAPACE_URL/KORA_URLis reachable from the machine running the migration - Any firewall or VPN rules allow outbound HTTP/HTTPS to those hosts
- Credentials in
*_USER/*_PASSWORDare correct (test withcurl -u user:pass <url>/subjects)
Connection refused on Kora PostgreSQL
Verify that:KORA_DB_URLuses the correct host, port, database name, and credentials- The PostgreSQL instance allows connections from your IP (check
pg_hba.conf) - The database user has
INSERTandUPDATEprivileges on the Kora schema tables