diff --git a/src/current/_data/redirects.yml b/src/current/_data/redirects.yml index 9d1204e0acd..b98d2e398da 100644 --- a/src/current/_data/redirects.yml +++ b/src/current/_data/redirects.yml @@ -290,10 +290,13 @@ - 'migrate-from-serverless-to-dedicated.md' versions: ['cockroachcloud'] -- destination: molt/migrate-to-cockroachdb.md?filters=mysql +- destination: molt/migrate-data-load-and-replication.md?filters=oracle + sources: [':version/migrate-from-oracle.md'] + +- destination: molt/migrate-data-load-and-replication.md?filters=mysql sources: [':version/migrate-from-mysql.md'] -- destination: molt/migrate-to-cockroachdb.md +- destination: molt/migrate-data-load-and-replication.md sources: [':version/migrate-from-postgres.md'] - destination: molt/migration-overview.md diff --git a/src/current/_includes/molt/fetch-data-load-and-replication.md b/src/current/_includes/molt/fetch-data-load-and-replication.md deleted file mode 100644 index 928a8ac3746..00000000000 --- a/src/current/_includes/molt/fetch-data-load-and-replication.md +++ /dev/null @@ -1,38 +0,0 @@ -{% include_cached copy-clipboard.html %} -~~~ shell -molt fetch \ ---source "postgres://postgres:postgres@localhost:5432/molt?sslmode=disable" \ ---target "postgres://root@localhost:26257/molt?sslmode=disable" \ ---table-filter 'employees' \ ---bucket-path 's3://molt-test' \ ---table-handling truncate-if-exists \ ---non-interactive \ ---mode data-load-and-replication \ ---pglogical-replication-slot-name cdc_slot \ ---allow-tls-mode-disable -~~~ - -- `--table-filter` filters for tables with the `employees` string in the name. -- `--bucket-path` specifies a directory on an [Amazon S3 bucket](#data-path) where intermediate files will be written. -- `--table-handling` specifies that existing tables on CockroachDB should be truncated before the source data is loaded. -- `--mode data-load-and-replication` starts continuous [replication](#load-data-and-replicate-changes) of data from the source database to CockroachDB after the fetch task succeeds. -- `--pglogical-replication-slot-name` specifies a replication slot name to be created on the source PostgreSQL database. This is used in continuous [replication](#load-data-and-replicate-changes). - - -
-{% include_cached copy-clipboard.html %} -~~~ shell -molt fetch \ ---source "mysql://user:password@localhost/molt" \ ---target "postgres://root@localhost:26257/molt?sslmode=disable" \ ---table-filter 'employees' \ ---bucket-path 's3://molt-test' \ ---table-handling truncate-if-exists \ ---non-interactive \ ---mode data-load-and-replication \ ---allow-tls-mode-disable -~~~ - -- tk -- tk -
\ No newline at end of file diff --git a/src/current/_includes/molt/fetch-data-load-modes.md b/src/current/_includes/molt/fetch-data-load-modes.md index 0ea7d273307..a26e9a5f2be 100644 --- a/src/current/_includes/molt/fetch-data-load-modes.md +++ b/src/current/_includes/molt/fetch-data-load-modes.md @@ -1,5 +1,10 @@ -The following example migrates a single `employees` table. The table is exported to an Amazon S3 bucket and imported to CockroachDB using the [`IMPORT INTO`]({% link {{ site.current_cloud_version }}/import-into.md %}) statement, which is the [default MOLT Fetch mode]({% link molt/molt-fetch.md %}#data-movement). +MOLT Fetch can use either [`IMPORT INTO`]({% link {{site.current_cloud_version}}/import-into.md %}) or [`COPY FROM`]({% link {{site.current_cloud_version}}/copy.md %}) to load data into CockroachDB: -- `IMPORT INTO` [takes the target CockroachDB tables offline]({% link {{ site.current_cloud_version }}/import-into.md %}#considerations) to maximize throughput. The tables come back online once the [import job]({% link {{site.current_cloud_version}}/import-into.md %}#view-and-control-import-jobs) completes successfully. If you need to keep the target tables online, add the `--use-copy` flag to export data with [`COPY FROM`]({% link {{ site.current_cloud_version }}/copy.md %}) instead. For more details, refer to [Data movement]({% link molt/molt-fetch.md %}#data-movement). +| Statement | MOLT Fetch flag | Description | +|---------------|---------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `IMPORT INTO` | Default mode | | +| `COPY FROM` | `--use-copy` or `--direct-copy` | | -- If you cannot move data to a public cloud, specify `--direct-copy` instead of `--bucket-path` in the `molt fetch` command. This flag instructs MOLT Fetch to use `COPY FROM` to move the source data directly to CockroachDB without an intermediate store. For more information, refer to [Direct copy]({% link molt/molt-fetch.md %}#direct-copy). \ No newline at end of file +- Use `IMPORT INTO` (the default mode) for large datasets, wide rows, or partitioned tables. +- Use `--use-copy` when tables must remain online during data load. +- Use `--direct-copy` only when you cannot move data to a public cloud, or want to perform local testing without intermediate storage. In this case, no [intermediate file storage](#intermediate-file-storage) is used. \ No newline at end of file diff --git a/src/current/_includes/molt/fetch-data-load-output.md b/src/current/_includes/molt/fetch-data-load-output.md index cdee75ed2b1..de9085a7dc8 100644 --- a/src/current/_includes/molt/fetch-data-load-output.md +++ b/src/current/_includes/molt/fetch-data-load-output.md @@ -4,41 +4,53 @@
~~~ json - {"level":"info","type":"summary","num_tables":1,"cdc_cursor":"0/43A1960","time":"2025-02-10T14:28:11-05:00","message":"starting fetch"} + {"level":"info","type":"summary","num_tables":3,"cdc_cursor":"0/43A1960","time":"2025-02-10T14:28:11-05:00","message":"starting fetch"} ~~~
~~~ json - {"level":"info","type":"summary","num_tables":1,"cdc_cursor":"4c658ae6-e8ad-11ef-8449-0242ac140006:1-28","time":"2025-02-10T14:28:11-05:00","message":"starting fetch"} + {"level":"info","type":"summary","num_tables":3,"cdc_cursor":"4c658ae6-e8ad-11ef-8449-0242ac140006:1-28","time":"2025-02-10T14:28:11-05:00","message":"starting fetch"} ~~~
+
+ ~~~ json + {"level":"info","type":"summary","num_tables":3,"cdc_cursor":"2358840","time":"2025-02-10T14:28:11-05:00","message":"starting fetch"} + ~~~ +
+ `data extraction` messages are written for each table that is exported to the location in `--bucket-path`: ~~~ json - {"level":"info","table":"public.employees","time":"2025-02-10T14:28:11-05:00","message":"data extraction phase starting"} + {"level":"info","table":"migration_schema.employees","time":"2025-02-10T14:28:11-05:00","message":"data extraction phase starting"} ~~~ ~~~ json - {"level":"info","table":"public.employees","type":"summary","num_rows":200000,"export_duration_ms":1000,"export_duration":"000h 00m 01s","time":"2025-02-10T14:28:12-05:00","message":"data extraction from source complete"} + {"level":"info","table":"migration_schema.employees","type":"summary","num_rows":200000,"export_duration_ms":1000,"export_duration":"000h 00m 01s","time":"2025-02-10T14:28:12-05:00","message":"data extraction from source complete"} ~~~ `data import` messages are written for each table that is loaded into CockroachDB: ~~~ json - {"level":"info","table":"public.employees","time":"2025-02-10T14:28:12-05:00","message":"starting data import on target"} + {"level":"info","table":"migration_schema.employees","time":"2025-02-10T14:28:12-05:00","message":"starting data import on target"} ~~~
~~~ json - {"level":"info","table":"public.employees","type":"summary","net_duration_ms":1899.748333,"net_duration":"000h 00m 01s","import_duration_ms":1160.523875,"import_duration":"000h 00m 01s","export_duration_ms":1000,"export_duration":"000h 00m 01s","num_rows":200000,"cdc_cursor":"0/43A1960","time":"2025-02-10T14:28:13-05:00","message":"data import on target for table complete"} + {"level":"info","table":"migration_schema.employees","type":"summary","net_duration_ms":1899.748333,"net_duration":"000h 00m 01s","import_duration_ms":1160.523875,"import_duration":"000h 00m 01s","export_duration_ms":1000,"export_duration":"000h 00m 01s","num_rows":200000,"cdc_cursor":"0/43A1960","time":"2025-02-10T14:28:13-05:00","message":"data import on target for table complete"} ~~~
~~~ json - {"level":"info","table":"public.employees","type":"summary","net_duration_ms":1899.748333,"net_duration":"000h 00m 01s","import_duration_ms":1160.523875,"import_duration":"000h 00m 01s","export_duration_ms":1000,"export_duration":"000h 00m 01s","num_rows":200000,"cdc_cursor":"4c658ae6-e8ad-11ef-8449-0242ac140006:1-29","time":"2025-02-10T14:28:13-05:00","message":"data import on target for table complete"} + {"level":"info","table":"migration_schema.employees","type":"summary","net_duration_ms":1899.748333,"net_duration":"000h 00m 01s","import_duration_ms":1160.523875,"import_duration":"000h 00m 01s","export_duration_ms":1000,"export_duration":"000h 00m 01s","num_rows":200000,"cdc_cursor":"4c658ae6-e8ad-11ef-8449-0242ac140006:1-29","time":"2025-02-10T14:28:13-05:00","message":"data import on target for table complete"} + ~~~ +
+ +
+ ~~~ json + {"level":"info","table":"migration_schema.employees","type":"summary","net_duration_ms":1899.748333,"net_duration":"000h 00m 01s","import_duration_ms":1160.523875,"import_duration":"000h 00m 01s","export_duration_ms":1000,"export_duration":"000h 00m 01s","num_rows":200000,"cdc_cursor":"2358840","time":"2025-02-10T14:28:13-05:00","message":"data import on target for table complete"} ~~~
@@ -46,12 +58,38 @@
~~~ json - {"level":"info","type":"summary","fetch_id":"f5cb422f-4bb4-4bbd-b2ae-08c4d00d1e7c","num_tables":1,"tables":["public.employees"],"cdc_cursor":"0/3F41E40","net_duration_ms":6752.847625,"net_duration":"000h 00m 06s","time":"2024-03-18T12:30:37-04:00","message":"fetch complete"} + {"level":"info","type":"summary","fetch_id":"f5cb422f-4bb4-4bbd-b2ae-08c4d00d1e7c","num_tables":3,"tables":["migration_schema.employees","migration_schema.payments","migration_schema.payments"],"cdc_cursor":"0/3F41E40","net_duration_ms":6752.847625,"net_duration":"000h 00m 06s","time":"2024-03-18T12:30:37-04:00","message":"fetch complete"} ~~~
~~~ json - {"level":"info","type":"summary","fetch_id":"f5cb422f-4bb4-4bbd-b2ae-08c4d00d1e7c","num_tables":1,"tables":["public.employees"],"cdc_cursor":"4c658ae6-e8ad-11ef-8449-0242ac140006:1-29","net_duration_ms":6752.847625,"net_duration":"000h 00m 06s","time":"2024-03-18T12:30:37-04:00","message":"fetch complete"} + {"level":"info","type":"summary","fetch_id":"f5cb422f-4bb4-4bbd-b2ae-08c4d00d1e7c","num_tables":3,"tables":["migration_schema.employees","migration_schema.payments","migration_schema.payments"],"cdc_cursor":"4c658ae6-e8ad-11ef-8449-0242ac140006:1-29","net_duration_ms":6752.847625,"net_duration":"000h 00m 06s","time":"2024-03-18T12:30:37-04:00","message":"fetch complete"} + ~~~ + + {% if page.name != "migrate-bulk-load.md" %} + This message includes a `cdc_cursor` value. You must set the `--defaultGTIDSet` replication flag to this value when starting [`replication-only` mode](#replicate-changes-to-cockroachdb): + + {% include_cached copy-clipboard.html %} + ~~~ + --defaultGTIDSet 4c658ae6-e8ad-11ef-8449-0242ac140006:1-29 + ~~~ + {% endif %} +
+ +
+ ~~~ json + {"level":"info","type":"summary","fetch_id":"f5cb422f-4bb4-4bbd-b2ae-08c4d00d1e7c","num_tables":3,"tables":["migration_schema.employees","migration_schema.payments","migration_schema.payments"],"cdc_cursor":"2358840","net_duration_ms":6752.847625,"net_duration":"000h 00m 06s","time":"2024-03-18T12:30:37-04:00","message":"fetch complete"} ~~~ -
\ No newline at end of file + + + {% if page.name == "migrate-data-load-replicate-only.md" %} +
+ The following message shows the appropriate values for the `--backfillFromSCN` and `--scn` replication flags to use when [starting`replication-only` mode](#replicate-changes-to-cockroachdb): + + {% include_cached copy-clipboard.html %} + ~~~ + replication-only mode should include the following replicator flags: --backfillFromSCN 26685444 --scn 26685786 + ~~~ +
+ {% endif %} \ No newline at end of file diff --git a/src/current/_includes/molt/fetch-intermediate-file-storage.md b/src/current/_includes/molt/fetch-intermediate-file-storage.md new file mode 100644 index 00000000000..488aadc1af2 --- /dev/null +++ b/src/current/_includes/molt/fetch-intermediate-file-storage.md @@ -0,0 +1,14 @@ +MOLT Fetch can write intermediate files to either a cloud storage bucket or a local file server: + +| Destination | MOLT Fetch flag(s) | Address and authentication | +|-------------------|---------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| Cloud storage | `--bucket-path` | Specify a `s3://bucket/path`, `gs://bucket/path`, or `azure-blob://bucket/path` URL. | +| Local file server | `--local-path`
`--local-path-listen-addr`
`--local-path-crdb-access-addr` | Write to `--local-path` on a local file server at `--local-path-listen-addr`; if the target CockroachDB cluster cannot reach this address, specify a publicly accessible address with `--local-path-crdb-access-addr`. No additional authentication is required. | + +{{site.data.alerts.callout_success}} +Cloud storage is often preferred over a local file server, which may require significant disk space. +{{site.data.alerts.end}} + +#### Cloud storage authentication + +{% include molt/fetch-secure-cloud-storage.md %} \ No newline at end of file diff --git a/src/current/_includes/molt/fetch-metrics.md b/src/current/_includes/molt/fetch-metrics.md new file mode 100644 index 00000000000..a432d7f5fa9 --- /dev/null +++ b/src/current/_includes/molt/fetch-metrics.md @@ -0,0 +1,21 @@ +By default, MOLT Fetch exports [Prometheus](https://prometheus.io/) metrics at `http://127.0.0.1:3030/metrics`. You can override the address with `--metrics-listen-addr '{host}:{port}'`, where the endpoint will be `http://{host}:{port}/metrics`. + +Cockroach Labs recommends monitoring the following metrics during data load: + +| Metric Name | Description | +|---------------------------------------|-----------------------------------------------------------------------------------------------------------------------------| +| `molt_fetch_num_tables` | Number of tables that will be moved from the source. | +| `molt_fetch_num_task_errors` | Number of errors encountered by the fetch task. | +| `molt_fetch_overall_duration` | Duration (in seconds) of the fetch task. | +| `molt_fetch_rows_exported` | Number of rows that have been exported from a table. For example:
`molt_fetch_rows_exported{table="public.users"}` | +| `molt_fetch_rows_imported` | Number of rows that have been imported from a table. For example:
`molt_fetch_rows_imported{table="public.users"}` | +| `molt_fetch_table_export_duration_ms` | Duration (in milliseconds) of a table's export. For example:
`molt_fetch_table_export_duration_ms{table="public.users"}` | +| `molt_fetch_table_import_duration_ms` | Duration (in milliseconds) of a table's import. For example:
`molt_fetch_table_import_duration_ms{table="public.users"}` | + +You can also use the [sample Grafana dashboard](https://molt.cockroachdb.com/molt/cli/grafana_dashboard.json) to view the preceding metrics. + +{% if page.name != "migrate-bulk-load.md" %} +{{site.data.alerts.callout_info}} +Metrics from the `replicator` process are enabled by setting the `--metricsAddr` [replication flag](#replication-flags), and are served at `http://{host}:{port}/_/varz`.
To view Oracle-specific metrics from `replicator`, import [this Grafana dashboard](https://replicator.cockroachdb.com/replicator_oracle_grafana_dashboard.json).
+{{site.data.alerts.end}} +{% endif %} \ No newline at end of file diff --git a/src/current/_includes/molt/fetch-replication-output.md b/src/current/_includes/molt/fetch-replication-output.md index 8be689becec..28b8248c586 100644 --- a/src/current/_includes/molt/fetch-replication-output.md +++ b/src/current/_includes/molt/fetch-replication-output.md @@ -6,16 +6,21 @@ {"level":"info","time":"2025-02-10T14:28:13-05:00","message":"starting replicator"} ~~~ - The `staging database name` message contains the name of the staging schema: + The `staging database name` message contains the name of the staging schema. The schema name contains a replication marker for streaming changes, which is used for [resuming replication]({% link molt/molt-fetch.md %}#resume-replication), or performing [failback to the source database]({% link molt/migrate-failback.md %}). + ~~~ json {"level":"info","time":"2025-02-10T14:28:13-05:00","message":"staging database name: _replicator_1739215693817700000"} ~~~ - The staging schema provides a replication marker for streaming changes. You will need the staging schema name in case replication fails and must be [resumed]({% link molt/molt-fetch.md %}#resume-replication), or [failback to the source database]({% link molt/migrate-failback.md %}) is performed. - `upserted rows` log messages indicate that changes were replicated to CockroachDB: ~~~ shell - DEBUG [Jan 22 13:52:40] upserted rows conflicts=0 duration=7.620208ms proposed=1 target="\"molt\".\"public\".\"employees\"" upserted=1 - ~~~ \ No newline at end of file + DEBUG [Jan 22 13:52:40] upserted rows conflicts=0 duration=7.620208ms proposed=1 target="\"molt\".\"migration_schema\".\"employees\"" upserted=1 + ~~~ + + {% if page.name != "migrate-replicate-only.md" %} + {{site.data.alerts.callout_success}} + If replication is interrupted, you can [resume replication]({% link molt/migrate-replicate-only.md %}). + {{site.data.alerts.end}} + {% endif %} \ No newline at end of file diff --git a/src/current/_includes/molt/fetch-replicator-flags.md b/src/current/_includes/molt/fetch-replicator-flags.md new file mode 100644 index 00000000000..2551ebc6bbc --- /dev/null +++ b/src/current/_includes/molt/fetch-replicator-flags.md @@ -0,0 +1,92 @@ +In the `molt fetch` command, use `--replicator-flags` to pass options to the included `replicator` process that handles continuous replication. For details on all available flags, refer to the [MOLT Fetch documentation]({% link molt/molt-fetch.md %}#replication-flags). + +{% if page.name == "migrate-data-load-replicate-only.md" %} +
+| Flag | Description | +|-----------------|----------------------------------------------------------------------------------------------------------------| +| `--metricsAddr` | Enable Prometheus metrics at a specified `{host}:{port}`. Metrics are served at `http://{host}:{port}/_/varz`. | +
+ +
+| Flag | Description | +|--------------------|-------------------------------------------------------------------------------------------------------------------------------------| +| `--defaultGTIDSet` | **Required.** Default GTID set for changefeed. | +| `--metricsAddr` | Enable Prometheus metrics at a specified `{host}:{port}`. Metrics are served at `http://{host}:{port}/_/varz`. | +| `--userscript` | Path to a userscript that enables table filtering from MySQL sources. Refer to [Table filter userscript](#table-filter-userscript). | + +Replication from MySQL requires `--defaultGTIDSet`, which sets the starting GTID for replication. You can find this value in the `cdc_cursor` field of the `fetch complete` message after the [initial data load](#load-data-into-cockroachdb) completes. +
+ +
+| Flag | Description | +|---------------------|--------------------------------------------------------------------------------------------------------------------------------------| +| `--scn` | **Required.** Snapshot System Change Number (SCN) for the initial changefeed starting point. | +| `--backfillFromSCN` | **Required.** SCN of the earliest active transaction at the time of the snapshot. Ensures no transactions are skipped. | +| `--metricsAddr` | Enable Prometheus metrics at a specified `{host}:{port}`. Metrics are served at `http://{host}:{port}/_/varz`. | +| `--userscript` | Path to a userscript that enables table filtering from Oracle sources. Refer to [Table filter userscript](#table-filter-userscript). | + +Replication from Oracle requires `--scn` and `--backfillFromSCN`, which specify the snapshot SCN and the earliest active transaction SCN, respectively. You can find these values in the message `replication-only mode should include the following replicator flags` after the [initial data load](#load-data-into-cockroachdb) completes. +
+ +{% elsif page.name == "migrate-replicate-only.md" %} +| Flag | Description | +|-------------------|----------------------------------------------------------------------------------------------------------------| +| `--stagingSchema` | **Required.** Staging schema name for the changefeed checkpoint table. | +| `--metricsAddr` | Enable Prometheus metrics at a specified `{host}:{port}`. Metrics are served at `http://{host}:{port}/_/varz`. | + +Resuming replication requires `--stagingSchema`, which specifies the staging schema name used as a checkpoint. MOLT Fetch [logs the staging schema name]({% link molt/migrate-data-load-replicate-only.md %}#replicate-changes-to-cockroachdb) as the `staging database name` when it starts replication. For example: + +~~~ json + {"level":"info","time":"2025-02-10T14:28:13-05:00","message":"staging database name: _replicator_1749699789613149000"} +~~~ + +
+{{site.data.alerts.callout_info}} +When using `--table-filter`, you must also include `--userscript`. Refer to [Table filter userscript]({% link molt/migrate-data-load-replicate-only.md %}?filters=mysql#table-filter-userscript). +{{site.data.alerts.end}} +
+ +
+{{site.data.alerts.callout_info}} +When using `--table-filter`, you must also include `--userscript`. Refer to [Table filter userscript]({% link molt/migrate-data-load-replicate-only.md %}?filters=oracle#table-filter-userscript). +{{site.data.alerts.end}} +
+ +{% elsif page.name == "migrate-data-load-and-replication.md" %} +| Flag | Description | +|-----------------|----------------------------------------------------------------------------------------------------------------| +| `--metricsAddr` | Enable Prometheus metrics at a specified `{host}:{port}`. Metrics are served at `http://{host}:{port}/_/varz`. | + +
+{{site.data.alerts.callout_info}} +When using `--table-filter`, you must also include `--userscript`. Refer to [Table filter userscript](#table-filter-userscript). +{{site.data.alerts.end}} +
+ +
+{{site.data.alerts.callout_info}} +When using `--table-filter`, you must also include `--userscript`. Refer to [Table filter userscript](#table-filter-userscript). +{{site.data.alerts.end}} +
+ +{% elsif page.name == "migrate-failback.md" %} +| Flag | Description | +|--------------------|--------------------------------------------------------------------------------------------------------------------------------------| +| `--stagingSchema` | **Required.** Staging schema name for the changefeed checkpoint table. | +| `--tlsCertificate` | Path to the server TLS certificate for the webhook sink. Refer to [Secure failback for changefeed](#secure-changefeed-for-failback). | +| `--tlsPrivateKey` | Path to the server TLS private key for the webhook sink. Refer to [Secure failback for changefeed](#secure-changefeed-for-failback). | +| `--metricsAddr` | Enable Prometheus metrics at a specified `{host}:{port}`. Metrics are served at `http://{host}:{port}/_/varz`. | + +- Failback requires `--stagingSchema`, which specifies the staging schema name used as a checkpoint. MOLT Fetch [logs the staging schema name]({% link molt/migrate-data-load-replicate-only.md %}#replicate-changes-to-cockroachdb) when it starts replication: + + ~~~ shell + staging database name: _replicator_1749699789613149000 + ~~~ + +- When configuring a [secure changefeed](#secure-changefeed-for-failback) for failback, you **must** include `--tlsCertificate` and `--tlsPrivateKey`, which specify the paths to the server certificate and private key for the webhook sink connection. + +{% else %} +| Flag | Description | +|-----------------|----------------------------------------------------------------------------------------------------------------| +| `--metricsAddr` | Enable Prometheus metrics at a specified `{host}:{port}`. Metrics are served at `http://{host}:{port}/_/varz`. | +{% endif %} \ No newline at end of file diff --git a/src/current/_includes/molt/fetch-schema-table-filtering.md b/src/current/_includes/molt/fetch-schema-table-filtering.md new file mode 100644 index 00000000000..1f44ad31248 --- /dev/null +++ b/src/current/_includes/molt/fetch-schema-table-filtering.md @@ -0,0 +1,29 @@ +MOLT Fetch can restrict which schemas (or users) and tables are migrated by using the following filter flags: + +| Filter type | Flag | Description | +|------------------------|----------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------| +| Schema filter | `--schema-filter` | [POSIX regex](https://wikipedia.org/wiki/Regular_expression) matching schema names to include; all matching schemas and their tables are moved. | +| Table filter | `--table-filter` | POSIX regex matching table names to include across all selected schemas. | +| Table exclusion filter | `--table-exclusion-filter` | POSIX regex matching table names to exclude across all selected schemas. | + +{{site.data.alerts.callout_success}} +Use `--schema-filter` to migrate only the specified schemas, and refine which tables are moved using `--table-filter` or `--table-exclusion-filter`. +{{site.data.alerts.end}} + +
+When migrating from Oracle, you **must** include `--schema-filter` to name an Oracle schema to migrate. This prevents Fetch from attempting to load tables owned by other users. For example: + +~~~ +--schema-filter 'migration_schema' +~~~ +
+ +{% if page.name != "migrate-bulk-load.md" %} +
+{% include molt/fetch-table-filter-userscript.md %} +
+ +
+{% include molt/fetch-table-filter-userscript.md %} +
+{% endif %} \ No newline at end of file diff --git a/src/current/_includes/molt/fetch-secure-cloud-storage.md b/src/current/_includes/molt/fetch-secure-cloud-storage.md index 487a600d5e8..96c3f314863 100644 --- a/src/current/_includes/molt/fetch-secure-cloud-storage.md +++ b/src/current/_includes/molt/fetch-secure-cloud-storage.md @@ -1,39 +1,131 @@ -- When exporting data to [cloud storage]({% link molt/molt-fetch.md %}#cloud-storage), ensure that access control is properly configured: +Ensure that access control is properly configured for [Amazon S3](#amazon-s3), [Google Cloud Storage](#google-cloud-storage), or [Azure Blob Storage](#azure-blob-storage). - - If you are using [Amazon S3](https://docs.aws.amazon.com/AmazonS3/latest/userguide/security-iam.html) for cloud storage: +
+ + + +
- - Ensure that the following environment variables are set appropriately in the terminal running `molt fetch`: +
+##### Amazon S3 - {% include_cached copy-clipboard.html %} - ~~~ shell - export AWS_REGION='us-east-1' - export AWS_SECRET_ACCESS_KEY='key' - export AWS_ACCESS_KEY_ID='id' - ~~~ +- Set the following environment variables in the terminal running `molt fetch`: - - Alternatively, set the `--use-implicit-auth` flag to use [implicit authentication]({% link {{ site.current_cloud_version }}/cloud-storage-authentication.md %}). + {% include_cached copy-clipboard.html %} + ~~~ shell + export AWS_REGION='us-east-1' + export AWS_SECRET_ACCESS_KEY='key' + export AWS_ACCESS_KEY_ID='id' + ~~~ - - Ensure the S3 bucket is created and accessible by authorized roles and users only. + - To run `molt fetch` in a containerized environment (e.g., Docker), pass the required environment variables using `-e`. If your authentication method relies on local credential files, you may also need to volume map the host path to the appropriate location inside the container using `-v`. For example: - - If you are using [Google Cloud Storage](https://cloud.google.com/storage/docs/access-control) for cloud storage: + ~~~ shell + docker run \ + -e AWS_ACCESS_KEY_ID='your-access-key' \ + -e AWS_SECRET_ACCESS_KEY='your-secret-key' \ + -v ~/.aws:/root/.aws \ + -it \ + cockroachdb/molt fetch \ + --bucket-path 's3://migration/data/cockroach' ... + ~~~ - - Ensure that your local environment is authenticated using [Application Default Credentials](https://cloud.google.com/docs/authentication/application-default-credentials): +- Alternatively, set `--use-implicit-auth` to use [implicit authentication]({% link {{ site.current_cloud_version }}/cloud-storage-authentication.md %}). When using assume role authentication, specify the service account with `--assume-role`. For example: - Using `gcloud`: + ~~~ + --bucket-path 's3://migration/data/cockroach' + --assume-role 'arn:aws:iam::123456789012:role/MyMigrationRole' + --use-implicit-auth + ~~~ - {% include_cached copy-clipboard.html %} - ~~~ shell - gcloud init - gcloud auth application-default login - ~~~ +- Set `--import-region` to specify an `AWS_REGION` (e.g., `--import-region 'ap-south-1'`). - Using the environment variable: +- Ensure the S3 bucket is created and accessible by authorized roles and users only. +
- {% include_cached copy-clipboard.html %} - ~~~ shell - export GOOGLE_APPLICATION_CREDENTIALS={path_to_cred_json} - ~~~ +
+##### Google Cloud Storage - - Alternatively, set the `--use-implicit-auth` flag to use [implicit authentication]({% link {{ site.current_cloud_version }}/cloud-storage-authentication.md %}?filters=gcs). +- Authenticate your local environment with [Application Default Credentials](https://cloud.google.com/docs/authentication/application-default-credentials): - - Ensure the Google Cloud Storage bucket is created and accessible by authorized roles and users only. \ No newline at end of file + Using `gcloud`: + + {% include_cached copy-clipboard.html %} + ~~~ shell + gcloud init + gcloud auth application-default login + ~~~ + + Using the environment variable: + + {% include_cached copy-clipboard.html %} + ~~~ shell + export GOOGLE_APPLICATION_CREDENTIALS={path_to_cred_json} + ~~~ + + - To run `molt fetch` in a containerized environment (e.g., Docker), pass the required environment variables using `-e`. If your authentication method relies on local credential files, you may also need to volume map the host path to the appropriate location inside the container using `-v`. For example: + + ~~~ shell + docker run \ + -e GOOGLE_APPLICATION_CREDENTIALS='/root/.config/gcloud/application_default_credentials.json' \ + -v ~/.config/gcloud:/root/.config/gcloud \ + -it \ + cockroachdb/molt fetch \ + --bucket-path 'gs://migration/data/cockroach' ... + ~~~ + +- Alternatively, set `--use-implicit-auth` to use [implicit authentication]({% link {{ site.current_cloud_version }}/cloud-storage-authentication.md %}). When using assume role authentication, specify the service account with `--assume-role`. For example: + + ~~~ + --bucket-path 'gs://migration/data/cockroach + --use-implicit-auth + --assume-role 'user-test@cluster-ephemeral.iam.gserviceaccount.com' + ~~~ + +- Ensure the Google Cloud Storage bucket is created and accessible by authorized roles and users only. +
+ +
+##### Azure Blob Storage + +- Set the following environment variables in the terminal running `molt fetch`: + + {% include_cached copy-clipboard.html %} + ~~~ shell + export AZURE_ACCOUNT_NAME='account' + export AZURE_ACCOUNT_KEY='key' + ~~~ + + You can also speicfy client and tenant credentials as environment variables: + + {% include_cached copy-clipboard.html %} + ~~~ shell + export AZURE_CLIENT_SECRET='secret' + export AZURE_TENANT_ID='id' + ~~~ + + - To run `molt fetch` in a containerized environment (e.g., Docker), pass the required environment variables using `-e`. If your authentication method relies on local credential files, you may also need to volume map the host path to the appropriate location inside the container using `-v`. For example: + + ~~~ shell + docker run \ + -e AZURE_ACCOUNT_NAME='account' \ + -e AZURE_ACCOUNT_KEY='key' \ + -e AZURE_CLIENT_SECRET='secret' \ + -e AZURE_TENANT_ID='id' \ + -v ~/.azure:/root/.azure \ + -it \ + cockroachdb/molt fetch \ + --bucket-path 'azure-blob://migration/data/cockroach' ... + ~~~ + +- Alternatively, set `--use-implicit-auth` to use implicit authentication: For example: + + ~~~ + --bucket-path 'azure-blob://migration/data/cockroach' + --use-implicit-auth + ~~~ + + This mode supports Azure [managed identities](https://learn.microsoft.com/en-us/entra/identity/managed-identities-azure-resources/overview) and [workload identities](https://learn.microsoft.com/en-us/entra/workload-id/workload-identities-overview). + +- Ensure the Azure Blob Storage container is created and accessible by authorized roles and users only. +
\ No newline at end of file diff --git a/src/current/_includes/molt/fetch-secure-connection-strings.md b/src/current/_includes/molt/fetch-secure-connection-strings.md new file mode 100644 index 00000000000..1e83e057676 --- /dev/null +++ b/src/current/_includes/molt/fetch-secure-connection-strings.md @@ -0,0 +1,34 @@ +To keep your database credentials out of shell history and logs, follow these best practices when specifying your source and target connection strings: + +- Avoid plaintext connection strings. + +- URL-encode connection strings for the source database and [CockroachDB]({% link {{site.current_cloud_version}}/connect-to-the-database.md %}) so special characters in passwords are handled correctly. + + - Given a password `a$52&`, pass it to the `molt escape-password` command with single quotes: + + {% include_cached copy-clipboard.html %} + ~~~ shell + molt escape-password 'a$52&' + ~~~ + + Use the encoded password in your `--source` connection string. For example: + + ~~~ + --source 'postgres://migration_user:a%2452%26@localhost:5432/replicationload' + ~~~ + +- Provide your connection strings as environment variables. For example: + + ~~~ shell + export SOURCE="postgres://migration_user:a%2452%26@localhost:5432/molt?sslmode=verify-full" + export TARGET="postgres://root@localhost:26257/molt?sslmode=verify-full" + ~~~ + + Afterward, reference the environment variables as follows: + + ~~~ + --source $SOURCE + --target $TARGET + ~~~ + +- If possible, use an external secrets manager to load the environment variables from stored secrets. \ No newline at end of file diff --git a/src/current/_includes/molt/fetch-table-filter-userscript.md b/src/current/_includes/molt/fetch-table-filter-userscript.md new file mode 100644 index 00000000000..c64c5adf595 --- /dev/null +++ b/src/current/_includes/molt/fetch-table-filter-userscript.md @@ -0,0 +1,41 @@ +#### Table filter userscript + +When loading a subset of tables using `--table-filter`, you **must** provide a TypeScript userscript to specify which tables to replicate. + +For example, the following `table_filter.ts` userscript filters change events to the specified source tables: + +~~~ ts +import * as api from "replicator@v1"; + +// List the source tables (matching source names and casing) to include in replication +const allowedTables = ["EMPLOYEES", "PAYMENTS", "ORDERS"]; + +// Update this to your target CockroachDB database and schema name +api.configureSource("defaultdb.migration_schema", { + dispatch: (doc: Document, meta: Document): Record | null => { + // Replicate only if the table matches one of the allowed tables + if (allowedTables.includes(meta.table)) { + let ret: Record = {}; + ret[meta.table] = [doc]; + return ret; + } + // Ignore all other tables + return null; + }, + deletesTo: (doc: Document, meta: Document): Record | null => { + // Optionally filter deletes the same way + if (allowedTables.includes(meta.table)) { + let ret: Record = {}; + ret[meta.table] = [doc]; + return ret; + } + return null; + }, +}); +~~~ + +Pass the userscript to MOLT Fetch with the `--userscript` [replication flag](#replication-flags): + +~~~ +--replicator-flags "--userscript table_filter.ts" +~~~ \ No newline at end of file diff --git a/src/current/_includes/molt/fetch-table-handling.md b/src/current/_includes/molt/fetch-table-handling.md new file mode 100644 index 00000000000..402ebd1c086 --- /dev/null +++ b/src/current/_includes/molt/fetch-table-handling.md @@ -0,0 +1,17 @@ +MOLT Fetch can initialize target tables on the CockroachDB database in one of three modes using `--table-handling`: + +| Mode | MOLT Fetch flag | Description | +|-------------------------------|------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `none` | Default mode |
  • Loads data into existing tables without altering schema or data.
  • Exits early if schemas mismatch in some cases.
| +| `truncate-if-exists` | `--table-handling truncate-if-exists` |
  • Truncates target tables before loading data.
  • Exits early if schemas mismatch in some cases.
| +| `drop-on-target-and-recreate` | `--table-handling drop-on-target-and-recreate` |
  • Drops and recreates target tables before loading data.
  • Automatically creates missing tables with [`PRIMARY KEY`]({% link {{site.current_cloud_version}}/primary-key.md %}) and [`NOT NULL`]({% link {{site.current_cloud_version}}/not-null.md %}) constraints.
| + +- Use `none` when you need to retain existing data and schema. +- Use `--table-handling truncate-if-exists` to clear existing data while preserving schema definitions. +- Use `--table-handling drop-on-target-and-recreate` for initial imports or when source and target schemas differ, letting MOLT Fetch generate compatible tables automatically. + +{{site.data.alerts.callout_info}} +When using the `drop-on-target-and-recreate` option, only [`PRIMARY KEY`]({% link {{site.current_cloud_version}}/primary-key.md %}) and [`NOT NULL`]({% link {{site.current_cloud_version}}/not-null.md %}) constraints are preserved on the target tables. Other constraints, such as [`FOREIGN KEY`]({% link {{site.current_cloud_version}}/foreign-key.md %}) references, [`UNIQUE`]({% link {{site.current_cloud_version}}/unique.md %}), or [`DEFAULT`]({% link {{site.current_cloud_version}}/default-value.md %}) value expressions, are **not** retained. +{{site.data.alerts.end}} + +To guide schema creation with `drop-on-target-and-recreate`, you can explicitly map source types to CockroachDB types. Refer to [Type mapping]({% link molt/molt-fetch.md %}#type-mapping). \ No newline at end of file diff --git a/src/current/_includes/molt/migration-create-sql-user.md b/src/current/_includes/molt/migration-create-sql-user.md new file mode 100644 index 00000000000..5062113e8ed --- /dev/null +++ b/src/current/_includes/molt/migration-create-sql-user.md @@ -0,0 +1,82 @@ +Create a SQL user in the CockroachDB cluster that has the necessary privileges. + +To create a user `crdb_user` in the default database (you will pass this username in the [target connection string](#target-connection-string)): + +{% include_cached copy-clipboard.html %} +~~~ sql +CREATE USER crdb_user WITH PASSWORD 'password'; +~~~ + +Grant database-level privileges for schema creation within the target database: + +{% include_cached copy-clipboard.html %} +~~~ sql +GRANT ALL ON DATABASE defaultdb TO crdb_user; +~~~ + +Grant user privileges to create internal MOLT tables like `_molt_fetch_exceptions` in the public schema: + +{{site.data.alerts.callout_info}} +Ensure that you are connected to the target database. +{{site.data.alerts.end}} + +{% include_cached copy-clipboard.html %} +~~~ sql +GRANT CREATE ON SCHEMA public TO crdb_user; +~~~ + +If you manually created the target schema (i.e., [`drop-on-target-and-recreate`](#table-handling-mode) will not be used), grant the following privileges on the schema: + +{% include_cached copy-clipboard.html %} +~~~ sql +GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA migration_schema TO crdb_user; +ALTER DEFAULT PRIVILEGES IN SCHEMA migration_schema +GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO crdb_user; +~~~ + +Grant the same privileges for internal MOLT tables: + +{% include_cached copy-clipboard.html %} +~~~ sql +GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO crdb_user; +ALTER DEFAULT PRIVILEGES IN SCHEMA public +GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO crdb_user; +~~~ + +Depending on the MOLT Fetch [data load mode](#data-load-mode) you will use, grant the necessary privileges to run either [`IMPORT INTO`](#import-into-privileges) or [`COPY FROM`](#copy-from-privileges) on the target tables: + +#### `IMPORT INTO` privileges + +Grant `SELECT`, `INSERT`, and `DROP` (required because the table is taken offline during the `IMPORT INTO`) privileges on all tables in the [target schema](#create-the-target-schema): + +{% include_cached copy-clipboard.html %} +~~~ sql +GRANT SELECT, INSERT, DROP ON ALL TABLES IN SCHEMA migration_schema TO crdb_user; +~~~ + +If you plan to use [cloud storage with implicit authentication](#cloud-storage-authentication) for data load, grant the `EXTERNALIOIMPLICITACCESS` [system-level privilege]({% link {{site.current_cloud_version}}/security-reference/authorization.md %}#supported-privileges): + +{% include_cached copy-clipboard.html %} +~~~ sql +GRANT EXTERNALIOIMPLICITACCESS TO crdb_user; +~~~ + +#### `COPY FROM` privileges + +Grant [`admin`]({% link {{site.current_cloud_version}}/security-reference/authorization.md %}#admin-role) privileges to the user: + +{% include_cached copy-clipboard.html %} +~~~ sql +GRANT admin TO crdb_user; +~~~ + +{% if page.name != "migrate-bulk-load.md" %} +#### Replication privileges + +Grant permissions to create the staging schema for replication: + +{% include_cached copy-clipboard.html %} +~~~ sql +ALTER USER crdb_user CREATEDB; +~~~ +{% endif %} \ No newline at end of file diff --git a/src/current/_includes/molt/migration-modify-target-schema.md b/src/current/_includes/molt/migration-modify-target-schema.md index 001422e018c..62cb83758f1 100644 --- a/src/current/_includes/molt/migration-modify-target-schema.md +++ b/src/current/_includes/molt/migration-modify-target-schema.md @@ -1,9 +1,7 @@ -{% if page.name == "migrate-in-phases.md" %} +Add any constraints or indexes that you previously [removed from the CockroachDB schema](#drop-constraints-and-indexes) to facilitate data load. + {{site.data.alerts.callout_info}} -If you need the best possible [replication](#step-6-replicate-changes-to-cockroachdb) performance, you can perform this step right before [cutover](#step-8-cutover). +If you used the `--table-handling drop-on-target-and-recreate` option for data load, only [`PRIMARY KEY`]({% link {{ site.current_cloud_version }}/primary-key.md %}) and [`NOT NULL`]({% link {{ site.current_cloud_version }}/not-null.md %}) constraints are preserved. You **must** manually recreate all other constraints and indexes. {{site.data.alerts.end}} -{% endif %} - -You can now add any constraints or indexes that you previously [removed from the CockroachDB schema](#step-3-load-data-into-cockroachdb) to facilitate data load. If you used the `--table-handling drop-on-target-and-recreate` option for data load, you **must** manually recreate all indexes and constraints other than [`PRIMARY KEY`]({% link {{ site.current_cloud_version }}/primary-key.md %}) and [`NOT NULL`]({% link {{ site.current_cloud_version }}/not-null.md %}). For the appropriate SQL syntax, refer to [`ALTER TABLE ... ADD CONSTRAINT`]({% link {{ site.current_cloud_version }}/alter-table.md %}#add-constraint) and [`CREATE INDEX`]({% link {{ site.current_cloud_version }}/create-index.md %}). Review the [best practices for creating secondary indexes]({% link {{ site.current_cloud_version }}/schema-design-indexes.md %}#best-practices) on CockroachDB. \ No newline at end of file diff --git a/src/current/_includes/molt/migration-prepare-database.md b/src/current/_includes/molt/migration-prepare-database.md index 92a423b92be..70d2901cc3e 100644 --- a/src/current/_includes/molt/migration-prepare-database.md +++ b/src/current/_includes/molt/migration-prepare-database.md @@ -1,15 +1,139 @@ +#### Create migration user on source database + +Create a dedicated migration user (e.g., `MIGRATION_USER`) on the source database. This user is responsible for reading data from source tables during the migration. You will pass this username in the [source connection string](#source-connection-string). +
-Ensure that the PostgreSQL database is configured for replication. Enable logical replication by setting `wal_level` to `logical` in `postgresql.conf` or in the SQL shell. For example: +{% include_cached copy-clipboard.html %} +~~~ sql +CREATE USER migration_user WITH PASSWORD 'password'; +~~~ + +Grant the user privileges to connect, view schema objects, and select the tables you migrate. {% include_cached copy-clipboard.html %} ~~~ sql -ALTER SYSTEM SET wal_level = 'logical'; +GRANT CONNECT ON DATABASE source_database TO MIGRATION_USER; +GRANT USAGE ON SCHEMA migration_schema TO MIGRATION_USER; +GRANT SELECT ON ALL TABLES IN SCHEMA migration_schema TO MIGRATION_USER; +ALTER DEFAULT PRIVILEGES IN SCHEMA migration_schema GRANT SELECT ON TABLES TO MIGRATION_USER; ~~~
-Ensure that the MySQL database is configured for replication. +{% include_cached copy-clipboard.html %} +~~~ sql +CREATE USER 'migration_user'@'%' IDENTIFIED BY 'password'; +~~~ + +Grant the user privileges to select only the tables you migrate: + +{% include_cached copy-clipboard.html %} +~~~ sql +GRANT SELECT ON source_database.* TO MIGRATION_USER@'%'; +FLUSH PRIVILEGES; +~~~ +
+ +
+{% include_cached copy-clipboard.html %} +~~~ sql +CREATE USER MIGRATION_USER IDENTIFIED BY 'password'; +~~~ + +{{site.data.alerts.callout_info}} +When migrating from Oracle Multitenant (PDB/CDB), this should be a [common user](https://docs.oracle.com/database/121/ADMQS/GUID-DA54EBE5-43EF-4B09-B8CC-FAABA335FBB8.htm). Prefix the username with `C##` (e.g., `C##MIGRATION_USER`). +{{site.data.alerts.end}} + +Grant the user privileges to connect, read metadata, and `SELECT` and `FLASHBACK` the tables you plan to migrate. The tables should all reside in a single schema (e.g., `migration_schema`). For details, refer to [Schema and table filtering](#schema-and-table-filtering). + +##### Oracle Multitenant (PDB/CDB) user privileges + +Connect to the Oracle CDB as a DBA and grant the following: + +{% include_cached copy-clipboard.html %} +~~~ sql +-- Basic access +GRANT CONNECT TO C##MIGRATION_USER; +GRANT CREATE SESSION TO C##MIGRATION_USER; + +-- General metadata access +GRANT EXECUTE_CATALOG_ROLE TO C##MIGRATION_USER; +GRANT SELECT_CATALOG_ROLE TO C##MIGRATION_USER; +-- Access to necessary V$ views +GRANT SELECT ON V_$DATABASE TO C##MIGRATION_USER; + +-- Direct grants to specific DBA views +GRANT SELECT ON ALL_USERS TO C##MIGRATION_USER; +GRANT SELECT ON DBA_USERS TO C##MIGRATION_USER; +GRANT SELECT ON DBA_OBJECTS TO C##MIGRATION_USER; +GRANT SELECT ON DBA_SYNONYMS TO C##MIGRATION_USER; +GRANT SELECT ON DBA_TABLES TO C##MIGRATION_USER; +~~~ + +Connect to the Oracle PDB (not the CDB) as a DBA and grant the following: + +{% include_cached copy-clipboard.html %} +~~~ sql +-- Allow C##MIGRATION_USER to connect to the PDB and see active transaction metadata +GRANT CONNECT TO C##MIGRATION_USER; +GRANT CREATE SESSION TO C##MIGRATION_USER; + +-- General metadata access +GRANT SELECT_CATALOG_ROLE TO C##MIGRATION_USER; + +-- Access to necessary V$ views +GRANT SELECT ON V_$SESSION TO C##MIGRATION_USER; +GRANT SELECT ON V_$TRANSACTION TO C##MIGRATION_USER; + +-- Grant these two for every table to migrate in the migration_schema +GRANT SELECT, FLASHBACK ON migration_schema.tbl TO C##MIGRATION_USER; +~~~ + +##### Single-tenant Oracle user privileges + +Connect to the Oracle database as a DBA and grant the following: + +{% include_cached copy-clipboard.html %} +~~~ sql +-- Basic access +GRANT CONNECT TO MIGRATION_USER; +GRANT CREATE SESSION TO MIGRATION_USER; + +-- General metadata access +GRANT SELECT_CATALOG_ROLE TO MIGRATION_USER; +GRANT EXECUTE_CATALOG_ROLE TO MIGRATION_USER; + +-- Access to necessary V$ views +GRANT SELECT ON V_$DATABASE TO MIGRATION_USER; +GRANT SELECT ON V_$SESSION TO MIGRATION_USER; +GRANT SELECT ON V_$TRANSACTION TO MIGRATION_USER; + +-- Direct grants to specific DBA views +GRANT SELECT ON ALL_USERS TO MIGRATION_USER; +GRANT SELECT ON DBA_USERS TO MIGRATION_USER; +GRANT SELECT ON DBA_OBJECTS TO MIGRATION_USER; +GRANT SELECT ON DBA_SYNONYMS TO MIGRATION_USER; +GRANT SELECT ON DBA_TABLES TO MIGRATION_USER; + +-- Grant these two for every table to migrate in the migration_schema +GRANT SELECT, FLASHBACK ON migration_schema.tbl TO MIGRATION_USER; +~~~ +
+ +{% if page.name != "migrate-bulk-load.md" %} +#### Configure source database for replication + +
+Enable logical replication by setting `wal_level` to `logical` in `postgresql.conf` or in the SQL shell. For example: + +{% include_cached copy-clipboard.html %} +~~~ sql +ALTER SYSTEM SET wal_level = 'logical'; +~~~ +
+ +
For MySQL **8.0 and later** sources, enable [global transaction identifiers (GTID)](https://dev.mysql.com/doc/refman/8.0/en/replication-options-gtids.html) consistency. Set the following values in `mysql.cnf`, in the SQL shell, or as flags in the `mysql` start command: - `--enforce-gtid-consistency=ON` @@ -23,4 +147,119 @@ For MySQL **5.7** sources, set the following values. Note that `binlog-row-image - `--binlog-row-image=full` - `--server-id={ID}` - `--log-bin=log-bin` -
\ No newline at end of file + + +
+##### Create source sentinel table + +Create a checkpoint table called `_replicator_sentinel` in the Oracle schema you will migrate: + +{% include_cached copy-clipboard.html %} +~~~ sql +CREATE TABLE migration_schema."_replicator_sentinel" ( + keycol NUMBER PRIMARY KEY, + lastSCN NUMBER +); +~~~ + +Grant privileges to modify the checkpoint table. In Oracle Multitenant, grant the privileges on the PDB: + +{% include_cached copy-clipboard.html %} +~~~ sql +GRANT SELECT, INSERT, UPDATE ON migration_schema."_replicator_sentinel" TO C##MIGRATION_USER; +~~~ + +##### Grant LogMiner privileges + +Grant LogMiner privileges. In Oracle Multitenant, grant the permissions on the CDB: + +{% include_cached copy-clipboard.html %} +~~~ sql +-- Access to necessary V$ views +GRANT SELECT ON V_$LOG TO C##MIGRATION_USER; +GRANT SELECT ON V_$LOGFILE TO C##MIGRATION_USER; +GRANT SELECT ON V_$LOGMNR_CONTENTS TO C##MIGRATION_USER; +GRANT SELECT ON V_$ARCHIVED_LOG TO C##MIGRATION_USER; +GRANT SELECT ON V_$LOG_HISTORY TO C##MIGRATION_USER; + +-- SYS-prefixed views (for full dictionary access) +GRANT SELECT ON SYS.V_$LOGMNR_DICTIONARY TO C##MIGRATION_USER; +GRANT SELECT ON SYS.V_$LOGMNR_LOGS TO C##MIGRATION_USER; +GRANT SELECT ON SYS.V_$LOGMNR_PARAMETERS TO C##MIGRATION_USER; +GRANT SELECT ON SYS.V_$LOGMNR_SESSION TO C##MIGRATION_USER; + +-- Access to LogMiner views and controls +GRANT LOGMINING TO C##MIGRATION_USER; +GRANT EXECUTE ON DBMS_LOGMNR TO C##MIGRATION_USER; +~~~ + +The user must: + +- Query [redo logs from LogMiner](#verify-logminer-privileges). +- Retrieve active transaction information to determine the starting point for ongoing replication. +- Update the internal [`_replicator_sentinel` table](#create-source-sentinel-table) created on the Oracle source schema by the DBA. + +##### Verify LogMiner privileges + +Query the locations of redo files in the Oracle database: + +{% include_cached copy-clipboard.html %} +~~~ sql +SELECT + l.GROUP#, + lf.MEMBER, + l.FIRST_CHANGE# AS START_SCN, + l.NEXT_CHANGE# AS END_SCN +FROM + V$LOG l +JOIN + V$LOGFILE lf +ON + l.GROUP# = lf.GROUP#; +~~~ + +~~~ + GROUP# MEMBER START_SCN END_SCN +_________ _________________________________________ ____________ ______________________ + 3 /opt/oracle/oradata/ORCLCDB/redo03.log 1232896 9295429630892703743 + 2 /opt/oracle/oradata/ORCLCDB/redo02.log 1155042 1232896 + 1 /opt/oracle/oradata/ORCLCDB/redo01.log 1141934 1155042 + +3 rows selected. +~~~ + +Get the current snapshot System Change Number (SCN): + +{% include_cached copy-clipboard.html %} +~~~ sql +SELECT CURRENT_SCN FROM V$DATABASE; +~~~ + +~~~ +CURRENT_SCN +----------- +2358840 + +1 row selected. +~~~ + +Load the redo logs into LogMiner, replacing `{current-scn}` with the SCN you queried: + +{% include_cached copy-clipboard.html %} +~~~ sql +EXEC DBMS_LOGMNR.START_LOGMNR( + STARTSCN => {current-scn}, + ENDSCN => 2358840, + OPTIONS => DBMS_LOGMNR.DICT_FROM_ONLINE_CATALOG +); +~~~ + +~~~ +PL/SQL procedure successfully completed. +~~~ + +{{site.data.alerts.callout_success}} +If you receive `ORA-01435: user does not exist`, the Oracle user lacks sufficient LogMiner privileges. Refer to [Grant LogMiner privileges](#grant-logminer-privileges). +{{site.data.alerts.end}} +
+{% endif %} \ No newline at end of file diff --git a/src/current/_includes/molt/migration-prepare-schema.md b/src/current/_includes/molt/migration-prepare-schema.md index 1b7e7f083c1..bf84bfb4955 100644 --- a/src/current/_includes/molt/migration-prepare-schema.md +++ b/src/current/_includes/molt/migration-prepare-schema.md @@ -1,42 +1,29 @@ -
-{{site.data.alerts.callout_info}} -CockroachDB supports the [PostgreSQL wire protocol](https://www.postgresql.org/docs/current/protocol.html) and is largely compatible with PostgreSQL syntax. For syntax differences, refer to [Features that differ from PostgreSQL]({% link {{ site.current_cloud_version }}/postgresql-compatibility.md %}#features-that-differ-from-postgresql). -{{site.data.alerts.end}} -
- {% include molt/migration-schema-design-practices.md %} -1. Convert your database schema to an equivalent CockroachDB schema. +#### Schema Conversion Tool - The simplest method is to use the [MOLT Schema Conversion Tool]({% link cockroachcloud/migrations-page.md %}) to convert your schema line-by-line. The tool accepts `.sql` files and will convert the syntax, identify [unimplemented features and syntax incompatibilities]({% link molt/migration-strategy.md %}#unimplemented-features-and-syntax-incompatibilities) in the schema, and suggest edits according to [CockroachDB best practices]({% link molt/migration-strategy.md %}#schema-design-best-practices). +The [MOLT Schema Conversion Tool]({% link cockroachcloud/migrations-page.md %}) (SCT) automates target schema creation. It requires a free [CockroachDB {{ site.data.products.cloud }} account]({% link cockroachcloud/create-an-account.md %}). - The Schema Conversion Tool requires a free [CockroachDB {{ site.data.products.cloud }} account]({% link cockroachcloud/create-an-account.md %}). If this is not an option for you, do one of the following: - - Enable automatic schema creation when [loading data](#step-3-load-data-into-cockroachdb) with MOLT Fetch. The [`--table-handling drop-on-target-and-recreate`]({% link molt/molt-fetch.md %}#target-table-handling) option creates one-to-one [type mappings]({% link molt/molt-fetch.md %}#type-mapping) between the source database and CockroachDB and works well when the source schema is well-defined. - - Manually convert the schema according to the [schema design best practices]({% link molt/migration-strategy.md %}#schema-design-best-practices){% comment %}and data type mappings{% endcomment %}. You can also [export a partially converted schema]({% link cockroachcloud/migrations-page.md %}#export-the-schema) from the Schema Conversion Tool to finish the conversion manually. +1. Upload a source `.sql` file to convert the syntax and identify [unimplemented features and syntax incompatibilities]({% link molt/migration-strategy.md %}#unimplemented-features-and-syntax-incompatibilities) in the schema. - For additional help, contact your account team. - -1. Import the converted schema to a CockroachDB cluster. - - When migrating to CockroachDB {{ site.data.products.cloud }}, use the Schema Conversion Tool to [migrate the converted schema to a new {{ site.data.products.cloud }} database]({% link cockroachcloud/migrations-page.md %}#migrate-the-schema). - - When migrating to a {{ site.data.products.core }} CockroachDB cluster, pipe the [data definition language (DDL)]({% link {{ site.current_cloud_version }}/sql-statements.md %}#data-definition-statements) directly into [`cockroach sql`]({% link {{ site.current_cloud_version }}/cockroach-sql.md %}). You can [export a converted schema file]({% link cockroachcloud/migrations-page.md %}#export-the-schema) from the Schema Conversion Tool. - {{site.data.alerts.callout_success}} - For the fastest performance, you can use a [local, single-node CockroachDB cluster]({% link {{ site.current_cloud_version }}/cockroach-start-single-node.md %}#start-a-single-node-cluster) to convert your schema. - {{site.data.alerts.end}} +1. Import the converted schema to a CockroachDB cluster: + - When migrating to CockroachDB {{ site.data.products.cloud }}, the Schema Conversion Tool automatically [applies the converted schema to a new {{ site.data.products.cloud }} database]({% link cockroachcloud/migrations-page.md %}#migrate-the-schema). + - When migrating to a {{ site.data.products.core }} CockroachDB cluster, [export a converted schema file]({% link cockroachcloud/migrations-page.md %}#export-the-schema) and pipe the [data definition language (DDL)]({% link {{ site.current_cloud_version }}/sql-statements.md %}#data-definition-statements) directly into [`cockroach sql`]({% link {{ site.current_cloud_version }}/cockroach-sql.md %}).
-When [using the Schema Conversion Tool]({% link cockroachcloud/migrations-page.md %}?filters=mysql#convert-a-schema), syntax that cannot automatically be converted will be displayed in the [**Summary Report**]({% link cockroachcloud/migrations-page.md %}?filters=mysql#summary-report). These may include the following: +Syntax that cannot automatically be converted will be displayed in the [**Summary Report**]({% link cockroachcloud/migrations-page.md %}?filters=mysql#summary-report). These may include the following: -#### String case sensitivity +##### String case sensitivity Strings are case-insensitive in MySQL and case-sensitive in CockroachDB. You may need to edit your MySQL data to get the results you expect from CockroachDB. For example, you may have been doing string comparisons in MySQL that will need to be changed to work with CockroachDB. For more information about the case sensitivity of strings in MySQL, refer to [Case Sensitivity in String Searches](https://dev.mysql.com/doc/refman/8.0/en/case-sensitivity.html) from the MySQL documentation. For more information about CockroachDB strings, refer to [`STRING`]({% link {{ site.current_cloud_version }}/string.md %}). -#### Identifier case sensitivity +##### Identifier case sensitivity Identifiers are case-sensitive in MySQL and [case-insensitive in CockroachDB]({% link {{ site.current_cloud_version }}/keywords-and-identifiers.md %}#identifiers). When [using the Schema Conversion Tool]({% link cockroachcloud/migrations-page.md %}?filters=mysql#convert-a-schema), you can either keep case sensitivity by enclosing identifiers in double quotes, or make identifiers case-insensitive by converting them to lowercase. -#### `AUTO_INCREMENT` attribute +##### `AUTO_INCREMENT` attribute The MySQL [`AUTO_INCREMENT`](https://dev.mysql.com/doc/refman/8.0/en/example-auto-increment.html) attribute, which creates sequential column values, is not supported in CockroachDB. When [using the Schema Conversion Tool]({% link cockroachcloud/migrations-page.md %}?filters=mysql#convert-a-schema), columns with `AUTO_INCREMENT` can be converted to use [sequences]({% link {{ site.current_cloud_version }}/create-sequence.md %}), `UUID` values with [`gen_random_uuid()`]({% link {{ site.current_cloud_version }}/functions-and-operators.md %}#id-generation-functions), or unique `INT8` values using [`unique_rowid()`]({% link {{ site.current_cloud_version }}/functions-and-operators.md %}#id-generation-functions). Cockroach Labs does not recommend using a sequence to define a primary key column. For more information, refer to [Unique ID best practices]({% link {{ site.current_cloud_version }}/performance-best-practices-overview.md %}#unique-id-best-practices). @@ -44,19 +31,19 @@ The MySQL [`AUTO_INCREMENT`](https://dev.mysql.com/doc/refman/8.0/en/example-aut Changing a column type during schema conversion will cause [MOLT Verify]({% link molt/molt-verify.md %}) to identify a type mismatch during data validation. This is expected behavior. {{site.data.alerts.end}} -#### `ENUM` type +##### `ENUM` type MySQL `ENUM` types are defined in table columns. On CockroachDB, [`ENUM`]({% link {{ site.current_cloud_version }}/enum.md %}) is a standalone type. When [using the Schema Conversion Tool]({% link cockroachcloud/migrations-page.md %}?filters=mysql#convert-a-schema), you can either deduplicate the `ENUM` definitions or create a separate type for each column. -#### `TINYINT` type +##### `TINYINT` type `TINYINT` data types are not supported in CockroachDB. The [Schema Conversion Tool]({% link cockroachcloud/migrations-page.md %}?filters=mysql) automatically converts `TINYINT` columns to [`INT2`]({% link {{ site.current_cloud_version }}/int.md %}) (`SMALLINT`). -#### Geospatial types +##### Geospatial types MySQL geometry types are not converted to CockroachDB [geospatial types]({% link {{ site.current_cloud_version }}/spatial-data-overview.md %}#spatial-objects) by the [Schema Conversion Tool]({% link cockroachcloud/migrations-page.md %}?filters=mysql). They should be manually converted to the corresponding types in CockroachDB. -#### `FIELD` function +##### `FIELD` function The MYSQL `FIELD` function is not supported in CockroachDB. Instead, you can use the [`array_position`]({% link {{ site.current_cloud_version }}/functions-and-operators.md %}#array-functions) function, which returns the index of the first occurrence of element in the array. @@ -80,4 +67,8 @@ While MySQL returns 0 when the element is not found, CockroachDB returns `NULL`. ~~~ sql SELECT * FROM table_a ORDER BY COALESCE(array_position(ARRAY[4,1,3,2],5),999); ~~~ -
\ No newline at end of file + + +#### Drop constraints and indexes + +{% include molt/molt-drop-constraints-indexes.md %} \ No newline at end of file diff --git a/src/current/_includes/molt/migration-schema-design-practices.md b/src/current/_includes/molt/migration-schema-design-practices.md index 53ca67fd37e..5cc0828b1bf 100644 --- a/src/current/_includes/molt/migration-schema-design-practices.md +++ b/src/current/_includes/molt/migration-schema-design-practices.md @@ -1,5 +1,37 @@ -Follow these recommendations when converting your schema for compatibility with CockroachDB. +Convert the source schema into a CockroachDB-compatible schema. CockroachDB supports the PostgreSQL wire protocol and is largely [compatible with PostgreSQL syntax]({% link {{ site.current_cloud_version }}/postgresql-compatibility.md %}#features-that-differ-from-postgresql). -- Define an explicit primary key on every table. For more information, refer to [Primary key best practices]({% link {{ site.current_cloud_version }}/schema-design-table.md %}#primary-key-best-practices). +- The source and target schemas must **match**. Review [Type mapping]({% link molt/molt-fetch.md %}#type-mapping) to understand which source types can be mapped to CockroachDB types. + + For example, a source table defined as `CREATE TABLE migration_schema.tbl (pk INT PRIMARY KEY);` must have a corresponding schema and table in CockroachDB: + + {% include_cached copy-clipboard.html %} + ~~~ sql + CREATE SCHEMA migration_schema; + CREATE TABLE migration_schema.tbl (pk INT PRIMARY KEY); + ~~~ + + - MOLT Fetch can automatically create a matching CockroachDB schema using the {% if page.name != "migration-strategy.md" %}[`drop-on-target-and-recreate`](#table-handling-mode){% else %}[`drop-on-target-and-recreate`]({% link molt/molt-fetch.md %}#target-table-handling){% endif %} option. + + - If you create the target schema manually, review how MOLT Fetch handles [type mismatches]({% link molt/molt-fetch.md %}#mismatch-handling). You can use the {% if page.name != "migration-strategy.md" %}[MOLT Schema Conversion Tool](#schema-conversion-tool){% else %}[MOLT Schema Conversion Tool]({% link cockroachcloud/migrations-page.md %}){% endif %} to create a matching schema. + +
+ - By default, table and column names are case-insensitive in MOLT Fetch. If using the [`--case-sensitive`]({% link molt/molt-fetch.md %}#global-flags) flag, schema, table, and column names must match Oracle's default uppercase identifiers. Use quoted names on the target to preserve case. For example, the following CockroachDB SQL statement will error: + + ~~~ sql + CREATE TABLE co.stores (... store_id ...); + ~~~ + + It should be written as: + + ~~~ sql + CREATE TABLE "CO"."STORES" (... "STORE_ID" ...); + ~~~ + + When using `--case-sensitive`, quote all identifiers and match the case exactly (for example, use `"CO"."STORES"` and `"STORE_ID"`). +
+ +- Every table **must** have an explicit primary key. For more information, refer to [Primary key best practices]({% link {{ site.current_cloud_version }}/schema-design-table.md %}#primary-key-best-practices). + +- Review [Transformations]({% link molt/molt-fetch.md %}#transformations) to understand how computed columns and partitioned tables can be mapped to the target, and how target tables can be renamed. - By default on CockroachDB, `INT` is an alias for `INT8`, which creates 64-bit signed integers. PostgreSQL and MySQL default to 32-bit integers. Depending on your source database or application requirements, you may need to change the integer size to `4`. For more information, refer to [Considerations for 64-bit signed integers]({% link {{ site.current_cloud_version }}/int.md %}#considerations-for-64-bit-signed-integers). \ No newline at end of file diff --git a/src/current/_includes/molt/molt-connection-strings.md b/src/current/_includes/molt/molt-connection-strings.md new file mode 100644 index 00000000000..f33426755e3 --- /dev/null +++ b/src/current/_includes/molt/molt-connection-strings.md @@ -0,0 +1,69 @@ +Define the connection strings for the [source](#source-connection-string) and [target](#target-connection-string) databases, and keep them [secure](#secure-connections). + +#### Source connection string + +The `--source` flag specifies the connection string for the source database: + +
+~~~ +--source 'postgres://{username}:{password}@{host}:{port}/{database}?sslmode=verify-full' +~~~ + +For example: + +~~~ +--source 'postgres://migration_user:password@localhost:5432/molt?sslmode=verify-full' +~~~ +
+ +
+~~~ +--source 'mysql://{username}:{password}@{host}:{port}/{database}?sslmode=verify-full&sslcert={path_to_client_crt}&sslkey={path_to_client_key}&sslrootcert={path_to_ca_crt}' +~~~ + +For example: + +~~~ +--source 'mysql://migration_user:password@localhost/molt?sslcert=.%2fsource_certs%2fclient.root.crt&sslkey=.%2fsource_certs%2fclient.root.key&sslmode=verify-full&sslrootcert=.%2fsource_certs%2fca.crt' +~~~ +
+ +
+~~~ +--source 'oracle://{username}:{password}@{host}:{port}/{service_name}' +~~~ + +In [Oracle Multitenant](https://docs.oracle.com/en/database/oracle/oracle-database/21/cncpt/CDBs-and-PDBs.html), `--source` specifies the connection string for the PDB. `--source-cdb` specifies the connection string for the CDB. The username specified in both `--source` and `--source-cdb` must be a common user with the privileges described in [Create migration user on source database](#create-migration-user-on-source-database). + +~~~ +--source 'oracle://{username}:{password}@{host}:{port}/{PDB_service_name}' +--source-cdb 'oracle://{username}:{password}@{host}:{port}/{CDB_service_name}' +~~~ + +Escape the `C##` prefix in the Oracle Multitenant username. For example, write `C##MIGRATION_USER` as `C%23%23`: + +~~~ +--source 'oracle://C%23%23MIGRATION_USER:password@host:1521/ORCLPDB1' +--source-cdb 'oracle://C%23%23MIGRATION_USER:password@host:1521/ORCLCDB' +~~~ +
+ +#### Target connection string + +The `--target` flag specifies the connection string for the target CockroachDB database: + +~~~ +--target 'postgres://{username}:{password}@{host}:{port}/{database}?sslmode=verify-full' +~~~ + +For example: + +~~~ +--target 'postgres://crdb_user:password@localhost:26257/defaultdb?sslmode=verify-full' +~~~ + +For details, refer to [Connect using a URL]({% link {{site.current_cloud_version}}/connection-parameters.md %}#connect-using-a-url). + +#### Secure connections + +{% include molt/fetch-secure-connection-strings.md %} \ No newline at end of file diff --git a/src/current/_includes/molt/molt-drop-constraints-indexes.md b/src/current/_includes/molt/molt-drop-constraints-indexes.md new file mode 100644 index 00000000000..a6456d779fa --- /dev/null +++ b/src/current/_includes/molt/molt-drop-constraints-indexes.md @@ -0,0 +1,28 @@ +To optimize data load performance, drop all non-`PRIMARY KEY` [constraints]({% link {{ site.current_cloud_version }}/alter-table.md %}#drop-constraint) and [indexes]({% link {{site.current_cloud_version}}/drop-index.md %}) on the target CockroachDB database before migrating: +{% if page.name == "molt-fetch.md" %} + - [`FOREIGN KEY`]({% link {{ site.current_cloud_version }}/foreign-key.md %}) + - [`UNIQUE`]({% link {{ site.current_cloud_version }}/unique.md %}) + - [Secondary indexes]({% link {{ site.current_cloud_version }}/schema-design-indexes.md %}) + - [`CHECK`]({% link {{ site.current_cloud_version }}/check.md %}) + - [`DEFAULT`]({% link {{ site.current_cloud_version }}/default-value.md %}) + - [`NOT NULL`]({% link {{ site.current_cloud_version }}/not-null.md %}) (you do not need to drop this constraint when using `drop-on-target-and-recreate` for [table handling](#target-table-handling)) + + {{site.data.alerts.callout_danger}} + Do **not** drop [`PRIMARY KEY`]({% link {{ site.current_cloud_version }}/primary-key.md %}) constraints. + {{site.data.alerts.end}} + + You can recreate [constraints]({% link {{ site.current_cloud_version }}/alter-table.md %}#add-constraint) and [indexes]({% link {{site.current_cloud_version}}/create-index.md %}) after loading the data. +{% else %} +- [`FOREIGN KEY`]({% link {{ site.current_cloud_version }}/foreign-key.md %}) +- [`UNIQUE`]({% link {{ site.current_cloud_version }}/unique.md %}) +- [Secondary indexes]({% link {{ site.current_cloud_version }}/schema-design-indexes.md %}) +- [`CHECK`]({% link {{ site.current_cloud_version }}/check.md %}) +- [`DEFAULT`]({% link {{ site.current_cloud_version }}/default-value.md %}) +- [`NOT NULL`]({% link {{ site.current_cloud_version }}/not-null.md %}) (you do not need to drop this constraint when using `drop-on-target-and-recreate` for [table handling](#table-handling-mode)) + +{{site.data.alerts.callout_danger}} +Do **not** drop [`PRIMARY KEY`]({% link {{ site.current_cloud_version }}/primary-key.md %}) constraints. +{{site.data.alerts.end}} + +You can [recreate the constraints and indexes after loading the data](#modify-the-cockroachdb-schema). +{% endif %} \ No newline at end of file diff --git a/src/current/_includes/molt/molt-limitations.md b/src/current/_includes/molt/molt-limitations.md new file mode 100644 index 00000000000..00fc35f014c --- /dev/null +++ b/src/current/_includes/molt/molt-limitations.md @@ -0,0 +1,41 @@ +### Limitations + +
+- `OID LOB` types in PostgreSQL are not supported, although similar types like `BYTEA` are supported. +
+ +
+- Migrations must be performed from a single Oracle schema. You **must** include [`--schema-filter`](#schema-and-table-filtering) so that MOLT Fetch only loads data from the specified schema. Refer to [Schema and table filtering](#schema-and-table-filtering). + - Specifying [`--table-filter`](#schema-and-table-filtering) is also strongly recommended to ensure that only necessary tables are migrated from the Oracle schema. +- Oracle advises against `LONG RAW` columns and [recommends converting them to `BLOB`](https://www.orafaq.com/wiki/LONG_RAW#History). `LONG RAW` can only store binary values up to 2GB, and only one `LONG RAW` column per table is supported. +
+ +- Only tables with [primary key]({% link {{ site.current_cloud_version }}/primary-key.md %}) types of [`INT`]({% link {{ site.current_cloud_version }}/int.md %}), [`FLOAT`]({% link {{ site.current_cloud_version }}/float.md %}), or [`UUID`]({% link {{ site.current_cloud_version }}/uuid.md %}) can be sharded with [`--export-concurrency`]({% link molt/molt-fetch.md %}#best-practices). + +{% if page.name != "migrate-bulk-load.md" %} +#### Replication limitations + +
+- Replication modes require write access to the PostgreSQL primary instance. MOLT cannot create replication slots or run replication against a read replica. +
+ +
+- MySQL replication is supported only with GTID-based configurations. Binlog-based features that do not use GTID are not supported. +
+ +
+- Replication will not work for tables or column names exceeding 30 characters. This is a [limitation of Oracle LogMiner](https://docs.oracle.com/en/database/oracle/oracle-database/21/sutil/oracle-logminer-utility.html#GUID-7594F0D7-0ACD-46E6-BD61-2751136ECDB4). +- The following data types are not supported for replication: + - User-defined types (UDTs) + - Nested tables + - `VARRAY` + - `LONGBLOB`/`CLOB` columns (over 4000 characters) +- If your Oracle workload executes `UPDATE` statements that modify only LOB columns, these `UPDATE` statements are not supported by Oracle LogMiner and will not be replicated. +- If you are using Oracle 11 and execute `UPDATE` statements on `XMLTYPE` or LOB columns, those changes are not supported by Oracle LogMiner and will be excluded from ongoing replication. +- If you are migrating LOB columns from Oracle 12c, use [AWS DMS Binary Reader](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.Oracle.html#CHAP_Source.Oracle.CDC) instead of LogMiner. Oracle LogMiner does not support LOB replication in 12c. +
+ +- Running DDL on the source or target while replication is in progress can cause replication failures. +- `TRUNCATE` operations on the source are not captured. Only `INSERT`, `UPDATE`, `UPSERT`, and `DELETE` events are replicated. +- Changes to virtual columns are not replicated automatically. To migrate these columns, you must define them explicitly with [transformation rules]({% link molt/molt-fetch.md %}#transformations). +{% endif %} \ No newline at end of file diff --git a/src/current/_includes/molt/molt-setup.md b/src/current/_includes/molt/molt-setup.md new file mode 100644 index 00000000000..6a04a2c6615 --- /dev/null +++ b/src/current/_includes/molt/molt-setup.md @@ -0,0 +1,82 @@ +
+ + + +
+ +
+{{site.data.alerts.callout_info}} +{% include feature-phases/preview.md %} +{{site.data.alerts.end}} +
+ +## Before you begin + +- Create a CockroachDB [{{ site.data.products.cloud }}]({% link cockroachcloud/create-your-cluster.md %}) or [{{ site.data.products.core }}]({% link {{ site.current_cloud_version }}/install-cockroachdb-mac.md %}) cluster. +- Install the [MOLT (Migrate Off Legacy Technology)]({% link releases/molt.md %}#installation) tools. +- Review the MOLT Fetch [best practices]({% link molt/molt-fetch.md %}#best-practices). +- Review [Migration Strategy]({% link molt/migration-strategy.md %}). + +
+{% include molt/oracle-migration-prerequisites.md %} +
+ +{% include molt/molt-limitations.md %} + +## Prepare the source database + +{% include molt/migration-prepare-database.md %} + +## Prepare the target database + +### Create the target schema + +{% include molt/migration-prepare-schema.md %} + +### Create the SQL user + +{% include molt/migration-create-sql-user.md %} + +## Configure data load + +When you run `molt fetch`, you can configure the following options for data load: + +- [Connection strings](#connection-strings): Specify URL‑encoded source and target connections. +- [Intermediate file storage](#intermediate-file-storage): Export data to cloud storage or a local file server. +- [Table handling mode](#table-handling-mode): Determine how existing target tables are initialized before load. +- [Schema and table filtering](#schema-and-table-filtering): Specify schema and table names to migrate. +- [Data load mode](#data-load-mode): Choose between `IMPORT INTO` and `COPY FROM`. +- [Metrics](#metrics): Configure metrics collection during the load. +{% if page.name != "migrate-bulk-load.md" %} +- [Replication flags](#replication-flags): Configure the `replicator` process. +{% endif %} + +### Connection strings + +{% include molt/molt-connection-strings.md %} + +### Intermediate file storage + +{% include molt/fetch-intermediate-file-storage.md %} + +### Table handling mode + +{% include molt/fetch-table-handling.md %} + +### Schema and table filtering + +{% include molt/fetch-schema-table-filtering.md %} + +### Data load mode + +{% include molt/fetch-data-load-modes.md %} + +### Metrics + +{% include molt/fetch-metrics.md %} + +{% if page.name == "migrate-data-load-and-replication.md" %} +### Replication flags + +{% include molt/fetch-replicator-flags.md %} +{% endif %} \ No newline at end of file diff --git a/src/current/_includes/molt/molt-troubleshooting.md b/src/current/_includes/molt/molt-troubleshooting.md new file mode 100644 index 00000000000..e885bf5edb7 --- /dev/null +++ b/src/current/_includes/molt/molt-troubleshooting.md @@ -0,0 +1,80 @@ +## Troubleshooting + +##### Fetch exits early due to mismatches + +`molt fetch` exits early in the following cases, and will output a log with a corresponding `mismatch_tag` and `failable_mismatch` set to `true`: + +- A source table is missing a primary key. +- A source primary key and target primary key have mismatching types. +- A [`STRING`]({% link {{site.current_cloud_version}}/string.md %}) primary key has a different [collation]({% link {{site.current_cloud_version}}/collate.md %}) on the source and target. +- A source and target column have mismatching types that are not [allowable mappings]({% link molt/molt-fetch.md %}#type-mapping). +- A target table is missing a column that is in the corresponding source table. +- A source column is nullable, but the corresponding target column is not nullable (i.e., the constraint is more strict on the target). + +`molt fetch` can continue in the following cases, and will output a log with a corresponding `mismatch_tag` and `failable_mismatch` set to `false`: + +- A target table has a column that is not in the corresponding source table. +- A source column has a `NOT NULL` constraint, and the corresponding target column is nullable (i.e., the constraint is less strict on the target). +- A [`DEFAULT`]({% link {{site.current_cloud_version}}/default-value.md %}), [`CHECK`]({% link {{site.current_cloud_version}}/check.md %}), [`FOREIGN KEY`]({% link {{site.current_cloud_version}}/foreign-key.md %}), or [`UNIQUE`]({% link {{site.current_cloud_version}}/unique.md %}) constraint is specified on a target column and not on the source column. + +
+##### ORA-01950: no privileges on tablespace + +If you receive `ORA-01950: no privileges on tablespace 'USERS'`, it means the Oracle table owner (`migration_schema` in the preceding examples) does not have sufficient quota on the tablespace used to store its data. By default, this tablespace is `USERS`, but it can vary. To resolve this issue, grant a quota to the table owner. For example: + +~~~ sql +-- change UNLIMITED to a suitable limit for the table owner +ALTER USER migration_schema QUOTA UNLIMITED ON USERS; +~~~ + +##### No tables to drop and recreate on target + +When expecting a bulk load but seeing `no tables to drop and recreate on the target`, ensure the migration user has `SELECT` and `FLASHBACK` privileges on each table to be migrated. For example: + +~~~ sql +GRANT SELECT, FLASHBACK ON migration_schema.employees TO C##MIGRATION_USER; +GRANT SELECT, FLASHBACK ON migration_schema.payments TO C##MIGRATION_USER; +GRANT SELECT, FLASHBACK ON migration_schema.orders TO C##MIGRATION_USER; +~~~ + +##### Table or view does not exist + +If the Oracle migration user lacks privileges on certain tables, you may receive errors stating that the table or view does not exist. Either use `--table-filter` to [limit the tables to be migrated](#schema-and-table-filtering), or grant the migration user `SELECT` privileges on all objects in the schema. Refer to [Create migration user on source database](#create-migration-user-on-source-database). + +{% if page.name != "migrate-bulk-load.md" %} +##### Missing redo logs or unavailable SCN + +If the Oracle redo log files are too small or do not retain enough history, you may get errors indicating that required log files are missing for a given SCN range, or that a specific SCN is unavailable. + +Increase the number and size of online redo log files, and verify that archived log files are being generated and retained correctly in your Oracle environment. + +##### Missing replicator flags + +If required `--replicator-flags` are missing, ensure that the necessary flags for your mode are included. For details, refer to [Replication flags](#replication-flags). + +##### Replicator lag + +If the `replicator` process is lagging significantly behind the current Oracle SCN, you may see log messages like: `replicator is catching up to the current SCN at 5000 from 1000…`. This indicates that replication is progressing but is still behind the most recent changes on the source database. +{% endif %} + +##### Oracle sessions remain open after forcefully stopping `molt` or `replicator` + +If you shut down `molt` or `replicator` unexpectedly (e.g., with `kill -9` or a system crash), Oracle sessions opened by these tools may remain active. + +- Check your operating system for any running `molt` or `replicator` processes and terminate them manually. +- After confirming that both processes have stopped, ask a DBA to check for active Oracle sessions using: + + ~~~ sql + SELECT sid, serial#, username, status, osuser, machine, program + FROM v$session + WHERE username = 'C##MIGRATION_USER'; + ~~~ + + Wait until any remaining sessions display an `INACTIVE` status, then terminate them using: + + ~~~ sql + ALTER SYSTEM KILL SESSION 'sid,serial#' IMMEDIATE; + ~~~ + + Replace `sid` and `serial#` in the preceding statement with the values returned by the `SELECT` query. +
\ No newline at end of file diff --git a/src/current/_includes/molt/oracle-migration-prerequisites.md b/src/current/_includes/molt/oracle-migration-prerequisites.md new file mode 100644 index 00000000000..8fb1422a5da --- /dev/null +++ b/src/current/_includes/molt/oracle-migration-prerequisites.md @@ -0,0 +1,71 @@ +### Prerequisites + +#### Oracle Instant Client + +Install Oracle Instant Client on the machine that will run `molt` and `replicator`: + +- On macOS ARM machines, download the [Oracle Instant Client](https://www.oracle.com/database/technologies/instant-client/macos-arm64-downloads.html#ic_osx_inst). After installation, you should have a new directory at `/Users/$USER/Downloads/instantclient_23_3` containing `.dylib` files. Set the `LD_LIBRARY_PATH` environment variable to this directory: + + {% include_cached copy-clipboard.html %} + ~~~ shell + export LD_LIBRARY_PATH=/Users/$USER/Downloads/instantclient_23_3 + ~~~ + +- On Linux machines, install the Oracle Instant Client dependencies and set the `LD_LIBRARY_PATH` to the client library path: + + {% include_cached copy-clipboard.html %} + ~~~ shell + sudo apt-get install -yqq --no-install-recommends libaio1t64 + sudo ln -s /usr/lib/x86_64-linux-gnu/libaio.so.1t64 /usr/lib/x86_64-linux-gnu/libaio.so.1 + curl -o /tmp/ora-libs.zip https://replicator.cockroachdb.com/third_party/instantclient-basiclite-linux-amd64.zip + unzip -d /tmp /tmp/ora-libs.zip + sudo mv /tmp/instantclient_21_13/* /usr/lib + export LD_LIBRARY_PATH=/usr/lib + ~~~ + +{% if page.name != "migrate-bulk-load.md" %} +#### Enable `ARCHIVELOG` + +Enable `ARCHIVELOG` mode on the Oracle database. This is required for Oracle LogMiner, Oracle's built-in changefeed tool that captures DML events for replication. + +{% include_cached copy-clipboard.html %} +~~~ sql +SELECT log_mode FROM v$database; +SHUTDOWN IMMEDIATE; +STARTUP MOUNT; +ALTER DATABASE ARCHIVELOG; +ALTER DATABASE OPEN; +SELECT log_mode FROM v$database; +~~~ + +~~~ +LOG_MODE +-------- +ARCHIVELOG + +1 row selected. +~~~ + +Enable supplemental primary key logging for logical replication: + +{% include_cached copy-clipboard.html %} +~~~ sql +ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (PRIMARY KEY) COLUMNS; +SELECT supplemental_log_data_min, supplemental_log_data_pk FROM v$database; +~~~ + +~~~ +SUPPLEMENTAL_LOG_DATA_MIN SUPPLEMENTAL_LOG_DATA_PK +------------------------- ------------------------ +IMPLICIT YES + +1 row selected. +~~~ + +Enable `FORCE_LOGGING` to ensure that all data changes are captured for the tables to migrate: + +{% include_cached copy-clipboard.html %} +~~~ sql +ALTER DATABASE FORCE LOGGING; +~~~ +{% endif %} \ No newline at end of file diff --git a/src/current/_includes/molt/replicator-flags.md b/src/current/_includes/molt/replicator-flags.md index 791870ba669..db2f4a19c30 100644 --- a/src/current/_includes/molt/replicator-flags.md +++ b/src/current/_includes/molt/replicator-flags.md @@ -2,45 +2,45 @@ The following flags are set with [`--replicator-flags`](#global-flags) and can be used in any [Fetch mode](#fetch-mode) that involves replication. -| Flag | Type | Description | -|----------------------------------------|------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| `--applyTimeout` | `DURATION` | The maximum amount of time to wait for an update to be applied.

**Default:** `30s` | -| `--dlqTableName` | `IDENT` | The name of a table in the target schema for storing dead-letter entries.

**Default:** `replicator_dlq` | -| `--flushPeriod` | `DURATION` | Flush queued mutations after this duration.

**Default:** `1s` | -| `--flushSize` | `INT` | Ideal batch size to determine when to flush mutations.

**Default:** `1000` | -| `--gracePeriod` | `DURATION` | Allow background processes to exit.

**Default:** `30s` | -| `--logDestination` | `STRING` | Write logs to a file. If not specified, write logs to `stdout`. | -| `--logFormat` | `STRING` | Choose log output format: `"fluent"`, `"text"`.

**Default:** `"text"` | -| `--metricsAddr` | `STRING` | A `host:port` on which to serve metrics and diagnostics. The metrics endpoint is `http://{host}:{port}/_/varz`. | -| `--parallelism` | `INT` | The number of concurrent database transactions to use.

**Default:** `16` | -| `--quiescentPeriod` | `DURATION` | How often to retry deferred mutations.

**Default:** `10s` | -| `--retireOffset` | `DURATION` | How long to delay removal of applied mutations.

**Default:** `24h0m0s` | -| `--scanSize` | `INT` | The number of rows to retrieve from the staging database used to store metadata for [replication modes](#fetch-mode).

**Default:** `10000` | -| `--schemaRefresh` | `DURATION` | How often a watcher will refresh its schema. If this value is zero or negative, refresh behavior will be disabled.

**Default:** `1m0s` | -| `--sourceConn` | `STRING` | The source database's connection string. | -| `--stageDisableCreateTableReaderIndex` | `BOOL` | Disable the creation of partial covering indexes to improve read performance on staging tables. Set to `true` if creating indexes on existing tables would cause a significant operational impact.

**Default:** `false` | -| `--stageMarkAppliedLimit` | `INT` | Limit the number of mutations to be marked applied in a single statement.

**Default:** `100000` | -| `--stageSanityCheckPeriod` | `DURATION` | How often to validate staging table apply order (`-1` to disable).

**Default:** `10m0s` | -| `--stageSanityCheckWindow` | `DURATION` | How far back to look when validating staging table apply order.

**Default:** `1h0m0s` | -| `--stageUnappliedPeriod` | `DURATION` | How often to report the number of unapplied mutations in staging tables (`-1` to disable).

**Default:** `1m0s` | -| `--stagingConn` | `STRING` | The staging database's connection string. | -| `--stagingCreateSchema` | | Automatically create the staging schema if it does not exist. | -| `--stagingIdleTime` | `DURATION` | Maximum lifetime of an idle connection.

**Default:** `1m0s` | -| `--stagingJitterTime` | `DURATION` | The time over which to jitter database pool disconnections.

**Default:** `15s` | -| `--stagingMaxLifetime` | `DURATION` | The maximum lifetime of a database connection.

**Default:** `5m0s` | -| `--stagingMaxPoolSize` | `INT` | The maximum number of staging database connections.

**Default:** `128` | -| `--stagingSchema` | `STRING` | Name of the SQL database schema that stores replication metadata. **Required** each time [`--mode replication-only`](#replicate-changes) is rerun after being interrupted, as the schema provides a replication marker for streaming changes. For details, refer to [Replicate changes](#replicate-changes).

**Default:** `_replicator.public` | -| `--targetConn` | `STRING` | The target database's connection string. | -| `--targetIdleTime` | `DURATION` | Maximum lifetime of an idle connection.

**Default:** `1m0s` | -| `--targetJitterTime` | `DURATION` | The time over which to jitter database pool disconnections.

**Default:** `15s` | -| `--targetMaxLifetime` | `DURATION` | The maximum lifetime of a database connection.

**Default:** `5m0s` | -| `--targetMaxPoolSize` | `INT` | The maximum number of target database connections.

**Default:** `128` | -| `--targetSchema` | `STRING` | The SQL database schema in the target cluster to update. | -| `--targetStatementCacheSize` | `INT` | The maximum number of prepared statements to retain.

**Default:** `128` | -| `--taskGracePeriod` | `DURATION` | How long to allow for task cleanup when recovering from errors.

**Default:** `1m0s` | -| `--timestampLimit` | `INT` | The maximum number of source timestamps to coalesce into a target transaction.

**Default:** `1000` | -| `--userscript` | `STRING` | The path to a configuration script, see `userscript` subcommand. | -| `-v`, `--verbose` | `COUNT` | Increase logging verbosity to `debug`; repeat for `trace`. | +| Flag | Type | Description | +|----------------------------------------|------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `--applyTimeout` | `DURATION` | The maximum amount of time to wait for an update to be applied.

**Default:** `30s` | +| `--dlqTableName` | `IDENT` | The name of a table in the target schema for storing dead-letter entries.

**Default:** `replicator_dlq` | +| `--flushPeriod` | `DURATION` | Flush queued mutations after this duration.

**Default:** `1s` | +| `--flushSize` | `INT` | Ideal batch size to determine when to flush mutations.

**Default:** `1000` | +| `--gracePeriod` | `DURATION` | Allow background processes to exit.

**Default:** `30s` | +| `--logDestination` | `STRING` | Write logs to a file. If not specified, write logs to `stdout`. | +| `--logFormat` | `STRING` | Choose log output format: `"fluent"`, `"text"`.

**Default:** `"text"` | +| `--metricsAddr` | `STRING` | A `host:port` on which to serve metrics and diagnostics. The metrics endpoint is `http://{host}:{port}/_/varz`. | +| `--parallelism` | `INT` | The number of concurrent database transactions to use.

**Default:** `16` | +| `--quiescentPeriod` | `DURATION` | How often to retry deferred mutations.

**Default:** `10s` | +| `--retireOffset` | `DURATION` | How long to delay removal of applied mutations.

**Default:** `24h0m0s` | +| `--scanSize` | `INT` | The number of rows to retrieve from the staging database used to store metadata for [replication modes](#fetch-mode).

**Default:** `10000` | +| `--schemaRefresh` | `DURATION` | How often a watcher will refresh its schema. If this value is zero or negative, refresh behavior will be disabled.

**Default:** `1m0s` | +| `--sourceConn` | `STRING` | The source database's connection string. When replicating from Oracle, this is the connection string of the Oracle container database (CDB). Refer to [Oracle replication flags](#oracle-replication-flags). | +| `--stageDisableCreateTableReaderIndex` | `BOOL` | Disable the creation of partial covering indexes to improve read performance on staging tables. Set to `true` if creating indexes on existing tables would cause a significant operational impact.

**Default:** `false` | +| `--stageMarkAppliedLimit` | `INT` | Limit the number of mutations to be marked applied in a single statement.

**Default:** `100000` | +| `--stageSanityCheckPeriod` | `DURATION` | How often to validate staging table apply order (`-1` to disable).

**Default:** `10m0s` | +| `--stageSanityCheckWindow` | `DURATION` | How far back to look when validating staging table apply order.

**Default:** `1h0m0s` | +| `--stageUnappliedPeriod` | `DURATION` | How often to report the number of unapplied mutations in staging tables (`-1` to disable).

**Default:** `1m0s` | +| `--stagingConn` | `STRING` | The staging database's connection string. | +| `--stagingCreateSchema` | | Automatically create the staging schema if it does not exist. | +| `--stagingIdleTime` | `DURATION` | Maximum lifetime of an idle connection.

**Default:** `1m0s` | +| `--stagingJitterTime` | `DURATION` | The time over which to jitter database pool disconnections.

**Default:** `15s` | +| `--stagingMaxLifetime` | `DURATION` | The maximum lifetime of a database connection.

**Default:** `5m0s` | +| `--stagingMaxPoolSize` | `INT` | The maximum number of staging database connections.

**Default:** `128` | +| `--stagingSchema` | `STRING` | Name of the CockroachDB schema that stores replication metadata. **Required** each time [`--mode replication-only`](#replicate-changes) is rerun after being interrupted, as the schema contains a checkpoint table that enables replication to resume from the correct transaction. For details, refer to [Replicate changes](#replicate-changes).

**Default:** `_replicator.public` | +| `--targetConn` | `STRING` | The target database's connection string. | +| `--targetIdleTime` | `DURATION` | Maximum lifetime of an idle connection.

**Default:** `1m0s` | +| `--targetJitterTime` | `DURATION` | The time over which to jitter database pool disconnections.

**Default:** `15s` | +| `--targetMaxLifetime` | `DURATION` | The maximum lifetime of a database connection.

**Default:** `5m0s` | +| `--targetMaxPoolSize` | `INT` | The maximum number of target database connections.

**Default:** `128` | +| `--targetSchema` | `STRING` | The SQL database schema in the target cluster to update. | +| `--targetStatementCacheSize` | `INT` | The maximum number of prepared statements to retain.

**Default:** `128` | +| `--taskGracePeriod` | `DURATION` | How long to allow for task cleanup when recovering from errors.

**Default:** `1m0s` | +| `--timestampLimit` | `INT` | The maximum number of source timestamps to coalesce into a target transaction.

**Default:** `1000` | +| `--userscript` | `STRING` | The path to a configuration script, see `userscript` subcommand. | +| `-v`, `--verbose` | `COUNT` | Increase logging verbosity to `debug`; repeat for `trace`. | ##### PostgreSQL replication flags @@ -62,6 +62,18 @@ The following flags are set with [`--replicator-flags`](#global-flags) and can b | `--fetchMetadata` | | Fetch column metadata explicitly, for older versions of MySQL that don't support `binlog_row_metadata`. | | `--replicationProcessID` | `UINT32` | The replication process ID to report to the source database.

**Default:** `10` | +##### Oracle replication flags + +The following flags are set with [`--replicator-flags`](#global-flags) and can be used in any [Fetch mode](#fetch-mode) that involves replication from an [Oracle source database](#source-and-target-databases). + +| Flag | Type | Description | +|------------------------------|----------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `--scn` | `INT` | The snapshot System Change Number (SCN) queried by MOLT Fetch for the initial data load. | +| `--backfillFromSCN` | `INT` | The SCN of the earliest active transaction at the time of the initial snapshot. Ensures no transactions are skipped when starting replication from Oracle. | +| `--sourcePDBConn` | `STRING` | Connection string for the Oracle pluggable database (PDB). Only required when using an [Oracle multitenant configuration](https://docs.oracle.com/en/database/oracle/oracle-database/21/cncpt/CDBs-and-PDBs.html). `--sourceConn`](#replication-flags) **must** be included. | +| `--schema-filter` | `STRING` | Restricts replication to the specified Oracle PDB schema (user). Set to the PDB user that owns the tables you want to replicate. Without this flag, replication will be attempted on tables from other users. | +| `--oracle-application-users` | `STRING` | List of Oracle usernames responsible for DML transactions in the PDB schema. Enables replication from the latest-possible starting point. Usernames are case-sensitive and must match the internal Oracle usernames (e.g., `PDB_USER`). | + ##### Failback replication flags The following flags are set with [`--replicator-flags`](#global-flags) and can be used in [`failback` mode](#fail-back-to-source-database). diff --git a/src/current/_includes/molt/verify-output.md b/src/current/_includes/molt/verify-output.md index 60653fbd05b..c1c72c17a18 100644 --- a/src/current/_includes/molt/verify-output.md +++ b/src/current/_includes/molt/verify-output.md @@ -1,23 +1,37 @@ -1. Use [MOLT Verify]({% link molt/molt-verify.md %}) to validate the consistency of the data between the source database and CockroachDB. +1. Run the [MOLT Verify]({% link molt/molt-verify.md %}) command, specifying the source and target [connection strings](#connection-strings) and the tables to validate.
{% include_cached copy-clipboard.html %} ~~~ shell molt verify \ - --source 'postgres://postgres:postgres@localhost:5432/molt?sslmode=verify-full' \ - --target 'postgres://root@localhost:26257/defaultdb?sslmode=verify-full' \ - --table-filter 'employees' \ + --source $SOURCE \ + --target $TARGET \ + --table-filter 'employees|payments|orders' ~~~
+ {% include_cached copy-clipboard.html %} + ~~~ shell + molt verify \ + --source $SOURCE \ + --target $TARGET \ + --table-filter 'employees|payments|orders' + ~~~ +
+ +
{% include_cached copy-clipboard.html %} ~~~ shell molt verify \ - --source 'mysql://user:password@localhost/molt?sslcert=.%2fsource_certs%2fclient.root.crt&sslkey=.%2fsource_certs%2fclient.root.key&sslmode=verify-full&sslrootcert=.%2fsource_certs%2fca.crt' \ - --target 'postgres://root@localhost:26257/defaultdb?sslmode=verify-full' \ - --table-filter 'employees' + --source $SOURCE \ + --target $TARGET \ + --table-filter 'employees|payments|orders' ~~~ + + {{site.data.alerts.callout_info}} + With Oracle Multitenant deployments, while `--source-cdb` is required for `fetch`, it is **not** necessary for `verify`. + {{site.data.alerts.end}}
1. Check the output to observe `verify` progress. diff --git a/src/current/_includes/v23.1/sidebar-data/migrate.json b/src/current/_includes/v23.1/sidebar-data/migrate.json index c3a9397fefe..351d9254e37 100644 --- a/src/current/_includes/v23.1/sidebar-data/migrate.json +++ b/src/current/_includes/v23.1/sidebar-data/migrate.json @@ -15,21 +15,38 @@ ] }, { - "title": "Migrate to CockroachDB", - "urls": [ - "/molt/migrate-to-cockroachdb.html" - ] - }, - { - "title": "Migrate in Phases", - "urls": [ - "/molt/migrate-in-phases.html" - ] - }, - { - "title": "Migration Failback", - "urls": [ - "/molt/migrate-failback.html" + "title": "Migration Flows", + "items": [ + { + "title": "Bulk Load", + "urls": [ + "/molt/migrate-bulk-load.html" + ] + }, + { + "title": "Load and Replicate", + "urls": [ + "/molt/migrate-data-load-and-replication.html" + ] + }, + { + "title": "Load and Replicate Separately", + "urls": [ + "/molt/migrate-data-load-replicate-only.html" + ] + }, + { + "title": "Resume Replication", + "urls": [ + "/molt/migrate-replicate-only.html" + ] + }, + { + "title": "Failback", + "urls": [ + "/molt/migrate-failback.html" + ] + } ] }, { @@ -41,6 +58,12 @@ "/cockroachcloud/migrations-page.html" ] }, + { + "title": "Fetch", + "urls": [ + "/molt/molt-fetch.html" + ] + }, { "title": "Verify", "urls": [ @@ -130,12 +153,6 @@ ] } ] - }, - { - "title": "Migrate from Oracle", - "urls": [ - "/${VERSION}/migrate-from-oracle.html" - ] } ] -} +} \ No newline at end of file diff --git a/src/current/_includes/v23.2/sidebar-data/migrate.json b/src/current/_includes/v23.2/sidebar-data/migrate.json index 5448baf143e..351d9254e37 100644 --- a/src/current/_includes/v23.2/sidebar-data/migrate.json +++ b/src/current/_includes/v23.2/sidebar-data/migrate.json @@ -15,21 +15,38 @@ ] }, { - "title": "Migrate to CockroachDB", - "urls": [ - "/molt/migrate-to-cockroachdb.html" - ] - }, - { - "title": "Migrate in Phases", - "urls": [ - "/molt/migrate-in-phases.html" - ] - }, - { - "title": "Migration Failback", - "urls": [ - "/molt/migrate-failback.html" + "title": "Migration Flows", + "items": [ + { + "title": "Bulk Load", + "urls": [ + "/molt/migrate-bulk-load.html" + ] + }, + { + "title": "Load and Replicate", + "urls": [ + "/molt/migrate-data-load-and-replication.html" + ] + }, + { + "title": "Load and Replicate Separately", + "urls": [ + "/molt/migrate-data-load-replicate-only.html" + ] + }, + { + "title": "Resume Replication", + "urls": [ + "/molt/migrate-replicate-only.html" + ] + }, + { + "title": "Failback", + "urls": [ + "/molt/migrate-failback.html" + ] + } ] }, { @@ -136,12 +153,6 @@ ] } ] - }, - { - "title": "Migrate from Oracle", - "urls": [ - "/${VERSION}/migrate-from-oracle.html" - ] } ] -} +} \ No newline at end of file diff --git a/src/current/_includes/v24.1/sidebar-data/migrate.json b/src/current/_includes/v24.1/sidebar-data/migrate.json index 5448baf143e..351d9254e37 100644 --- a/src/current/_includes/v24.1/sidebar-data/migrate.json +++ b/src/current/_includes/v24.1/sidebar-data/migrate.json @@ -15,21 +15,38 @@ ] }, { - "title": "Migrate to CockroachDB", - "urls": [ - "/molt/migrate-to-cockroachdb.html" - ] - }, - { - "title": "Migrate in Phases", - "urls": [ - "/molt/migrate-in-phases.html" - ] - }, - { - "title": "Migration Failback", - "urls": [ - "/molt/migrate-failback.html" + "title": "Migration Flows", + "items": [ + { + "title": "Bulk Load", + "urls": [ + "/molt/migrate-bulk-load.html" + ] + }, + { + "title": "Load and Replicate", + "urls": [ + "/molt/migrate-data-load-and-replication.html" + ] + }, + { + "title": "Load and Replicate Separately", + "urls": [ + "/molt/migrate-data-load-replicate-only.html" + ] + }, + { + "title": "Resume Replication", + "urls": [ + "/molt/migrate-replicate-only.html" + ] + }, + { + "title": "Failback", + "urls": [ + "/molt/migrate-failback.html" + ] + } ] }, { @@ -136,12 +153,6 @@ ] } ] - }, - { - "title": "Migrate from Oracle", - "urls": [ - "/${VERSION}/migrate-from-oracle.html" - ] } ] -} +} \ No newline at end of file diff --git a/src/current/_includes/v24.2/sidebar-data/migrate.json b/src/current/_includes/v24.2/sidebar-data/migrate.json index 5448baf143e..351d9254e37 100644 --- a/src/current/_includes/v24.2/sidebar-data/migrate.json +++ b/src/current/_includes/v24.2/sidebar-data/migrate.json @@ -15,21 +15,38 @@ ] }, { - "title": "Migrate to CockroachDB", - "urls": [ - "/molt/migrate-to-cockroachdb.html" - ] - }, - { - "title": "Migrate in Phases", - "urls": [ - "/molt/migrate-in-phases.html" - ] - }, - { - "title": "Migration Failback", - "urls": [ - "/molt/migrate-failback.html" + "title": "Migration Flows", + "items": [ + { + "title": "Bulk Load", + "urls": [ + "/molt/migrate-bulk-load.html" + ] + }, + { + "title": "Load and Replicate", + "urls": [ + "/molt/migrate-data-load-and-replication.html" + ] + }, + { + "title": "Load and Replicate Separately", + "urls": [ + "/molt/migrate-data-load-replicate-only.html" + ] + }, + { + "title": "Resume Replication", + "urls": [ + "/molt/migrate-replicate-only.html" + ] + }, + { + "title": "Failback", + "urls": [ + "/molt/migrate-failback.html" + ] + } ] }, { @@ -136,12 +153,6 @@ ] } ] - }, - { - "title": "Migrate from Oracle", - "urls": [ - "/${VERSION}/migrate-from-oracle.html" - ] } ] -} +} \ No newline at end of file diff --git a/src/current/_includes/v24.3/sidebar-data/migrate.json b/src/current/_includes/v24.3/sidebar-data/migrate.json index 5448baf143e..351d9254e37 100644 --- a/src/current/_includes/v24.3/sidebar-data/migrate.json +++ b/src/current/_includes/v24.3/sidebar-data/migrate.json @@ -15,21 +15,38 @@ ] }, { - "title": "Migrate to CockroachDB", - "urls": [ - "/molt/migrate-to-cockroachdb.html" - ] - }, - { - "title": "Migrate in Phases", - "urls": [ - "/molt/migrate-in-phases.html" - ] - }, - { - "title": "Migration Failback", - "urls": [ - "/molt/migrate-failback.html" + "title": "Migration Flows", + "items": [ + { + "title": "Bulk Load", + "urls": [ + "/molt/migrate-bulk-load.html" + ] + }, + { + "title": "Load and Replicate", + "urls": [ + "/molt/migrate-data-load-and-replication.html" + ] + }, + { + "title": "Load and Replicate Separately", + "urls": [ + "/molt/migrate-data-load-replicate-only.html" + ] + }, + { + "title": "Resume Replication", + "urls": [ + "/molt/migrate-replicate-only.html" + ] + }, + { + "title": "Failback", + "urls": [ + "/molt/migrate-failback.html" + ] + } ] }, { @@ -136,12 +153,6 @@ ] } ] - }, - { - "title": "Migrate from Oracle", - "urls": [ - "/${VERSION}/migrate-from-oracle.html" - ] } ] -} +} \ No newline at end of file diff --git a/src/current/_includes/v25.1/sidebar-data/migrate.json b/src/current/_includes/v25.1/sidebar-data/migrate.json index 5448baf143e..351d9254e37 100644 --- a/src/current/_includes/v25.1/sidebar-data/migrate.json +++ b/src/current/_includes/v25.1/sidebar-data/migrate.json @@ -15,21 +15,38 @@ ] }, { - "title": "Migrate to CockroachDB", - "urls": [ - "/molt/migrate-to-cockroachdb.html" - ] - }, - { - "title": "Migrate in Phases", - "urls": [ - "/molt/migrate-in-phases.html" - ] - }, - { - "title": "Migration Failback", - "urls": [ - "/molt/migrate-failback.html" + "title": "Migration Flows", + "items": [ + { + "title": "Bulk Load", + "urls": [ + "/molt/migrate-bulk-load.html" + ] + }, + { + "title": "Load and Replicate", + "urls": [ + "/molt/migrate-data-load-and-replication.html" + ] + }, + { + "title": "Load and Replicate Separately", + "urls": [ + "/molt/migrate-data-load-replicate-only.html" + ] + }, + { + "title": "Resume Replication", + "urls": [ + "/molt/migrate-replicate-only.html" + ] + }, + { + "title": "Failback", + "urls": [ + "/molt/migrate-failback.html" + ] + } ] }, { @@ -136,12 +153,6 @@ ] } ] - }, - { - "title": "Migrate from Oracle", - "urls": [ - "/${VERSION}/migrate-from-oracle.html" - ] } ] -} +} \ No newline at end of file diff --git a/src/current/_includes/v25.2/sidebar-data/migrate.json b/src/current/_includes/v25.2/sidebar-data/migrate.json index 5448baf143e..7762f069cbc 100644 --- a/src/current/_includes/v25.2/sidebar-data/migrate.json +++ b/src/current/_includes/v25.2/sidebar-data/migrate.json @@ -15,21 +15,38 @@ ] }, { - "title": "Migrate to CockroachDB", - "urls": [ - "/molt/migrate-to-cockroachdb.html" - ] - }, - { - "title": "Migrate in Phases", - "urls": [ - "/molt/migrate-in-phases.html" - ] - }, - { - "title": "Migration Failback", - "urls": [ - "/molt/migrate-failback.html" + "title": "Migration Flows", + "items": [ + { + "title": "Bulk Load", + "urls": [ + "/molt/migrate-bulk-load.html" + ] + }, + { + "title": "Load and Replicate", + "urls": [ + "/molt/migrate-data-load-and-replication.html" + ] + }, + { + "title": "Load and Replicate Separately", + "urls": [ + "/molt/migrate-data-load-replicate-only.html" + ] + }, + { + "title": "Resume Replication", + "urls": [ + "/molt/migrate-replicate-only.html" + ] + }, + { + "title": "Failback", + "urls": [ + "/molt/migrate-failback.html" + ] + } ] }, { @@ -136,12 +153,6 @@ ] } ] - }, - { - "title": "Migrate from Oracle", - "urls": [ - "/${VERSION}/migrate-from-oracle.html" - ] } ] } diff --git a/src/current/_includes/v25.3/sidebar-data/migrate.json b/src/current/_includes/v25.3/sidebar-data/migrate.json index 5448baf143e..351d9254e37 100644 --- a/src/current/_includes/v25.3/sidebar-data/migrate.json +++ b/src/current/_includes/v25.3/sidebar-data/migrate.json @@ -15,21 +15,38 @@ ] }, { - "title": "Migrate to CockroachDB", - "urls": [ - "/molt/migrate-to-cockroachdb.html" - ] - }, - { - "title": "Migrate in Phases", - "urls": [ - "/molt/migrate-in-phases.html" - ] - }, - { - "title": "Migration Failback", - "urls": [ - "/molt/migrate-failback.html" + "title": "Migration Flows", + "items": [ + { + "title": "Bulk Load", + "urls": [ + "/molt/migrate-bulk-load.html" + ] + }, + { + "title": "Load and Replicate", + "urls": [ + "/molt/migrate-data-load-and-replication.html" + ] + }, + { + "title": "Load and Replicate Separately", + "urls": [ + "/molt/migrate-data-load-replicate-only.html" + ] + }, + { + "title": "Resume Replication", + "urls": [ + "/molt/migrate-replicate-only.html" + ] + }, + { + "title": "Failback", + "urls": [ + "/molt/migrate-failback.html" + ] + } ] }, { @@ -136,12 +153,6 @@ ] } ] - }, - { - "title": "Migrate from Oracle", - "urls": [ - "/${VERSION}/migrate-from-oracle.html" - ] } ] -} +} \ No newline at end of file diff --git a/src/current/advisories/a144650.md b/src/current/advisories/a144650.md index d77e0cbe635..cf13b97f1db 100644 --- a/src/current/advisories/a144650.md +++ b/src/current/advisories/a144650.md @@ -106,7 +106,7 @@ Follow these steps after [`detect_144650.sh` finds a corrupted job or problemati #### MOLT Fetch -By default, MOLT Fetch uses [`IMPORT INTO`]({% link v25.1/import-into.md %}) to load data into CockroachDB, and can therefore be affected by this issue. [As recommended in the migration documentation]({% link molt/migrate-to-cockroachdb.md %}#step-5-stop-replication-and-verify-data), a run of [MOLT Fetch]({% link molt/molt-fetch.md %}) should be followed by a run of [MOLT Verify]({% link molt/molt-verify.md %}) to ensure that all data on the target side matches the data on the source side. +By default, MOLT Fetch uses [`IMPORT INTO`]({% link v25.1/import-into.md %}) to load data into CockroachDB, and can therefore be affected by this issue. [As recommended in the migration documentation]({% link molt/migrate-data-load-replicate-only.md %}#stop-replication-and-verify-data), a run of [MOLT Fetch]({% link molt/molt-fetch.md %}) should be followed by a run of [MOLT Verify]({% link molt/molt-verify.md %}) to ensure that all data on the target side matches the data on the source side. - If you ran MOLT Verify after completing your MOLT Fetch run, and Verify did not find mismatches, then MOLT Fetch was unaffected by this issue. diff --git a/src/current/cockroachcloud/migrations-page.md b/src/current/cockroachcloud/migrations-page.md index 68f89cb68cb..9e9462285bd 100644 --- a/src/current/cockroachcloud/migrations-page.md +++ b/src/current/cockroachcloud/migrations-page.md @@ -360,6 +360,5 @@ To delete or verify a set of credentials, select the appropriate option in the * ## See also - [Migration Overview]({% link molt/migration-strategy.md %}) -- [Migrate to CockroachDB]({% link molt/migrate-to-cockroachdb.md %}) - [MOLT Fetch]({% link molt/molt-fetch.md %}) - [MOLT Verify]({% link molt/molt-verify.md %}) \ No newline at end of file diff --git a/src/current/molt/migrate-bulk-load.md b/src/current/molt/migrate-bulk-load.md new file mode 100644 index 00000000000..41c6295d20c --- /dev/null +++ b/src/current/molt/migrate-bulk-load.md @@ -0,0 +1,86 @@ +--- +title: Bulk Load Migration +summary: Learn how to migrate data from a source database (such as PostgreSQL, MySQL, or Oracle) into a CockroachDB cluster. +toc: true +docs_area: migrate +--- + +Use `data-load` mode to perform a one-time bulk load of source data into CockroachDB. + +{% include molt/molt-setup.md %} + +## Load data into CockroachDB + +Perform the bulk load of the source data. + +1. Run the [MOLT Fetch]({% link molt/molt-fetch.md %}) command to move the source data into CockroachDB, specifying [`--mode data-load`]({% link molt/molt-fetch.md %}#fetch-mode) to perform a one-time data load. This example command passes the source and target connection strings [as environment variables](#secure-connections), writes [intermediate files](#intermediate-file-storage) to S3 storage, and uses the `truncate-if-exists` [table handling mode](#table-handling-mode) to truncate the target tables before loading data. It limits the migration to a single schema and filters for three specific tables. The [data load mode](#data-load-mode) defaults to `IMPORT INTO`. + +
+ {% include_cached copy-clipboard.html %} + ~~~ shell + molt fetch \ + --source $SOURCE \ + --target $TARGET \ + --schema-filter 'migration_schema' \ + --table-filter 'employees|payments|orders' \ + --bucket-path 's3://migration/data/cockroach' \ + --table-handling truncate-if-exists \ + --mode data-load + ~~~ +
+ +
+ {% include_cached copy-clipboard.html %} + ~~~ shell + molt fetch \ + --source $SOURCE \ + --target $TARGET \ + --schema-filter 'migration_schema' \ + --table-filter 'employees|payments|orders' \ + --bucket-path 's3://migration/data/cockroach' \ + --table-handling truncate-if-exists \ + --mode data-load + ~~~ +
+ +
+ The command assumes an Oracle Multitenant (CDB/PDB) source. `--source-cdb` specifies the container database (CDB) connection string. + + {% include_cached copy-clipboard.html %} + ~~~ shell + molt fetch \ + --source $SOURCE \ + --source-cdb $SOURCE_CDB \ + --target $TARGET \ + --schema-filter 'migration_schema' \ + --table-filter 'employees|payments|orders' \ + --bucket-path 's3://migration/data/cockroach' \ + --table-handling truncate-if-exists \ + --mode 'data-load' + ~~~ +
+ +{% include molt/fetch-data-load-output.md %} + +## Verify the data load + +{% include molt/verify-output.md %} + +## Modify the CockroachDB schema + +{% include molt/migration-modify-target-schema.md %} + +## Cutover + +Perform a cutover by resuming application traffic, now to CockroachDB. + +{% include molt/molt-troubleshooting.md %} + +## See also + +- [Migration Overview]({% link molt/migration-overview.md %}) +- [Migration Strategy]({% link molt/migration-strategy.md %}) +- [MOLT Schema Conversion Tool]({% link cockroachcloud/migrations-page.md %}) +- [MOLT Fetch]({% link molt/molt-fetch.md %}) +- [MOLT Verify]({% link molt/molt-verify.md %}) +- [Migration Failback]({% link molt/migrate-failback.md %}) \ No newline at end of file diff --git a/src/current/molt/migrate-data-load-and-replication.md b/src/current/molt/migrate-data-load-and-replication.md new file mode 100644 index 00000000000..726b916a22a --- /dev/null +++ b/src/current/molt/migrate-data-load-and-replication.md @@ -0,0 +1,111 @@ +--- +title: Load and Replicate +summary: Learn how to migrate data from a source database (such as PostgreSQL, MySQL, or Oracle) into a CockroachDB cluster. +toc: true +docs_area: migrate +--- + +{% assign tab_names_html = "Load and replicate;Replicate separately" %} +{% assign html_page_filenames = "migrate-data-load-and-replication.html;migrate-data-load-replicate-only.html" %} + +{% include filter-tabs.md tab_names=tab_names_html page_filenames=html_page_filenames page_folder="molt" %} + +Use `data-load-and-replication` mode to perform a one-time bulk load of source data and start continuous replication in a single command. + +{{site.data.alerts.callout_success}} +You can also [load and replicate separately]({% link molt/migrate-data-load-replicate-only.md %}) using `data-load` and `replicate-only`. +{{site.data.alerts.end}} + +{% include molt/molt-setup.md %} + +## Load data into CockroachDB + +Start the initial load of data into the target database. Continuous replication of changes will start once the data load is complete. + +1. Issue the [MOLT Fetch]({% link molt/molt-fetch.md %}) command to move the source data to CockroachDB, specifying `--mode data-load-and-replication` to perform an initial load followed by continuous replication. In this example, the `--metricsAddr :30005` [replication flag](#replication-flags) enables a Prometheus endpoint at `http://localhost:30005/_/varz` where replication metrics will be served. You can use these metrics to [verify that replication has drained](#stop-replication-and-verify-data) in a later step. + +
+ Specify a replication slot name with `--pglogical-replication-slot-name`. This is required for [replication after data load](#replicate-changes-to-cockroachdb). + + {% include_cached copy-clipboard.html %} + ~~~ shell + molt fetch \ + --source $SOURCE \ + --target $TARGET \ + --schema-filter 'migration_schema' \ + --table-filter 'employees|payments|orders' \ + --bucket-path 's3://migration/data/cockroach' \ + --table-handling truncate-if-exists \ + --pglogical-replication-slot-name cdc_slot \ + --replicator-flags '--metricsAddr :30005' \ + --mode data-load-and-replication + ~~~ +
+ +
+ {% include_cached copy-clipboard.html %} + ~~~ shell + molt fetch \ + --source $SOURCE \ + --target $TARGET \ + --schema-filter 'migration_schema' \ + --table-filter 'employees|payments|orders' \ + --bucket-path 's3://migration/data/cockroach' \ + --table-handling truncate-if-exists \ + --replicator-flags '--metricsAddr :30005 --userscript table_filter.ts' \ + --mode data-load-and-replication + ~~~ +
+ +
+ The command assumes an Oracle Multitenant (CDB/PDB) source. `--source-cdb` specifies the container database (CDB) connection string. + + {% include_cached copy-clipboard.html %} + ~~~ shell + molt fetch \ + --source $SOURCE \ + --source-cdb $SOURCE_CDB \ + --target $TARGET \ + --schema-filter 'migration_schema' \ + --table-filter 'employees|payments|orders' \ + --bucket-path 's3://migration/data/cockroach' \ + --table-handling truncate-if-exists \ + --replicator-flags '--metricsAddr :30005 --userscript table_filter.ts' \ + --mode data-load-and-replication + ~~~ +
+ +{% include molt/fetch-data-load-output.md %} + +## Replicate changes to CockroachDB + +1. Continuous replication begins immediately after `fetch complete`. + +{% include molt/fetch-replication-output.md %} + +## Stop replication and verify data + +Use [MOLT Verify]({% link molt/molt-verify.md %}) to confirm that the source and target data is consistent. This ensures that the data load was successful. + +{% include molt/migration-stop-replication.md %} + +{% include molt/verify-output.md %} + +## Modify the CockroachDB schema + +{% include molt/migration-modify-target-schema.md %} + +## Cutover + +Perform a cutover by resuming application traffic, now to CockroachDB. + +{% include molt/molt-troubleshooting.md %} + +## See also + +- [Migration Overview]({% link molt/migration-overview.md %}) +- [Migration Strategy]({% link molt/migration-strategy.md %}) +- [MOLT Schema Conversion Tool]({% link cockroachcloud/migrations-page.md %}) +- [MOLT Fetch]({% link molt/molt-fetch.md %}) +- [MOLT Verify]({% link molt/molt-verify.md %}) +- [Migration Failback]({% link molt/migrate-failback.md %}) \ No newline at end of file diff --git a/src/current/molt/migrate-data-load-replicate-only.md b/src/current/molt/migrate-data-load-replicate-only.md new file mode 100644 index 00000000000..1b8075fb877 --- /dev/null +++ b/src/current/molt/migrate-data-load-replicate-only.md @@ -0,0 +1,159 @@ +--- +title: Load and Replicate Separately +summary: Learn how to migrate data from a source database (such as PostgreSQL, MySQL, or Oracle) into a CockroachDB cluster. +toc: true +docs_area: migrate +--- + +{% assign tab_names_html = "Load and replicate;Replicate separately" %} +{% assign html_page_filenames = "migrate-data-load-and-replication.html;migrate-data-load-replicate-only.html" %} + +{% include filter-tabs.md tab_names=tab_names_html page_filenames=html_page_filenames page_folder="molt" %} + +Perform an initial bulk load of the source data using `data-load` mode, then use `replication-only` mode to replicate ongoing changes to the target. + +{{site.data.alerts.callout_success}} +You can also [load and replicate in a single command]({% link molt/migrate-data-load-and-replication.md %}) using `data-load-and-replication`. +{{site.data.alerts.end}} + +{% include molt/molt-setup.md %} + +## Load data into CockroachDB + +Perform the initial load of the source data. + +1. Issue the [MOLT Fetch]({% link molt/molt-fetch.md %}) command to move the source data to CockroachDB, specifying [`--mode data-load`]({% link molt/molt-fetch.md %}#fetch-mode) to perform a one-time data load. This example command passes the source and target connection strings [as environment variables](#secure-connections), writes [intermediate files](#intermediate-file-storage) to S3 storage, and uses the `truncate-if-exists` [table handling mode](#table-handling-mode) to truncate the target tables before loading data. It also limits the migration to a single schema and filters three specific tables to migrate. The [data load mode](#data-load-mode) defaults to `IMPORT INTO`. + +
+ Specify a replication slot name with `--pglogical-replication-slot-name`. This is required for [replication in a subsequent step](#replicate-changes-to-cockroachdb). + + {% include_cached copy-clipboard.html %} + ~~~ shell + molt fetch \ + --source $SOURCE \ + --target $TARGET \ + --schema-filter 'migration_schema' \ + --table-filter 'employees|payments|orders' \ + --bucket-path 's3://migration/data/cockroach' \ + --table-handling truncate-if-exists \ + --pglogical-replication-slot-name cdc_slot \ + --mode data-load + ~~~ +
+ +
+ {% include_cached copy-clipboard.html %} + ~~~ shell + molt fetch \ + --source $SOURCE \ + --target $TARGET \ + --schema-filter 'migration_schema' \ + --table-filter 'employees|payments|orders' \ + --bucket-path 's3://migration/data/cockroach' \ + --table-handling truncate-if-exists \ + --mode data-load + ~~~ +
+ +
+ The command assumes an Oracle Multitenant (CDB/PDB) source. `--source-cdb` specifies the container database (CDB) connection string. + + {% include_cached copy-clipboard.html %} + ~~~ shell + molt fetch \ + --source $SOURCE \ + --source-cdb $SOURCE_CDB \ + --target $TARGET \ + --schema-filter 'migration_schema' \ + --table-filter 'employees|payments|orders' \ + --bucket-path 's3://migration/data/cockroach' \ + --table-handling truncate-if-exists \ + --mode data-load + ~~~ +
+ +{% include molt/fetch-data-load-output.md %} + +## Verify the data load + +Use [MOLT Verify]({% link molt/molt-verify.md %}) to confirm that the source and target data is consistent. This ensures that the data load was successful. + +{% include molt/verify-output.md %} + +## Replicate changes to CockroachDB + +With initial load complete, start replication of ongoing changes on the source to CockroachDB. + + +{% include molt/fetch-replicator-flags.md %} + +1. Run the [MOLT Fetch]({% link molt/molt-fetch.md %}) command to start replication on CockroachDB, specifying [`--mode replication-only`]({% link molt/molt-fetch.md %}#fetch-mode). In this example, the `--metricsAddr :30005` replication flag enables a Prometheus endpoint at `http://localhost:30005/_/varz` where replication metrics will be served. You can use these metrics to [verify that replication has drained](#stop-replication-and-verify-data) in a later step. + +
+ Be sure to specify the same `--pglogical-replication-slot-name` value that you provided in [Load data into CockroachDB](#load-data-into-cockroachdb). + + {% include_cached copy-clipboard.html %} + ~~~ shell + molt fetch \ + --source $SOURCE \ + --target $TARGET \ + --table-filter 'employees|payments|orders' \ + --pglogical-replication-slot-name cdc_slot \ + --replicator-flags '--metricsAddr :30005' \ + --mode replication-only + ~~~ +
+ +
+ {% include_cached copy-clipboard.html %} + ~~~ shell + molt fetch \ + --source $SOURCE \ + --target $TARGET \ + --table-filter 'employees|payments|orders' \ + --non-interactive \ + --replicator-flags '--defaultGTIDSet 4c658ae6-e8ad-11ef-8449-0242ac140006:1-29 --metricsAddr :30005 --userscript table_filter.ts' \ + --mode replication-only + ~~~ +
+ +
+ {% include_cached copy-clipboard.html %} + ~~~ shell + molt fetch \ + --source $SOURCE \ + --source-cdb $SOURCE_CDB \ + --target $TARGET \ + --schema-filter 'migration_schema' \ + --table-filter 'employees|payments|orders' \ + --replicator-flags '--backfillFromSCN 26685444 --scn 26685786 --metricsAddr :30005 --userscript table_filter.ts' \ + --mode 'replication-only' + ~~~ +
+ +{% include molt/fetch-replication-output.md %} + +## Stop replication and verify data + +{% include molt/migration-stop-replication.md %} + +1. Repeat [Verify the data load](#verify-the-data-load) to verify the updated data. + +## Modify the CockroachDB schema + +{% include molt/migration-modify-target-schema.md %} + +## Cutover + +Perform a cutover by resuming application traffic, now to CockroachDB. + +{% include molt/molt-troubleshooting.md %} + +## See also + +- [Migration Overview]({% link molt/migration-overview.md %}) +- [Migration Strategy]({% link molt/migration-strategy.md %}) +- [MOLT Schema Conversion Tool]({% link cockroachcloud/migrations-page.md %}) +- [MOLT Fetch]({% link molt/molt-fetch.md %}) +- [MOLT Verify]({% link molt/molt-verify.md %}) +- [Migration Failback]({% link molt/migrate-failback.md %}) \ No newline at end of file diff --git a/src/current/molt/migrate-failback.md b/src/current/molt/migrate-failback.md index 6359aadf1c1..fceaa2cdab1 100644 --- a/src/current/molt/migrate-failback.md +++ b/src/current/molt/migrate-failback.md @@ -5,54 +5,131 @@ toc: true docs_area: migrate --- -If issues arise during the migration, you can start MOLT Fetch in `failback` mode after stopping replication and before sending new writes to CockroachDB. Failing back to the source database ensures that data remains consistent on the source, in case you need to roll back the migration. +If issues arise during migration, run MOLT Fetch in `failback` mode after stopping replication and before writing to CockroachDB. This ensures that data remains consistent on the source in case you need to roll back the migration. -{% assign tab_names_html = "Load and replicate;Phased migration;Failback" %} -{% assign html_page_filenames = "migrate-to-cockroachdb.html;migrate-in-phases.html;migrate-failback.html" %} - -{% include filter-tabs.md tab_names=tab_names_html page_filenames=html_page_filenames page_folder="molt" %} +
+ + + +
-## Before you begin +## Prepare the CockroachDB cluster -[Enable rangefeeds]({% link {{ site.current_cloud_version }}/create-and-configure-changefeeds.md %}#enable-rangefeeds) in the CockroachDB SQL shell: +[Enable rangefeeds]({% link {{ site.current_cloud_version }}/create-and-configure-changefeeds.md %}#enable-rangefeeds) on the CockroachDB cluster: {% include_cached copy-clipboard.html %} ~~~ sql SET CLUSTER SETTING kv.rangefeed.enabled = true; ~~~ -Select the source dialect you migrated to CockroachDB: +
+## Grant Oracle user permissions -
- - -
+You should have already created a migration user on the source database with the necessary privileges. Refer to [Create migration user on source database]({% link molt/migrate-data-load-replicate-only.md %}?filters=oracle#create-migration-user-on-source-database). -## Step 1. Stop replication to CockroachDB +Grant the Oracle user additional `INSERT` and `UPDATE` privileges on the tables to fail back: -Cancel replication to CockroachDB by entering `ctrl-c` to issue a `SIGTERM` signal to the `fetch` process. This returns an exit code `0`. +{% include_cached copy-clipboard.html %} +~~~ sql +GRANT SELECT, INSERT, UPDATE, FLASHBACK ON migration_schema.employees TO MIGRATION_USER; +GRANT SELECT, INSERT, UPDATE, FLASHBACK ON migration_schema.payments TO MIGRATION_USER; +GRANT SELECT, INSERT, UPDATE, FLASHBACK ON migration_schema.orders TO MIGRATION_USER; +~~~ +
-## Step 2. Fail back from CockroachDB +## Configure failback -The following example watches the `employees` table for change events. +Configure the MOLT Fetch connection strings and filters for `failback` mode, ensuring that the CockroachDB changefeed is correctly targeting your original source. -1. Issue the [MOLT Fetch]({% link molt/molt-fetch.md %}) command to fail back to the source database, specifying `--mode failback`. For details on this mode, refer to the [MOLT Fetch]({% link molt/molt-fetch.md %}#fail-back-to-source-database) page. +### Connection strings - {{site.data.alerts.callout_success}} - Be mindful when specifying the connection strings: `--source` is the CockroachDB connection string and `--target` is the connection string of the database you migrated from. - {{site.data.alerts.end}} +In `failback` mode, the `--source` and `--target` connection strings are reversed from other migration modes: + +`--source` is the CockroachDB connection string. For example: + +~~~ +--source 'postgres://crdb_user@localhost:26257/defaultdb?sslmode=verify-full' +~~~ + +`--target` is the connection string of the database you migrated from. + +
+For example: + +~~~ +--target 'postgres://postgres:postgres@localhost:5432/molt?sslmode=verify-full' +~~~ +
+ +
+For example: + +~~~ +--target 'mysql://user:password@localhost/molt?sslcert=.%2fsource_certs%2fclient.root.crt&sslkey=.%2fsource_certs%2fclient.root.key&sslmode=verify-full&sslrootcert=.%2fsource_certs%2fca.crt' +~~~ +
+ +
+For example: + +~~~ +--target 'oracle://C%23%23MIGRATION_USER:password@host:1521/ORCLPDB1' +~~~ + +{{site.data.alerts.callout_info}} +With Oracle Multitenant deployments, `--source-cdb` is **not** necessary for `failback`. +{{site.data.alerts.end}} +
+ +### Secure changefeed for failback + +`failback` mode creates a [CockroachDB changefeed]({% link {{ site.current_cloud_version }}/change-data-capture-overview.md %}) and sets up a [webhook sink]({% link {{ site.current_cloud_version }}/changefeed-sinks.md %}#webhook-sink) to pass change events from CockroachDB to the failback target. In production, you should override the [default insecure changefeed]({% link molt/molt-fetch.md %}#default-insecure-changefeed) with secure settings. + +Provide these overrides in a JSON file. At minimum, the JSON should include the base64-encoded client certificate ([`client_cert`]({% link {{ site.current_cloud_version }}/create-changefeed.md %}#client-cert)), key ([`client_key`]({% link {{ site.current_cloud_version }}/create-changefeed.md %}#client-key)), and CA ([`ca_cert`]({% link {{ site.current_cloud_version }}/create-changefeed.md %}#ca-cert)) for the webhook sink. + +{% include_cached copy-clipboard.html %} +~~~ json +{ + "sink_query_parameters": "client_cert={base64 cert}&client_key={base64 key}&ca_cert={base64 CA cert}" +} +~~~ + +{{site.data.alerts.callout_success}} +In the `molt fetch` command, use `--replicator-flags` to specify the paths to the server certificate and key for the webhook sink. Refer to [Replication flags](#replication-flags). +{{site.data.alerts.end}} + +Pass the JSON file path to `molt` via `--changefeeds-path`. For example: + +{% include_cached copy-clipboard.html %} +~~~ +--changefeeds-path 'changefeed-secure.json' +~~~ + +Because the changefeed runs inside the CockroachDB cluster, the `--changefeeds-path` file must reference a webhook endpoint address reachable by the cluster, not necessarily your local workstation. + +For details, refer to [Changefeed override settings]({% link molt/molt-fetch.md %}#changefeed-override-settings). - Use the `--stagingSchema` replication flag to provide the name of the staging schema. This is found in the `staging database name` message that is written at the beginning of the [replication task]({% link molt/migrate-in-phases.md %}#step-6-replicate-changes-to-cockroachdb). +### Replication flags + +{% include molt/fetch-replicator-flags.md %} + +## Fail back from CockroachDB + +Start failback to the source database. + +1. Cancel replication to CockroachDB by entering `ctrl-c` to issue a `SIGTERM` signal to the `fetch` process. This returns an exit code `0`. + +1. Issue the [MOLT Fetch]({% link molt/molt-fetch.md %}) command to fail back to the source database, specifying `--mode failback`. In this example, we filter the `migration_schema` schema and the `employees`, `payments`, and `orders` tables, configure the staging schema with `--replicator-flags`, and use `--changefeeds-path` to provide the secure changefeed override.
{% include_cached copy-clipboard.html %} ~~~ shell molt fetch \ - --source 'postgres://root@localhost:26257/defaultdb?sslmode=verify-full' \ - --target 'postgres://postgres:postgres@localhost:5432/molt?sslmode=verify-full' \ - --table-filter 'employees' \ - --non-interactive \ - --replicator-flags "--stagingSchema _replicator_1739996035106984000" \ + --source $SOURCE \ + --target $TARGET \ + --schema-filter 'migration_schema' \ + --table-filter 'employees|payments|orders' \ + --replicator-flags '--stagingSchema _replicator_1739996035106984000 --tlsCertificate ./certs/server.crt --tlsPrivateKey ./certs/server.key' \ --mode failback \ --changefeeds-path 'changefeed-secure.json' ~~~ @@ -62,30 +139,33 @@ The following example watches the `employees` table for change events. {% include_cached copy-clipboard.html %} ~~~ shell molt fetch \ - --source 'postgres://root@localhost:26257/defaultdb?sslmode=verify-full' \ - --target 'mysql://user:password@localhost/molt?sslcert=.%2fsource_certs%2fclient.root.crt&sslkey=.%2fsource_certs%2fclient.root.key&sslmode=verify-full&sslrootcert=.%2fsource_certs%2fca.crt' \ - --table-filter 'employees' \ - --non-interactive \ - --replicator-flags "--stagingSchema _replicator_1739996035106984000" \ + --source $SOURCE \ + --target $TARGET \ + --schema-filter 'migration_schema' \ + --table-filter 'employees|payments|orders' \ + --replicator-flags '--stagingSchema _replicator_1739996035106984000 --tlsCertificate ./certs/server.crt --tlsPrivateKey ./certs/server.key' \ --mode failback \ --changefeeds-path 'changefeed-secure.json' ~~~
- `--changefeeds-path` specifies a path to `changefeed-secure.json`, which should contain the following setting override: - +
{% include_cached copy-clipboard.html %} - ~~~ json - { - "sink_query_parameters": "client_cert={base64 cert}&client_key={base64 key}&ca_cert={base64 CA cert}" - } + ~~~ shell + molt fetch \ + --source $SOURCE \ + --target $TARGET \ + --schema-filter 'migration_schema' \ + --table-filter 'employees|payments|orders' \ + --replicator-flags '--stagingSchema _replicator_1739996035106984000 --tlsCertificate ./certs/server.crt --tlsPrivateKey ./certs/server.key --userscript table_filter.ts' \ + --mode failback \ + --changefeeds-path 'changefeed-secure.json' ~~~ - `client_cert`, `client_key`, and `ca_cert` are [webhook sink parameters]({% link {{ site.current_cloud_version }}/changefeed-sinks.md %}#webhook-parameters) that must be base64- and URL-encoded (for example, use the command `base64 -i ./client.crt | jq -R -r '@uri'`). - - {{site.data.alerts.callout_success}} - For details on the default changefeed settings and how to override them, refer to [Changefeed override settings]({% link molt/molt-fetch.md %}#changefeed-override-settings). + {{site.data.alerts.callout_info}} + With Oracle Multitenant deployments, while `--source-cdb` is required for other `fetch` modes, it is **not** necessary for `failback`. {{site.data.alerts.end}} +
1. Check the output to observe `fetch progress`. @@ -110,9 +190,7 @@ The following example watches the `employees` table for change events. ## See also - [Migration Overview]({% link molt/migration-overview.md %}) +- [Migration Strategy]({% link molt/migration-strategy.md %}) - [MOLT Schema Conversion Tool]({% link cockroachcloud/migrations-page.md %}) - [MOLT Fetch]({% link molt/molt-fetch.md %}) -- [MOLT Verify]({% link molt/molt-verify.md %}) -- [Migration Overview]({% link molt/migration-overview.md %}) -- [Migrate to CockroachDB]({% link molt/migrate-to-cockroachdb.md %}) -- [Migrate to CockroachDB in Phases]({% link molt/migrate-in-phases.md %}) \ No newline at end of file +- [MOLT Verify]({% link molt/molt-verify.md %}) \ No newline at end of file diff --git a/src/current/molt/migrate-in-phases.md b/src/current/molt/migrate-in-phases.md deleted file mode 100644 index 77e098a0ce0..00000000000 --- a/src/current/molt/migrate-in-phases.md +++ /dev/null @@ -1,163 +0,0 @@ ---- -title: Migrate to CockroachDB in Phases -summary: Learn how to migrate data in phases from a PostgreSQL or MySQL database into a CockroachDB cluster. -toc: true -docs_area: migrate ---- - -A phased migration to CockroachDB uses the [MOLT tools]({% link molt/migration-overview.md %}) to [convert your source schema](#step-2-prepare-the-source-schema), incrementally [load source data](#step-3-load-data-into-cockroachdb) and [verify the results](#step-4-verify-the-data-load), and finally [replicate ongoing changes](#step-6-replicate-changes-to-cockroachdb) before performing cutover. - -{% assign tab_names_html = "Load and replicate;Phased migration;Failback" %} -{% assign html_page_filenames = "migrate-to-cockroachdb.html;migrate-in-phases.html;migrate-failback.html" %} - -{% include filter-tabs.md tab_names=tab_names_html page_filenames=html_page_filenames page_folder="molt" %} - -## Before you begin - -- Review the [Migration Overview]({% link molt/migration-overview.md %}). -- Install the [MOLT (Migrate Off Legacy Technology)]({% link releases/molt.md %}#installation) tools. -- Review the MOLT Fetch [setup]({% link molt/molt-fetch.md %}#setup) and [best practices]({% link molt/molt-fetch.md %}#best-practices). -{% include molt/fetch-secure-cloud-storage.md %} - -Select the source dialect you will migrate to CockroachDB: - -
- - -
- -## Step 1. Prepare the source database - -{% include molt/migration-prepare-database.md %} - -## Step 2. Prepare the source schema - -{% include molt/migration-prepare-schema.md %} - -## Step 3. Load data into CockroachDB - -{{site.data.alerts.callout_success}} -To optimize performance of [data load](#step-3-load-data-into-cockroachdb), Cockroach Labs recommends dropping any [constraints]({% link {{ site.current_cloud_version }}/alter-table.md %}#drop-constraint) and [indexes]({% link {{site.current_cloud_version}}/drop-index.md %}) on the target CockroachDB database. You can [recreate them after the data is loaded](#step-5-modify-the-cockroachdb-schema). -{{site.data.alerts.end}} - -Perform an initial load of data into the target database. This can be a subset of the source data that you wish to verify, or it can be the entire dataset. - -{% include molt/fetch-data-load-modes.md %} - -1. Issue the [MOLT Fetch]({% link molt/molt-fetch.md %}) command to move the source data to CockroachDB, specifying `--mode data-load` to perform a one-time data load. For details on this mode, refer to the [MOLT Fetch]({% link molt/molt-fetch.md %}#load-data) page. - - {{site.data.alerts.callout_info}} - Ensure that the `--source` and `--target` [connection strings]({% link molt/molt-fetch.md %}#connection-strings) are URL-encoded. - {{site.data.alerts.end}} - -
- {% include_cached copy-clipboard.html %} - Be sure to specify `--pglogical-replication-slot-name`, which is required for replication in [Step 6](#step-6-replicate-changes-to-cockroachdb). - - ~~~ shell - molt fetch \ - --source 'postgres://postgres:postgres@localhost:5432/molt?sslmode=verify-full' \ - --target 'postgres://root@localhost:26257/defaultdb?sslmode=verify-full' \ - --table-filter 'employees' \ - --bucket-path 's3://molt-test' \ - --table-handling truncate-if-exists \ - --non-interactive \ - --pglogical-replication-slot-name cdc_slot \ - --mode data-load - ~~~ -
- -
- {% include_cached copy-clipboard.html %} - ~~~ shell - molt fetch \ - --source 'mysql://user:password@localhost/molt?sslcert=.%2fsource_certs%2fclient.root.crt&sslkey=.%2fsource_certs%2fclient.root.key&sslmode=verify-full&sslrootcert=.%2fsource_certs%2fca.crt' \ - --target 'postgres://root@localhost:26257/defaultdb?sslmode=verify-full' \ - --table-filter 'employees' \ - --bucket-path 's3://molt-test' \ - --table-handling truncate-if-exists \ - --non-interactive \ - --mode data-load - ~~~ -
- -{% include molt/fetch-data-load-output.md %} - -## Step 4. Verify the data load - -{% include molt/verify-output.md %} - -Repeat [Step 3](#step-3-load-data-into-cockroachdb) and [Step 4](#step-4-verify-the-data-load) to migrate any remaining tables. - -## Step 5. Modify the CockroachDB schema - -{% include molt/migration-modify-target-schema.md %} - -## Step 6. Replicate changes to CockroachDB - -With initial load complete, start replication of ongoing changes on the source to CockroachDB. - -The following example specifies that the `employees` table should be watched for change events. - -1. Issue the [MOLT Fetch]({% link molt/molt-fetch.md %}) command to start replication on CockroachDB, specifying `--mode replication-only` to replicate ongoing changes on the source to CockroachDB. For details on this mode, refer to the [MOLT Fetch]({% link molt/molt-fetch.md %}#replicate-changes) page. - -
- Be sure to specify the same `--pglogical-replication-slot-name` value that you provided in [Step 3](#step-3-load-data-into-cockroachdb). - - {% include_cached copy-clipboard.html %} - ~~~ shell - molt fetch \ - --source 'postgres://postgres:postgres@localhost:5432/molt?sslmode=verify-full' \ - --target 'postgres://root@localhost:26257/defaultdb?sslmode=verify-full' \ - --table-filter 'employees' \ - --non-interactive \ - --mode replication-only \ - --pglogical-replication-slot-name cdc_slot \ - --replicator-flags '--metricsAddr :30005' - ~~~ -
- -
- Use the `--defaultGTIDSet` replication flag to specify the GTID set. To find your GTID record, run `SELECT source_uuid, min(interval_start), max(interval_end) FROM mysql.gtid_executed GROUP BY source_uuid;` on MySQL. - - {% include_cached copy-clipboard.html %} - ~~~ shell - molt fetch \ - --source 'mysql://user:password@localhost/molt?sslcert=.%2fsource_certs%2fclient.root.crt&sslkey=.%2fsource_certs%2fclient.root.key&sslmode=verify-full&sslrootcert=.%2fsource_certs%2fca.crt' \ - --target 'postgres://root@localhost:26257/defaultdb?sslmode=verify-full' \ - --table-filter 'employees' \ - --non-interactive \ - --mode replication-only \ - --replicator-flags '--defaultGTIDSet 4c658ae6-e8ad-11ef-8449-0242ac140006:1-29 --metricsAddr :30005' - ~~~ -
- - {{site.data.alerts.callout_info}} - `--metricsAddr` enables a Prometheus-compatible metrics endpoint at `http://{host}:{port}/_/varz` where replication metrics will be served. In this example, the endpoint is `http://localhost:30005/_/varz`. - {{site.data.alerts.end}} - -{% include molt/fetch-replication-output.md %} - -## Step 7. Stop replication and verify data - -{% include molt/migration-stop-replication.md %} - -1. Repeat [Step 4](#step-4-verify-the-data-load) to verify the updated data. - -{{site.data.alerts.callout_success}} -If you encountered issues with replication, you can now use [`failback`]({% link molt/migrate-failback.md %}) mode to replicate changes on CockroachDB back to the initial source database. In case you need to roll back the migration, this ensures that data is consistent on the initial source database. -{{site.data.alerts.end}} - -## Step 8. Cutover - -Perform a cutover by resuming application traffic, now to CockroachDB. - -## See also - -- [Migration Overview]({% link molt/migration-overview.md %}) -- [MOLT Schema Conversion Tool]({% link cockroachcloud/migrations-page.md %}) -- [MOLT Fetch]({% link molt/molt-fetch.md %}) -- [MOLT Verify]({% link molt/molt-verify.md %}) -- [Migration Overview]({% link molt/migration-overview.md %}) -- [Migrate to CockroachDB]({% link molt/migrate-to-cockroachdb.md %}) -- [Migration Failback]({% link molt/migrate-failback.md %}) \ No newline at end of file diff --git a/src/current/molt/migrate-replicate-only.md b/src/current/molt/migrate-replicate-only.md new file mode 100644 index 00000000000..01ddf4b7f5d --- /dev/null +++ b/src/current/molt/migrate-replicate-only.md @@ -0,0 +1,81 @@ +--- +title: Resume Replication +summary: Restart ongoing replication using an existing staging schema checkpoint. +toc: true +docs_area: migrate +--- + +Use `replication-only` mode to resume replication to CockroachDB after an interruption, without reloading data. + +{{site.data.alerts.callout_info}} +These steps assume that you previously started replication. Refer to [Load and Replicate]({% link molt/migrate-data-load-and-replication.md %}#replicate-changes-to-cockroachdb) or [Load and Replicate Separately]({% link molt/migrate-data-load-replicate-only.md %}#replicate-changes-to-cockroachdb). +{{site.data.alerts.end}} + +
+ + + +
+ +## Resume replication after interruption + +{% include molt/fetch-replicator-flags.md %} + +1. Issue the [MOLT Fetch]({% link molt/molt-fetch.md %}) command to start replication on CockroachDB, specifying [`--mode replication-only`]({% link molt/molt-fetch.md %}#fetch-mode). + +
+ Be sure to specify the same `--pglogical-replication-slot-name` value that you provided on [data load]({% link molt/migrate-data-load-replicate-only.md %}#load-data-into-cockroachdb). + + {% include_cached copy-clipboard.html %} + ~~~ shell + molt fetch \ + --source $SOURCE \ + --target $TARGET \ + --schema-filter 'migration_schema' \ + --table-filter 'employees|payments|orders' \ + --pglogical-replication-slot-name cdc_slot \ + --replicator-flags '--stagingSchema _replicator_1749699789613149000 --metricsAddr :30005' \ + --mode replication-only + ~~~ +
+ +
+ {% include_cached copy-clipboard.html %} + ~~~ shell + molt fetch \ + --source $SOURCE \ + --target $TARGET \ + --schema-filter 'migration_schema' \ + --table-filter 'employees|payments|orders' \ + --non-interactive \ + --replicator-flags '--stagingSchema _replicator_1749699789613149000 --metricsAddr :30005 --userscript table_filter.ts' \ + --mode replication-only + ~~~ +
+ +
+ {% include_cached copy-clipboard.html %} + ~~~ shell + molt fetch \ + --source $SOURCE \ + --source-cdb $SOURCE_CDB \ + --target $TARGET \ + --schema-filter 'migration_schema' \ + --table-filter 'employees|payments|orders' \ + --replicator-flags '--stagingSchema _replicator_1749699789613149000 --metricsAddr :30005 --userscript table_filter.ts' \ + --mode 'replication-only' + ~~~ +
+ + Replication resumes from the last checkpoint without performing a fresh load. + +{% include molt/fetch-replication-output.md %} + +## See also + +- [Migration Overview]({% link molt/migration-overview.md %}) +- [Migration Strategy]({% link molt/migration-strategy.md %}) +- [MOLT Schema Conversion Tool]({% link cockroachcloud/migrations-page.md %}) +- [MOLT Fetch]({% link molt/molt-fetch.md %}) +- [MOLT Verify]({% link molt/molt-verify.md %}) +- [Migration Failback]({% link molt/migrate-failback.md %}) \ No newline at end of file diff --git a/src/current/molt/migrate-to-cockroachdb.md b/src/current/molt/migrate-to-cockroachdb.md index 17a43549c6a..60aabe29810 100644 --- a/src/current/molt/migrate-to-cockroachdb.md +++ b/src/current/molt/migrate-to-cockroachdb.md @@ -5,120 +5,35 @@ toc: true docs_area: migrate --- -A migration to CockroachDB uses the [MOLT tools]({% link molt/migration-overview.md %}) to [convert your source schema](#step-2-prepare-the-source-schema), [load source data](#step-3-load-data-into-cockroachdb) into CockroachDB and immediately [replicate ongoing changes](#step-4-replicate-changes-to-cockroachdb), and [verify consistency](#step-5-stop-replication-and-verify-data) on the CockroachDB cluster before performing cutover. +MOLT Fetch supports various migration flows using [MOLT Fetch modes]({% link molt/molt-fetch.md %}#fetch-mode). -{% assign tab_names_html = "Load and replicate;Phased migration;Failback" %} -{% assign html_page_filenames = "migrate-to-cockroachdb.html;migrate-in-phases.html;migrate-failback.html" %} +| Migration flow | Mode | Description | Best for | +|----------------------------------------------------------------------------------------|----------------------------------------------------|--------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------| +| [Bulk load]({% link molt/migrate-bulk-load.md %}) | `--mode data-load` | Perform a one-time bulk load of source data into CockroachDB. | Testing, migrations with [planned downtime]({% link molt/migration-strategy.md %}#approach-to-downtime) | +| [Data load and replication]({% link molt/migrate-data-load-and-replication.md %}) | `--mode data-load-and-replication` | Load source data, then replicate subsequent changes continuously. | [Minimal downtime]({% link molt/migration-strategy.md %}#approach-to-downtime) migrations | +| [Data load then replication-only]({% link molt/migrate-data-load-replicate-only.md %}) | `--mode data-load`, then `--mode replication-only` | Load source data first, then start replication in a separate task. | [Minimal downtime]({% link molt/migration-strategy.md %}#approach-to-downtime) migrations | +| [Resume replication]({% link molt/migrate-replicate-only.md %}) | `--mode replication-only` | Resume replication from a checkpoint after interruption. | Resuming interrupted migrations, post-load sync | +| [Failback]({% link molt/migrate-failback.md %}) | `--mode failback` | Replicate changes from CockroachDB back to the source database. | [Rollback]({% link molt/migrate-failback.md %}) scenarios | -{% include filter-tabs.md tab_names=tab_names_html page_filenames=html_page_filenames page_folder="molt" %} +### Bulk load -## Before you begin +For migrations that tolerate downtime, use `data-load` mode to perform a one-time bulk load of source data into CockroachDB. Refer to [Bulk Load]({% link molt/migrate-bulk-load.md %}). -- Review the [Migration Overview]({% link molt/migration-overview.md %}). -- Install the [MOLT (Migrate Off Legacy Technology)]({% link releases/molt.md %}#installation) tools. -- Review the MOLT Fetch [setup]({% link molt/molt-fetch.md %}#setup) and [best practices]({% link molt/molt-fetch.md %}#best-practices). -{% include molt/fetch-secure-cloud-storage.md %} +### Migrations with minimal downtime -Select the source dialect you will migrate to CockroachDB: +To minimize downtime during migration, MOLT Fetch supports replication streams that sync ongoing changes from the source database to CockroachDB. Instead of performing the entire data load during a planned downtime window, you can perform an initial load followed by continuous replication. Writes are only briefly paused to allow replication to drain before final cutover. The length of the pause depends on the volume of write traffic and the amount of replication lag between the source and CockroachDB. -
- - -
+- Use `data-load-and-replication` mode to perform both steps in one task. Refer to [Load and Replicate]({% link molt/migrate-data-load-and-replication.md %}). +- Use `data-load` followed by `replication-only` to perform the steps separately. Refer to [Load and Replicate Separately]({% link molt/migrate-data-load-replicate-only.md %}). -## Step 1. Prepare the source database +### Recovery and rollback strategies -{% include molt/migration-prepare-database.md %} +If the migration is interrupted or you need to abort cutover, MOLT Fetch supports safe recovery flows: -## Step 2. Prepare the source schema - -{% include molt/migration-prepare-schema.md %} - -## Step 3. Load data into CockroachDB - -{{site.data.alerts.callout_success}} -To optimize performance of [data load](#step-3-load-data-into-cockroachdb), Cockroach Labs recommends dropping any [constraints]({% link {{ site.current_cloud_version }}/alter-table.md %}#drop-constraint) and [indexes]({% link {{site.current_cloud_version}}/drop-index.md %}) on the target CockroachDB database. You can [recreate them after the data is loaded](#step-6-modify-the-cockroachdb-schema). -{{site.data.alerts.end}} - -Start the initial load of data into the target database. Continuous replication of changes will start once the data load is complete. - -{% include molt/fetch-data-load-modes.md %} - -1. Issue the [MOLT Fetch]({% link molt/molt-fetch.md %}) command to move the source data to CockroachDB, specifying `--mode data-load-and-replication` to perform an initial load followed by continuous replication. For details on this mode, refer to the [MOLT Fetch]({% link molt/molt-fetch.md %}#load-data-and-replicate-changes) page. - - {{site.data.alerts.callout_info}} - Ensure that the `--source` and `--target` [connection strings]({% link molt/molt-fetch.md %}#connection-strings) are URL-encoded. - {{site.data.alerts.end}} - -
- Be sure to specify `--pglogical-replication-slot-name`, which is required for replication. - - {% include_cached copy-clipboard.html %} - ~~~ shell - molt fetch \ - --source "postgres://postgres:postgres@localhost:5432/molt?sslmode=verify-full" \ - --target "postgres://root@localhost:26257/defaultdb?sslmode=verify-full" \ - --table-filter 'employees' \ - --bucket-path 's3://molt-test' \ - --table-handling truncate-if-exists \ - --non-interactive \ - --mode data-load-and-replication \ - --pglogical-replication-slot-name cdc_slot \ - --replicator-flags '--metricsAddr :30005' - ~~~ -
- -
- {% include_cached copy-clipboard.html %} - ~~~ shell - molt fetch \ - --source 'mysql://user:password@localhost/molt?sslcert=.%2fsource_certs%2fclient.root.crt&sslkey=.%2fsource_certs%2fclient.root.key&sslmode=verify-full&sslrootcert=.%2fsource_certs%2fca.crt' \ - --target "postgres://root@localhost:26257/defaultdb?sslmode=verify-full" \ - --table-filter 'employees' \ - --bucket-path 's3://molt-test' \ - --table-handling truncate-if-exists \ - --non-interactive \ - --mode data-load-and-replication \ - --replicator-flags '--metricsAddr :30005' - ~~~ -
- - {{site.data.alerts.callout_info}} - `--metricsAddr` enables a Prometheus-compatible metrics endpoint at `http://{host}:{port}/_/varz` where replication metrics will be served. In this example, the endpoint is `http://localhost:30005/_/varz`. - {{site.data.alerts.end}} - -{% include molt/fetch-data-load-output.md %} - -## Step 4. Replicate changes to CockroachDB - -1. Continuous replication begins immediately after `fetch complete`. - -{% include molt/fetch-replication-output.md %} - -## Step 5. Stop replication and verify data - -{% include molt/migration-stop-replication.md %} - -{% include molt/verify-output.md %} - -{{site.data.alerts.callout_success}} -If you encountered issues with replication, you can now use [`failback`]({% link molt/migrate-failback.md %}) mode to replicate changes on CockroachDB back to the initial source database. In case you need to roll back the migration, this ensures that data is consistent on the initial source database. -{{site.data.alerts.end}} - -## Step 6. Modify the CockroachDB schema - -{% include molt/migration-modify-target-schema.md %} - -## Step 7. Cutover - -Perform a cutover by resuming application traffic, now to CockroachDB. +- Use `replication-only` to resume a previously interrupted replication stream. Refer to [Resume Replication]({% link molt/migrate-replicate-only.md %}). +- Use `failback` to reverse the migration, syncing changes from CockroachDB back to the original source. This ensures data consistency on the source so that you can retry later. Refer to [Migration Failback]({% link molt/migrate-failback.md %}). ## See also -- [Migration Overview]({% link molt/migration-overview.md %}) -- [MOLT Schema Conversion Tool]({% link cockroachcloud/migrations-page.md %}) -- [MOLT Fetch]({% link molt/molt-fetch.md %}) -- [MOLT Verify]({% link molt/molt-verify.md %}) -- [Migration Overview]({% link molt/migration-overview.md %}) -- [Migrate to CockroachDB in Phases]({% link molt/migrate-in-phases.md %}) -- [Migration Failback]({% link molt/migrate-failback.md %}) \ No newline at end of file +- [Migration Strategy]({% link molt/migration-strategy.md %}) +- [MOLT Releases]({% link releases/molt.md %}) \ No newline at end of file diff --git a/src/current/molt/migration-overview.md b/src/current/molt/migration-overview.md index fe35137931b..1f08e9607ec 100644 --- a/src/current/molt/migration-overview.md +++ b/src/current/molt/migration-overview.md @@ -9,11 +9,11 @@ The MOLT (Migrate Off Legacy Technology) toolkit enables safe, minimal-downtime This page provides an overview of the following: -- Overall [migration flow](#migration-flow) +- Overall [migration sequence](#migration-sequence) - [MOLT tools](#molt-tools) -- Supported [migration and failback modes](#migration-modes) +- Supported [migration flows](#migration-flows) -## Migration flow +## Migration sequence {{site.data.alerts.callout_success}} Before you begin the migration, review [Migration Strategy]({% link molt/migration-strategy.md %}). @@ -34,7 +34,7 @@ A migration to CockroachDB generally follows this sequence: 1. Verify consistency before cutover: Use [MOLT Verify]({% link molt/molt-verify.md %}) to confirm that the CockroachDB data is consistent with the source. 1. Cut over to CockroachDB: Redirect application traffic to the CockroachDB cluster. -For a practical example of the preceding steps, refer to [Migrate to CockroachDB]({% link molt/migrate-to-cockroachdb.md %}). +For more details, refer to [Migration flows](#migration-flows). ## MOLT tools @@ -58,13 +58,13 @@ MOLT [Fetch](#fetch) and [Verify](#verify) are CLI-based to maximize control, au Fetch Initial data load; optional continuous replication - PostgreSQL 11-16, MySQL 5.7-8.0+, CockroachDB + PostgreSQL 11-16, MySQL 5.7-8.0+, Oracle Database 19c (Enterprise Edition) and 21c (Express Edition), CockroachDB GA Verify Schema and data validation - PostgreSQL 12-16, MySQL 5.7-8.0+, CockroachDB + PostgreSQL 12-16, MySQL 5.7-8.0+, Oracle Database 19c (Enterprise Edition) and 21c (Express Edition), CockroachDB Preview @@ -81,7 +81,7 @@ The [MOLT Schema Conversion Tool]({% link cockroachcloud/migrations-page.md %}) [MOLT Fetch]({% link molt/molt-fetch.md %}) performs the core data migration to CockroachDB. It supports: -- [Multiple migration modes](#migration-modes) via `IMPORT INTO` or `COPY FROM`. +- [Multiple migration flows](#migration-flows) via `IMPORT INTO` or `COPY FROM`. - Data movement via [cloud storage, local file servers, or direct copy]({% link molt/molt-fetch.md %}#data-path). - [Concurrent data export]({% link molt/molt-fetch.md %}#best-practices) from multiple source tables and shards. - [Continuous replication]({% link molt/molt-fetch.md %}#replicate-changes), enabling you to minimize downtime before cutover. @@ -97,33 +97,37 @@ The [MOLT Schema Conversion Tool]({% link cockroachcloud/migrations-page.md %}) - Column definition. - Row-level data. -## Migration modes +## Migration flows -MOLT Fetch supports [multiple data migration modes]({% link molt/molt-fetch.md %}#fetch-mode). These can be combined based on your testing and cutover strategy. +MOLT Fetch supports various migration flows using [MOLT Fetch modes]({% link molt/molt-fetch.md %}#fetch-mode). -| Mode | Description | Best For | -|---------------------------------------------|------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| `--mode data-load` | Performs one-time load of source data into CockroachDB | Testing, migrations with [planned downtime]({% link molt/migration-strategy.md %}#approach-to-downtime), migrations with [minimal downtime]({% link molt/migrate-to-cockroachdb.md %}) | -| `--mode data-load-and-replication` | Loads source data and starts continuous replication from the source database | [Migrations with minimal downtime]({% link molt/migrate-to-cockroachdb.md %}) | -| `--mode replication-only` | Starts replication from a previously loaded source | [Migrations with minimal downtime]({% link molt/migrate-to-cockroachdb.md %}), post-load sync | -| `--mode failback` | Replicates changes on CockroachDB back to the original source | [Rollback scenarios]({% link molt/migrate-failback.md %}) | -| `--mode export-only` / `--mode import-only` | Separates data export and import phases | Local performance testing | -| `--direct-copy` | Loads data directly using `COPY FROM`, without intermediate storage | Local testing, limited-infrastructure environments | +| Migration flow | Mode | Description | Best for | +|----------------------------------------------------------------------------------------|----------------------------------------------------|--------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------| +| [Bulk load]({% link molt/migrate-bulk-load.md %}) | `--mode data-load` | Perform a one-time bulk load of source data into CockroachDB. | Testing, migrations with [planned downtime]({% link molt/migration-strategy.md %}#approach-to-downtime) | +| [Data load and replication]({% link molt/migrate-data-load-and-replication.md %}) | `--mode data-load-and-replication` | Load source data, then replicate subsequent changes continuously. | [Minimal downtime]({% link molt/migration-strategy.md %}#approach-to-downtime) migrations | +| [Data load then replication-only]({% link molt/migrate-data-load-replicate-only.md %}) | `--mode data-load`, then `--mode replication-only` | Load source data first, then start replication in a separate task. | [Minimal downtime]({% link molt/migration-strategy.md %}#approach-to-downtime) migrations | +| [Resume replication]({% link molt/migrate-replicate-only.md %}) | `--mode replication-only` | Resume replication from a checkpoint after interruption. | Resuming interrupted migrations, post-load sync | +| [Failback]({% link molt/migrate-failback.md %}) | `--mode failback` | Replicate changes from CockroachDB back to the source database. | [Rollback]({% link molt/migrate-failback.md %}) scenarios | -## Migrations with minimal downtime +### Bulk load -MOLT simplifies and streamlines migrations by using a replication stream to minimize downtime. Rather than load all data into CockroachDB during a [planned downtime]({% link molt/migration-strategy.md %}#approach-to-downtime) window, you perform an initial data load and continuously replicate any subsequent changes to CockroachDB. Writes are only briefly paused to allow replication to drain before final cutover. The length of the pause depends on the volume of write traffic and the amount of replication lag between the source and CockroachDB. +For migrations that tolerate downtime, use `data-load` mode to perform a one-time bulk load of source data into CockroachDB. Refer to [Bulk Load]({% link molt/migrate-bulk-load.md %}). -Run MOLT Fetch in either `data-load-and-replication` mode, or `data-load` mode followed by `replication-only`, to load the initial source data and continuously replicate subsequent changes to CockroachDB. When ready, pause application traffic to allow replication to drain, validate data consistency with MOLT Verify, then cut over to CockroachDB. For example steps, refer to [Migrate to CockroachDB]({% link molt/migrate-to-cockroachdb.md %}). +### Migrations with minimal downtime -## Migration failback +To minimize downtime during migration, MOLT Fetch supports replication streams that continuously synchronize changes from the source database to CockroachDB. Instead of loading all data during a planned downtime window, you can run an initial load followed by continuous replication. Writes are paused only briefly to allow replication to drain before the final cutover. The duration of this pause depends on the volume of write traffic and the replication lag between the source and CockroachDB. -If issues arise during the migration, start MOLT Fetch in `failback` mode after stopping replication and before sending new writes to CockroachDB. Failback mode replicates changes from CockroachDB back to the original source database, ensuring that data is consistent on the original source so that you can retry the migration later. For example steps, refer to [Migration Failback]({% link molt/migrate-failback.md %}). +- Use `data-load-and-replication` mode to perform both steps in one task. Refer to [Load and Replicate]({% link molt/migrate-data-load-and-replication.md %}). +- Use `data-load` followed by `replication-only` to perform the steps separately. Refer to [Load and Replicate Separately]({% link molt/migrate-data-load-replicate-only.md %}). + +### Recovery and rollback strategies + +If the migration is interrupted or cutover must be aborted, MOLT Fetch provides safe recovery options: + +- Use `replication-only` to resume a previously interrupted replication stream. Refer to [Resume Replication]({% link molt/migrate-replicate-only.md %}). +- Use `failback` to reverse the migration, synchronizing changes from CockroachDB back to the original source. This ensures data consistency on the source so that you can retry the migration later. Refer to [Migration Failback]({% link molt/migrate-failback.md %}). ## See also -- [Migrate to CockroachDB]({% link molt/migrate-to-cockroachdb.md %}) -- [Migrate to CockroachDB in Phases]({% link molt/migrate-in-phases.md %}) -- [Migration Failback]({% link molt/migrate-failback.md %}) - [Migration Strategy]({% link molt/migration-strategy.md %}) - [MOLT Releases]({% link releases/molt.md %}) \ No newline at end of file diff --git a/src/current/molt/migration-strategy.md b/src/current/molt/migration-strategy.md index 8b379974d17..ff462126db3 100644 --- a/src/current/molt/migration-strategy.md +++ b/src/current/molt/migration-strategy.md @@ -37,7 +37,7 @@ It's important to fully [prepare the migration](#prepare-for-migration) in order - *Planned downtime* is made known to your users in advance. Once you have [prepared for the migration](#prepare-for-migration), you take the application offline, [conduct the migration]({% link molt/migration-overview.md %}), and bring the application back online on CockroachDB. To succeed, you should estimate the amount of downtime required to migrate your data, and ideally schedule the downtime outside of peak hours. Scheduling downtime is easiest if your application traffic is "periodic", meaning that it varies by the time of day, day of week, or day of month. - Migrations with planned downtime are only recommended if you can complete the bulk data load (e.g., using the MOLT Fetch [`data-load` mode]({% link molt/migration-overview.md %}#migration-modes)) within the downtime window. Otherwise, you can [minimize downtime using continuous replication]({% link molt/migration-overview.md %}#migrations-with-minimal-downtime). + Migrations with planned downtime are only recommended if you can complete the bulk data load (e.g., using the MOLT Fetch [`data-load` mode]({% link molt/molt-fetch.md %}#fetch-mode)) within the downtime window. Otherwise, you can [minimize downtime using continuous replication]({% link molt/migration-overview.md %}#migrations-with-minimal-downtime). - *Minimal downtime* impacts as few customers as possible, ideally without impacting their regular usage. If your application is intentionally offline at certain times (e.g., outside business hours), you can migrate the data without users noticing. Alternatively, if your application's functionality is not time-sensitive (e.g., it sends batched messages or emails), you can queue requests while the system is offline and process them after completing the migration to CockroachDB. @@ -110,7 +110,7 @@ Based on the error budget you [defined in your migration plan](#develop-a-migrat ### Load test data -It's useful to load test data into CockroachDB so that you can [test your application queries](#validate-queries). Refer to the steps in [Migrate to CockroachDB in Phases]({% link molt/migrate-in-phases.md %}). +It's useful to load test data into CockroachDB so that you can [test your application queries](#validate-queries). Refer to [Migration flows]({% link molt/migration-overview.md %}#migration-flows). MOLT Fetch [supports both `IMPORT INTO` and `COPY FROM`]({% link molt/molt-fetch.md %}#data-movement) for loading data into CockroachDB: @@ -160,13 +160,11 @@ To safely cut over when using replication: 1. When your [monitoring](#set-up-monitoring-and-alerting) indicates that replication is idle, use [MOLT Verify]({% link molt/molt-verify.md %}) to validate the CockroachDB data. 1. Start application traffic on CockroachDB. -When you are ready to migrate, refer to [Migrate to CockroachDB]({% link molt/migrate-to-cockroachdb.md %}) or [Migrate to CockroachDB in Phases]({% link molt/migrate-in-phases.md %}) for practical examples of the migration steps. +When you are ready to migrate, refer to [Migration flows]({% link molt/migration-overview.md %}#migration-flows) for a summary of migration types. ## See also - [Migration Overview]({% link molt/migration-overview.md %}) -- [Migrate to CockroachDB]({% link molt/migrate-to-cockroachdb.md %}) -- [Migrate to CockroachDB in Phases]({% link molt/migrate-in-phases.md %}) - [Migration Failback]({% link molt/migrate-failback.md %}) - [Schema Design Overview]({% link {{ site.current_cloud_version }}/schema-design-overview.md %}) - [Primary key best practices]({% link {{ site.current_cloud_version }}/schema-design-table.md %}#primary-key-best-practices) diff --git a/src/current/molt/molt-fetch.md b/src/current/molt/molt-fetch.md index 731a016f64d..0a37f136774 100644 --- a/src/current/molt/molt-fetch.md +++ b/src/current/molt/molt-fetch.md @@ -7,7 +7,7 @@ docs_area: migrate MOLT Fetch moves data from a source database into CockroachDB as part of a [database migration]({% link molt/migration-overview.md %}). -MOLT Fetch uses [`IMPORT INTO`]({% link {{site.current_cloud_version}}/import-into.md %}) or [`COPY FROM`]({% link {{site.current_cloud_version}}/copy.md %}) to move the source data to cloud storage (Google Cloud Storage or Amazon S3), a local file server, or local memory. Once the data is exported, MOLT Fetch can load the data into a target CockroachDB database and replicate changes from the source database. For details, see [Usage](#usage). +MOLT Fetch uses [`IMPORT INTO`]({% link {{site.current_cloud_version}}/import-into.md %}) or [`COPY FROM`]({% link {{site.current_cloud_version}}/copy.md %}) to move the source data to cloud storage (Google Cloud Storage, Amazon S3, or Azure Blob Storage), a local file server, or local memory. Once the data is exported, MOLT Fetch can load the data into a target CockroachDB database and replicate changes from the source database. For details, see [Usage](#usage). ## Supported databases @@ -15,6 +15,7 @@ The following source databases are currently supported: - PostgreSQL 11-16 - MySQL 5.7, 8.0 and later +- Oracle Database 19c (Enterprise Edition) and 21c (Express Edition) ## Installation @@ -73,6 +74,10 @@ Complete the following items before using MOLT Fetch: ## Best practices +{{site.data.alerts.callout_success}} +To verify that your connections and configuration work properly, run MOLT Fetch in a staging environment before migrating any data in production. Use a test or development environment that closely resembles production. +{{site.data.alerts.end}} + - To prevent connections from terminating prematurely during data export, set the following to high values on the source database: - **Maximum allowed number of connections.** MOLT Fetch can export data across multiple connections. The number of connections it will create is the number of shards ([`--export-concurrency`](#global-flags)) multiplied by the number of tables ([`--table-concurrency`](#global-flags)) being exported concurrently. @@ -83,7 +88,7 @@ Complete the following items before using MOLT Fetch: - **Maximum lifetime of a connection.** -- If a PostgreSQL database is set as a [source](#source-and-target-databases), ensure that [`idle_in_transaction_session_timeout`](https://www.postgresql.org/docs/current/runtime-config-client.html#GUC-IDLE-IN-TRANSACTION-SESSION-TIMEOUT) on PostgreSQL is either disabled or set to a value longer than the duration of data export. Otherwise, the connection will be prematurely terminated. To estimate the time needed to export the PostgreSQL tables, you can [perform a dry run](#perform-a-dry-run) and sum the value of [`molt_fetch_table_export_duration_ms`](#metrics) for all exported tables. +- If a PostgreSQL database is set as a [source](#source-and-target-databases), ensure that [`idle_in_transaction_session_timeout`](https://www.postgresql.org/docs/current/runtime-config-client.html#GUC-IDLE-IN-TRANSACTION-SESSION-TIMEOUT) on PostgreSQL is either disabled or set to a value longer than the duration of data export. Otherwise, the connection will be prematurely terminated. To estimate the time needed to export the PostgreSQL tables, you can perform a dry run and sum the value of [`molt_fetch_table_export_duration_ms`](#metrics) for all exported tables. - To prevent memory outages during `READ COMMITTED` data export of tables with large rows, estimate the amount of memory used to export a table: @@ -99,7 +104,7 @@ Complete the following items before using MOLT Fetch: - Ensure that the machine running MOLT Fetch is large enough to handle the amount of data being migrated. Fetch performance can sometimes be limited by available resources, but should always be making progress. To identify possible resource constraints, observe the `molt_fetch_rows_exported` [metric](#metrics) for decreases in the number of rows being processed. You can use the [sample Grafana dashboard](https://molt.cockroachdb.com/molt/cli/grafana_dashboard.json) to view metrics. -- Before moving data, Cockroach Labs recommends dropping any [constraints]({% link {{ site.current_cloud_version }}/alter-table.md %}#drop-constraint) and [indexes]({% link {{site.current_cloud_version}}/drop-index.md %}) on the target CockroachDB database. Doing so will optimize performance. You can recreate [constraints]({% link {{ site.current_cloud_version }}/alter-table.md %}#add-constraint) and [indexes]({% link {{site.current_cloud_version}}/create-index.md %}) after the data is loaded. +- {% include molt/molt-drop-constraints-indexes.md %} ## Security recommendations @@ -113,36 +118,12 @@ Cockroach Labs **strongly** recommends the following: ### Connection strings -- Avoid plaintext connection strings. -- Provide your connection strings as environment variables. -- If possible within your security infrastructure, use an external secrets manager to load the environment variables from stored secrets. - - For example, to export connection strings as environment variables: - - ~~~ shell - export SOURCE="postgres://postgres:postgres@localhost:5432/molt?sslmode=verify-full" - export TARGET="postgres://root@localhost:26257/molt?sslmode=verify-full" - ~~~ - - Afterward, to pass the environment variables in `molt fetch` commands: - - ~~~ shell - molt fetch \ - --source $SOURCE \ - --target $TARGET \ - --table-filter 'employees' \ - --bucket-path 's3://molt-test' \ - --table-handling truncate-if-exists - ~~~ +{% include molt/fetch-secure-connection-strings.md %} ### Secure cloud storage {% include molt/fetch-secure-cloud-storage.md %} -### Perform a dry run - -To verify that your connections and configuration work properly, run MOLT Fetch in a staging environment before moving any data in production. Use a test or development environment that is as similar as possible to production. - ## Commands | Command | Usage | @@ -161,12 +142,13 @@ To verify that your connections and configuration work properly, run MOLT Fetch | Flag | Description | |------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| `--source` | (Required) Connection string for the source database. For details, see [Source and target databases](#source-and-target-databases). | -| `--target` | (Required) Connection string for the target database. For details, see [Source and target databases](#source-and-target-databases). | +| `--source` | (Required) Connection string used to connect to the Oracle PDB (in a CDB/PDB architecture) or to a standalone database (non‑CDB). For details, refer to [Source and target databases](#source-and-target-databases). | +| `--source-cdb` | Connection string for the Oracle container database (CDB) when using a multitenant (CDB/PDB) architecture. Omit this flag on a non‑multitenant Oracle database. For details, refer to [Source and target databases](#source-and-target-databases). | +| `--target` | (Required) Connection string for the target database. For details, refer to [Source and target databases](#source-and-target-databases). | | `--allow-tls-mode-disable` | Allow insecure connections to databases. Secure SSL/TLS connections should be used by default. This should be enabled **only** if secure SSL/TLS connections to the source or target database are not possible. | | `--assume-role` | Service account to use for assume role authentication. `--use-implicit-auth` must be included. For example, `--assume-role='user-test@cluster-ephemeral.iam.gserviceaccount.com' --use-implicit-auth`. For details, refer to [Cloud Storage Authentication]({% link {{ site.current_cloud_version }}/cloud-storage-authentication.md %}). | | `--bucket-path` | The path within the [cloud storage](#cloud-storage) bucket where intermediate files are written (e.g., `'s3://bucket/path'` or `'gs://bucket/path'`). Only the URL path is used; query parameters (e.g., credentials) are ignored. To pass in query parameters, use the appropriate flags: `--assume-role`, `--import-region`, `--use-implicit-auth`. | -| `--case-sensitive` | Toggle case sensitivity when comparing table and column names on the source and target. To disable case sensitivity, set `--case-sensitive=false`. If `=` is **not** included (e.g., `--case-sensitive false`), the flag is interpreted as `--case-sensitive` (i.e., `--case-sensitive=true`).

**Default:** `false` | +| `--case-sensitive` | Toggle case sensitivity when comparing table and column names on the source and target. To disable case sensitivity, set `--case-sensitive=false`. If `=` is **not** included (e.g., `--case-sensitive false`), the flag is interpreted as `--case-sensitive` (i.e., `--case-sensitive=true`).

**Default:** `false` | | `--changefeeds-path` | Path to a JSON file that contains changefeed override settings for [failback](#fail-back-to-source-database), when enabled with `--mode failback`. If not specified, an insecure default configuration is used, and `--allow-tls-mode-disable` must be included. For details, see [Fail back to source database](#fail-back-to-source-database). | | `--cleanup` | Whether to delete intermediate files after moving data using [cloud or local storage](#data-path). **Note:** Cleanup does not occur on [continuation](#fetch-continuation). | | `--compression` | Compression method for data when using [`IMPORT INTO`](#data-movement) (`gzip`/`none`).

**Default:** `gzip` | @@ -187,7 +169,7 @@ To verify that your connections and configuration work properly, run MOLT Fetch | `--log-file` | Write messages to the specified log filename. If no filename is provided, messages write to `fetch-{datetime}.log`. If `"stdout"` is provided, messages write to `stdout`. | | `--logging` | Level at which to log messages (`trace`/`debug`/`info`/`warn`/`error`/`fatal`/`panic`).

**Default:** `info` | | `--metrics-listen-addr` | Address of the Prometheus metrics endpoint, which has the path `{address}/metrics`. For details on important metrics to monitor, see [Metrics](#metrics).

**Default:** `'127.0.0.1:3030'` | -| `--mode` | Configure the MOLT Fetch behavior: `data-load`, `data-load-and-replication`, `replication-only`, `export-only`, `import-only`, or `failback`. For details, refer to [Fetch mode](#fetch-mode).

**Default:** `data-load` | +| `--mode` | Configure the MOLT Fetch behavior: `data-load`, `data-load-and-replication`, `replication-only`, `export-only`, `import-only`, or `failback`. For details, refer to [Fetch mode](#fetch-mode).

**Default:** `data-load` | | `--non-interactive` | Run the fetch task without interactive prompts. This is recommended **only** when running `molt fetch` in an automated process (i.e., a job or continuous integration). | | `--pglogical-publication-name` | If set, the name of the [publication](https://www.postgresql.org/docs/current/logical-replication-publication.html) that will be created or used for replication. Used in [`replication-only`](#replicate-changes) mode.

**Default:** `molt_fetch` | | `--pglogical-publication-and-slot-drop-and-recreate` | If set, drops the [publication](https://www.postgresql.org/docs/current/logical-replication-publication.html) and slots if they exist and then recreates them. Used in [`replication-only`](#replicate-changes) mode. | @@ -227,6 +209,8 @@ The following sections describe how to use the `molt fetch` [flags](#flags). Follow the recommendations in [Connection strings](#connection-strings). {{site.data.alerts.end}} +#### `--source` + `--source` specifies the connection string of the source database. PostgreSQL or CockroachDB: @@ -243,6 +227,27 @@ MySQL: --source 'mysql://{username}:{password}@{protocol}({host}:{port})/{database}' ~~~ +Oracle: + +{% include_cached copy-clipboard.html %} +~~~ +--source 'oracle://{username}:{password}@{host}:{port}/{service_name}' +~~~ + +In Oracle migrations, the `--source` connection string specifies a PDB (in [Oracle Multitenant databases](https://docs.oracle.com/en/database/oracle/oracle-database/21/cncpt/CDBs-and-PDBs.html)) or single database. The `{username}` corresponds to the owner of the tables you will migrate. + +#### `--source-cdb` + +The `--source-cdb` flag specifies the connection string for the Oracle container database (CDB) in an Oracle Multitenant deployment. Omit this flag on a non‑multitenant Oracle database. + +{% include_cached copy-clipboard.html %} +~~~ +--source oracle://{username}:{password}@{host}:{port}/{service_name} +--source-cdb oracle://{username}:{password}@{host}:{port}/{container_service} +~~~ + +#### `--target` + `--target` specifies the [CockroachDB connection string]({% link {{site.current_cloud_version}}/connection-parameters.md %}#connect-using-a-url): {% include_cached copy-clipboard.html %} @@ -284,7 +289,7 @@ In case you need to rename your [publication](https://www.postgresql.org/docs/cu #### Load data and replicate changes {{site.data.alerts.callout_info}} -Before using this option, the source PostgreSQL or MySQL database **must** be configured for continuous replication, as described in [Setup](#replication-setup). MySQL 5.7 and later are supported. +Before using this option, the source database **must** be configured for continuous replication, as described in [Setup](#replication-setup). {{site.data.alerts.end}} `data-load-and-replication` instructs MOLT Fetch to load the source data into CockroachDB, and replicate any subsequent changes on the source. This enables [migrations with minimal downtime]({% link molt/migration-overview.md %}#migrations-with-minimal-downtime). @@ -323,7 +328,7 @@ To customize the replication behavior (an advanced use case), use `--replicator- {{site.data.alerts.callout_info}} Before using this option: -- The source PostgreSQL or MySQL database **must** be configured for continuous replication, as described in [Setup](#replication-setup). MySQL 5.7 and later are supported. +- The source database **must** be configured for continuous replication, as described in [Setup](#replication-setup). - The `replicator` binary **must** be located either in the same directory as `molt` or in a directory beneath `molt`. {{site.data.alerts.end}} @@ -341,20 +346,12 @@ Before using this option: In case you want to run `replication-only` without already having loaded data (e.g., for testing), also include `--pglogical-publication-and-slot-drop-and-recreate` to ensure that the publication and replication slot are created in the correct order. For details on this flag, refer to [Global flags](#global-flags). {{site.data.alerts.end}} -- For a MySQL source, first get your GTID record: - {% include_cached copy-clipboard.html %} - ~~~ sql - SELECT source_uuid, min(interval_start), max(interval_end) - FROM mysql.gtid_executed - GROUP BY source_uuid; - ~~~ - - In the `molt fetch` command, specify a GTID set using the [`--defaultGTIDSet` replication flag](#mysql-replication-flags) and the format `source_uuid:min(interval_start)-max(interval_end)`. For example: +- For a MySQL source, replication requires specifying a starting GTID set with the `--defaultGTIDSet` replication flag. After the initial data load completes, locate the [`cdc_cursor`](#cdc-cursor) value in the `fetch complete` log output and use it as the GTID set. For example: {% include_cached copy-clipboard.html %} - ~~~ - --mode replication-only + ~~~ shell + --mode replication-only \ --replicator-flags "--defaultGTIDSet b7f9e0fa-2753-1e1f-5d9b-2402ac810003:3-21" ~~~ @@ -522,25 +519,37 @@ MOLT Fetch can move the source data to CockroachDB via [cloud storage](#cloud-st Only the path specified in `--bucket-path` is used. Query parameters, such as credentials, are ignored. To authenticate cloud storage, follow the steps in [Secure cloud storage](#secure-cloud-storage). {{site.data.alerts.end}} -`--bucket-path` instructs MOLT Fetch to write intermediate files to a path within a [Google Cloud Storage](https://cloud.google.com/storage/docs/buckets) or [Amazon S3](https://aws.amazon.com/s3/) bucket to which you have the necessary permissions. Use additional [flags](#global-flags), shown in the following examples, to specify authentication or region parameters as required for bucket access. +`--bucket-path` instructs MOLT Fetch to write intermediate files to a path within [Google Cloud Storage](https://cloud.google.com/storage/docs/buckets), [Amazon S3](https://aws.amazon.com/s3/), or [Azure Blob Storage](https://azure.microsoft.com/en-us/products/storage/blobs) to which you have the necessary permissions. Use additional [flags](#global-flags), shown in the following examples, to specify authentication or region parameters as required for bucket access. -The following example connects to a Google Cloud Storage bucket with [implicit authentication]({% link {{ site.current_cloud_version }}/cloud-storage-authentication.md %}#google-cloud-storage-implicit) and [assume role]({% link {{ site.current_cloud_version }}/cloud-storage-authentication.md %}#set-up-google-cloud-storage-assume-role). +Connect to a Google Cloud Storage bucket with [implicit authentication]({% link {{ site.current_cloud_version }}/cloud-storage-authentication.md %}#google-cloud-storage-implicit) and [assume role]({% link {{ site.current_cloud_version }}/cloud-storage-authentication.md %}#set-up-google-cloud-storage-assume-role): {% include_cached copy-clipboard.html %} ~~~ ---bucket-path 'gs://migration/data/cockroach +--bucket-path 'gs://migration/data/cockroach' --assume-role 'user-test@cluster-ephemeral.iam.gserviceaccount.com' --use-implicit-auth ~~~ -The following example connects to an Amazon S3 bucket and explicitly specifies the `ap_south-1` region. The `--import-region` flag enables the use of the `AWS_REGION` query parameter in the `s3` URL. When this flag is set, `IMPORT INTO` must be used for [data movement](#data-movement). +Connect to an Amazon S3 bucket and explicitly specify the `ap_south-1` region: {% include_cached copy-clipboard.html %} ~~~ ---bucket-path 's3://migration/data/cockroach +--bucket-path 's3://migration/data/cockroach' --import-region 'ap-south-1' ~~~ +{{site.data.alerts.callout_info}} +When `--import-region` is set, `IMPORT INTO` must be used for [data movement](#data-movement). +{{site.data.alerts.end}} + +Connect to an Azure Blob Storage container with [implicit authentication]({% link {{ site.current_cloud_version }}/cloud-storage-authentication.md %}?filters=azure#azure-blob-storage-implicit-authentication): + +{% include_cached copy-clipboard.html %} +~~~ +--bucket-path 'azure-blob://migration/data/cockroach' +--use-implicit-auth +~~~ + #### Local file server `--local-path` instructs MOLT Fetch to write intermediate files to a path within a [local file server]({% link {{site.current_cloud_version}}/use-a-local-file-server.md %}). `local-path-listen-addr` specifies the address of the local file server. For example: @@ -654,36 +663,58 @@ If [`drop-on-target-and-recreate`](#target-table-handling) is set, MOLT Fetch au - PostgreSQL types are mapped to existing CockroachDB [types]({% link {{site.current_cloud_version}}/data-types.md %}) that have the same [`OID`]({% link {{site.current_cloud_version}}/oid.md %}). - The following MySQL types are mapped to corresponding CockroachDB types: - | MySQL type | CockroachDB type | - |-----------------------------------------------------|--------------------------------------------------------------------------------------------------------------------| - | `CHAR`, `CHARACTER`, `VARCHAR`, `NCHAR`, `NVARCHAR` | [`VARCHAR`]({% link {{site.current_cloud_version}}/string.md %}) | - | `TINYTEXT`, `TEXT`, `MEDIUMTEXT`, `LONGTEXT` | [`STRING`]({% link {{site.current_cloud_version}}/string.md %}) | - | `GEOMETRY` | [`GEOMETRY`]({% link {{site.current_cloud_version}}/architecture/glossary.md %}#geometry) | - | `LINESTRING` | [`LINESTRING`]({% link {{site.current_cloud_version}}/linestring.md %}) | - | `POINT` | [`POINT`]({% link {{site.current_cloud_version}}/point.md %}) | - | `POLYGON` | [`POLYGON`]({% link {{site.current_cloud_version}}/polygon.md %}) | - | `MULTIPOINT` | [`MULTIPOINT`]({% link {{site.current_cloud_version}}/multipoint.md %}) | - | `MULTILINESTRING` | [`MULTILINESTRING`]({% link {{site.current_cloud_version}}/multilinestring.md %}) | - | `MULTIPOLYGON` | [`MULTIPOLYGON`]({% link {{site.current_cloud_version}}/multipolygon.md %}) | - | `GEOMETRYCOLLECTION`, `GEOMCOLLECTION` | [`GEOMETRYCOLLECTION`]({% link {{site.current_cloud_version}}/geometrycollection.md %}) | - | `JSON` | [`JSONB`]({% link {{site.current_cloud_version}}/jsonb.md %}) | - | `TINYINT`, `INT1` | [`INT2`]({% link {{site.current_cloud_version}}/int.md %}) | - | `BLOB` | [`BYTES`]({% link {{site.current_cloud_version}}/bytes.md %}) | - | `SMALLINT`, `INT2` | [`INT2`]({% link {{site.current_cloud_version}}/int.md %}) | - | `MEDIUMINT`, `INT`, `INTEGER`, `INT4` | [`INT4`]({% link {{site.current_cloud_version}}/int.md %}) | - | `BIGINT`, `INT8` | [`INT`]({% link {{site.current_cloud_version}}/int.md %}) | - | `FLOAT` | [`FLOAT4`]({% link {{site.current_cloud_version}}/float.md %}) | - | `DOUBLE` | [`FLOAT`]({% link {{site.current_cloud_version}}/float.md %}) | - | `DECIMAL`, `NUMERIC`, `REAL` | [`DECIMAL`]({% link {{site.current_cloud_version}}/decimal.md %}) (Negative scale values are autocorrected to `0`) | - | `BINARY`, `VARBINARY` | [`BYTES`]({% link {{site.current_cloud_version}}/bytes.md %}) | - | `DATETIME` | [`TIMESTAMP`]({% link {{site.current_cloud_version}}/timestamp.md %}) | - | `TIMESTAMP` | [`TIMESTAMPTZ`]({% link {{site.current_cloud_version}}/timestamp.md %}) | - | `TIME` | [`TIME`]({% link {{site.current_cloud_version}}/time.md %}) | - | `BIT` | [`VARBIT`]({% link {{site.current_cloud_version}}/bit.md %}) | - | `DATE` | [`DATE`]({% link {{site.current_cloud_version}}/date.md %}) | - | `TINYBLOB`, `MEDIUMBLOB`, `LONGBLOB` | [`BYTES`]({% link {{site.current_cloud_version}}/bytes.md %}) | - | `BOOL`, `BOOLEAN` | [`BOOL`]({% link {{site.current_cloud_version}}/bool.md %}) | - | `ENUM` | [`ANY_ENUM`]({% link {{site.current_cloud_version}}/enum.md %}) | + | MySQL type | CockroachDB type | Notes | + |-----------------------------------------------------|-------------------------------------------------------------------------------------------|--------------------------------------------------------------| + | `CHAR`, `CHARACTER`, `VARCHAR`, `NCHAR`, `NVARCHAR` | [`VARCHAR`]({% link {{site.current_cloud_version}}/string.md %}) | Varying-length string; raises warning if BYTE semantics used | + | `TINYTEXT`, `TEXT`, `MEDIUMTEXT`, `LONGTEXT` | [`STRING`]({% link {{site.current_cloud_version}}/string.md %}) | Unlimited-length string | + | `GEOMETRY` | [`GEOMETRY`]({% link {{site.current_cloud_version}}/architecture/glossary.md %}#geometry) | Spatial type (PostGIS-style) | + | `LINESTRING` | [`LINESTRING`]({% link {{site.current_cloud_version}}/linestring.md %}) | Spatial type (PostGIS-style) | + | `POINT` | [`POINT`]({% link {{site.current_cloud_version}}/point.md %}) | Spatial type (PostGIS-style) | + | `POLYGON` | [`POLYGON`]({% link {{site.current_cloud_version}}/polygon.md %}) | Spatial type (PostGIS-style) | + | `MULTIPOINT` | [`MULTIPOINT`]({% link {{site.current_cloud_version}}/multipoint.md %}) | Spatial type (PostGIS-style) | + | `MULTILINESTRING` | [`MULTILINESTRING`]({% link {{site.current_cloud_version}}/multilinestring.md %}) | Spatial type (PostGIS-style) | + | `MULTIPOLYGON` | [`MULTIPOLYGON`]({% link {{site.current_cloud_version}}/multipolygon.md %}) | Spatial type (PostGIS-style) | + | `GEOMETRYCOLLECTION`, `GEOMCOLLECTION` | [`GEOMETRYCOLLECTION`]({% link {{site.current_cloud_version}}/geometrycollection.md %}) | Spatial type (PostGIS-style) | + | `JSON` | [`JSONB`]({% link {{site.current_cloud_version}}/jsonb.md %}) | CRDB's native JSON format | + | `TINYINT`, `INT1` | [`INT2`]({% link {{site.current_cloud_version}}/int.md %}) | 2-byte integer | + | `BLOB` | [`BYTES`]({% link {{site.current_cloud_version}}/bytes.md %}) | Binary data | + | `SMALLINT`, `INT2` | [`INT2`]({% link {{site.current_cloud_version}}/int.md %}) | 2-byte integer | + | `MEDIUMINT`, `INT`, `INTEGER`, `INT4` | [`INT4`]({% link {{site.current_cloud_version}}/int.md %}) | 4-byte integer | + | `BIGINT`, `INT8` | [`INT`]({% link {{site.current_cloud_version}}/int.md %}) | 8-byte integer | + | `FLOAT` | [`FLOAT4`]({% link {{site.current_cloud_version}}/float.md %}) | 32-bit float | + | `DOUBLE` | [`FLOAT`]({% link {{site.current_cloud_version}}/float.md %}) | 64-bit float | + | `DECIMAL`, `NUMERIC`, `REAL` | [`DECIMAL`]({% link {{site.current_cloud_version}}/decimal.md %}) | Validates scale ≤ precision; warns if precision > 19 | + | `BINARY`, `VARBINARY` | [`BYTES`]({% link {{site.current_cloud_version}}/bytes.md %}) | Binary data | + | `DATETIME` | [`TIMESTAMP`]({% link {{site.current_cloud_version}}/timestamp.md %}) | Date and time (no time zone) | + | `TIMESTAMP` | [`TIMESTAMPTZ`]({% link {{site.current_cloud_version}}/timestamp.md %}) | Date and time with time zone | + | `TIME` | [`TIME`]({% link {{site.current_cloud_version}}/time.md %}) | Time of day (no date) | + | `BIT` | [`VARBIT`]({% link {{site.current_cloud_version}}/bit.md %}) | Variable-length bit array | + | `DATE` | [`DATE`]({% link {{site.current_cloud_version}}/date.md %}) | Date only (no time) | + | `TINYBLOB`, `MEDIUMBLOB`, `LONGBLOB` | [`BYTES`]({% link {{site.current_cloud_version}}/bytes.md %}) | Binary data | + | `BOOL`, `BOOLEAN` | [`BOOL`]({% link {{site.current_cloud_version}}/bool.md %}) | Boolean | + +- The following Oracle types are mapped to CockroachDB types: + + | Oracle type(s) | CockroachDB type | Notes | + |---------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------| + | `NCHAR`, `CHAR`, `CHARACTER` | [`CHAR`]({% link {{site.current_cloud_version}}/string.md %})(n) or [`CHAR`]({% link {{site.current_cloud_version}}/string.md %}) | Fixed-length character; falls back to unbounded if length not specified | + | `VARCHAR`, `VARCHAR2`, `NVARCHAR2` | [`VARCHAR`]({% link {{site.current_cloud_version}}/string.md %})(n) or [`VARCHAR`]({% link {{site.current_cloud_version}}/string.md %}) | Varying-length string; raises warning if BYTE semantics used | + | `STRING` | [`STRING`]({% link {{site.current_cloud_version}}/string.md %}) | Unlimited-length string | + | `SMALLINT` | [`INT2`]({% link {{site.current_cloud_version}}/int.md %}) | 2-byte integer | + | `INTEGER`, `INT`, `SIMPLE_INTEGER` | [`INT4`]({% link {{site.current_cloud_version}}/int.md %}) | 4-byte integer | + | `LONG` | [`INT8`]({% link {{site.current_cloud_version}}/int.md %}) | 8-byte integer | + | `FLOAT`, `BINARY_FLOAT`, `REAL` | [`FLOAT4`]({% link {{site.current_cloud_version}}/float.md %}) | 32-bit float | + | `DOUBLE`, `BINARY_DOUBLE` | [`FLOAT8`]({% link {{site.current_cloud_version}}/float.md %}) | 64-bit float | + | `DEC`, `NUMBER`, `DECIMAL`, `NUMERIC` | [`DECIMAL`]({% link {{site.current_cloud_version}}/decimal.md %})(p, s) or [`DECIMAL`]({% link {{site.current_cloud_version}}/decimal.md %}) | Validates scale ≤ precision; warns if precision > 19 | + | `DATE` | [`DATE`]({% link {{site.current_cloud_version}}/date.md %}) | Date only (no time) | + | `BLOB`, `RAW`, `LONG RAW` | [`BYTES`]({% link {{site.current_cloud_version}}/bytes.md %}) | Binary data | + | `JSON` | [`JSONB`]({% link {{site.current_cloud_version}}/jsonb.md %}) | CRDB's native JSON format | + | `CLOB`, `NCLOB` | [`STRING`]({% link {{site.current_cloud_version}}/string.md %}) | Treated as large text | + | `BOOLEAN` | [`BOOL`]({% link {{site.current_cloud_version}}/bool.md %}) | Boolean | + | `TIMESTAMP` | [`TIMESTAMP`]({% link {{site.current_cloud_version}}/timestamp.md %}) or [`TIMESTAMPTZ`]({% link {{site.current_cloud_version}}/timestamp.md %}) | If `WITH TIME ZONE` → `TIMESTAMPTZ`, else `TIMESTAMP` | + | `ROWID`, `UROWID` | [`STRING`]({% link {{site.current_cloud_version}}/string.md %}) | Treated as opaque identifier | + | `SDO_GEOMETRY` | [`GEOMETRY`]({% link {{site.current_cloud_version}}/architecture/glossary.md %}#geometry) | Spatial type (PostGIS-style) | + | `XMLTYPE` | [`STRING`]({% link {{site.current_cloud_version}}/string.md %}) | Stored as text | - To override the default mappings for automatic schema creation, you can map source to target CockroachDB types explicitly. These are defined in the JSON file indicated by the `--type-map-file` flag. The allowable custom mappings are valid CockroachDB aliases, casts, and the following mappings specific to MOLT Fetch and [Verify]({% link molt/molt-verify.md %}): @@ -890,10 +921,12 @@ Continuation Tokens. A change data capture (CDC) cursor is written to the output as `cdc_cursor` at the beginning and end of the fetch task. For example: ~~~ json -{"level":"info","type":"summary","fetch_id":"735a4fe0-c478-4de7-a342-cfa9738783dc","num_tables":1,"tables":["public.employees"],"cdc_cursor":"0/3F41E40","net_duration_ms":4879.890041,"net_duration":"000h 00m 04s","time":"2024-03-18T12:37:02-04:00","message":"fetch complete"} +{"level":"info","type":"summary","fetch_id":"735a4fe0-c478-4de7-a342-cfa9738783dc","num_tables":1,"tables":["public.employees"],"cdc_cursor":"b7f9e0fa-2753-1e1f-5d9b-2402ac810003:3-21","net_duration_ms":4879.890041,"net_duration":"000h 00m 04s","time":"2024-03-18T12:37:02-04:00","message":"fetch complete"} ~~~ -You can use the `cdc_cursor` value with an external change data capture (CDC) tool to continuously replicate subsequent changes on the source data to CockroachDB. +Use the `cdc_cursor` value as the starting GTID set for MySQL replication by passing it to the `--defaultGTIDSet` replication flag (refer to [Replication flags](#replication-flags)). + +You can also use the `cdc_cursor` value with an external change data capture (CDC) tool to continuously replicate subsequent changes from the source database to CockroachDB. ### Metrics @@ -1130,5 +1163,5 @@ DEBUG [Sep 11 11:04:01] httpReque ## See also - [Migration Overview]({% link molt/migration-overview.md %}) -- [Migrate to CockroachDB]({% link molt/migrate-to-cockroachdb.md %}) -- [MOLT Verify]({% link molt/molt-verify.md %}) \ No newline at end of file +- [Migration Strategy]({% link molt/migration-strategy.md %}) +- [MOLT Verify]({% link molt/molt-verify.md %}) diff --git a/src/current/molt/molt-verify.md b/src/current/molt/molt-verify.md index 2449663f93f..255978d7340 100644 --- a/src/current/molt/molt-verify.md +++ b/src/current/molt/molt-verify.md @@ -27,6 +27,7 @@ The following source databases are currently supported: - PostgreSQL 12-16 - MySQL 5.7, 8.0 and later +- Oracle Database 19c (Enterprise Edition) and 21c (Express Edition) ## Installation @@ -130,5 +131,5 @@ The following limitation is specific to MySQL: ## See also - [Migration Overview]({% link molt/migration-overview.md %}) -- [Migrate to CockroachDB]({% link molt/migrate-to-cockroachdb.md %}) +- [Migration Strategy]({% link molt/migration-strategy.md %}) - [MOLT Fetch]({% link molt/molt-fetch.md %}) \ No newline at end of file