Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add docs for MN migration to TimescaleDB instance #2827

Merged
merged 1 commit into from
Nov 21, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
166 changes: 166 additions & 0 deletions _partials/_migrate_post_schema_caggs_etal.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,166 @@
## Migrate schema post-data

When you have migrated your table and hypertable data, migrate your PostgreSQL
schema post-data. This includes information about constraints.

<Procedure>

### Migrating schema post-data

1. At the command prompt, dump the schema post-data from your source database
into a `dump_post_data.dump` file, using your source database connection details. Exclude
Timescale-specific schemas. If you are prompted for a password, use your
source database credentials:

```bash
pg_dump -U <SOURCE_DB_USERNAME> -W \
-h <SOURCE_DB_HOST> -p <SOURCE_DB_PORT> -Fc -v \
--section=post-data --exclude-schema="_timescaledb*" \
-f dump_post_data.dump <DATABASE_NAME>
```

1. Restore the dumped schema post-data from the `dump_post_data.dump` file into
your Timescale database, using your connection details. To avoid permissions
errors, include the `--no-owner` flag:

```bash
pg_restore -U tsdbadmin -W \
-h <HOST> -p <PORT> --no-owner -Fc \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need the -Fc here? I think pg_restore figures out this itself and the documentation for pg_dump and pg_restore does not use it.

-v -d tsdb dump_post_data.dump
```

</Procedure>

### Troubleshooting

If you see these errors during the migration process, you can safely ignore
them. The migration still occurs successfully.

```
pg_restore: error: could not execute query: ERROR: relation "<relation_name>" already exists
```

```
pg_restore: error: could not execute query: ERROR: trigger "ts_insert_blocker" for relation "<relation_name>" already exists
```

## Recreate continuous aggregates

Continuous aggregates aren't migrated by default when you transfer your schema
and data separately. You can restore them by recreating the continuous aggregate
definitions and recomputing the results on your Timescale database. The recomputed
continuous aggregates only aggregate existing data in your Timescale database. They
don't include deleted raw data.

<Procedure>

### Recreating continuous aggregates

1. Connect to your source database:

```bash
psql "postgres://<SOURCE_DB_USERNAME>:<SOURCE_DB_PASSWORD>@<SOURCE_DB_HOST>:<SOURCE_DB_PORT>/<SOURCE_DB_NAME>?sslmode=require"
```

1. Get a list of your existing continuous aggregate definitions:

```sql
SELECT view_name, view_definition FROM timescaledb_information.continuous_aggregates;
```

This query returns the names and definitions for all your continuous
aggregates. For example:

```sql
view_name | view_definition
----------------+--------------------------------------------------------------------------------------------------------
avg_fill_levels | SELECT round(avg(fill_measurements.fill_level), 2) AS avg_fill_level, +
| time_bucket('01:00:00'::interval, fill_measurements."time") AS bucket, +
| fill_measurements.sensor_id +
| FROM fill_measurements +
| GROUP BY (time_bucket('01:00:00'::interval, fill_measurements."time")), fill_measurements.sensor_id;
(1 row)
```

1. Connect to your Timescale database:

```bash
psql "postgres://tsdbadmin:<PASSWORD>@<HOST>:<PORT>/tsdb?sslmode=require"
```

1. Recreate each continuous aggregate definition:

```sql
CREATE MATERIALIZED VIEW <VIEW_NAME>
WITH (timescaledb.continuous) AS
<VIEW_DEFINITION>
```
nikkhils marked this conversation as resolved.
Show resolved Hide resolved
nikkhils marked this conversation as resolved.
Show resolved Hide resolved

</Procedure>

## Recreate policies

By default, policies aren't migrated when you transfer your schema and data
separately. Recreate them on your Timescale database.

<Procedure>

### Recreating policies

1. Connect to your source database:

```bash
psql "postgres://<SOURCE_DB_USERNAME>:<SOURCE_DB_PASSWORD>@<SOURCE_DB_HOST>:<SOURCE_DB_PORT>/<SOURCE_DB_NAME>?sslmode=require"
```

1. Get a list of your existing policies. This query returns a list of all your
policies, including continuous aggregate refresh policies, retention
policies, compression policies, and reorder policies:

```sql
SELECT application_name, schedule_interval, retry_period,
config, hypertable_name
FROM timescaledb_information.jobs WHERE owner = '<SOURCE_DB_USERNAME>';
```

1. Connect to your Timescale database:

```sql
nikkhils marked this conversation as resolved.
Show resolved Hide resolved
psql "postgres://tsdbadmin:<PASSWORD>@<HOST>:<PORT>/tsdb?sslmode=require"
```

1. Recreate each policy. For more information about recreating policies, see
the sections on [continuous-aggregate refresh policies][cagg-policy],
[retention policies][retention-policy], [compression
policies][compression-policy], and [reorder policies][reorder-policy].

</Procedure>

## Update table statistics

Update your table statistics by running [`ANALYZE`][analyze] on your entire
dataset. Note that this might take some time depending on the size of your
database:

```sql
ANALYZE;
```

### Troubleshooting

If you see errors of the following form when you run `ANALYZE`, you can safely
ignore them:

```
WARNING: skipping "<TABLE OR INDEX>" --- only superuser can analyze it
```

The skipped tables and indexes correspond to system catalogs that can't be
accessed. Skipping them does not affect statistics on your data.
nikkhils marked this conversation as resolved.
Show resolved Hide resolved

[analyze]: https://www.postgresql.org/docs/10/sql-analyze.html
[cagg-policy]: /use-timescale/:currentVersion:/continuous-aggregates/refresh-policies/
[compression-policy]: /use-timescale/:currentVersion:/compression/
[retention-policy]: /use-timescale/:currentVersion:/data-retention/create-a-retention-policy/
[reorder-policy]: /api/:currentVersion:/hypertable/add_reorder_policy/

Check failure on line 165 in _partials/_migrate_post_schema_caggs_etal.md

View workflow job for this annotation

GitHub Actions / prose

[vale] reported by reviewdog 🐶 [Vale.Terms] Use 'API' instead of 'api'. Raw Output: {"message": "[Vale.Terms] Use 'API' instead of 'api'.", "location": {"path": "_partials/_migrate_post_schema_caggs_etal.md", "range": {"start": {"line": 165, "column": 20}}}, "severity": "ERROR"}
[timescaledb-parallel-copy]: https://github.com/timescale/timescaledb-parallel-copy

Check warning on line 166 in _partials/_migrate_post_schema_caggs_etal.md

View workflow job for this annotation

GitHub Actions / prose

[vale] reported by reviewdog 🐶 [Google.DeprecatedLinks] This reference might not appear in the main text. Double check and remove it if necessary. Raw Output: {"message": "[Google.DeprecatedLinks] This reference might not appear in the main text. Double check and remove it if necessary.", "location": {"path": "_partials/_migrate_post_schema_caggs_etal.md", "range": {"start": {"line": 166, "column": 1}}}, "severity": "WARNING"}
30 changes: 30 additions & 0 deletions _partials/_migrate_using_parallel_copy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
<Procedure>

### Restoring data into Timescale with timescaledb-parallel-copy

Check failure on line 3 in _partials/_migrate_using_parallel_copy.md

View workflow job for this annotation

GitHub Actions / prose

[vale] reported by reviewdog 🐶 [Vale.Terms] Use 'COPY' instead of 'copy'. Raw Output: {"message": "[Vale.Terms] Use 'COPY' instead of 'copy'.", "location": {"path": "_partials/_migrate_using_parallel_copy.md", "range": {"start": {"line": 3, "column": 61}}}, "severity": "ERROR"}

1. At the command prompt, install `timescaledb-parallel-copy`:

```bash
go get github.com/timescale/timescaledb-parallel-copy/cmd/timescaledb-parallel-copy
```

1. Use `timescaledb-parallel-copy` to import data into
your Timescale database. Set `<NUM_WORKERS>` to twice the number of CPUs in your
database. For example, if you have 4 CPUs, `<NUM_WORKERS>` should be `8`.

Check warning on line 13 in _partials/_migrate_using_parallel_copy.md

View workflow job for this annotation

GitHub Actions / prose

[vale] reported by reviewdog 🐶 [Vale.Spelling] Did you really mean 'CPUs'? Raw Output: {"message": "[Vale.Spelling] Did you really mean 'CPUs'?", "location": {"path": "_partials/_migrate_using_parallel_copy.md", "range": {"start": {"line": 13, "column": 42}}}, "severity": "WARNING"}

```bash
timescaledb-parallel-copy \
--connection "host=<HOST> \
user=tsdbadmin password=<PASSWORD> \
port=<PORT> \
sslmode=require" \
--db-name tsdb \
--table <TABLE_NAME> \
--file <FILE_NAME>.csv \
--workers <NUM_WORKERS> \
--reporting-period 30s
```

Repeat for each table and hypertable you want to migrate.

</Procedure>
26 changes: 26 additions & 0 deletions _partials/_migrate_using_postgres_copy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
<Procedure>

### Restoring data into Timescale with COPY

1. Connect to your Timescale database:

```sql
nikkhils marked this conversation as resolved.
Show resolved Hide resolved
psql "postgres://tsdbadmin:<PASSWORD>@<HOST>:<PORT>/tsdb?sslmode=require"
```

1. Restore the data to your Timescale database:

```sql
\copy <TABLE_NAME> FROM '<TABLE_NAME>.csv' WITH (FORMAT CSV);
```

Repeat for each table and hypertable you want to migrate.

</Procedure>

## Migrate schema post-data

When you have migrated your table and hypertable data, migrate your PostgreSQL
schema post-data. This includes information about constraints.

<Procedure>
Loading
Loading