chore: refactor `dbListTables()` et al. #413

dpprdan · 2022-11-29T18:59:20Z

I would like to take another stab at #251 and this some preparatory work for that.

Background

Materialized views are currently not returned by dbListTables() et al (#251). Materialized Views are not included in INFORMATION_SCHEMA.tables (see this thread for why they are not. Tl;dr: "They are not defined by the SQL standard.")

The information schema consists of a set of views that contain information about the objects defined in the current database.
The information schema is defined in the SQL standard and can therefore be expected to be portable and remain stable — unlike the system catalogs, which are specific to PostgreSQL and are modeled after implementation concerns.
The information schema views do not, however, contain information about PostgreSQL-specific features; to inquire about those you need to query the system catalogs or other PostgreSQL-specific views.
https://www.postgresql.org/docs/current/information-schema.html (emphasis mine).

Sidenote: Most if not all tables in information_schema are just views of the system catalogs, see e.g.

SELECT definition FROM pg_views 
WHERE schemaname = 'information_schema' AND viewname = 'tables';

Since mviews are PostgreSQL-specific features we need to query the system catalogs. In particular, we need to use pg_class/pg_namespace instead of INFORMATION_SCHEMA.tables.

However, in order to retain support for Redshift, we need to keep the INFORMATION_SCHEMA queries as well.

What happens in this PR?

The above affects the queries that underly the dbListTables(), dbExistsTable(), dbListObjects(), and dbListFields() functions.

dbListTables(), dbExistsTable() and dbListObjects() do essentially the same thing, namely call INFORMATION_SCHEMA.tables.
In order to make the abovementioned adjustments easier and keep duplicated code at a minimum, I refactored those functions to use a common core function list_tables(), which returns the SQL code to query INFORMATION_SCHEMA.tables. The (alternative) queries to pg_class (to be added in another PR) will then live here.

I refactored the find_table() and list_fields() functions that underly dbListFields() in a similar fashion.

Details

dbListFields(): find the first table on the search path

SELECT column_name FROM (
  SELECT *, rank() OVER (ORDER BY nr) AS rnr FROM (
    SELECT nr, schemas[nr] AS table_schema FROM (
      SELECT *, generate_subscripts(schemas, 1) AS nr FROM (
        SELECT current_schemas(true) AS schemas) t
      ) tt WHERE schemas[nr] <> 'pg_catalog'
    ) ttt 
  INNER JOIN INFORMATION_SCHEMA.columns USING (table_schema) WHERE table_name = [table_name]
) tttt 
WHERE rnr = 1 ORDER BY ordinal_position

can be re-written as (with 2 instead of 4 nested queries)

SELECT column_name FROM (
  SELECT *, rank() OVER (ORDER BY schema_nr) AS schema_rank FROM (
    SELECT * FROM unnest(current_schemas(true)) WITH ORDINALITY AS "tbl"("table_schema","schema_nr") 
    WHERE "table_schema" != 'pg_catalog') t 
  INNER JOIN INFORMATION_SCHEMA.columns USING (table_schema) 
  WHERE table_name = [table_name]) tt 
WHERE schema_rank = 1 
ORDER BY ordinal_position

Essentially I applied similar query code here as in #261, but calling information_schema instead of the system catalogs.
I will open another PR to incorporate the queries to the system catalogs.

It is probably easiest to review this by going through the individual commits.

I do not have access to Redshift, so I was not able to test this PR there. There might be small adjustments necessary (see below). Is it somehow possible for me to test on Redshift (e.g. via GHA)? Or how does this work in general?

Related issues

#251
#261

This does not address #388. I think it is best to address this after #251 is resolved.

#390 is also related, but relatively trivial to apply before or after this.

dpprdan · 2022-11-29T19:05:02Z

R/tables.R

      " SELECT nr, current_schemas[nr] AS table_schema FROM tt WHERE current_schemas[nr] <> 'pg_catalog'",
      ") ttt"
    )
    only_first <- FALSE


This override was already present in find_table() (but find_table() was generally called with only_first = TRUE by list_fields()).

I think this is a remainder of the first two commits in #326, where indeed only_first isn't necessary, because that used current_schema() (not "schema_s_").

It should be only_first = TRUE now though because there can be multiple schemas on the search path on Redshift as well (I think).

krlmlr · 2023-03-16T05:22:30Z

Conflicts here, too.

aviator-app · 2023-11-09T11:26:13Z

Current Aviator status

Aviator will automatically update this comment as the status of the PR changes.
Comment /aviator refresh to force Aviator to re-examine your PR (or learn about other /aviator commands).

This PR was merged using Aviator.

See the real-time status of this PR on the Aviator webapp.

Use the Aviator Chrome Extension to see the status of your PR within GitHub.

krlmlr · 2024-04-01T13:58:22Z

Thanks!

Original: 7730f13 refactor `dbListTables()` with `list_tables()`, now orders result by `table_type` and `table_name` refactor `dbExistsTable()` with `list_tables()` refactor `dbListObjects()` with `list_tables()` merge `find_table()` code into `list_fields()` `find_table()` isn't used anywhere else anymore (e.g. `exists_table()`) simplify the "get current_schemas() as table" code pass full `id` to `list_fields()` align `dbExistsTable()` with `dbListFields()` simplify `where_schema` in `list_tables()` align `where_table` with `where_schema` in `list_tables()`

dpprdan commented Nov 29, 2022

View reviewed changes

dpprdan force-pushed the feat/refactor_list branch from 63490cf to a60fc4e Compare December 14, 2022 16:36

dpprdan mentioned this pull request Dec 15, 2022

feat: dbListTables() lists materialized views #414

Open

dpprdan changed the title ~~refactor dbListTables() et al.~~ chore: refactor dbListTables() et al. Mar 16, 2023

dpprdan mentioned this pull request Jan 23, 2024

unnamed Id() breaks dbQuoteIdentifier() #453

Closed

krlmlr force-pushed the feat/refactor_list branch 3 times, most recently from 607d2df to 356b790 Compare April 1, 2024 13:57

krlmlr added the mergequeue label Apr 1, 2024

aviator-app bot force-pushed the feat/refactor_list branch from 356b790 to f789767 Compare April 1, 2024 14:10

aviator-app bot merged commit ff4b0e8 into r-dbi:main Apr 1, 2024

dpprdan deleted the feat/refactor_list branch April 3, 2024 16:43

github-actions bot locked as resolved and limited conversation to collaborators Apr 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

chore: refactor `dbListTables()` et al. #413

chore: refactor `dbListTables()` et al. #413

Uh oh!

dpprdan commented Nov 29, 2022 •

edited

Loading

Uh oh!

dpprdan Nov 29, 2022 •

edited

Loading

Uh oh!

krlmlr commented Mar 16, 2023

Uh oh!

aviator-app bot commented Nov 9, 2023 •

edited

Loading

Uh oh!

krlmlr commented Apr 1, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

chore: refactor dbListTables() et al. #413

chore: refactor dbListTables() et al. #413

Uh oh!

Conversation

dpprdan commented Nov 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Background

What happens in this PR?

Related issues

Uh oh!

dpprdan Nov 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

krlmlr commented Mar 16, 2023

Uh oh!

aviator-app bot commented Nov 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Current Aviator status

Uh oh!

krlmlr commented Apr 1, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

chore: refactor `dbListTables()` et al. #413

chore: refactor `dbListTables()` et al. #413

dpprdan commented Nov 29, 2022 •

edited

Loading

dpprdan Nov 29, 2022 •

edited

Loading

aviator-app bot commented Nov 9, 2023 •

edited

Loading