Skip to content

Conversation

@zation99
Copy link
Contributor

Summary:
If a materialized view is a part of a logical view, the logical view's where predicate is not pushed down to materialized view so that it doesn't check the overlap correctly. It caused the comparison between mv's data and ALL base table data instead of the ones specified in the query.

This diff fixes it by storing the where predicate when processing a logical view. So mv can combine the where predicate in logical view as well when getting mv status. It also fixes the issue during with using the logical view/mv in cte.

Differential Revision: D87928199

@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Nov 27, 2025

Reviewer's Guide

Extends the analyzer to track WHERE predicates from logical view accessors and combine them with the current subquery predicate when computing materialized view status, and adds regression tests covering partition-based MV selection through logical views and CTEs.

Sequence diagram for logical view WHERE predicate propagation into materialized view status

sequenceDiagram
    actor User
    participant Planner
    participant StatementAnalyzer
    participant Analysis
    participant MetadataResolver
    participant MaterializedViewStatus

    User->>Planner: submit query referencing logical_view with WHERE
    Planner->>StatementAnalyzer: analyze query

    Note over StatementAnalyzer,Analysis: Processing logical view access
    StatementAnalyzer->>Analysis: getCurrentQuerySpecification()
    Analysis-->>StatementAnalyzer: Optional currentQuerySpecification
    alt currentQuerySpecification present and has WHERE
        StatementAnalyzer->>Analysis: setViewAccessorWhereClause(Optional whereClause)
    end

    StatementAnalyzer->>StatementAnalyzer: parseView and analyzeView(logical_view)

    Note over StatementAnalyzer,Analysis: Inside analysis of view, resolve materialized view
    StatementAnalyzer->>Analysis: getCurrentQuerySpecification()
    Analysis-->>StatementAnalyzer: Optional currentSubquery

    StatementAnalyzer->>Analysis: getViewAccessorWhereClause()
    Analysis-->>StatementAnalyzer: Optional viewAccessorWhereClause

    StatementAnalyzer->>StatementAnalyzer: build wherePredicates list
    StatementAnalyzer->>StatementAnalyzer: combinedWhereClause = ExpressionUtils.combineConjuncts(wherePredicates)
    StatementAnalyzer->>StatementAnalyzer: conjuncts = ExpressionUtils.extractConjuncts(combinedWhereClause)
    StatementAnalyzer->>MetadataResolver: getMaterializedViewDefinition(materializedViewName)
    MetadataResolver-->>StatementAnalyzer: Optional materializedViewDefinition

    StatementAnalyzer->>StatementAnalyzer: filter conjuncts on MV columns
    StatementAnalyzer-->>MaterializedViewStatus: create MaterializedViewStatus

    Note over StatementAnalyzer,Analysis: After view analysis completes
    StatementAnalyzer->>Analysis: clearViewAccessorWhereClause()
    StatementAnalyzer-->>Planner: analysis result for query
    Planner-->>User: planned query using materialized view when valid
Loading

Class diagram for updated Analysis and StatementAnalyzer materialized view handling

classDiagram
    class Analysis {
        - Optional~QuerySpecification~ currentQuerySpecification
        - Optional~Expression~ viewAccessorWhereClause
        + void setCurrentSubquery(QuerySpecification currentSubQuery)
        + Optional~QuerySpecification~ getCurrentQuerySpecification()
        + void setViewAccessorWhereClause(Optional~Expression~ whereClause)
        + void clearViewAccessorWhereClause()
        + Optional~Expression~ getViewAccessorWhereClause()
    }

    class StatementAnalyzer {
        - Analysis analysis
        + Scope processView(Table table, Optional~Scope~ scope, QualifiedObjectName name)
        - MaterializedViewStatus getMaterializedViewStatus(QualifiedObjectName materializedViewName, Table table, Optional~Scope~ scope, Session session, MetadataResolver metadataResolver)
    }

    class Expression {
    }

    class QuerySpecification {
        + Optional~Expression~ getWhere()
    }

    class MaterializedViewStatus {
    }

    class ExpressionUtils {
        + static Expression combineConjuncts(List~Expression~ predicates)
        + static List~Expression~ extractConjuncts(Expression predicate)
    }

    class VariablesExtractor {
        + static Set~QualifiedName~ extractNames(Expression expression, Set~Expression~ columnReferences)
    }

    Analysis --> QuerySpecification : uses
    Analysis --> Expression : tracks
    StatementAnalyzer --> Analysis : collaborates
    StatementAnalyzer --> MaterializedViewStatus : computes
    StatementAnalyzer --> ExpressionUtils : uses
    StatementAnalyzer --> VariablesExtractor : uses
    QuerySpecification --> Expression : where
Loading

File-Level Changes

Change Details Files
Track and expose the WHERE clause of a query that accesses a view so it can be used during materialized view analysis.
  • Introduce a new Optional field to store the accessor query's WHERE clause.
  • Add setters, clearer, and getter methods to manage this WHERE clause state.
  • Use this stored predicate when analyzing related subqueries such as materialized views.
presto-analyzer/src/main/java/com/facebook/presto/sql/analyzer/Analysis.java
Capture and restore the accessor query's WHERE predicate when processing a logical view, ensuring it is available during nested analysis like materialized view status computation.
  • When analyzing a view table reference, read the current query specification's WHERE clause (if present) and store it into Analysis before parsing/analyzing the view definition.
  • After finishing view analysis, clear the stored accessor WHERE predicate to avoid leaking state across analyses.
  • Wire this behavior into the existing view-processing path without altering stale-view detection logic.
presto-main-base/src/main/java/com/facebook/presto/sql/analyzer/StatementAnalyzer.java
Use the combination of accessor WHERE predicate and current subquery predicate for materialized view partition filtering, rather than only the innermost subquery predicate.
  • Aggregate predicates from the current query specification and any stored view-accessor WHERE clause into a list.
  • Combine these predicates into a single conjunctive expression and feed it into the materialized view partition-filtering logic.
  • Preserve existing behavior for extracting conjuncts and checking which ones reference materialized view columns, but now operating on the combined predicate.
presto-main-base/src/main/java/com/facebook/presto/sql/analyzer/StatementAnalyzer.java
Add regression tests to verify partition-based materialized view usage and fallback through logical views and CTEs.
  • Add a test that queries a logical view over a partitioned materialized view with a partition predicate, asserting that the MV is used for refreshed partitions and base table is used when the partition is missing.
  • Add a test that queries a logical view through a CTE with a partition predicate, ensuring the predicate is pushed down and the MV is used appropriately.
  • Add a test that queries a materialized view directly through a CTE, validating correct partition selection and fallback to the base table when partitions are missing, and verifying query plans via constrainedTableScan expectations.
presto-hive/src/test/java/com/facebook/presto/hive/TestHiveMaterializedViewLogicalPlanner.java

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@zation99 zation99 changed the title fix(analyzer): materialized view with logical view and cte fix(analyzer): Materialized view with logical view and cte Nov 27, 2025
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • When saving and restoring viewAccessorWhereClause in StatementAnalyzer.processView, you currently only clear it after analyzing the view; if view analysis is nested (view within view), this will lose any outer accessor predicate—consider saving the previous value and restoring it rather than unconditionally clearing.
  • The new MV/CTE logical view tests have a lot of duplicated setup (creating similar partitioned tables, MVs, sessions, and assertions); consider extracting common helpers to reduce duplication and make future changes to these scenarios less error-prone.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- When saving and restoring `viewAccessorWhereClause` in `StatementAnalyzer.processView`, you currently only clear it after analyzing the view; if view analysis is nested (view within view), this will lose any outer accessor predicate—consider saving the previous value and restoring it rather than unconditionally clearing.
- The new MV/CTE logical view tests have a lot of duplicated setup (creating similar partitioned tables, MVs, sessions, and assertions); consider extracting common helpers to reduce duplication and make future changes to these scenarios less error-prone.

## Individual Comments

### Comment 1
<location> `presto-main-base/src/main/java/com/facebook/presto/sql/analyzer/StatementAnalyzer.java:2375-2379` </location>
<code_context>

             analysis.getAccessControlReferences().addViewDefinitionReference(name, view);

+            Optional<Expression> savedViewAccessorWhereClause = Optional.empty();
+            if (analysis.getCurrentQuerySpecification().isPresent()) {
+                QuerySpecification currentQuerySpec = analysis.getCurrentQuerySpecification().get();
+                savedViewAccessorWhereClause = currentQuerySpec.getWhere();
+                if (savedViewAccessorWhereClause.isPresent()) {
+                    analysis.setViewAccessorWhereClause(savedViewAccessorWhereClause);
+                }
</code_context>

<issue_to_address>
**issue (bug_risk):** viewAccessorWhereClause is not restored to its previous value, which can break nested view/materialized-view analysis

Here, `viewAccessorWhereClause` is updated and then cleared based only on the current query’s `WHERE` clause, which breaks proper nesting:

1. If an outer context already set `viewAccessorWhereClause` and this view also has a `WHERE`, the outer value is overwritten and then lost, so outer predicates are dropped for later MV partition filtering.
2. If this view has no `WHERE`, you leave any existing `viewAccessorWhereClause` in place, so unrelated outer predicates are incorrectly applied to this view.

To keep nesting semantics correct, capture the previous value, set the current value (including explicitly clearing it when there is no `WHERE`), and restore the previous value in a `try/finally`:

```java
Optional<Expression> previousAccessorWhere = analysis.getViewAccessorWhereClause();
Optional<Expression> currentAccessorWhere = currentQuerySpec.getWhere();
analysis.setViewAccessorWhereClause(currentAccessorWhere);
try {
    // analyze view
}
finally {
    analysis.setViewAccessorWhereClause(previousAccessorWhere);
}
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Member

@hantangwangd hantangwangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zation99 thanks for this fix, overall looks good to me. Just a couple of little nits.

zation99 added a commit to zation99/presto that referenced this pull request Nov 29, 2025
…26713)

Summary:

If a materialized view is a part of a logical view, the logical view's where predicate is not pushed down to materialized view so that it doesn't check the overlap correctly. It caused the comparison between mv's data and ALL base table data instead of the ones specified in the query.

This diff fixes it by storing the where predicate when processing a logical view. So mv can combine the where predicate in logical view as well when getting mv status. It also fixes the issue during with using the logical view/mv in cte.

Differential Revision: D87928199
…26713)

Summary:

If a materialized view is a part of a logical view, the logical view's where predicate is not pushed down to materialized view so that it doesn't check the overlap correctly. It caused the comparison between mv's data and ALL base table data instead of the ones specified in the query.

This diff fixes it by storing the where predicate when processing a logical view. So mv can combine the where predicate in logical view as well when getting mv status. It also fixes the issue during with using the logical view/mv in cte.

Differential Revision: D87928199
Copy link
Member

@hantangwangd hantangwangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix, lgtm!

@zation99 zation99 merged commit 5e86f6c into prestodb:master Nov 30, 2025
82 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants