Skip to content

Commit 42cab5f

Browse files
ZetaSQL Teammatthewcbrown
ZetaSQL Team
authored andcommitted
Export of internal ZetaSQL changes.
Add support for read-modify-write of java ResolvedAST (see RewritingVisitorTest.java for examples) Add support for TO_JSON_STRING. -- Change by ZetaSQL Team <[email protected]>: Pass correct language options to Validator in PreparedQuery api when validating a resolved tree passed to the PreparedQuery constructor. -- Change by ZetaSQL Team <[email protected]>: Remove in_development tag from FEATURE_BIGNUMERIC_TYPE. -- Change by ZetaSQL Team <[email protected]>: Increase the number of operations per benchmark loop in value_test -- Change by ZetaSQL Team <[email protected]>: Update zetasql documentation to include BIGNUMERIC support. -- Change by ZetaSQL Team <[email protected]>: Remove unused lexer rules. -- Change by ZetaSQL Team <[email protected]>: Analyzer changes for QUALIFY clause -- Change by ZetaSQL Team <[email protected]>: Fix DEFAULT column syntax error messages. -- Change by ZetaSQL Team <[email protected]>: Make a clarification about the promotion of NULLs in GREATEST/LEAST -- Change by ZetaSQL Team <[email protected]>: Fix bug with MODIFY_MAP rewriter - NULL first arg should be NULL output -- Change by ZetaSQL Team <[email protected]>: Fix bug in handling array offsets with FLATTEN. -- Change by ZetaSQL Team <[email protected]>: execute_query: Add JSON and Textproto output modes -- Change by Matthew Brown <[email protected]>: Fix name and contract for DescriptorPool.getAllFileDescriptors to be ordered -- Change by ZetaSQL Team <[email protected]>: remove expected error for contains_key -- Change by ZetaSQL Team <[email protected]>: Undeprecate presence tests in proto3. -- Change by ZetaSQL Team <[email protected]>: INTERNAL -- Change by ZetaSQL Team <[email protected]>: execute_query: Move "ExecuteQuery" overload into test code -- Change by ZetaSQL Team <[email protected]>: execute_query: Retrieve examination callback only once -- Change by ZetaSQL Team <[email protected]>: execute_query: Add writers for Textproto and JSON formats -- Change by Matthew Brown <[email protected]>: local_service: Use a more explicit for serializing DescriptorPools -- Change by ZetaSQL Team <[email protected]>: Support parameterized types in MERGE INSERT -- Change by ZetaSQL Team <[email protected]>: Add utilities to generate protobuf messages from iterator -- Change by ZetaSQL Team <[email protected]>: Fix resolver logic for PIVOT to handle the case where columns in the pivot input have multiple aliases, but only some of them are referenced. -- Change by ZetaSQL Team <[email protected]>: Remove deprecated AddCastOrConvertLiteral function -- Change by Matthew Brown <[email protected]>: Remove Serializable from DescriptorPool.sub interfaces -- Change by ZetaSQL Team <[email protected]>: Add explicit CAST node to struct fields with type parameters -- Change by ZetaSQL Team <[email protected]>: Change the visibility of cast_date_time. -- Change by ZetaSQL Team <[email protected]>: Use separate control flow nodes for starting a FOR loop and advancing a FOR loop. -- Change by ZetaSQL Team <[email protected]>: Fix the AST locations for TVF arguments. If it refers to an expression, just pass the location of the expression to the downstream. Otherwise it would prevent literal casting for struct fields. -- Change by ZetaSQL Team <[email protected]>: Use null handling behavior enum in reference impl instead of hard-coded list of functions that respect nulls. -- Change by ZetaSQL Team <[email protected]>: Modify TypeParameters:MatchType() -- Change by ZetaSQL Team <[email protected]>: Automatically FLATTEN for IN UNNEST. -- Change by ZetaSQL Team <[email protected]>: Native PIVOT implementation in reference impl, which doesn't have limitations on types of aggregate functions supported imposed by the rewriter. -- Change by ZetaSQL Team <[email protected]>: Split long logging statements to multiple lines. -- Change by ZetaSQL Team <[email protected]>: Add flag to analyze_query to control rewrites. -- Change by ZetaSQL Team <[email protected]>: RQG for ARRAY_FILTER and ARRAY_TRANSFORM. -- Change by ZetaSQL Team <[email protected]>: execute_query: Add option to gracefully handle statement prompt errors -- Change by ZetaSQL Team <[email protected]>: Add string escaping in test_function.cc. (And 84 more changes) GitOrigin-RevId: de685b7ab6095676abb23b49ff0dfc3e51eb09d5 Change-Id: Ib21e0f71bdb8ee07c8d65d122f8663b321e60078
1 parent 697da73 commit 42cab5f

File tree

540 files changed

+121788
-15999
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

540 files changed

+121788
-15999
lines changed

bazel/bison.bzl

+3-1
Original file line numberDiff line numberDiff line change
@@ -39,9 +39,10 @@ def _genyacc_impl(ctx):
3939
executable = ctx.executable._bison,
4040
env = {
4141
"M4": ctx.executable._m4.path,
42+
"BISON_PKGDATADIR": ctx.files._bison_data[0].dirname,
4243
},
4344
arguments = [args],
44-
inputs = ctx.files.src,
45+
inputs = ctx.files._bison_data + ctx.files.src,
4546
tools = [ctx.executable._m4],
4647
outputs = outputs,
4748
mnemonic = "Yacc",
@@ -78,6 +79,7 @@ genyacc = rule(
7879
doc = "A list of extra options to pass to Bison. These are " +
7980
"subject to $(location ...) expansion.",
8081
),
82+
"_bison_data": attr.label(default = "@bison//:bison_runtime_data"),
8183
"_bison": attr.label(
8284
default = Label("//bazel:bison_bin"),
8385
executable = True,

bazel/zetasql_deps_step_2.bzl

+23-11
Original file line numberDiff line numberDiff line change
@@ -106,10 +106,10 @@ cc_proto_library(
106106
#
107107
http_archive(
108108
name = "com_google_absl",
109-
# Commit from 2020-07-01
110-
url = "https://github.com/abseil/abseil-cpp/archive/81f34df8347a73c617f244f49cb916238857dc34.tar.gz",
111-
sha256 = "89b1c570dd59cebf5127ff96b9b46ae8a7fe352cab4f9198e20dc7749ab8aa16",
112-
strip_prefix = "abseil-cpp-81f34df8347a73c617f244f49cb916238857dc34",
109+
# Commit from 2021-02-23
110+
url = "https://github.com/abseil/abseil-cpp/archive/a50ae369a30f99f79d7559002aba3413dac1bd48.tar.gz",
111+
sha256 = "be2a9d7ea7ee15f9317b57beff37e8ffb67418fb0df64592366b04c8618c2584",
112+
strip_prefix = "abseil-cpp-a50ae369a30f99f79d7559002aba3413dac1bd48",
113113
)
114114

115115
# Abseil (Python)
@@ -259,9 +259,9 @@ cc_proto_library(
259259
http_archive(
260260
name = "com_google_file_based_test_driver",
261261
# Commit from 2020-11-24
262-
url = "https://github.com/google/file-based-test-driver/archive/5074f48f03c6a892edafab55410addc43f4a0546.tar.gz",
263-
sha256 = "955cdee45433dd608bfde47d4d1dd6f47decf739a4c54cf4eecc11896dcbb374",
264-
strip_prefix = "file-based-test-driver-5074f48f03c6a892edafab55410addc43f4a0546",
262+
url = "https://github.com/google/file-based-test-driver/archive/77e24638ad40ec67dcbf6e37fd57e20c5d98976e.tar.gz",
263+
sha256 = "fdb5d0138cc013b8b8d21b0d1827a1296621f1bfa599ef889a69eeed73a6f24b",
264+
strip_prefix = "file-based-test-driver-77e24638ad40ec67dcbf6e37fd57e20c5d98976e",
265265
)
266266

267267
# gRPC
@@ -467,10 +467,10 @@ cc_proto_library(
467467
if not native.existing_rule("junit_junit"):
468468
jvm_maven_import_external(
469469
name = "junit_junit",
470-
artifact = "junit:junit:4.12",
471-
tags = ["maven_coordinates=junit:junit:4.12"],
470+
artifact = "junit:junit:4.13",
471+
tags = ["maven_coordinates=junit:junit:4.13"],
472472
server_urls = ["https://repo1.maven.org/maven2"],
473-
artifact_sha256 = "59721f0805e223d84b90677887d9ff567dc534d7c502ca903c0c2b17f05c116a",
473+
artifact_sha256 = "4b8532f63bdc0e0661507f947eb324a954d1dbac631ad19c8aa9a00feed1d863",
474474
licenses = ["notice"], # EPL 1.0
475475
)
476476

@@ -857,10 +857,22 @@ alias(
857857
##########################################################################
858858

859859
all_content = """filegroup(name = "all", srcs = glob(["**"]), visibility = ["//visibility:public"])"""
860+
bison_build_file_content = all_content + """
861+
filegroup(
862+
name = "bison_runtime_data",
863+
srcs = glob(["data/**/*"]),
864+
output_licenses = ["unencumbered"],
865+
path = "data",
866+
visibility = ["//visibility:public"],
867+
868+
)
869+
exports_files(["data"])
870+
871+
"""
860872

861873
http_archive(
862874
name = "bison",
863-
build_file_content = all_content,
875+
build_file_content = bison_build_file_content,
864876
strip_prefix = "bison-3.6.2",
865877
sha256 = "e28ed3aad934de2d1df68be209ac0b454f7b6d3c3d6d01126e5cd2cbadba089a",
866878
urls = [

docs/aggregate_functions.md

+4-4
Original file line numberDiff line numberDiff line change
@@ -391,11 +391,11 @@ The clauses are applied *in the following order*:
391391

392392
<thead>
393393
<tr>
394-
<th>INPUT</th><th>INT32</th><th>INT64</th><th>UINT32</th><th>UINT64</th><th>NUMERIC</th><th>FLOAT</th><th>DOUBLE</th>
394+
<th>INPUT</th><th>INT32</th><th>INT64</th><th>UINT32</th><th>UINT64</th><th>NUMERIC</th><th>BIGNUMERIC</th><th>FLOAT</th><th>DOUBLE</th>
395395
</tr>
396396
</thead>
397397
<tbody>
398-
<tr><th>OUTPUT</th><td style="vertical-align:middle">DOUBLE</td><td style="vertical-align:middle">DOUBLE</td><td style="vertical-align:middle">DOUBLE</td><td style="vertical-align:middle">DOUBLE</td><td style="vertical-align:middle">NUMERIC</td><td style="vertical-align:middle">DOUBLE</td><td style="vertical-align:middle">DOUBLE</td></tr>
398+
<tr><th>OUTPUT</th><td style="vertical-align:middle">DOUBLE</td><td style="vertical-align:middle">DOUBLE</td><td style="vertical-align:middle">DOUBLE</td><td style="vertical-align:middle">DOUBLE</td><td style="vertical-align:middle">NUMERIC</td><td style="vertical-align:middle">BIGNUMERIC</td><td style="vertical-align:middle">DOUBLE</td><td style="vertical-align:middle">DOUBLE</td></tr>
399399
</tbody>
400400

401401
</table>
@@ -1156,11 +1156,11 @@ The clauses are applied *in the following order*:
11561156

11571157
<thead>
11581158
<tr>
1159-
<th>INPUT</th><th>INT32</th><th>INT64</th><th>UINT32</th><th>UINT64</th><th>NUMERIC</th><th>FLOAT</th><th>DOUBLE</th>
1159+
<th>INPUT</th><th>INT32</th><th>INT64</th><th>UINT32</th><th>UINT64</th><th>NUMERIC</th><th>BIGNUMERIC</th><th>FLOAT</th><th>DOUBLE</th>
11601160
</tr>
11611161
</thead>
11621162
<tbody>
1163-
<tr><th>OUTPUT</th><td style="vertical-align:middle">INT64</td><td style="vertical-align:middle">INT64</td><td style="vertical-align:middle">UINT64</td><td style="vertical-align:middle">UINT64</td><td style="vertical-align:middle">NUMERIC</td><td style="vertical-align:middle">DOUBLE</td><td style="vertical-align:middle">DOUBLE</td></tr>
1163+
<tr><th>OUTPUT</th><td style="vertical-align:middle">INT64</td><td style="vertical-align:middle">INT64</td><td style="vertical-align:middle">UINT64</td><td style="vertical-align:middle">UINT64</td><td style="vertical-align:middle">NUMERIC</td><td style="vertical-align:middle">BIGNUMERIC</td><td style="vertical-align:middle">DOUBLE</td><td style="vertical-align:middle">DOUBLE</td></tr>
11641164
</tbody>
11651165

11661166
</table>

docs/approximate_aggregate_functions.md

+2
Original file line numberDiff line numberDiff line change
@@ -258,6 +258,8 @@ If the `weight` input is negative or `NaN`, this function returns an error.
258258

259259
<li>NUMERIC</li>
260260

261+
<li>BIGNUMERIC</li>
262+
261263
<li>DOUBLE</li>
262264
</ul>
263265

docs/array_functions.md

+124-19
Original file line numberDiff line numberDiff line change
@@ -213,28 +213,51 @@ FROM items;
213213

214214
```sql
215215
FLATTEN(flatten_path)
216+
217+
flatten_path:
218+
{
219+
array_expression
220+
| flatten_path.field
221+
| flatten_path.array_field
222+
| flatten_path.array_field[{offset_clause | safe_offset_clause}]
223+
}
216224
```
217225

218226
**Description**
219227

220-
Extracts a collection of values that have the same semantic meaning from a
221-
tree-shaped value and returns an array. `flatten_path` is a path that can select
222-
many values out of the tree-shaped value and return them as an array. For
223-
example, `FLATTEN(table.column.array_field.target)` will return an array of all
224-
`targets` inside `table.column`. Tree-shaped data is represented in the
225-
ZetaSQL type system by composing these typed values:
226-
227-
+ STRUCT
228-
+ ARRAY
229-
+ PROTO (Protobuf message types and enum types)
230-
231-
The resulting array may contain `NULL` array elements, but only if
232-
evaluating the path along some array elements returns `NULL`. Returns `NULL`
233-
if `flatten_path` is `NULL`.
234-
235-
You can learn more about flattening tree structured data into arrays and
228+
Nested data can be flattened into a single, flat array with the `FLATTEN`
229+
operator. The `FLATTEN` operator accepts a unique type of path called the
230+
_flatten path_. The flatten path lets you traverse through the levels of a
231+
nested array from left to right. For example,
232+
`FLATTEN(column.array_field.target)` will return an array of all
233+
`targets` inside `column`. The flatten path can include:
234+
235+
+ `array_expression`: Expression that evaluates to a single, flat array.
236+
+ `flatten_path.field`: A concatenation of `element.field` for all elements of
237+
`FLATTEN(flatten_path)`. `field` represents a non-array field.
238+
+ `flatten_path.array_field`: A concatenation of elements of
239+
`element.array_field` for all elements of `FLATTEN(flatten_path)`.
240+
`array_field` represents an array field.
241+
+ `[{offset_clause | safe_offset_clause}]`: If the optional
242+
[`OFFSET`][offset-clause] or [`SAFE_OFFSET`][safe-offset-clause] is present,
243+
for each array_field value, `FLATTEN` includes only the array element at
244+
the selected offset, rather than all elements.
245+
246+
`FLATTEN` can return `NULL` if following the flatten path encounters a
247+
`NULL` before it encounters an array. Once a non-null array is encountered,
248+
`FLATTEN` can never return `NULL` and will always return an array.
249+
250+
`NULL`s in arrays are added to the resulting array.
251+
252+
Tip: Nested data is common in protocol buffers that have data within repeated
253+
messages.
254+
255+
Tip: The `FLATTEN` operator is implicit inside the `UNNEST` operator and
256+
`UNNEST(flatten_path)` is equivalent to `UNNEST(FLATTEN(flatten_path))`.
257+
258+
You can learn more about flattening nested data into arrays and
236259
the flatten path in
237-
[Flattening tree-structured data into arrays][flatten-tree-to-array].
260+
[Flattening nested data into an array][flatten-tree-to-array].
238261

239262
**Return type**
240263

@@ -313,6 +336,86 @@ SELECT FLATTEN(
313336
+------------------+
314337
```
315338

339+
In this example, all of the arrays for `v.sales.quantity` are concatenated in
340+
a flattened array.
341+
342+
```sql
343+
WITH t AS (
344+
SELECT
345+
[
346+
STRUCT([STRUCT([1,2,3] AS quantity), STRUCT([4,5,6] AS quantity)] AS sales),
347+
STRUCT([STRUCT([7,8] AS quantity), STRUCT([] AS quantity)] AS sales)
348+
] AS v
349+
)
350+
SELECT FLATTEN(v.sales.quantity) AS all_values
351+
FROM t;
352+
353+
+--------------------------+
354+
| all_values |
355+
+--------------------------+
356+
| [1, 2, 3, 4, 5, 6, 7, 8] |
357+
+--------------------------+
358+
```
359+
360+
In this example, `OFFSET` gets the second value in each array and
361+
concatenates them.
362+
363+
```sql
364+
WITH t AS (
365+
SELECT
366+
[
367+
STRUCT([STRUCT([1,2,3] AS quantity), STRUCT([4,5,6] AS quantity)] AS sales),
368+
STRUCT([STRUCT([7,8,9] AS quantity), STRUCT([10,11,12] AS quantity)] AS sales)
369+
] AS v
370+
)
371+
SELECT FLATTEN(v.sales.quantity[OFFSET(1)]) AS second_values
372+
FROM t;
373+
374+
+---------------+
375+
| second_values |
376+
+---------------+
377+
| [2, 5, 8, 11] |
378+
+---------------+
379+
```
380+
381+
If you use `OFFSET` with `FLATTEN` and a value is missing from an array,
382+
an error is returned.
383+
384+
```sql
385+
WITH t AS (
386+
SELECT
387+
[
388+
STRUCT([STRUCT([1,2,3] AS quantity), STRUCT([4,5,6] AS quantity)] AS sales),
389+
STRUCT([STRUCT([7,8,9] AS quantity), STRUCT([10] AS quantity)] AS sales)
390+
] AS v
391+
)
392+
SELECT FLATTEN(v.sales.quantity[OFFSET(1)]) AS second_values
393+
FROM t;
394+
395+
-- ERROR: Array index is out of bounds.
396+
```
397+
398+
In this example, `SAFE_OFFSET` gets the third value in each array and
399+
concatenates them. If a value is missing, a `NULL` is returned for the value.
400+
401+
```sql
402+
WITH t AS (
403+
SELECT
404+
[
405+
STRUCT([STRUCT([1,2,3] AS quantity), STRUCT([4,5,6] AS quantity)] AS sales),
406+
STRUCT([STRUCT([7] AS quantity), STRUCT([10,11,12] AS quantity)] AS sales)
407+
] AS v
408+
)
409+
SELECT FLATTEN(v.sales.quantity[SAFE_OFFSET(1)]) AS third_values
410+
FROM t;
411+
412+
+------------------+
413+
| third_values |
414+
+------------------+
415+
| [3, 6, NULL, 12] |
416+
+------------------+
417+
```
418+
316419
The resulting array may contain `NULL` `ARRAY` elements, but only if
317420
evaluating the flatten path along some array elements returns `NULL`.
318421
For example:
@@ -358,7 +461,7 @@ parameters determine the inclusive start and end of the array.
358461
The `GENERATE_ARRAY` function accepts the following data types as inputs:
359462

360463
<ul>
361-
<li>INT64</li><li>UINT64</li><li>NUMERIC</li><li>DOUBLE</li>
464+
<li>INT64</li><li>UINT64</li><li>NUMERIC</li><li>BIGNUMERIC</li><li>DOUBLE</li>
362465
</ul>
363466

364467
The `step_expression` parameter determines the increment used to
@@ -918,11 +1021,13 @@ FROM items;
9181021
+----------------------------------+---------------+----------------+
9191022
```
9201023

1024+
[offset-clause]: #offset_and_ordinal
1025+
[safe-offset-clause]: #safe_offset_and_safe_ordinal
9211026
[subqueries]: https://github.com/google/zetasql/blob/master/docs/query-syntax.md#subqueries
9221027
[datamodel-sql-tables]: https://github.com/google/zetasql/blob/master/docs/data-model.md#standard-sql-tables
9231028
[datamodel-value-tables]: https://github.com/google/zetasql/blob/master/docs/data-model.md#value-tables
9241029
[array-data-type]: https://github.com/google/zetasql/blob/master/docs/data-types.md#array_type
925-
[flatten-tree-to-array]: https://github.com/google/zetasql/blob/master/docs/arrays.md#flattening_trees_into_arrays
1030+
[flatten-tree-to-array]:https://github.com/google/zetasql/blob/master/docs/arrays.md#flattening_nested_data_into_arrays
9261031

9271032
[array-link-to-operators]: https://github.com/google/zetasql/blob/master/docs/operators.md
9281033

0 commit comments

Comments
 (0)