Releases: apollographql/router
v1.61.0
🚀 Features
Query planner dry-run option (PR #6656)
This PR adds a new dry-run
option to the Apollo-Expose-Query-Plan
header value that emits the query plans back to Studio for visualizations. This new value will only emit the query plan, and abort execution. This can be helpful for tools like rover
, where query plan generation is needed but not full runtime, or for potentially prewarming query plan caches out of band.
curl --request POST --include \
--header 'Accept: application/json' \
--header 'Apollo-Expose-Query-Plan: dry-run' \
--url 'http://127.0.0.1:4000/' \
--data '{"query": "{ topProducts { upc name } }"}'
By @aaronArinder and @lennyburdette in #6656.
Enable Remote Proxy Downloads
This enables users without direct download access to specify a remote proxy mirror location for the github download of
the Apollo Router releases.
By @LongLiveCHIEF in #6667
🐛 Fixes
Header propagation rules passthrough (PR #6690)
Header propagation contains logic to prevent headers from being propagated more than once. This was broken
in #6281 which always considered a header propagated regardless if a rule
actually matched.
This PR alters the logic so that a header is marked as fixed only when it's populated.
The following will now work again:
headers:
all:
request:
- propagate:
named: a
rename: b
- propagate:
named: b
Note that defaulting a header WILL populate it, so make sure to include your defaults last in your propagation
rules.
headers:
all:
request:
- propagate:
named: a
rename: b
default: defaulted # This will prevent any further rule evaluation for header `b`
- propagate:
named: b
Instead, make sure that your headers are defaulted last:
headers:
all:
request:
- propagate:
named: a
rename: b
- propagate:
named: b
default: defaulted # OK
By @BrynCooke in #6690
Entity cache: fix directive conflicts in cache-control header (Issue #6441)
Unnecessary cache-control directives are created in cache-control header. The router will now filter out unnecessary values from the cache-control
header when the request resolves. So if there's max-age=10, no-cache, must-revalidate, no-store
, the expected value for the cache-control header would simply be no-store
. Please see the MDN docs for justification of this reasoning: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control#preventing_storing
Query Planning: fix __typename
selections in sibling typename optimization
The query planner uses an optimization technique called "sibling typename", which attaches __typename
selections to their sibling selections so the planner won't need to plan them separately.
Previously, when there were multiple identical selections and one of them has a __typename
attached, the query planner could pick the one without the attachment, effectively losing a __typename
selection.
Now, the query planner favors the one with a __typename
attached without losing the __typename
selection.
📃 Configuration
Promote experimental_otlp_tracing_sampler
config to stable (PR #6070)
The router's otlp tracing sampler feature that was previously experimental is now generally available.
If you used its experimental configuration, you should migrate to the new configuration option:
telemetry.apollo.experimental_otlp_tracing_sampler
is nowtelemetry.apollo.otlp_tracing_sampler
The experimental configuration option is now deprecated. It remains functional but will log warnings.
Promote experimental_local_manifess
config for persisted queries to stable
The experimental_local_manifests
PQ configuration option is being promoted to stable. This change updates the configuration option name and any references to it, as well as the related documentation. The experimental_
usage remains valid as an alias for existing usages.
By @trevor-scheer in #6564
🛠 Maintenance
Reduce demand control allocations on start/reload (PR #6754)
When demand control is enabled, the router now preallocates capacity for demand control's processed schema and shrinks to fit after processing. When it's disabled, the router skips the type processing entirely to minimize startup impact.
By @tninesling in #6754
v1.61.0-rc.0
1.61.0-rc.0
v2.0.0
This is a major release of the router containing significant new functionality and improvements to behaviour, resulting in more predictable resource utilisation and decreased latency.
Router 2.0.0 introduces general availability of Apollo Connectors, helping integrate REST services in router deployments.
This entry summarizes the overall changes in 2.0.0. To learn more details, go to the What's New in router v2.x page.
To upgrade to this version, follow the upgrading from router 1.x to 2.x guide.
❗ BREAKING CHANGES ❗
In order to make structural improvements in the router and upgrade some of our key dependencies, some breaking changes were introduced in this major release. Most of the breaking changes are in the areas of configuration and observability. All details on what's been removed and changed can be found in the upgrade guide.
🚀 Features
Router 2.0.0 comes with many new features and improvements. While all the details can be found in the What's New guide, the following features are the ones we are most excited about.
Simplified integration of REST services using Apollo Connectors. Apollo Connectors are a declarative programming model for GraphQL, allowing you to plug your existing REST services directly into your graph. Once integrated, client developers gain all the benefits of GraphQL, and API owners gain all the benefits of GraphOS, including incorporation into a supergraph for a comprehensive, unified view of your organization's data and services. This detailed guide outlines how to configure connectors with the router. Moving from Connectors Preview can be accomplished by following the steps in the Connectors GA upgrade guide.
Predictable resource utilization and availability with back pressure. Back pressure was not maintained in router 1.x, which meant all requests were being accepted by the router. This resulted in issues for routers which are accepting high levels of traffic. Router 2.0.0 improves the handling of back pressure so that traffic shaping measures are more effective while also improving integration with telemetry. Improvements to back pressure then allows for significant improvements in traffic shaping, which improves router's ability to observe timeout and traffic shaping restrictions correctly. You can read about traffic shaping changes in this section of the upgrade guide.
Metrics now all follow OpenTelemetry naming conventions. Some of router's earlier metrics were created before the introduction of OpenTelemetry, resulting in naming inconsistencies. Along with standardising metrics to OpenTelemetry, Apollo operation usage reporting now also defaults to using OpenTelemetry in router 2.0.0. Quite a few existing metrics had to be changed in order to do this properly and correctly, and we encourage you to carefully read through the upgrade guide for all the metrics changes.
Improved validation of CORS configurations, preventing silent failures. While CORS configuration did not change in router 2.0.0, we did improve CORS value validation. This results in things like invalid regex or unknown allow_methods
returning errors early and preventing starting the router.
Documentation for context keys, improving usability for advanced customers. Router 2.0.0 creates consistent naming semantics for request context keys, which are used to share data across internal router pipeline stages. If you are relying on context entries in rust plugins, rhai scripts, coprocessors, or telemetry selectors, please refer to this section to see what keys changed.
📃 Configuration
Some changes to router configuration options were necessary in this release. Descriptions for both breaking changes to previous configuration and configuration for new features can be found in the upgrade guide).
🛠 Maintenance
Many external Rust dependencies (crates) have been updated to modern versions where possible. As the Rust ecosystem evolves, so does the router. Keeping these crates up to date helps keep the router secure and stable.
Major upgrades in this version include:
axum
http
hyper
opentelemetry
redis
v2.0.0-rc.0
2.0.0-rc.0
v1.60.1
🐛 Fixes
Header propagation rules passthrough (PR #6690)
Header propagation contains logic to prevent headers from being propagated more than once. This was broken
in #6281 which always considered a header propagated regardless if a rule
actually matched.
This PR alters the logic so that only when a header is populated then the header is marked as fixed.
The following will now work again:
headers:
all:
request:
- propagate:
named: a
rename: b
- propagate:
named: b
Note that defaulting a head WILL populate a header, so make sure to include your defaults last in your propagation
rules.
headers:
all:
request:
- propagate:
named: a
rename: b
default: defaulted # This will prevent any further rule evaluation for header `b`
- propagate:
named: b
Instead, make sure that your headers are defaulted last:
headers:
all:
request:
- propagate:
named: a
rename: b
- propagate:
named: b
default: defaulted # OK
By @BrynCooke in #6690
Entity cache: fix directive conflicts in cache-control header (Issue #6441)
Unnecessary cache-control directives are created in cache-control header. The router will now filter out unnecessary values from the cache-control
header when the request resolves. So if there's max-age=10, no-cache, must-revalidate, no-store
, the expected value for the cache-control header would simply be no-store
. Please see the MDN docs for justification of this reasoning: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control#preventing_storing
Resolve regressions in fragment compression for certain operations (PR #6651)
In v1.58.0 we introduced a new compression strategy for subgraph GraphQL operations to replace an older, more complicated algorithm.
While we were able to validate improvements for a majority of cases, some regressions still surfaced. To address this, we are extending it to compress more operations with the following outcomes:
- The P99 overhead of running the new compression algorithm on the largest operations in our corpus is now just 10ms
- In case of better compression, at P99 it shrinks the operations by 50Kb when compared to the old algorithm
- In case of worse compression, at P99 it only adds an additional 108 bytes compared to the old algorithm, which was an acceptable trade-off versus added complexity
By @dariuszkuc in #6651
v1.60.1-rc.1
1.60.1-rc.1
v2.0.0-preview.6
2.0.0-preview.6
v1.60.1-rc.0
1.60.1-rc.0
v1.60.0
🚀 Features
Improve BatchProcessor observability (Issue #6558)
A new metric has been introduced to allow observation of how many spans are being dropped by an telemetry batch processor.
apollo.router.telemetry.batch_processor.errors
- The number of errors encountered by exporter batch processors.name
: One ofapollo-tracing
,datadog-tracing
,jaeger-collector
,otlp-tracing
,zipkin-tracing
.error
= One ofchannel closed
,channel full
.
By observing the number of spans dropped it is possible to estimate what batch processor settings will work for you.
In addition, the log message for dropped spans will now indicate which batch processor is affected.
By @BrynCooke in #6558
🐛 Fixes
Improve performance of query hashing by using a precomputed schema hash (PR #6622)
The router now uses a simpler and faster query hashing algorithm with more predictable CPU and memory usage. This improvement is enabled by using a precomputed hash of the entire schema, rather than computing and hashing the subset of types and fields used by each query.
For more details on why these design decisions were made, please see the PR description
By @IvanGoncharov in #6622
Truncate invalid error paths (PR #6359)
This fix addresses an issue where the router was silently dropping subgraph errors that included invalid paths.
According to the GraphQL Specification an error path must point to a response field:
If an error can be associated to a particular field in the GraphQL result, it must contain an entry with the key path that details the path of the response field which experienced the error.
The router now truncates the path to the nearest valid field path if a subgraph error includes a path that can't be matched to a response field,
By @IvanGoncharov in #6359
Eagerly init subgraph operation for subscription primary nodes (PR #6509)
When subgraph operations are deserialized, typically from a query plan cache, they are not automatically parsed into a full document. Instead, each node needs to initialize its operation(s) prior to execution. With this change, the primary node inside SubscriptionNode is initialized in the same way as other nodes in the plan.
By @tninesling in #6509
Fix increased memory usage in sysinfo
since Router 1.59.0 (PR #6634)
In version 1.59.0, Apollo Router started using the sysinfo
crate to gather metrics about available CPUs and RAM. By default, that crate uses rayon
internally to parallelize its handling of system processes. In turn, rayon creates a pool of long-lived threads.
In a particular benchmark on a 32-core Linux server, this caused resident memory use to increase by about 150 MB. This is likely a combination of stack space (which only gets freed when the thread terminates) and per-thread space reserved by the heap allocator to reduce cross-thread synchronization cost.
This regression is now fixed by:
- Disabling
sysinfo
’s use ofrayon
, so the thread pool is not created and system processes information is gathered in a sequential loop. - Making
sysinfo
not gather that information in the first place since Router does not use it.
By @SimonSapin in #6634
Optimize demand control lookup (PR #6450)
The performance of demand control in the router has been optimized.
Previously, demand control could reduce router throughput due to its extra processing required for scoring.
This fix improves performance by shifting more data to be computed at plugin initialization and consolidating lookup queries:
- Cost directives for arguments are now stored in a map alongside those for field definitions
- All precomputed directives are bundled into a struct for each field, along with that field's extended schema type. This reduces 5 individual lookups to a single lookup.
- Response scoring was looking up each field's definition twice. This is now reduced to a single lookup.
By @tninesling in #6450
Fix missing Content-Length header in subgraph requests (Issue #6503)
A change in 1.59.0
caused the Router to send requests to subgraphs without a Content-Length
header, which would cause issues with some GraphQL servers that depend on that header.
This solves the underlying bug and reintroduces the Content-Length
header.
By @nmoutschen in #6538
🛠 Maintenance
Remove the legacy query planner (PR #6418)
The legacy query planner has been removed in this release. In the previous release, router v1.58, it was no longer used by default but was still available through the experimental_query_planner_mode
configuration key. That key is now removed.
Also removed are configuration keys which were only relevant to the legacy planner:
supergraph.query_planning.experimental_parallelism
: the new planner can always use available parallelism.supergraph.experimental_reuse_query_fragments
: this experimental algorithm that attempted to
reuse fragments from the original operation while forming subgraph requests is no longer present. Instead, by default new fragment definitions are generated based on the shape of the subgraph operation.
By @SimonSapin in #6418
Migrate various metrics to OTel instruments (PR #6476, PR #6356, PR #6539)
Various metrics using our legacy mechanism based on the tracing
crate are migrated to OTel instruments.
By @goto-bus-stop in #6476, #6356, #6539
📚 Documentation
Add instrumentation configuration examples (PR #6487)
The docs for router telemetry have new example configurations for common use cases for selectors and condition.
🧪 Experimental
Remove experimental_retry option (PR #6338)
The experimental_retry
option has been removed due to its limited use and functionality during its experimental phase.
v1.60.0-rc.1
1.60.0-rc.1