-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible to cache graphql.parse() results? #5377
Comments
I think it should be possible to do this transparently. I.e. any call to 🤔 I wonder if this should be a feature to opt-in to, or if we need to worry about the cache growing too large... @philipaconrad What do you think? |
@srenatus I agree that we could probably borrow the inter-query cache system for this, but it easily could cause a memory usage spike for folks with many unique GraphQL schemas that get cached. |
Mentioned OOB -- maybe some sort of LRU cache could help here. If we cache 10 schemas, we'd probably solve the problem for most users, while not blowing up the memory usage for those special cases where many unique schemas are parsed. For those special cases, the situation would get neither better nor worse from having an LRU. |
This issue has been automatically marked as inactive because it has not had any activity in the last 30 days. |
This issue has been automatically marked as inactive because it has not had any activity in the last 30 days. |
I am hitting the same issue. A schema with about 2000 lines takes about 2-3 seconds to parse - for every request. Without some kind of cache this is not feasible :-( |
This issue has been automatically marked as inactive because it has not had any activity in the last 30 days. Although currently inactive, the issue could still be considered and actively worked on in the future. More details about the use-case this issue attempts to address, the value provided by completing it or possible solutions to resolve it would help to prioritize the issue. |
In case the workaround isn't obvious, you can pass the schema AST into OPA so it doesn't have to be parsed each time. Rego Playground Example. |
That'll still parse the schema every time though, won't it? Which won't be a problem in your case, but apparently so when the schema is thousands of lines. If anyone would want to have a go at fixing this, I did something very similar some time ago for another built-in function. Could likely be mostly copy-pasted #7081 |
Yep, but it helps a ton with the |
I've got some initial results and wanted to see if I am on the right track before reworking the other functions in a similar fashion.
|
…gent#5377) This commit stores parsed GraphQL schemas to the cache, which improves the performance of GraphQL operations that parse the schema more than once. Rough Draft Complete: - graphql.schema_is_valid TODO: - graphql.is_valid - graphql.parse - graphql.parse_and_verify - graphql.parse_query - graphql.parse_schema goarch: amd64 pkg: github.com/open-policy-agent/opa/v1/topdown cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz BenchmarkGraphQLSchemaIsValid/Trivial_Schema_-_string-16 16966 68579 ns/op BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_string-16 1000000 1062 ns/op BenchmarkGraphQLSchemaIsValid/Trivial_Schema_-_object-16 12900 92484 ns/op BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_object-16 92516 12064 ns/op BenchmarkGraphQLSchemaIsValid/Complex_Schema_-_string-16 60 19294987 ns/op BenchmarkGraphQLSchemaIsValid/Complex_Schema_with_cache_-_string-16 342 3443940 ns/op PASS ok github.com/open-policy-agent/opa/v1/topdown 12.613s Fixes: open-policy-agent#5377
Very nice @robmyersrobmyers 👍 No docs or best practices I'm aware of, but hashing the input seems like a good idea. The cache limit can be configured, but I'm not sure if it accounts for the key size (I'd have to check).. so keeping that small at the cost of hashing seems like a reasonable trade-off to me. |
…gent#5377) This commit stores parsed GraphQL schemas to the cache, which improves the performance of GraphQL operations that parse the schema more than once. Rough Draft Complete: - graphql.schema_is_valid TODO: - graphql.is_valid - graphql.parse - graphql.parse_and_verify - graphql.parse_query - graphql.parse_schema goarch: amd64 pkg: github.com/open-policy-agent/opa/v1/topdown cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz BenchmarkGraphQLSchemaIsValid/Trivial_Schema_-_string-16 16966 68579 ns/op BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_string-16 1000000 1062 ns/op BenchmarkGraphQLSchemaIsValid/Trivial_Schema_-_object-16 12900 92484 ns/op BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_object-16 92516 12064 ns/op BenchmarkGraphQLSchemaIsValid/Complex_Schema_-_string-16 60 19294987 ns/op BenchmarkGraphQLSchemaIsValid/Complex_Schema_with_cache_-_string-16 342 3443940 ns/op PASS ok github.com/open-policy-agent/opa/v1/topdown 12.613s Fixes: open-policy-agent#5377
…gent#5377) This commit stores parsed GraphQL schemas to the cache, which improves the performance of GraphQL operations that parse the schema more than once. Rough Draft Complete: - graphql.schema_is_valid TODO: - graphql.is_valid - graphql.parse - graphql.parse_and_verify - graphql.parse_query - graphql.parse_schema goarch: amd64 pkg: github.com/open-policy-agent/opa/v1/topdown cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz BenchmarkGraphQLSchemaIsValid/Trivial_Schema_-_string-16 16966 68579 ns/op BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_string-16 1000000 1062 ns/op BenchmarkGraphQLSchemaIsValid/Trivial_Schema_-_object-16 12900 92484 ns/op BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_object-16 92516 12064 ns/op BenchmarkGraphQLSchemaIsValid/Complex_Schema_-_string-16 60 19294987 ns/op BenchmarkGraphQLSchemaIsValid/Complex_Schema_with_cache_-_string-16 342 3443940 ns/op PASS ok github.com/open-policy-agent/opa/v1/topdown 12.613s Fixes: open-policy-agent#5377
StatusI am still working on this. I've got tests for all the graphql builtinss and benches for most of them, but I've still only added caching on one of the builtins. Questions
Edit: Updated branch of reproducer to keep it out of the MR. |
Great work! Hmm, EDIT: looks like it might be the default condition of that switch. I've gotta look into what data structure it's trying to convert there. As for the tests, I think the ideal soultion would be to generate the schema to use for benchmarking in the benchmark. I don't know much about GraphQL schemas though ... perhaps that's not feasible? |
I agree it would be best to autogenerate the gql schemas as part of the benchmark, but that's a bit more work than I have time for at the moment. :) Instead, I've removed the complex schema stuff to a different branch and I'll add the results to this comment for posterity. All tests and benches are working, so I will push up an MR in a bit.
|
…gent#5377) This commit stores parsed GraphQL schemas to the cache, which improves the performance of GraphQL operations that parse the schema more than once. Queries are not cached. pkg: github.com/open-policy-agent/opa/v1/topdown cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz BenchmarkGraphQLSchemaIsValid/Trivial_Schema_-_string-16 15519 100178 ns/op BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_string-16 371311 3383 ns/op BenchmarkGraphQLSchemaIsValid/Trivial_Schema_-_object-16 2133 542355 ns/op BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_object-16 3471 528579 ns/op BenchmarkGraphQLSchemaIsValid/Trivial_Schema_-_AST_object-16 7105 193325 ns/op BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_AST_object-16 66594 18093 ns/op BenchmarkGraphQLParseSchema/Trivial_Schema_-_string-16 6429 173773 ns/op BenchmarkGraphQLParseSchema/Trivial_Schema_with_cache_-_string-16 6523 170819 ns/op BenchmarkGraphQLParseQuery/Trivial_Query_-_string-16 16352 72777 ns/op BenchmarkGraphQLParseQuery/Trivial_Query_with_cache_-_string-16 16083 73548 ns/op BenchmarkGraphQLIsValid/Trivial_Schema_-_string-16 14320 83589 ns/op BenchmarkGraphQLIsValid/Trivial_Schema_with_cache_-_string-16 71486 15463 ns/op BenchmarkGraphQLParse/Trivial_Schema_-_string-16 3380 321490 ns/op BenchmarkGraphQLParse/Trivial_Schema_with_cache_-_string-16 13909 87633 ns/op BenchmarkGraphQLParseAndVerify/Trivial_Schema_-_string-16 3435 327646 ns/op BenchmarkGraphQLParseAndVerify/Trivial_Schema_with_cache_-_string-16 13844 85213 ns/op PASS ok github.com/open-policy-agent/opa/v1/topdown 112.465s Resolves: open-policy-agent#5377
…gent#5377) This commit stores parsed GraphQL schemas to the cache, which improves the performance of GraphQL operations that parse the schema more than once. Queries are not cached. pkg: github.com/open-policy-agent/opa/v1/topdown cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz BenchmarkGraphQLSchemaIsValid/Trivial_Schema_-_string-16 15519 100178 ns/op BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_string-16 371311 3383 ns/op BenchmarkGraphQLSchemaIsValid/Trivial_Schema_-_object-16 2133 542355 ns/op BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_object-16 3471 528579 ns/op BenchmarkGraphQLSchemaIsValid/Trivial_Schema_-_AST_object-16 7105 193325 ns/op BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_AST_object-16 66594 18093 ns/op BenchmarkGraphQLParseSchema/Trivial_Schema_-_string-16 6429 173773 ns/op BenchmarkGraphQLParseSchema/Trivial_Schema_with_cache_-_string-16 6523 170819 ns/op BenchmarkGraphQLParseQuery/Trivial_Query_-_string-16 16352 72777 ns/op BenchmarkGraphQLParseQuery/Trivial_Query_with_cache_-_string-16 16083 73548 ns/op BenchmarkGraphQLIsValid/Trivial_Schema_-_string-16 14320 83589 ns/op BenchmarkGraphQLIsValid/Trivial_Schema_with_cache_-_string-16 71486 15463 ns/op BenchmarkGraphQLParse/Trivial_Schema_-_string-16 3380 321490 ns/op BenchmarkGraphQLParse/Trivial_Schema_with_cache_-_string-16 13909 87633 ns/op BenchmarkGraphQLParseAndVerify/Trivial_Schema_-_string-16 3435 327646 ns/op BenchmarkGraphQLParseAndVerify/Trivial_Schema_with_cache_-_string-16 13844 85213 ns/op PASS ok github.com/open-policy-agent/opa/v1/topdown 112.465s Resolves: open-policy-agent#5377 Signed-off-by: Rob Myers <[email protected]>
Excellent! Yeah if I wasn't clear before, I did in no way mean to imply that you should fix that self-referential infinite loop issue as part of your work :) That's a bug we'll need to look into separately. Looking forward to your PR — the numbers look incredible! |
…gent#5377) This commit stores parsed GraphQL schemas to the cache, which improves the performance of GraphQL operations that parse the schema more than once. Queries are not cached. pkg: github.com/open-policy-agent/opa/v1/topdown cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz BenchmarkGraphQLSchemaIsValid/Trivial_Schema_-_string-16 15519 100178 ns/op BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_string-16 371311 3383 ns/op BenchmarkGraphQLSchemaIsValid/Trivial_Schema_-_object-16 2133 542355 ns/op BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_object-16 3471 528579 ns/op BenchmarkGraphQLSchemaIsValid/Trivial_Schema_-_AST_object-16 7105 193325 ns/op BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_AST_object-16 66594 18093 ns/op BenchmarkGraphQLParseSchema/Trivial_Schema_-_string-16 6429 173773 ns/op BenchmarkGraphQLParseSchema/Trivial_Schema_with_cache_-_string-16 6523 170819 ns/op BenchmarkGraphQLParseQuery/Trivial_Query_-_string-16 16352 72777 ns/op BenchmarkGraphQLParseQuery/Trivial_Query_with_cache_-_string-16 16083 73548 ns/op BenchmarkGraphQLIsValid/Trivial_Schema_-_string-16 14320 83589 ns/op BenchmarkGraphQLIsValid/Trivial_Schema_with_cache_-_string-16 71486 15463 ns/op BenchmarkGraphQLParse/Trivial_Schema_-_string-16 3380 321490 ns/op BenchmarkGraphQLParse/Trivial_Schema_with_cache_-_string-16 13909 87633 ns/op BenchmarkGraphQLParseAndVerify/Trivial_Schema_-_string-16 3435 327646 ns/op BenchmarkGraphQLParseAndVerify/Trivial_Schema_with_cache_-_string-16 13844 85213 ns/op PASS ok github.com/open-policy-agent/opa/v1/topdown 112.465s Resolves: open-policy-agent#5377 Signed-off-by: Rob Myers <[email protected]>
…gent#5377) This commit stores parsed GraphQL schemas to the cache, which improves the performance of GraphQL operations that parse the schema more than once. Queries are not cached. pkg: github.com/open-policy-agent/opa/v1/topdown cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz BenchmarkGraphQLSchemaIsValid/Trivial_Schema_-_string-16 15519 100178 ns/op BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_string-16 371311 3383 ns/op BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_AST_object-16 66594 18093 ns/op BenchmarkGraphQLParseSchema/Trivial_Schema_-_string-16 6429 173773 ns/op BenchmarkGraphQLParseSchema/Trivial_Schema_with_cache_-_string-16 6523 170819 ns/op BenchmarkGraphQLParseQuery/Trivial_Query_-_string-16 16352 72777 ns/op BenchmarkGraphQLParseQuery/Trivial_Query_with_cache_-_string-16 16083 73548 ns/op BenchmarkGraphQLIsValid/Trivial_Schema_-_string-16 14320 83589 ns/op BenchmarkGraphQLIsValid/Trivial_Schema_with_cache_-_string-16 71486 15463 ns/op BenchmarkGraphQLParse/Trivial_Schema_-_string-16 3380 321490 ns/op BenchmarkGraphQLParse/Trivial_Schema_with_cache_-_string-16 13909 87633 ns/op BenchmarkGraphQLParseAndVerify/Trivial_Schema_-_string-16 3435 327646 ns/op BenchmarkGraphQLParseAndVerify/Trivial_Schema_with_cache_-_string-16 13844 85213 ns/op PASS ok github.com/open-policy-agent/opa/v1/topdown 112.465s Resolves: open-policy-agent#5377 Signed-off-by: Rob Myers <[email protected]>
Short description
I am wondering if there is an easy way to cache the graphql schema AST that is generated by
graphql.parse
function call. I have been working on integrating OPA for our graphql layer and, unfortunately, am running into performance issues due how long it takes to generate the AST from a raw query.Our graphQL schema does not change between each authorization decisions and therefore it makes sense to generate thes chema AST once and re-use it for each authorization query.
I am running OPA embedded into Go mode. This is running locally from my laptop.
Steps To Reproduce
Nothing really interesting here. Rego looks like:
The only difference is having farily large
input.query
andinput.schema
.Expected behavior
I would like to find a way to cache my graphql schema AST between authorization decisions.
Additional context
The text was updated successfully, but these errors were encountered: