Speed up grpc fetch and query response parsing by yorickvP · Pull Request #537 · pinecone-io/pinecone-python-client

yorickvP · 2025-11-05T12:23:54Z

Problem

Running a profiler on my pinecone-using application. it was CPU-bottlenecked on json_format.MessageToDict (being able to do about 100 vectors per second in query and fetch responses).
It turns out converting the embeddings to a dict this way is very slow. It's much faster to convert them to a list without going through MessageToDict.

Solution

Changed parse_fetch_response and parse_query_response to directly read the protobuf structure instead of going through MessageToDict

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update
Infrastructure change (CI configs, etc)
Non-code change (docs, etc)
None of the above: (explain here)

Test Plan

Describe specific steps for validating this change.

Ran make test-grpc-unit.

Running a profiler on my pinecone program, it was CPU-bottlenecked on json_format.MessageToDict (being able to do about 100 vectors per second in query and fetch responses). It turns out converting the embeddings to a dict this way is very slow. It's much faster to convert them to a list without going through MessageToDict.

## Problem The current implementation uses `json_format.MessageToDict` to convert entire protobuf messages to dictionaries when parsing gRPC responses. This is a significant CPU bottleneck when processing large numbers of vectors, as reported in PR #537 where users experienced ~100 vectors/second throughput. The `MessageToDict` conversion is expensive because it: 1. Serializes the entire protobuf message to JSON 2. Deserializes it back into a Python dictionary 3. Does this for every field, even when we only need specific fields Additionally, several other performance issues were identified: - Metadata conversion using `MessageToDict` on `Struct` messages - Inefficient list construction (append vs pre-allocation) - Unnecessary dict creation for `SparseValues` parsing - Response header processing overhead ## Solution Optimized all gRPC response parsing functions in `pinecone/grpc/utils.py` to directly access protobuf fields instead of converting entire messages to dictionaries. This approach: 1. **Directly accesses protobuf fields**: Uses `response.vectors`, `response.matches`, `response.namespace`, etc. directly 2. **Optimized metadata conversion**: Created `_struct_to_dict()` helper that directly accesses `Struct` fields (~1.5-2x faster than `MessageToDict`) 3. **Pre-allocates lists**: Uses `[None] * len()` for known-size lists (~6.5% improvement) 4. **Direct SparseValues creation**: Creates `SparseValues` objects directly instead of going through dict conversion (~410x faster) 5. **Caches protobuf attributes**: Stores repeated attribute accesses in local variables 6. **Optimized response info extraction**: Improved `extract_response_info()` performance with module-level constants and early returns 7. **Maintains backward compatibility**: Output format remains identical to the previous implementation ## Performance Impact Performance testing of the response parsing functions show significant improvements across all optimized functions. ## Changes ### Modified Files - `pinecone/grpc/utils.py`: Optimized 9 response parsing functions with direct protobuf field access - Added `_struct_to_dict()` helper for optimized metadata conversion (~1.5-2x faster) - Pre-allocated lists where size is known (~6.5% improvement) - Direct `SparseValues` creation (removed dict conversion overhead) - Cached protobuf message attributes - Removed dead code paths (dict fallback in `parse_usage`) - `pinecone/grpc/index_grpc.py`: Updated to pass protobuf messages directly to parse functions - `pinecone/grpc/resources/vector_grpc.py`: Updated to pass protobuf messages directly to parse functions - `pinecone/utils/response_info.py`: Optimized `extract_response_info()` with module-level constants and early returns - `tests/perf/test_fetch_response_optimization.py`: New performance tests for fetch response parsing - `tests/perf/test_query_response_optimization.py`: New performance tests for query response parsing - `tests/perf/test_other_parse_methods.py`: New performance tests for all other parse methods - `tests/perf/test_grpc_parsing_perf.py`: Extended with additional benchmarks ### Technical Details **Core Optimizations**: 1. **`_struct_to_dict()` Helper Function**: - Directly accesses protobuf `Struct` and `Value` fields - Handles all value types (null, number, string, bool, struct, list) - Recursively processes nested structures - ~1.5-2x faster than `json_format.MessageToDict` for metadata conversion 2. **List Pre-allocation**: - `parse_query_response`: Pre-allocates `matches` list with `[None] * len(matches_proto)` - `parse_list_namespaces_response`: Pre-allocates `namespaces` list - ~6.5% performance improvement over append-based construction 3. **Direct SparseValues Creation**: - Replaced `parse_sparse_values(dict)` with direct `SparseValues(indices=..., values=...)` creation - ~410x faster (avoids dict creation and conversion overhead) ## Testing - All existing unit tests pass (224 tests in `tests/unit_grpc`) - Comprehensive pytest benchmark tests added for all optimized functions: - `test_fetch_response_optimization.py`: Tests for fetch response with varying metadata sizes - `test_query_response_optimization.py`: Tests for query response with varying match counts, dimensions, metadata sizes, and sparse vectors - `test_other_parse_methods.py`: Tests for all other parse methods (fetch_by_metadata, list_namespaces, stats, upsert, update, namespace_description) - Mypy type checking passes with and without grpc extras (with types extras) - No breaking changes - output format remains identical ## Related This addresses the performance issue reported in PR #537, implementing a similar optimization approach but adapted for the current codebase structure. All parse methods have been optimized with comprehensive performance testing to verify improvements.

jhamon mentioned this pull request Nov 18, 2025

Optimize gRPC Response Parsing Performance #553

Merged

yorickvP closed this Nov 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up grpc fetch and query response parsing#537

Speed up grpc fetch and query response parsing#537
yorickvP wants to merge 1 commit intopinecone-io:mainfrom
datakami:speed-up-grpc-fetch-query

yorickvP commented Nov 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yorickvP commented Nov 5, 2025

Problem

Solution

Type of Change

Test Plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant