support/datastore: add read-only http datastore.#5809
support/datastore: add read-only http datastore.#5809overcat wants to merge 3 commits intostellar:mainfrom
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR adds HTTP as a read-only data source to the datastore package, allowing access to files hosted on HTTP(S) endpoints. The implementation supports authentication via custom headers and provides all standard read operations while explicitly rejecting write operations.
- Adds HTTPDataStore implementation with configurable base URL, timeout, and custom headers
- Implements all read operations (GetFile, Exists, Size, GetFileMetadata, etc.) using HTTP HEAD/GET requests
- Provides comprehensive test coverage with mock HTTP server for various scenarios
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| support/datastore/http.go | Core HTTP datastore implementation with read-only operations |
| support/datastore/http_test.go | Comprehensive test suite with mock HTTP server |
| support/datastore/datastore.go | Integration point adding HTTP type to datastore factory |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| client := &http.Client{ | ||
| Timeout: timeout, |
There was a problem hiding this comment.
The HTTP client lacks important security configurations. Consider adding MaxIdleConns, MaxIdleConnsPerHost, and IdleConnTimeout to prevent connection pool exhaustion, and disable automatic redirect following for better security control.
| client := &http.Client{ | |
| Timeout: timeout, | |
| transport := &http.Transport{ | |
| MaxIdleConns: 100, | |
| MaxIdleConnsPerHost: 10, | |
| IdleConnTimeout: 90 * time.Second, | |
| } | |
| client := &http.Client{ | |
| Timeout: timeout, | |
| Transport: transport, | |
| CheckRedirect: func(req *http.Request, via []*http.Request) error { | |
| return http.ErrUseLastResponse | |
| }, |
|
|
||
| resp, err := h.client.Do(req) | ||
| if err != nil { | ||
| log.Debugf("Error retrieving file '%s': %v", filePath, err) |
There was a problem hiding this comment.
Logging the full error message may expose sensitive information like internal URLs or authentication details. Consider sanitizing the error message before logging or reducing the log level.
| log.Debugf("Error retrieving file '%s': %v", filePath, err) | |
| log.Debugf("Error retrieving file '%s'", filePath) |
| headers map[string]string | ||
| } | ||
|
|
||
| func NewHTTPDataStore(datastoreConfig DataStoreConfig) (DataStore, error) { |
There was a problem hiding this comment.
It would be convenient to provide some some comment docs on this method to describe the expected toml datastore config params structure such as for the keys expected like 'timeout' and headers.
There was a problem hiding this comment.
There are no similar comments in s3 and gcp either. If you would like me to add comments for them in this PR as well, please let me know, thank you.
6f231b0 to
ad73020
Compare
f1b9561 to
83cad6d
Compare
|
Hi there, I'm building a custom indexer and this PR would be useful for my use case. Any updates on getting this merged? |
PR Checklist
PR Structure
otherwise).
services/friendbot, orallordocif the changes are broad or impact manypackages.
Thoroughness
.mdfiles, etc... affected by this change). Take a look in the
docsfolder for a given service,like this one.
Release planning
CHANGELOG.mdwithin the component folder structure. For example, if I changed horizon, then I updated (services/horizon/CHANGELOG.md. I add a new line item describing the change and reference to this PR. If I don't update a CHANGELOG, I acknowledge this PR's change may not be mentioned in future release notes.semver, or if it's mainly a patch change. The PR is targeted at the next
release branch if it's not a patch change.
What
Support HTTP as a read-only data source
Why
See #5808
Known limitations
N/A