-
Notifications
You must be signed in to change notification settings - Fork 701
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SOLR-17043: Remove SolrClient path pattern matching #3238
base: main
Are you sure you want to change the base?
Conversation
solr/solrj/src/java/org/apache/solr/client/solrj/request/SolrPing.java
Outdated
Show resolved
Hide resolved
solr/solrj/src/java/org/apache/solr/client/solrj/SolrRequest.java
Outdated
Show resolved
Hide resolved
* SolrRequest#getRequestType}. Note that changing request type can break/impact request routing | ||
* within various clients (i.e. {@link CloudSolrClient}). | ||
*/ | ||
protected SolrRequestType getBaseRequestType() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just thinking out loud: I wonder if it is possible to make this a member of SolrRequest
itself, i.e. private SolrRequestType type
The advantage of this is you don't have to expand the interface with this arguably low-level detail. You could imagine a set of overloaded constructors of SolrRequest
, i.e.:
private SolrRequestType type;
public SolrRequest(METHOD m, String path) {
this(m, path, UNSPECIFIED);
}
public SolrRequest(METHOD m, String path, SolrRequestType defaultType) {
this.method = m;
this.path = path;
this.type = resolveType(path, defaultType);
}
public void setPath(String path) {
this.path = path;
this.type = resolveType(path, type);
}
public SolrRequestType getRequestType() {
return type;
}
The catch is that calling setPath
has to update this additional property as well. But because getRequestType
already depends on path
, both solutions are equally mutable. It would be nice if SolrRequest
had a builder which could unburden SolrRequest
from this mutability but that would be a much bigger change I suppose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getBaseRequestType
seems sad to me; and I sympathize with Luke's musings. I'd like to see this method go away. I like's Luke's suggestion on a field. But going towards immutable is a bit much IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it does seem to complicate the interface a bit.
I'd be fine with the "field" approach that Luke suggested. Or alternately, could this be solved without even a "field" if getRequestType
below was non-final and subclasses could implement it as needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolving the type in the constructor breaks QueryRequest
, since it can only handle calls to SolrRequest.getPath()
after construction. I removed getBaseRequestType
, added a field defaultType
, and put the request type pattern matching logic in getRequestType
. Does this compromise work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good stuff here!
solr/solrj/src/java/org/apache/solr/client/solrj/request/SolrPing.java
Outdated
Show resolved
Hide resolved
* SolrRequest#getRequestType}. Note that changing request type can break/impact request routing | ||
* within various clients (i.e. {@link CloudSolrClient}). | ||
*/ | ||
protected SolrRequestType getBaseRequestType() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getBaseRequestType
seems sad to me; and I sympathize with Luke's musings. I'd like to see this method go away. I like's Luke's suggestion on a field. But going towards immutable is a bit much IMO.
* | ||
* @see CloudSolrClient#isUpdatesToLeaders | ||
*/ | ||
public boolean shouldSendToLeaders() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I question that we want this. Maybe this pre-dated org.apache.solr.common.params.ShardParams#SHARDS_PREFERENCE
where you can accomplish the same thing. Ideally this method goes away (and goes away from AbstractUpdateRequest).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of this PR, as I understand it, is to clean up a lot of the String pattern matching, casting and instanceof checks from our client implementations. And the whole "is this an update request and should I route it to leaders" logic is a big source of those issues.
If you don't like this approach @dsmiley , can you be more explicit about what alternative path you'd like Jude to take? Are you suggesting this PR not touch that logic at all? Are you suggesting we remove it altogether in favor of UpdateRequest setting a shards.preference
value, or something similar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should have clarified that my comment wasn't a critique of this PR. The presence of this boolean seemed to somewhat get in the way of this PR (?) so I chose this moment to share my thoughts. Feel free to ignore!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, the author has made this method more prominent by elevating to SolrRequest, so my feedback is relevant & timely. I don't like this method here at all (or anywhere).
Are you suggesting we remove it altogether in favor of UpdateRequest setting a shards.preference value, or something similar?
Exactly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at this closer just now. shards.preference
routes the entire request whereas IsUpdateRequest.isSendToLeaders
is about each document in the request, individually routing it. So not the same. Any way, I don't think it makes sense for this to be on SolrRequest generally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any way, I don't think it makes sense for this to be on SolrRequest generally.
@dsmiley is this more "sharing your thoughts", or do you want Jude to address this in some way? Just want to double-check so there's no ambiguity.
I agree with you that UpdateRequest should probably re-use shards.preference
instead of offering additional methods to impact routing. (Or perhaps - maybe UpdateRequest offers a few syntax-sugar methods to set common preferences, but those operate by modifying shards.preference
rather than maintaining a separate boolean?).......but all of that feels out of scope in a PR that was proposed initially as a refactor?
If you feel strongly about the method being out of place on SolrRequest, maybe the best approach for now is to file a follow-up ticket, and to hold off touching the UpdateRequest stuff at all in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you feel strongly about the method being out of place on SolrRequest
Very much so.
[-1] to ShardRequest.shouldSendToLeaders
since it's usage is actually only for updates; isn't generally controlling all requests (nor should it). We could just leave Solr as it is with regards to this boolean in UpdateRequest
& IsUpdateRequest
. Maybe there are other possible solutions. Maybe it doesn't need to exist if users will configure CloudSolrClient with the setting and not change it; not sure.
solr/solrj/src/java/org/apache/solr/client/solrj/impl/CloudSolrClient.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is looking great overall, though I agree with some of the other commenters that we should come up with an alternative to the SolrRequest.getBaseRequestType
method.
Other than that, only a few minor comments inline.
* SolrRequest#getRequestType}. Note that changing request type can break/impact request routing | ||
* within various clients (i.e. {@link CloudSolrClient}). | ||
*/ | ||
protected SolrRequestType getBaseRequestType() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it does seem to complicate the interface a bit.
I'd be fine with the "field" approach that Luke suggested. Or alternately, could this be solved without even a "field" if getRequestType
below was non-final and subclasses could implement it as needed?
solr/solrj/src/java/org/apache/solr/client/solrj/SolrRequest.java
Outdated
Show resolved
Hide resolved
* | ||
* @see CloudSolrClient#isUpdatesToLeaders | ||
*/ | ||
public boolean shouldSendToLeaders() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of this PR, as I understand it, is to clean up a lot of the String pattern matching, casting and instanceof checks from our client implementations. And the whole "is this an update request and should I route it to leaders" logic is a big source of those issues.
If you don't like this approach @dsmiley , can you be more explicit about what alternative path you'd like Jude to take? Are you suggesting this PR not touch that logic at all? Are you suggesting we remove it altogether in favor of UpdateRequest setting a shards.preference
value, or something similar?
solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpSolrClient.java
Show resolved
Hide resolved
solr/solrj/src/java/org/apache/solr/client/solrj/impl/LBHttp2SolrClient.java
Show resolved
Hide resolved
@@ -84,8 +84,8 @@ public DelegationTokenResponse.Get createResponse(SolrClient client) { | |||
} | |||
|
|||
@Override | |||
public String getRequestType() { | |||
return SolrRequestType.ADMIN.toString(); | |||
protected SolrRequestType getBaseRequestType() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Q] Unrelated to this PR, but does DelegationTokenRequest serve a purpose anymore? SOLR-9200 added it originally in order to support 'hadoop-auth', and now that the "hadoop-auth" module is gone this is only referenced in tests afaict.
Perhaps it could've been removed when @epugh removed hadoop-auth earlier this year?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@risdenk I also at-mentioned you on that point on the related Hadoop removal PR since you are the SME on this (not Pugh).
@@ -55,8 +55,13 @@ public SolrParams getParams() { | |||
} | |||
|
|||
@Override | |||
public String getRequestType() { | |||
return SolrRequestType.UPDATE.toString(); | |||
protected SolrRequestType getBaseRequestType() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[0] Unrelated to this PR, but it's a shame that DirectXmlRequest isn't a child of either AbstractUpdateRequest
or UpdateRequest
. Anyone remember why this is by chance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO we should deprecate DirectXmlRequest
. It's unused, untested, and not needed. I'll do so in my deprecation branch/PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I already did that:
solr/solr/solrj/src/java/org/apache/solr/client/solrj/request/DirectXmlRequest.java
Line 33 in cbc2321
public class DirectXmlRequest extends CollectionRequiringSolrRequest<UpdateResponse> |
I should merge that PR so we can more easily communicate to even each other what things are going away so we shouldn't spend attention on
solr/solrj/src/java/org/apache/solr/client/solrj/request/SolrPing.java
Outdated
Show resolved
Hide resolved
solr/solrj/src/java/org/apache/solr/client/solrj/SolrRequest.java
Outdated
Show resolved
Hide resolved
solr/solrj/src/java/org/apache/solr/client/solrj/request/CollectionRequiringSolrRequest.java
Outdated
Show resolved
Hide resolved
*/ | ||
public SolrRequestType getRequestType() { | ||
String path = getPath(); | ||
if (path != null && CommonParams.ADMIN_PATHS.contains(path)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this part is troubling IMO; I think we should just return the field of the requestType. Therefore each subclass needs to provide it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason this is necessary at the moment is because of this pattern, which is present in several unit tests across the codebase (example from TestReplicationHandler):
ModifiableSolrParams params = new ModifiableSolrParams();
params.set("action", "reload");
params.set("core", core);
params.set("qt", "/admin/cores");
QueryRequest req = new QueryRequest(params);
Doing pattern matching on getPath
allows this type of path manipulation to "just work" with CloudSolrClient
, at the cost of introducing more complexity into SolrRequest
. I considered going through and changing each instance of this pattern across all unit tests, but I felt like it might expand the scope of the PR a bit too much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those unit tests are then "wrong". QueryRequest is only for a query, not requests generally. It's parameterized response type is QueryResponse holding all the interesting information from /select
. Note that, unfortunately, some classes in Solr include the word "Query" when it shouldn't but QueryRequest isn't one of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see "qt" makes this harder. I hate it; I almost removed it completely as one of my first big actions as a committer, forever ago, but I went half-way by removing its processing server-side. It's still present in SolrJ as you see here; sigh. I'd love to see this gone. I know it's a convenience for tests that want a SolrParams to be sort of all-encompassing about the request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any "real" risks of mis-classification of the request type in SolrJ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some pattern matching is necessary here to make the APIs work as intended. Moving that logic to SolrRequest
seemed like a good idea when I first made this change, but I am kind of second-guessing that now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could sympathize if my suggestion increases the scope of the PR but I still think we shouldn't play games in this method.
What could help is having request constructors that don't take the request type but insist the path is obviously admin, so we default to admin or otherwise throw IllegalArgumentException. That would cut down on many changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[re: 'qt' param] I almost removed it completely as one of my first big actions as a committer
It's been on my list for awhile, as it's caused problems with the v2 API as well. We really should nuke it across the board...
The reason this is necessary at the moment is because of this pattern, which is present in several unit tests
I think this pattern is mostly a result of gaps in SolrJ. We didn't offer a "reload core" SolrRequest implementation at some point, so a test-author hacked around it by forcing the request into a QueryRequest. But I wonder how many of those gaps have closed up over time.
We have a reload-core SolrRequest now, on the v2 side at least: org.apache.solr.client.solrj.request.CoresApi.ReloadCore
. That should be safe to use in tests. And even where there's still a SolrJ gap we should still be able to replace could these QueryRequest mis-uses with GenericSolrRequest I think?
Is there's not that many instances of the pattern, maybe that's a path forward as well? (Though if it's too much scope creep, I'd also understand that...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can go ahead and make the changes to the tests, I agree that it seems like the best path forwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Speaking of "qt": I filed SOLR-17715 to remove it for SolrJ 10.
solr/solrj/src/java/org/apache/solr/client/solrj/SolrRequest.java
Outdated
Show resolved
Hide resolved
@@ -38,7 +38,8 @@ public class SolrPing extends CollectionRequiringSolrRequest<SolrPingResponse> { | |||
|
|||
/** Create a new SolrPing object. */ | |||
public SolrPing() { | |||
super(METHOD.GET, CommonParams.PING_HANDLER); | |||
// this request is not processed as an admin request | |||
super(METHOD.GET, CommonParams.PING_HANDLER, SolrRequestType.UNSPECIFIED); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ADMIN
solr/solrj/src/java/org/apache/solr/client/solrj/request/schema/AbstractSchemaRequest.java
Show resolved
Hide resolved
* | ||
* @see CloudSolrClient#isUpdatesToLeaders | ||
*/ | ||
public boolean shouldSendToLeaders() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, the author has made this method more prominent by elevating to SolrRequest, so my feedback is relevant & timely. I don't like this method here at all (or anywhere).
Are you suggesting we remove it altogether in favor of UpdateRequest setting a shards.preference value, or something similar?
Exactly.
@@ -35,11 +35,12 @@ public class LukeRequest extends CollectionRequiringSolrRequest<LukeResponse> { | |||
private Boolean includeIndexFieldFlags = null; | |||
|
|||
public LukeRequest() { | |||
super(METHOD.GET, "/admin/luke"); | |||
// this request is not processed as an ADMIN request | |||
super(METHOD.GET, "/admin/luke", SolrRequestType.UNSPECIFIED); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we agreed to change the classification separately (another PR)
if (request instanceof IsUpdateRequest) { | ||
sendToLeaders = ((IsUpdateRequest) request).isSendToLeaders() && this.isUpdatesToLeaders(); | ||
if (request.getRequestType() == SolrRequestType.UPDATE) { | ||
sendToLeaders = request.shouldSendToLeaders() && this.isUpdatesToLeaders(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know casting is never elegant but shouldSendToLeaders() doesn't belong on SolrRequest base class
@@ -30,17 +30,12 @@ public AbstractSchemaRequest(METHOD m, String path) { | |||
} | |||
|
|||
public AbstractSchemaRequest(METHOD m, String path, SolrParams params) { | |||
super(m, path); | |||
super(m, path, SolrRequestType.UNSPECIFIED); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again; lets not change that in this PR
@@ -39,7 +39,7 @@ public class FieldAnalysisRequest extends CollectionRequiringSolrRequest<FieldAn | |||
|
|||
/** Constructs a new FieldAnalysisRequest with a default uri of "/fieldanalysis". */ | |||
public FieldAnalysisRequest() { | |||
super(METHOD.GET, "/analysis/field"); | |||
super(METHOD.GET, "/analysis/field", SolrRequestType.UNSPECIFIED); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keep as QUERY
@@ -50,7 +50,7 @@ public GenericSolrRequest(METHOD m, String path) { | |||
* @param params query parameter names and values for making this request. | |||
*/ | |||
public GenericSolrRequest(METHOD m, String path, SolrParams params) { | |||
super(m, path); | |||
super(m, path, SolrRequestType.UNSPECIFIED); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should overload the constructor and encourage specifying this param. II could see defaulting to ADMIN only if the path starts with "/admin" and otherwise throwing an IllegalArgumentException, thus compelling users to be explicit while also benefiting from the implied ADMIN from the path when that's commonly the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could see defaulting to ADMIN only if the path starts with "/admin" and otherwise throwing an IllegalArgumentException
+1 to the additional ctor so that callers can specify their own request-type as desired, but I dislike the suggestion around conditionally defaulting to "ADMIN", for two reasons.
- Some built-in paths containing "admin" are emphatically NOT "ADMIN" APIs;
/admin/security
being a particularly thorny example. And the assumption gets even rockier when you start considering users who might want GenericSolrRequest to hit their custom endpoints. The whole purpose of the class is to be generic - making assumptions about the type of request (beyond "UNSPECIFIED") seems shaky IMO. - One of the main purposes of this JIRA/PR is to get rid of path-inspection and string comparison, which tends to be brittle. Let's not just shift it to a different place!
@@ -50,7 +50,7 @@ public class V2Request extends SolrRequest<V2Response> implements MapWriter { | |||
private ResponseParser parser; | |||
|
|||
private V2Request(METHOD m, String resource, boolean useBinary) { | |||
super(m, resource); | |||
super(m, resource, SolrRequestType.UNSPECIFIED); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keep ADMIN
*/ | ||
public SolrRequestType getRequestType() { | ||
String path = getPath(); | ||
if (path != null && CommonParams.ADMIN_PATHS.contains(path)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could sympathize if my suggestion increases the scope of the PR but I still think we shouldn't play games in this method.
What could help is having request constructors that don't take the request type but insist the path is obviously admin, so we default to admin or otherwise throw IllegalArgumentException. That would cut down on many changes.
@@ -36,7 +36,7 @@ public class ConfigRequest extends CollectionRequiringSolrRequest { | |||
protected final String message; | |||
|
|||
public ConfigRequest(String message) { | |||
super(SolrRequest.METHOD.POST, "/config"); | |||
super(SolrRequest.METHOD.POST, "/config", SolrRequestType.UNSPECIFIED); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keep ADMIN
4082aae
to
8928cfb
Compare
https://issues.apache.org/jira/browse/SOLR-17043
Description
Currently, some SolrClient implementations (especially our "load-balancing" and "cloud" clients) do pattern-matching on the path to guess the "type" (admin, update, etc. ) of each request. This seems unnecessary though, as SolrRequest already has a "getRequestType" method exposing this. We should use this method where possible instead of the ad-hoc pattern matching we currently do, which is brittle and doesn't map well to the varied paths used by our v2 APIs.
Solution
Use
SolrRequest.getRequestType
to identify request types in SolrJ, and remove other forms of request type identification from the SolrJ codebase. This includes deprecating theIsUpdateRequest
interface.Change
getRequestType
to return the actualSolrRequestType
enum instead of a String.Some unit tests use
SolrRequest.setPath()
to send aQueryRequest
to a/admin
path. This PR adds basic pattern matching ingetRequestType
and adds methodSolrRequest.getBaseRequestType
to promote an inheritance pattern which works with mutable request paths.CloudSolrClient
relies heavily on request type to route requests. After modifying the routing inCloudSolrClient
to usegetRequestType
, several requests started failing because they returned request typeADMIN
when their handler path was not inCommonParams.ADMIN_PATHS
. This PR changes the type of those misclassified requests toUNSPECIFIED
. Some of these changes may seem counterintuitive (i.e./admin/ping
and/admin/luke
are notADMIN
requests), but they preserve the existing data flow.Tests
Solr unit tests pass with
./gradlew test
.Checklist
Please review the following and check all that apply:
main
branch../gradlew check
.