-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Experimental parse error from #93 #99
base: main
Are you sure you want to change the base?
Conversation
* Adds type alias `crate::resolve::ResolveError` for `crate::resolve::Error` * Renames `crate::assign::AssignError` to `crate::assign::Error` * Adds type alias `crate::assign::AssignError` for crate::assign::Error` * Adds `position` (token index) to variants of `assign::Error` & `resolve::Error`
Co-authored-by: André Mello <[email protected]>
I don't think related errors is a bad idea intrinsically — one of the things I wanted to explore in follow up PRs is this. This seems to be the right approach with miette for displaying multiple errors. Most of the noise in the output is due to that bug I flagged on #93, which we can just fix, and maybe we can tune the labels a bit for the multi-errors case, but it doesn't look terrible. That said, I think we should leave it for a later step, after we've figured out the basic interface for rich errors. Assuming we do this, my take is we shouldn't store the full offsets in the original error, and instead should compute them on-demand when the rich error is constructed (or even when let err: Report<ParseError, Cow<'_, str>> = PointerBuf::parse("hello-world/~3/##~~/~3/~")
.unwrap_err();
let err: FullReport<ParseError, Cow<'_, str>> = err.diagnose_fully();
// needed for conversion to `miette::Report`
let err: FullReport<ParseError, Cow<'static, str>> = err.into_owned();
println!("{:?}", miette::Report::from(err)); The need to convert to If we let go of the let err: Report<ParseError, String> = PointerBuf::parse("hello-world/~3/##~~/~3/~")
.unwrap_err();
let err: FullReport<ParseError, String> = err.diagnose_fully();
println!("{:?}", miette::Report::from(err)); The subject ( I'll give some more thought into the stuff in I don't have a complete alternative to offer yet, other than doing the straightforward thing and defining |
n@asmello I'm sorry, github is back to marking messages I haven't seen as read. I'm not sure what that's about.
Yep, this was due to me delegating again. I created an enum multi-error display (slightly different from above as I omitted the leading slash: There's still a bug that trailing
Heh, I should have kept reading before investigating and writing code.
That random function attached to I ended up just using the offsets already on the errors and it made things a good bit cleaner. I'm not opposed to removing them from When it comes to
There are a different set of methods based upon the associated types. impl<S> ParseError<S> where S: Structure<Cause = Causes> {
pub fn causes(&self) -> & [Cause]
}
impl<S> ParseError<S> where S: Structure<Cause = Cause> {
pub fn cause(&self) -> &Cause
}
impl<S> ParseError<S> where S: Structure<Subject = String> {
pub fn subject(&self) -> &str
}
|
I had really hoped this would be possible: let Result<_, ParseError<Complete>> = Pointer::parse("/ptr"); where impl Pointer {
pub fn parse<S: AsRef<str> + ?Sized, E: Structure>(s: &S) -> Result<&Self, ParseError<E>>
} but not being able to set defaults on generics of functions put an end to that. The remaining advantage to using this structure is that the api of the error remains consistent between
|
There are 3 shapes that
where pub enum Cause {
NoLeadingBackslash,
InvalidEncoding {
offset: usize,
source: EncodingError,
},
}
|
If we had specialization, the API could be identical across variations of impl<S> ParseError<S> where S: Structure<Cause = Causes> {
pub fn cause(&self) -> &Cause { &self.cause[0] }
} |
Nevermind, I was able to get pub trait Causative:
PartialEq + Sized + std::error::Error + fmt::Display + fmt::Debug + miette::Diagnostic
+ fn first(&self) -> &Cause; impl<S: Structure> ParseError<S> {
pub fn cause(&self) -> &Cause {
self.cause.first()
}
}
|
An alternative approach would be something like struct SansInput { cause: Cause }
struct WithInput { cause: Cause, input: String }
struct Complete { causes: Vec<Cause>, input: String }
enum ParseError {
SansInput(SansInput),
WithInput(WithInput),
Complete(Complete),
} but methods which operate on the input would need to return |
I'm still skeptical about this marker trait system, because in my experience this pattern has some serious ergonomics implications. One you kinda noticed already - lack of specialisation means we often have to implement the same function multiple times, once for each concrete marker value. This has a viral effect, because each function that is implemented concretely can't be called in generic contexts. That is, this doesn't work: use std::marker::PhantomData;
trait State {}
struct S1;
impl State for S1 {}
struct S2;
impl State for S2 {}
struct Foo<S: State> {
_phantom: PhantomData<S>
}
impl Foo<S1> {
fn do_it() { }
}
impl Foo<S2> {
fn do_it() { }
}
fn frobnicate<S: State>(foo: Foo<S>) {
// error[E0599]: no method named `do_it` found for struct `Foo<S>`
foo.do_it();
} I'm not actually sure how your I'm also worried that all this work is just to support |
To be clear though, you are not crazy, there is some credit to this system. The state machine is a nice model for error enrichment. My concerns are just over implications for ergonomics, as well as complexity. I'm also scared of running into edge cases with generics and associated types. Fought those hard last Sunday getting |
I wonder if the answer isn't to just have 3 totally separate types, but have them implement a trait where common functionality makes sense? This would give callers a common API to operate with where it matters, while still providing a way to have different methods on each stage of enrichment. What's telling to me here is that the three flavours of |
Ah, I hadn't really considered too much the implications of infecting generics on people. Hmm.
yea, I had no idea you couldn't shadow
Oh, in this case I wanted
I realize this is a contrived example but we wouldn't need last. Most of the distinctions between these types are whether or not there exists the input string. I think I have most of that covered but and what's remaining of finishing it shouldn't depend too much on the markers. I'm not dismissing the concern. Its valid - just giving context on the current state.
This is true. It is confusing to have two separate mechanisms for enrichment when we have so few errors.
Understood and thanks. I wasn't sure if you outright hated it or not :). I've been going around and around trying to find the right combination of complexity, ergonomics, and utility. The dual types, while awesome, definitely spike the complexity around errors. gotta run, i'll reply to the rest when I get back. |
Oh, I don't hate it. I'm just scared of it! 😄
I think this is probably ok for users, because they have to specify which concrete type they want out of |
That may work. If we go that route, I'm thinking we eliminate the third option so it'll just be 2, with and without input.
Yea, we have 2 or 3 distinct structures depending on whether we keep the list. The generics keep it to one descriptive type. That's about it. Unfortunately the luster of this approach is severely diminished without being able to pick and choose the reporting style from a single set of methods ( Granted, we can still allow for picking of reporting style on impl PointerBuf {
pub fn parse<E: Structure>(s: impl Into<String>) -> Result<Self, ParseError<E>> {
let s = s.into();
match Validator::validate::<E::Cause>(&s) {
Ok(()) => Ok(Self(s)),
Err(cause) => Err(ParseError::<E>::new(cause, s)),
}
}
}
fn example () {
let ptr = PointerBuf::parse("/").unwrap();
}
Hah, I understand that. Having said all of that, I don't think this particular arrangement is horrible, especially if we seal up |
I think I'd favor a distinct type for the edit: Hmm, maybe not. I'm going to go back to your PR and play with it. Perhaps just one enriched generic error is sufficient. It may help a good bit if we eliminate the possibility of multiple errors too. |
I'm not opposed to this, but as long as we have two versions for every type (one encoding just the cause and the other also encoding a position), I think the generic scales better. That's about it, really. My initial jerk reaction to the generic wrapper was because in my experience generic errors tend to be painful to work with. Especially for users, as they often end up doing things we didn't foresee. But what I ended up realising is that the public API can expose only concrete types, so this isn't a problem (
I'll try to set apart some time this weekend to try and make it work with multiple errors too. But like I said, I wouldn't oppose just sticking with concrete types either. And I haven't fully ruled out the associated types approach, I'm just dubious of the net benefit. If you think of another motivation for it I may change my stance. |
I'll be awol this weekend but I think I'm leaning toward the wrapper as well. The alternative is going to be basically the equivalent of it anyway. Basically the two options are: enum ParseError {}
struct EnrichedParseError { input: String, error: ParseError } or something like this to get the same API across types: enum Cause {}
struct ParseError { cause: Cause }
struct EnrichedParseError { cause: Cause, input: String } Both of which don't make a whole lot of sense if we have a wrapper type. |
Taking a step back to re-evaluate the overall approach with a fresher perspective. For rich errors (including multi-errors) in the parsing API, I think the right approach would be to introduce separate parsers for each level of error richness. This way we keep the core API simple and avoid doing extra work without giving up the option to produce the enhanced errors when it matters. impl Pointer {
pub fn parse(s: &str) -> Result<Self, ParseError> {...}
}
impl PointerBuf {
pub fn parse(s: impl Into<Cow<'_, str>>) -> Result<Self, ParseError> {...}
}
trait Parser {
type Error;
pub fn parse(s: impl Into<Cow<'_, str>>) -> Result<Cow<'_, Pointer>, Self::Error>;
}
let err: ParseError = SimpleParser::parse("foo").unwrap_err();
let err: RichParseError = RichParser::parse("foo").unwrap_err();
let err: FullParseError = FullParser::parse("foo").unwrap_err(); I'm keeping the For the For resolve, it makes even less sense to have a helper struct, since there's nothing really to configure, but may still be worth it for the rich error interface. Multi-errors also don't make sense here. Another option for assign and resolve is to just condition the error types on the Multi-errors could make sense in You mentioned being away this weekend, so no pressure to reply. I'll probably come back to all of this next week(end?) after I've given it all some more thought. |
Also worth mentioning, the one issue with the wrapper type, even if we make it work with multi-errors, is the potential of mismatch between the subject and the error. It's even weirder with multi-errors because, to generate them, we'd need to essentially run another full scan and may find entirely different errors if given a different source. This is what made me feel like it's the wrong direction to take. Ideally we don't need to revisit the subject other than for display purposes, once an error has been generated. I said before the mismatch is a low risk given typically you'd use the wrapper immediately like |
Ohh! Now that's a clever way to handle it! I really like the approach of having different traits. It solves what this design failed to do. That is, allow for structuring the error by specifying a call without cluttering up the API with numerous methods. impl PointerBuf {
pub fn parse(s: impl Into<Cow<'_, str>>) -> Result<Self, ParseError> {...}
} What if we always allocate for rich and full parser? It seems like if you want an error with the input, you are on an error path and allocating a string is fine, like you were saying earlier. Digesting the rest. edit: |
Right,
That sounds like an awesome idea. I'm game for trying it in the future. I would love something like it. EditMissed a reply.
Yea, I kept going back and forth on this. Its especially weird with the first
This is an incredibly small nit, but we don't have to do a full scan - we can start from the last encoding error's offset + 1. Basically nothing, but yea. I get your point.
Oh, I know. I really don't like it either. We are just sort of bound by the language. |
@asmello experimental parse error from #93 Sorry this took me so long to get back around to doing.
Its not as ergonomic as I had hoped, I forgot you can't provide defaults to generics of functions. Honestly, it may honestly be more hassle than its worth. I don't know if multi-error reporting is even remotely useful to folks for something as simple as a json pointer.
The miette output of the multi-error is a mess at the moment. I'm guessing its due to the related errors but I dont know how to get a list to show otherwise.