Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build-your-own-parser #22

Open
dougwilson opened this issue May 27, 2014 · 19 comments
Open

build-your-own-parser #22

dougwilson opened this issue May 27, 2014 · 19 comments
Assignees

Comments

@dougwilson
Copy link
Contributor

If you look at the middleware, after I had refactored it, all they are is a call to typeis and a call to read. We should add something like bodyParser.generic() to let people roll their owner simple body parsers.

@dougwilson dougwilson self-assigned this May 27, 2014
@jonathanong
Copy link
Member

I think this lib is pretty easy to understand. IMO they should just fork this lib if they wanna do more stuff. Most of the logic should be in separate libs anyways

@Fishrock123
Copy link
Contributor

I think it would be better to have the interface and people have parsing libs than lots of body-parser forks.

@dougwilson
Copy link
Contributor Author

I think it would be better to have the interface

This module should be the interface. json and urlencoded should be the separable libs that use this module ;)

@jonathanong
Copy link
Member

You guys are just adding more responsibility to your plate hah

@Fishrock123
Copy link
Contributor

Yeah, that's the issue, may be too much to be worthwhile to support.

@dougwilson
Copy link
Contributor Author

i.am.machine :DDD

@mikermcneil
Copy link
Member

This module should be the interface. json and urlencoded should be the separable libs that use this module ;)

I can tackle this if it's helpful

@dougwilson
Copy link
Contributor Author

This is actually nearly done, as it was intended for the goal of bringing body-parser back into express core.

Out of curiosity, what parser were you planning to build that we don't have already :)?

@mikermcneil
Copy link
Member

Out of curiosity, what parser were you planning to build that we don't have already :)?

good question

@mikermcneil
Copy link
Member

xml

@mikermcneil
Copy link
Member

lol

@dougwilson
Copy link
Contributor Author

xml

lol. Somehow I knew that was the answer ;) Currently the best you can do is to use bodyParser.text and then feed the text into a XML parser.

@dougwilson
Copy link
Contributor Author

The reason why I had a feeling is because I use this module to parse XML all the time, but of course using text + parser requires the request body to buffer up instead of feeding it to an incremental parser (CSV is another common one, which I also use!).

This new stuff will actually be out sooner rather than later since I should not be distracted with express core.

@rlidwka
Copy link
Member

rlidwka commented Aug 1, 2014

Out of curiosity, what parser were you planning to build that we don't have already :)?

JSON5? I use that in a few APIs, because it's easier to debug such api with a curl (and b/w compatibility with json is perfect). Even created express-json5 based on bodyParser.

So this use-case is more real than it seems.

@dougwilson
Copy link
Contributor Author

Even created express-json5 based on bodyParser.

Though it looks like as body-parser is currently, it can be significantly simplified by wrapping bodyParser.text ;)

@dougwilson
Copy link
Contributor Author

@rlidwka or, looking at your code now, you could technically wrap bodyParser.json and if it errors, check err.body and parse that with the json5 parser :) I'm not actually saying this invalids the issue at had, here, because it's still valid and I'm working on it, haha.

@lopugit
Copy link

lopugit commented Jun 12, 2019

How do you build may I ask?

@jonchurch
Copy link
Member

jonchurch commented Feb 14, 2025

Reviving this issue as a more central place to discuss all the Generic parser work.

tldr;

Im in favor of creating a generic parser middleware interface folks can extend to create custom middlewares for their advanced usecases. However, Im against exposing an opts.parser function on existing parsers which users can use as an escape hatch

The Current PRs

We have several PRs open right now to land a Generic implementation. I don't want to leave feedback on all of these, or the additional closed ones, so Im writing this comment instead. cc @wesleytodd @UlisesGascon @ctcpip @Phillip9587

The existing approach here exposes an opts.parser option, which I don't want to accept.

Flexibility vs Simplicity

I'm open to offering a Generic interface for custom parsers, but I don’t believe we should expose an opts.parser option in our existing middleware parsers. I prefer we export well known and well tested parsers which are hard to muck up and simpler to reason about for us and users. (We do that already with urlencoded extended, and I've read enough issues to know that confuses people.)

The goal of body-parser (IMO) is to provide simple, reliable parsers for common HTTP body formats like JSON and urlencoded data, covering common usecases therein. By sticking to common use cases and avoiding overcomplication, the core middleware is easier for users to adopt and maintain long term.

Extending a generic interface to create a more custom parser is a viable way to provide users flexibility while keeping the simplicity of the core middlewares.

Customization needs are rare compared to the more common issues reported, which suggests that the current parsers cover most users' needs.

We’ve seen a small but recurring set of issues where users want to extend or change the behavior of existing parsers. Examples include:

Most of these requests represent edge cases or specialized needs, which is why I think a Generic parser interface is the best way to address them without overcomplicating the core parsers.

@Phillip9587
Copy link
Contributor

I agree with @jonchurch that we should focus on maintaining and improving our core parsers while allowing users to create their own custom parsers as needed. Keeping the core parsers simple and reliable should remain our priority.

Additionally, I think we should start over rather than trying to rebase the existing PRs. A fresh approach would give us a cleaner foundation to work from and ensure we implement this in the best possible way.

That said, there is #551, which introduces the normalizeOptions function. This PR standardizes and validates common parser options like inflate, limit, type and verify, ensuring consistent defaults and reducing duplication across the parsers. I think we should merge it as foundational work, which would help keep the scope of this effort smaller.

Looking forward to hearing everyone's thoughts!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants