Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Render to regex syntax #36

Open
chrisdone opened this issue Sep 13, 2019 · 8 comments
Open

Render to regex syntax #36

chrisdone opened this issue Sep 13, 2019 · 8 comments

Comments

@chrisdone
Copy link

chrisdone commented Sep 13, 2019

I'm making a new form library like formlets, and considering the ability to embed a regex as a validator/parser which would run on the server side with regex-applicative upon submission, but also, as a UX optimization, compile the same regex to a [a-z]{2}+-style regex to be used as the pattern=".." attribute for html5 (avoiding a roundtrip just to do a simple text format check). This would enable a developer to build forms with client and server-side validation composed using the same regex-applicative recipe.

So the question is: is there already a library or module that would convert an RE s a to Text? E.g.

render (many (sym 'a')) => "a*"

I assume I could write my own, but perhaps it already exists.

Thanks!

@UnkindPartition
Copy link
Owner

Unfortunately, this is not possible because sym 'a' is translated to psym (== 'a'), and you cannot extract 'a' out of (== 'a').

@chrisdone
Copy link
Author

chrisdone commented Sep 14, 2019

@feuerbach Ah, ok. So it was worth asking. Thanks.

I suppose then I would need to wrap up the API with an interface which doesn't expose psym. E.g.

data Regex a

with a Regex a -> RE Char a conversion function. Whipping up a simple GADT that is the same as the RE one but without this capability should be trivial, and I can re-use the rest of the library.

If you're interested, I could send it as a PR against this package as e.g. Text.Regex.Applicative.Renderable or w/e, otherwise package it as regex-applicative-renderable or something. I'm sure I'll find other uses than just this form package, and the work is probably trivial.

@UnkindPartition
Copy link
Owner

Yeah, or maybe have a class containing sym and perhaps some other functions, and make both types instances of that class.

The other constructors it should probably support are:

  1. Recognizing a set (as in Data.Set.Set) of characters — can be simulated by summing individual syms, but could be implemented more efficiently.
  2. Recognizing a class (as in a POSIX regex class — alpha, alnum etc.) of characters — can in theory be implemented via individual syms, but in practice would be prohibitive.

This new module/representation might also support more efficient compilation, i.e. it's potential utility won't be restricted just to rendering. Let's call it Text.Regex.Applicative.Char?

@chrisdone
Copy link
Author

OK, great then. When I get to this part of my lib I'll circle back to this issue. Thanks, Roman.

@treeowl
Copy link

treeowl commented Sep 15, 2019 via email

@chrisdone
Copy link
Author

@treeowl do you have benchmarks for that?

@ceedubs
Copy link

ceedubs commented Jun 5, 2020

I stumbled upon this discussion today. I had written a Scala regex library that a while ago I rewrote to more or less be a port of regex-applicative. The way I handled this in my library is that the the equivalent of the Symbol constructor (I call it Elem) has a separate metadata field of an abstract type m (which ends up needing to be another type parameter on RE).

For character (and any other discrete type) regexes, the m type ends up being a type that looks something like:

data Match a = Literal a | Wildcard | Allow (Diet a) | Forbid (Diet a)

where Diet is a discrete interval tree. This represents the notion of character classes in traditional regular expressions, supporting negation, unions and intersections. You can then write helper methods that implement the a -> maybe b function based on the Match.

In addition to being able to get a string representation of the regex, a nice benefit of having this metadata around is that you can then convert a regular expression to a Quickcheck generator that generates values that match that regex. I have a demo page here that I think is kind of fun.

Now is probably a good time to say thanks for creating this library, @feuerbach! As soon as I saw it I knew that I wanted to rewrite my Scala library :)

Edit: P.S. the polymorphic metadata field also makes a convenient place to throw a ThreadId into a tuple when compiling the regex so you don't have to have ThreadId baked into the user construction of the regex.

@UnkindPartition
Copy link
Owner

Thanks @ceedubs, those are some nice ideas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants