Skip to content

Add comments for the platform dependency of the regex module #747

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 9 commits into from
Closed

Add comments for the platform dependency of the regex module #747

wants to merge 9 commits into from

Conversation

jnooree
Copy link
Contributor

@jnooree jnooree commented Jul 1, 2020

In BSD-based platform such as macOS, the metacharacters with leading "\" are not supported.
This adds comments for users with those OS, to avoid confusing results.

Also changes the wrong associative array name in the example(s).

In BSD-based platform such as macOS, the metacharacters with leading "\" are not supported.
This adds comments for users with those OS, to avoid confusing results.

Also changes the wrong associative array name in the example(s).
@danielshahaf
Copy link
Member

By the way, please add the GitHub magic syntax Fixes #746 to the log message.

@jnooree
Copy link
Contributor Author

jnooree commented Jul 1, 2020

I’ve missed some points.
It would be better if I make another fork, or it is just ok to keep working on this fork and make another PR?
I’m not that familiar with open source communities, sorry about that.

@danielshahaf
Copy link
Member

danielshahaf commented Jul 1, 2020 via email

jnooree added 3 commits July 2, 2020 00:49
This changes document structure and adds comments for the platform-dependency
of regular expressions.

Fixes #746.
@jnooree
Copy link
Contributor Author

jnooree commented Jul 1, 2020

Just added some commits. Hope this will help.

It would be better if I make another fork, or keep working on this fork and make another PR?
Neither. Keep the same fork, and keep the same PR (branch), and push additional commits to it. I'll squash/split as needed before merging.
I’m not that familiar with open source communities, sorry about that.
Not to worry; we were all beginners once. Thanks for the patch!

@danielshahaf
Copy link
Member

Thanks for the revisions.

Regarding the method of making the amendments — by pushing new commits to the existing PR's branch — it was exactly right.

Regarding the content, I like the current example because it serves two purposes — demonstrating how to do something the pattern highlighter can't, and demonstrating platform-dependent aspects — but I still wonder whether we could have two examples: first, a simple example that works everywhere, and second, the current, word boundaries example. I think this change would make the documentation more accessible to first-time readers; i.e., it would flatten the learning curve. WDYT?

Another question. The regexp semantics aren't part of z-sy-h; they're a zsh feature that z-sy-h uses. Therefore, should a note about the OS-dependence of regexp syntax and a reference to the relevant OS-dependent manual page (re_format(7)?) be added to zsh's manual? The note could go in the documentation of the zsh/regex module, for example. The zsh manual sources are in zsh's repository: https://github.com/zsh-users/zsh/tree/master/Doc/Zsh

No pun intended.

@jnooree
Copy link
Contributor Author

jnooree commented Jul 3, 2020

Regarding the content, I like the current example because it serves two purposes — demonstrating how to do something the pattern highlighter can't, and demonstrating platform-dependent aspects — but I still wonder whether we could have two examples: first, a simple example that works everywhere, and second, the current, word boundaries example. I think this change would make the documentation more accessible to first-time readers; i.e., it would flatten the learning curve. WDYT?

That's a good idea. I would add one in front of the current examples.

Another question. The regexp semantics aren't part of z-sy-h; they're a zsh feature that z-sy-h uses. Therefore, should a note about the OS-dependence of regexp syntax and a reference to the relevant OS-dependent manual page (re_format(7)?) be added to zsh's manual? The note could go in the documentation of the zsh/regex module, for example. The zsh manual sources are in zsh's repository: https://github.com/zsh-users/zsh/tree/master/Doc/Zsh

Absolutely, I will try to move information to the Zsh manuals.

jnooree added 2 commits July 3, 2020 15:10
This adds a general, platform independent example before the more detailed
example.

Also removes details about the reason of the platform dependency of some
non-standard regular expressions, and replaces it with a hyperlink to the related
Zsh document.
@danielshahaf
Copy link
Member

Thanks, LGTM! I'll wait a bit for further feedback from others, and squash + merge as-is otherwise.

Corrects the root cause of the platform dependency.
@jnooree jnooree changed the title Add comments for BSD-based platform users Add comments for the platform dependency of the regex module Jul 3, 2020
@danielshahaf
Copy link
Member

@jnooree Thanks for the revisions.

@phy1729 Are we clear to merge, or?

The only issue I see is the phrase "many GNU/Linux". "GNU/Linux" is an adjective or a singular noun and "many" should be followed by a plural noun, such as "distributions".

@danielshahaf danielshahaf added this to the 0.8.0 milestone Jul 7, 2020
@jnooree
Copy link
Contributor Author

jnooree commented Jul 8, 2020

The only issue I see is the phrase "many GNU/Linux". "GNU/Linux" is an adjective or a singular noun and "many" should be followed by a plural noun, such as "distributions".

What a shame! I changed it.

@jnooree
Copy link
Contributor Author

jnooree commented Jul 9, 2020

I have found something more about this. After adding this to ~/.zshrc:

autoload zsh/pcre
setopt RE_MATCH_PCRE

then

$ [[ "sudo" =~ "\bsudo\b" ]] && echo true || echo false
> true # before, it was false

I think it would be better adding the two lines to the docs rather than explaining everything,
as @danielshahaf mentioned earlier.

@danielshahaf
Copy link
Member

The emulate call will have reset that option to its default value by the time the regexp is compiled/matched, won't it?

Also, PCRE is optional at compile time.

Adding support for PCRE patterns sounds like a fair feature request, though.

@danielshahaf
Copy link
Member

My first question refers to what happens when the regexp highlighter compiles a string for matching.

@jnooree
Copy link
Contributor Author

jnooree commented Jul 9, 2020

The emulate call will have reset that option to its default value by the time the regexp is compiled/matched, won't it?
My first question refers to what happens when the regexp highlighter compiles a string for matching.

Actually, when I test the "word boundary" syntax, the highlighter has evaluated it correctly.

Also, PCRE is optional at compile time.

But I didn't know about this. Then it would be better to leave the docs.

@danielshahaf
Copy link
Member

Actually, when I test the "word boundary" syntax, the highlighter has evaluated it correctly.

I stand corrected.

But I didn't know about this. Then it would be better to leave the docs.

Well, that's one option. Another option is to mention in the docs that if your build has the pcre module available, then you can use it to get PCRE features (and a platform-independent syntax to boot); I guess most distro builds do ship that module. However, see #748 which I just opened for a potential can of worms here.

I have found something more about this. After adding this to ~/.zshrc:

autoload zsh/pcre
setopt RE_MATCH_PCRE

Did this test happen on the same machine you saw the zmodload error on? Asking this in context of whether most distro packages ship zsh with PCRE support.

@jnooree
Copy link
Contributor Author

jnooree commented Jul 13, 2020

Did this test happen on the same machine you saw the zmodload error on? Asking this in context of whether most distro packages ship zsh with PCRE support.

Yes, it did. All I need was the first line, autoload zsh/pcre. I think the error comes out if the module is not loaded (even if the module exists).

@danielshahaf
Copy link
Member

On Linux, I don't get dlopen() errors when the module is available but not loaded:

% zsh -fc '[[ foo -pcre-match bar ]]' 
zsh:1: unknown condition: -pcre-match
zsh: exit 2     zsh -fc '[[ foo -pcre-match bar ]]'
% zsh -fc 'zmodload zsh/pcre; [[ foo -pcre-match bar ]]' 
zsh: exit 1     zsh -fc 'zmodload zsh/pcre; [[ foo -pcre-match bar ]]'

Nor in a static build where pcre was disabled at build time:

% [[ foo -pcre-match bar ]] 
zsh: unknown condition: -pcre-match

So I'm not sure what to make of the error you saw.

By the way, you keep writing autoload. I assume you mean zmodload. autoload zsh/pcre simply marks the function called zsh/pcre as autoloadable; it doesn't have anything to do with the module.

@jnooree
Copy link
Contributor Author

jnooree commented Jul 13, 2020

On Linux, I don't get dlopen() errors when the module is available but not loaded:

% zsh -fc '[[ foo -pcre-match bar ]]' 
zsh:1: unknown condition: -pcre-match
zsh: exit 2     zsh -fc '[[ foo -pcre-match bar ]]'
% zsh -fc 'zmodload zsh/pcre; [[ foo -pcre-match bar ]]' 
zsh: exit 1     zsh -fc 'zmodload zsh/pcre; [[ foo -pcre-match bar ]]'

Nor in a static build where pcre was disabled at build time:

% [[ foo -pcre-match bar ]] 
zsh: unknown condition: -pcre-match

So I'm not sure what to make of the error you saw.

By the way, you keep writing autoload. I assume you mean zmodload. autoload zsh/pcre simply marks the function called zsh/pcre as autoloadable; it doesn't have anything to do with the module.

Now I've got the point. I changed the shell from the macOS system-wide zsh (/bin/zsh) to the homebrew version of zsh (during this conversation). Same error still exists with the system's zsh, but no problem with the homebrew version. Apple might have done something to the shell.

@danielshahaf
Copy link
Member

Summary: The patch is correct for the case that the RE_MATCH_PCRE option is unset, but doesn't cover the case that that option is set.

We're going to have to document the dependency on that option anyway, regardless of how we resolve #748.

So, what do you think of the following? It's @jnooree's text, extended to cover PCRE and copyedited.

This will highlight lines that start with a call to the `rm` command.

The regular expressions flavour used is [PCRE][pcresyntax] when the
`RE_MATCH_PCRE` option is set and POSIX Extended Regular Expressions (ERE),
as implemented by the platform's C library, otherwise.  For details on the
latter, see [the `zsh/regex` module's documentation][MAN_ZSH_REGEX] and the
`regcomp(3)` and `re_format(7)` manual pages on your system.

For instance, to highlight `sudo` only as a complete word, i.e., `sudo cmd`,
but not `sudoedit`, one might use:

* When the `RE_MATCH_PCRE` is set:

    ```zsh
    typeset -A ZSH_HIGHLIGHT_REGEXP
    ZSH_HIGHLIGHT_REGEXP+=('\bsudo\b' fg=123,bold)
    ```

* When the `RE_MATCH_PCRE` is unset, on platforms with GNU `libc` (e.g., many GNU/Linux distributions):

    ```zsh
    typeset -A ZSH_HIGHLIGHT_REGEXP
    ZSH_HIGHLIGHT_REGEXP+=('\<sudo\>' fg=123,bold)
    ```

* When the `RE_MATCH_PCRE` is unset, on BSD-based platforms (e.g., macOS):

    ```zsh
    typeset -A ZSH_HIGHLIGHT_REGEXP
    ZSH_HIGHLIGHT_REGEXP+=('[[:<:]]sudo[[:>:]]' fg=123,bold)
    ```

Note, however, that PCRE and POSIX ERE have a large common subset:
for instance, the regular expressions `[abc]`, `a*`, and `(a|b)` have the same
meaning in both flavours.

[pcresyntax]: http://www.pcre.org/original/doc/html/pcresyntax.html

@danielshahaf
Copy link
Member

Merged with @phy1729's review on #722 (comment). Thank you very much, @jnooree, and sorry we hadn't gotten to this sooner!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants