Skip to content

Latest commit

 

History

History
633 lines (546 loc) · 13.8 KB

combinators.adoc

File metadata and controls

633 lines (546 loc) · 13.8 KB

Available combinators and how to use them

chomp provides a large number of combinators for parsing text. The following is a glossary containing all available combinators and their intended use.

As chomp is designed to return a tuple containing the (remaining text [rem], extracted text [ext], error), all errors are omitted from this glossary for brevity.

Basic combinators

Combinator Usage Output

Tag

Must match a series of characters at the beginning of the input text in the exact order and case provided

chomp.Tag("Hello")("Hello, World!")
rem: ", World!"
ext: "Hello"

Any

Must match at least one character from the provided sequence at the beginning of the input text. Parsing stops upon the first unmatched character

chomp.Any("eH")("Hello, World!")
rem: "llo, World!"
ext: "He"

Not

Must not match at least one character at the beginning of the input text from the provided sequence. Parsing stops upon the first matched character

chomp.Not("ol")("Hello, World!")
rem: "llo, World!"
ext: "He"

Must match a single character at the beginning of the text from the provided sequence

chomp.OneOf("!,eH")("Hello, World!")
rem: "ello, World!"
ext: "H"

Must not match a single character at the beginning of the text from the provided sequence

chomp.NoneOf("loWrd!e")("Hello, World!")
rem: "ello, World!"
ext: "H"

Will scan the input text for the first occurrence of the provided series of characters. Everything until that point in the text will be matched

chomp.Until("World")("Hello, World!")
rem: "World!"
ext: "Hello, "

Predicate combinators

Combinator Usage Output

Will scan the input text, testing each character against the provided predicate. The predicate must match at least one character

chomp.While(chomp.IsLetter)("Hello, World!")
rem: ", World!"
ext: "Hello"

Will scan the input text, testing each character against the provided predicate. The predicate must match at least n characters. If n is zero, this becomes an optional combinator

chomp.WhileN(chomp.IsLetter, 1)("Hello, World!")
rem: ", World!"
ext: "Hello"

Will scan the input text, testing each character against the provided predicate. The predicate must match a minimum of n and upto a maximum of m characters. If n is zero, this becomes an optional combinator

chomp.WhileNM(
    chomp.IsLetter, 1, 8)("Hello, World!")
rem: ", World!"
ext: "Hello"

Will scan the input text, testing each character against the provided predicate. The predicate must not match at least one character. It has the inverse behavior of While

chomp.WhileNot(chomp.IsDigit)("Hello, World!")
rem: ""
ext: "Hello, World!"

Will scan the input text, testing each character against the provided predicate. The predicate must not match at least n characters. If n is zero, this becomes an optional combinator. It has the inverse behavior of WhileN

chomp.WhileNotN(
    chomp.IsDigit, 1)("Hello, World!")
rem: ""
ext: "Hello, World!"

Will scan the input text, testing each character against the provided predicate. The predicate must not match a minimum of n and upto a maximum of m characters. If n is zero, this becomes an optional combinator. It has the inverse behavior of WhileNM

chomp.WhileNotNM(
    chomp.IsLetter, 1, 8,
)("20240709 was a great day")
rem: " was a great day"
ext: "20240709"

Available predicates

  • IsDigit: Determines whether a rune is a decimal digit. A rune is classed as a digit if it is between the ASCII range of '0' or '9', or if it belongs within the Unicode Nd category.

  • IsLetter: Determines if a rune is a letter. A rune is classed as a letter if it is between the ASCII range of 'a' and 'z' (including its uppercase equivalents), or it belongs within any of the Unicode letter categories: Lu LI Lt Lm Lo.

  • IsAlphanumeric: Determines whether a rune is a decimal digit or a letter. This convenience method wraps the existing IsDigit and IsLetter predicates.

  • IsLineEnding: Determines whether a rune is one of the following ASCII line ending characters '\r' or '\n'.

Sequence combinators

Combinator Usage Output

Will scan the input text and match each combinator in turn. Both combinators must match

chomp.Pair(
    chomp.Tag("Hello,"),
    chomp.Tag(" World"))("Hello, World!")
rem: "!"
ext: ["Hello,", " World"]

Will scan the input text and match each combinator, discarding the separator’s output. All combinators must match

chomp.SepPair(
    chomp.Tag("Hello"),
    chomp.Tag(", "),
    chomp.Tag("World"))("Hello, World!")
rem: "!"
ext: ["Hello", "World"]

Will scan the input text and match the combinator the defined number of times. Every execution must match

chomp.Repeat(
    chomp.Parentheses(), 2,
)("(Hello)(World)(!)")
rem: "(!)"
ext: ["(Hello)", "(World)"]

Will scan the input text and match the combinator between a minimum and maximum number of times. It must match the expected minimum number of times

chomp.RepeatRange(
    chomp.OneOf("Hleo"), 1, 8,
)("Hello, World!")
rem: ", World!"
ext: ["H", "e", "l", "l", "o"]

Will match a series of combinators against the input text. All must match, with the delimiters being discarded

chomp.Delimited(
    chomp.Tag("'"),
    chomp.Tag("Hello, World!"),
    chomp.Tag("'"))("'Hello, World!'")
rem: ""
ext: "Hello, World!"

Will match any text delimited (or surrounded) by a pair of "double quotes"

chomp.DoubleQuote()(`"Hello, World!"`)
rem: ""
ext: "Hello, World!"

Will match any text delimited (or surrounded) by a pair of 'single quotes'

chomp.QuoteSingle()("'Hello, World!'")
rem: ""
ext: "Hello, World!"

Will match any text delimited (or surrounded) by a pair of [square brackets]

chomp.BracketSquare()("[Hello, World!]")
rem: ""
ext: "Hello, World!"

Will match any text delimited (or surrounded) by a pair of (parentheses)

chomp.Parentheses()("(Hello, World!)")
rem: ""
ext: "Hello, World!"

Will match any text delimited (or surrounded) by a pair of <angled brackets>

chomp.BracketAngled()("<Hello, World!>")
rem: ""
ext: "Hello, World!"

Will match the input text against a series of combinators. Matching stops as soon as the first combinator succeeds. One combinator must match. For better performance, try and order the combinators from most to least likely to match

chomp.First(
    chomp.Tag("Good Morning"),
    chomp.Tag("Hello"),
)("Good Morning, World!")
rem: " ,World!"
ext: "Good Morning"

All

Will match the input text against a series of combinators. All combinators must match in the order provided

chomp.All(
    chomp.Tag("Hello"),
    chomp.Until("W"),
    chomp.Tag("World!"))("Hello, World!")
rem: ""
ext: ["Hello", ", ", "World!"]

Will scan the input text, and it must match the combinator at least once. This combinator is greedy and will continuously execute until the first failed match. It is the equivalent of calling ManyN with an argument of 1

chomp.Many(one.Of("Ho"))("Hello, World!")
rem: "ello, World!"
ext: ["H"]

Will scan the input text and match the combinator a minimum number of times. This combinator is greedy and will continuously execute until the first failed match

chomp.ManyN(
    chomp.OneOf("W"), 0)("Hello, World!")
rem: "Hello, World!"
ext: []

Will scan the input text for a defined prefix and discard it before matching the remaining text against the combinator. Both combinators must match

chomp.Prefixed(
    chomp.Tag("Hello"),
    chomp.Tag(`"`))(`"Hello, World!"`)
rem: `, World!"`
ext: "Hello"

Will scan the input text against the combinator before matching a suffix and discarding it. Both combinators must match

chomp.Suffixed(
    chomp.Tag("Hello"),
    chomp.Tag(", "))("Hello, World!")
rem: "World!"
ext: "Hello"

Modifier combinators

Combinator Usage Output

Map

Map the result of a combinator to any other type

chomp.Map(
    chomp.While(chomp.IsDigit),
    func (in string) int {
        return len(in)
    },
)("123456")
rem: ""
ext: 6

S

Wraps the result of the inner combinator within a string slice. Combinators of differing return types can be successfully chained together while using this conversion combinator

chomp.S(chomp.Until(","))("Hello, World!")
rem: ", World!"
ext: ["Hello"]

I

Extracts and returns a single string from the result of the inner combinator. Combinators of differing return types can be successfully chained together while using this conversion combinator

chomp.I(chomp.SepPair(
    chomp.Tag("Hello"),
    chomp.Tag(", "),
    chomp.Tag("World")), 1)("Hello, World!")
rem: "!"
ext: "World"

Will scan the text and apply the combinator without consuming any input. Useful if you need to look ahead

chomp.Peek(chomp.Tag("Hello"))("Hello, World!")
rem: "Hello, World!"
ext: "Hello"

Opt

Allows a combinator to be optional by discarding its returned error and not modifying the input text upon failure

chomp.Opt(chomp.Tag("Hey"))("Hello, World!")
rem: "Hello, World!"
ext: ""

Flattens the output from a combinator by joining all extracted values into a string

chomp.Flatten(
    chomp.Many(chomp.Parentheses()),
)("(H)(el)(lo), World!")
rem: ", World!"
ext: "Hello"

Ready-made parsers

Combinator Usage Output

Must match either a CR (\r) or CRLF (\r\n) line ending

chomp.Crlf()("\r\nHello")
rem: "Hello"
ext: "\r\n"

Eol

Will scan and return any text before any ASCII line ending characters. Line endings are discarded

chomp.Eol()(`Hello, World!\nIt's a great day!`)
rem: "It's a great day!"
ext: "Hello, World!"