Bug 10496 - lazy matches in regular expressions
Summary: lazy matches in regular expressions
Status: NEW
Alias: None
Product: Busybox
Classification: Unclassified
Component: Other (show other bugs)
Version: unspecified
Hardware: All Linux
: P5 normal
Target Milestone: ---
Assignee: unassigned
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-11-17 10:54 UTC by Shawn Landden
Modified: 2017-11-17 10:54 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Shawn Landden 2017-11-17 10:54:43 UTC
The lack of lazy matching in regular expressions makes certain regular languages impossible to parse. https://stackoverflow.com/a/39752929/8890015 but that only matches one occurrence.

TRE is a minimal library, 64KB in Debian amd64, that has the O(n) guarantee that all regular expressions engines SHOULD have, and supports lazy matching. I was told once that libc++ used TRE, but I can't find proof right now...

I am talking about:

.*?

https://github.com/laurikari/tre/

An example of something impossible to parse:

Foo: bar
Diz: harf

Two: newlines
Requires: lazymatches
When: number of keys
Is: variable

You: can't parse this
Without: lazy matches