• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

rust-lang / regex
100%
master: 93%

Build:
Build:
LAST BUILD BRANCH: ag/misc-fixes
DEFAULT BRANCH: master
Repo Added 13 Mar 2016 05:50PM CUT
Files 20
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

LAST BUILD ON BRANCH word
branch: word
CHANGE BRANCH
x
Reset
  • word
  • master

pending completion
507

push

travis-ci

BurntSushi
Add ASCII word boundaries to the lazy DFA.

In other words, `\b` in a `bytes::Regex` can now be used in the DFA.
This leads to a big performance boost:

```
sherlock::word_ending_n                  115,465,261 (5 MB/s)  3,038,621 (195 MB/s)    -112,426,640  -97.37%
```

Unfortunately, Unicode word boundaries continue to elude the DFA. This
state of affairs is lamentable, but after a lot of thought, I've
concluded there are only two ways to speed up Unicode word boundaries:

1. Come up with a hairbrained scheme to add multi-byte look-behind/ahead
   to the lazy DFA. (The theory says it's possible. Figuring out how to
   do this without combinatorial state explosion is not within my grasp
   at the moment.)
2. Build a second lazy DFA with transitions on Unicode codepoints
   instead of bytes. (The looming inevitability of this makes me queasy
   for a number of reasons.)

To ameliorate this state of affairs, it is now possible to disable
Unicode support in `Regex::new` with `(?-u)`. In other words, one can
now use an ASCII word boundary with `(?-u:\b)`.

Disabling Unicode support does not violate any invariants around UTF-8.
In particular, if the regular expression could lead to a match of
invalid UTF-8, then the parser will return an error. (This only happens
for `Regex::new`. `bytes::Regex::new` still of course allows matching
arbitrary bytes.)

Finally, a new `PERFORMANCE.md` guide was written.

1048 of 1048 relevant lines covered (100.0%)

1.0 hits per line

Relevant lines Covered
Build:
Build:
1048 RELEVANT LINES 1048 COVERED LINES
1.0 HITS PER LINE
Source Files on word
  • List 0
  • Changed 0
  • Source Changed 0
  • Coverage Changed 0
Loading...
Coverage∆FileLinesRelevantCoveredMissedHits/Line
No data available in table
Showing 0 to 0 of 0 entries
  • Previous
  • Next

Recent builds

Builds Branch Commit Type Ran Committer Via Coverage
507 word Add ASCII word boundaries to the lazy DFA. In other words, `\b` in a `bytes::Regex` can now be used in the DFA. This leads to a big performance boost: ``` sherlock::word_ending_n 115,465,261 (5 MB/s) 3,038,621 (195 MB/s) -11... push 01 May 2018 11:09AM CUT BurntSushi travis-ci pending completion  
See All Builds (912)
  • Repo on GitHub
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2025 Coveralls, Inc