Coveralls logob
Coveralls logo
  • Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

rust-lang / regex
100%
master: 93%

DEFAULT BRANCH: master
Build:
LAST BUILD BRANCH: ag/misc-fixes
Repo Added 13 Mar 2016 05:50PM UTC
Files 20
Badge
Badge Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

LAST BUILD ON BRANCH word
branch: word
CHANGE BRANCH
x
Reset
  • word
  • master

9 Apr 2016 - 2:24 coverage: 100.0%. First build
507

push

travis-ci

C07104de771c3b6f6c30be8f592ef8f7?size=18&default=identiconBurntSushi
Add ASCII word boundaries to the lazy DFA.

In other words, `\b` in a `bytes::Regex` can now be used in the DFA.
This leads to a big performance boost:

```
sherlock::word_ending_n                  115,465,261 (5 MB/s)  3,038,621 (195 MB/s)    -112,426,640  -97.37%
```

Unfortunately, Unicode word boundaries continue to elude the DFA. This
state of affairs is lamentable, but after a lot of thought, I've
concluded there are only two ways to speed up Unicode word boundaries:

1. Come up with a hairbrained scheme to add multi-byte look-behind/ahead
   to the lazy DFA. (The theory says it's possible. Figuring out how to
   do this without combinatorial state explosion is not within my grasp
   at the moment.)
2. Build a second lazy DFA with transitions on Unicode codepoints
   instead of bytes. (The looming inevitability of this makes me queasy
   for a number of reasons.)

To ameliorate this state of affairs, it is now possible to disable
Unicode support in `Regex::new` with `(?-u)`. In other words, one can
now use an ASCII word boundary with `(?-u:\b)`.

Disabling Unicode support does not violate any invariants around UTF-8.
In particular, if the regular expression could lead to a match of
invalid UTF-8, then the parser will return an error. (This only happens
for `Regex::new`. `bytes::Regex::new` still of course allows matching
arbitrary bytes.)

Finally, a new `PERFORMANCE.md` guide was written.

1048 of 1048 relevant lines covered (100.0%)

1.0 hits per line

Relevant lines Covered
1048 RELEVANT LINES 1048 COVERED LINES
Build:
1.0 HITS PER LINE
Source Files on word
  • List 16
  • Changed 0
  • Source Changed 0
  • Coverage Changed 0
Coverage ∆ File Lines Relevant Covered Missed Hits/Line

Builds

Builds Branch Commit Type Ran Committer Via Coverage
507 word Add ASCII word boundaries to the lazy DFA. In other words, `\b` in a `bytes::Regex` can now be used in the DFA. This leads to a big performance boost: ``` sherlock::word_ending_n 115,465,261 (5 MB/s) 3,038,621 (195 MB/s) -11... push 01 May 2018 11:09AM UTC C07104de771c3b6f6c30be8f592ef8f7?size=18&default=identiconBurntSushi travis-ci
100.0
See All Builds (912)
Notice exc

Badge your Repo: regex

We detected this repo isn’t badged! Grab the embed code to the right, add it to your repo to show off your code coverage, and when the badge is live hit the refresh button to remove this message.

Could not find badge in README.

Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

Loading Refresh
  • Repo on GitHub
Troubleshooting · Open an Issue · Sales · Support · ENTERPRISE · CAREERS · STATUS
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2023 Coveralls, Inc