• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

grobidOrg / grobid / 27977370222

22 Jun 2026 07:01PM UTC coverage: 38.579% (-0.009%) from 38.588%
27977370222

Pull #1461

github

lfoppiano
fix: stop folding the letters ae/oe to "ae"/"oe" in clean() #728

TextUtilities.clean() expands typographic ligatures (the fi/fl/ff family,
genuine PDF rendering artifacts) but also folded the distinct Unicode
letters ae/AE and oe/OE to ASCII. Those are real letters used deliberately
in Danish, Norwegian, Icelandic, French, etc., so folding them was data
loss in the extracted text. They now pass through unchanged; the ligature
expansion is unaffected.

Add regression tests covering both the preserved letters and the
still-expanded typographic ligatures.
Pull Request #1461: Fix 728 - Avoid converting danish letters when not necessary

8636 of 24856 branches covered (34.74%)

Branch coverage included in aggregate %.

18385 of 45184 relevant lines covered (40.69%)

4.94 hits per line

Source File
Press 'n' to go to next uncovered line, 'b' for previous

22.96
/org/grobid/core/visualization/CitationsVisualizer.java


Source Not Available

STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc