• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

grobidOrg / grobid / 28109899234
39%

Build:
DEFAULT BRANCH: master
Ran 24 Jun 2026 03:41PM UTC
Jobs 3
Files 325
Run time 1min
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

24 Jun 2026 03:29PM UTC coverage: 38.58% (-0.06%) from 38.641%
28109899234

push

github

web-flow
Fix invalid XML output tags in the generated training data  (#1470)

In testClosingTag(), the guard deciding whether to close a paragraph after a
reference marker used '||' between mutually exclusive equality checks
(citation/figure/table/equation marker), which is always true. As a result a
'</p>' was appended after a marker even when the following token was another
marker inside the same paragraph, producing redundant and unbalanced tags
(unclosed/extra <p>) in the generated *.training.fulltext.tei.xml. Use '&&'
so the paragraph is closed only when the next tag is neither the paragraph
continuation nor another embedded marker.

This PR solves #465 and expand the scope to fix the training data generation on all the available models:

    fulltext ||→&& (createTraining produces fulltext TEI with unclosed tags #465) — the original marker-</p> tautology.
    table stray — testClosingTag + writeField emitted an opening tag where bare text belonged.
    figure </content> — closed an element that was never opened (gold corpus has no <content>).
    consecutive same-label blocks — adjacent figures/tables/<note>/<head> never closed the prior element; extended the existing I-<paragraph>/I-<item> close+reopen pattern.
    markers inside non-paragraph blocks — a ending a figure/equation/label always emitted </p>; added a lastContainerTag0 tracker + closingTagForContainer() to close the real block.
    adjacent different-type markers — <ref type="biblio">… nested; now closes the prior first.
    affiliation multi-token marker — re-opened <affiliation> on every marker token, crossing an open </marker>; now only re-opens on I-<marker>.

8656 of 24954 branches covered (34.69%)

Branch coverage included in aggregate %.

18431 of 45256 relevant lines covered (40.73%)

4.94 hits per line

Coverage Regressions

Lines Coverage ∆ File
319
46.6
-1.55% org/grobid/core/engines/FullTextParser.java
76
26.02
-0.1% org/grobid/core/engines/AffiliationAddressParser.java
28
60.49
0.37% org/grobid/core/engines/FigureParser.java
27
64.46
0.02% org/grobid/core/engines/TableParser.java
5
20.65
-3.23% org/grobid/core/process/StreamGobbler.java
1
74.51
-0.78% org/grobid/core/layout/BoundingBox.java
Jobs
ID Job ID Ran Files Coverage
1 28109899234.1 24 Jun 2026 03:41PM UTC 325
38.57
GitHub Action Run
2 28109899234.2 24 Jun 2026 03:42PM UTC 325
38.58
GitHub Action Run
3 28109899234.3 24 Jun 2026 03:51PM UTC 325
38.57
GitHub Action Run
Source Files on build 28109899234
  • Tree
  • List 325
  • Changed 5
  • Source Changed 0
  • Coverage Changed 5
Coverage ∆ File Lines Relevant Covered Missed Hits/Line Branch Hits Branch Misses
  • Back to Repo
  • Github Actions Build #28109899234
  • 21ee9ffb on github
  • Prev Build on master (#28080400274)
  • Next Build on master (#28113626887)
  • Delete
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc