• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

OWASP / java-html-sanitizer / #133

08 Jun 2022 04:43PM UTC coverage: 92.78%. First build
#133

push

web-flow
Decode attribute content differently from text node content (#255)

As described in issue #254 `&para` is a full complete character
reference when decoding text node content, but not when
decoding attribute content which causes problems for URL attribute
values like

    /test?param1=foo&param2=bar

As shown via JS test code in that issue, a small set of
next characters prevent a character reference name match
from being considered complete.

This commit:
- modifies the decode functions to take an extra parameter
  `boolean inAttribute`, and modifies the Trie traversal
  loops to not store a longest match so far based on that
  parameter and some next character tests
- modifies the HTML lexer to pass that attribute appropriately
- for backwards compat, leaves the old APIs in place but `@deprecated`
- adds unit tests for the decode functions
- adds a unit test for the specific input from the issue

This change should make us more conformant with observed
browser behaviour so is not expected to cause compatibility
problems for existing users.

Fixes #254

14 of 16 new or added lines in 3 files covered. (87.5%)

3945 of 4252 relevant lines covered (92.78%)

0.93 hits per line

Source File
Press 'n' to go to next uncovered line, 'b' for previous

90.09
/src/main/java/org/owasp/html/HtmlEntities.java


Source Not Available

STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc