• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

grpc / grpc-java / #20119

16 Dec 2025 06:38PM UTC coverage: 88.724% (+0.008%) from 88.716%
#20119

push

github

web-flow
api: Fix encoding of IPv6 scopes. (#12564)

Both `java.net.Uri` and Guava's `InetAddresses` predate standardization
of IPv6 scopes in URIs. They both emit/accept a naked % between the
[square bracketed] address. This causes the current implementation to
crash in io.grpc.Uri#getHost while percent decoding. RFC 6874 says that
% in an IP-literal must be percent-encoded just as is done everywhere
else.

RFC 3986 & 9844 say not to support scopes at all but I think we should, for feature parity with `java.net.Uri`. (Other contemporary libraries take the same approach, e.g. https://pkg.go.dev/net/url). A future PR will provide a first-class method to convert from `java.net.Uri` handling all the edge cases.

35472 of 39980 relevant lines covered (88.72%)

0.89 hits per line

Source File
Press 'n' to go to next uncovered line, 'b' for previous

98.67
/../api/src/main/java/io/grpc/Uri.java
1
/*
2
 * Copyright 2025 The gRPC Authors
3
 *
4
 * Licensed under the Apache License, Version 2.0 (the "License");
5
 * you may not use this file except in compliance with the License.
6
 * You may obtain a copy of the License at
7
 *
8
 *     http://www.apache.org/licenses/LICENSE-2.0
9
 *
10
 * Unless required by applicable law or agreed to in writing, software
11
 * distributed under the License is distributed on an "AS IS" BASIS,
12
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13
 * See the License for the specific language governing permissions and
14
 * limitations under the License.
15
 */
16

17
package io.grpc;
18

19
import static com.google.common.base.Preconditions.checkArgument;
20
import static com.google.common.base.Preconditions.checkNotNull;
21
import static com.google.common.base.Preconditions.checkState;
22

23
import com.google.common.base.VerifyException;
24
import com.google.common.collect.ImmutableList;
25
import com.google.common.net.InetAddresses;
26
import com.google.errorprone.annotations.CanIgnoreReturnValue;
27
import java.net.InetAddress;
28
import java.net.URISyntaxException;
29
import java.nio.ByteBuffer;
30
import java.nio.CharBuffer;
31
import java.nio.charset.CharacterCodingException;
32
import java.nio.charset.CharsetEncoder;
33
import java.nio.charset.CodingErrorAction;
34
import java.nio.charset.MalformedInputException;
35
import java.nio.charset.StandardCharsets;
36
import java.util.BitSet;
37
import java.util.List;
38
import java.util.Locale;
39
import java.util.Objects;
40
import javax.annotation.Nullable;
41

42
/**
43
 * A not-quite-general-purpose representation of a Uniform Resource Identifier (URI), as defined by
44
 * <a href="https://datatracker.ietf.org/doc/html/rfc3986">RFC 3986</a>.
45
 *
46
 * <h1>The URI</h1>
47
 *
48
 * <p>A URI identifies a resource by its name or location or both. The resource could be a file,
49
 * service, or some other abstract entity.
50
 *
51
 * <h2>Examples</h2>
52
 *
53
 * <ul>
54
 *   <li><code>http://admin@example.com:8080/controlpanel?filter=users#settings</code>
55
 *   <li><code>ftp://[2001:db8::7]/docs/report.pdf</code>
56
 *   <li><code>file:///My%20Computer/Documents/letter.doc</code>
57
 *   <li><code>dns://8.8.8.8/storage.googleapis.com</code>
58
 *   <li><code>mailto:John.Doe@example.com</code>
59
 *   <li><code>tel:+1-206-555-1212</code>
60
 *   <li><code>urn:isbn:978-1492082798</code>
61
 * </ul>
62
 *
63
 * <h2>Limitations</h2>
64
 *
65
 * <p>This class aims to meet the needs of grpc-java itself and RPC related code that depend on it.
66
 * It isn't quite general-purpose. It definitely would not be suitable for building an HTTP user
67
 * agent or proxy server. In particular, it:
68
 *
69
 * <ul>
70
 *   <li>Can only represent a URI, not a "URI-reference" or "relative reference". In other words, a
71
 *       "scheme" is always required.
72
 *   <li>Has no knowledge of the particulars of any scheme, with respect to normalization and
73
 *       comparison. We don't know <code>https://google.com</code> is the same as <code>
74
 *       https://google.com:443</code>, that <code>file:///</code> is the same as <code>
75
 *       file://localhost</code>, or that <code>joe@example.com</code> is the same as <code>
76
 *       joe@EXAMPLE.COM</code>. No one class can or should know everything about every scheme so
77
 *       all this is better handled at a higher layer.
78
 *   <li>Implements {@link #equals(Object)} as a char-by-char comparison. Expect false negatives.
79
 *   <li>Does not support "IPvFuture" literal addresses.
80
 *   <li>Does not reflect how web browsers parse user input or the <a
81
 *       href="https://url.spec.whatwg.org/">URL Living Standard</a>.
82
 *   <li>Does not support different character encodings. Assumes UTF-8 in several places.
83
 * </ul>
84
 *
85
 * <h2>Migrating from RFC 2396 and {@link java.net.URI}</h2>
86
 *
87
 * <p>Those migrating from {@link java.net.URI} and/or its primary specification in RFC 2396 should
88
 * note some differences.
89
 *
90
 * <h3>Uniform Hierarchical Syntax</h3>
91
 *
92
 * <p>RFC 3986 unifies the older ideas of "hierarchical" and "opaque" URIs into a single generic
93
 * syntax. What RFC 2396 called an opaque "scheme-specific part" is always broken out by RFC 3986
94
 * into an authority and path hierarchy, followed by query and fragment components. Accordingly,
95
 * this class has only getters for those components but no {@link
96
 * java.net.URI#getSchemeSpecificPart()} analog.
97
 *
98
 * <p>The RFC 3986 definition of path is now more liberal to accommodate this:
99
 *
100
 * <ul>
101
 *   <li>Path doesn't have to start with a slash. For example, the path of <code>
102
 *       urn:isbn:978-1492082798</code> is <code>isbn:978-1492082798</code> even though it doesn't
103
 *       look much like a file system path.
104
 *   <li>The path can now be empty. So Android's <code>
105
 *       intent:#Intent;action=MAIN;category=LAUNCHER;end</code> is now a valid {@link Uri}. Even
106
 *       the scheme-only <code>about:</code> is now valid.
107
 * </ul>
108
 *
109
 * <p>The uniform syntax always understands what follows a '?' to be a query string. For example,
110
 * <code>mailto:me@example.com?subject=foo</code> now has a query component whereas RFC 2396
111
 * considered everything after the <code>mailto:</code> scheme to be opaque.
112
 *
113
 * <p>Same goes for fragment. <code>data:image/png;...#xywh=0,0,10,10</code> now has a fragment
114
 * whereas RFC 2396 considered everything after the scheme to be opaque.
115
 *
116
 * <h3>Uniform Authority Syntax</h3>
117
 *
118
 * <p>RFC 2396 tried to guess if an authority was a "server" (host:port) or "registry-based"
119
 * (arbitrary string) based on its contents. RFC 3986 expects every authority to look like
120
 * [userinfo@]host[:port] and loosens the definition of a "host" to accommodate. Accordingly, this
121
 * class has no equivalent to {@link java.net.URI#parseServerAuthority()} -- authority was parsed
122
 * into its components and checked for validity when the {@link Uri} was created.
123
 *
124
 * <h3>Other Specific Differences</h3>
125
 *
126
 * <p>RFC 2396 does not allow underscores in a host name, meaning {@link java.net.URI} switches to
127
 * opaque mode when it sees one. {@link Uri} does allow underscores in host, to accommodate
128
 * registries other than DNS. So <code>http://my_site.com:8080/index.html</code> now parses as a
129
 * host, port and path rather than a single opaque scheme-specific part.
130
 *
131
 * <p>{@link Uri} strictly *requires* square brackets in the query string and fragment to be
132
 * percent-encoded whereas RFC 2396 merely recommended doing so.
133
 *
134
 * <p>Other URx classes are "liberal in what they accept and strict in what they produce." {@link
135
 * Uri#parse(String)} and {@link Uri#create(String)}, however, are strict in what they accept and
136
 * transparent when asked to reproduce it via {@link Uri#toString()}. The former policy may be
137
 * appropriate for parsing user input or web content, but this class is meant for gRPC clients,
138
 * servers and plugins like name resolvers where human error at runtime is less likely and best
139
 * detected early. {@link java.net.URI#create(String)} is similarly strict, which makes migration
140
 * easy, except for the server/registry-based ambiguity addressed by {@link
141
 * java.net.URI#parseServerAuthority()}.
142
 *
143
 * <p>{@link java.net.URI} and {@link Uri} both support IPv6 literals in square brackets as defined
144
 * by RFC 2732.
145
 *
146
 * <p>{@link java.net.URI} supports IPv6 scope IDs but accepts and emits a non-standard syntax.
147
 * {@link Uri} implements the newer RFC 6874, which percent encodes scope IDs and the % delimiter
148
 * itself. RFC 9844 claims to obsolete RFC 6874 because web browsers would not support it. This
149
 * class implements RFC 6874 anyway, mostly to avoid creating a barrier to migration away from
150
 * {@link java.net.URI}.
151
 */
152
@Internal
153
public final class Uri {
154
  // Components are stored percent-encoded, just as originally parsed for transparent parse/toString
155
  // round-tripping.
156
  private final String scheme; // != null since we don't support relative references.
157
  @Nullable private final String userInfo;
158
  @Nullable private final String host;
159
  @Nullable private final String port;
160
  private final String path; // In RFC 3986, path is always defined (but can be empty).
161
  @Nullable private final String query;
162
  @Nullable private final String fragment;
163

164
  private Uri(Builder builder) {
1✔
165
    this.scheme = checkNotNull(builder.scheme, "scheme");
1✔
166
    this.userInfo = builder.userInfo;
1✔
167
    this.host = builder.host;
1✔
168
    this.port = builder.port;
1✔
169
    this.path = builder.path;
1✔
170
    this.query = builder.query;
1✔
171
    this.fragment = builder.fragment;
1✔
172

173
    // Checks common to the parse() and Builder code paths.
174
    if (hasAuthority()) {
1✔
175
      if (!path.isEmpty() && !path.startsWith("/")) {
1✔
176
        throw new IllegalArgumentException("Has authority -- Non-empty path must start with '/'");
1✔
177
      }
178
    } else {
179
      if (path.startsWith("//")) {
1✔
180
        throw new IllegalArgumentException("No authority -- Path cannot start with '//'");
1✔
181
      }
182
    }
183
  }
1✔
184

185
  /**
186
   * Parses a URI from its string form.
187
   *
188
   * @throws URISyntaxException if 's' is not a valid RFC 3986 URI.
189
   */
190
  public static Uri parse(String s) throws URISyntaxException {
191
    try {
192
      return create(s);
1✔
193
    } catch (IllegalArgumentException e) {
1✔
194
      throw new URISyntaxException(s, e.getMessage());
1✔
195
    }
196
  }
197

198
  /**
199
   * Creates a URI from a string assumed to be valid.
200
   *
201
   * <p>Useful for defining URI constants in code. Not for user input.
202
   *
203
   * @throws IllegalArgumentException if 's' is not a valid RFC 3986 URI.
204
   */
205
  public static Uri create(String s) {
206
    Builder builder = new Builder();
1✔
207
    int i = 0;
1✔
208
    final int n = s.length();
1✔
209

210
    // 3.1. Scheme: Look for a ':' before '/', '?', or '#'.
211
    int schemeColon = -1;
1✔
212
    for (; i < n; ++i) {
1✔
213
      char c = s.charAt(i);
1✔
214
      if (c == ':') {
1✔
215
        schemeColon = i;
1✔
216
        break;
1✔
217
      } else if (c == '/' || c == '?' || c == '#') {
1✔
218
        break;
1✔
219
      }
220
    }
221
    if (schemeColon < 0) {
1✔
222
      throw new IllegalArgumentException("Missing required scheme.");
1✔
223
    }
224
    builder.setRawScheme(s.substring(0, schemeColon));
1✔
225

226
    // 3.2. Authority. Look for '//' then keep scanning until '/', '?', or '#'.
227
    i = schemeColon + 1;
1✔
228
    if (i + 1 < n && s.charAt(i) == '/' && s.charAt(i + 1) == '/') {
1✔
229
      // "//" just means we have an authority. Skip over it.
230
      i += 2;
1✔
231

232
      int authorityStart = i;
1✔
233
      for (; i < n; ++i) {
1✔
234
        char c = s.charAt(i);
1✔
235
        if (c == '/' || c == '?' || c == '#') {
1✔
236
          break;
1✔
237
        }
238
      }
239
      String authority = s.substring(authorityStart, i);
1✔
240

241
      // 3.2.1. UserInfo. Easy, because '@' cannot appear unencoded inside userinfo or host.
242
      int userInfoEnd = authority.indexOf('@');
1✔
243
      if (userInfoEnd >= 0) {
1✔
244
        builder.setRawUserInfo(authority.substring(0, userInfoEnd));
1✔
245
      }
246

247
      // 3.2.2/3. Host/Port.
248
      int hostStart = userInfoEnd >= 0 ? userInfoEnd + 1 : 0;
1✔
249
      int portStartColon = findPortStartColon(authority, hostStart);
1✔
250
      if (portStartColon < 0) {
1✔
251
        builder.setRawHost(authority.substring(hostStart, authority.length()));
1✔
252
      } else {
253
        builder.setRawHost(authority.substring(hostStart, portStartColon));
1✔
254
        builder.setRawPort(authority.substring(portStartColon + 1));
1✔
255
      }
256
    }
257

258
    // 3.3. Path: Whatever is left before '?' or '#'.
259
    int pathStart = i;
1✔
260
    for (; i < n; ++i) {
1✔
261
      char c = s.charAt(i);
1✔
262
      if (c == '?' || c == '#') {
1✔
263
        break;
1✔
264
      }
265
    }
266
    builder.setRawPath(s.substring(pathStart, i));
1✔
267

268
    // 3.4. Query, if we stopped at '?'.
269
    if (i < n && s.charAt(i) == '?') {
1✔
270
      i++; // Skip '?'
1✔
271
      int queryStart = i;
1✔
272
      for (; i < n; ++i) {
1✔
273
        char c = s.charAt(i);
1✔
274
        if (c == '#') {
1✔
275
          break;
1✔
276
        }
277
      }
278
      builder.setRawQuery(s.substring(queryStart, i));
1✔
279
    }
280

281
    // 3.5. Fragment, if we stopped at '#'.
282
    if (i < n && s.charAt(i) == '#') {
1✔
283
      ++i; // Skip '#'
1✔
284
      builder.setRawFragment(s.substring(i));
1✔
285
    }
286

287
    return builder.build();
1✔
288
  }
289

290
  private static int findPortStartColon(String authority, int hostStart) {
291
    for (int i = authority.length() - 1; i >= hostStart; --i) {
1✔
292
      char c = authority.charAt(i);
1✔
293
      if (c == ':') {
1✔
294
        return i;
1✔
295
      }
296
      if (c == ']') {
1✔
297
        // Hit the end of IP-literal. Any further colon is inside it and couldn't indicate a port.
298
        break;
1✔
299
      }
300
      if (!digitChars.get(c)) {
1✔
301
        // Found a non-digit, non-colon, non-bracket.
302
        // This means there is no valid port (e.g. host is "example.com")
303
        break;
1✔
304
      }
305
    }
306
    return -1;
1✔
307
  }
308

309
  // Checks a raw path for validity and parses it into segments. Let 'out' be null to just validate.
310
  private static void parseAssumedUtf8PathIntoSegments(
311
      String path, ImmutableList.Builder<String> out) {
312
    // Skip the first slash so it doesn't count as an empty segment at the start.
313
    // (e.g., "/a" -> ["a"], not ["", "a"])
314
    int start = path.startsWith("/") ? 1 : 0;
1✔
315

316
    for (int i = start; i < path.length(); ) {
1✔
317
      int nextSlash = path.indexOf('/', i);
1✔
318
      String segment;
319
      if (nextSlash >= 0) {
1✔
320
        // Typical segment case (e.g., "foo" in "/foo/bar").
321
        segment = path.substring(i, nextSlash);
1✔
322
        i = nextSlash + 1;
1✔
323
      } else {
324
        // Final segment case (e.g., "bar" in "/foo/bar").
325
        segment = path.substring(i);
1✔
326
        i = path.length();
1✔
327
      }
328
      if (out != null) {
1✔
329
        out.add(percentDecodeAssumedUtf8(segment));
1✔
330
      } else {
331
        checkPercentEncodedArg(segment, "path segment", pChars);
1✔
332
      }
333
    }
1✔
334

335
    // RFC 3986 says a trailing slash creates a final empty segment.
336
    // (e.g., "/foo/" -> ["foo", ""])
337
    if (path.endsWith("/") && out != null) {
1✔
338
      out.add("");
1✔
339
    }
340
  }
1✔
341

342
  /** Returns the scheme of this URI. */
343
  public String getScheme() {
344
    return scheme;
1✔
345
  }
346

347
  /**
348
   * Returns the percent-decoded "Authority" component of this URI, or null if not present.
349
   *
350
   * <p>NB: This method assumes the "host" component was encoded as UTF-8, as mandated by RFC 3986.
351
   * This method also assumes the "user information" part of authority was encoded as UTF-8,
352
   * although RFC 3986 doesn't specify an encoding.
353
   *
354
   * <p>Decoding errors are indicated by a {@code '\u005CuFFFD'} unicode replacement character in
355
   * the output. Callers who want to detect and handle errors in some other way should call {@link
356
   * #getRawAuthority()}, {@link #percentDecode(CharSequence)}, then decode the bytes for
357
   * themselves.
358
   */
359
  @Nullable
360
  public String getAuthority() {
361
    return percentDecodeAssumedUtf8(getRawAuthority());
1✔
362
  }
363

364
  private boolean hasAuthority() {
365
    return host != null;
1✔
366
  }
367

368
  /**
369
   * Returns the "authority" component of this URI in its originally parsed, possibly
370
   * percent-encoded form.
371
   */
372
  @Nullable
373
  public String getRawAuthority() {
374
    if (hasAuthority()) {
1✔
375
      StringBuilder sb = new StringBuilder();
1✔
376
      appendAuthority(sb);
1✔
377
      return sb.toString();
1✔
378
    }
379
    return null;
1✔
380
  }
381

382
  private void appendAuthority(StringBuilder sb) {
383
    if (userInfo != null) {
1✔
384
      sb.append(userInfo).append('@');
1✔
385
    }
386
    if (host != null) {
1✔
387
      sb.append(host);
1✔
388
    }
389
    if (port != null) {
1✔
390
      sb.append(':').append(port);
1✔
391
    }
392
  }
1✔
393

394
  /**
395
   * Returns the percent-decoded "User Information" component of this URI, or null if not present.
396
   *
397
   * <p>NB: This method *assumes* this component was encoded as UTF-8, although RFC 3986 doesn't
398
   * specify an encoding.
399
   *
400
   * <p>Decoding errors are indicated by a {@code '\u005CuFFFD'} unicode replacement character in
401
   * the output. Callers who want to detect and handle errors in some other way should call {@link
402
   * #getRawUserInfo()}, {@link #percentDecode(CharSequence)}, then decode the bytes for themselves.
403
   */
404
  @Nullable
405
  public String getUserInfo() {
406
    return percentDecodeAssumedUtf8(userInfo);
1✔
407
  }
408

409
  /**
410
   * Returns the "User Information" component of this URI in its originally parsed, possibly
411
   * percent-encoded form.
412
   */
413
  @Nullable
414
  public String getRawUserInfo() {
415
    return userInfo;
1✔
416
  }
417

418
  /**
419
   * Returns the percent-decoded "host" component of this URI, or null if not present.
420
   *
421
   * <p>This method assumes the host was encoded as UTF-8, as mandated by RFC 3986.
422
   *
423
   * <p>Decoding errors are indicated by a {@code '\u005CuFFFD'} unicode replacement character in
424
   * the output. Callers who want to detect and handle errors in some other way should call {@link
425
   * #getRawHost()}, {@link #percentDecode(CharSequence)}, then decode the bytes for themselves.
426
   */
427
  @Nullable
428
  public String getHost() {
429
    return percentDecodeAssumedUtf8(host);
1✔
430
  }
431

432
  /**
433
   * Returns the host component of this URI in its originally parsed, possibly percent-encoded form.
434
   */
435
  @Nullable
436
  public String getRawHost() {
437
    return host;
1✔
438
  }
439

440
  /** Returns the "port" component of this URI, or -1 if not present. */
441
  public int getPort() {
442
    return port != null ? Integer.parseInt(port) : -1;
1✔
443
  }
444

445
  /** Returns the raw port component of this URI in its originally parsed form. */
446
  @Nullable
447
  public String getRawPort() {
448
    return port;
1✔
449
  }
450

451
  /**
452
   * Returns the (possibly empty) percent-decoded "path" component of this URI.
453
   *
454
   * <p>NB: This method *assumes* the path was encoded as UTF-8, although RFC 3986 doesn't specify
455
   * an encoding.
456
   *
457
   * <p>Decoding errors are indicated by a {@code '\u005CuFFFD'} unicode replacement character in
458
   * the output. Callers who want to detect and handle errors in some other way should call {@link
459
   * #getRawPath()}, {@link #percentDecode(CharSequence)}, then decode the bytes for themselves.
460
   *
461
   * <p>NB: Prefer {@link #getPathSegments()} because this method's decoding is lossy. For example,
462
   * consider these (different) URIs:
463
   *
464
   * <ul>
465
   *   <li>file:///home%2Ffolder/my%20file
466
   *   <li>file:///home/folder/my%20file
467
   * </ul>
468
   *
469
   * <p>Calling getPath() on each returns the same string: <code>/home/folder/my file</code>. You
470
   * can't tell whether the second '/' character is part of the first path segment or separates the
471
   * first and second path segments. This method only exists to ease migration from {@link
472
   * java.net.URI}.
473
   */
474
  public String getPath() {
475
    return percentDecodeAssumedUtf8(path);
1✔
476
  }
477

478
  /**
479
   * Returns this URI's path as a list of path segments not including the '/' segment delimiters.
480
   *
481
   * <p>Prefer this method over {@link #getPath()} because it preserves the distinction between
482
   * segment separators and literal '/'s within a path segment.
483
   *
484
   * <p>The returned list is immutable.
485
   */
486
  public List<String> getPathSegments() {
487
    // Returned list must be immutable but we intentionally keep guava out of the public API.
488
    ImmutableList.Builder<String> segmentsBuilder = ImmutableList.builder();
1✔
489
    parseAssumedUtf8PathIntoSegments(path, segmentsBuilder);
1✔
490
    return segmentsBuilder.build();
1✔
491
  }
492

493
  /**
494
   * Returns the path component of this URI in its originally parsed, possibly percent-encoded form.
495
   */
496
  public String getRawPath() {
497
    return path;
1✔
498
  }
499

500
  /**
501
   * Returns the percent-decoded "query" component of this URI, or null if not present.
502
   *
503
   * <p>NB: This method assumes the query was encoded as UTF-8, although RFC 3986 doesn't specify an
504
   * encoding.
505
   *
506
   * <p>Decoding errors are indicated by a {@code '\u005CuFFFD'} unicode replacement character in
507
   * the output. Callers who want to detect and handle errors in some other way should call {@link
508
   * #getRawQuery()}, {@link #percentDecode(CharSequence)}, then decode the bytes for themselves.
509
   */
510
  @Nullable
511
  public String getQuery() {
512
    return percentDecodeAssumedUtf8(query);
1✔
513
  }
514

515
  /**
516
   * Returns the query component of this URI in its originally parsed, possibly percent-encoded
517
   * form, without any leading '?' character.
518
   */
519
  @Nullable
520
  public String getRawQuery() {
521
    return query;
1✔
522
  }
523

524
  /**
525
   * Returns the percent-decoded "fragment" component of this URI, or null if not present.
526
   *
527
   * <p>NB: This method assumes the fragment was encoded as UTF-8, although RFC 3986 doesn't specify
528
   * an encoding.
529
   *
530
   * <p>Decoding errors are indicated by a {@code '\u005CuFFFD'} unicode replacement character in
531
   * the output. Callers who want to detect and handle errors in some other way should call {@link
532
   * #getRawFragment()}, {@link #percentDecode(CharSequence)}, then decode the bytes for themselves.
533
   */
534
  @Nullable
535
  public String getFragment() {
536
    return percentDecodeAssumedUtf8(fragment);
1✔
537
  }
538

539
  /**
540
   * Returns the fragment component of this URI in its original, possibly percent-encoded form, and
541
   * without any leading '#' character.
542
   */
543
  @Nullable
544
  public String getRawFragment() {
545
    return fragment;
1✔
546
  }
547

548
  /**
549
   * {@inheritDoc}
550
   *
551
   * <p>If this URI was created by {@link #parse(String)} or {@link #create(String)}, then the
552
   * returned string will match that original input exactly.
553
   */
554
  @Override
555
  public String toString() {
556
    // https://datatracker.ietf.org/doc/html/rfc3986#section-5.3
557
    StringBuilder sb = new StringBuilder();
1✔
558
    sb.append(scheme).append(':');
1✔
559
    if (hasAuthority()) {
1✔
560
      sb.append("//");
1✔
561
      appendAuthority(sb);
1✔
562
    }
563
    sb.append(path);
1✔
564
    if (query != null) {
1✔
565
      sb.append('?').append(query);
1✔
566
    }
567
    if (fragment != null) {
1✔
568
      sb.append('#').append(fragment);
1✔
569
    }
570
    return sb.toString();
1✔
571
  }
572

573
  /**
574
   * Returns true iff this URI has a scheme and an authority/path hierarchy, but no fragment.
575
   *
576
   * <p>All instances of {@link Uri} are RFC 3986 URIs, not "relative references", so this method is
577
   * equivalent to {@code getFragment() == null}. It mostly exists for compatibility with {@link
578
   * java.net.URI}.
579
   */
580
  public boolean isAbsolute() {
581
    return scheme != null && fragment == null;
1✔
582
  }
583

584
  /**
585
   * {@inheritDoc}
586
   *
587
   * <p>Two instances of {@link Uri} are equal if and only if they have the same string
588
   * representation, which RFC 3986 calls "Simple String Comparison" (6.2.1). Callers with a higher
589
   * layer expectation of equality (e.g. <code>http://some%2Dhost:80/foo/./bar.txt</code> ~= <code>
590
   * http://some-host/foo/bar.txt</code>) will experience false negatives.
591
   */
592
  @Override
593
  public boolean equals(Object otherObj) {
594
    if (!(otherObj instanceof Uri)) {
1✔
595
      return false;
1✔
596
    }
597
    Uri other = (Uri) otherObj;
1✔
598
    return Objects.equals(scheme, other.scheme)
1✔
599
        && Objects.equals(userInfo, other.userInfo)
1✔
600
        && Objects.equals(host, other.host)
1✔
601
        && Objects.equals(port, other.port)
1✔
602
        && Objects.equals(path, other.path)
1✔
603
        && Objects.equals(query, other.query)
1✔
604
        && Objects.equals(fragment, other.fragment);
1✔
605
  }
606

607
  @Override
608
  public int hashCode() {
609
    return Objects.hash(scheme, userInfo, host, port, path, query, fragment);
1✔
610
  }
611

612
  /** Returns a new Builder initialized with the fields of this URI. */
613
  public Builder toBuilder() {
614
    return new Builder(this);
1✔
615
  }
616

617
  /** Creates a new {@link Builder} with all fields uninitialized or set to their default values. */
618
  public static Builder newBuilder() {
619
    return new Builder();
1✔
620
  }
621

622
  /** Builder for {@link Uri}. */
623
  public static final class Builder {
624
    private String scheme;
625
    private String path = "";
1✔
626
    private String query;
627
    private String fragment;
628
    private String userInfo;
629
    private String host;
630
    private String port;
631

632
    private Builder() {}
1✔
633

634
    Builder(Uri prototype) {
1✔
635
      this.scheme = prototype.scheme;
1✔
636
      this.userInfo = prototype.userInfo;
1✔
637
      this.host = prototype.host;
1✔
638
      this.port = prototype.port;
1✔
639
      this.path = prototype.path;
1✔
640
      this.query = prototype.query;
1✔
641
      this.fragment = prototype.fragment;
1✔
642
    }
1✔
643

644
    /**
645
     * Sets the scheme, e.g. "https", "dns" or "xds".
646
     *
647
     * <p>This field is required.
648
     *
649
     * @return this, for fluent building
650
     * @throws IllegalArgumentException if the scheme is invalid.
651
     */
652
    @CanIgnoreReturnValue
653
    public Builder setScheme(String scheme) {
654
      return setRawScheme(scheme.toLowerCase(Locale.ROOT));
1✔
655
    }
656

657
    @CanIgnoreReturnValue
658
    Builder setRawScheme(String scheme) {
659
      if (scheme.isEmpty() || !alphaChars.get(scheme.charAt(0))) {
1✔
660
        throw new IllegalArgumentException("Scheme must start with an alphabetic char");
1✔
661
      }
662
      for (int i = 0; i < scheme.length(); i++) {
1✔
663
        char c = scheme.charAt(i);
1✔
664
        if (!schemeChars.get(c)) {
1✔
665
          throw new IllegalArgumentException("Invalid character in scheme at index " + i);
1✔
666
        }
667
      }
668
      this.scheme = scheme;
1✔
669
      return this;
1✔
670
    }
671

672
    /**
673
     * Specifies the new URI's path component as a string of zero or more '/' delimited segments.
674
     *
675
     * <p>Path segments can consist of any string of codepoints. Codepoints that can't be encoded
676
     * literally will be percent-encoded for you.
677
     *
678
     * <p>If a URI contains an authority component, then the path component must either be empty or
679
     * begin with a slash ("/") character. If a URI does not contain an authority component, then
680
     * the path cannot begin with two slash characters ("//").
681
     *
682
     * <p>This method interprets all '/' characters in 'path' as segment delimiters. If any of your
683
     * segments contain literal '/' characters, call {@link #setRawPath(String)} instead.
684
     *
685
     * <p>See <a href="https://datatracker.ietf.org/doc/html/rfc3986#section-3.3">RFC 3986 3.3</a>
686
     * for more.
687
     *
688
     * <p>This field is required but can be empty (its default value).
689
     *
690
     * @param path the new path
691
     * @return this, for fluent building
692
     */
693
    @CanIgnoreReturnValue
694
    public Builder setPath(String path) {
695
      checkArgument(path != null, "Path can be empty but not null");
1✔
696
      this.path = percentEncode(path, pCharsAndSlash);
1✔
697
      return this;
1✔
698
    }
699

700
    /**
701
     * Specifies the new URI's path component as a string of zero or more '/' delimited segments.
702
     *
703
     * <p>Path segments can consist of any string of codepoints but the caller must first percent-
704
     * encode anything other than RFC 3986's "pchar" character class using UTF-8.
705
     *
706
     * <p>If a URI contains an authority component, then the path component must either be empty or
707
     * begin with a slash ("/") character. If a URI does not contain an authority component, then
708
     * the path cannot begin with two slash characters ("//").
709
     *
710
     * <p>This method interprets all '/' characters in 'path' as segment delimiters. If any of your
711
     * segments contain literal '/' characters, you must percent-encode them.
712
     *
713
     * <p>See <a href="https://datatracker.ietf.org/doc/html/rfc3986#section-3.3">RFC 3986 3.3</a>
714
     * for more.
715
     *
716
     * <p>This field is required but can be empty (its default value).
717
     *
718
     * @param path the new path, a string consisting of characters from "pchar"
719
     * @return this, for fluent building
720
     */
721
    @CanIgnoreReturnValue
722
    public Builder setRawPath(String path) {
723
      checkArgument(path != null, "Path can be empty but not null");
1✔
724
      parseAssumedUtf8PathIntoSegments(path, null);
1✔
725
      this.path = path;
1✔
726
      return this;
1✔
727
    }
728

729
    /**
730
     * Specifies the query component of the new URI (not including the leading '?').
731
     *
732
     * <p>Query can contain any string of codepoints. Codepoints that can't be encoded literally
733
     * will be percent-encoded for you as UTF-8.
734
     *
735
     * <p>This field is optional.
736
     *
737
     * @param query the new query component, or null to clear this field
738
     * @return this, for fluent building
739
     */
740
    @CanIgnoreReturnValue
741
    public Builder setQuery(@Nullable String query) {
742
      this.query = percentEncode(query, queryChars);
1✔
743
      return this;
1✔
744
    }
745

746
    @CanIgnoreReturnValue
747
    Builder setRawQuery(String query) {
748
      checkPercentEncodedArg(query, "query", queryChars);
1✔
749
      this.query = query;
1✔
750
      return this;
1✔
751
    }
752

753
    /**
754
     * Specifies the fragment component of the new URI (not including the leading '#').
755
     *
756
     * <p>The fragment can contain any string of codepoints. Codepoints that can't be encoded
757
     * literally will be percent-encoded for you as UTF-8.
758
     *
759
     * <p>This field is optional.
760
     *
761
     * @param fragment the new fragment component, or null to clear this field
762
     * @return this, for fluent building
763
     */
764
    @CanIgnoreReturnValue
765
    public Builder setFragment(@Nullable String fragment) {
766
      this.fragment = percentEncode(fragment, fragmentChars);
1✔
767
      return this;
1✔
768
    }
769

770
    @CanIgnoreReturnValue
771
    Builder setRawFragment(String fragment) {
772
      checkPercentEncodedArg(fragment, "fragment", fragmentChars);
1✔
773
      this.fragment = fragment;
1✔
774
      return this;
1✔
775
    }
776

777
    /**
778
     * Set the "user info" component of the new URI, e.g. "username:password", not including the
779
     * trailing '@' character.
780
     *
781
     * <p>User info can contain any string of codepoints. Codepoints that can't be encoded literally
782
     * will be percent-encoded for you as UTF-8.
783
     *
784
     * <p>This field is optional.
785
     *
786
     * @param userInfo the new "user info" component, or null to clear this field
787
     * @return this, for fluent building
788
     */
789
    @CanIgnoreReturnValue
790
    public Builder setUserInfo(@Nullable String userInfo) {
791
      this.userInfo = percentEncode(userInfo, userInfoChars);
1✔
792
      return this;
1✔
793
    }
794

795
    @CanIgnoreReturnValue
796
    Builder setRawUserInfo(String userInfo) {
797
      checkPercentEncodedArg(userInfo, "userInfo", userInfoChars);
1✔
798
      this.userInfo = userInfo;
1✔
799
      return this;
1✔
800
    }
801

802
    /**
803
     * Specifies the "host" component of the new URI in its "registered name" form (usually DNS),
804
     * e.g. "server.com".
805
     *
806
     * <p>The registered name can contain any string of codepoints. Codepoints that can't be encoded
807
     * literally will be percent-encoded for you as UTF-8.
808
     *
809
     * <p>This field is optional.
810
     *
811
     * @param regName the new host component in "registered name" form, or null to clear this field
812
     * @return this, for fluent building
813
     */
814
    @CanIgnoreReturnValue
815
    public Builder setHost(@Nullable String regName) {
816
      if (regName != null) {
1✔
817
        regName = regName.toLowerCase(Locale.ROOT);
1✔
818
        regName = percentEncode(regName, regNameChars);
1✔
819
      }
820
      this.host = regName;
1✔
821
      return this;
1✔
822
    }
823

824
    /**
825
     * Specifies the "host" component of the new URI as an IP address.
826
     *
827
     * <p>This field is optional.
828
     *
829
     * @param addr the new "host" component in InetAddress form, or null to clear this field
830
     * @return this, for fluent building
831
     */
832
    @CanIgnoreReturnValue
833
    public Builder setHost(@Nullable InetAddress addr) {
834
      this.host = addr != null ? toUriString(addr) : null;
1✔
835
      return this;
1✔
836
    }
837

838
    private static String toUriString(InetAddress addr) {
839
      // InetAddresses.toUriString(addr) is almost enough but neglects RFC 6874 percent encoding.
840
      String inetAddrStr = InetAddresses.toUriString(addr);
1✔
841
      int percentIndex = inetAddrStr.indexOf('%');
1✔
842
      if (percentIndex < 0) {
1✔
843
        return inetAddrStr;
1✔
844
      }
845

846
      String scope = inetAddrStr.substring(percentIndex, inetAddrStr.length() - 1);
1✔
847
      return inetAddrStr.substring(0, percentIndex) + percentEncode(scope, unreservedChars) + "]";
1✔
848
    }
849

850
    @CanIgnoreReturnValue
851
    Builder setRawHost(String host) {
852
      if (host.startsWith("[") && host.endsWith("]")) {
1✔
853
        // IP-literal: Guava's isUriInetAddress() is almost enough but it doesn't check the scope.
854
        int percentIndex = host.indexOf('%');
1✔
855
        if (percentIndex > 0) {
1✔
856
          String scope = host.substring(percentIndex, host.length() - 1);
1✔
857
          checkPercentEncodedArg(scope, "scope", unreservedChars);
1✔
858
        }
859
      }
860
      // IP-literal validation is complicated so we delegate it to Guava. We use this particular
861
      // method of InetAddresses because it doesn't try to match interfaces on the local machine.
862
      // (The validity of a URI should be the same no matter which machine does the parsing.)
863
      // TODO(jdcormie): IPFuture
864
      if (!InetAddresses.isUriInetAddress(host)) {
1✔
865
        // Must be a "registered name".
866
        checkPercentEncodedArg(host, "host", regNameChars);
1✔
867
      }
868
      this.host = host;
1✔
869
      return this;
1✔
870
    }
871

872
    /**
873
     * Specifies the "port" component of the new URI, e.g. "8080".
874
     *
875
     * <p>The port can be any non-negative integer. A negative value represents "no port".
876
     *
877
     * <p>This field is optional.
878
     *
879
     * @param port the new "port" component, or -1 to clear this field
880
     * @return this, for fluent building
881
     */
882
    @CanIgnoreReturnValue
883
    public Builder setPort(int port) {
884
      this.port = port < 0 ? null : Integer.toString(port);
1✔
885
      return this;
1✔
886
    }
887

888
    @CanIgnoreReturnValue
889
    Builder setRawPort(String port) {
890
      try {
891
        Integer.parseInt(port); // Result unused.
1✔
892
      } catch (NumberFormatException e) {
1✔
893
        throw new IllegalArgumentException("Invalid port", e);
1✔
894
      }
1✔
895
      this.port = port;
1✔
896
      return this;
1✔
897
    }
898

899
    /** Builds a new instance of {@link Uri} as specified by the setters. */
900
    public Uri build() {
901
      checkState(scheme != null, "Missing required scheme.");
1✔
902
      if (host == null) {
1✔
903
        checkState(port == null, "Cannot set port without host.");
1✔
904
        checkState(userInfo == null, "Cannot set userInfo without host.");
1✔
905
      }
906
      return new Uri(this);
1✔
907
    }
908
  }
909

910
  /**
911
   * Decodes a string of characters in the range [U+0000, U+007F] to bytes.
912
   *
913
   * <p>Each percent-encoded sequence (e.g. "%F0" or "%2a", as defined by RFC 3986 2.1) is decoded
914
   * to the octet it encodes. Other characters are decoded to their code point's single byte value.
915
   * A literal % character must be encoded as %25.
916
   *
917
   * @throws IllegalArgumentException if 's' contains characters out of range or invalid percent
918
   *     encoding sequences.
919
   */
920
  public static ByteBuffer percentDecode(CharSequence s) {
921
    // This is large enough because each input character needs *at most* one byte of output.
922
    ByteBuffer outBuf = ByteBuffer.allocate(s.length());
1✔
923
    percentDecode(s, "input", null, outBuf);
1✔
924
    outBuf.flip();
1✔
925
    return outBuf;
1✔
926
  }
927

928
  private static void percentDecode(
929
      CharSequence s, String what, BitSet allowedChars, ByteBuffer outBuf) {
930
    for (int i = 0; i < s.length(); i++) {
1✔
931
      char c = s.charAt(i);
1✔
932
      if (c == '%') {
1✔
933
        if (i + 2 >= s.length()) {
1✔
934
          throw new IllegalArgumentException(
1✔
935
              "Invalid percent-encoding at index " + i + " of " + what + ": " + s);
936
        }
937
        int h1 = Character.digit(s.charAt(i + 1), 16);
1✔
938
        int h2 = Character.digit(s.charAt(i + 2), 16);
1✔
939
        if (h1 == -1 || h2 == -1) {
1✔
940
          throw new IllegalArgumentException(
1✔
941
              "Invalid hex digit in " + what + " at index " + i + " of: " + s);
942
        }
943
        if (outBuf != null) {
1✔
944
          outBuf.put((byte) (h1 << 4 | h2));
1✔
945
        }
946
        i += 2;
1✔
947
      } else if (allowedChars == null || allowedChars.get(c)) {
1✔
948
        if (outBuf != null) {
1✔
949
          outBuf.put((byte) c);
1✔
950
        }
951
      } else {
952
        throw new IllegalArgumentException("Invalid character in " + what + " at index " + i);
1✔
953
      }
954
    }
955
  }
1✔
956

957
  @Nullable
958
  private static String percentDecodeAssumedUtf8(@Nullable String s) {
959
    if (s == null || s.indexOf('%') == -1) {
1✔
960
      return s;
1✔
961
    }
962

963
    ByteBuffer utf8Bytes = percentDecode(s);
1✔
964
    try {
965
      return StandardCharsets.UTF_8
1✔
966
          .newDecoder()
1✔
967
          .onMalformedInput(CodingErrorAction.REPLACE)
1✔
968
          .onUnmappableCharacter(CodingErrorAction.REPLACE)
1✔
969
          .decode(utf8Bytes)
1✔
970
          .toString();
1✔
971
    } catch (CharacterCodingException e) {
×
972
      throw new VerifyException(e); // Should not happen in REPLACE mode.
×
973
    }
974
  }
975

976
  @Nullable
977
  private static String percentEncode(String s, BitSet allowedCodePoints) {
978
    if (s == null) {
1✔
979
      return null;
1✔
980
    }
981
    CharsetEncoder encoder =
1✔
982
        StandardCharsets.UTF_8
983
            .newEncoder()
1✔
984
            .onMalformedInput(CodingErrorAction.REPORT)
1✔
985
            .onUnmappableCharacter(CodingErrorAction.REPORT);
1✔
986
    ByteBuffer utf8Bytes;
987
    try {
988
      utf8Bytes = encoder.encode(CharBuffer.wrap(s));
1✔
989
    } catch (MalformedInputException e) {
1✔
990
      throw new IllegalArgumentException("Malformed input", e); // Must be a broken surrogate pair.
1✔
991
    } catch (CharacterCodingException e) {
×
992
      throw new VerifyException(e); // Should not happen when encoding to UTF-8.
×
993
    }
1✔
994

995
    StringBuilder sb = new StringBuilder();
1✔
996
    while (utf8Bytes.hasRemaining()) {
1✔
997
      int b = 0xff & utf8Bytes.get();
1✔
998
      if (allowedCodePoints.get(b)) {
1✔
999
        sb.append((char) b);
1✔
1000
      } else {
1001
        sb.append('%');
1✔
1002
        sb.append(hexDigitsByVal[(b & 0xF0) >> 4]);
1✔
1003
        sb.append(hexDigitsByVal[b & 0x0F]);
1✔
1004
      }
1005
    }
1✔
1006
    return sb.toString();
1✔
1007
  }
1008

1009
  private static void checkPercentEncodedArg(String s, String what, BitSet allowedChars) {
1010
    percentDecode(s, what, allowedChars, null);
1✔
1011
  }
1✔
1012

1013
  // See UriTest for how these were computed from the ABNF constants in RFC 3986.
1014
  static final BitSet digitChars = BitSet.valueOf(new long[] {0x3ff000000000000L});
1✔
1015
  static final BitSet alphaChars = BitSet.valueOf(new long[] {0L, 0x7fffffe07fffffeL});
1✔
1016
  // scheme        = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )
1017
  static final BitSet schemeChars =
1✔
1018
      BitSet.valueOf(new long[] {0x3ff680000000000L, 0x7fffffe07fffffeL});
1✔
1019
  // unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"
1020
  static final BitSet unreservedChars =
1✔
1021
      BitSet.valueOf(new long[] {0x3ff600000000000L, 0x47fffffe87fffffeL});
1✔
1022
  // gen-delims    = ":" / "/" / "?" / "#" / "[" / "]" / "@"
1023
  static final BitSet genDelimsChars =
1✔
1024
      BitSet.valueOf(new long[] {0x8400800800000000L, 0x28000001L});
1✔
1025
  // sub-delims    = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="
1026
  static final BitSet subDelimsChars = BitSet.valueOf(new long[] {0x28001fd200000000L});
1✔
1027
  // reserved      = gen-delims / sub-delims
1028
  static final BitSet reservedChars = BitSet.valueOf(new long[] {0xac009fda00000000L, 0x28000001L});
1✔
1029
  // reg-name      = *( unreserved / pct-encoded / sub-delims )
1030
  static final BitSet regNameChars =
1✔
1031
      BitSet.valueOf(new long[] {0x2bff7fd200000000L, 0x47fffffe87fffffeL});
1✔
1032
  // userinfo      = *( unreserved / pct-encoded / sub-delims / ":" )
1033
  static final BitSet userInfoChars =
1✔
1034
      BitSet.valueOf(new long[] {0x2fff7fd200000000L, 0x47fffffe87fffffeL});
1✔
1035
  // pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
1036
  static final BitSet pChars =
1✔
1037
      BitSet.valueOf(new long[] {0x2fff7fd200000000L, 0x47fffffe87ffffffL});
1✔
1038
  static final BitSet pCharsAndSlash =
1✔
1039
      BitSet.valueOf(new long[] {0x2fffffd200000000L, 0x47fffffe87ffffffL});
1✔
1040
  //  query         = *( pchar / "/" / "?" )
1041
  static final BitSet queryChars =
1✔
1042
      BitSet.valueOf(new long[] {0xafffffd200000000L, 0x47fffffe87ffffffL});
1✔
1043
  // fragment      = *( pchar / "/" / "?" )
1044
  static final BitSet fragmentChars = queryChars;
1✔
1045

1046
  private static final char[] hexDigitsByVal = "0123456789ABCDEF".toCharArray();
1✔
1047
}
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2025 Coveralls, Inc