• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

grpc / grpc-java / #20142

09 Jan 2026 01:58PM UTC coverage: 88.656% (-0.009%) from 88.665%
#20142

push

github

web-flow
api: Add RFC 3986 support to DnsNameResolverProvider (#12602)

Accept both absolute (e.g. `dns:///hostname`) and rootless (e.g.
`dns:hostname`) paths as specified by
https://github.com/grpc/grpc/blob/master/doc/naming.md and matching the
behavior of grpc core and grpc-go.

35331 of 39852 relevant lines covered (88.66%)

0.89 hits per line

Source File
Press 'n' to go to next uncovered line, 'b' for previous

98.68
/../api/src/main/java/io/grpc/Uri.java
1
/*
2
 * Copyright 2025 The gRPC Authors
3
 *
4
 * Licensed under the Apache License, Version 2.0 (the "License");
5
 * you may not use this file except in compliance with the License.
6
 * You may obtain a copy of the License at
7
 *
8
 *     http://www.apache.org/licenses/LICENSE-2.0
9
 *
10
 * Unless required by applicable law or agreed to in writing, software
11
 * distributed under the License is distributed on an "AS IS" BASIS,
12
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13
 * See the License for the specific language governing permissions and
14
 * limitations under the License.
15
 */
16

17
package io.grpc;
18

19
import static com.google.common.base.Preconditions.checkArgument;
20
import static com.google.common.base.Preconditions.checkNotNull;
21
import static com.google.common.base.Preconditions.checkState;
22

23
import com.google.common.base.VerifyException;
24
import com.google.common.collect.ImmutableList;
25
import com.google.common.net.InetAddresses;
26
import com.google.errorprone.annotations.CanIgnoreReturnValue;
27
import java.net.InetAddress;
28
import java.net.URISyntaxException;
29
import java.nio.ByteBuffer;
30
import java.nio.CharBuffer;
31
import java.nio.charset.CharacterCodingException;
32
import java.nio.charset.CharsetEncoder;
33
import java.nio.charset.CodingErrorAction;
34
import java.nio.charset.MalformedInputException;
35
import java.nio.charset.StandardCharsets;
36
import java.util.BitSet;
37
import java.util.List;
38
import java.util.Locale;
39
import java.util.Objects;
40
import javax.annotation.Nullable;
41

42
/**
43
 * A not-quite-general-purpose representation of a Uniform Resource Identifier (URI), as defined by
44
 * <a href="https://datatracker.ietf.org/doc/html/rfc3986">RFC 3986</a>.
45
 *
46
 * <h1>The URI</h1>
47
 *
48
 * <p>A URI identifies a resource by its name or location or both. The resource could be a file,
49
 * service, or some other abstract entity.
50
 *
51
 * <h2>Examples</h2>
52
 *
53
 * <ul>
54
 *   <li><code>http://admin@example.com:8080/controlpanel?filter=users#settings</code>
55
 *   <li><code>ftp://[2001:db8::7]/docs/report.pdf</code>
56
 *   <li><code>file:///My%20Computer/Documents/letter.doc</code>
57
 *   <li><code>dns://8.8.8.8/storage.googleapis.com</code>
58
 *   <li><code>mailto:John.Doe@example.com</code>
59
 *   <li><code>tel:+1-206-555-1212</code>
60
 *   <li><code>urn:isbn:978-1492082798</code>
61
 * </ul>
62
 *
63
 * <h2>Limitations</h2>
64
 *
65
 * <p>This class aims to meet the needs of grpc-java itself and RPC related code that depend on it.
66
 * It isn't quite general-purpose. It definitely would not be suitable for building an HTTP user
67
 * agent or proxy server. In particular, it:
68
 *
69
 * <ul>
70
 *   <li>Can only represent a URI, not a "URI-reference" or "relative reference". In other words, a
71
 *       "scheme" is always required.
72
 *   <li>Has no knowledge of the particulars of any scheme, with respect to normalization and
73
 *       comparison. We don't know <code>https://google.com</code> is the same as <code>
74
 *       https://google.com:443</code>, that <code>file:///</code> is the same as <code>
75
 *       file://localhost</code>, or that <code>joe@example.com</code> is the same as <code>
76
 *       joe@EXAMPLE.COM</code>. No one class can or should know everything about every scheme so
77
 *       all this is better handled at a higher layer.
78
 *   <li>Implements {@link #equals(Object)} as a char-by-char comparison. Expect false negatives.
79
 *   <li>Does not support "IPvFuture" literal addresses.
80
 *   <li>Does not reflect how web browsers parse user input or the <a
81
 *       href="https://url.spec.whatwg.org/">URL Living Standard</a>.
82
 *   <li>Does not support different character encodings. Assumes UTF-8 in several places.
83
 * </ul>
84
 *
85
 * <h2>Migrating from RFC 2396 and {@link java.net.URI}</h2>
86
 *
87
 * <p>Those migrating from {@link java.net.URI} and/or its primary specification in RFC 2396 should
88
 * note some differences.
89
 *
90
 * <h3>Uniform Hierarchical Syntax</h3>
91
 *
92
 * <p>RFC 3986 unifies the older ideas of "hierarchical" and "opaque" URIs into a single generic
93
 * syntax. What RFC 2396 called an opaque "scheme-specific part" is always broken out by RFC 3986
94
 * into an authority and path hierarchy, followed by query and fragment components. Accordingly,
95
 * this class has only getters for those components but no {@link
96
 * java.net.URI#getSchemeSpecificPart()} analog.
97
 *
98
 * <p>The RFC 3986 definition of path is now more liberal to accommodate this:
99
 *
100
 * <ul>
101
 *   <li>Path doesn't have to start with a slash. For example, the path of <code>
102
 *       urn:isbn:978-1492082798</code> is <code>isbn:978-1492082798</code> even though it doesn't
103
 *       look much like a file system path.
104
 *   <li>The path can now be empty. So Android's <code>
105
 *       intent:#Intent;action=MAIN;category=LAUNCHER;end</code> is now a valid {@link Uri}. Even
106
 *       the scheme-only <code>about:</code> is now valid.
107
 * </ul>
108
 *
109
 * <p>The uniform syntax always understands what follows a '?' to be a query string. For example,
110
 * <code>mailto:me@example.com?subject=foo</code> now has a query component whereas RFC 2396
111
 * considered everything after the <code>mailto:</code> scheme to be opaque.
112
 *
113
 * <p>Same goes for fragment. <code>data:image/png;...#xywh=0,0,10,10</code> now has a fragment
114
 * whereas RFC 2396 considered everything after the scheme to be opaque.
115
 *
116
 * <h3>Uniform Authority Syntax</h3>
117
 *
118
 * <p>RFC 2396 tried to guess if an authority was a "server" (host:port) or "registry-based"
119
 * (arbitrary string) based on its contents. RFC 3986 expects every authority to look like
120
 * [userinfo@]host[:port] and loosens the definition of a "host" to accommodate. Accordingly, this
121
 * class has no equivalent to {@link java.net.URI#parseServerAuthority()} -- authority was parsed
122
 * into its components and checked for validity when the {@link Uri} was created.
123
 *
124
 * <h3>Other Specific Differences</h3>
125
 *
126
 * <p>RFC 2396 does not allow underscores in a host name, meaning {@link java.net.URI} switches to
127
 * opaque mode when it sees one. {@link Uri} does allow underscores in host, to accommodate
128
 * registries other than DNS. So <code>http://my_site.com:8080/index.html</code> now parses as a
129
 * host, port and path rather than a single opaque scheme-specific part.
130
 *
131
 * <p>{@link Uri} strictly *requires* square brackets in the query string and fragment to be
132
 * percent-encoded whereas RFC 2396 merely recommended doing so.
133
 *
134
 * <p>Other URx classes are "liberal in what they accept and strict in what they produce." {@link
135
 * Uri#parse(String)} and {@link Uri#create(String)}, however, are strict in what they accept and
136
 * transparent when asked to reproduce it via {@link Uri#toString()}. The former policy may be
137
 * appropriate for parsing user input or web content, but this class is meant for gRPC clients,
138
 * servers and plugins like name resolvers where human error at runtime is less likely and best
139
 * detected early. {@link java.net.URI#create(String)} is similarly strict, which makes migration
140
 * easy, except for the server/registry-based ambiguity addressed by {@link
141
 * java.net.URI#parseServerAuthority()}.
142
 *
143
 * <p>{@link java.net.URI} and {@link Uri} both support IPv6 literals in square brackets as defined
144
 * by RFC 2732.
145
 *
146
 * <p>{@link java.net.URI} supports IPv6 scope IDs but accepts and emits a non-standard syntax.
147
 * {@link Uri} implements the newer RFC 6874, which percent encodes scope IDs and the % delimiter
148
 * itself. RFC 9844 claims to obsolete RFC 6874 because web browsers would not support it. This
149
 * class implements RFC 6874 anyway, mostly to avoid creating a barrier to migration away from
150
 * {@link java.net.URI}.
151
 */
152
@Internal
153
public final class Uri {
154
  // Components are stored percent-encoded, just as originally parsed for transparent parse/toString
155
  // round-tripping.
156
  private final String scheme; // != null since we don't support relative references.
157
  @Nullable private final String userInfo;
158
  @Nullable private final String host;
159
  @Nullable private final String port;
160
  private final String path; // In RFC 3986, path is always defined (but can be empty).
161
  @Nullable private final String query;
162
  @Nullable private final String fragment;
163

164
  private Uri(Builder builder) {
1✔
165
    this.scheme = checkNotNull(builder.scheme, "scheme");
1✔
166
    this.userInfo = builder.userInfo;
1✔
167
    this.host = builder.host;
1✔
168
    this.port = builder.port;
1✔
169
    this.path = builder.path;
1✔
170
    this.query = builder.query;
1✔
171
    this.fragment = builder.fragment;
1✔
172

173
    // Checks common to the parse() and Builder code paths.
174
    if (hasAuthority()) {
1✔
175
      if (!path.isEmpty() && !path.startsWith("/")) {
1✔
176
        throw new IllegalArgumentException("Has authority -- Non-empty path must start with '/'");
1✔
177
      }
178
    } else {
179
      if (path.startsWith("//")) {
1✔
180
        throw new IllegalArgumentException("No authority -- Path cannot start with '//'");
1✔
181
      }
182
    }
183
  }
1✔
184

185
  /**
186
   * Parses a URI from its string form.
187
   *
188
   * @throws URISyntaxException if 's' is not a valid RFC 3986 URI.
189
   */
190
  public static Uri parse(String s) throws URISyntaxException {
191
    try {
192
      return create(s);
1✔
193
    } catch (IllegalArgumentException e) {
1✔
194
      throw new URISyntaxException(s, e.getMessage());
1✔
195
    }
196
  }
197

198
  /**
199
   * Creates a URI from a string assumed to be valid.
200
   *
201
   * <p>Useful for defining URI constants in code. Not for user input.
202
   *
203
   * @throws IllegalArgumentException if 's' is not a valid RFC 3986 URI.
204
   */
205
  public static Uri create(String s) {
206
    Builder builder = new Builder();
1✔
207
    int i = 0;
1✔
208
    final int n = s.length();
1✔
209

210
    // 3.1. Scheme: Look for a ':' before '/', '?', or '#'.
211
    int schemeColon = -1;
1✔
212
    for (; i < n; ++i) {
1✔
213
      char c = s.charAt(i);
1✔
214
      if (c == ':') {
1✔
215
        schemeColon = i;
1✔
216
        break;
1✔
217
      } else if (c == '/' || c == '?' || c == '#') {
1✔
218
        break;
1✔
219
      }
220
    }
221
    if (schemeColon < 0) {
1✔
222
      throw new IllegalArgumentException("Missing required scheme.");
1✔
223
    }
224
    builder.setRawScheme(s.substring(0, schemeColon));
1✔
225

226
    // 3.2. Authority. Look for '//' then keep scanning until '/', '?', or '#'.
227
    i = schemeColon + 1;
1✔
228
    if (i + 1 < n && s.charAt(i) == '/' && s.charAt(i + 1) == '/') {
1✔
229
      // "//" just means we have an authority. Skip over it.
230
      i += 2;
1✔
231

232
      int authorityStart = i;
1✔
233
      for (; i < n; ++i) {
1✔
234
        char c = s.charAt(i);
1✔
235
        if (c == '/' || c == '?' || c == '#') {
1✔
236
          break;
1✔
237
        }
238
      }
239
      String authority = s.substring(authorityStart, i);
1✔
240

241
      // 3.2.1. UserInfo. Easy, because '@' cannot appear unencoded inside userinfo or host.
242
      int userInfoEnd = authority.indexOf('@');
1✔
243
      if (userInfoEnd >= 0) {
1✔
244
        builder.setRawUserInfo(authority.substring(0, userInfoEnd));
1✔
245
      }
246

247
      // 3.2.2/3. Host/Port.
248
      int hostStart = userInfoEnd >= 0 ? userInfoEnd + 1 : 0;
1✔
249
      int portStartColon = findPortStartColon(authority, hostStart);
1✔
250
      if (portStartColon < 0) {
1✔
251
        builder.setRawHost(authority.substring(hostStart, authority.length()));
1✔
252
      } else {
253
        builder.setRawHost(authority.substring(hostStart, portStartColon));
1✔
254
        builder.setRawPort(authority.substring(portStartColon + 1));
1✔
255
      }
256
    }
257

258
    // 3.3. Path: Whatever is left before '?' or '#'.
259
    int pathStart = i;
1✔
260
    for (; i < n; ++i) {
1✔
261
      char c = s.charAt(i);
1✔
262
      if (c == '?' || c == '#') {
1✔
263
        break;
1✔
264
      }
265
    }
266
    builder.setRawPath(s.substring(pathStart, i));
1✔
267

268
    // 3.4. Query, if we stopped at '?'.
269
    if (i < n && s.charAt(i) == '?') {
1✔
270
      i++; // Skip '?'
1✔
271
      int queryStart = i;
1✔
272
      for (; i < n; ++i) {
1✔
273
        char c = s.charAt(i);
1✔
274
        if (c == '#') {
1✔
275
          break;
1✔
276
        }
277
      }
278
      builder.setRawQuery(s.substring(queryStart, i));
1✔
279
    }
280

281
    // 3.5. Fragment, if we stopped at '#'.
282
    if (i < n && s.charAt(i) == '#') {
1✔
283
      ++i; // Skip '#'
1✔
284
      builder.setRawFragment(s.substring(i));
1✔
285
    }
286

287
    return builder.build();
1✔
288
  }
289

290
  private static int findPortStartColon(String authority, int hostStart) {
291
    for (int i = authority.length() - 1; i >= hostStart; --i) {
1✔
292
      char c = authority.charAt(i);
1✔
293
      if (c == ':') {
1✔
294
        return i;
1✔
295
      }
296
      if (c == ']') {
1✔
297
        // Hit the end of IP-literal. Any further colon is inside it and couldn't indicate a port.
298
        break;
1✔
299
      }
300
      if (!digitChars.get(c)) {
1✔
301
        // Found a non-digit, non-colon, non-bracket.
302
        // This means there is no valid port (e.g. host is "example.com")
303
        break;
1✔
304
      }
305
    }
306
    return -1;
1✔
307
  }
308

309
  // Checks a raw path for validity and parses it into segments. Let 'out' be null to just validate.
310
  private static void parseAssumedUtf8PathIntoSegments(
311
      String path, ImmutableList.Builder<String> out) {
312
    // Skip the first slash so it doesn't count as an empty segment at the start.
313
    // (e.g., "/a" -> ["a"], not ["", "a"])
314
    int start = path.startsWith("/") ? 1 : 0;
1✔
315

316
    for (int i = start; i < path.length(); ) {
1✔
317
      int nextSlash = path.indexOf('/', i);
1✔
318
      String segment;
319
      if (nextSlash >= 0) {
1✔
320
        // Typical segment case (e.g., "foo" in "/foo/bar").
321
        segment = path.substring(i, nextSlash);
1✔
322
        i = nextSlash + 1;
1✔
323
      } else {
324
        // Final segment case (e.g., "bar" in "/foo/bar").
325
        segment = path.substring(i);
1✔
326
        i = path.length();
1✔
327
      }
328
      if (out != null) {
1✔
329
        out.add(percentDecodeAssumedUtf8(segment));
1✔
330
      } else {
331
        checkPercentEncodedArg(segment, "path segment", pChars);
1✔
332
      }
333
    }
1✔
334

335
    // RFC 3986 says a trailing slash creates a final empty segment.
336
    // (e.g., "/foo/" -> ["foo", ""])
337
    if (path.endsWith("/") && out != null) {
1✔
338
      out.add("");
1✔
339
    }
340
  }
1✔
341

342
  /** Returns the scheme of this URI. */
343
  public String getScheme() {
344
    return scheme;
1✔
345
  }
346

347
  /**
348
   * Returns the percent-decoded "Authority" component of this URI, or null if not present.
349
   *
350
   * <p>NB: This method assumes the "host" component was encoded as UTF-8, as mandated by RFC 3986.
351
   * This method also assumes the "user information" part of authority was encoded as UTF-8,
352
   * although RFC 3986 doesn't specify an encoding.
353
   *
354
   * <p>Decoding errors are indicated by a {@code '\u005CuFFFD'} unicode replacement character in
355
   * the output. Callers who want to detect and handle errors in some other way should call {@link
356
   * #getRawAuthority()}, {@link #percentDecode(CharSequence)}, then decode the bytes for
357
   * themselves.
358
   */
359
  @Nullable
360
  public String getAuthority() {
361
    return percentDecodeAssumedUtf8(getRawAuthority());
1✔
362
  }
363

364
  private boolean hasAuthority() {
365
    return host != null;
1✔
366
  }
367

368
  /**
369
   * Returns the "authority" component of this URI in its originally parsed, possibly
370
   * percent-encoded form.
371
   */
372
  @Nullable
373
  public String getRawAuthority() {
374
    if (hasAuthority()) {
1✔
375
      StringBuilder sb = new StringBuilder();
1✔
376
      appendAuthority(sb);
1✔
377
      return sb.toString();
1✔
378
    }
379
    return null;
1✔
380
  }
381

382
  private void appendAuthority(StringBuilder sb) {
383
    if (userInfo != null) {
1✔
384
      sb.append(userInfo).append('@');
1✔
385
    }
386
    if (host != null) {
1✔
387
      sb.append(host);
1✔
388
    }
389
    if (port != null) {
1✔
390
      sb.append(':').append(port);
1✔
391
    }
392
  }
1✔
393

394
  /**
395
   * Returns the percent-decoded "User Information" component of this URI, or null if not present.
396
   *
397
   * <p>NB: This method *assumes* this component was encoded as UTF-8, although RFC 3986 doesn't
398
   * specify an encoding.
399
   *
400
   * <p>Decoding errors are indicated by a {@code '\u005CuFFFD'} unicode replacement character in
401
   * the output. Callers who want to detect and handle errors in some other way should call {@link
402
   * #getRawUserInfo()}, {@link #percentDecode(CharSequence)}, then decode the bytes for themselves.
403
   */
404
  @Nullable
405
  public String getUserInfo() {
406
    return percentDecodeAssumedUtf8(userInfo);
1✔
407
  }
408

409
  /**
410
   * Returns the "User Information" component of this URI in its originally parsed, possibly
411
   * percent-encoded form.
412
   */
413
  @Nullable
414
  public String getRawUserInfo() {
415
    return userInfo;
1✔
416
  }
417

418
  /**
419
   * Returns the percent-decoded "host" component of this URI, or null if not present.
420
   *
421
   * <p>This method assumes the host was encoded as UTF-8, as mandated by RFC 3986.
422
   *
423
   * <p>Decoding errors are indicated by a {@code '\u005CuFFFD'} unicode replacement character in
424
   * the output. Callers who want to detect and handle errors in some other way should call {@link
425
   * #getRawHost()}, {@link #percentDecode(CharSequence)}, then decode the bytes for themselves.
426
   */
427
  @Nullable
428
  public String getHost() {
429
    return percentDecodeAssumedUtf8(host);
1✔
430
  }
431

432
  /**
433
   * Returns the host component of this URI in its originally parsed, possibly percent-encoded form.
434
   */
435
  @Nullable
436
  public String getRawHost() {
437
    return host;
1✔
438
  }
439

440
  /** Returns the "port" component of this URI, or -1 if not present. */
441
  public int getPort() {
442
    return port != null ? Integer.parseInt(port) : -1;
1✔
443
  }
444

445
  /** Returns the raw port component of this URI in its originally parsed form. */
446
  @Nullable
447
  public String getRawPort() {
448
    return port;
1✔
449
  }
450

451
  /**
452
   * Returns the (possibly empty) percent-decoded "path" component of this URI.
453
   *
454
   * <p>NB: This method *assumes* the path was encoded as UTF-8, although RFC 3986 doesn't specify
455
   * an encoding.
456
   *
457
   * <p>Decoding errors are indicated by a {@code '\u005CuFFFD'} unicode replacement character in
458
   * the output. Callers who want to detect and handle errors in some other way should call {@link
459
   * #getRawPath()}, {@link #percentDecode(CharSequence)}, then decode the bytes for themselves.
460
   *
461
   * <p>NB: Prefer {@link #getPathSegments()} because this method's decoding is lossy. For example,
462
   * consider these (different) URIs:
463
   *
464
   * <ul>
465
   *   <li>file:///home%2Ffolder/my%20file
466
   *   <li>file:///home/folder/my%20file
467
   * </ul>
468
   *
469
   * <p>Calling getPath() on each returns the same string: <code>/home/folder/my file</code>. You
470
   * can't tell whether the second '/' character is part of the first path segment or separates the
471
   * first and second path segments. This method only exists to ease migration from {@link
472
   * java.net.URI}.
473
   */
474
  public String getPath() {
475
    return percentDecodeAssumedUtf8(path);
1✔
476
  }
477

478
  /**
479
   * Returns this URI's path as a list of path segments not including the '/' segment delimiters.
480
   *
481
   * <p>Prefer this method over {@link #getPath()} because it preserves the distinction between
482
   * segment separators and literal '/'s within a path segment.
483
   *
484
   * <p>A trailing '/' delimiter in the path results in the empty string as the last element in the
485
   * returned list. For example, <code>file://localhost/foo/bar/</code> has path segments <code>
486
   * ["foo", "bar", ""]</code>
487
   *
488
   * <p>A leading '/' delimiter cannot be detected using this method. For example, both <code>
489
   * dns:example.com</code> and <code>dns:///example.com</code> have the same list of path segments:
490
   * <code>["example.com"]</code>. Use {@link #isPathAbsolute()} or {@link #isPathRootless()} to
491
   * distinguish these cases.
492
   *
493
   * <p>The returned list is immutable.
494
   */
495
  public List<String> getPathSegments() {
496
    // Returned list must be immutable but we intentionally keep guava out of the public API.
497
    ImmutableList.Builder<String> segmentsBuilder = ImmutableList.builder();
1✔
498
    parseAssumedUtf8PathIntoSegments(path, segmentsBuilder);
1✔
499
    return segmentsBuilder.build();
1✔
500
  }
501

502
  /**
503
   * Returns true iff this URI's path component starts with a path segment (rather than the '/'
504
   * segment delimiter).
505
   *
506
   * <p>The path of an RFC 3986 URI is either empty, absolute (starts with the '/' segment
507
   * delimiter) or rootless (starts with a path segment). For example, <code>tel:+1-206-555-1212
508
   * </code>, <code>mailto:me@example.com</code> and <code>urn:isbn:978-1492082798</code> all have
509
   * rootless paths. <code>mailto:%2Fdev%2Fnull@example.com</code> is also rootless because its
510
   * percent-encoded slashes are not segment delimiters but rather part of the first and only path
511
   * segment.
512
   *
513
   * <p>Contrast rootless paths with absolute ones (see {@link #isPathAbsolute()}.
514
   */
515
  public boolean isPathRootless() {
516
    return !path.isEmpty() && !path.startsWith("/");
1✔
517
  }
518

519
  /**
520
   * Returns true iff this URI's path component starts with the '/' segment delimiter (rather than a
521
   * path segment).
522
   *
523
   * <p>The path of an RFC 3986 URI is either empty, absolute (starts with the '/' segment
524
   * delimiter) or rootless (starts with a path segment). For example, <code>file:///resume.txt
525
   * </code>, <code>file:/resume.txt</code> and <code>file://localhost/</code> all have absolute
526
   * paths while <code>tel:+1-206-555-1212</code>'s path is not absolute. <code>
527
   * mailto:%2Fdev%2Fnull@example.com</code> is also not absolute because its percent-encoded
528
   * slashes are not segment delimiters but rather part of the first and only path segment.
529
   *
530
   * <p>Contrast absolute paths with rootless ones (see {@link #isPathRootless()}.
531
   *
532
   * <p>NB: The term "absolute" has two different meanings in RFC 3986 which are easily confused.
533
   * This method tests for a property of this URI's path component. Contrast with {@link
534
   * #isAbsolute()} which tests the URI itself for a different property.
535
   */
536
  public boolean isPathAbsolute() {
537
    return path.startsWith("/");
1✔
538
  }
539

540
  /**
541
   * Returns the path component of this URI in its originally parsed, possibly percent-encoded form.
542
   */
543
  public String getRawPath() {
544
    return path;
1✔
545
  }
546

547
  /**
548
   * Returns the percent-decoded "query" component of this URI, or null if not present.
549
   *
550
   * <p>NB: This method assumes the query was encoded as UTF-8, although RFC 3986 doesn't specify an
551
   * encoding.
552
   *
553
   * <p>Decoding errors are indicated by a {@code '\u005CuFFFD'} unicode replacement character in
554
   * the output. Callers who want to detect and handle errors in some other way should call {@link
555
   * #getRawQuery()}, {@link #percentDecode(CharSequence)}, then decode the bytes for themselves.
556
   */
557
  @Nullable
558
  public String getQuery() {
559
    return percentDecodeAssumedUtf8(query);
1✔
560
  }
561

562
  /**
563
   * Returns the query component of this URI in its originally parsed, possibly percent-encoded
564
   * form, without any leading '?' character.
565
   */
566
  @Nullable
567
  public String getRawQuery() {
568
    return query;
1✔
569
  }
570

571
  /**
572
   * Returns the percent-decoded "fragment" component of this URI, or null if not present.
573
   *
574
   * <p>NB: This method assumes the fragment was encoded as UTF-8, although RFC 3986 doesn't specify
575
   * an encoding.
576
   *
577
   * <p>Decoding errors are indicated by a {@code '\u005CuFFFD'} unicode replacement character in
578
   * the output. Callers who want to detect and handle errors in some other way should call {@link
579
   * #getRawFragment()}, {@link #percentDecode(CharSequence)}, then decode the bytes for themselves.
580
   */
581
  @Nullable
582
  public String getFragment() {
583
    return percentDecodeAssumedUtf8(fragment);
1✔
584
  }
585

586
  /**
587
   * Returns the fragment component of this URI in its original, possibly percent-encoded form, and
588
   * without any leading '#' character.
589
   */
590
  @Nullable
591
  public String getRawFragment() {
592
    return fragment;
1✔
593
  }
594

595
  /**
596
   * {@inheritDoc}
597
   *
598
   * <p>If this URI was created by {@link #parse(String)} or {@link #create(String)}, then the
599
   * returned string will match that original input exactly.
600
   */
601
  @Override
602
  public String toString() {
603
    // https://datatracker.ietf.org/doc/html/rfc3986#section-5.3
604
    StringBuilder sb = new StringBuilder();
1✔
605
    sb.append(scheme).append(':');
1✔
606
    if (hasAuthority()) {
1✔
607
      sb.append("//");
1✔
608
      appendAuthority(sb);
1✔
609
    }
610
    sb.append(path);
1✔
611
    if (query != null) {
1✔
612
      sb.append('?').append(query);
1✔
613
    }
614
    if (fragment != null) {
1✔
615
      sb.append('#').append(fragment);
1✔
616
    }
617
    return sb.toString();
1✔
618
  }
619

620
  /**
621
   * Returns true iff this URI has a scheme and an authority/path hierarchy, but no fragment.
622
   *
623
   * <p>All instances of {@link Uri} are RFC 3986 URIs, not "relative references", so this method is
624
   * equivalent to {@code getFragment() == null}. It mostly exists for compatibility with {@link
625
   * java.net.URI}.
626
   */
627
  public boolean isAbsolute() {
628
    return scheme != null && fragment == null;
1✔
629
  }
630

631
  /**
632
   * {@inheritDoc}
633
   *
634
   * <p>Two instances of {@link Uri} are equal if and only if they have the same string
635
   * representation, which RFC 3986 calls "Simple String Comparison" (6.2.1). Callers with a higher
636
   * layer expectation of equality (e.g. <code>http://some%2Dhost:80/foo/./bar.txt</code> ~= <code>
637
   * http://some-host/foo/bar.txt</code>) will experience false negatives.
638
   */
639
  @Override
640
  public boolean equals(Object otherObj) {
641
    if (!(otherObj instanceof Uri)) {
1✔
642
      return false;
1✔
643
    }
644
    Uri other = (Uri) otherObj;
1✔
645
    return Objects.equals(scheme, other.scheme)
1✔
646
        && Objects.equals(userInfo, other.userInfo)
1✔
647
        && Objects.equals(host, other.host)
1✔
648
        && Objects.equals(port, other.port)
1✔
649
        && Objects.equals(path, other.path)
1✔
650
        && Objects.equals(query, other.query)
1✔
651
        && Objects.equals(fragment, other.fragment);
1✔
652
  }
653

654
  @Override
655
  public int hashCode() {
656
    return Objects.hash(scheme, userInfo, host, port, path, query, fragment);
1✔
657
  }
658

659
  /** Returns a new Builder initialized with the fields of this URI. */
660
  public Builder toBuilder() {
661
    return new Builder(this);
1✔
662
  }
663

664
  /** Creates a new {@link Builder} with all fields uninitialized or set to their default values. */
665
  public static Builder newBuilder() {
666
    return new Builder();
1✔
667
  }
668

669
  /** Builder for {@link Uri}. */
670
  public static final class Builder {
671
    private String scheme;
672
    private String path = "";
1✔
673
    private String query;
674
    private String fragment;
675
    private String userInfo;
676
    private String host;
677
    private String port;
678

679
    private Builder() {}
1✔
680

681
    Builder(Uri prototype) {
1✔
682
      this.scheme = prototype.scheme;
1✔
683
      this.userInfo = prototype.userInfo;
1✔
684
      this.host = prototype.host;
1✔
685
      this.port = prototype.port;
1✔
686
      this.path = prototype.path;
1✔
687
      this.query = prototype.query;
1✔
688
      this.fragment = prototype.fragment;
1✔
689
    }
1✔
690

691
    /**
692
     * Sets the scheme, e.g. "https", "dns" or "xds".
693
     *
694
     * <p>This field is required.
695
     *
696
     * @return this, for fluent building
697
     * @throws IllegalArgumentException if the scheme is invalid.
698
     */
699
    @CanIgnoreReturnValue
700
    public Builder setScheme(String scheme) {
701
      return setRawScheme(scheme.toLowerCase(Locale.ROOT));
1✔
702
    }
703

704
    @CanIgnoreReturnValue
705
    Builder setRawScheme(String scheme) {
706
      if (scheme.isEmpty() || !alphaChars.get(scheme.charAt(0))) {
1✔
707
        throw new IllegalArgumentException("Scheme must start with an alphabetic char");
1✔
708
      }
709
      for (int i = 0; i < scheme.length(); i++) {
1✔
710
        char c = scheme.charAt(i);
1✔
711
        if (!schemeChars.get(c)) {
1✔
712
          throw new IllegalArgumentException("Invalid character in scheme at index " + i);
1✔
713
        }
714
      }
715
      this.scheme = scheme;
1✔
716
      return this;
1✔
717
    }
718

719
    /**
720
     * Specifies the new URI's path component as a string of zero or more '/' delimited segments.
721
     *
722
     * <p>Path segments can consist of any string of codepoints. Codepoints that can't be encoded
723
     * literally will be percent-encoded for you.
724
     *
725
     * <p>If a URI contains an authority component, then the path component must either be empty or
726
     * begin with a slash ("/") character. If a URI does not contain an authority component, then
727
     * the path cannot begin with two slash characters ("//").
728
     *
729
     * <p>This method interprets all '/' characters in 'path' as segment delimiters. If any of your
730
     * segments contain literal '/' characters, call {@link #setRawPath(String)} instead.
731
     *
732
     * <p>See <a href="https://datatracker.ietf.org/doc/html/rfc3986#section-3.3">RFC 3986 3.3</a>
733
     * for more.
734
     *
735
     * <p>This field is required but can be empty (its default value).
736
     *
737
     * @param path the new path
738
     * @return this, for fluent building
739
     */
740
    @CanIgnoreReturnValue
741
    public Builder setPath(String path) {
742
      checkArgument(path != null, "Path can be empty but not null");
1✔
743
      this.path = percentEncode(path, pCharsAndSlash);
1✔
744
      return this;
1✔
745
    }
746

747
    /**
748
     * Specifies the new URI's path component as a string of zero or more '/' delimited segments.
749
     *
750
     * <p>Path segments can consist of any string of codepoints but the caller must first percent-
751
     * encode anything other than RFC 3986's "pchar" character class using UTF-8.
752
     *
753
     * <p>If a URI contains an authority component, then the path component must either be empty or
754
     * begin with a slash ("/") character. If a URI does not contain an authority component, then
755
     * the path cannot begin with two slash characters ("//").
756
     *
757
     * <p>This method interprets all '/' characters in 'path' as segment delimiters. If any of your
758
     * segments contain literal '/' characters, you must percent-encode them.
759
     *
760
     * <p>See <a href="https://datatracker.ietf.org/doc/html/rfc3986#section-3.3">RFC 3986 3.3</a>
761
     * for more.
762
     *
763
     * <p>This field is required but can be empty (its default value).
764
     *
765
     * @param path the new path, a string consisting of characters from "pchar"
766
     * @return this, for fluent building
767
     */
768
    @CanIgnoreReturnValue
769
    public Builder setRawPath(String path) {
770
      checkArgument(path != null, "Path can be empty but not null");
1✔
771
      parseAssumedUtf8PathIntoSegments(path, null);
1✔
772
      this.path = path;
1✔
773
      return this;
1✔
774
    }
775

776
    /**
777
     * Specifies the query component of the new URI (not including the leading '?').
778
     *
779
     * <p>Query can contain any string of codepoints. Codepoints that can't be encoded literally
780
     * will be percent-encoded for you as UTF-8.
781
     *
782
     * <p>This field is optional.
783
     *
784
     * @param query the new query component, or null to clear this field
785
     * @return this, for fluent building
786
     */
787
    @CanIgnoreReturnValue
788
    public Builder setQuery(@Nullable String query) {
789
      this.query = percentEncode(query, queryChars);
1✔
790
      return this;
1✔
791
    }
792

793
    @CanIgnoreReturnValue
794
    Builder setRawQuery(String query) {
795
      checkPercentEncodedArg(query, "query", queryChars);
1✔
796
      this.query = query;
1✔
797
      return this;
1✔
798
    }
799

800
    /**
801
     * Specifies the fragment component of the new URI (not including the leading '#').
802
     *
803
     * <p>The fragment can contain any string of codepoints. Codepoints that can't be encoded
804
     * literally will be percent-encoded for you as UTF-8.
805
     *
806
     * <p>This field is optional.
807
     *
808
     * @param fragment the new fragment component, or null to clear this field
809
     * @return this, for fluent building
810
     */
811
    @CanIgnoreReturnValue
812
    public Builder setFragment(@Nullable String fragment) {
813
      this.fragment = percentEncode(fragment, fragmentChars);
1✔
814
      return this;
1✔
815
    }
816

817
    @CanIgnoreReturnValue
818
    Builder setRawFragment(String fragment) {
819
      checkPercentEncodedArg(fragment, "fragment", fragmentChars);
1✔
820
      this.fragment = fragment;
1✔
821
      return this;
1✔
822
    }
823

824
    /**
825
     * Set the "user info" component of the new URI, e.g. "username:password", not including the
826
     * trailing '@' character.
827
     *
828
     * <p>User info can contain any string of codepoints. Codepoints that can't be encoded literally
829
     * will be percent-encoded for you as UTF-8.
830
     *
831
     * <p>This field is optional.
832
     *
833
     * @param userInfo the new "user info" component, or null to clear this field
834
     * @return this, for fluent building
835
     */
836
    @CanIgnoreReturnValue
837
    public Builder setUserInfo(@Nullable String userInfo) {
838
      this.userInfo = percentEncode(userInfo, userInfoChars);
1✔
839
      return this;
1✔
840
    }
841

842
    @CanIgnoreReturnValue
843
    Builder setRawUserInfo(String userInfo) {
844
      checkPercentEncodedArg(userInfo, "userInfo", userInfoChars);
1✔
845
      this.userInfo = userInfo;
1✔
846
      return this;
1✔
847
    }
848

849
    /**
850
     * Specifies the "host" component of the new URI in its "registered name" form (usually DNS),
851
     * e.g. "server.com".
852
     *
853
     * <p>The registered name can contain any string of codepoints. Codepoints that can't be encoded
854
     * literally will be percent-encoded for you as UTF-8.
855
     *
856
     * <p>This field is optional.
857
     *
858
     * @param regName the new host component in "registered name" form, or null to clear this field
859
     * @return this, for fluent building
860
     */
861
    @CanIgnoreReturnValue
862
    public Builder setHost(@Nullable String regName) {
863
      if (regName != null) {
1✔
864
        regName = regName.toLowerCase(Locale.ROOT);
1✔
865
        regName = percentEncode(regName, regNameChars);
1✔
866
      }
867
      this.host = regName;
1✔
868
      return this;
1✔
869
    }
870

871
    /**
872
     * Specifies the "host" component of the new URI as an IP address.
873
     *
874
     * <p>This field is optional.
875
     *
876
     * @param addr the new "host" component in InetAddress form, or null to clear this field
877
     * @return this, for fluent building
878
     */
879
    @CanIgnoreReturnValue
880
    public Builder setHost(@Nullable InetAddress addr) {
881
      this.host = addr != null ? toUriString(addr) : null;
1✔
882
      return this;
1✔
883
    }
884

885
    private static String toUriString(InetAddress addr) {
886
      // InetAddresses.toUriString(addr) is almost enough but neglects RFC 6874 percent encoding.
887
      String inetAddrStr = InetAddresses.toUriString(addr);
1✔
888
      int percentIndex = inetAddrStr.indexOf('%');
1✔
889
      if (percentIndex < 0) {
1✔
890
        return inetAddrStr;
1✔
891
      }
892

893
      String scope = inetAddrStr.substring(percentIndex, inetAddrStr.length() - 1);
1✔
894
      return inetAddrStr.substring(0, percentIndex) + percentEncode(scope, unreservedChars) + "]";
1✔
895
    }
896

897
    @CanIgnoreReturnValue
898
    Builder setRawHost(String host) {
899
      if (host.startsWith("[") && host.endsWith("]")) {
1✔
900
        // IP-literal: Guava's isUriInetAddress() is almost enough but it doesn't check the scope.
901
        int percentIndex = host.indexOf('%');
1✔
902
        if (percentIndex > 0) {
1✔
903
          String scope = host.substring(percentIndex, host.length() - 1);
1✔
904
          checkPercentEncodedArg(scope, "scope", unreservedChars);
1✔
905
        }
906
      }
907
      // IP-literal validation is complicated so we delegate it to Guava. We use this particular
908
      // method of InetAddresses because it doesn't try to match interfaces on the local machine.
909
      // (The validity of a URI should be the same no matter which machine does the parsing.)
910
      // TODO(jdcormie): IPFuture
911
      if (!InetAddresses.isUriInetAddress(host)) {
1✔
912
        // Must be a "registered name".
913
        checkPercentEncodedArg(host, "host", regNameChars);
1✔
914
      }
915
      this.host = host;
1✔
916
      return this;
1✔
917
    }
918

919
    /**
920
     * Specifies the "port" component of the new URI, e.g. "8080".
921
     *
922
     * <p>The port can be any non-negative integer. A negative value represents "no port".
923
     *
924
     * <p>This field is optional.
925
     *
926
     * @param port the new "port" component, or -1 to clear this field
927
     * @return this, for fluent building
928
     */
929
    @CanIgnoreReturnValue
930
    public Builder setPort(int port) {
931
      this.port = port < 0 ? null : Integer.toString(port);
1✔
932
      return this;
1✔
933
    }
934

935
    @CanIgnoreReturnValue
936
    Builder setRawPort(String port) {
937
      try {
938
        Integer.parseInt(port); // Result unused.
1✔
939
      } catch (NumberFormatException e) {
1✔
940
        throw new IllegalArgumentException("Invalid port", e);
1✔
941
      }
1✔
942
      this.port = port;
1✔
943
      return this;
1✔
944
    }
945

946
    /** Builds a new instance of {@link Uri} as specified by the setters. */
947
    public Uri build() {
948
      checkState(scheme != null, "Missing required scheme.");
1✔
949
      if (host == null) {
1✔
950
        checkState(port == null, "Cannot set port without host.");
1✔
951
        checkState(userInfo == null, "Cannot set userInfo without host.");
1✔
952
      }
953
      return new Uri(this);
1✔
954
    }
955
  }
956

957
  /**
958
   * Decodes a string of characters in the range [U+0000, U+007F] to bytes.
959
   *
960
   * <p>Each percent-encoded sequence (e.g. "%F0" or "%2a", as defined by RFC 3986 2.1) is decoded
961
   * to the octet it encodes. Other characters are decoded to their code point's single byte value.
962
   * A literal % character must be encoded as %25.
963
   *
964
   * @throws IllegalArgumentException if 's' contains characters out of range or invalid percent
965
   *     encoding sequences.
966
   */
967
  public static ByteBuffer percentDecode(CharSequence s) {
968
    // This is large enough because each input character needs *at most* one byte of output.
969
    ByteBuffer outBuf = ByteBuffer.allocate(s.length());
1✔
970
    percentDecode(s, "input", null, outBuf);
1✔
971
    outBuf.flip();
1✔
972
    return outBuf;
1✔
973
  }
974

975
  private static void percentDecode(
976
      CharSequence s, String what, BitSet allowedChars, ByteBuffer outBuf) {
977
    for (int i = 0; i < s.length(); i++) {
1✔
978
      char c = s.charAt(i);
1✔
979
      if (c == '%') {
1✔
980
        if (i + 2 >= s.length()) {
1✔
981
          throw new IllegalArgumentException(
1✔
982
              "Invalid percent-encoding at index " + i + " of " + what + ": " + s);
983
        }
984
        int h1 = Character.digit(s.charAt(i + 1), 16);
1✔
985
        int h2 = Character.digit(s.charAt(i + 2), 16);
1✔
986
        if (h1 == -1 || h2 == -1) {
1✔
987
          throw new IllegalArgumentException(
1✔
988
              "Invalid hex digit in " + what + " at index " + i + " of: " + s);
989
        }
990
        if (outBuf != null) {
1✔
991
          outBuf.put((byte) (h1 << 4 | h2));
1✔
992
        }
993
        i += 2;
1✔
994
      } else if (allowedChars == null || allowedChars.get(c)) {
1✔
995
        if (outBuf != null) {
1✔
996
          outBuf.put((byte) c);
1✔
997
        }
998
      } else {
999
        throw new IllegalArgumentException("Invalid character in " + what + " at index " + i);
1✔
1000
      }
1001
    }
1002
  }
1✔
1003

1004
  @Nullable
1005
  private static String percentDecodeAssumedUtf8(@Nullable String s) {
1006
    if (s == null || s.indexOf('%') == -1) {
1✔
1007
      return s;
1✔
1008
    }
1009

1010
    ByteBuffer utf8Bytes = percentDecode(s);
1✔
1011
    try {
1012
      return StandardCharsets.UTF_8
1✔
1013
          .newDecoder()
1✔
1014
          .onMalformedInput(CodingErrorAction.REPLACE)
1✔
1015
          .onUnmappableCharacter(CodingErrorAction.REPLACE)
1✔
1016
          .decode(utf8Bytes)
1✔
1017
          .toString();
1✔
1018
    } catch (CharacterCodingException e) {
×
1019
      throw new VerifyException(e); // Should not happen in REPLACE mode.
×
1020
    }
1021
  }
1022

1023
  @Nullable
1024
  private static String percentEncode(String s, BitSet allowedCodePoints) {
1025
    if (s == null) {
1✔
1026
      return null;
1✔
1027
    }
1028
    CharsetEncoder encoder =
1✔
1029
        StandardCharsets.UTF_8
1030
            .newEncoder()
1✔
1031
            .onMalformedInput(CodingErrorAction.REPORT)
1✔
1032
            .onUnmappableCharacter(CodingErrorAction.REPORT);
1✔
1033
    ByteBuffer utf8Bytes;
1034
    try {
1035
      utf8Bytes = encoder.encode(CharBuffer.wrap(s));
1✔
1036
    } catch (MalformedInputException e) {
1✔
1037
      throw new IllegalArgumentException("Malformed input", e); // Must be a broken surrogate pair.
1✔
1038
    } catch (CharacterCodingException e) {
×
1039
      throw new VerifyException(e); // Should not happen when encoding to UTF-8.
×
1040
    }
1✔
1041

1042
    StringBuilder sb = new StringBuilder();
1✔
1043
    while (utf8Bytes.hasRemaining()) {
1✔
1044
      int b = 0xff & utf8Bytes.get();
1✔
1045
      if (allowedCodePoints.get(b)) {
1✔
1046
        sb.append((char) b);
1✔
1047
      } else {
1048
        sb.append('%');
1✔
1049
        sb.append(hexDigitsByVal[(b & 0xF0) >> 4]);
1✔
1050
        sb.append(hexDigitsByVal[b & 0x0F]);
1✔
1051
      }
1052
    }
1✔
1053
    return sb.toString();
1✔
1054
  }
1055

1056
  private static void checkPercentEncodedArg(String s, String what, BitSet allowedChars) {
1057
    percentDecode(s, what, allowedChars, null);
1✔
1058
  }
1✔
1059

1060
  // See UriTest for how these were computed from the ABNF constants in RFC 3986.
1061
  static final BitSet digitChars = BitSet.valueOf(new long[] {0x3ff000000000000L});
1✔
1062
  static final BitSet alphaChars = BitSet.valueOf(new long[] {0L, 0x7fffffe07fffffeL});
1✔
1063
  // scheme        = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )
1064
  static final BitSet schemeChars =
1✔
1065
      BitSet.valueOf(new long[] {0x3ff680000000000L, 0x7fffffe07fffffeL});
1✔
1066
  // unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"
1067
  static final BitSet unreservedChars =
1✔
1068
      BitSet.valueOf(new long[] {0x3ff600000000000L, 0x47fffffe87fffffeL});
1✔
1069
  // gen-delims    = ":" / "/" / "?" / "#" / "[" / "]" / "@"
1070
  static final BitSet genDelimsChars =
1✔
1071
      BitSet.valueOf(new long[] {0x8400800800000000L, 0x28000001L});
1✔
1072
  // sub-delims    = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="
1073
  static final BitSet subDelimsChars = BitSet.valueOf(new long[] {0x28001fd200000000L});
1✔
1074
  // reserved      = gen-delims / sub-delims
1075
  static final BitSet reservedChars = BitSet.valueOf(new long[] {0xac009fda00000000L, 0x28000001L});
1✔
1076
  // reg-name      = *( unreserved / pct-encoded / sub-delims )
1077
  static final BitSet regNameChars =
1✔
1078
      BitSet.valueOf(new long[] {0x2bff7fd200000000L, 0x47fffffe87fffffeL});
1✔
1079
  // userinfo      = *( unreserved / pct-encoded / sub-delims / ":" )
1080
  static final BitSet userInfoChars =
1✔
1081
      BitSet.valueOf(new long[] {0x2fff7fd200000000L, 0x47fffffe87fffffeL});
1✔
1082
  // pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
1083
  static final BitSet pChars =
1✔
1084
      BitSet.valueOf(new long[] {0x2fff7fd200000000L, 0x47fffffe87ffffffL});
1✔
1085
  static final BitSet pCharsAndSlash =
1✔
1086
      BitSet.valueOf(new long[] {0x2fffffd200000000L, 0x47fffffe87ffffffL});
1✔
1087
  //  query         = *( pchar / "/" / "?" )
1088
  static final BitSet queryChars =
1✔
1089
      BitSet.valueOf(new long[] {0xafffffd200000000L, 0x47fffffe87ffffffL});
1✔
1090
  // fragment      = *( pchar / "/" / "?" )
1091
  static final BitSet fragmentChars = queryChars;
1✔
1092

1093
  private static final char[] hexDigitsByVal = "0123456789ABCDEF".toCharArray();
1✔
1094
}
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc