• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

JuliaLang / julia / #37497

pending completion
#37497

push

local

web-flow
<a href="https://github.com/JuliaLang/julia/commit/<a class=hub.com/JuliaLang/julia/commit/def2ddacc9a9b064d06c24d12885427fb0502465">def2ddacc<a href="https://github.com/JuliaLang/julia/commit/def2ddacc9a9b064d06c24d12885427fb0502465">&quot;&gt;make default worker pool an AbstractWorkerPool (#49101)

Changes [Distributed._default_worker_pool](https://github.com/JuliaLang/julia/blob/</a><a class="double-link" href="https://github.com/JuliaLang/julia/commit/<a class="double-link" href="https://github.com/JuliaLang/julia/commit/5f5d2040511b42ba74bd7529a0eac9cf817ad496">5f5d20405</a>">5f5d20405</a><a href="https://github.com/JuliaLang/julia/commit/def2ddacc9a9b064d06c24d12885427fb0502465">/stdlib/Distributed/src/workerpool.jl#L242) to hold an `AbstractWorkerPool` instead of `WorkerPool`. With this, alternate implementations can be plugged in as the default pool. Helps in cases where a cluster is always meant to use a certain custom pool. Lower level calls can then work without having to pass a custom pool reference with every call.

4 of 4 new or added lines in 2 files covered. (100.0%)

71044 of 82770 relevant lines covered (85.83%)

33857692.69 hits per line

Source File
Press 'n' to go to next uncovered line, 'b' for previous

98.29
/base/strings/basic.jl
1
# This file is a part of Julia. License is MIT: https://julialang.org/license
2

3
"""
4
The `AbstractString` type is the supertype of all string implementations in
5
Julia. Strings are encodings of sequences of [Unicode](https://unicode.org/)
6
code points as represented by the `AbstractChar` type. Julia makes a few assumptions
7
about strings:
8

9
* Strings are encoded in terms of fixed-size "code units"
10
  * Code units can be extracted with `codeunit(s, i)`
11
  * The first code unit has index `1`
12
  * The last code unit has index `ncodeunits(s)`
13
  * Any index `i` such that `1 ≤ i ≤ ncodeunits(s)` is in bounds
14
* String indexing is done in terms of these code units:
15
  * Characters are extracted by `s[i]` with a valid string index `i`
16
  * Each `AbstractChar` in a string is encoded by one or more code units
17
  * Only the index of the first code unit of an `AbstractChar` is a valid index
18
  * The encoding of an `AbstractChar` is independent of what precedes or follows it
19
  * String encodings are [self-synchronizing] – i.e. `isvalid(s, i)` is O(1)
20

21
[self-synchronizing]: https://en.wikipedia.org/wiki/Self-synchronizing_code
22

23
Some string functions that extract code units, characters or substrings from
24
strings error if you pass them out-of-bounds or invalid string indices. This
25
includes `codeunit(s, i)` and `s[i]`. Functions that do string
26
index arithmetic take a more relaxed approach to indexing and give you the
27
closest valid string index when in-bounds, or when out-of-bounds, behave as if
28
there were an infinite number of characters padding each side of the string.
29
Usually these imaginary padding characters have code unit length `1` but string
30
types may choose different "imaginary" character sizes as makes sense for their
31
implementations (e.g. substrings may pass index arithmetic through to the
32
underlying string they provide a view into). Relaxed indexing functions include
33
those intended for index arithmetic: `thisind`, `nextind` and `prevind`. This
34
model allows index arithmetic to work with out-of- bounds indices as
35
intermediate values so long as one never uses them to retrieve a character,
36
which often helps avoid needing to code around edge cases.
37

38
See also [`codeunit`](@ref), [`ncodeunits`](@ref), [`thisind`](@ref),
39
[`nextind`](@ref), [`prevind`](@ref).
40
"""
41
AbstractString
42

43
## required string functions ##
44

45
"""
46
    ncodeunits(s::AbstractString) -> Int
47

48
Return the number of code units in a string. Indices that are in bounds to
49
access this string must satisfy `1 ≤ i ≤ ncodeunits(s)`. Not all such indices
50
are valid – they may not be the start of a character, but they will return a
51
code unit value when calling `codeunit(s,i)`.
52

53
# Examples
54
```jldoctest
55
julia> ncodeunits("The Julia Language")
56
18
57

58
julia> ncodeunits("∫eˣ")
59
6
60

61
julia> ncodeunits('∫'), ncodeunits('e'), ncodeunits('ˣ')
62
(3, 1, 2)
63
```
64

65
See also [`codeunit`](@ref), [`checkbounds`](@ref), [`sizeof`](@ref),
66
[`length`](@ref), [`lastindex`](@ref).
67
"""
68
ncodeunits(s::AbstractString)
69

70
"""
71
    codeunit(s::AbstractString) -> Type{<:Union{UInt8, UInt16, UInt32}}
72

73
Return the code unit type of the given string object. For ASCII, Latin-1, or
74
UTF-8 encoded strings, this would be `UInt8`; for UCS-2 and UTF-16 it would be
75
`UInt16`; for UTF-32 it would be `UInt32`. The code unit type need not be
76
limited to these three types, but it's hard to think of widely used string
77
encodings that don't use one of these units. `codeunit(s)` is the same as
78
`typeof(codeunit(s,1))` when `s` is a non-empty string.
79

80
See also [`ncodeunits`](@ref).
81
"""
82
codeunit(s::AbstractString)
83

84
const CodeunitType = Union{Type{UInt8},Type{UInt16},Type{UInt32}}
85

86
"""
87
    codeunit(s::AbstractString, i::Integer) -> Union{UInt8, UInt16, UInt32}
88

89
Return the code unit value in the string `s` at index `i`. Note that
90

91
    codeunit(s, i) :: codeunit(s)
92

93
I.e. the value returned by `codeunit(s, i)` is of the type returned by
94
`codeunit(s)`.
95

96
# Examples
97
```jldoctest
98
julia> a = codeunit("Hello", 2)
99
0x65
100

101
julia> typeof(a)
102
UInt8
103
```
104

105
See also [`ncodeunits`](@ref), [`checkbounds`](@ref).
106
"""
107
@propagate_inbounds codeunit(s::AbstractString, i::Integer) = i isa Int ?
3✔
108
    throw(MethodError(codeunit, (s, i))) : codeunit(s, Int(i))
109

110
"""
111
    isvalid(s::AbstractString, i::Integer) -> Bool
112

113
Predicate indicating whether the given index is the start of the encoding of a
114
character in `s` or not. If `isvalid(s, i)` is true then `s[i]` will return the
115
character whose encoding starts at that index, if it's false, then `s[i]` will
116
raise an invalid index error or a bounds error depending on if `i` is in bounds.
117
In order for `isvalid(s, i)` to be an O(1) function, the encoding of `s` must be
118
[self-synchronizing](https://en.wikipedia.org/wiki/Self-synchronizing_code). This
119
is a basic assumption of Julia's generic string support.
120

121
See also [`getindex`](@ref), [`iterate`](@ref), [`thisind`](@ref),
122
[`nextind`](@ref), [`prevind`](@ref), [`length`](@ref).
123

124
# Examples
125
```jldoctest
126
julia> str = "αβγdef";
127

128
julia> isvalid(str, 1)
129
true
130

131
julia> str[1]
132
'α': Unicode U+03B1 (category Ll: Letter, lowercase)
133

134
julia> isvalid(str, 2)
135
false
136

137
julia> str[2]
138
ERROR: StringIndexError: invalid index [2], valid nearby indices [1]=>'α', [3]=>'β'
139
Stacktrace:
140
[...]
141
```
142
"""
143
@propagate_inbounds isvalid(s::AbstractString, i::Integer) = i isa Int ?
205,037✔
144
    throw(MethodError(isvalid, (s, i))) : isvalid(s, Int(i))
145

146
"""
147
    iterate(s::AbstractString, i::Integer) -> Union{Tuple{<:AbstractChar, Int}, Nothing}
148

149
Return a tuple of the character in `s` at index `i` with the index of the start
150
of the following character in `s`. This is the key method that allows strings to
151
be iterated, yielding a sequences of characters. If `i` is out of bounds in `s`
152
then a bounds error is raised. The `iterate` function, as part of the iteration
153
protocol may assume that `i` is the start of a character in `s`.
154

155
See also [`getindex`](@ref), [`checkbounds`](@ref).
156
"""
157
@propagate_inbounds iterate(s::AbstractString, i::Integer) = i isa Int ?
410,042✔
158
    throw(MethodError(iterate, (s, i))) : iterate(s, Int(i))
159

160
## basic generic definitions ##
161

162
eltype(::Type{<:AbstractString}) = Char # some string types may use another AbstractChar
100✔
163

164
"""
165
    sizeof(str::AbstractString)
166

167
Size, in bytes, of the string `str`. Equal to the number of code units in `str` multiplied by
168
the size, in bytes, of one code unit in `str`.
169

170
# Examples
171
```jldoctest
172
julia> sizeof("")
173
0
174

175
julia> sizeof("∀")
176
3
177
```
178
"""
179
sizeof(s::AbstractString) = ncodeunits(s)::Int * sizeof(codeunit(s)::CodeunitType)
12,300,456✔
180
firstindex(s::AbstractString) = 1
850✔
181
lastindex(s::AbstractString) = thisind(s, ncodeunits(s)::Int)
7,263,208✔
182
isempty(s::AbstractString) = iszero(ncodeunits(s)::Int)
6,626,115✔
183

184
function getindex(s::AbstractString, i::Integer)
207,652✔
185
    @boundscheck checkbounds(s, i)
207,653✔
186
    @inbounds return isvalid(s, i) ? (iterate(s, i)::NTuple{2,Any})[1] : string_index_err(s, i)
207,651✔
187
end
188

189
getindex(s::AbstractString, i::Colon) = s
1✔
190
# TODO: handle other ranges with stride ±1 specially?
191
# TODO: add more @propagate_inbounds annotations?
192
getindex(s::AbstractString, v::AbstractVector{<:Integer}) =
4✔
193
    sprint(io->(for i in v; write(io, s[i]) end), sizehint=length(v))
27✔
194
getindex(s::AbstractString, v::AbstractVector{Bool}) =
2✔
195
    throw(ArgumentError("logical indexing not supported for strings"))
196

197
function get(s::AbstractString, i::Integer, default)
5✔
198
# TODO: use ternary once @inbounds is expression-like
199
    if checkbounds(Bool, s, i)
6✔
200
        @inbounds return s[i]
3✔
201
    else
202
        return default
2✔
203
    end
204
end
205

206
## bounds checking ##
207

208
checkbounds(::Type{Bool}, s::AbstractString, i::Integer) =
147,450,142✔
209
    1 ≤ i ≤ ncodeunits(s)::Int
210
checkbounds(::Type{Bool}, s::AbstractString, r::AbstractRange{<:Integer}) =
13,767,081✔
211
    isempty(r) || (1 ≤ minimum(r) && maximum(r) ≤ ncodeunits(s)::Int)
212
checkbounds(::Type{Bool}, s::AbstractString, I::AbstractArray{<:Real}) =
1✔
213
    all(i -> checkbounds(Bool, s, i), I)
1✔
214
checkbounds(::Type{Bool}, s::AbstractString, I::AbstractArray{<:Integer}) =
9✔
215
    all(i -> checkbounds(Bool, s, i), I)
24✔
216
checkbounds(s::AbstractString, I::Union{Integer,AbstractArray}) =
70,755,158✔
217
    checkbounds(Bool, s, I) ? nothing : throw(BoundsError(s, I))
218

219
## construction, conversion, promotion ##
220

221
string() = ""
19✔
222
string(s::AbstractString) = s
4✔
223

224
Vector{UInt8}(s::AbstractString) = unsafe_wrap(Vector{UInt8}, String(s))
×
225
Array{UInt8}(s::AbstractString) = unsafe_wrap(Vector{UInt8}, String(s))
1✔
226
Vector{T}(s::AbstractString) where {T<:AbstractChar} = collect(T, s)
235✔
227

228
Symbol(s::AbstractString) = Symbol(String(s))
1✔
229
Symbol(x...) = Symbol(string(x...))
86,553✔
230

231
convert(::Type{T}, s::T) where {T<:AbstractString} = s
100,309✔
232
convert(::Type{T}, s::AbstractString) where {T<:AbstractString} = T(s)::T
102,198✔
233

234
## summary ##
235

236
function summary(io::IO, s::AbstractString)
3✔
237
    prefix = isempty(s) ? "empty" : string(ncodeunits(s), "-codeunit")
5✔
238
    print(io, prefix, " ", typeof(s))
3✔
239
end
240

241
## string & character concatenation ##
242

243
"""
244
    *(s::Union{AbstractString, AbstractChar}, t::Union{AbstractString, AbstractChar}...) -> AbstractString
245

246
Concatenate strings and/or characters, producing a [`String`](@ref). This is equivalent
247
to calling the [`string`](@ref) function on the arguments. Concatenation of built-in
248
string types always produces a value of type `String` but other string types may choose
249
to return a string of a different type as appropriate.
250

251
# Examples
252
```jldoctest
253
julia> "Hello " * "world"
254
"Hello world"
255

256
julia> 'j' * "ulia"
257
"julia"
258
```
259
"""
260
(*)(s1::Union{AbstractChar, AbstractString}, ss::Union{AbstractChar, AbstractString}...) = string(s1, ss...)
4,600,846✔
261

262
one(::Union{T,Type{T}}) where {T<:AbstractString} = convert(T, "")
4✔
263

264
## generic string comparison ##
265

266
"""
267
    cmp(a::AbstractString, b::AbstractString) -> Int
268

269
Compare two strings. Return `0` if both strings have the same length and the character
270
at each index is the same in both strings. Return `-1` if `a` is a prefix of `b`, or if
271
`a` comes before `b` in alphabetical order. Return `1` if `b` is a prefix of `a`, or if
272
`b` comes before `a` in alphabetical order (technically, lexicographical order by Unicode
273
code points).
274

275
# Examples
276
```jldoctest
277
julia> cmp("abc", "abc")
278
0
279

280
julia> cmp("ab", "abc")
281
-1
282

283
julia> cmp("abc", "ab")
284
1
285

286
julia> cmp("ab", "ac")
287
-1
288

289
julia> cmp("ac", "ab")
290
1
291

292
julia> cmp("α", "a")
293
1
294

295
julia> cmp("b", "β")
296
-1
297
```
298
"""
299
function cmp(a::AbstractString, b::AbstractString)
342✔
300
    a === b && return 0
342✔
301
    (iv1, iv2) = (iterate(a), iterate(b))
489✔
302
    while iv1 !== nothing && iv2 !== nothing
919✔
303
        (c, d) = (first(iv1)::AbstractChar, first(iv2)::AbstractChar)
637✔
304
        c ≠ d && return ifelse(c < d, -1, 1)
637✔
305
        (iv1, iv2) = (iterate(a, last(iv1)), iterate(b, last(iv2)))
867✔
306
    end
594✔
307
    return iv1 === nothing ? (iv2 === nothing ? 0 : -1) : 1
282✔
308
end
309

310
"""
311
    ==(a::AbstractString, b::AbstractString) -> Bool
312

313
Test whether two strings are equal character by character (technically, Unicode
314
code point by code point).
315

316
# Examples
317
```jldoctest
318
julia> "abc" == "abc"
319
true
320

321
julia> "abc" == "αβγ"
322
false
323
```
324
"""
325
==(a::AbstractString, b::AbstractString) = cmp(a, b) == 0
283✔
326

327
"""
328
    isless(a::AbstractString, b::AbstractString) -> Bool
329

330
Test whether string `a` comes before string `b` in alphabetical order
331
(technically, in lexicographical order by Unicode code points).
332

333
# Examples
334
```jldoctest
335
julia> isless("a", "b")
336
true
337

338
julia> isless("β", "α")
339
false
340

341
julia> isless("a", "a")
342
false
343
```
344
"""
345
isless(a::AbstractString, b::AbstractString) = cmp(a, b) < 0
585,535✔
346

347
# faster comparisons for symbols
348

349
@assume_effects :total function cmp(a::Symbol, b::Symbol)
282✔
350
    Int(sign(ccall(:strcmp, Int32, (Cstring, Cstring), a, b)))
14,716,619✔
351
end
352

353
isless(a::Symbol, b::Symbol) = cmp(a, b) < 0
14,701,253✔
354

355
# hashing
356

357
hash(s::AbstractString, h::UInt) = hash(String(s), h)
1✔
358

359
## character index arithmetic ##
360

361
"""
362
    length(s::AbstractString) -> Int
363
    length(s::AbstractString, i::Integer, j::Integer) -> Int
364

365
Return the number of characters in string `s` from indices `i` through `j`.
366

367
This is computed as the number of code unit indices from `i` to `j` which are
368
valid character indices. With only a single string argument, this computes
369
the number of characters in the entire string. With `i` and `j` arguments it
370
computes the number of indices between `i` and `j` inclusive that are valid
371
indices in the string `s`. In addition to in-bounds values, `i` may take the
372
out-of-bounds value `ncodeunits(s) + 1` and `j` may take the out-of-bounds
373
value `0`.
374

375
!!! note
376
    The time complexity of this operation is linear in general. That is, it
377
    will take the time proportional to the number of bytes or characters in
378
    the string because it counts the value on the fly. This is in contrast to
379
    the method for arrays, which is a constant-time operation.
380

381
See also [`isvalid`](@ref), [`ncodeunits`](@ref), [`lastindex`](@ref),
382
[`thisind`](@ref), [`nextind`](@ref), [`prevind`](@ref).
383

384
# Examples
385
```jldoctest
386
julia> length("jμΛIα")
387
5
388
```
389
"""
390
length(s::AbstractString) = @inbounds return length(s, 1, ncodeunits(s)::Int)
3,362✔
391

392
function length(s::AbstractString, i::Int, j::Int)
59,571✔
393
    @boundscheck begin
59,571✔
394
        0 < i ≤ ncodeunits(s)::Int+1 || throw(BoundsError(s, i))
59,571✔
395
        0 ≤ j < ncodeunits(s)::Int+1 || throw(BoundsError(s, j))
59,574✔
396
    end
397
    n = 0
59,568✔
398
    for k = i:j
90,622✔
399
        @inbounds n += isvalid(s, k)
1,703,198✔
400
    end
3,375,342✔
401
    return n
59,568✔
402
end
403

404
@propagate_inbounds length(s::AbstractString, i::Integer, j::Integer) =
1✔
405
    length(s, Int(i), Int(j))
406

407
"""
408
    thisind(s::AbstractString, i::Integer) -> Int
409

410
If `i` is in bounds in `s` return the index of the start of the character whose
411
encoding code unit `i` is part of. In other words, if `i` is the start of a
412
character, return `i`; if `i` is not the start of a character, rewind until the
413
start of a character and return that index. If `i` is equal to 0 or `ncodeunits(s)+1`
414
return `i`. In all other cases throw `BoundsError`.
415

416
# Examples
417
```jldoctest
418
julia> thisind("α", 0)
419
0
420

421
julia> thisind("α", 1)
422
1
423

424
julia> thisind("α", 2)
425
1
426

427
julia> thisind("α", 3)
428
3
429

430
julia> thisind("α", 4)
431
ERROR: BoundsError: attempt to access 2-codeunit String at index [4]
432
[...]
433

434
julia> thisind("α", -1)
435
ERROR: BoundsError: attempt to access 2-codeunit String at index [-1]
436
[...]
437
```
438
"""
439
thisind(s::AbstractString, i::Integer) = thisind(s, Int(i))
4✔
440

441
function thisind(s::AbstractString, i::Int)
1,210✔
442
    z = ncodeunits(s)::Int + 1
1,210✔
443
    i == z && return i
1,210✔
444
    @boundscheck 0 ≤ i ≤ z || throw(BoundsError(s, i))
1,213✔
445
    @inbounds while 1 < i && !(isvalid(s, i)::Bool)
1,189✔
446
        i -= 1
1,154✔
447
    end
1,154✔
448
    return i
1,189✔
449
end
450

451
"""
452
    prevind(str::AbstractString, i::Integer, n::Integer=1) -> Int
453

454
* Case `n == 1`
455

456
  If `i` is in bounds in `s` return the index of the start of the character whose
457
  encoding starts before index `i`. In other words, if `i` is the start of a
458
  character, return the start of the previous character; if `i` is not the start
459
  of a character, rewind until the start of a character and return that index.
460
  If `i` is equal to `1` return `0`.
461
  If `i` is equal to `ncodeunits(str)+1` return `lastindex(str)`.
462
  Otherwise throw `BoundsError`.
463

464
* Case `n > 1`
465

466
  Behaves like applying `n` times `prevind` for `n==1`. The only difference
467
  is that if `n` is so large that applying `prevind` would reach `0` then each remaining
468
  iteration decreases the returned value by `1`.
469
  This means that in this case `prevind` can return a negative value.
470

471
* Case `n == 0`
472

473
  Return `i` only if `i` is a valid index in `str` or is equal to `ncodeunits(str)+1`.
474
  Otherwise `StringIndexError` or `BoundsError` is thrown.
475

476
# Examples
477
```jldoctest
478
julia> prevind("α", 3)
479
1
480

481
julia> prevind("α", 1)
482
0
483

484
julia> prevind("α", 0)
485
ERROR: BoundsError: attempt to access 2-codeunit String at index [0]
486
[...]
487

488
julia> prevind("α", 2, 2)
489
0
490

491
julia> prevind("α", 2, 3)
492
-1
493
```
494
"""
495
prevind(s::AbstractString, i::Integer, n::Integer) = prevind(s, Int(i), Int(n))
2✔
496
prevind(s::AbstractString, i::Integer)             = prevind(s, Int(i))
2,221,319✔
497
prevind(s::AbstractString, i::Int)                 = prevind(s, i, 1)
8,858,330✔
498

499
function prevind(s::AbstractString, i::Int, n::Int)
11,082,706✔
500
    n < 0 && throw(ArgumentError("n cannot be negative: $n"))
11,082,706✔
501
    z = ncodeunits(s) + 1
11,082,702✔
502
    @boundscheck 0 < i ≤ z || throw(BoundsError(s, i))
11,082,736✔
503
    n == 0 && return thisind(s, i) == i ? i : string_index_err(s, i)
11,082,668✔
504
    while n > 0 && 1 < i
45,141,105✔
505
        @inbounds n -= isvalid(s, i -= 1)
34,124,369✔
506
    end
34,124,369✔
507
    return i - n
11,016,736✔
508
end
509

510
"""
511
    nextind(str::AbstractString, i::Integer, n::Integer=1) -> Int
512

513
* Case `n == 1`
514

515
  If `i` is in bounds in `s` return the index of the start of the character whose
516
  encoding starts after index `i`. In other words, if `i` is the start of a
517
  character, return the start of the next character; if `i` is not the start
518
  of a character, move forward until the start of a character and return that index.
519
  If `i` is equal to `0` return `1`.
520
  If `i` is in bounds but greater or equal to `lastindex(str)` return `ncodeunits(str)+1`.
521
  Otherwise throw `BoundsError`.
522

523
* Case `n > 1`
524

525
  Behaves like applying `n` times `nextind` for `n==1`. The only difference
526
  is that if `n` is so large that applying `nextind` would reach `ncodeunits(str)+1` then
527
  each remaining iteration increases the returned value by `1`. This means that in this
528
  case `nextind` can return a value greater than `ncodeunits(str)+1`.
529

530
* Case `n == 0`
531

532
  Return `i` only if `i` is a valid index in `s` or is equal to `0`.
533
  Otherwise `StringIndexError` or `BoundsError` is thrown.
534

535
# Examples
536
```jldoctest
537
julia> nextind("α", 0)
538
1
539

540
julia> nextind("α", 1)
541
3
542

543
julia> nextind("α", 3)
544
ERROR: BoundsError: attempt to access 2-codeunit String at index [3]
545
[...]
546

547
julia> nextind("α", 0, 2)
548
3
549

550
julia> nextind("α", 1, 2)
551
4
552
```
553
"""
554
nextind(s::AbstractString, i::Integer, n::Integer) = nextind(s, Int(i), Int(n))
2✔
555
nextind(s::AbstractString, i::Integer)             = nextind(s, Int(i))
2✔
556
nextind(s::AbstractString, i::Int)                 = nextind(s, i, 1)
2,146✔
557

558
function nextind(s::AbstractString, i::Int, n::Int)
2,277,653✔
559
    n < 0 && throw(ArgumentError("n cannot be negative: $n"))
2,277,653✔
560
    z = ncodeunits(s)
2,277,646✔
561
    @boundscheck 0 ≤ i ≤ z || throw(BoundsError(s, i))
2,277,666✔
562
    n == 0 && return thisind(s, i) == i ? i : string_index_err(s, i)
2,277,626✔
563
    while n > 0 && i < z
34,343,797✔
564
        @inbounds n -= isvalid(s, i += 1)
32,122,875✔
565
    end
32,122,875✔
566
    return i + n
2,220,922✔
567
end
568

569
## string index iteration type ##
570

571
struct EachStringIndex{T<:AbstractString}
572
    s::T
2,774✔
573
end
574
keys(s::AbstractString) = EachStringIndex(s)
34,899✔
575

576
length(e::EachStringIndex) = length(e.s)
881✔
577
first(::EachStringIndex) = 1
405✔
578
last(e::EachStringIndex) = lastindex(e.s)
324✔
579
iterate(e::EachStringIndex, state=firstindex(e.s)) = state > ncodeunits(e.s) ? nothing : (state, nextind(e.s, state))
379,976✔
580
eltype(::Type{<:EachStringIndex}) = Int
27✔
581

582
"""
583
    isascii(c::Union{AbstractChar,AbstractString}) -> Bool
584

585
Test whether a character belongs to the ASCII character set, or whether this is true for
586
all elements of a string.
587

588
# Examples
589
```jldoctest
590
julia> isascii('a')
591
true
592

593
julia> isascii('α')
594
false
595

596
julia> isascii("abc")
597
true
598

599
julia> isascii("αβγ")
600
false
601
```
602
For example, `isascii` can be used as a predicate function for [`filter`](@ref) or [`replace`](@ref)
603
to remove or replace non-ASCII characters, respectively:
604
```jldoctest
605
julia> filter(isascii, "abcdeγfgh") # discard non-ASCII chars
606
"abcdefgh"
607

608
julia> replace("abcdeγfgh", !isascii=>' ') # replace non-ASCII chars with spaces
609
"abcde fgh"
610
```
611
"""
612
isascii(c::Char) = bswap(reinterpret(UInt32, c)) < 0x80
8,122,307✔
613
isascii(s::AbstractString) = all(isascii, s)
1✔
614
isascii(c::AbstractChar) = UInt32(c) < 0x80
1✔
615

616
@inline function _isascii(code_units::AbstractVector{CU}, first, last) where {CU}
26,514✔
617
    r = zero(CU)
26,514✔
618
    for n = first:last
53,020✔
619
        @inbounds r |= code_units[n]
4,985,871✔
620
    end
9,945,236✔
621
    return 0 ≤ r < 0x80
26,514✔
622
end
623

624
#The chunking algorithm makes the last two chunks overlap inorder to keep the size fixed
625
@inline function  _isascii_chunks(chunk_size,cu::AbstractVector{CU}, first,last) where {CU}
54✔
626
    n=first
54✔
627
    while n <= last - chunk_size
786✔
628
        _isascii(cu,n,n+chunk_size-1) || return false
780✔
629
        n += chunk_size
732✔
630
    end
732✔
631
    return  _isascii(cu,last-chunk_size+1,last)
30✔
632
end
633
"""
634
    isascii(cu::AbstractVector{CU}) where {CU <: Integer} -> Bool
635

636
Test whether all values in the vector belong to the ASCII character set (0x00 to 0x7f).
637
This function is intended to be used by other string implementations that need a fast ASCII check.
638
"""
639
function isascii(cu::AbstractVector{CU}) where {CU <: Integer}
25,782✔
640
    chunk_size = 1024
25,782✔
641
    chunk_threshold =  chunk_size + (chunk_size ÷ 2)
25,782✔
642
    first = firstindex(cu);   last = lastindex(cu)
51,564✔
643
    l = last - first + 1
25,782✔
644
    l < chunk_threshold && return _isascii(cu,first,last)
25,782✔
645
    return _isascii_chunks(chunk_size,cu,first,last)
54✔
646
end
647

648
## string map, filter ##
649

650
function map(f, s::AbstractString)
19,688✔
651
    out = StringVector(max(4, sizeof(s)::Int÷sizeof(codeunit(s)::CodeunitType)))
19,697✔
652
    index = UInt(1)
17✔
653
    for c::AbstractChar in s
39,315✔
654
        c′ = f(c)
217,083✔
655
        isa(c′, AbstractChar) || throw(ArgumentError(
101✔
656
            "map(f, s::AbstractString) requires f to return AbstractChar; " *
657
            "try map(f, collect(s)) or a comprehension instead"))
658
        index + 3 > length(out) && resize!(out, unsigned(2 * length(out)))
217,082✔
659
        index += __unsafe_string!(out, convert(Char, c′), index)
217,156✔
660
    end
414,449✔
661
    resize!(out, index-1)
39,374✔
662
    sizehint!(out, index-1)
19,687✔
663
    return String(out)
19,687✔
664
end
665

666
function filter(f, s::AbstractString)
2✔
667
    out = IOBuffer(sizehint=sizeof(s))
4✔
668
    for c in s
2✔
669
        f(c) && write(out, c)
57✔
670
    end
57✔
671
    String(_unsafe_take!(out))
2✔
672
end
673

674
## string first and last ##
675

676
"""
677
    first(s::AbstractString, n::Integer)
678

679
Get a string consisting of the first `n` characters of `s`.
680

681
# Examples
682
```jldoctest
683
julia> first("∀ϵ≠0: ϵ²>0", 0)
684
""
685

686
julia> first("∀ϵ≠0: ϵ²>0", 1)
687
"∀"
688

689
julia> first("∀ϵ≠0: ϵ²>0", 3)
690
"∀ϵ≠"
691
```
692
"""
693
first(s::AbstractString, n::Integer) = @inbounds s[1:min(end, nextind(s, 0, n))]
26✔
694

695
"""
696
    last(s::AbstractString, n::Integer)
697

698
Get a string consisting of the last `n` characters of `s`.
699

700
# Examples
701
```jldoctest
702
julia> last("∀ϵ≠0: ϵ²>0", 0)
703
""
704

705
julia> last("∀ϵ≠0: ϵ²>0", 1)
706
"0"
707

708
julia> last("∀ϵ≠0: ϵ²>0", 3)
709
"²>0"
710
```
711
"""
712
last(s::AbstractString, n::Integer) = @inbounds s[max(1, prevind(s, ncodeunits(s)+1, n)):end]
8✔
713

714
"""
715
    reverseind(v, i)
716

717
Given an index `i` in [`reverse(v)`](@ref), return the corresponding index in
718
`v` so that `v[reverseind(v,i)] == reverse(v)[i]`. (This can be nontrivial in
719
cases where `v` contains non-ASCII characters.)
720

721
# Examples
722
```jldoctest
723
julia> s = "Julia🚀"
724
"Julia🚀"
725

726
julia> r = reverse(s)
727
"🚀ailuJ"
728

729
julia> for i in eachindex(s)
730
           print(r[reverseind(r, i)])
731
       end
732
Julia🚀
733
```
734
"""
735
reverseind(s::AbstractString, i::Integer) = thisind(s, ncodeunits(s)-i+1)
773✔
736

737
"""
738
    repeat(s::AbstractString, r::Integer)
739

740
Repeat a string `r` times. This can be written as `s^r`.
741

742
See also [`^`](@ref :^(::Union{AbstractString, AbstractChar}, ::Integer)).
743

744
# Examples
745
```jldoctest
746
julia> repeat("ha", 3)
747
"hahaha"
748
```
749
"""
750
repeat(s::AbstractString, r::Integer) = repeat(String(s), r)
5✔
751

752
"""
753
    ^(s::Union{AbstractString,AbstractChar}, n::Integer) -> AbstractString
754

755
Repeat a string or character `n` times. This can also be written as `repeat(s, n)`.
756

757
See also [`repeat`](@ref).
758

759
# Examples
760
```jldoctest
761
julia> "Test "^3
762
"Test Test Test "
763
```
764
"""
765
(^)(s::Union{AbstractString,AbstractChar}, r::Integer) = repeat(s, r)
757,672✔
766

767
# reverse-order iteration for strings and indices thereof
768
iterate(r::Iterators.Reverse{<:AbstractString}, i=lastindex(r.itr)) = i < firstindex(r.itr) ? nothing : (r.itr[i], prevind(r.itr, i))
508,237✔
769
iterate(r::Iterators.Reverse{<:EachStringIndex}, i=lastindex(r.itr.s)) = i < firstindex(r.itr.s) ? nothing : (i, prevind(r.itr.s, i))
522,712✔
770

771
## code unit access ##
772

773
"""
774
    CodeUnits(s::AbstractString)
775

776
Wrap a string (without copying) in an immutable vector-like object that accesses the code units
777
of the string's representation.
778
"""
779
struct CodeUnits{T,S<:AbstractString} <: DenseVector{T}
780
    s::S
781
    CodeUnits(s::S) where {S<:AbstractString} = new{codeunit(s),S}(s)
462,510✔
782
end
783

784
length(s::CodeUnits) = ncodeunits(s.s)
34,453,490✔
785
sizeof(s::CodeUnits{T}) where {T} = ncodeunits(s.s) * sizeof(T)
73✔
786
size(s::CodeUnits) = (length(s),)
45,406✔
787
elsize(s::Type{<:CodeUnits{T}}) where {T} = sizeof(T)
3✔
788
@propagate_inbounds getindex(s::CodeUnits, i::Int) = codeunit(s.s, i)
43,241,637✔
789
IndexStyle(::Type{<:CodeUnits}) = IndexLinear()
1✔
790
@inline iterate(s::CodeUnits, i=1) = (i % UInt) - 1 < length(s) ? (@inbounds s[i], i + 1) : nothing
34,642,446✔
791

792

793
write(io::IO, s::CodeUnits) = write(io, s.s)
×
794

795
unsafe_convert(::Type{Ptr{T}},    s::CodeUnits{T}) where {T} = unsafe_convert(Ptr{T}, s.s)
48✔
796
unsafe_convert(::Type{Ptr{Int8}}, s::CodeUnits{UInt8}) = unsafe_convert(Ptr{Int8}, s.s)
1✔
797

798
"""
799
    codeunits(s::AbstractString)
800

801
Obtain a vector-like object containing the code units of a string.
802
Returns a `CodeUnits` wrapper by default, but `codeunits` may optionally be defined
803
for new string types if necessary.
804

805
# Examples
806
```jldoctest
807
julia> codeunits("Juλia")
808
6-element Base.CodeUnits{UInt8, String}:
809
 0x4a
810
 0x75
811
 0xce
812
 0xbb
813
 0x69
814
 0x61
815
```
816
"""
817
codeunits(s::AbstractString) = CodeUnits(s)
463,144✔
818

819
function _split_rest(s::AbstractString, n::Int)
1✔
820
    lastind = lastindex(s)
1✔
821
    i = try
1✔
822
        prevind(s, lastind, n)
1✔
823
    catch e
824
        e isa BoundsError || rethrow()
×
825
        _check_length_split_rest(length(s), n)
1✔
826
    end
827
    last_n = SubString(s, nextind(s, i), lastind)
1✔
828
    front = s[begin:i]
1✔
829
    return front, last_n
1✔
830
end
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2025 Coveralls, Inc