• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

JuliaLang / julia / #37822

28 Jun 2024 01:08AM UTC coverage: 86.381% (-1.1%) from 87.488%
#37822

push

local

web-flow
inference: implement an opt-in interface to cache generated sources (#54916)

In Cassette-like systems, where inference has to infer many calls of
`@generated` function and the generated function involves complex code
transformations, the overhead from code generation itself can become
significant. This is because the results of code generation are not
cached, leading to duplicated code generation in the following contexts:
- `method_for_inference_heuristics` for regular inference on cached
`@generated` function calls (since
`method_for_inference_limit_heuristics` isn't stored in cached optimized
sources, but is attached to generated unoptimized sources).
- `retrieval_code_info` for constant propagation on cached `@generated`
function calls.

Having said that, caching unoptimized sources generated by `@generated`
functions is not a good tradeoff in general cases, considering the
memory space consumed (and the image bloat). The code generation for
generators like `GeneratedFunctionStub` produced by the front end is
generally very simple, and the first duplicated code generation
mentioned above does not occur for `GeneratedFunctionStub`.

So this unoptimized source caching should be enabled in an opt-in
manner.

Based on this idea, this commit defines the trait `abstract type
Core.CachedGenerator` as an interface for the external systems to
opt-in. If the generator is a subtype of this trait, inference caches
the generated unoptimized code, sacrificing memory space to improve the
performance of subsequent inferences. Specifically, the mechanism for
caching the unoptimized source uses the infrastructure already
implemented in JuliaLang/julia#54362. Thanks to JuliaLang/julia#54362,
the cache for generated functions is now partitioned by world age, so
even if the unoptimized source is cached, the existing invalidation
system will invalidate it as expected.

In JuliaDebug/CassetteOverlay.jl#56, the following benchmark result... (continued)

7 of 8 new or added lines in 1 file covered. (87.5%)

1385 existing lines in 56 files now uncovered.

76243 of 88264 relevant lines covered (86.38%)

15480525.49 hits per line

Source File
Press 'n' to go to next uncovered line, 'b' for previous

95.85
/stdlib/Dates/src/parse.jl
1
# This file is a part of Julia. License is MIT: https://julialang.org/license
2

3
### Parsing utilities
4

5
_directives(::Type{DateFormat{S,T}}) where {S,T} = T.parameters
244✔
6

7
character_codes(df::Type{DateFormat{S,T}}) where {S,T} = character_codes(_directives(df))
166✔
8
function character_codes(directives::Core.SimpleVector)
244✔
9
    letters = sizehint!(Char[], length(directives))
244✔
10
    for (i, directive) in enumerate(directives)
488✔
11
        if directive <: DatePart
1,512✔
12
            letter = first(directive.parameters)
913✔
13
            push!(letters, letter)
913✔
14
        end
15
    end
2,780✔
16
    return letters
244✔
17
end
18

19
genvar(t::DataType) = Symbol(lowercase(string(nameof(t))))
1,938✔
20

21
"""
22
    tryparsenext_core(str::AbstractString, pos::Int, len::Int, df::DateFormat, raise=false)
23

24
Parse the string according to the directives within the `DateFormat`. Parsing will start at
25
character index `pos` and will stop when all directives are used or we have parsed up to
26
the end of the string, `len`. When a directive cannot be parsed the returned value
27
will be `nothing` if `raise` is false otherwise an exception will be thrown.
28

29
If successful, return a 3-element tuple `(values, pos, num_parsed)`:
30
* `values::Tuple`: A tuple which contains a value
31
  for each `DatePart` within the `DateFormat` in the order
32
  in which they occur. If the string ends before we finish parsing all the directives
33
  the missing values will be filled in with default values.
34
* `pos::Int`: The character index at which parsing stopped.
35
* `num_parsed::Int`: The number of values which were parsed and stored within `values`.
36
  Useful for distinguishing parsed values from default values.
37
"""
38
@generated function tryparsenext_core(str::AbstractString, pos::Int, len::Int,
237✔
39
                                      df::DateFormat, raise::Bool=false)
40
    directives = _directives(df)
78✔
41
    letters = character_codes(directives)
78✔
42

43
    tokens = Type[CONVERSION_SPECIFIERS[letter] for letter in letters]
78✔
44
    value_names = Symbol[genvar(t) for t in tokens]
78✔
45
    value_defaults = Tuple(CONVERSION_DEFAULTS[t] for t in tokens)
78✔
46

47
    # Pre-assign variables to defaults. Allows us to use `@goto done` without worrying about
48
    # unassigned variables.
49
    assign_defaults = Expr[]
78✔
50
    for (name, default) in zip(value_names, value_defaults)
156✔
51
        push!(assign_defaults, quote
295✔
52
            $name = $default
237✔
53
        end)
54
    end
512✔
55

56
    vi = 1
78✔
57
    parsers = Expr[]
78✔
58
    for i = 1:length(directives)
78✔
59
        if directives[i] <: DatePart
491✔
60
            name = value_names[vi]
295✔
61
            vi += 1
295✔
62
            push!(parsers, quote
295✔
63
                pos > len && @goto done
746✔
64
                let val = tryparsenext(directives[$i], str, pos, len, locale)
1,407✔
65
                    val === nothing && @goto error
743✔
66
                    $name, pos = val
732✔
67
                end
68
                num_parsed += 1
732✔
69
                directive_index += 1
732✔
70
            end)
71
        else
72
            push!(parsers, quote
196✔
73
                pos > len && @goto done
498✔
74
                let val = tryparsenext(directives[$i], str, pos, len, locale)
970✔
75
                    val === nothing && @goto error
478✔
76
                    delim, pos = val
463✔
77
                end
78
                directive_index += 1
463✔
79
            end)
80
        end
81
    end
491✔
82

83
    return quote
78✔
84
        directives = df.tokens
237✔
85
        locale::DateLocale = df.locale
237✔
86

87
        num_parsed = 0
237✔
88
        directive_index = 1
237✔
89

90
        $(assign_defaults...)
91
        $(parsers...)
92

93
        pos > len || @goto error
188✔
94

95
        @label done
96
        return $(Expr(:tuple, value_names...)), pos, num_parsed
205✔
97

98
        @label error
99
        if raise
32✔
100
            if directive_index > length(directives)
31✔
101
                throw(ArgumentError("Found extra characters at the end of date time string"))
6✔
102
            else
103
                d = directives[directive_index]
25✔
104
                throw(ArgumentError("Unable to parse date time. Expected directive $d at char $pos"))
25✔
105
            end
106
        end
107
        return nothing
1✔
108
    end
109
end
110

111
"""
112
    tryparsenext_internal(::Type{<:TimeType}, str, pos, len, df::DateFormat, raise=false)
113

114
Parse the string according to the directives within the `DateFormat`. The specified `TimeType`
115
type determines the type of and order of tokens returned. If the given `DateFormat` or string
116
does not provide a required token a default value will be used. When the string cannot be
117
parsed the returned value will be `nothing` if `raise` is false otherwise an exception will
118
be thrown.
119

120
If successful, returns a 2-element tuple `(values, pos)`:
121
* `values::Tuple`: A tuple which contains a value
122
  for each token as specified by the passed in type.
123
* `pos::Int`: The character index at which parsing stopped.
124
"""
125
@generated function tryparsenext_internal(::Type{T}, str::AbstractString, pos::Int, len::Int,
417✔
126
                                          df::DateFormat, raise::Bool=false) where T<:TimeType
127
    letters = character_codes(df)
322✔
128

129
    tokens = Type[CONVERSION_SPECIFIERS[letter] for letter in letters]
161✔
130
    value_names = Symbol[genvar(t) for t in tokens]
161✔
131

132
    output_tokens = CONVERSION_TRANSLATIONS[T]
161✔
133
    output_names = Symbol[genvar(t) for t in output_tokens]
1,214✔
134
    output_defaults = Tuple(CONVERSION_DEFAULTS[t] for t in output_tokens)
161✔
135

136
    # Pre-assign output variables to defaults. Ensures that all output variables are
137
    # assigned as the value tuple returned from `tryparsenext_core` may not include all
138
    # of the required variables.
139
    assign_defaults = Expr[
1,214✔
140
        quote
141
            $name = $default
193✔
142
        end
143
        for (name, default) in zip(output_names, output_defaults)
144
    ]
145

146
    # Unpacks the value tuple returned by `tryparsenext_core` into separate variables.
147
    value_tuple = Expr(:tuple, value_names...)
161✔
148

149
    return quote
161✔
150
        val = tryparsenext_core(str, pos, len, df, raise)
224✔
151
        val === nothing && return nothing
194✔
152
        values, pos, num_parsed = val
193✔
153
        $(assign_defaults...)
154
        $value_tuple = values
193✔
155
        return $(Expr(:tuple, output_names...)), pos
193✔
156
    end
157
end
158

159
@inline function tryparsenext_sign(str::AbstractString, i::Int, len::Int)
160
    i > len && return nothing
198✔
161
    c, ii = iterate(str, i)::Tuple{Char, Int}
396✔
162
    if c == '+'
198✔
163
        return 1, ii
4✔
164
    elseif c == '-'
194✔
165
        return -1, ii
6✔
166
    else
167
        return nothing
188✔
168
    end
169
end
170

171
@inline function tryparsenext_base10(str::AbstractString, i::Int, len::Int, min_width::Int=1, max_width::Int=0)
172
    i > len && return nothing
823✔
173
    min_pos = min_width <= 0 ? i : i + min_width - 1
1,453✔
174
    max_pos = max_width <= 0 ? len : min(i + max_width - 1, len)
845✔
175
    d::Int64 = 0
775✔
176
    @inbounds while i <= max_pos
775✔
177
        c, ii = iterate(str, i)::Tuple{Char, Int}
4,635✔
178
        if '0' <= c <= '9'
2,319✔
179
            d = d * 10 + (c - '0')
1,854✔
180
        else
181
            break
465✔
182
        end
183
        i = ii
1,854✔
184
    end
1,854✔
185
    if i <= min_pos
775✔
186
        return nothing
3✔
187
    else
188
        return d, i
772✔
189
    end
190
end
191

192
@inline function tryparsenext_word(str::AbstractString, i, len, locale, maxchars=0)
193
    word_start, word_end = i, 0
35✔
194
    max_pos = maxchars <= 0 ? len : min(len, nextind(str, i, maxchars-1))
41✔
195
    @inbounds while i <= max_pos
35✔
196
        c, ii = iterate(str, i)::Tuple{Char, Int}
275✔
197
        if isletter(c)
142✔
198
            word_end = i
111✔
199
        else
200
            break
28✔
201
        end
202
        i = ii
111✔
203
    end
111✔
204
    if word_end == 0
35✔
205
        return nothing
2✔
206
    else
207
        return SubString(str, word_start, word_end), i
33✔
208
    end
209
end
210

211
function Base.parse(::Type{DateTime}, s::AbstractString, df::typeof(ISODateTimeFormat))
28✔
212
    i, end_pos = firstindex(s), lastindex(s)
52✔
213
    i > end_pos && throw(ArgumentError("Cannot parse an empty string as a DateTime"))
28✔
214

215
    coefficient = 1
24✔
216
    local dy
217
    dm = dd = Int64(1)
24✔
218
    th = tm = ts = tms = Int64(0)
24✔
219

220
    # Optional sign
221
    let val = tryparsenext_sign(s, i, end_pos)
48✔
222
        if val !== nothing
24✔
223
            coefficient, i = val
4✔
224
        end
225
    end
226

227
    let val = tryparsenext_base10(s, i, end_pos, 1)
46✔
228
        val === nothing && @goto error
24✔
229
        dy, i = val
22✔
230
        i > end_pos && @goto done
22✔
231
    end
232

233
    c, i = iterate(s, i)::Tuple{Char, Int}
42✔
234
    c != '-' && @goto error
21✔
235
    i > end_pos && @goto done
21✔
236

237
    let val = tryparsenext_base10(s, i, end_pos, 1, 2)
42✔
238
        val === nothing && @goto error
21✔
239
        dm, i = val
21✔
240
        i > end_pos && @goto done
21✔
241
    end
242

243
    c, i = iterate(s, i)::Tuple{Char, Int}
42✔
244
    c != '-' && @goto error
21✔
245
    i > end_pos && @goto done
21✔
246

247
    let val = tryparsenext_base10(s, i, end_pos, 1, 2)
42✔
248
        val === nothing && @goto error
21✔
249
        dd, i = val
21✔
250
        i > end_pos && @goto done
21✔
251
    end
252

253
    c, i = iterate(s, i)::Tuple{Char, Int}
22✔
254
    c != 'T' && @goto error
11✔
255
    i > end_pos && @goto done
11✔
256

257
    let val = tryparsenext_base10(s, i, end_pos, 1, 2)
22✔
258
        val === nothing && @goto error
11✔
259
        th, i = val
11✔
260
        i > end_pos && @goto done
11✔
261
    end
262

263
    c, i = iterate(s, i)::Tuple{Char, Int}
22✔
264
    c != ':' && @goto error
11✔
265
    i > end_pos && @goto done
11✔
266

267
    let val = tryparsenext_base10(s, i, end_pos, 1, 2)
22✔
268
        val === nothing && @goto error
11✔
269
        tm, i = val
11✔
270
        i > end_pos && @goto done
11✔
271
    end
272

273
    c, i = iterate(s, i)::Tuple{Char, Int}
22✔
274
    c != ':' && @goto error
11✔
275
    i > end_pos && @goto done
11✔
276

277
    let val = tryparsenext_base10(s, i, end_pos, 1, 2)
22✔
278
        val === nothing && @goto error
11✔
279
        ts, i = val
11✔
280
        i > end_pos && @goto done
11✔
281
    end
282

UNCOV
283
    c, i = iterate(s, i)::Tuple{Char, Int}
×
UNCOV
284
    c != '.' && @goto error
×
UNCOV
285
    i > end_pos && @goto done
×
286

UNCOV
287
    let val = tryparsenext_base10(s, i, end_pos, 1, 3)
×
UNCOV
288
        val === nothing && @goto error
×
UNCOV
289
        tms, j = val
×
UNCOV
290
        tms *= 10 ^ (3 - (j - i))
×
UNCOV
291
        j > end_pos || @goto error
×
292
    end
293

294
    @label done
295
    return DateTime(dy * coefficient, dm, dd, th, tm, ts, tms)
22✔
296

297
    @label error
298
    throw(ArgumentError("Invalid DateTime string"))
2✔
299
end
300

301
function Base.parse(::Type{T}, str::AbstractString, df::DateFormat=default_format(T)) where T<:TimeType
244✔
302
    pos, len = firstindex(str), lastindex(str)
465✔
303
    pos > len && throw(ArgumentError("Cannot parse an empty string as a Date or Time"))
238✔
304
    val = tryparsenext_internal(T, str, pos, len, df, true)
412✔
305
    @assert val !== nothing
191✔
306
    values, endpos = val
191✔
307
    return T(values...)::T
191✔
308
end
309

310
function Base.tryparse(::Type{T}, str::AbstractString, df::DateFormat=default_format(T)) where T<:TimeType
13✔
311
    pos, len = firstindex(str), lastindex(str)
16✔
312
    pos > len && return nothing
9✔
313
    res = tryparsenext_internal(T, str, pos, len, df, false)
5✔
314
    res === nothing && return nothing
3✔
315
    values, endpos = res
2✔
316
    if validargs(T, values...) === nothing
2✔
317
        # TODO: validargs gets called twice, since it's called again in the T constructor
318
        return T(values...)::T
1✔
319
    end
320
    return nothing
1✔
321
end
322

323
"""
324
    parse_components(str::AbstractString, df::DateFormat) -> Array{Any}
325

326
Parse the string into its components according to the directives in the `DateFormat`.
327
Each component will be a distinct type, typically a subtype of Period. The order of the
328
components will match the order of the `DatePart` directives within the `DateFormat`. The
329
number of components may be less than the total number of `DatePart`.
330
"""
331
@generated function parse_components(str::AbstractString, df::DateFormat)
13✔
332
    letters = character_codes(df)
10✔
333
    tokens = Type[CONVERSION_SPECIFIERS[letter] for letter in letters]
5✔
334

335
    return quote
5✔
336
        pos, len = firstindex(str), lastindex(str)
26✔
337
        val = tryparsenext_core(str, pos, len, df, #=raise=#true)
13✔
338
        @assert val !== nothing
12✔
339
        values, pos, num_parsed = val
12✔
340
        types = $(Expr(:tuple, tokens...))
12✔
341
        result = Vector{Any}(undef, num_parsed)
24✔
342
        for (i, typ) in enumerate(types)
12✔
343
            i > num_parsed && break
53✔
344
            result[i] = typ(values[i])  # Constructing types takes most of the time
53✔
345
        end
53✔
346
        return result
12✔
347
    end
348
end
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc