• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

nci / scores / 21655129859

04 Feb 2026 01:46AM UTC coverage: 99.902% (-0.1%) from 100.0%
21655129859

Pull #971

github

web-flow
Merge 6fb5759c8 into 26d8907f0
Pull Request #971: Replace interp1d

425 of 425 branches covered (100.0%)

Branch coverage included in aggregate %.

10 of 13 new or added lines in 1 file covered. (76.92%)

3 existing lines in 1 file now uncovered.

2647 of 2650 relevant lines covered (99.89%)

4.0 hits per line

Source File
Press 'n' to go to next uncovered line, 'b' for previous

98.7
/src/scores/processing/isoreg_impl.py
1
"""
2
An implementation of isotonic regression using the pool adjacent violators (PAV)
3
algorithm. In the context of forecast verification, the regression explanatory variable
4
is the forecast and the response variable is the observation. Forecasts and observations
5
are assumed to be real-valued quantities.
6

7
Isotonic regression exposes conditional forecast biases. Confidence bands for the
8
regression fit can also be returned. This implementation includes option to specify what
9
functional the forecast is targeting, so that appropriate regression can be performed on
10
different types of real-valued forecasts, such as those which are a quantile of the
11
predictive distribution.
12
"""
13

14
from functools import partial
4✔
15
from typing import Callable, Literal, Optional, Tuple, Union
4✔
16

17
import numpy as np
4✔
18
import xarray as xr
4✔
19
from scipy.optimize import isotonic_regression
4✔
20

21

22
def isotonic_fit(  # pylint: disable=too-many-locals, too-many-arguments
4✔
23
    fcst: Union[np.ndarray, xr.DataArray],
24
    obs: Union[np.ndarray, xr.DataArray],
25
    *,  # Force keywords arguments to be keyword-only
26
    weight: Optional[Union[np.ndarray, xr.DataArray]] = None,
27
    functional: Optional[Literal["mean", "quantile"]] = "mean",
28
    bootstraps: Optional[int] = None,
29
    quantile_level: Optional[float] = None,
30
    solver: Optional[Union[Callable[[np.ndarray, np.ndarray], float], Callable[[np.ndarray], float]]] = None,
31
    confidence_level: Optional[float] = 0.9,
32
    min_non_nan: Optional[int] = 1,
33
    report_bootstrap_results: Optional[bool] = False,
34
) -> dict:
35
    """
36
    Calculates an isotonic regression fit to observations given forecasts as the
37
    explanatory values. Optionally returns confidence bands on the fit via bootstrapping.
38
    Forecasts and observations are scalar-valued (i.e, integers, floats or NaN).
39

40
    If forecasts target the mean or a quantile functional, then the user can select that
41
    functional. Otherwise the user needs to supply a `solver` function
42

43
        - of one variable (if no weights are supplied), or
44
        - of two variables (if weights are supplied).
45

46
    The solver function is used to find an optimal regression fit in each block, subject
47
    to a non-decreasing monotonicity constraint, using the pool adjacent violators (PAV)
48
    algorithm.
49

50
    An example of a solver function of one variable is obs -> np.mean(obs)
51
    An example of a solver function of two variables is (obs, weight) -> np.average(obs, weights=weight).
52

53
    Ties (where one forecast value has more than one observation value) are handled as follows.
54
    Forecasts are sorted in ascending order, and observations sorted to match. Where ties occur,
55
    observations are sorted in descending order within each subset of ties. Isotonic regression
56
    is then performed on this sorted observation sequence.
57

58
    This implementation uses sklearn.isotonic.IsotonicRegression when `functional="mean"`.
59

60
    Args:
61
        fcst: np.array or xr.DataArray of forecast (or explanatory) values. Values must
62
            be float, integer or NaN.
63
        obs: np.array or xr.DataArray of corresponding observation (or response) values.
64
            Must be the same shape as `fcst`. Values must be float, integer or NaN.
65
        weight: positive weights to apply to each forecast-observation pair. Must be the
66
            same shape as `fcst`, or `None` (which is equivalent to applying equal weights).
67
        functional: Functional that the forecasts are targeting. Either "mean" or
68
            "quantile" or `None`. If `None` then `solver` must be supplied. The current
69
            implementation for "quantile" does not accept weights. If weighted quantile
70
            regression is desired then the user should should supply an appropriate solver.
71
        bootstraps: the number of bootstrap samples to perform for calculating the
72
            regression confidence band. Set to `None` if a confidence band is not required.
73
        quantile_level: the level of the quantile functional if `functional='quantile'`.
74
            Must be strictly between 0 and 1.
75
        solver: function that accepts 1D numpy array of observations and returns
76
            a float. Function values give the regression fit for each block as determined
77
            by the PAV algorithm. If weights are supplied, the function must have a second
78
            argument that accepts 1D numpy array of weights of same length as the observations.
79
        confidence_level: Confidence level of the confidence band, strictly between 0 and 1.
80
            For example, a confidence level of 0.5 will calculate the confidence band by
81
            computing the 25th and 75th percentiles of the bootstrapped samples.
82
        min_non_nan: The minimum number of non-NaN bootstrapped values to calculate
83
            confidence bands at a particular forecast value.
84
        report_bootstrap_results: This specifies to whether keep the bootstrapped
85
            regression values in return dictionary. Default is False to keep the output
86
            result small.
87

88
    Returns:
89
        Dictionary with the following keys:
90

91
        - "unique_fcst_sorted": 1D numpy array of remaining forecast values sorted in
92
            ascending order, after any NaNs from `fcst`, `obs` and `weight` are removed,
93
            and only unique values are kept to keep the output size reasonably small.
94
        - "fcst_counts": 1D numpy array of forecast counts for unique values of forecast sorted
95
        - "regression_values": 1D numpy array of regression values corresponding to
96
            "unique_fcst_sorted" values.
97
        - "regression_func": function that returns the regression fit based on linear
98
            interpolation of ("fcst_sorted", "regression_values"), for any supplied
99
            argument (1D numpy array) of potential forecast values.
100
        - "bootstrap_results": in the case of `report_bootstrap_results=True`, 2D numpy
101
            array of bootstrapped regression values is included in return dictionary.
102
            Each row gives the interpolated regression values from a particular bootstrapped
103
            sample, evaluated at "fcst_sorted" values. If `m` is the number of bootstrap
104
            samples and `n = len(fcst_sorted)` then it is has shape `(m, n)`. We emphasise
105
            that this array corresponds to `fcst_sorted` not `unique_fcst_sorted`.
106
        - "confidence_band_lower_values": values of lower confidence band threshold, evaluated
107
            at "unique_fcst_sorted" values.
108
        - "confidence_band_upper_values": values of upper confidence band threshold, evaluated
109
            at "unique_fcst_sorted" values.
110
        - "confidence_band_lower_func": function that returns regression fit based on linear
111
            interpolation of ("fcst_sorted", "confidence_band_lower_values"), given any
112
            argument (1D numpy array) of potential forecast values.
113
        - "confidence_band_upper_func": function that returns regression fit based on linear
114
            interpolation of ("fcst_sorted", "confidence_band_upper_values"), given any
115
            argument (1D numpy array) of potential forecast values.
116
        - "confidence_band_levels": tuple giving the quantile levels used to calculate the
117
            confidence band.
118

119
    Raises:
120
        ValueError: if `fcst` and `obs` are np.arrays and don't have the same shape.
121
        ValueError: if `fcst` and `weight` are np.arrays and don't have the same shape.
122
        ValueError: if `fcst` and `obs` are xarray.DataArrays and don't have the same dimensions.
123
        ValueError: if `fcst` and `weight` are xarray.DataArrays and don't have the same dimensions.
124
        ValueError: if any entries of `fcst`, `obs` or `weight` are not an integer, float or NaN.
125
        ValueError: if there are no `fcst` and `obs` pairs after NaNs are removed.
126
        ValueError: if `functional` is not one of "mean", "quantile" or `None`.
127
        ValueError: if `quantile_level` is not strictly between 0 and 1 when `functional="quantile"`.
128
        ValueError: if `weight` is not `None` when `functional="quantile"`.
129
        ValueError: if `functional` and `solver` are both `None`.
130
        ValueError: if not exactly one of `functional` and `solver` is `None`.
131
        ValueError: if `bootstraps` is not a positive integer.
132
        ValueError: if `confidence_level` is not strictly between 0 and 1.
133

134
    Note: This function only keeps the unique values of `fcst_sorted` to keep the volume of
135
        the return dictionary small. The forecast counts is also included, so users can it to
136
        create forecast histogram (usually displayed in the reliability diagrams).
137

138
    References
139
        - de Leeuw, Hornik and Mair. "Isotone Optimization in R: Pool-Adjacent-Violators Algorithm (PAVA)
140
          and Active Set Methods", Journal of Statistical Software, 2009.
141
        - Dimitriadis, Gneiting and Jordan. "Stable reliability diagrams for probabilistic classifiers",
142
          PNAS, Vol. 118 No. 8, 2020. Available at https://www.pnas.org/doi/10.1073/pnas.2016191118
143
        - Jordan, Mühlemann, and Ziegel. "Optimal solutions to the isotonic regression problem",
144
          2020 (version 2), available on arxiv at https://arxiv.org/abs/1904.04761
145
    """
146

147
    if isinstance(fcst, xr.DataArray):
4✔
148
        fcst, obs, weight = _xr_to_np(fcst, obs, weight)  # type: ignore
4✔
149
    # now fcst, obs and weight (unless None) are np.arrays
150

151
    _iso_arg_checks(
4✔
152
        fcst,  # type: ignore
153
        obs,  # type: ignore
154
        weight=weight,  # type: ignore
155
        functional=functional,
156
        quantile_level=quantile_level,
157
        solver=solver,
158
        bootstraps=bootstraps,
159
        confidence_level=confidence_level,
160
    )
161
    fcst_tidied, obs_tidied, weight_tidied = _tidy_ir_inputs(fcst, obs, weight=weight)  # type: ignore
4✔
162
    y_out = _do_ir(
4✔
163
        obs_tidied,
164
        weight=weight_tidied,
165
        functional=functional,
166
        quantile_level=quantile_level,
167
        solver=solver,
168
    )
169

170
    # calculate the fitting function
171
    ir_func = _get_interp1d_func(fcst_tidied, y_out)
4✔
172

173
    if bootstraps is not None:
4✔
174
        boot_results = _bootstrap_ir(
4✔
175
            fcst=fcst_tidied,
176
            obs=obs_tidied,
177
            weight=weight_tidied,
178
            functional=functional,
179
            quantile_level=quantile_level,
180
            solver=solver,
181
            bootstraps=bootstraps,
182
        )
183

184
        lower_pts, upper_pts = _confidence_band(boot_results, confidence_level, min_non_nan)  # type: ignore
4✔
185

186
        lower_func = _get_interp1d_func(fcst_tidied, lower_pts)
4✔
187
        upper_func = _get_interp1d_func(fcst_tidied, upper_pts)
4✔
188

189
        confband_levels = ((1 - confidence_level) / 2, 1 - (1 - confidence_level) / 2)  # type: ignore
4✔
190

191
    else:
192
        boot_results = lower_pts = upper_pts = None  # type: ignore
4✔
193
        lower_func = upper_func = partial(np.full_like, fill_value=np.nan)
4✔
194
        confband_levels = (None, None)  # type: ignore
4✔
195
    # To reduce the size of output dictionary, we only keep the unique values of
196
    # forecasts and accordingly regression and confidence band values calculate by using
197
    # unique forecast values. We also calculate forecast counts that can be used to create
198
    # the forecast histogram.
199
    unique_fcst_sorted, fcst_counts = np.unique(fcst_tidied, return_counts=True)
4✔
200
    regression_values = ir_func(unique_fcst_sorted)
4✔
201
    if bootstraps:
4✔
202
        lower_pts = lower_func(unique_fcst_sorted)
4✔
203
        upper_pts = upper_func(unique_fcst_sorted)
4✔
204
    results = {
4✔
205
        "fcst_sorted": unique_fcst_sorted,
206
        "fcst_counts": fcst_counts,
207
        "regression_values": regression_values,
208
        "regression_func": ir_func,
209
        "confidence_band_lower_values": lower_pts,
210
        "confidence_band_upper_values": upper_pts,
211
        "confidence_band_lower_func": lower_func,
212
        "confidence_band_upper_func": upper_func,
213
        "confidence_band_levels": confband_levels,
214
    }
215
    if report_bootstrap_results:
4✔
216
        results["bootstrap_results"] = boot_results
4✔
217

218
    return results
4✔
219

220

221
def _xr_to_np(
4✔
222
    fcst: xr.DataArray, obs: xr.DataArray, weight: Optional[xr.DataArray]
223
) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
224
    """
225
    Conducts basic dimension checks to `isotonic_fit` inputs if they are xr.DataArray.
226
    Then converts xr.DataArray objects into corresponding numpy arrays, with
227
    corresponding datapoints.
228

229
    Args:
230
        fcst: DataArray of forecast values
231
        obs: DataArray of observed values
232
        weight: weights to apply to each forecast-observation pair. Must be the same
233
            shape as `fcst`, or `None`.
234
    Returns:
235
        numpy arrays of `fcst`, `obs` and `weight`.
236

237
    Raises:
238
        ValueError: if `fcst` and `obs` do not have the same dimension.
239
        ValueError: if `fcst` and `weight` do not have the same dimension.
240
    """
241
    fcst_dims = fcst.dims
4✔
242

243
    if set(fcst_dims) != set(obs.dims):
4✔
244
        raise ValueError("`fcst` and `obs` must have same dimensions.")
4✔
245

246
    if weight is not None:
4✔
247
        if set(fcst_dims) != set(weight.dims):
4✔
248
            raise ValueError("`fcst` and `weight` must have same dimensions.")
4✔
249
        merged_ds = xr.merge([fcst.rename("fcst"), obs.rename("obs"), weight.rename("weight")], join="outer")
4✔
250
        weight = merged_ds["weight"].transpose(*fcst_dims).values  # type: ignore
4✔
251
    else:
252
        merged_ds = xr.merge([fcst.rename("fcst"), obs.rename("obs")], join="outer")
4✔
253

254
    fcst = merged_ds["fcst"].transpose(*fcst_dims).values  # type: ignore
4✔
255
    obs = merged_ds["obs"].transpose(*fcst_dims).values  # type: ignore
4✔
256

257
    return fcst, obs, weight  # type: ignore
4✔
258

259

260
def _iso_arg_checks(  # pylint: disable=too-many-arguments, too-many-branches
4✔
261
    fcst: np.ndarray,
262
    obs: np.ndarray,
263
    *,  # Force keywords arguments to be keyword-only
264
    weight: Optional[Union[np.ndarray, xr.DataArray]] = None,
265
    functional: Optional[str] = None,
266
    quantile_level: Optional[float] = None,
267
    solver: Optional[Union[Callable[[np.ndarray, np.ndarray], float], Callable[[np.ndarray], float]]] = None,
268
    bootstraps: Optional[int] = None,
269
    confidence_level: Optional[float] = None,
270
) -> None:
271
    """Raises ValueError if isotonic_fit arguments are invalid."""
272
    if fcst.shape != obs.shape:
4✔
273
        raise ValueError("`fcst` and `obs` must have same shape.")
4✔
274

275
    if not (np.issubdtype(fcst.dtype, np.integer) or np.issubdtype(fcst.dtype, np.floating)):
4✔
276
        raise ValueError("`fcst` must be an array of floats or integers.")
4✔
277

278
    if not (np.issubdtype(obs.dtype, np.integer) or np.issubdtype(obs.dtype, np.floating)):
4✔
279
        raise ValueError("`obs` must be an array of floats or integers.")
4✔
280

281
    if weight is not None:
4✔
282
        if fcst.shape != weight.shape:
4✔
283
            raise ValueError("`fcst` and `weight` must have same shape.")
4✔
284

285
        if not (np.issubdtype(weight.dtype, np.integer) or np.issubdtype(weight.dtype, np.floating)):
4✔
286
            raise ValueError("`weight` must be an array of floats or integers, or else `None`.")
4✔
287

288
        if np.any(weight <= 0):
4✔
289
            raise ValueError("`weight` values must be either positive or NaN.")
4✔
290

291
    if functional not in ["mean", "quantile", None]:
4✔
292
        raise ValueError("`functional` must be one of 'mean', 'quantile' or `None`.")
4✔
293

294
    if functional == "quantile" and not 0 < quantile_level < 1:  # type: ignore
4✔
295
        raise ValueError("`quantile_level` must be strictly between 0 and 1.")
4✔
296

297
    if functional == "quantile" and weight is not None:
4✔
298
        msg = "Weighted quantile isotonic regression has not been implemented. "
4✔
299
        msg += "Either (i) set `weight=None` or (ii) set `functional=None` and supply an appropriate "
4✔
300
        msg += "quantile `solver` function with two arguments."
4✔
301
        raise ValueError(msg)
4✔
302

303
    if functional is None and solver is None:
4✔
304
        raise ValueError("`functional` and `solver` cannot both be `None`.")
4✔
305

306
    if None not in (functional, solver):
4✔
307
        raise ValueError("One of `functional` or `solver` must be `None`.")
4✔
308

309
    if bootstraps is not None:
4✔
310
        if not isinstance(bootstraps, int) or bootstraps < 1:
4✔
311
            raise ValueError("`bootstraps` must be a positive integer.")
4✔
312
        if not 0 < confidence_level < 1:  # type: ignore
4✔
313
            raise ValueError("`confidence_level` must be strictly between 0 and 1.")
4✔
314

315

316
def _tidy_ir_inputs(
4✔
317
    fcst: np.ndarray,
318
    obs: np.ndarray,
319
    *,  # Force keywords arguments to be keyword-only
320
    weight: Optional[np.ndarray] = None,
321
) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
322
    """
323
    Tidies array inputs to `isotonic_fit` in preparation for the regression routine.
324

325
    Arrays are flattened to 1D arrays. NaNs are jointly removed from fcst, obs, weights.
326
    The array data is then sorted by fcst (ascending) then by obs (descending).
327

328
    Args:
329
        fcst: numpy array of forecast values.
330
        obs: numpy array of corresponding observed values, with the same shape as `fcst`.
331
        weight: Either `None`, or numpy array of corresponding weights with the same
332
            shape as `fcst`.
333

334
    Returns:
335
        Tidied fcst, obs and weight arrays. If input weight is `None` then returned weight
336
        is `None`.
337

338
    Raises:
339
        ValueError: if no forecast and observation pairs remains after removing NaNs.
340
    """
341
    fcst = fcst.flatten()
4✔
342
    obs = obs.flatten()
4✔
343

344
    if weight is not None:
4✔
345
        weight = weight.flatten()
4✔
346

347
    nan_locs = np.isnan(fcst) | np.isnan(obs)
4✔
348

349
    if weight is not None:
4✔
350
        nan_locs = nan_locs | np.isnan(weight)
4✔
351

352
    new_fcst = fcst[~nan_locs]
4✔
353
    new_obs = obs[~nan_locs]
4✔
354

355
    if len(new_fcst) == 0:
4✔
356
        raise ValueError("No (fcst, obs) pairs remaining after NaNs removed.")
4✔
357

358
    # deal with ties in fcst: sort by fcst (ascending) then by obs (descending)
359
    # this ensures that ties in fcst have same regression value in PAV algorithm
360
    sorter = np.lexsort((-new_obs, new_fcst))
4✔
361
    new_fcst = new_fcst[sorter]
4✔
362
    new_obs = new_obs[sorter]
4✔
363

364
    if weight is None:
4✔
365
        new_weight = None
4✔
366
    else:
367
        new_weight = weight[~nan_locs]
4✔
368
        new_weight = new_weight[sorter]
4✔
369

370
    return new_fcst, new_obs, new_weight  # type: ignore
4✔
371

372

373
def _do_ir(  # pylint: disable=too-many-arguments
4✔
374
    obs: np.ndarray,
375
    *,  # Force keywords arguments to be keyword-only
376
    weight: Optional[np.ndarray] = None,
377
    functional: Optional[Literal["mean", "quantile"]] = None,
378
    quantile_level: Optional[float] = None,
379
    solver: Optional[Union[Callable[[np.ndarray, np.ndarray], float], Callable[[np.ndarray], float]]] = None,
380
) -> np.ndarray:
381
    """
382
    Returns the isotonic regression (IR) fit for specified functional or solver,
383
    by passing the inputs to the appropriate IR algorithm.
384

385
    The inputs `obs` and `weight` are 1D numpy arrays assumed to have been
386
    mutually tidied via `_tidy_ir_inputs`.
387

388
    Args:
389
        obs: tidied array of observed values.
390
        weight: tidied array of weights.
391
        functional: either "mean", "quantile" or None.
392
        quantile_level: float strictly between 0 and 1 if functional="quantile",
393
            else None.
394
        solver: solver function as described in docstring of `isotonic_fit`, or `None`.
395
            Cannot be `None` if `functional` is `None`.
396

397
    Returns:
398
        1D numpy array of values fitted using isotonic regression.
399
    """
400
    if functional == "mean":
4✔
401
        y_out = _contiguous_mean_ir(obs, weight=weight)
4✔
402
    elif functional == "quantile":
4✔
403
        y_out = _contiguous_quantile_ir(obs, quantile_level)  # type: ignore
4✔
404
    else:
405
        y_out = _contiguous_ir(obs, solver, weight=weight)  # type: ignore
4✔
406

407
    return y_out
4✔
408

409

410
def _contiguous_ir(
4✔
411
    y: np.ndarray,
412
    solver: Union[Callable[[np.ndarray, np.ndarray], float], Callable[[np.ndarray], float]],
413
    *,  # Force keywords arguments to be keyword-only
414
    weight: Optional[np.ndarray] = None,
415
) -> np.ndarray:
416
    """
417
    Given a 1D numpy array `y` of response values, returns fitted isotonic regression
418
    with values for each valid regression block determined by `solver`.
419

420
    This implementation uses the pool adjacent violators (PAV) algorithm, and is based on the squared loss
421
    implementation at sklearn._isotonic._inplace_contiguous_isotonic_regression
422
    https://github.com/scikit-learn/scikit-learn/blob/a6574655478a24247189236ce5f824e65ad0d369/sklearn/_isotonic.pyx
423
    For an animated visualisation of the algorithm, see
424
    https://www.r-bloggers.com/2020/05/what-is-isotonic-regression/
425

426
    Args:
427
        y: 1D numpy array of response (i.e., observed) values.
428
        solver: function which determines values of each valid regression block
429
            in the PAV algorithm, given response values in the block.
430
        weight: 1D numpy array of weights, or else `None`.
431
            If `weight` is not `None` then `solver` takes in two arrays as arguments
432
            (response values and weights). If `weight` is `None` then `solver` takes
433
            in a single array as argument (response values).
434

435
    """
436
    len_y = len(y)
4✔
437

438
    if weight is not None and len(weight) != len_y:
4✔
439
        raise ValueError("`y` and `weight` must have same length.")
4✔
440

441
    # y_out will be the final regression output
442
    y_out = y.copy()
4✔
443

444
    # target describes a list of blocks.  At any time, if [i..j] (inclusive) is
445
    # an active block, then target[i] := j and target[j] := i.
446
    target = np.arange(len_y, dtype=np.intp)
4✔
447

448
    index = 0
4✔
449
    while index < len_y:
4✔
450
        # idx_next_block is the index of beginning of the next block
451
        idx_next_block = target[index] + 1
4✔
452

453
        if idx_next_block == len_y:  # there are no more blocks
4✔
454
            break
4✔
455
        if (
4✔
456
            y_out[index] < y_out[idx_next_block]
457
        ):  # the regression value of the next block is greater than the active one
458
            index = idx_next_block  # that block becomes the active block
4✔
459
            continue
4✔
460

461
        while True:
4✔
462
            # We are within a decreasing subsequence.
463
            prev_y = y_out[idx_next_block]  # this is the smaller y value after the active block
4✔
464
            idx_next_block = target[idx_next_block] + 1  # advance index of next block
4✔
465
            if idx_next_block == len_y or prev_y < y_out[idx_next_block]:
4✔
466
                # update block indices
467
                target[index] = idx_next_block - 1
4✔
468
                target[idx_next_block - 1] = index
4✔
469

470
                # regression value for current block
471
                if weight is None:
4✔
472
                    y_out[index] = solver(y[index:idx_next_block])  # type: ignore
4✔
473
                else:
474
                    y_out[index] = solver(y[index:idx_next_block], weight[index:idx_next_block])  # type: ignore
4✔
475

476
                if index > 0:
4✔
477
                    # Backtrack if we can.  This makes the algorithm
478
                    # single-pass and ensures O(n) complexity (subject to solver complexity)
479
                    index = target[index - 1]
4✔
480
                # Otherwise, restart from the same point.
481
                break
4✔
482
    # Reconstruct the solution.
483
    index = 0
4✔
484
    while index < len_y:
4✔
485
        idx_next_block = target[index] + 1
4✔
486
        y_out[index + 1 : idx_next_block] = y_out[index]
4✔
487
        index = idx_next_block
4✔
488

489
    return y_out
4✔
490

491

492
def _contiguous_quantile_ir(y: np.ndarray, alpha: float) -> np.ndarray:
4✔
493
    """Performs contiguous quantile IR on tidied data y, for quantile-level alpha, with no weights."""
494
    return _contiguous_ir(y, partial(np.quantile, q=alpha))  # type: ignore
4✔
495

496

497
def _contiguous_mean_ir(
4✔
498
    y: np.ndarray,
499
    *,
500
    weight: Optional[np.ndarray] = None,  # Force keywords arguments to be keyword-only
501
) -> np.ndarray:
502
    """
503
    Performs classical (i.e. for mean functional) contiguous quantile IR on the tidied data array y.
504

505
    Uses a SciPy implementation rather than supplying the mean solver function to `_contiguous_ir`,
506
    as the SciPy implementation is much faster.
507

508
    Important note: The y array must be sorted in descending order before being used.
509
    This sorting of the y array is handled by the '_tidy_ir_inputs' function
510
    prior to any calls of this function.
511
    """
512
    return isotonic_regression(y, weights=weight, increasing=True).x  # type: ignore
4✔
513

514

515
def _bootstrap_ir(  # pylint: disable=too-many-arguments, too-many-locals
4✔
516
    fcst: np.ndarray,
517
    obs: np.ndarray,
518
    bootstraps: int,
519
    *,  # Force keywords arguments to be keyword-only
520
    weight: Optional[np.ndarray] = None,
521
    functional: Optional[Literal["mean", "quantile"]] = None,
522
    quantile_level: Optional[float] = None,
523
    solver: Optional[Union[Callable[[np.ndarray, np.ndarray], float], Callable[[np.ndarray], float]]] = None,
524
):
525
    """
526
    Gives the isotonic fits of bootstrapped samples.
527

528
    Args:
529
        fcst: 1D numpy array of forecast values.
530
        obs: 1D numpy array of observed values corresponding to `fcst`.
531
        bootstraps: the number of samples to bootstrap.
532
        weight: either `None`, or 1D numpy array of weights corresponding to `fcst`.
533
        functional: either `None` or the target functional (one of "mean" or "quantile").
534
        quantile_level: the quantile level if `functional="quantile"`.
535
        solver: the regression solver function if `functional=None`.
536

537
    Returns:
538
        2D numpy array with one row for each bootstrapped sample and one column for each `fcst` value.
539
        The [i,j] entry is the isotonic fit of the ith bootstrapped sample evaluated at the point
540
        `fcst[j]`. Evaluations are determined using interpolation when necessary. If the [i,j] entry is
541
        NaN then `fcst[j]` lay outside the range of the ith bootstrapped sample.
542
    """
543
    # initialise
544
    fc_values_count = len(fcst)
4✔
545
    result = np.full((bootstraps, fc_values_count), np.nan)
4✔
546

547
    for boostrap_sample_num in range(bootstraps):
4✔
548
        selection = np.random.randint(0, fc_values_count, fc_values_count)
4✔
549
        fcst_sample = fcst[selection]
4✔
550
        obs_sample = obs[selection]
4✔
551
        if weight is None:
4✔
552
            weight_sample = None
4✔
553
        else:
554
            weight_sample = weight[selection]
4✔
555

556
        fcst_sample, obs_sample, weight_sample = _tidy_ir_inputs(fcst_sample, obs_sample, weight=weight_sample)
4✔
557

558
        ir_results = _do_ir(
4✔
559
            obs_sample,
560
            weight=weight_sample,
561
            functional=functional,
562
            quantile_level=quantile_level,
563
            solver=solver,
564
        )
565
        approximation_func = _get_interp1d_func(fcst_sample, ir_results)
4✔
566

567
        result[boostrap_sample_num] = approximation_func(fcst)
4✔
568

569
    return result
4✔
570

571

572
def _confidence_band(
4✔
573
    bootstrap_results: np.ndarray, confidence_level: float, min_non_nan: int
574
) -> Tuple[np.ndarray, np.ndarray]:
575
    """
576
    Given bootstrapped results, return the lower and upper values of the corresponding confidence band
577
    with specified `confidence_level`.
578

579
    Args:
580
        bootstrap_results: matrix output from `_bootstrap_ir`.
581
        confidence_level: float between 0 and 1.
582
        min_non_nan: the minimum number of non-NaN values a column of `bootstrap_results`
583
            for which the confidence band will be calculated. If the minimum is not met,
584
            a NaN value is returned.
585

586
    Returns:
587
        Tuple of two 1D numpy arrays. The first array gives is the
588
        `(1-confidence_level)/2`-quantile of each column of `bootstrap_results`.
589
        The second array gives is the `1-(1-confidence_level)/2`-quantile of each column.
590
    """
591
    sig_level = 1 - confidence_level
4✔
592
    upper_level = 1 - sig_level / 2
4✔
593
    lower_level = sig_level / 2
4✔
594

595
    # confidence band values ignoring NaNs
596
    upper = _nanquantile(bootstrap_results, upper_level)
4✔
597
    lower = _nanquantile(bootstrap_results, lower_level)
4✔
598

599
    # masking values with too many NaNs in bootstrap results
600
    non_nan_count = np.count_nonzero(~np.isnan(bootstrap_results), axis=0)
4✔
601
    upper = np.where(non_nan_count >= min_non_nan, upper, np.nan)
4✔
602
    lower = np.where(non_nan_count >= min_non_nan, lower, np.nan)
4✔
603

604
    return lower, upper
4✔
605

606

607
def _nanquantile(arr: np.ndarray, quant: float) -> np.ndarray:
4✔
608
    """Returns same* output as np.nanquantile but faster.
609

610
    *almost equal
611
    See https://github.com/numpy/numpy/issues/16575, this implementation is about 100
612
    times faster compared to numpy 1.24.3.
613

614
    Args:
615
        arr: 2D numpy array with one row for each bootstrapped sample and one column for
616
            each `fcst` value.
617
        quant: Value between 0 and 1.
618

619
    Returns:
620
        1D numpy array of shape `arr.shape[0]` of values at each quantile.
621
    """
622
    arr = np.copy(arr)
4✔
623
    valid_obs_count = np.sum(np.isfinite(arr), axis=0)
4✔
624
    # Replace NaN with maximum (these values will be ignored)
625
    if not np.isnan(arr).all():
4✔
626
        max_val = np.nanmax(arr)
4✔
627
        arr[np.isnan(arr)] = max_val
4✔
628
    # Sort forecast values - former NaNs will move to the end
629
    arr = np.sort(arr, axis=0)
4✔
630
    result = np.zeros(shape=[arr.shape[1]])
4✔
631

632
    # Get the index of the values at the requested quantile
633
    desired_position = (valid_obs_count - 1) * quant
4✔
634
    index_floor = np.floor(desired_position).astype(np.int32)
4✔
635
    index_ceil = np.ceil(desired_position).astype(np.int32)
4✔
636
    same_index = index_floor == index_ceil
4✔
637

638
    # Linear interpolation - take the fractional part of desired position
639
    floor_val = arr[index_floor, np.arange(arr.shape[1])]
4✔
640
    floor_frac = index_ceil - desired_position
4✔
641
    floor_frac_val = arr[index_floor, np.arange(arr.shape[1])] * floor_frac
4✔
642
    ceil_frac = desired_position - index_floor
4✔
643
    ceil_frac_val = arr[index_ceil, np.arange(arr.shape[1])] * ceil_frac
4✔
644
    result = np.where(same_index, floor_val, floor_frac_val + ceil_frac_val)
4✔
645
    return result
4✔
646

647

648
def _get_interp1d_func(x_data: np.ndarray, y_data: np.ndarray) -> Callable[[np.ndarray], np.ndarray]:
4✔
649
    """
650
    Wraps numpy interpolation in a function similar to scipy.interpolate.interp1d.
651
    `bounds_error` not included since np.interp handles NAs differently.
652
    Only accepts 1D arrays.
653

654
    Parameters
655
    ----------
656
    x_data : np.ndarray
657
        1-D array
658
        Independent variable values.
659
    y_data : np.ndarray
660
        1-D array
661
        Dependent variable values,
662

663
    Returns
664
    -------
665
    Callable[[float | np.ndarray], np.ndarray]
666
        Interpolation function.
667
    """
668

669
    if not np.all(x_data[:-1] <= x_data[1:]):
4✔
670
        # x input must be sorted
NEW
UNCOV
671
        order = np.argsort(x_data)
×
NEW
UNCOV
672
        x_data = x_data[order]
×
NEW
UNCOV
673
        y_data = y_data[order]
×
674

675
    def func(x_new):
4✔
676
        # return nan if interpolating outside known coords
677
        result = np.interp(np.sort(x_new), x_data, y_data, left=np.nan, right=np.nan)
4✔
678
        return result
4✔
679

680
    return func
4✔
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc