• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

mmschlk / shapiq / 18499875864

14 Oct 2025 02:26PM UTC coverage: 92.799% (-0.7%) from 93.522%
18499875864

push

github

web-flow
Enhance type safety and fix bugs across the codebase (#430)

* First Pyright cleanup

* TypeChecked game

* fixed introduced bugs in game and interaction_values

* Pyright Save Sampling

* TypeSafe Approximator

* Typechecked Datasets

* Explainer folder typechecked

* GameTheory Typechecked

* Imputer Typechecked

* Plot Typechecked

* Added static typechecking to pre-commit

* Refactoring

* Add pyright change to CHANGELOG

* Activate code quality show diff

* changed uv sync in pre-commit hook

* made fixtures local import

* Introduced Generic TypeVar in Approximator, reducing ignores

* Introduced Generic Types for Explainer. Approximator, Imputer and ExactComputer can either exist or not, depending on dynamic Type

* Bug fix caused through refactoring

* updated overrides

* tightened CoalitionMatrix to accept only bool arrays

* Remove Python reinstallation step in CI workflow

Removed the step to reinstall Python on Windows due to issues with tkinter. The linked GitHub issue was solved. Doing this as a first try.

* Add Python reinstallation and Tkinter installation steps

Reinstall Python and install Tkinter for Windows tests. prior commit did not help

* Fix command for installing Tkinter in workflow

* Update Windows workflow to install Tkinter via Chocolatey

* Remove Tkinter installation step from Windows workflow and adjust matplotlib usage for headless environments

* adapted some pyright types

* removed generics from explainer again

* tightened index type check

* made n_players None at assignment again

* moved comments

---------

Co-authored-by: Maximilian <maximilian.muschalik@gmail.com>

304 of 360 new or added lines in 51 files covered. (84.44%)

12 existing lines in 9 files now uncovered.

4987 of 5374 relevant lines covered (92.8%)

0.93 hits per line

Source File
Press 'n' to go to next uncovered line, 'b' for previous

97.44
/src/shapiq/utils/sets.py
1
"""This module contains utility functions for dealing with sets, coalitions and game theory."""
2

3
from __future__ import annotations
1✔
4

5
import copy
1✔
6
from itertools import chain, combinations
1✔
7
from typing import TYPE_CHECKING, TypeVar
1✔
8

9
import numpy as np
1✔
10
from scipy.special import binom
1✔
11

12
if TYPE_CHECKING:
1✔
NEW
13
    from collections.abc import Collection, Iterable, Iterator
×
14

NEW
15
    from shapiq.typing import CoalitionMatrix, CoalitionTuple
×
16

17
__all__ = [
1✔
18
    "count_interactions",
19
    "generate_interaction_lookup",
20
    "get_explicit_subsets",
21
    "pair_subset_sizes",
22
    "powerset",
23
    "split_subsets_budget",
24
    "transform_array_to_coalitions",
25
    "transform_coalitions_to_array",
26
]
27

28
T = TypeVar("T", int, str)
1✔
29

30

31
def powerset(
1✔
32
    iterable: Iterable[T],
33
    min_size: int = 0,
34
    max_size: int | None = None,
35
) -> Iterator[tuple[T, ...]]:
36
    """Return a powerset of an iterable as tuples with optional size limits.
37

38
    Args:
39
        iterable: An iterable (e.g., list, set, etc.) from which to generate the powerset.
40
        min_size: Minimum size of the subsets. Defaults to 0 (start with the empty set).
41
        max_size: Maximum size of the subsets. Defaults to None (all possible sizes).
42

43
    Returns:
44
        iterable: Powerset of the iterable.
45

46
    Example:
47
        >>> list(powerset([1, 2, 3]))
48
        [(), (1,), (2,), (3,), (1, 2), (1, 3), (2, 3), (1, 2, 3)]
49

50
        >>> list(powerset([1, 2, 3], min_size=1))
51
        [(1,), (2,), (3,), (1, 2), (1, 3), (2, 3), (1, 2, 3)]
52

53
        >>> list(powerset([1, 2, 3], max_size=2))
54
        [(), (1,), (2,), (3,), (1, 2), (1, 3), (2, 3)]
55

56
        >>> list(powerset(["A", "B", "C"], min_size=1, max_size=2))
57
        [('A',), ('B',), ('C',), ('A', 'B'), ('A', 'C'), ('B', 'C')]
58

59
    """
60
    s = sorted(iterable)
1✔
61
    max_size = len(s) if max_size is None else min(max_size, len(s))
1✔
62
    return chain.from_iterable(combinations(s, r) for r in range(max(min_size, 0), max_size + 1))
1✔
63

64

65
def pair_subset_sizes(order: int, n: int) -> tuple[list[tuple[int, int]], int | None]:
1✔
66
    """Determines what subset sizes are paired together.
67

68
    Given an interaction order and the number of players, determines the paired subsets. Paired
69
    subsets are subsets of the same size that are paired together moving from the smallest subset
70
    paired with the largest subset to the center.
71

72
    Args:
73
        order: interaction order.
74
        n: number of players.
75

76
    Returns:
77
        paired and unpaired subsets. If there is no unpaired subset `unpaired_subset` is None.
78

79
    Examples:
80
        >>> pair_subset_sizes(order=1, n=5)
81
        ([(1, 4), (2, 3)], None)
82

83
        >>> pair_subset_sizes(order=1, n=6)
84
        ([(1, 5), (2, 4)], 3)
85

86
        >>> pair_subset_sizes(order=2, n=5)
87
        ([(2, 3)], None)
88

89
    """
90
    subset_sizes = list(range(order, n - order + 1))
1✔
91
    n_paired_subsets = len(subset_sizes) // 2
1✔
92
    paired_subsets = [
1✔
93
        (subset_sizes[size - 1], subset_sizes[-size]) for size in range(1, n_paired_subsets + 1)
94
    ]
95
    unpaired_subset = None if len(subset_sizes) % 2 == 0 else subset_sizes[n_paired_subsets]
1✔
96
    return paired_subsets, unpaired_subset
1✔
97

98

99
def split_subsets_budget(
1✔
100
    order: int,
101
    n: int,
102
    budget: int,
103
    sampling_weights: np.ndarray,
104
) -> tuple[list, list, int]:
105
    """Determines which subset sizes can be computed explicitly and which sizes need to be sampled.
106

107
    Given a computational budget, determines the complete subsets that can be computed explicitly
108
    and the corresponding incomplete subsets that need to be estimated via sampling.
109

110
    Args:
111
        order: interaction order.
112
        n: number of players.
113
        budget: total allowed budget for the computation.
114
        sampling_weights: weight vector of the sampling distribution in shape (n + 1,). The first
115
            and last element constituting the empty and full subsets are not used.
116

117
    Returns:
118
        complete subsets, incomplete subsets, remaining budget
119

120
    Examples:
121
        >>> split_subsets_budget(order=1, n=6, budget=100, sampling_weights=np.ones(shape=(6,)))
122
        ([1, 5, 2, 4, 3], [], 38)
123

124
        >>> split_subsets_budget(order=1, n=6, budget=60, sampling_weights=np.ones(shape=(6,)))
125
        ([1, 5, 2, 4], [3], 18)
126

127
        >>> split_subsets_budget(order=1, n=6, budget=100, sampling_weights=np.zeros(shape=(6,)))
128
        ([], [1, 2, 3, 4, 5], 100)
129

130
    """
131
    # determine paired and unpaired subsets
132
    complete_subsets: list[int] = []
1✔
133
    paired_subsets, unpaired_subset = pair_subset_sizes(order, n)
1✔
134
    incomplete_subsets = list(range(order, n - order + 1))
1✔
135

136
    # turn weight vector into probability vector
137
    weight_vector = copy.copy(sampling_weights)
1✔
138
    weight_vector[0], weight_vector[-1] = 0, 0  # zero out the empty and full subsets
1✔
139
    sum_weight_vector = np.sum(weight_vector)
1✔
140
    weight_vector = np.divide(
1✔
141
        weight_vector,
142
        sum_weight_vector,
143
        out=weight_vector,
144
        where=sum_weight_vector != 0,
145
    )
146

147
    # check if the budget is sufficient to compute all paired subset sizes explicitly
148
    allowed_budget = weight_vector * budget  # allowed budget for each subset size
1✔
149
    for subset_size_1, subset_size_2 in paired_subsets:
1✔
150
        subset_budget = int(binom(n, subset_size_1))  # required budget for full computation
1✔
151
        # check if the budget is sufficient to compute the paired subset sizes explicitly
152
        if allowed_budget[subset_size_1] >= subset_budget and allowed_budget[subset_size_1] > 0:
1✔
153
            complete_subsets.extend((subset_size_1, subset_size_2))
1✔
154
            incomplete_subsets.remove(subset_size_1)
1✔
155
            incomplete_subsets.remove(subset_size_2)
1✔
156
            weight_vector[subset_size_1], weight_vector[subset_size_2] = 0, 0  # zero used sizes
1✔
157
            if np.sum(weight_vector) != 0:
1✔
158
                weight_vector /= np.sum(weight_vector)  # re-normalize into probability vector
1✔
159
            budget -= subset_budget * 2
1✔
160
        else:  # if the budget is not sufficient, return the current state
161
            return complete_subsets, incomplete_subsets, budget
1✔
162
        allowed_budget = weight_vector * budget  # update allowed budget for each subset size
1✔
163

164
    # check if the budget is sufficient to compute the unpaired subset size explicitly
165
    if unpaired_subset is not None:
1✔
166
        subset_budget = int(binom(n, unpaired_subset))
1✔
167
        if budget - subset_budget >= 0:
1✔
168
            complete_subsets.append(unpaired_subset)
1✔
169
            incomplete_subsets.remove(unpaired_subset)
1✔
170
            budget -= subset_budget
1✔
171
    return complete_subsets, incomplete_subsets, budget
1✔
172

173

174
def get_explicit_subsets(n: int, subset_sizes: list[int]) -> np.ndarray:
1✔
175
    """Enumerates all subsets of the given sizes and returns a one-hot matrix.
176

177
    Args:
178
        n: number of players.
179
        subset_sizes: list of subset sizes.
180

181
    Returns:
182
        one-hot matrix of all subsets of certain sizes.
183

184
    Examples:
185
        >>> get_explicit_subsets(n=4, subset_sizes=[1, 2]).astype(int)
186
        array([[1, 0, 0, 0],
187
               [0, 1, 0, 0],
188
               [0, 0, 1, 0],
189
               [0, 0, 0, 1],
190
               [1, 1, 0, 0],
191
               [1, 0, 1, 0],
192
               [1, 0, 0, 1],
193
               [0, 1, 1, 0],
194
               [0, 1, 0, 1],
195
               [0, 0, 1, 1]])
196

197
    """
198
    total_subsets = int(sum(binom(n, size) for size in subset_sizes))
1✔
199
    subset_matrix = np.zeros(shape=(total_subsets, n), dtype=bool)
1✔
200
    subset_index = 0
1✔
201
    for subset_size in subset_sizes:
1✔
202
        for subset in combinations(range(n), subset_size):
1✔
203
            subset_matrix[subset_index, subset] = True
1✔
204
            subset_index += 1
1✔
205
    return subset_matrix
1✔
206

207

208
def generate_interaction_lookup(
1✔
209
    players: Iterable[T] | int,
210
    min_order: int,
211
    max_order: int | None = None,
212
) -> dict[tuple[T, ...], int] | dict[tuple[int, ...], int]:
213
    """Generates a lookup dictionary for interactions.
214

215
    Args:
216
        players: A unique set of players or an Integer denoting the number of players.
217
        min_order: The minimum order of the approximation.
218
        max_order: The maximum order of the approximation.
219

220
    Returns:
221
        A dictionary that maps interactions to their index in the values vector.
222

223
    Example:
224
        >>> generate_interaction_lookup(3, 1, 3)
225
        {(0,): 0, (1,): 1, (2,): 2, (0, 1): 3, (0, 2): 4, (1, 2): 5, (0, 1, 2): 6}
226
        >>> generate_interaction_lookup(3, 2, 2)
227
        {(0, 1): 0, (0, 2): 1, (1, 2): 2}
228
        >>> generate_interaction_lookup(["A", "B", "C"], 1, 2)
229
        {('A',): 0, ('B',): 1, ('C',): 2, ('A', 'B'): 3, ('A', 'C'): 4, ('B', 'C'): 5}
230

231
    """
232
    if isinstance(players, int):
1✔
233
        return {
1✔
234
            interaction: i
235
            for i, interaction in enumerate(
236
                powerset(range(players), min_size=min_order, max_size=max_order)
237
            )
238
        }
239
    return {
1✔
240
        interaction: i
241
        for i, interaction in enumerate(powerset(players, min_size=min_order, max_size=max_order))
242
    }
243

244

245
def generate_interaction_lookup_from_coalitions(
1✔
246
    coalitions: CoalitionMatrix,
247
) -> dict[tuple[int, ...], int]:
248
    """Generates a lookup dictionary for interactions based on an array of coalitions.
249

250
    Args:
251
        coalitions: An array of player coalitions.
252

253
    Returns:
254
        A dictionary that maps interactions to their index in the values vector
255

256
    Example:
257
        >>> coalitions = np.array([
258
        ...     [1, 0, 1],
259
        ...     [0, 1, 1],
260
        ...     [1, 1, 0],
261
        ...     [0, 0, 1]
262
        ... ])
263
        >>> generate_interaction_lookup_from_coalitions(coalitions)
264
        {(0, 2): 0, (1, 2): 1, (0, 1): 2, (2,): 3}
265

266
    """
267
    return {tuple(np.where(coalition)[0]): idx for idx, coalition in enumerate(coalitions)}
1✔
268

269

270
def transform_coalitions_to_array(
1✔
271
    coalitions: Collection[CoalitionTuple],
272
    n_players: int | None = None,
273
) -> CoalitionMatrix:
274
    """Transforms a collection of coalitions to a binary array (one-hot encodings).
275

276
    Args:
277
        coalitions: Collection of coalitions.
278
        n_players: Number of players. Defaults to None (determined from the coalitions). If
279
            provided, n_players must be greater than the maximum player index in the coalitions.
280

281
    Returns:
282
        Binary array of coalitions.
283

284
    Example:
285
        >>> coalitions = [(0, 1), (1, 2), (0, 2)]
286
        >>> transform_coalitions_to_array(coalitions)
287
        array([[ True,  True, False],
288
               [False,  True,  True],
289
               [ True, False,  True]])
290

291
        >>> transform_coalitions_to_array(coalitions, n_players=4)
292
        array([[ True,  True, False, False],
293
               [False,  True,  True, False],
294
               [ True, False,  True, False]])
295

296
    """
297
    n_coalitions = len(coalitions)
1✔
298
    if n_players is None:
1✔
299
        n_players = max(max(coalition) for coalition in coalitions) + 1
1✔
300

301
    coalition_array = np.zeros((n_coalitions, n_players), dtype=bool)
1✔
302
    for i, coalition in enumerate(coalitions):
1✔
303
        coalition_array[i, coalition] = True
1✔
304
    return coalition_array
1✔
305

306

307
def transform_array_to_coalitions(coalitions: CoalitionMatrix) -> Collection[CoalitionTuple]:
1✔
308
    """Transforms a 2d one-hot matrix of coalitions into a list of tuples.
309

310
    Args:
311
        coalitions: A binary array of coalitions.
312

313
    Returns:
314
        List of coalitions as tuples.
315

316
    Examples:
317
        >>> coalitions = np.array([[True, True, False], [False, True, True], [True, False, True]])
318
        >>> transform_array_to_coalitions(coalitions)
319
        [(0, 1), (1, 2), (0, 2)]
320

321
        >>> coalitions = np.array([[False, False, False], [True, True, True]])
322
        >>> transform_array_to_coalitions(coalitions)
323
        [(), (0, 1, 2)]
324

325
    """
326
    return [tuple(np.where(coalition)[0]) for coalition in coalitions]
1✔
327

328

329
def count_interactions(n: int, max_order: int | None = None, min_order: int = 0) -> int:
1✔
330
    """Counts the number of interactions for a given number of players and maximum order.
331

332
    Args:
333
        n: Number of players.
334
        max_order: Maximum order of the interactions. If `None`, it is set to the number of players.
335
            Defaults to `None`.
336
        min_order: Minimum order of the interactions. Defaults to 0.
337

338
    Returns:
339
        The number of interactions.
340

341
    Examples:
342
        >>> count_interactions(3)
343
        8
344
        >>> count_interactions(3, 2)
345
        7
346
        >>> count_interactions(3, 2, 1)
347
        6
348

349
    """
350
    if max_order is None:
1✔
351
        max_order = n
1✔
352
    return int(sum(binom(n, size) for size in range(min_order, max_order + 1)))
1✔
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc