• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

idanmoradarthas / DataScienceUtils / 21175371642 / 4
100%
master: 100%

Build:
DEFAULT BRANCH: master
Ran 20 Jan 2026 02:41PM UTC
Files 7
Run time 2s
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

20 Jan 2026 02:35PM UTC coverage: 99.558% (-0.1%) from 99.7%
21175371642.4

push

github

web-flow
Fix: Handle Fully Missing Features in compute_mutual_information (#90) (#91)

* fix: Handle fully missing features in compute_mutual_information

Previously, the `compute_mutual_information` function would raise an `IndexError` if a feature column contained only missing values. This was because the `SimpleImputer` would drop the column, leading to a dimension mismatch with the `discrete_features_mask`.

This commit fixes the bug by pre-emptively identifying fully missing numerical, categorical, and boolean features and imputing them with default values (`0`, `"_MISSING_"`, and `False`, respectively). This ensures that the imputer no longer drops the columns, and the mutual information calculation can proceed without error.

A `UserWarning` is now issued to inform the user when this imputation occurs.

The fix is covered by new unit tests for both fully missing numerical and categorical features.

* fix(preprocess): Handle fully null columns in mutual information

The compute_mutual_information function previously failed when a feature column contained only null (NaN) values, as the imputer had nothing to learn.

This commit modifies the function to correctly handle these cases by:
- Identifying and filtering out any columns that are entirely null.
- Calculating mutual information scores on the remaining valid columns.
- Assigning a score of 0 to the fully null columns and adding them back to the final result.

This ensures the function runs without errors and provides a sensible default score for features that have no data to contribute to the analysis.

Additionally, this commit resolves a `ValueError` from scikit-learn's `SimpleImputer` which does not support boolean dtypes. A fix has been added to cast boolean columns to the `object` type before imputation.

* fix(preprocess): Handle fully null columns in mutual information

The compute_mutual_information function previously failed when a feature column contained only null (NaN) values, as the impu... (continued)

675 of 678 relevant lines covered (99.56%)

1.0 hits per line

Source Files on job windows-latest-python-3.10 - 21175371642.4
  • Tree
  • List 7
  • Changed 1
  • Source Changed 0
  • Coverage Changed 1
Coverage ∆ File Lines Relevant Covered Missed Hits/Line
  • Back to Build 21175371642
  • c27b5ad1 on github
  • Prev Job for on master (#21079409936.1)
  • Next Job for on master (#21175393653.1)
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc