Ran
|
Jobs
1
|
Files
404
|
Run time
18s
|
Badge
Embed ▾
README BADGES
|
github
feat: make probabilistic optimizations optional and tunable in the YAML config Probabilistic optimization sacrifices accuracy in order to reduce memory consumption. In certain parts of the pipeline, a Bloom Filter is used ([set_processor](https://github.com/getdozer/dozer/blob/<a class=hub.com/getdozer/dozer/commit/<a class="double-link" href="https://git"><a class=hub.com/getdozer/dozer/commit/2e3ba96c3f4bdf9a691747191ab15617564d8ca2">2e3ba96c3/dozer-sql/src/pipeline/product/set/set_processor.rs#L20)), while in other parts, hash tables that store only the hash of the keys instead of the full keys are used ([aggregation_processor](https://github.com/getdozer/dozer/blob/2e3ba96c3f4bdf9a691747191ab15617564d8ca2/dozer-sql/src/pipeline/aggregation/processor.rs#L59) and [join_processor](https://github.com/getdozer/dozer/blob/2e3ba96c3f4bdf9a691747191ab15617564d8ca2/dozer-sql/src/pipeline/product/join/operator.rs#L57-L58)). This commit makes these optimizations disabled by default and offers user-configurable flags to enable each of these optimizations separately. This is an example of how to turn on probabilistic optimizations for each processor in the Dozer configuration. ``` flags: enable_probabilistic_optimizations: in_sets: true # enable probabilistic optimizations in set operations (UNION, EXCEPT, INTERSECT); Default: false in_joins: true # enable probabilistic optimizations in JOIN operations; Default: false in_aggregations: true # enable probabilistic optimizations in aggregations (SUM, COUNT, MIN, etc.); Default: false ```
539 of 539 new or added lines in 26 files covered. (100.0%)
47313 of 63074 relevant lines covered (75.01%)
47372.26 hits per line
Lines | Coverage | ∆ | File |
---|---|---|---|
1 |
29.69 |
dozer-cli/src/lib.rs | |
1 |
58.23 |
dozer-cli/src/simple/orchestrator.rs | |
1 |
62.07 |
dozer-sql/src/pipeline/product/set/operator.rs | |
2 |
97.59 |
dozer-sql/src/pipeline/product/set/record_map.rs | |
2 |
98.63 |
dozer-sql/src/pipeline/utils/hashtable.rs | |
2 |
91.07 |
dozer-types/src/tests/flags_config_yaml_deserialize.rs | |
3 |
30.77 |
dozer-sql/src/pipeline/product/set/set_processor.rs | |
5 |
61.54 |
dozer-types/src/models/flags.rs | |
7 |
0.0 |
dozer-cli/src/live/state.rs | |
8 |
40.26 |
dozer-cli/src/ui_helper.rs | |
11 |
82.14 |
dozer-sql/src/pipeline/aggregation/processor.rs | |
12 |
78.88 |
dozer-core/src/app.rs | |
21 |
68.0 |
dozer-sql/src/pipeline/product/join/operator.rs |
ID | Job ID | Ran | Files | Coverage | |
---|---|---|---|---|---|
1 | 5966846708.1 | 404 |
75.01 |
GitHub Action Run |
Coverage | ∆ | File | Lines | Relevant | Covered | Missed | Hits/Line |
---|