• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

llnl / dftracer-utils / 26043728131

18 May 2026 03:37PM UTC coverage: 51.706% (-0.4%) from 52.076%
26043728131

push

github

hariharan-devarajan
feat(perf): performance improvements for parallel reading, indexing, and aggregation

Indexer
- Streaming parse-and-emit worker pipeline with bounded memory usage
- Concurrent SST artifact ingestion with staging support
- Gzip member slicing for parallel indexing
- Lazy decoding for compressed value counts
- Bypass DOM wrapper for indexer hot path (simdjson on_demand)
- Decoupled write workers from parse workers
- --rebuild-summaries flag and optimized root summary rebuild

Aggregator / MPI
- Task-based DAG execution for aggregator pipeline
- Shared staging for multi-node artifact relocation
- Per-node thread scaling to avoid oversubscription
- Unified distributed aggregation tracking, removed manifest consolidation
- Deterministic aggregation and intra-file parallelism

Trace reader / query
- Compiled predicate evaluation for AND-of-EQ queries
- Uniform-match shortcut for AND-of-EQ queries
- Line-range support for work items and checkpoint processing
- Optimized chunk pruning and checkpoint handling

Replay
- Pipelined replay with coroutines and channels
- JsonParser-based trace processing
- Optimized string handling and i/o buffering

Organize / writer / dft
- Parallel slice creation and merging in organize visitor
- Inline indexer in organize
- Gzip member tracking in writer
- Coroutine-based event dispatcher with extracted parse logic
- Batch flushing in organize visitor

Arrow / call_tree
- Optimized arrow conversion
- Arrow IPC support and improved save/load in call_tree

Build / infrastructure
- zlib-ng option, system simdjson fallback
- cgroup v1/v2 memory limit detection
- Auto-computed per-file memory estimates and batch sizes
- CI: perf branch trigger, formatting

Docs
- Rewritten indexer and trace reader API references

35907 of 90345 branches covered (39.74%)

Branch coverage included in aggregate %.

16869 of 21880 new or added lines in 137 files covered. (77.1%)

273 existing lines in 39 files now uncovered.

32021 of 41028 relevant lines covered (78.05%)

13164.29 hits per line

Source File
Press 'n' to go to next uncovered line, 'b' for previous

36.0
/src/dftracer/utils/python/trace_reader_iterator.h
1
#ifndef DFTRACER_UTILS_PYTHON_TRACE_READER_ITERATOR_H
2
#define DFTRACER_UTILS_PYTHON_TRACE_READER_ITERATOR_H
3

4
#include <Python.h>
5
#include <dftracer/utils/core/common/config.h>
6
#include <dftracer/utils/core/task_handle.h>
7
#include <dftracer/utils/python/memoryview_batch.h>
8
#include <dftracer/utils/utilities/composites/dft/args_map.h>
9

10
#include <memory>
11
#include <string>
12
#include <vector>
13
#ifdef DFTRACER_UTILS_ENABLE_ARROW
14
#include <dftracer/utils/utilities/common/arrow/arrow_export.h>
15

16
typedef struct {
17
    PyObject_HEAD dftracer::utils::utilities::common::arrow::ArrowExportResult
18
        *result;
19
} ArrowBatchCapsuleObject;
20

21
extern PyTypeObject ArrowBatchCapsuleType;
22
#endif
23

24
enum class IteratorMode {
25
    MEMORYVIEW,
26
    JSON_DICT,
27
#ifdef DFTRACER_UTILS_ENABLE_ARROW
28
    ARROW,
29
#endif
30
};
31

32
#ifdef DFTRACER_UTILS_ENABLE_ARROW
33
struct ArrowIteratorState {
54✔
34
    using BatchType =
35
        dftracer::utils::utilities::common::arrow::ArrowExportResult;
36
    std::shared_ptr<dftracer::utils::coro::Channel<BatchType>> channel;
37
    std::mutex error_mtx;
38
    std::exception_ptr error;
39
    std::atomic<bool> cancelled{false};
54✔
40
    std::size_t memory_budget_bytes = 0;
54✔
41
    std::atomic<std::size_t> bytes_in_queue{0};
54✔
42
    std::shared_future<void> task_future;
43

NEW
44
    void set_error(std::exception_ptr e) {
×
NEW
45
        std::lock_guard<std::mutex> lock(error_mtx);
×
NEW
46
        if (!error) error = e;
×
NEW
47
    }
×
48
};
49
#endif
50

51
using ArgsValue = dftracer::utils::utilities::composites::dft::ArgsValue;
52
using ArgsMap = dftracer::utils::utilities::composites::dft::ArgsMap;
53

54
struct JsonDictEvent {
240✔
55
    ArgsMap top;
56
    ArgsMap args;
57
};
58

59
struct JsonDictBatch {
60
    std::vector<JsonDictEvent> events;
61
};
62

63
struct JsonDictIteratorState {
7✔
64
    std::shared_ptr<dftracer::utils::coro::Channel<JsonDictBatch>> channel;
65
    std::mutex error_mtx;
66
    std::exception_ptr error;
67
    std::atomic<bool> cancelled{false};
7✔
68
    std::size_t memory_budget_bytes = 0;
7✔
69
    std::atomic<std::size_t> bytes_in_queue{0};
7✔
70
    std::shared_future<void> task_future;
71

NEW
72
    void set_error(std::exception_ptr e) {
×
NEW
73
        std::lock_guard<std::mutex> lock(error_mtx);
×
NEW
74
        if (!error) error = e;
×
NEW
75
    }
×
76
};
77

78
using dftracer::utils::python::MemoryViewBatchIteratorState;
79
using dftracer::utils::python::MemoryViewBatchObject;
80
using dftracer::utils::python::MemoryViewBatchType;
81

82
typedef struct {
83
    PyObject_HEAD std::shared_ptr<MemoryViewBatchIteratorState> batch_state;
84
    PyObject *current_batch;
85
    Py_ssize_t batch_index;
86
    std::shared_ptr<JsonDictIteratorState> json_dict_state;
87
    std::shared_ptr<JsonDictBatch> json_dict_current_batch;
88
    Py_ssize_t json_dict_index;
89
#ifdef DFTRACER_UTILS_ENABLE_ARROW
90
    std::shared_ptr<ArrowIteratorState> arrow_state;
91
#endif
92
    IteratorMode mode;
93
} TraceReaderIteratorObject;
94

95
extern PyTypeObject TraceReaderIteratorType;
96
int init_trace_reader_iterator(PyObject *m);
97

98
#endif  // DFTRACER_UTILS_PYTHON_TRACE_READER_ITERATOR_H
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc