• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

vigna / sux-rs / 14053318487

25 Mar 2025 06:42AM UTC coverage: 78.674% (+2.5%) from 76.128%
14053318487

push

github

vigna
Links to relevant papers

4117 of 5233 relevant lines covered (78.67%)

128232554.7 hits per line

Source File
Press 'n' to go to next uncovered line, 'b' for previous

63.64
/src/func/vfunc.rs
1
/*
2
 * SPDX-FileCopyrightText: 2025 Sebastiano Vigna
3
 *
4
 * SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later
5
 */
6

7
use std::borrow::Borrow;
8

9
use super::shard_edge::FuseLge3Shards;
10
use crate::func::shard_edge::ShardEdge;
11
use crate::traits::bit_field_slice::*;
12
use crate::utils::*;
13
use epserde::prelude::*;
14
use mem_dbg::*;
15

16
/// Static functions with low space overhead, fast parallel construction, and
17
/// fast queries.
18
///
19
/// *Static functions* map keys to values, but they do not store the keys:
20
/// querying a static function with a key outside of the original set will lead
21
/// to an arbitrary result. Another name for static functions *retrieval data
22
/// structure*.
23
///
24
/// In exchange, static functions have a very low space overhead, and make it
25
/// possible to store the association between keys and values just in the space
26
/// required by the values (with a small overhead).
27
///
28
/// This structure is based on “[ε-Cost Sharding: Scaling Hypergraph-Based
29
/// Static Functions and Filters to Trillions of
30
/// Keys](https://arxiv.org/abs/2503.18397)”. Space overhead with respect to the
31
/// optimum depends on the [`ShardEdge`] type. The default is
32
/// [`FuseLge3Shards`], which provides 10.5% space overhead for large key sets
33
/// (above a few million keys), which grow up to 12% going towards smaller key
34
/// sets. Details on other possible [`ShardEdge`] implementations can be found
35
/// in the [`shard_edge`](crate::func::shard_edge) module documentation.
36
///
37
/// Instances of this structure are immutable; they are built using a
38
/// [`VBuilder`](crate::func::VBuilder) and can be serialized using
39
/// [ε-serde](`epserde`). Please see the documentation of
40
/// [`VBuilder`](crate::func::VBuilder) for examples.
41
///
42
///
43
/// # Generics
44
///
45
/// * `T`: The type of the keys.
46
/// * `W`: The word used to store the data, which is also the output type. It
47
///        can be any unsigned type.
48
/// * `D`: The backend storing the function data. It can be a
49
///        [`BitFieldVec<W>`](crate::bits::BitFieldVec) or a `Box<[W]>`. In the
50
///        first case, the data is stored using exactly the number of bits
51
///        needed, but access is slightly slower, while in the second case the
52
///        data is stored in a boxed slice of `W`, thus forcing the number of
53
///        bits to the number of bits of `W`, but access will be faster.
54
/// * `S`: The signature type. The default is `[u64; 2]`. You can switch to
55
///        `[u64; 1]` (and possibly
56
///        [`FuseLge3NoShards`](crate::func::shard_edge::FuseLge3NoShards)) for
57
///        slightly faster construction and queries, but the construction will
58
///        not scale beyond two billion keys or so.
59
/// * `E`: The sharding and edge logic type. The default is [`FuseLge3Shards`].
60
///        For small sets of keys you might try
61
///        [`FuseLge3NoShards`](crate::func::shard_edge::FuseLge3NoShards),
62
///        possibly coupled with `[u64; 1]` signatures. For functions with more
63
///        than a few dozen billion keys, you might try
64
///       [`FuseLge3FullSigs`](crate::func::shard_edge::FuseLge3FullSigs).
65
#[derive(Epserde, Debug, MemDbg, MemSize)]
66
pub struct VFunc<
67
    T: ?Sized + ToSig<S>,
68
    W: ZeroCopy + Word = usize,
69
    D: BitFieldSlice<W> = Box<[W]>,
70
    S: Sig = [u64; 2],
71
    E: ShardEdge<S, 3> = FuseLge3Shards,
72
> {
73
    pub(crate) shard_edge: E,
74
    pub(crate) seed: u64,
75
    pub(crate) num_keys: usize,
76
    pub(crate) data: D,
77
    pub(crate) _marker_t: std::marker::PhantomData<T>,
78
    pub(crate) _marker_w: std::marker::PhantomData<W>,
79
    pub(crate) _marker_s: std::marker::PhantomData<S>,
80
}
81

82
impl<T: ?Sized + ToSig<S>, W: ZeroCopy + Word, D: BitFieldSlice<W>, S: Sig, E: ShardEdge<S, 3>>
83
    VFunc<T, W, D, S, E>
84
{
85
    /// Returns the value associated with the given signature, or a random value
86
    /// if the signature is not the signature of a key .
87
    ///
88
    /// This method is mainly useful in the construction of compound functions.
89
    #[inline(always)]
90
    pub fn get_by_sig(&self, sig: S) -> W {
23,010,100✔
91
        let edge = self.shard_edge.edge(sig);
23,010,100✔
92
        unsafe {
93
            self.data.get_unchecked(edge[0])
23,010,100✔
94
                ^ self.data.get_unchecked(edge[1])
23,010,100✔
95
                ^ self.data.get_unchecked(edge[2])
23,010,100✔
96
        }
97
    }
98

99
    /// Returns the value associated with the given key, or a random value if the
100
    /// key is not present.
101
    #[inline]
102
    pub fn get(&self, key: impl Borrow<T>) -> W {
12,606,060✔
103
        self.get_by_sig(T::to_sig(key.borrow(), self.seed))
12,606,060✔
104
    }
105

106
    /// Returns the number of keys in the function.
107
    pub fn len(&self) -> usize {
×
108
        self.num_keys
×
109
    }
110

111
    /// Returns whether the function has no keys.
112
    pub fn is_empty(&self) -> bool {
×
113
        self.len() == 0
×
114
    }
115
}
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc