• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

stillwater-sc / universal / 24944226215

26 Apr 2026 12:27AM UTC coverage: 84.328% (+0.01%) from 84.317%
24944226215

push

github

web-flow
feat(math): add sw::math::constexpr_math facility with constexpr log2 (#762)

* feat(math): add sw::math::constexpr_math facility with constexpr log2

Introduces a new self-contained library of constexpr math functions for
IEEE-754 float and double, parallel to the existing sw::math::function
facility but specifically for compile-time evaluation. Per the issue
brief: "constexpr is a slightly different use case than templated math
functions ... it would be nice to have this as a separate and
specialized include set ... independently useful outside of Universal's
number systems".

Layout:
  include/sw/math/constexpr_math.hpp       -- aggregator
  include/sw/math/constexpr_math/log2.hpp  -- base-2 logarithm
  internal/constexpr_math/api/log2.cpp     -- regression test (cm_log2)

Namespace: sw::math::constexpr_math

Algorithm (log2):
  1. Decompose x = 2^E * M with M in [1, 2) via std::bit_cast<uint64_t>.
  2. Range-reduce to M in [1/sqrt(2), sqrt(2)] by halving M and
     incrementing E when M > sqrt(2).
  3. With u = (M-1)/(M+1), |u| <= 0.1716, evaluate
       ln(M) = 2 * artanh(u) = 2 * (u + u^3/3 + u^5/5 + ...)
     via Horner. Coefficients are odd reciprocals (1/3, 1/5, ...) --
     derivable from first principles, no external tooling
     (no Sollya, no Remez).
  4. log2(x) = E + ln(M) / ln(2)

Polynomial degree:
  - double: degree 21 (terms up to u^21/21) -> ~5e-17 truncation error
  - float : degree 11 (terms up to u^11/11) -> ~1e-9 truncation error

Special-value handling follows IEEE-754:
  log2(NaN)  -> NaN
  log2(x<0)  -> NaN
  log2(0)    -> -infinity
  log2(+inf) -> +infinity

Subnormals are normalized via a software shift loop (constexpr-safe).

Verification:
  - 22 static_assert checks for exact powers of 2, special values,
    log2(3.14) and log2(0.75) acceptance forms
  - Runtime sweep over 23 hand-picked points + subnormal sample
  - Out-of-band stress sweep (100K random samples in [1e-100, 1e100]):
    max relative error 1.74e-16 = 0.78... (continued)

90 of 104 new or added lines in 2 files covered. (86.54%)

4 existing lines in 3 files now uncovered.

45103 of 53485 relevant lines covered (84.33%)

6403422.69 hits per line

Source File
Press 'n' to go to next uncovered line, 'b' for previous

91.55
/include/sw/math/constexpr_math/log2.hpp
1
#pragma once
2
// log2.hpp: constexpr base-2 logarithm for IEEE-754 float and double
3
//
4
// Copyright (C) 2017 Stillwater Supercomputing, Inc.
5
// SPDX-License-Identifier: MIT
6
//
7
// This file is part of the universal numbers project, which is released under an MIT Open Source license.
8
//
9
// -----------------------------------------------------------------------------
10
// Algorithm
11
// -----------------------------------------------------------------------------
12
// Given x > 0, IEEE-754 representation gives x = 2^E * M with M in [1, 2).
13
// We further range-reduce so that M is in [1/sqrt(2), sqrt(2)] by halving M
14
// and incrementing E when M > sqrt(2). Then with
15
//
16
//     u = (M - 1) / (M + 1),   |u| <= (sqrt(2) - 1)/(sqrt(2) + 1) ~ 0.1716
17
//
18
// the identity
19
//
20
//     ln(M) = 2 * artanh(u) = 2 * (u + u^3/3 + u^5/5 + u^7/7 + ...)
21
//
22
// converges very quickly because |u^2| <= 0.0294. The series uses only odd
23
// reciprocals (1, 1/3, 1/5, ...) so the coefficients are derivable from first
24
// principles without any external tooling (no Sollya, no Remez, no minimax).
25
//
26
// Finally  log2(x) = E + ln(M) / ln(2).
27
//
28
// Polynomial degree is chosen so the truncation error is below 1 ulp:
29
//   - float : degree 11 (terms up to u^11 / 11) gives ~1e-9 error
30
//   - double: degree 21 (terms up to u^21 / 21) gives ~5e-17 error
31
//
32
// -----------------------------------------------------------------------------
33
// Why is_constant_evaluated dispatch is NOT needed
34
// -----------------------------------------------------------------------------
35
// std::bit_cast (C++20) is constexpr for trivially-copyable types. Floating-
36
// point arithmetic in constexpr is allowed since C++14 on gcc/clang. So the
37
// same code path works at compile time and at runtime.
38

39
#include <bit>
40
#include <cstdint>
41
#include <limits>
42
#include <type_traits>
43

44
namespace sw { namespace math { namespace constexpr_math {
45

46
namespace detail {
47

48
// IEEE-754 layout guards: this implementation decodes fixed exponent/mantissa
49
// bit positions via std::bit_cast. On platforms whose floating-point format is
50
// not IEC 559 (e.g., IBM hex float on z/Architecture, VAX), the bit pattern is
51
// different and the algorithm would silently produce wrong results. Fail at
52
// compile time rather than at runtime.
53
static_assert(std::numeric_limits<float>::is_iec559,
54
              "sw::math::constexpr_math requires IEEE-754 single-precision (IEC 559) float");
55
static_assert(std::numeric_limits<float>::digits == 24,
56
              "sw::math::constexpr_math requires float with 24-bit significand (1 implicit + 23 stored)");
57
static_assert(std::numeric_limits<double>::is_iec559,
58
              "sw::math::constexpr_math requires IEEE-754 double-precision (IEC 559) double");
59
static_assert(std::numeric_limits<double>::digits == 53,
60
              "sw::math::constexpr_math requires double with 53-bit significand (1 implicit + 52 stored)");
61

62
// 1 / ln(2), to full double precision. Hardcoded as the well-known value
63
// log2(e) = 1.4426950408889634073599246810018921... (cf. C99 M_LOG2E).
64
// Verifiable independently by computing ln(2) via the same atanh series and
65
// taking the reciprocal.
66
inline constexpr double LOG2E = 1.4426950408889634;
67
inline constexpr double SQRT2 = 1.4142135623730951;  // sqrt(2)
68

69
}  // namespace detail
70

71
// constexpr log2(double): returns the base-2 logarithm of x.
72
// Special values follow IEEE-754 conventions:
73
//   log2(NaN)  -> NaN
74
//   log2(x<0)  -> NaN
75
//   log2(0)    -> -infinity
76
//   log2(+inf) -> +infinity
77
constexpr double log2(double x) {
24✔
78
        if (x != x) return x;                                                // NaN propagation
24✔
79
        if (x < 0.0) return std::numeric_limits<double>::quiet_NaN();
24✔
80
        if (x == 0.0) return -std::numeric_limits<double>::infinity();
24✔
81
        if (x == std::numeric_limits<double>::infinity()) return x;
24✔
82

83
        // Decompose x into 2^E * M with M in [1, 2).
84
        std::uint64_t bits = std::bit_cast<std::uint64_t>(x);
24✔
85
        int exp_field = static_cast<int>((bits >> 52) & 0x7FFu);
24✔
86

87
        // Subnormal: normalize by counting how many bits we need to shift the
88
        // mantissa left so its MSB sits at bit 52 (the implicit-1 position).
89
        int subnormal_shift = 0;
24✔
90
        if (exp_field == 0) {
24✔
91
                std::uint64_t mant = bits & ((std::uint64_t{1} << 52) - 1);
1✔
92
                // mant != 0 because x != 0 (handled above)
93
                while ((mant & (std::uint64_t{1} << 52)) == 0) {
3✔
94
                        mant <<= 1;
2✔
95
                        ++subnormal_shift;
2✔
96
                }
97
                bits = mant;
1✔
98
                exp_field = 1;  // base exponent for the renormalized mantissa
1✔
99
        }
100

101
        int E = exp_field - 1023 - subnormal_shift;
24✔
102

103
        // Reconstruct M in [1, 2) by clearing the exponent field and setting it
104
        // to bias (1023). Sign bit is necessarily zero (we already returned for x<0).
105
        std::uint64_t m_bits = (bits & ((std::uint64_t{1} << 52) - 1))
24✔
106
                               | (std::uint64_t{1023} << 52);
24✔
107
        double M = std::bit_cast<double>(m_bits);
24✔
108

109
        // Range reduction: shift M into [1/sqrt(2), sqrt(2)] for fast convergence.
110
        if (M > detail::SQRT2) {
24✔
111
                M *= 0.5;
11✔
112
                E += 1;
11✔
113
        }
114

115
        // u = (M - 1) / (M + 1), |u| <= 0.1716
116
        double u = (M - 1.0) / (M + 1.0);
24✔
117
        double u2 = u * u;
24✔
118

119
        // Horner evaluation of the artanh series
120
        //   artanh(u) = u * (1 + u^2/3 + u^4/5 + ... + u^20/21)
121
        // from the smallest term down. Degree 21 -> ~5e-17 error.
122
        double s = 1.0 / 21.0;
24✔
123
        s = u2 * s + 1.0 / 19.0;
24✔
124
        s = u2 * s + 1.0 / 17.0;
24✔
125
        s = u2 * s + 1.0 / 15.0;
24✔
126
        s = u2 * s + 1.0 / 13.0;
24✔
127
        s = u2 * s + 1.0 / 11.0;
24✔
128
        s = u2 * s + 1.0 / 9.0;
24✔
129
        s = u2 * s + 1.0 / 7.0;
24✔
130
        s = u2 * s + 1.0 / 5.0;
24✔
131
        s = u2 * s + 1.0 / 3.0;
24✔
132
        s = u2 * s + 1.0;
24✔
133
        double artanh = u * s;
24✔
134

135
        // log2(M) = (2 * artanh(u)) * (1 / ln(2))
136
        double log2M = 2.0 * artanh * detail::LOG2E;
24✔
137

138
        return static_cast<double>(E) + log2M;
24✔
139
}
140

141
// constexpr log2(float): same algorithm at lower precision (degree 11 series).
142
constexpr float log2(float x) {
9✔
143
        if (x != x) return x;
9✔
144
        if (x < 0.0f) return std::numeric_limits<float>::quiet_NaN();
9✔
145
        if (x == 0.0f) return -std::numeric_limits<float>::infinity();
9✔
146
        if (x == std::numeric_limits<float>::infinity()) return x;
9✔
147

148
        std::uint32_t bits = std::bit_cast<std::uint32_t>(x);
9✔
149
        int exp_field = static_cast<int>((bits >> 23) & 0xFFu);
9✔
150

151
        int subnormal_shift = 0;
9✔
152
        if (exp_field == 0) {
9✔
NEW
153
                std::uint32_t mant = bits & ((std::uint32_t{1} << 23) - 1);
×
NEW
154
                while ((mant & (std::uint32_t{1} << 23)) == 0) {
×
NEW
155
                        mant <<= 1;
×
NEW
156
                        ++subnormal_shift;
×
157
                }
NEW
158
                bits = mant;
×
NEW
159
                exp_field = 1;
×
160
        }
161

162
        int E = exp_field - 127 - subnormal_shift;
9✔
163

164
        std::uint32_t m_bits = (bits & ((std::uint32_t{1} << 23) - 1))
9✔
165
                               | (std::uint32_t{127} << 23);
9✔
166
        float M = std::bit_cast<float>(m_bits);
9✔
167

168
        if (M > static_cast<float>(detail::SQRT2)) {
9✔
169
                M *= 0.5f;
4✔
170
                E += 1;
4✔
171
        }
172

173
        float u = (M - 1.0f) / (M + 1.0f);
9✔
174
        float u2 = u * u;
9✔
175

176
        // Degree 11: ~1e-9 error, well below float epsilon (~1.2e-7).
177
        float s = 1.0f / 11.0f;
9✔
178
        s = u2 * s + 1.0f / 9.0f;
9✔
179
        s = u2 * s + 1.0f / 7.0f;
9✔
180
        s = u2 * s + 1.0f / 5.0f;
9✔
181
        s = u2 * s + 1.0f / 3.0f;
9✔
182
        s = u2 * s + 1.0f;
9✔
183
        float artanh = u * s;
9✔
184

185
        float log2M = 2.0f * artanh * static_cast<float>(detail::LOG2E);
9✔
186

187
        return static_cast<float>(E) + log2M;
9✔
188
}
189

190
}}}  // namespace sw::math::constexpr_math
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc