The evolution of constexpr: compile-time lookup tables in C++

Lookup tables (LUTs) are an important resource for systems programming. They are the embodiment of the time-space tradeoff: precomputing results allow faster computation, up to O(1)!

My background: I’m a modern C++ enthusiast and a embedded systems developer. Recently, I had to implement two CRC tables, with different parameters. Instead of hard coding the tables, I went for my favorite kind of approach: computed at compile time. As the codebase was in C++11, it reminded me how far we’ve come.

So, in this article, we’ll review how to implement compile-time lookup tables in C++, for every Standard C++ published so far: C++98, C++03, C++11, C++14, C++17 and C++20. We’ll focus on the evolution of constexpr, but we’ll also see important additions to the language, such as auto and library functionality.

Note: for this article, we’ll only consider features available on each Standard, even if some of the library features were possible to implement in previous versions of C++.

🔗C++98/03: The Dark Ages

Before modern C++, we did not have ways of compute constant expressions with the own language.

All we had for this task was exactly what C has:

1
2
3
4
5
6
// my_table.hpp
extern const int lut[16];

// my_table.cpp
#include "my_table.h"
const int lut[16] = {1,2,3, /* ... */, 16};

We declare a lookup table in our header. Due to how C++ linkage works, we need to use extern linkage and define the lookup table in a single translation unit (my_table.cpp). To use the table, all we need is to include the my_table.hpp file and invoke lut[index].

Alternatively, we can introduce a function for lookup and make the LUT an implementation detail:

1
2
3
4
5
6
7
8
9
10
11
// my_table.hpp
int lookup(size_t index);

// my_table.cpp
#include "my_table.h"
static const int lut[16] = {1,2,3, /* ... */, 16};

int lookup(size_t index){
assert(index < 16); // optional safety measure
return lut[index];
}

🔗How do we compute our lookup tables in C++98?

One could adventure with the preprocessor, but the most common way is generating the my_table.cpp file with a scripting language.

We can definitely do better than that!

🔗C++11: Dawn of constexpr

C++11 was such a huge improvement over the previous versions, it changed many paradigms in the language. Modern C++ was born.

🔗A limited constexpr

The focus of our article is this beautiful inclusion to the language, so let’s start with it!

Though its name isn’t the most elegant-sounding, the keyword perfectly describes what it’s supposed to do: allow definition of constant expressions.

Now we can create type-safe constants, replacing macros:

1
2
constexpr size_t lut_size = 99;
const int lut[lut_size] = {...};

We can also create functions that can be evaluated at compile time:

1
2
3
4
5
constexpr size_t lut_size(){
return 99;
}

const int lut[lut_size()] = {...};

With functions generating constant expressions, we have now a better way of computing anything at compile time. constexpr is Turing-complete!

The main limitation with C++11’s constexpr, however, is that it only allows one return statement. No if‘s and else‘s, no variables. Only one return. So, we had a lot of recursive functions and ternary operators:

1
2
3
constexpr unsigned factorial(unsigned n){
return (n < 2) ? 1 : (n * factorial(n-1));
}

🔗auto

C++ is a statically-type language. The compiler already knows the type of an expression. So why should we?

1
2
3
4
5
6
template<typename T>
void do_something(const T& t){
// We don't know. We don't care
auto tmp = compute(t);
do_something_else(tmp);
}

We can also have auto as our return type. The problem? It’s not deduced, and we still need a trailing return type:

1
2
3
auto do_something() -> int {
return 0;
}

🔗decltype and std::declval

How do we deduce a return type of a generic function, when we cannot just declare it as auto? We can use decltype so the compiler can tell us:

1
2
3
4
template<typename T1, typename T2>
auto add(const T1& a, const T1& b) -> decltype(a + b) {
return a + b;
}

If we need a variable in decltype that we don’t have declared, we can use the std::declval utility, to pretend we have one:

1
2
3
4
5
6
7
using other_type = /* a mystery */;

template<typename F>
auto call(F&& f) -> decltype(f(std::declval<other_type>())){
other_type t { /*...*/ };
return f(t);
}

🔗std::array

One issue with C-style arrays is that we cannot return them from functions. They decay to pointers, which then dangle.

C++11 introduced std::array, a statically-sized container, which solves this issue:

1
2
3
4
5
6
// Returning an array of 10 items
std::array<int, 10> create_arrray(){
std::array<int, 10> arr;
// do something with arr
return arr;
}

🔗Variadic templates

The missing piece of our puzzle: we can now have templates with any amount of template parameters, using parameter packs!

1
2
3
4
template<typename... Args>
void do_something(Args&&... args){
// Do something with args. To individually address arguments, use recursion.
}

(If you’re confused by the Args&& nomenclature, check my previous article: Forwarding References and Perfect Forwarding in C++)

🔗Compile-time LUT in C++11

We all know all the features we needed to create this, so let’s do it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
// Implementation of the LUT, using a recursive function
// We use SFINAE to end recursion

// The final iteration: we construct the array
template<std::size_t Length, typename T, typename Generator, typename... Values>
constexpr auto lut_impl(Generator&& f, Values... values)
-> typename std::enable_if<sizeof...(Values) == (Length - 1), std::array<T, Length>>::type
{
return {values..., std::forward<Generator>(f)(sizeof...(Values))};
}

// All other iterations: we append the values to
template<std::size_t Length, typename T, typename Generator, typename... Values>
constexpr auto lut_impl(Generator&& f, Values... values)
-> typename std::enable_if<sizeof...(Values) < (Length - 1), std::array<T, Length>>::type
{
return lut_impl<Length, T, Generator, Values..., decltype(f(std::size_t{0}))>
(std::forward<Generator>(f),
values...,
f(sizeof...(Values)));
}

// Our lookup table function
// - Length: the size of our LUT
// - Generator: the functor that we'll call to generate the LUT on each index
// - As we're using indexes, we don't need to call std::declval and can just declare a value of size_t
template<std::size_t Length, typename Generator>
constexpr auto lut(Generator&& f) -> std::array<decltype(f(std::size_t{0})), Length> {
// We call the implementation
return lut_impl<Length, // The size
decltype(f(std::size_t{0})) // The return type
>(std::forward<Generator>(f));
}

If you found this unreadable and extremely confusing, remember to thank your library implementers. They do this kind of work so you don’t have to! All you have to do is use the library, which is definitely easier.

Explanation:

  • As we can only have a return type on C++11, we cannot just create the array in our function. So, we created a recursive helper function, to hold all our values until we construct the array.
  • We need to deduce the return type in each function. As we know it’s an array and its length, we can return std::array<decltype(f(std::size_t{0})), Length>.

Usage:

1
2
3
4
5
// We already implemented factorial above, so let's use it!
unsigned precomputed_factorial(unsigned n){
constexpr auto lut = lut<10>(factorial);
return lut[n];
}

Try it out on Compiler Explorer!

Limitations:

  • With std::array, we cannot legally access the data at compile time. So, lookup is only done at runtime.

🔗C++14: constexpr rises

After a decade working on C++11, the committee decided taking a different approach: standards will be released in a 3-year cycle. This way, we don’t wait for the completion of some features, while many are complete and ready to be used.

C++14 was the first standard released this way, and mostly consisted of C++11 bug fixes.

🔗Down with the annoying constexpr limitation!

constexpr was a success. What the committee didn’t expect was how successful it would become.

The initial thought was “let’s replace macros with one-liners”. What actually happened? Compile-time programming became a staple of the language, a symbol of the zero-overhead principle.

Now, we can have multiple lines in a single constexpr function!

1
2
3
4
5
6
7
constexpr unsigned factorial(unsigned n){
unsigned result = 1;
for(unsigned i = 2; i <= n; i++){
result *= i;
}
return result;
}

Now, we can just build an std::array and… Oh, no! They forgot to make std::array‘s operators constexpr!

We could implement our own constexpr notstd::array, but, as the initial disclaimer, we’ll only use standard features. So, let’s see another addition to C++14.

🔗std::integer_sequence and std::index_sequence

std::integer_sequence and std::index_sequence are template utilities for creating sequences and accessing multiple indexes as pack expansions.

🔗A better auto

Now, the compiler can deduce the return type of our functions. No more trailing decltype!

Using our C++11 example, we now can remove our ugly decltype expression:

1
2
3
4
5
6
7
using other_type = /* a mystery */;

template<typename F>
auto call(F&& f) {
other_type t { /*...*/ };
return f(t);
}

Note: auto implies returning a value. In order to return whichever type f returns, including references, we must define our return type decltype(auto).

🔗Compile-time LUT in C++14

At the end, we have slightly more readable code:

1
2
3
4
5
6
7
8
9
10
11
12
// Implementation of the LUT, in one shot!
template<std::size_t Length, typename Generator, std::size_t... Indexes>
constexpr auto lut_impl(Generator&& f, std::index_sequence<Indexes...>) {
using content_type = decltype(f(std::size_t{0}));
return std::array<content_type, Length> {{ f(Indexes)... }};
}

// Our lookup table function
template<std::size_t Length, typename Generator>
constexpr auto lut(Generator&& f){
return lut_impl<Length>(std::forward<Generator>(f), std::make_index_sequence<Length>{});
}

Try it out on Compiler Explorer!

Details:

🔗Variable templates

C++14 introduced variable templates. We now can declare our LUTs as any size, and use them as normal variables (See in Compiler Explorer):

1
2
3
4
5
6
7
template<std::size_t Length>
constexpr auto factorial_lut = lut<Length>(factorial);

// Function is now unnecessary!
unsigned precomputed_factorial(unsigned n){
return factorial_lut<10>[n];
}

Note: Only after C++17, we can declare our variable as inline. We may have some problems with linkage, as we need to take the address of a global variable inside our LUT. Multiple static variables may increase binary size, and not using static linkage may result in duplicate symbols.

🔗C++17: constexpr shines

🔗A fully constexpr std::array

The committee finally added constexpr to std::array‘s iterators and operator[]. We can now iteratively construct an std::array:

1
2
3
4
5
6
7
constexpr auto random_array(){
std::array<int, 10> arr;
for(auto& itm : arr){
itm = random_func();
}
return arr;
}

🔗Compile-time LUT in C++17

We can now go down to a single function!

1
2
3
4
5
6
7
8
9
10
11
template<std::size_t Length, typename Generator>
constexpr auto lut(Generator&& f){
using content_type = decltype(f(std::size_t{0}));
std::array<content_type, Length> arr {};

for(std::size_t i = 0; i < Length; i++){
arr[i] = f(i);
}

return arr;
}

Try it out on Compiler Explorer!

Note: This considers the result of f is default and copy constructible. Otherwise, we need to use the index_sequence version.

🔗constexpr lambdas

Lambda expressions in C++17 now can be declared constexpr, to indicate its operator() can be called at compile time. It can also just infer constexpr from it.

With lambdas and inline variable templates, we can now declare our LUT in a single statement:

1
2
3
4
5
6
7
8
template<std::size_t Length>
inline constexpr auto factorial_lut = lut<Length>([](std::size_t n){
unsigned result = 1;
for(unsigned i = 2; i <= n; i++){
result *= i;
}
return result;
});

🔗C++20: constexpr unleashed

C++20 brings the biggest upgrades in C++ since C++11. However, we’ll just use a couple of features.

🔗constexpr algorithms and ranges

In his classic C++ Seasoning talk, Sean Parent introduced us to the concept of “no raw loops”.

So, as good listeners, let’s use the fact that C++20 introduces constexpr to the functions on the <algorithm> header!

C++20 also introduced the ranges library. It’s enormous, and changes the way we think about algorithms in C++. We’ll use them, rather than the usual algorithm functions.

🔗Compile-time LUT in C++20, feat. ranges

1
2
3
4
5
6
7
8
9
10
11
12
template<std::size_t Length, typename Generator>
constexpr auto lut(Generator&& f){
using content_type = decltype(f(std::size_t{0}));
std::array<content_type, Length> arr;

using namespace std::ranges;
auto content = views::iota(std::size_t{0}, Length) // Generate a sequence
| views::transform(std::forward<Generator>(f)); // Transform using our generator
copy(content, arr.begin());

return arr;
}

Try it out on Compiler Explorer!

Okay, that isn’t easier to read than C++17’s. (No, using std::ranges::transform won’t make it more readable.)

But that’s the beauty: C++ is a backwards-compatible language, that one still works! And the language is highly complex, so this minimal example won’t be fair to the feature. Listen to Sean Parent, use algorithms.

🔗What about the future?

We saw C++20 added ranges. Sadly, they didn’t introduce adapters to generate our data types. Hopefully, in the future, we can do something like this:

1
2
3
4
5
6
template<std::size_t Length, typename Generator>
constexpr auto lut(Generator&& f){
using namespace std::ranges;
return views::iota(0uz, Length) |> views::transform(f) |> to_array();
// to_array doesn't exist. It may not work with our current language features
}

Notes:

  • P0330: uz is the newly approved suffix for size_t literals!
  • P2011: The “pizza operator” |> is a solution proposed for an unwanted overhead of ranges
  • P1206: “ranges::to: A function to convert any range to a container”. It says “Future work is needed to allow constructing std::array

🔗Conclusion

Only by looking back, we can see how far we’ve come. With constexpr, it’s no different.

For our application, we found a balance. We can now generate lookup tables beautifully in our language, without depending on other languages, and at no runtime cost!

At the end, I still don’t like that we cannot use the C++17/C++20 versions for types that are not default constructible. We still need to use the C++14 version for those types, maybe only adding CTAD.

Perhaps a to_array adapter will work someday. I tried implementing it with my small knowledge of ranges, but I believe it may need P1045, so we can deduce the array length from a function input. Until then, we’ll keep using what we can.

If you came so far, thanks for reading! I have DMs open on twitter, and on Reddit. Any questions or comments, just let me know!

The larger code snippets from this post are released under the Boost License, on my snippets repository.