Interrupt handling is an important part of embedded systems development: it allows separating application logic from peripheral interfacing, while removing the need for polling and allowing real-time operations of peripherals.
With interrupt handling in mind, the library may add functionality that requires different implementation while inside an interrupt context. Alternatively, a library function may require checks and calls when running outside an interrupt.
The rule of thumb for interrupt handling is the faster, the better. This usually rules out automatic context detection at runtime. So, library writers generate two versions of the same function, optimizing for the context where they’re executed.
One example of an API that extensively uses of this technique is FreeRTOS’s API. We can take its Timer API as an example, which provides many pairs of these functions:
xTimerStart
/xTimerStartFromISR
xTimerStop
/xTimerStopFromISR
xTimerChangePeriod
/xTimerChangePeriodFromISR
xTimerReset
/xTimerResetFromISR
xTimerPendFunctionCall
/xTimerPendFunctionCallFromISR
But this approach has some issues:
- A user depends on an IDE or documentation to know if a
FromISR
function even exists - If a free function does not have the
FromISR
suffix, does it mean it’s interrupt-safe or not?- Do we really expect a programmer to access documentation to verify if a call is valid in a specific context?
It’s 2021, we can do better than that. In this article, we’ll explore 4 ways of improving an interface like that with C++.
But wait, there’s more: for those who won’t/can’t use C++, we’ll also see 2 ways of doing this with C!
🔗Our example: Wrapping a queue interface written in C
Instead of creating a new API for every technique, we’ll wrap this fictional C API:
1 | // Our handle type. |
It has:
- 2 functions that can be used anywhere
- 2 functions that need the user to specify context
- 2 functions that should not be called inside an ISR
We have here all the issues described in the introduction. So let’s also have a toy example, to try and replicate for each technique:
1 | queue_handle_t* global_queue; |
A reader with a keen eye may have noticed we already forgot to call the correct push function inside an ISR. Other readers may have missed it, like I did, and like a code reviewer can. And that’s the problem with this type of interface.
“Make interfaces easy to use correctly and hard to use incorrectly”
— Scott Meyers
The master has spoken, so let’s see what we can do!
🔗C++: RAII guards
The idea for this entire article started from this technique, which I was experimenting earlier, and had promising results. Even though it ended up not being great, it’s still cool to see what a compiler can do for us. So it’s a perfect first solution to build upon.
Let’s start with how it looks like:
1 | // A shared instance of the queue |
We presented a few constructs here:
context::is_isr()
defines whether we’re inside an ISRcontext::isr
andcontext::regular
RAII guards, that are not referenced anywhere else
We can conclude there must be something hidden. And there is: a global boolean variable, which stores the magic behind the context detection logic.
The context logic is fairly simple, and this would be the entire library component we’d need:
1 | namespace context { |
“But Joel, your titled said compile time, that’s runtime!”
Indeed, it is. But if your compiler can guarantee all functions called within that context do not change is_isr
, it can infer that constructing and destroying both isr_context
and regular_context
have no side-effects.
And then, they’re optimized away¹. For instance, this simplified example does everything we expected from it, outputting an assembly with zero overhead.
¹ That’s the same reason why we need volatile
or atomic
for variables being used in both contexts. But here, we’re using it to our advantage!
Defining the rest of the Queue wrapper class
Our wrapper class’ interface is transparent to context:
1 | class MyQueue { |
Implementation of all other functions is left as an exercise to the reader.
Pros
- Easy to use
- Only define the context logic once, use everywhere
- Only instantiate once on the top-level function of the context, automatically detect inside the functions
- Zero overhead is achievable
Cons
- Works without specifying any context (easy to use incorrectly!)
- No way to block, at compile time, calls to
Queue_create
andQueue_delete
in the wrong context- At runtime, it would need exception throwing or
std::optional
factories. Not great, either way.
- At runtime, it would need exception throwing or
- Very hard to reach zero overhead, needing LTO or unity builds (ew), or jumping through many hoops to make the compiler understand there are no side-effects.
Verdict: It’s a fun showing of the power of the compiler, but it’s too error-prone, and depends on very strong optimization techniques. We can do better, and we will.
🔗C++: Context wrapper classes
This strategy is based on getting a handler from a specific context type. Our sample code:
1 | // A shared instance of the queue |
What determines our context are the member functions isr_context
and regular_context
.
We cannot call a function on global_queue
without a context wrapper/handler, forcing the user to, at least, think of which context they want to use.
The interface of this implementation is given by:
1 | struct MyQueue { |
Pros
- Easy to use correctly
- Impossible to use without choosing a context, i.e. harder to use incorrectly
- Zero overhead (when inlined)
- We can define functions that only work in certain contexts
- We can share implementation between contexts
Cons
- Verbose declaration
- We can’t invalidate the constructor’s call to
Queue_create
inside an ISR - Strategy needs an object to operate over, i.e. does not allow declaring static functionality.
Verdict: It’s a viable way of implementing interfaces, but it needs some boilerplate for the definition. Declaration is not very clean, though we can improve readability by separating definition from declaration.
🔗C++: Tagged functions
For this technique, we need to call our functions with objects of different types, similar to how we’d use them for tag dispatching. We then have two different approaches:
1. Overload Set, which uses empty classes and passes an instance of them as arguments.
1 | namespace context{ |
Notice we pass the tags as value, not as references. This slightly helps code generation in the case where the function is not inlined, as we don’t pass an address.
2. Template specializations, using a template parameter as the tag. The parameter can be a value, e.g. an enum
:
1 | enum class context { |
Although more restrictive and verbose, this technique does not use a register for the tag argument, which may be relevant when not inlining these functions. In all other aspects, it’s either the same or worse than the previous solution.
Pros
- A context-specific handle is always required to call functions
- We can constrain the construction to certain contexts
- Zero overhead (when the overload set is inlined, or when using the template alternative)
- No need for a
release
member function, as the object can only be constructed in a regular context
Cons
- All calls must be individually tagged
- Including the functions that do not depend on context, for interface consistency
- Template version is verbose and does not allow calling the constructor with tags
Verdict: Both are practical ways of solving the problem, though the overload set solution is definitely cleaner. Having to tag each call makes the interface less clean than it needs to be, which may or may not be a good thing.
(Thanks to Austin Morton for introducing this technique to the discussion on the CppLang Slack)
🔗C++: Namespaces
This technique is fairly straight-forward, as it’s very similar to our C code:
1 | MyQueue global_queue; |
And the declaration:
1 | // A dummy wrapper, for this application |
As you may see, we’re basically doing what the original C interface does, but the name of the functions are defined by their namespaces instead of suffixes.
The advantage this presents over the original C implementation is that we can just define using namespace context::<context>;
inside our function,
and never worry about which function is interrupt-safe or not, or which one requires a FromISR
suffix.
Pros
- Easy to use: define the context once per function, every call is done correctly
- Zero overhead (requires inlining only if a wrapper)
Cons
- Can only use free functions (we have no UFCS, or extension methods)
- Our class is either an aggregate, or we require to jump through hoops to make our class constructible in an specific context. If we make the class an aggregate, we have the same complications for destruction.
- These solutions are not idiomatic C++, and still resemble C APIs
Verdict: Usable technique, though the complications of making it look like a C interface may be a turn-off for C++ programmers that prefer encapsulated interfaces.
🔗C: Tagged Macros
Well, I promised we can also do this in C. So, let’s see how! Starting from the usage:
1 | typedef struct isr_context{} isr_context; |
Pretty neat for C, huh? An overload set with the same name as the original function, just like the C++ solution!
How to implement this solution
The first thing we need to know to implement this is the _Generic
selector from C11. It allows implementing overload sets using macros, e.g. the math functions in <tgmath.h>
:
1 |
|
In this example, we dispatch, at compile time, the function that will be called, given the argument for our cbrt
macro. (Example from the Generic Selection page on cppreference.com)
One thing that we don’t usually consider with generic selection, though, is that argument passing is part of the macro, not the _Generic
syntax! So we can just implement our functions like this:
1 | // Context-dependent dispatching |
(As before, implementing the remaining functions should be trivial, and is left as an exercise)
Pros
- Single function name, so it’s easy to use correctly and hard to use incorrectly
- Macros hide the original interface (Note: they need to be declared after the functions, and cannot exist before the definition)
- Trying to call without a tag yields a compilation error, which is what we wanted!
- Zero overhead, even without optimizations!
Cons
- Available only on C11 or later
- Errors being behind preprocessor magic can make it hard to understand/debug for some users
- All calls must be individually tagged
Verdict: This technique is very similar to C++’s tagged functions, and has similar pros and cons. But, since it’s in C, it’s one of the better interface improvements we can get, with a superior interface, when compared to the initial one. If C11 is available in your compiler, it may be a good choice to try it out.
🔗C: Tagged macros with wrappers
The previous solution was a vast improvement on the interface for C. The question is: can we do better? I can’t assure we can, but we definitely can do differently:
1 | typedef struct isr_context{ queue_handle_t* handle; } isr_context; |
We now removed the need for a separate tag! We only need to specify it once, and use it normally. The definition is basically the same as the previous one, just changing the parameter passing:
1 |
|
Pros
- Easy to use: instantiate the wrapper once, use it in every function call
- Cannot use without a wrapper, i.e. hard to use incorrectly
- Zero overhead, in optimized builds. On unoptimized builds, it constructs a small struct holding a pointer.
Cons
- The same macro magic and C11 availability issues from the previous solution
- Utilizing the tag type for both wrapping a handle or passing an empty context tag may be confusing
- e.g. A user may ask “Is
Queue_create
supposed to return a handle, or to put the handle inside my context variable?”
- e.g. A user may ask “Is
Verdict: This technique seems very similar to the C++ Namespaces solution, with a touch of the wrapper class technique. This technique is fairly clean, and the cons sound like nitpicking, when compared to the original interface. If I still wrote pure C interfaces, I’d choose this implementation.
🔗Conclusion
We saw a few techniques for solving an embedded systems development problem: how to make a context-dependent API more fool-proof?
We cannot make any of these interfaces impossible to use incorrectly, but instantiating the wrong context makes it much easier to detect a mistake.
About which solution is the best, there is none. Most of the techniques presented are viable choices for their respective languages. (RAII guards can burn in hell)
At the end, it’s a tradeoff, mostly based on complexity of implementation and how clean is the interface. Sometimes, it’s better to just follow the KISS principle and choose the simplest one. Some other times, it’s better to try a more robust approach to give a more readable interface to the user. As I’ve said, it’s a tradeoff.
And, if there is a tradeoff, there is space for a comparison table!
Feature | RAII guard | Wrapper class | Tagged overload set | Tagged template | Namespaces | _Generic macro | _Generic wrapper |
---|---|---|---|---|---|---|---|
Forces a choice in each context | ❌ | ✔️ | ✔️* | ✔️* | ✔️ | ✔️* | ✔️ |
Prevents circumvention | ❔ | ✔️ | ✔️* | ✔️* | ✔️* | ✔️* | ✔️ |
Usage with free functions | ✔️ | ✔️¹ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ |
Usage with member functions | ✔️ | ✔️ | ✔️ | ✔️ | ❌ | DNA | DNA |
Removes repetition from calls | ✔️ | ✔️ | ❌ | ❌ | ✔️ | ❌ | ✔️ |
No additional symbols generated | ❌ | ➖ | ➖ | ➖ | ➖/✔️² | ✔️ | ➖ |
Can limit construction scope | ✔️³ | ❌ | ✔️ | ⁉ | ⁉ | DNA | DNA |
Runtime overhead removal | complex | trivial | trivial | trivial | trivial/nonexistent** | nonexistent | trivial |
Legend:
- ✔️ means the feature is present
- ❌ means the feature is not present
- ➖ means the compiler needs to completely inline the functions to remove the overhead
- Which shouldn’t be a problem in most cases. Always compile with optimizations on, even on debug mode! (GCC’s
-Og
is a thing)
- Which shouldn’t be a problem in most cases. Always compile with optimizations on, even on debug mode! (GCC’s
- * means that handles need to be encapsulated, in order to prevent the user from calling a function directly
- ❔ means… it’s complicated. We can force the use of our syntax, but it’s too error-prone
- Runtime overhead removal, i.e. how much does the compiler need to know to
- complex: the compiler needs a lot of information to entirely remove overhead, and may require link-time optimization (LTO)
- trivial: simply inlining the function removes all runtime overhead that would occur
- nonexistent: when the decision is made by a macro, the compiled code is no different than manually putting the correct code
- ¹ With hidden friends
- ² There’s only overhead (without optimizations) if we’re wrapping other functions.
- ³ At runtime, with exceptions (not recommended)
- ⁉ means we need to use non-idiomatic ways of constructing (and/or destructing) the classes
- DNA (Does Not Apply): C does not have member functions, constructors and destructors
In order to not overcomplicate this article, I’ve intentionally left out some other issues, like qualifiers and noexcept
correctness, combination of some of these techniques, or problems we’d get by taking function addresses or interfacing with a function-like macro.
If you reached this far, thank you for reading! If I’m wrong, I’ll be glad to learn something new from it! You can find me on twitter, on the CppLang Slack, and you can also open an issue on this blog’s repository!