June 19, 2021

Another take at this Unified Call Syntax Thing


Author	Joel P. C. Filho
Audience	EWGI

C++, as a multi-paradigm programming language, allows its users to define two kinds of functions: member functions and free functions. The beauty of the language is that, no matter which side one prefers, they can use it. Except, of course, when an opinionated library does not use their preferred way of doing that.

The idea of a Unified Call Syntax (UCS, sometimes called Uniform) is to allow users to call a function with either syntax, regardless of how it was declared.

Without a UCS, we’ve been using of artifacts to emulate the desired behavior, e.g. std::begin:

1
2
3

auto std::begin(auto& range) -> decltype(range.begin()) {
	return range.begin();
}

Without UCS

auto my_begin(auto& x) {
	// We need to import this to guarantee we find a begin function
	using std::begin;
	// Then, ADL is performed, and, if not available, we call std::begin, which calls in.begin
	return begin(x);
}

With UCS

auto my_begin(auto& x) {
	// Automatically call x.begin(), because this call is equivalent
	// if begin(x) also exists, we need to define a rule to resolve the ambiguity
	return begin(x);
}

We can also observe this behavior on other Standard functions, such as std::swap‘s specializations, which invokes a.swap(b) member functions of the Standard containers. This creates inconsistent interfaces between libraries, making the life of the user harder than it needs to be.

The introduction of UCS into C++ is something that has been sought with no success by many, including Bjarne Stroustrup, the creator of the language. Meanwhile, other systems programming languages, without 40 years of legacy, have successfully implemented unified function call syntaxes.

By previous committee responses, it’s clear that ambiguous code or a syntax that breaks any current code are undesirable. So, at this point, we should accept that x.begin() being equivalent to begin(x) may be never happening in C++. We should also have some understanding that the semantics must be clear, and possible ambiguities should be eliminated.

Therefore, this paper proposes a new unified call operator, in order to maintain full backwards compatibility with previous version of the code. Additionally, the proposal includes an attempt to define the behavior of each possible use case of the operator.

As this paper is presented to incubation, the focus is on defining behavior and syntax, while specific wording is omitted.

🔗On the goal of this proposal: Who benefits from UCS?

Generic code benefits the most from an UCS, by not needing to specialize to multiple situations. Therefore, library writers are the main public for an UCS.

However, by introducing a new syntax for unified calls, we not only provide a new tool for generic programming, but also an opt-in mechanism for any C++ developer who wishes to take advantage of the functionality.

🔗The proposed syntax

The proposed syntax utilizes an operator !(), called the Unified Call Operator (UCO), e.g.:

f!(x)
x.f!()
x->f!()

The UCO is proposed as a non-overloadable operator, which performs the unified call operation on the function named f. The syntax was chosen with the intent of not creating confusion between unified calls and the current model. Other languages utilize this syntax for other function-like calls, thus it’s a practical syntax.

Each of the specified UCO uses has rules, depending on the kind of the call. The unified calls can be categorized as:

f!(x, args...) - Unified Free Function Call
x.f!(args...) - Unified Referencing Member Function Call
x->f!(args...) - Unified Dereferencing Member Function Call

The following subsections present the intended behavior for each of these categories:

🔗Unified Free Function Call

f!(x, args...)‘s translation process by the compiler is given as follows:

If f(x, args...) is well-formed, it’s equivalent to it, with the usual argument-dependent lookup (ADL) rules
Otherwise, it becomes a call to x.f(args...), if well-formed
Otherwise, the program is ill-formed

The feature is implementable as syntactic sugar with an immediately-invoked lambda expression. For instance, this ruleset may be implemented by the compiler as:

// f!(x, args...):
[](auto&& __x, auto&&... __args) constexpr
  noexcept(
	[]() constexpr {
		if constexpr(requires{f(std::forward<decltype(__x)>(__x), std::forward<decltype(__args)>(__args)...);}){
			return noexcept(f(std::forward<decltype(__x)>(__x), std::forward<decltype(__args)>(__args)...));
		} else if constexpr(requires{std::forward<decltype(__x)>(__x).f(std::forward<decltype(__args)>(__args)...);}){
			return noexcept(std::forward<decltype(__x)>(__x).f(std::forward<decltype(__args)>(__args)...));
		} else {
			return true; // Not returning a boolean here already yields a compile-time error
		}
	}()
  )
{
	if constexpr(requires{f(std::forward<decltype(__x)>(__x), std::forward<decltype(__args)>(__args)...);}){
		return f(std::forward<decltype(__x)>(__x), std::forward<decltype(__args)>(__args)...);
	} else if constexpr(requires{std::forward<decltype(__x)>(__x).f(std::forward<decltype(__args)>(__args)...);}){
		return std::forward<decltype(__x)>(__x).f(std::forward<decltype(__args)>(__args)...);
	} else {
        // Give us std::static_error with constexpr std::format!
		__implementation_defined_compile_time_error("function_name", __x, __args...);
	}
} (x, args...)

For flexibility and to allow optimizations of compiler speed, we may declare the functionality is implemented as if the syntactic sugar was implemented. The unified member function calls can also be specified in a similar manner.

Note: As an optional addition for this proposal, it may also try to instantiate a call to x->f(args...) . However, it may introduce unnecessary complexity, therefore it was not proposed in this paper. For example, when using a smart pointer, a user may be inclined to think all unified calls are done to the contained object, while it’s not true if the function also exists on the pointer class.

🔗Unified Referencing Member Function Call

x.f!(args...)‘s process is just the inverse of the free function call:

If x.f(args...) is well-formed, it’s equivalent to it
Otherwise, it becomes a call to f(x, args...), with the usual argument-dependent lookup rules, if well-formed
Otherwise, the program is ill-formed

🔗Unified Dereferencing Member Function Call

x->f!(args...)‘s requires dereferencing x. The basic interpretation is:

If x->f(args...) is well-formed, it’s equivalent to it
Otherwise, it becomes a call to f(*x, args...), with the usual argument-dependent lookup rules, if well-formed
Otherwise, the program is ill-formed

🔗Function calls with template arguments

Utilizing the UCO in a function call with explicit template parameters should be valid:

f<T...>!(x, args...)
x.f<T...>!(args...)
x->f<T...>!(args...)

These cases follow the same rules as the non-template ones, but capturing the function name and template parameters together.

🔗Chaining

Chaining is supported, and works as their non-unified counterparts.

x.f!().g!() is equivalent to (x.f!()).g!()
f!(g!(x)) is allowed

🔗Special cases

There are some special cases that should be explicitly forbidden. A program should be ill-formed if:

The function calls a destructor, e.g. x->~T!()
- Reason: destructor are special member functions, and cannot be free functions. Therefore, it is unnecessary to add this special case to correct behavior.
Pointer-to-member function call is attempted, e.g. x.*f!()
- Reason: Like the destructor, f cannot be a free function. Therefore, it also is unnecessary.
Function name is qualified, e.g. ns::f!(x)
- Reason: By qualifying the function call, it’s not generic code, and would require the compiler to generate an unqualified name for the member function. using ns::f; is more explicit, and less ambiguous about the possibility of x.f being contained outside ns.
- Note: Using a function call on a qualified name is acceptable, e.g. std::cout.f!() may call f(std::cout)
There are no arguments on the free function call, e.g. f!()
- Reason: There is no unification or ADL to be done. We should maintain the default call instead.

There are other special cases, with suggested solutions, which may require further discussion:

🔗Constructor calls

Using the current syntactic sugar model, this code compiles, calling the move constructor in both occasions:

1
2
3

struct S{};
S!(S{});
S{}.S!();

Similarly, this is currently well-formed, calling unique_ptr<int>(x):

int *x {};
using ptr = std::unique_ptr<int>;
x.ptr!();
ptr!(x);

There are at least two solutions to this issue:

It’s a feature, not a bug!
The program is ill-formed if calling f! would only be valid if it resulted in a call to a constructor of a class type f
If f is a type name, the program is ill-formed

Author’s opinion: #2, even though it breaks the current syntactic sugar model, it’s clearer on what’s allowed or not. It’s probably better than #3, because we may have member functions that are named as some types.

🔗Dereferencing Member Function Call

Previously, we defined the rules for the x->f!(args...) case.

However, if x is a class type and it does not overload the unary operator *, while overloading operator ->, it fails. We may either:

Force dereferencing semantics, where *x and x.operator->() are always equivalent
Change the behavior to try to instantiate f(*(x.operator->()), args...) first, then f(*x, args...)

Author’s opinion: Forcing language semantics is not bad in this case, but, as the fix is trivial, it’s perfectly acceptable to choose #2.

🔗Literals

By this point, we’ve only assumed the proposed UCS accepts member function calls in fundamental non-class types. This would mean that someone may try to do this:

1 2	1.f!(); // Trying to call f on a double 1.f.f!(); // Trying to calll f on a float

While the second case is already solved, as it is ill-formed and requires a space between the first f and the second point, the first one may be wrongfully parsed as 1.f followed by !(), i.e. trying to do a unified function call to 1.0f, which is ill-formed.

Options for solving this issue include:

Introduce a lookahead requirement, so f is considered a unified call, before being consumed as a float literal (or any user-defined literal)
Always require a space between any literal, with or without suffix, and the function call
Forbid using non-class types with member function call operator, at the cost of disabling various generic code options

Author’s opinion: since the alternative interpretation is ill-formed, and #1 is implementable with constant look-ahead, it may be the best, most flexible choice.

🔗Operators

It may be possible that the UCO is used in an operator call, e.g. x.operator>>!(y). The functionality is already built-in into the language, and it’s used as though they were a unified call.

However, there are corner cases where it may be relevant, e.g. if the programmer is utilizing an operator call with explicit template arguments. For example:

1	x.template operator>><T2>!(y);

If utilized, the transformed name is template operator>><T2> for the member function alternatives being checked, and operator>><T2> for the free function one. Since it utilizes two different names for the function, it also changes our syntactic sugar model.

Author’s opinion: Cursed code should burn in hell. We should not support this.

🔗Conclusion

This paper proposes the introduction of the operator !(), the Unified Call Operator, in order to enable an opt-in and backwards-compatible unified call syntax.

Without Unified Call Syntax

auto my_begin(auto& x) {
	// We need to import this to guarantee we find a begin function
	using std::begin;
	// Then, ADL is performed, and, if not available, we call std::begin, which calls in.begin
	return begin(x);
}

With Unified Call Operator

auto my_begin(auto& x) {
	// Automatically call begin(x). If not defined, call x.begin();
	return begin!(x);
}

x.f!() and f!(x) have different, but clear, semantics:

Prioritize call as written (as if by removing the !)
Only call the alternative if not able to call as written

Or, when using Revzin’s classification method:

Candidate Set: Any finds Any
- (But only if using the Unified Call Operator)
Overload Resolution: Two Rounds, Prefer As Written
- (Without breaking existing code, as we have new syntax for explicit unified function call)

When compared to previous proposals:

✔️ Behavior is opt-in by the caller
✔️ Syntax is unambiguous
✔️ Code being added should not break legacy code
- ❌ Behavior can change between compilations, when adding a second call option with different behavior
❌ Not as simple as just calling as we’ve always done
❌ Better for library writers than general users
- ✔️ Eliminates the need to write std::begin-like functions for every name needed
✔️ Unlike P0301R1, which also introduces an opt-in mechanism, gives the control of resolution priority to the caller

Joel Filho

Modern C++ & Embedded Systems