C Recipe - Error Callbacks

Context

Error handling is something with a wide range of opinions and approaches, and like most things is something I certainly have opinions on which will be covered in a more general page.

Problem

The standard error handling in C is likely primitive to a fault; the return value of a function that may fail indicates its status. The calling code is left to check for that status and then perform some type of error handling after potentially retrieving additional information that has been stashed in some global variable. This introduces several possible issues at least some of which seem to be commonly referenced.

Easily Ignored

The standard pattern does very little to enforce that errors are dealt with; it effectively represents an opt-in model where responsibility is left to the calling code rather than an opt-out model where the calling code must explicitly ignore the possibility of errors. While compiler settings can help enforce that return values aren’t ignored, such settings must be enabled and therefore amount to a variation which still fits within the opt-in model. The status return can easily be missed depending on the utility of the function, and the possibility of the implementation changing such that the return type shifts from a void to an int (either in response to a previously missed failure mode or extended behavior) could easily leave errors unhandled.

Anemic Feedback

The relaying of the actual error through mechanisms such as errno is likely to limit the amount of feedback that can be communicated. This can include loss of instance specific information about the errors and the inability to build a chain of errors which could collect valuable context. Reliance on global state also invites challenges around concurrency which is certainly outside of the scope of my current C usage but is a consideration in terms of long term applicability of practices.

Noise

Errors should be handled and the use of the status return leads to nicely readable patterns where each call that may fail can be wrapped in an if statement. In practice, however, many such calls are unlikely to fail or the handling of such errors (or lack thereof) may for one reason or another pollute the calling code in ways that sacrifice the readability of the core logic for the sake of technical defense.

Goals

Based on the above the desire is to pursue a solution which forces the calling code to more obviously be aware of the risk of errors while also minimizing the disruption of primary logic for the sake of error handling. When errors are thrown they should be able to communicate significant feedback, and there should be a recognition of the different types of errors (exceptional vs. expected, can handle vs. panic).

Considerations and Approach

It is compelling to consider errors as simply alternative return values of functions. This lends itself to making use of standard language constructs while also enabling most of the concerns outlined above. This lends itself to the use of an Either type approach(“Data.either” 2021). In C the most natural comparison is likely to that of Go, where Go uses its support for returning multiple values and a provided error type. C, however, does not support returning multiple return values nor does it support any type of tuple which could act as a natural container for such a compound. A containing struct could certaintly be created, likely making use of a union, but the lack of type parameterization means that there would need to be a separate definition for each return type or that a lossy solution such as void pointers are used. Some combination of macros and code generation could make this approach tenable, but is likely to introduce a fair amount of complexity and indirection into the code and risk a stance that amounts to fighting against the C language. This is particularly troublesome given that in many cases the errors may be very unlikely and yet the need to deal with atypical constructs would still be prominent. An additional concern is that the composition of such calls may still introduce a fair amount of noise or invite yet further levels of obfuscation. A similar noise issue arguably exists in (v1) Go, but it is a result of idiomatic simplicity whereas layering it on top of C code could easily trip over a terrible combination of idiosyncractic noise and complexity.

While C does not support multiple return values, it does support multiple output parameters. This could provide a similar albeit slightly more clunky variation of Go style error handling where pointers to a result and error variable could be passed in. This would leave the return value unused which doesn’t lend itself to particularly expressive function calls and the notion of always passing a pointer for the result is likely to feel particularly clumsy in situations where the function is producing some type of object. It therefore seems natural to use the return value for the primary output of the function but an output parameter could be used for the error response. This allows the behavior around return values to vary based on need while providing a more focused interface for error handlers. An approach along these lines would allow for the risk of error to be clearly communciated in the function signature and for the calling code to be able to provide a home for any produced error.

Very often the handling of errors is going to be uninteresting and may be handled by one of a small number of strategies. While a catalog of macros could layer such strategies on top of the above, there are times when it is likely to be a poor fit. If the error can safely be ignored then there’s no reason to provide an address for it: similarly if the error should invoke a panic then not only does it not need an address but there’s also no reason for the called function to continue past that point even if it may be able to. This provokes a slight reordering in perspective where rather than returning an error value which may be acted upon in some way the behavior becomes acting on an error value in some way that may include returning/capturing it. This therefore revolves around providing a callback which is to be invoked when errors are encountered.

To retain context in light of C’s dynamic scoping and enable more advanced behavior the callback will be a member of a struct which passes itself as an argument. This aligns with the pattern for implementing closures in C. A typical use would resemble something such as:

ToReturn call_that_may_fail(Args a, ErrorHandler eh) {
  if (risky_call(a)) eh.on_error(eh, "Ooopsie");
  ...
}

...
  call_that_may_fail(my_args, capture_err);
  if (capture_err.err) { // deal with it };

Much of the usage can then be complemented by a range of macros to better express the flow.

How Is This Simpler?

After outlining the goals above it is likely to be questionable why this approach seems superior to others, and in all honesty it’s relatively untried so it remains speculative. The perceived advantage is that it enforces that errors are consistently handled while also allowing for terser handling of simple reactions. In cases such as ignoring or panicking on errors an appropriate ignoring or panicking handler can be passed as an argument. The cost incurred for error handling can therefore hopefully remain proportional to the complexity of the handling of the errors. This provides the pursued opt-out rather than opt-in model. This also seems poised to handle more complex needs such as collecting a chain of errors or enabling richer patterns around adapting flows to errors; while these may not be trivial they seem likely to satisfy the aforementioned proportionality. This also does not compromise the happy path and therefore seems an addition to rather than a mutation of idiomatic C.

“Data.either.” 2021. https://hackage.haskell.org/package/base-4.15.0.0/docs/Data-Either.html.