Author:: Fred Hebert <mononcqc(at)ferd(dot)ca>
Status:: Final/25.0 Implemented in OTP release 25
Type:: Standards Track
Created:: 31-Aug-2018
Erlang-Version:: OTP-25.0
Post-History:: 05-Dec-2020, 02-Nov-2021, 17-Nov-2021

EEP 49: Value-Based Error Handling Mechanisms #

Abstract #

This EEP introduces the maybe ... end expression as a construct usable for control flow and value-based error handling based on pattern matching. By using this construct, deeply nested case ... end expressions can be avoided or simplified, and the use of exception for flow control can be avoided.

Copyright #

This document has been placed in the public domain.

Specification #

We propose the maybe ... end construct which is similar to begin ... end in that it is used to group multiple distinct expression as a single block. But there is one important difference in that the maybe block does not export its variables while begin does export its variables.

We propose a new type of expressions (denoted MatchOrReturnExprs), which are only valid within a maybe ... end expression:

maybe
    Exprs | MatchOrReturnExprs
end

MatchOrReturnExprs are defined as having the following form:

Pattern ?= Expr

This definition means that MatchOrReturnExprs are only allowed at the top-level of maybe ... end expressions.

The ?= operator takes the value returned by Expr and pattern matches on it against Pattern.

If the pattern matches, all variables from Pattern are bound in the local environment, and the expression is equivalent to a successful Pattern = Expr call. If the value does not match, the maybe ... end expression returns the failed expression directly.

A special case exists in which we extend maybe ... end into the following form:

maybe
    Exprs | MatchOrReturnExprs
else
    Pattern -> Exprs;
    ...
    Pattern -> Exprs
end

This form exists to capture non-matching expressions in a MatchOrReturnExprs to handle failed matches rather than returning their value. In such a case, an unhandled failed match will raise an else_clause error, otherwise identical to a case_clause error.

This extended form is useful to properly identify and handle successful and unsuccessful matches within the same construct without risking to confuse happy and unhappy paths.

Given the structure described here, the final expression may look like:

maybe
    Foo = bar(),            % normal exprs still allowed
    {ok, X} ?= f(Foo),
    [H|T] ?= g([1,2,3]),
    ...
else
    {error, Y} ->
        {ok, "default"};
    {ok, _Term} ->
        {error, "unexpected wrapper"}
end

Do note that to allow easier pattern matching and more intuitive usage, the ?= operator should have associativity rules lower than =, such that:

maybe
    X = [H|T] ?= exp()
end

is a valid MatchOrReturnExprs equivalent to the non-infix form '?='('='(X, [H|T]), exp()), since reversing the priorities would give '='('?='(X, [H|T]), exp()), which would create a MatchOrReturnExp out of context and be invalid.

Motivation #

Erlang has some of the most flexible error handling available across a large number of programming languages. The language supports:

three types of exceptions (throw, error, exit)
- handled by catch Exp
- handled by try ... [of ...] catch ... [after ...] end
links, exit/2, and trap_exit
monitors
return values such as {ok, Val} | {error, Term}, {ok, Val} | false, or ok | {error, Val}
a combination of one or more of the above

So why should we look to add more? There are various reasons for this, including trying to reduce deeply nested conditional expressions, cleaning up some messy patterns found in the wild, and providing a better separation of concerns when implementing functions.

Reducing Nesting #

One common pattern that can be seen in Erlang is deep nesting of case ... end expressions, to check complex conditionals.

Take the following code taken from Mnesia, for example:

commit_write(OpaqueData) ->
    B = OpaqueData,
    case disk_log:sync(B#backup.file_desc) of
        ok ->
            case disk_log:close(B#backup.file_desc) of
                ok ->
                    case file:rename(B#backup.tmp_file, B#backup.file) of
                       ok ->
                            {ok, B#backup.file};
                       {error, Reason} ->
                            {error, Reason}
                    end;
                {error, Reason} ->
                    {error, Reason}
            end;
        {error, Reason} ->
            {error, Reason}
    end.

The code is nested to the extent that shorter aliases must be introduced for variables (OpaqueData renamed to B), and half of the code just transparently returns the exact values each function was given.

By comparison, the same code could be written as follows with the new construct:

commit_write(OpaqueData) ->
    maybe
        ok ?= disk_log:sync(OpaqueData#backup.file_desc),
        ok ?= disk_log:close(OpaqueData#backup.file_desc),
        ok ?= file:rename(OpaqueData#backup.tmp_file, OpaqueData#backup.file),
        {ok, OpaqueData#backup.file}
    end.

Or, to protect against disk_log calls returning something else than ok | {error, Reason}, the following form could be used:

commit_write(OpaqueData) ->
    maybe
        ok ?= disk_log:sync(OpaqueData#backup.file_desc),
        ok ?= disk_log:close(OpaqueData#backup.file_desc),
        ok ?= file:rename(OpaqueData#backup.tmp_file, OpaqueData#backup.file),
        {ok, OpaqueData#backup.file}
    else
        {error, Reason} -> {error, Reason}
    end.

The semantics of these calls are identical, except that it is now much easier to focus on the flow of individual operations and either success or error paths.

Obsoleting Messy Patterns #

Frequent ways in which people work with sequences of failable operations include folds over lists of functions, and abusing list comprehensions. Both patterns have heavy weaknesses that makes them less than ideal.

Folds over list of functions use patterns such as those defined in posts from the mailing list:

pre_check(Action, User, Context, ExternalThingy) ->
    Checks =
        [fun check_request/1,
         fun check_permission/1,
         fun check_dispatch_target/1,
         fun check_condition/1],
    Args = {Action, User, Context, ExternalThingy},
    Harness =
        fun
            (Check, ok)    -> Check(Args);
            (_,     Error) -> Error
        end,
    case lists:foldl(Harness, ok, Checks) of
        ok    -> dispatch(Action, User, Context);
        Error -> Error
    end.

This code requires declaring the functions one by one, ensuring the entire context is carried from function to function. Since there is no shared scope between functions, all functions must operate on all arguments.

By comparison, the same code could be implemented with the new construct as:

pre_check(Action, User, Context, ExternalThingy) ->
    maybe
        ok ?= check_request(Context, User),
        ok ?= check_permissions(Action, User),
        ok ?= check_dispatch_target(ExternalThingy),
        ok ?= check_condition(Action, Context),
        dispatch(Action, User, Context)
    end.

And if there was a need for derived state between any two steps, it would be easy to weave it in:

pre_check(Action, User, Context, ExternalThingy) ->
    maybe
        ok ?= check_request(Context, User),
        ok ?= check_permissions(Action, User),
        ok ?= check_dispatch_target(ExternalThingy),
        DispatchData = dispatch_target(ExternalThingy),
        ok ?= check_condition(Action, Context),
        dispatch(Action, User, Context, DispatchData)
    end.

The list comprehension hack, by comparison, is a bit more rare. In fact, it is mostly theoretical. Some things that hint at how it could work can be found in Diameter test cases or the PropEr plugin for Rebar3.

Its overall form uses generators in list comprehensions to tunnel a happy path:

[Res] =
    [f(Z) || {ok, W} <- [b()],
             {ok, X} <- [c(W)],
             {ok, Y} <- [d(X)],
             Z <- [e(Y)]],
Res.

This form doesn’t see too much usage since it is fairly obtuse and I suspect most people have either been reasonable enough not to use it, or did not think about it. Obviously the new form would be cleaner:

maybe
    {ok, W} ?= b(),
    {ok, X} ?= c(W),
    {ok, Y} ?= d(X),
    Z = e(Y),
    f(Z)
end

which on top of it, has the benefit of returning an error value if one is found.

Better Separation of Concerns #

This form is not necessarily obvious at a first glance. To better expose it, let’s take a look at some functions defined in the release_handler module in OTP:

write_releases_m(Dir, NewReleases, Masters) ->
    RelFile = filename:join(Dir, "RELEASES"),
    Backup = filename:join(Dir, "RELEASES.backup"),
    Change = filename:join(Dir, "RELEASES.change"),
    ensure_RELEASES_exists(Masters, RelFile),
    case at_all_masters(Masters, ?MODULE, do_copy_files,
                        [RelFile, [Backup, Change]]) of
        ok ->
            case at_all_masters(Masters, ?MODULE, do_write_release,
                                [Dir, "RELEASES.change", NewReleases]) of
                ok ->
                    case at_all_masters(Masters, file, rename,
                                        [Change, RelFile]) of
                        ok ->
                            remove_files(all, [Backup, Change], Masters),
                            ok;
                        {error, {Master, R}} ->
                            takewhile(Master, Masters, file, rename,
                                      [Backup, RelFile]),
                            remove_files(all, [Backup, Change], Masters),
                            throw({error, {Master, R, move_releases}})
                    end;
                {error, {Master, R}} ->
                    remove_files(all, [Backup, Change], Masters),
                    throw({error, {Master, R, update_releases}})
            end;
        {error, {Master, R}} ->
            remove_files(Master, [Backup, Change], Masters),
            throw({error, {Master, R, backup_releases}})
    end.

At a glance, it is very difficult to clean up this code: there are 3 multi-node operations (backing up, updating, and moving release data), each of which relies on the previous one to succeed.

You’ll also notice that each error requires special handling, reverting or removing specific operations on success or on failure. This is not a simple question of tunnelling values in and out of a narrow scope.

Another thing to note is that this module, as a whole (and not just the snippet presented here) uses throw expressions to operate non-local return. The actual point of return handling these is spread through various locations in the file: create_RELEASES/4, and write_releases_1/3 for example.

The case catch Exp of form is used throughout the file because value-based error flow is painful in nested structures.

So let’s take a look at how we could refactor this with the new construct:

write_releases_m(Dir, NewReleases, Masters) ->
    RelFile = filename:join(Dir, "RELEASES"),
    Backup = filename:join(Dir, "RELEASES.backup"),
    Change = filename:join(Dir, "RELEASES.change"),
    maybe
        ok ?= backup_releases(Dir, NewReleases, Masters, Backup, Change,
                              RelFile),
        ok ?= update_releases(Dir, NewReleases, Masters, Backup, Change),
        ok ?= move_releases(Dir, NewReleases, Masters, Backup, Change, RelFile)
    end.

backup_releases(Dir, NewReleases, Masters, Backup, Change, RelFile) ->
    case at_all_masters(Masters, ?MODULE, do_copy_files,
                        [RelFile, [Backup, Change]]) of
        ok ->
            ok;
        {error, {Master, R}} ->
            remove_files(Master, [Backup, Change], Masters)
            {error, {Master, R, backup_releases}}
    end.

update_releases(Dir, NewReleases, Masters, Backup, Change) ->
    case at_all_masters(Masters, ?MODULE, do_write_release,
                        [Dir, "RELEASES.change", NewReleases]) of
        ok ->
            ok;
        {error, {Master, R}} ->
            remove_files(all, [Backup, Change], Masters),
            {error, {Master, R, update_releases}}
    end.

move_releases(Dir, NewReleases, Masters, Backup, Change, RelFile) ->
    case at_all_masters(Masters, file, rename, [Change, RelFile]) of
        ok ->
            remove_files(all, [Backup, Change], Masters),
            ok;
        {error, {Master, R}} ->
            takewhile(Master, Masters, file, rename, [Backup, RelFile]),
            remove_files(all, [Backup, Change], Masters),
            {error, {Master, R, move_releases}}
    end.

The only reasonable way to rewrite the code was to extract all three major multi-node operations into distinct functions.

The improvements are:

The consequence of failing an operation is located near where the operation takes place
The functions have return values that Dialyzer can more easily typecheck
The functions are inherently more testable independently
Context can still be added and carried on the generalized workflow at the parent level
The chain of successful operations is very obvious and readable
Exceptions are no longer required to make the code work, but if we needed it, only one throw() would be needed in write_release_m, therefore separating the flow control details from specific function implementations.

As a control experiment, let’s try reusing our shorter functions with the previous flow:

%% Here is the same done through exceptions:
write_releases_m(Dir, NewReleases, Masters) ->
    RelFile = filename:join(Dir, "RELEASES"),
    Backup = filename:join(Dir, "RELEASES.backup"),
    Change = filename:join(Dir, "RELEASES.change"),
    try
        ok = backup_releases(Dir, NewReleases, Masters, Backup, Change,
                             RelFile),
        ok = update_releases(Dir, NewReleases, Masters, Backup, Change),
        ok = move_releases(Dir, NewReleases, Masters, Backup, Change, RelFile)
    catch
        {error, Reason} -> {error, Reason}
    end.

backup_releases(Dir, NewReleases, Masters, Backup, Change, RelFile) ->
    case at_all_masters(Masters, ?MODULE, do_copy_files,
                        [RelFile, [Backup, Change]]) of
        ok ->
            ok;
        {error, {Master, R}} ->
            remove_files(Master, [Backup, Change], Masters)
            throw({error, {Master, R, backup_releases}})
    end.

update_releases(Dir, NewReleases, Masters, Backup, Change) ->
    case at_all_masters(Masters, ?MODULE, do_write_release,
                        [Dir, "RELEASES.change", NewReleases]) of
        ok ->
            ok;
        {error, {Master, R}} ->
            remove_files(all, [Backup, Change], Masters),
            throw({error, {Master, R, update_releases}})
    end.

move_releases(Dir, NewReleases, Masters, Backup, Change, RelFile) ->
    case at_all_masters(Masters, file, rename, [Change, RelFile]) of
        ok ->
            remove_files(all, [Backup, Change], Masters),
            ok;
        {error, {Master, R}} ->
            takewhile(Master, Masters, file, rename, [Backup, RelFile]),
            remove_files(all, [Backup, Change], Masters),
            throw({error, {Master, R, move_releases}})
    end.

Very little changes in the three distributed functions. However, the weakness of this approach is that we have intimately tied implementation details of the small functions to their parent’s context. This makes it hard to reason about these functions in isolation or to reuse them in a different context. Furthermore, the parent function may capture throws not intended for it.

It is my opinion that using value-based flow control, through similar refactorings, yields safer and cleaner code, which also happens to have far more reduced levels of nesting. It should therefore be possible to express more complex sequences of operations without making them any harder to read, nor reason about in isolation.

That is in part due to the nesting, but also because we take a more compositional approach, where there is no need to tie local functions’ implementation details to the complexity of their overall pipeline and execution context.

It is also the best way to structure code in order to handle all exceptions and to provide the context they need as close as possible to their source, and as far as possible from the integrated flow.

Rationale #

This section will detail the decision-making behind this EEP, including:

Prior Art in Other Languages
Whether to Normalize on Wrappers
Adding the else Block
The choice of maybe ... end as a construct and its scope
The choice of the matching operator ?=
Other disregarded approaches
The choice of exceptions raised

There’s a lot of content to cover here.

Prior Art in Other Languages #

Multiple languages have value-based exception handling, many of which have a strong functional slant.

Haskell #

The most famous case is possibly Haskell with the Maybe monad, which uses either Nothing (meaning the computation returned nothing) or Just x (their type-based equivalent of {ok, X}). The union of both types is denoted Maybe x. The following examples are taken from Haskell/Understanding monads/Maybe.

Values for such errors are tagged in functions as follows:

safeLog :: (Floating a, Ord a) => a -> Maybe a
safeLog x
    | x > 0     = Just (log x)
    | otherwise = Nothing

Using the type annotations directly, it is possible to extract values (if any) through pattern matching:

zeroAsDefault :: Maybe Int -> Int
zeroAsDefault mx = case mx of
    Nothing -> 0
    Just x -> x

One thing to note here is that as long as you are not able to find a value to substitute for Nothing or that you cannot take a different branch, you are forced to carry that uncertainty with you through all the types in the system.

This is usually where Erlang stops. You have the same possibilities (albeit dynamically checked), along with the possibility of transforming invalid values into exceptions.

Haskell, by comparison, offers monadic operations and its do notation to abstract over things:

getTaxOwed name = do
  number       <- lookup name phonebook
  registration <- lookup number governmentDatabase
  lookup registration taxDatabase

In this snippet, even though the lookup function returns a Maybe x type, the do notation abstracts away the Nothing values, letting the programmer focus on the x part of Just x. Even though the code is written as if we can operate on discrete value, the function automatically re-wraps its result into Just x and any Nothing value just bypasses operations.

As such, the developer is forced to acknowledge that the whole function’s flow is conditional to values being in place, but they can nevertheless write it mostly as if everything were discrete.

OCaml #

OCaml supports exceptions, with constructs such as raise (Type "value") to raise an exception, and try ... with ... to handle them. However, since exceptions wouldn’t be tracked by the type system, maintainers introduced a Result type.

The type is defined as

type ('a, 'b) result =
  | Ok of 'a
  | Error of 'b

which is reminiscent of Erlang’s {ok, A} and {error, B}. OCaml users appear to mostly use pattern matching, combinator libraries, and monadic binding to deal with value-based error handling, something similar to Haskell’s usage.

Rust #

Rust defines two types of errors: unrecoverable ones (using panic!) and recoverable ones, using the Result<T, E> values. The latter is of interest to us, and defined as:

enum Result<T, E> {
    Ok(T),
    Err(E),
}

Which would intuitively translate to Erlang terms {ok, T} and {error, E}. The simple way to handle these in Rust is through pattern matching:

let f = File::open("eep.txt");
match f {
    Ok(file) => do_something(file),
    Err(error) => {
        panic!("Error in file: {:?}", error)
    },
};

Specific error values have to be well-typed, and it seems that the Rust community is still debating implementation details about how to best get composability and annotations within a generic type.

However, their workflow for handling these is well-defined already. This pattern matching form has been judged too cumbersome. To automatically panic on error values, the .unwrap() method is added:

let f = File::open("eep.txt").unwrap();

In Erlang, we could approximate this with:

unwrap({ok, X}) -> X;
unwrap({error, T}) -> exit(T).

F = unwrap(file:open("eep.txt", Opts)).

Another construct exists to return errors to caller code more directly, without panics, with the ? operator:

fn read_eep() -> Result<String, io::Error> {
    let mut h = File::open("eep.txt")?;
    let mut s = String::new();
    h.read_to_string(&mut s)?;
    Ok(s)
}

Any value Ok(T) encountering ? is unwrapped. Any value Err(E) encountering ? is returned to the caller as-is, as if a match with return had been used. This operator however requires that the function’s type signature use the Result<T, E> type as a return value.

Prior to version 1.13, Rust used the try!(Exp) macro to the same effect, but found it too cumbersome. Compare:

try!(try!(try!(foo()).bar()).baz())
foo()?.bar()?.baz()?

Swift #

Swift supports exceptions, along with type annotations declaring that a function may raise exceptions, and do ... catch blocks.

There is a special operator try? which catches any thrown exception and turns it into nil:

func someThrowingFunction() throws -> Int {
    // ...
}
let x = try? someThrowingFunction()

Here x can either have a value of Int or nil. The data flow is often simplified by using let assignments in a conditional expression:

func fetchEep() -> Eep? {
    if let x = try? fetchEepFromDisk() { return x }
    if let x = try? fetchEepFromServer() { return x }
    return nil
}

Go #

Go has some fairly anemic error handling. It has panics, and error values. Error values must be assigned (or explicitly ignored) but they can be left unchecked and cause all kinds of issues.

Nevertheless, Go exposed plans for new error handling in future versions, which can be interesting.

Rather than changing semantics of their error handling, Go designers are mostly considering syntactic changes to reduce the cumbersome nature of their errors.

Go programs typically handled errors as follows:

func main() {
        hex, err := ioutil.ReadAll(os.Stdin)
        if err != nil {
                log.Fatal(err)
        }

        data, err := parseHexdump(string(hex))
        if err != nil {
                log.Fatal(err)
        }

        os.Stdout.Write(data)
}

The new proposed mechanism looks as follows:

func main() {
    handle err {
        log.Fatal(err)
    }

    hex := check ioutil.ReadAll(os.Stdin)
    data := check parseHexdump(string(hex))
    os.Stdout.Write(data)
}

The check keyword asks to implicitly check whether the second return value err is equal to nil or not. If it is not equal to nil, the latest defined handle block is called. It can return the result out to exit the function, repair some values, or simply panic, to name a few options.

Elixir #

Elixir has a slightly different semantic approach to error handling compared to Erlang. Exceptions are discouraged for control flow (while Erlang specifically uses throw for it), and the with macro is introduced:

with {:ok, var} <- some_call(),
     {:error, _} <- fail(),
     {:ok, x, y} <- parse_name(var)
do
    success(x, y, var)
else
    {:error, err} -> handle(err)
    nil -> {:error, nil}
end

The macro allows a sequence of pattern matches, after which the ˋdo …ˋ block is called. If any of the pattern matches fails, the failing value gets re-matched in the optional ˋelse … end` section.

This is the most general control flow in this document, being fully flexible with regards to which values it can handle. This was done in part because there is not a strong norm regarding error or valid values in either the Erlang nor Elixir APIs, at least compared to other languages here.

This high level of flexibility has been criticized in some instances as being a bit confusing: it is possible for users to make error-only flows, success-only flows, mixed flows, and consequently the ˋelseˋ clause can become convoluted.

The OK library was released to explicitly narrow the workflow to well-defined errors. It supports three forms, the first of which is the for block:

OK.for do
  user <- fetch_user(1)
  cart <- fetch_cart(1)
  order = checkout(cart, user)
  saved_order <- save_order(order)
after
  saved_order
end

It works by only matching on {:ok, val} to keep moving forwards when using the <- operator: the fetch_user/1 function above must return {:ok, user} in order for the code to proceed. The = operator is allowed for pattern matches the same way it usually does within Elixir.

Any return value that matches {:error, t} ends up returning directly out of the expression. The after ... end section takes the last value returned, and if it isn’t already in a tuple of the form {:ok val}, it wraps it as such.

The second variant is the try block:

OK.try do
  user <- fetch_user(1)
  cart <- fetch_cart(1)
  order = checkout(cart, user)
  saved_order <- save_order(order)
after
  saved_order
rescue
  :user_not_found -> {:error, missing_user}
end

This variant will capture exceptions as well (in the rescue block), and will not re-wrap the final return value in the after section.

The last variant for the library is the pipe:

def get_employee_data(file, name) do
  {:ok, file}
  ~>> File.read
  ~> String.upcase
end

The goal of this variant is to simply thread together operations that could result in either a success or error. The ~>> operator matches and returns an {:ok, term} tuple, and the ~> operator wraps a value into an {:ok, term} tuple.

Whether to Normalize on Wrappers #

In Erlang, true and false are regular atoms that only gained special status through usage in boolean expressions. It would be easy to think that more functions would return yes and no were it not from control flow constructs.

Similarly, undefined has over years of use become a kind of default “not found” value. Values such as nil, null, unknown, undef, false and so on have seen some use, but a strong consistency in format has ended up aligning the community on one value.

When it comes to return values for various functions, {ok, Term} is the most common one for positive results that need to communicate a value, ok for positive results with no other value than their own success, and {error, Term} is most often uses for errors. Pattern matching and assertions have enforced that it is easy to know whether a call worked or not by its own structure.

However, many success values are still larger tuples: {ok, Val, Warnings}, {ok, Code, Status, Headers, Body}, and so on. Such variations are not problematic on their own, but it would likely not hurt too much either to use {ok, {Val, Warnings}} or {ok, {Code, Status, Headers, Body}}.

While using more standard forms could lead to easier generalizations and abstractions that can be applied to community-wide code. By choosing specific formats for control flow on value-based error handling, we would explicitly encourage this form of standardization.

That being said, the variety of formats existing and the low amount of strict values being used would mean that forcing normalization calls for a potential loss of flexibility in future language decisions. For example, EEP-54 —- completed before final revisions of this EEP -— tries to add new forms of context to error reports, and various libraries already rely on these richer patterns.

It is therefore the opinion of the OTP technical board that we should not normalize error-return values. As such, an approach closer to Elixir’s with has been proposed, although this EEP’s approach is more general in terms of sequences of acceptable expressions and their composition.

Adding the else Block #

Avoiding normalization on error and good values introduces the need for the else ... end sub-block to prevent edge cases.

Let’s look with the following type of expression as an explanation why:

maybe
    {ok, {X,Y}} ?= id({ok, {X,Y}})
    ...
end

While this mechanism is fine to handle skipping pattern, it has some problematic weaknesses in the context of error handling.

One example of this could be taken from the OTP pull request that adds new return value to packet reading based on inet options: #1950.

This PR adds a possible value for packet reception to the prior form:

{ok, {PeerIP, PeerPort, Data}}

To ask make it possible to alternatively get:

{ok, {PeerIP, PeerPort, AncData, Data}}

Based on socket options set earlier. So let’s put it in context for the current proposal:

maybe
    {ok, {X,Y}} ?= id({ok, {X,Y}}),
    {ok, {PeerIP, PeerPort, Data}} ?= gen_udp:recv(...),
    ...
end

Since we force a return on any non-matching value, the whole expression, if the socket is misconfigured to return AncData, would return {ok, {PeerIP, PeerPort, AncData, Data}} on a failure to match.

Basically, an unexpected but good result could be returned from a function using the maybe ... end construct, which would look like a success while it was actually a complete failure to match and handle the information given. This is made even more ambiguous when data has the right shape and type, but a set of bound variables ultimately define whether the match succeeds or fails (in the case of a UDP socket, returning values that comes from the wrong peer, for example).

In the worst cases, it could let raw unformatted data exit a conditional pipeline with no way to detect it after the fact, particularly if later functions in maybe ... end apply transformations to text, such as anonymizing or sanitizing data. This could be pretty unsafe and near impossible to debug well.

Think for example of:

-spec fetch() -> {ok, iodata()} | {error, _}.
fetch() ->
    maybe
        {ok, B = <<_/binary>>} ?= f(),
        true ?= validate(B),
        {ok, sanitize(B)}
    end.

If the value returned from f() turns out to be a list (say it’s a misconfigured socket using list instead of binary as an option), the expression will return early, the fetch() function will still return {ok, iodata()} but you couldn’t know as a caller whether it is the transformed data or non-matching content. It would not be obvious to most developers either that this could represent a major security risk by allowing unexpected data to be seen as clean data.

This specific type of error is in fact possible in Elixir, but no such warning appears to have been circulating within its community so far. The issue is to be handled with an else block which this proposal reuses to clamp down on unexpected values:

-spec fetch() -> {ok, iodata()} | {error, _}.
fetch() ->
    maybe
        {ok, B = <<_/binary>>} ?= f(),
        true ?= validate(B),
        {ok, sanitize(B)}
    else
        false -> {error, invalid_data};
        {error, R} -> {error, R}
    end.

Here misconfigured sockets won’t result in unchecked data passing through your app; any invalid use case is captured, and if the value for B turns out to be a list, an else_clause error is raised with the bad value.

Unless the clause is mandatory (it is not in Elixir), this level of additional matching is purely optional; the developer has no obvious incentive to go and handle these errors, and if they do, the exception raised will be through a missing clause in the else section, which will obscure its origin and line number.

We will therefore have to rely on education and documentation (along with type analysis) to prevent such issues from arising in the future.

These problems would not exist with normalized error and return values as those used in statically-typed languages, but since we do not intend to normalize values, the else block is a necessary workaround.

Choosing `maybe ... end` Expressions #

Abstractions over error flow requires to define a scope limiting the way flow is controlled. Before choosing the maybe ... end expression, the following items needed consideration:

what is the scope we need to cover
what is the format of the structure to use
why ending up with maybe ... end
why choose the else keyword

Scoping Limits #

In the languages mentioned earlier, two big error handling categories seem to emerge.

The first group of language seems to track their error handling at the function level. For example, Go uses return to return early from the current function. Swift and Rust also scope their error handling abstractions to the current function, but they also make use of their type signatures to keep information about the control flow transformations taking place. Rust uses the Result<T, E> type signature to define what operations are valid, and Swift asks of developers that they either handle the error locally, or annotate the function with throws to make things explicit.

On the other hand, Haskell’s do notation is restricted to specific expressions, and so are all of Elixir’s mechanisms.

Erlang, Haskell, and Elixir all primarily use recursion as an iteration mechanism, and (outside of Haskell’s monadic constructs) do not support return control flow; it is conceptually more difficult for a return (or break) to be useful when iteration requires recursion: “returning” by exiting the current flow may not bail you out of what the programmer might consider a loop, for example.

Instead, Erlang would use throw() exceptions as a control flow mechanism for non-local return, along with a catch or a try ... catch. Picking a value-based error handling construct that acts at the function level would not necessarily be very interesting since almost any recursive procedure would still require using exceptions.

As such, it feels simpler to use a self-contained construct built to specifically focus on sequences of operations that contain value-based errors.

Format of Structure #

Prior attempts at abstracting value-based error handling in Erlang overloaded special constructs with parse transforms in order to provide specific workflows.

For example, the fancyflow library tried to abstract the following code:

sans_maybe() ->
    case file:get_cwd() of
        {ok, Dir} ->
            case
                file:read_file(
                  filename:join([Dir, "demo", "data.txt"]))
            of
                {ok, Bin} ->
                    {ok, {byte_size(Bin), Bin}};
                {error, Reason} ->
                    {error, Reason}
            end;
        {error, Reason} ->
            {error, Reason}
    end.

as:

-spec maybe() -> {ok, non_neg_integer()} | {error, term()}.
maybe() ->
    [maybe](undefined,
            file:get_cwd(),
            file:read_file(filename:join([_, "demo", "data.txt"])),
            {ok, {byte_size(_), _}}).

And Erlando would replace:

write_file(Path, Data, Modes) ->
    Modes1 = [binary, write | (Modes -- [binary, write])],
    case make_binary(Data) of
        Bin when is_binary(Bin) ->
            case file:open(Path, Modes1) of
                {ok, Hdl} ->
                    case file:write(Hdl, Bin) of
                        ok ->
                            case file:sync(Hdl) of
                                ok ->
                                    file:close(Hdl);
                                {error, _} = E ->
                                    file:close(Hdl),
                                    E
                            end;
                        {error, _} = E ->
                            file:close(Hdl),
                            E
                    end;
                {error, _} = E -> E
            end;
        {error, _} = E -> E
    end.

With monadic constructs in list comprehensions:

write_file(Path, Data, Modes) ->
    Modes1 = [binary, write | (Modes -- [binary, write])],
    do([error_m ||
        Bin <- make_binary(Data),
        Hdl <- file:open(Path, Modes1),
        Result <- return(do([error_m ||
                             file:write(Hdl, Bin),
                             file:sync(Hdl)])),
        file:close(Hdl),
        Result]).

Those cases specifically aimed for a way to write sequences of operations where pre-defined semantics are bound by a special context, but are limited to overloading constructs rather than introducing new ones.

By comparison, most of Erlang’s control flow expressions follow similar structures. See the following most common ones:

case ... of
    Pattern [when Guard] -> Expressions
end

if
   Guard -> Expressions
end

begin
    Expressions
end

receive
    Pattern [when Guard] -> Expressions
after                                               % optional
    IntegerExp -> Expressions
end

try
    Expressions
of                                                  % optional
    Pattern [when Guard] -> Expressions
catch                                               % optional
    ExceptionPattern [when Guard] -> Expressions
after                                               % optional
    Expressions
end

It therefore logically follows that if we were to add a new construct, it should be of the form

<keyword>
    ...
end

The questions remaining are: which keyword to choose, and which clauses to support.

Choosing `maybe ... end` #

Initially, a format similar to Elixir’s with expression was being considered:

<keyword>
    Expressions | UnwrapExpressions
of                                              % optional
    Pattern [when Guard] -> Expressions
end

With this construct, the basic <keyword> ... end form would follow the currently proposed semantics, but the of ... section would allow pattern matching on any return value from the expression, whether {error, Reason} or any non-exception value returned by the last expression in the main section.

This form would be in line with what try ... of ... catch ... end allows: once the main section is covered, more work can be done within the same construct.

However, try ... of ... catch ... end has a specific reason for introducing the patterns and guards: protected code impacting tail recursion.

In a loop such as:

map_nocrash(_, []) -> [];
map_nocrash(F, [H|T]) ->
    try
        F(H)
    of
        Val -> [Val | map_nocrash(F, T)]
    catch
        _:_ -> map_nocrash(F, T)
    end.

The of section allows to continue doing work in the case no exception has happened, without having to protect more than the current scope of the function, nor preventing tail-recursion by forcing a presence of each iteration on the stack.

No such concerns exist for value-based error handling, and while the of ... end section might be convenient at times, it is strictly not necessary for the construct to be useful.

What was left was to choose a name. Initially, the <keyword> value chosen was maybe, based on the Maybe monad. The problem is that introducing any new keyword carries severe risks to backwards compatibility.

Since the OTP team is now planning to introduce a new mechanism for activation of new language features per module the potential risk for incompatibility with introducing a new keyword is reduced. Only modules explicitly using the new language feature will be impacted.

For example, all of the following words were considered:

======= ================= =========================================
Keyword Times used in OTP Rationale
         as a function
======= ================= =========================================
maybe   0                 can clash with existing used words,
                           otherwise respects the spirit
option  88                definitely clashes with existing code
opt     68                definitely clashes with existing code
check   49                definitely clashes with existing code
let     0                 word is already reserved and free, but
                           makes no sense in context
cond    0                 word is already reserved and free, may
                           make sense, but would prevent the
                           addition of a conditional expression
given   0                 could work, kind of respects the context
when    0                 reserved for guards, could hijack in new
                          context but may be confusing
begin   0                 carries no conditional meaning, mostly
                          free for overrides

Initially, this proposal expected to use the maybe keyword:

maybe
    Pattern <op> Exp,
    ...
of
    Pattern -> Exp  % optional
end

but for the reasons mentioned in the previous section, the of ... section became non-essential.

Why Choose the `else` keyword #

The first step here was looking at all the existing alternative reserved keywords: of, when, cond, catch, after.

None of these actually conveys the sense of requiring an alternative clause to the construct, and so we require adding a new one. The else keyword is tempting if only because it opens the door to introducing it as a reserved word in if expressions at a later date.

A quick look at the OTP code base to be sure seems to return no else() function and should therefore be relatively safe to use in general.

Choosing An Infix Operator #

In order to form MatchOrReturnExprs, there is a need for a mechanism to introduce pattern matching with distinct semantics from regular pattern matching.

A naive parse transform approach with fake function calls would be the most basic way to go:

begin
    match_or_return(Pattern, Exp),
    %% variables bound in Pattern are available in scope
    ...
end

However, this would introduce pattern matches in non-left-hand-side positions and make nesting really weird to deal with without exposing parse transform details and knowing how the code is translated.

A prefix keyword such let <Pattern> = <Exp> could also be used. However, that use of let would differ from how it is use in other languages and would be confusing.

An infix operator seems like a good fit since pattern matching already uses them in multiple forms:

= is used for pattern matches. Overloading it in error flow would prevent regular matching from being used
:= is used for maps; using it could work, but would certainly be confusing when handling nested maps in a pattern
<- could make sense. It is already restricted in scope to list and binary comprehensions and would therefore not clash nor be confused. The existing semantics of the operator imply a literal pattern match working like a filter, which is what we are looking for.
<= same as <- but for binary generators
?= could make sense. It is a new operator that does not clash with anything existing and it could be thought of as conditional matching. (The preprocessor uses ? to signal the start of a macro invocation. Although = is a valid macro name, that does not cause any ambiguity, because macro names that are not atoms or variables must be single-quoted. Therefore, ?= is not a valid invocation of the = macro; the invocation of the = macro must be written like ?'='.)

The <- or the ?= operators makes most sense. We choose ?=, since <- in its current use means sub set of or member of which is not really what it is used for here.

For completeness’s sake, I also checked for alternative operators in a prior version of this EEP that introduced prescriptive values for {ok, T} | {error, R}, which had distinct semantics:

=======  ===========================================================
Operator Description
=======  ===========================================================
#=       No clash with other syntax (maps, records, integers), no
         clash with abstract patterns EEP either.
!=       No clash with message passing, but is sure to anyone used
         to C-style inequality checks
<~       Works with no known conflict; shouldn't clash with ROK's
         frame proposals (uses infix ~ and < > as delimiters).
         Has the disadvantage to being visually similar to `<-`.
<|       Reverse pipe operator. If Erlang were to implement Elixir's
         pipe operator, it would probably make sense to implement both
         `<|` and `|>` since the "interesting" argument often comes
         last.
=~       Regular expression operator in Elixir and Perl.

Operator Priority #

Within the expected usage of the unwrap expressions, the ?= operator needs to have a precedence rule such that:

X = {Y,X} ?= <Exp>

Is considered a valid pattern match operation with X = {Y,X} being the whole left-hand-side pattern, such that operation priorities are:

lhs ?= rhs

Instead of

lhs = rhs ?= <...>

In all other regards, the precedence rules should be the same as = in order to provide the most unsurprising experience possible.

Other Disregarded Approaches and Variations #

Other approaches were considered in making this proposal, and ultimately disregarded.

Begin … end with prescriptive error values #

An earlier version of this document simply used:

begin
    Foo = bar(),
    X ?= id({ok, 5}),
    [H|T] ?= id({ok, [1,2,3]}),
    ...
end

Which implicitly unpacked {ok, T} = f() by calling T ?= f(), and forced all acceptable non-matching values to be of the form {error, T}.

To make the form useful to most existing code, it also required some magic everyone (myself included) didn’t very much like, for which _ ?= f() would implicitly succeed if the return value for f() was ok.

This was judged to be too magical, and not necessarily a ton of existing Erlang code would have benefited from the form since ok is often returned for successful functions without an extra value. A stronger prescriptiveness of the form {ok, undefined} (to replicate Rust’s Ok(())) would have been required to avoid the magic, and would have felt very unidiomatic.

Elixir-Like Patterns in `with` #

The Elixir approach is fairly comprehensive, and rather powerful. Rather than handling success or errors, it generalizes over pattern matching as a whole, as we do here.

The one difference is that Elixir’s with expression forces all conditionals to happen first, with a do block for the free-form expressions that follow:

with dob <- parse_dob(params["dob"]),
    name <- parse_name(params["name"])
do
  %User{dob: dob, name: name}
else
  err -> err
end

The Erlang form introduced in this document is more general since it allows mixing MatchOrReturnExprs and regular expressions throughout, without the need for a general do block.

The Erlang form does imply a likely more complex set of rewriting rules when translating from the AST form to Core Erlang. It should be possible to purely rewrite in existing Core Erlang terms although the end result may not look like the original code at all

`cond` and `cond let` #

Anthony Ramine recommended looking into reusing the already reserved cond and let keywords. He mentioned Rust planning something based on these and how it could be ported to Erlang based on his prior work on supporting the cond construct within the language.

The proposed mechanism would look like:

cond
    X > 5 -> % regular guard
        Exp;
    f() < 18 -> % function used in guard, as originally planned
        Exp;
    let {ok, Y} = exp(), Y < 5 ->
        Exp
end

The last clause would allow Y to be used in its own branch only if it matches and all guards succeed; if the binding fails, a switch is automatically made to the next branch.

As such, more complex sequences of operations could be covered as:

cond
    let {ok, _} = call1(),
    let {ok, _} = call2(),
    let Res = call3() ->
        Res;
    true ->
        AlternativeBranch
end

This mechanism is, in my opinion, worth exploring and maybe adding to the language, but on its own does not adequately solve error handling flow issues since errors cannot be extracted easily from failing operations.

Auto-Wrapping Return Values #

Auto-wrapping return values is something the Elixir’s OK library does, as well as Haskell’s do notation, but that neither Rust nor Swift does.

It seems that there is no very clear consensus on what could be done. Thus, for the simplicity of the implementation, just returning the value as-is without auto-wrapping seems sensible, particularly since we do not prescribe tuple formats for handled values.

It would therefore be up to the developer to just return whatever value best matches their function’s type signature, making easier to still integrate return values with the system they have.

It also lets sequences of operations potentially return ok on success, even if their individual functions returned values such as true, for example, rather than {ok, true}.

Choosing Exceptions Raised #

The exception format proposed here is {else_clause, Value}. This format is chosen following Erlang/OTP standards:

if_clause
{case_clause, Val}
function_clause (the value is provided in the stacktrace)
{badmatch, Val}
Unmatching values in a catch block and receive expressions do not raise anything explicitly

Since case_clause is functionally the closest exceptions and that it carries a value, we choose to replicate the same form here.

The reason else_clause is chosen over maybe_clause because the else block could arguably be used in other constructs in the future, and constraining the exception to the block’s name itself is likely more future-proof.

Backwards Compatibility #

The new keyword maybe is introduced and might clash with existing use of unquoted maybe as an atom which in practice also includes use as a module or function name.

The plan is to introduce this new feature together with a mechanism that let the user activate it and other new features individually per module and by that the new keyword maybe will only be a potential incompatibility in the module where the user activates this feature.

Reference Implementation #

There are several reference implementations: