A Semantics of Core Erlang with Handling of Signals

We introduce a small step semantics for a subset of Core Erlang modeling its monitoring and signal systems. The goal of our semantics is to enable the construction of causal explanations for property violations, which will be the object of future work. As a first axis of reflection, we chose to study the impact of the order of messages on a faulty behavior. We present our semantics and discuss some of our design choices. This work is a part of a broader project on causal debugging of concurrent programs in Erlang.


Introduction
Debugging concurrent systems is a difficult task, for which causal explanations of observed faults are a precious help.More precisely, our work is motivated by the goal of explaining errors made by choices of the scheduler that lead to incorrect interleavings, also known as concurrency bugs.In this paper we formalize a semantics of Core Erlang that preserves the pertinent information for such a causal analysis.
Most importantly, we want to investigate the impact of the order of messages on a faulty behavior.We thus need a semantics modeling precisely how message handling, signals, and monitoring affect the outcome.None of the existing semantics presented all of these characteristics at the same time.However, when possible, we adapt some ideas from them, and use observations made on the implementation.
Our small step semantics focuses on the order of messages.It models signal handling and come monitoring behaviors, but presents simplifications: its spawn never fails, it cannot represent continuous time -hence we do not support timers -, and nodes, try-catch expressions, or the ability to dynamically update code are not modeled.
This work is a part of a broader project called DCore, whose objective is to develop a semantically well-founded form of concurrent debugging, which we call causal debugging, that aims to alleviate the deficiencies of current debugging techniques for large concurrent software systems.
DCore encompasses two main work directions.First, the design of a reversible execution engine that allows programmers to backtrack and replay a concurrent or distributed program execution, in a way that is both precise and efficient (only the exact threads involved by a return to a target anterior or posterior program state are impacted) [10,11].
The second axis of DCore is the development of a causal analysis engine that allows programmers to analyze concurrent executions by asking questions of the form "what caused the violation of this program property?", and that allows for the precise and efficient investigation of past and potential program executions.Our causal analysis will be based on the semantics presented in this paper.
After a short overview of existing semantics, we expose the result of our work and the choices we made.We conclude with a discussion about the differences of our semantics.

Related Work
Our semantics is by no means the first one for Core Erlang.[6] is the most recent official specification of Core Erlang we found, but whilst useful it is informal and partly obsolete.[14] proposed a small step semantics of a subset of Core Erlang, making some choices such as the order of evaluation of function arguments based on its implementation rather than the official documentation.[4] formalized a subset of Core Erlang with a big step semantics, following [14] in some aspects, and validated it with Coq.[5] introduced the concept of medium step semantics, which considers as a step not only assignments, but also some events of interactions between processes.However, all these semantics lack the signal handling and monitoring part.The only semantics we found that finely modeled signals [15] treats them as side effects of operations that are instantaneously treated, which implies a strong hypothesis on their order of reception.

Design Choices
Erlang offers numerous functionalities and shortcuts in order to help its users, but can ultimately be translated, via an official tool, to a language with less syntactic sugar: Core Erlang.We thus chose, for our future analysis, to use a subset of Core Erlang.Both languages are still evolving, hence we tried to base our choices on concrete elements.We are using the initial, though somewhat outdated, specification of Core Erlang [6], and the official documentation available on the Erlang website [1] as far as possible.When a choice we made directly refers to this documentation, we will provide the clickable reference in the form [ERM, page, section] (Erlang Reference Manual [2]) or [ERTS, page, section] (Erlang Run-Time System Application reference manual [3]).
When these sources lack vital information, we search in the official implementation.If there is a conflict between both, as it is the case in the evaluation order of a function call arguments, we follow the former.For now, we put the concept of node aside.In our model, an Erlang system consists of a set of processes which interact through signal exchanges.

Execution Model
Our first goal is to formalize the different behaviors emerging from the non-deterministic order of the messages treatment.Thus, our model represents how processes send and handle signals, how they evaluate instructions and how they manage some parts of the monitoring aspect of Erlang.
Each process has a unique identifier pid .The state of a process is defined by the state of its execution task, its outbox, and the set of its links with others processes.They evolve asynchronously, but are synchonized when the processes interact.

Execution Task
A task e executes instructions.It is represented as a tuple S e = (I e , θ, τ, m, st, r) where I e is the next expression being evaluated, θ maps free variables to their values (as bound by lets or pattern matching), while τ does the same for variables that are bound to recursive functions by letrec.In this model, modules are named sets of functions used during the initialization of τ .We will not focus on the rules around letrec, which can be found in appendix.We also store information dedicated to generate signals: m is the name of the current module, st stores the execution stack and r stores the reason of the end of the process if known.Notice that the only role of st is to store the information needed to build error messages.

Sent Signals
We followed [5] in their choice of an outbox per process, and no inbox.It is a good way to keep the order guarantees given by the language, allowing to represent the local knowledge of a process when it sends signals without making presuppositions about when the signal is actually received [ERM, processes, delivery-of-signals].
There are two kinds of signals.
Messages.Messages content can be any constant in Core Erlang.We tag them as msg, and store the identifier of the process they are sent to.They are read and consumed only when their recipient evaluates a receive expression.
Other signals.Other signals are automatically processed: as stated in [ERM, processes, signals], "there is nothing a process must do to handle the reception of signals, or can do to prevent it".We tag them with not_msg and the identifier of their recipient.Their content depends on the nature of the signal, associated with specific flags ranging over link, unlink, unlinkAck, monitored_by, demonitor, and down.Links between processes are created and removed thanks to signals whose content is tagged with the flags link, unlink, and unlinkAck.More precisely, a signal with content (link, pid source , pid target ) is generated by a process pid source to generate a link with the process pid target .A signal with content (unlink, pid source , pid target ) is generated to remove this link, while a signal with content (unlinkAck, pid source , pid target ) is used if the link did not already exist.
Core Erlang has another kind of links with their own signals, namely, monitor links.They are created and removed thanks to signals whose content is tagged with the flags link and demonitor.These signals also contain information about the pid of the process that has created the monitor link.Furthermore, each monitor link has its own identifier.Link identifiers take the form of terms built with the constructor #Ref on a quadruple of integers typically standing for the identifier of the node, the identifier of the source process, the identifier of the target process, and a unique integer.This way, there may be two monitor links between the same pair of processes.The set of monitor link identifiers is denoted as Ref L .A signal with the content (link, pid source , lid ) is used by the source process pid source to initiate a monitor link with id lid , whereas the signals of with the content (SigDemonitor, pid source , lid ) are used to removed them.
When a process terminates, it sends signals describing the reason of its termination to the processes with which it shares links.The content of these signals is tagged with the flag down.A signal with content (down, f l, pid ended , lid , r) where f l either the flag monitored_by or link according to the kind of the link to be removed, pid ended is the pid of the process that has terminated its execution, lid is either the pid of the target process or the monitor link id according to the kind of links, and r is the reason of the process termination.
The set of signals that are not messages is denoted by SigOther.In our model, the process outbox stores its sent signals in their order of sending.The set of its possible states is

Links
One of the main caracteristics of Erlang is the ability to implement easily a whole monitoring system between processes.This is done by linking them in order to suitably react when one of them dies.The language proposes two types of linking: link, associated with a tag link, and monitor, associated with tags monitored_by and monitoring.
When two processes are bound with a link, if one of them dies, so does the other one.Only one link can exist for a pair of processes.We model that a process of identifer pid 1 has a link with the process of identifier pid 2 with the tuple (link, pid 2 , pid 1 ).On the other hand, monitor links are asymmetric: one process is monitoring the other one.When the monitored process dies, it sends to the monitoring one a signal that is transformed into a message.This message can then be handled with a receive expression.A pair of processes can have multiple monitor links, hence we identify them through a unique reference in the set Ref L .When a process of identifier pid 2  [14], expanded with [6]).When a category is followed by a string in parentheses, this string denotes an element of this category in the rest of the paper.
is monitored by a process of identifier pid 1 through a monitor link of identifier lid , it stores the tuple (monitored_by, pid 1 , lid ), and the monitoring process stores (monitoring, pid 2 , lid ).
The state of the of links with other processes is a set The state of a process π 1 , of identifier pid 1 , is thus described as the tuple S p = (pid 1 , S e , S sigsent , S L ).The state of a system is the set of the states of the processes constituting it.

Syntax
We are using a subset of Core Erlang described in Figure 1.It has one major difference with the Core Erlang code obtained from Erlang code with the official compiler: the receive instruction no longer exists since the OTP 23 update and is replaced with the code presented in appendix.Replacing this code with the old receive during parsing allows us to keep receive as a shorter and equivalent expression.
We added two elements to Val : EoP signals that the process is terminating, and ended signals a terminated one.6 Semantics

Transition Relations
A transition relation term −→ models small steps of execution.This relation is defined by means of inference rules.The evaluation of expressions is performed step-wise and bottom-up while respecting the policy about evaluation order (see 6.5).Yet, the evaluation of sub-expressions offers fewer capabilities than the evaluation of expressions at the top level.For instance, when a process terminates, an expression EoP is generated (see for instance Fig. 2), propagated at the top level of the expression (the rule is omitted), and only then signals are sent to warn the other processes (see Fig. 3).No signal can be emitted while the EoP expression has not reached the top level.This ensures that these signals are sent only once.
For this purpose, an auxiliary transition relation aux −→ is used.Auxiliary transitions are not proper transitions, but they may occur in the inference proofs of the proper transitions.They describe only what can be done when evaluating sub-expressions.They can then be propagated by means of an evaluation context (see 6.5) and promoted as proper transitions term −→ (see Fig. 4).

Pattern Matching
Numerous functionalities are based on pattern matching, represented here as the function match.Upon success, pattern matching binds new variables by unification.This is the purpose of the disjoint union operator that takes two arguments θ 1 and θ 2 that are either a map or the symbol no_match.θ 1 θ 2 is defined as and as no_match otherwise.
The function match takes four arguments, an expression e, a pattern p, a map θ and a functions table τ , and returns either a map or the symbol no_match.The function match(e, p, θ, τ ) is defined as: • if p ∈ Fun with arity n p and name Fname p : if p is bound, match(e, p, θ, τ ) = match(e, θ(p), θ, τ ) • if p and v are respectively of the form {p 0 , . . ., p n } and {v 0 , . . ., v n }, or [p 0 , . . ., p n ] and [v 0 , . . ., v n ], or < p 0 , . . ., p n > and < v 0 , . . ., v n >, with v i ∈ Val and • if p is of the form v alias = p 0 with v alias ∈ var and p 0 ∈ Pat, match(v, < v alias = p 0 >, θ, τ ) = match(v, p 0 , θ, τ ) match(v alias , v, θ, τ ) • for all other cases, match(v, p, θ, τ ) = no_match.Notice that when a function is defined, Erlang generates a name for it if needed.Furthermore, in the case where e is an unbound variable, match should return a special value treated as ∅ and generating a warning, but as we are not modeling this behavior, we simply return ∅.

Variables and End Reason
Variable assignments are done through unification thanks to the function match.When a correspondance can be made, the context θ is updated accordingly.Variables are local and they cannot be reassigned before exiting their scope [ERM, expressions, variables].When the unification fails, the process ends while explaining the reason of the failure, unless the process was already terminated for a previous reason.This is the purpose of the function end_reason(r, r new ), which output the reason r whenever it is not equal to ⊥, or the reason r new otherwise.Such behavior occurs in local binding and switch cases.For example, see Fig. 2, where match_cls is a function looking for the first clause, of a list of clauses, which matches with a given pattern.It then returns its expression and the unification context.Otherwise, it returns no_match.
A variable x already associated with a value is simply read from θ(x).This rule is omitted for the sake of brevity.

Calling a Function
Calls to external and built-in functions are idealized.Their code is replaced by black-boxes which abstract their behavior.An example of the call of a built-in function is in appendix.
The instructions apply and call are used to call the functions that are defined in the program, respectively in the current module, or in arbitrary any module.For instance, see the rule Call in Fig. 5 where bodyF unc(M name, Fname, n) takes three arguments, a value M name, a value Fname and an integer n, and returns an expression, which is the body of the function named Fname, of arity n, coming from the module named M name.
Rules for the external functions and the instruction apply works similarly, they are omitted for the sake of brevity

Evaluation Order
The evaluation order of expressions is specified as follows.
For sequential composition do e 1 e 2 , the expression e 1 is evaluated before the expression e 2 .As for the evaluation of subexpressions, we follow the documentation rather than, as other semantics such as [14], the current implementation.Indeed, the documentation insisted on the fact that the evaluation order of the arguments of a function is not deterministic [ERM, expressions, expressionevaluation]; historically, it already changed once [12]; and further changes are possible.
The evaluation order is formalized by the means of evaluation contexts.An evaluation context is an expression with a hole '•'.Intuitively, the hole contains a subexpression that can be evaluated, whereas the rest of the context remains as it is until the sub-expression is fully evaluated.Evaluation contexts are defined by the following grammar: where es 1• denotes a list of expressions, excluding EoP , separated with commas, where exactly one expression is replaced with a hole '•' at an arbitrary position.This models the cases where the evaluation order is non deterministic, as in the apply expression, or imposed, as in the do expression.
Applying an evaluation context c[•] to a subexpression e consists in replacing the unique occurrence of the hole '•' in the evaluation context c[•] with the expression e.This is denoted as c[e].
The rule for the evaluation of subexpressions is then as shown in Fig. 6; Subexpr ok if the evaluation goes without terminating the process, and Subexpr ko otherwise, which follows the same principle as case ko .The rule for sequential composition is omitted for the sake of brevity.

Signals
As said in the documentation, "signals are received asynchronously and automatically" [ERM, processes, receiving-signals].The only guarantee about their reception order is "if an entity sends multiple signals to the same destination entity, the order is preserved" [ERM, processes, signaldelivery], and the documentation emphasizes the importance of not using any other kind of order determinism based on the current implementation, as it might change in the future [ERTS, Communication, implementation].
For the sake of brevity, we will designate the first received message from a process j, given an outbox sig sent , as first_sig(sig sent , j).

Sending and Receiving Messages
In our model, processes send messages asynchronously, and never fail in doing so, as described in rule Send in Fig. 7.
Our receive instruction is simplified in that, if no received message match with its clauses, it waits instead of starting a timer.As for the signals, the first received message from a process j is written first_msg(sig sent , j).A receive instruction when the message comes from another process is then evaluated as in the rule rcv_ij ok Fig. 7.
Here, the first message matching with a clause is consumed and the corresponding expression is evaluated in the updated context.

Spawning a Process
Our spawn function, described in Fig. 8, is greatly simplified: we only model the case where it always succeeds, and we omit its signal exchange protocol.However, most of the mechanisms needed to describe it properly being already present here, it might be the object of future work.It is based on fresh_pid, a function returning an identifier not already used by one of the processes of the system.

Process Monitoring
The creation of a link between processes is asynchronous.The initiating process creates it locally and sends to the other process a signal.Once this signal has been processed, the other side of the link is created.In the case of a link, when such a link already exists, nothing is done and the evaluation continues.In the case of a monitor link, a new identifier is allocated as shown in Fig. 9.The sent signal is then handled.In the case of a link, if the linked process does not exist and when it is "cheap" to know it, instead of executing the operation asynchronously, the calling process directly ends [ERTS, erlang, link-1], which changes the emitted signals.As the documentation suggest that the definition of "cheap" can evolve, we left the function is_cheap parametric (cf.Fig. 10).When the verification is costly, the link is created, and an exit signal with noproc as a reason is sent.In the current implementation of Erlang, a hidden managing process sends the message, but this is an implementation choice.We thus model this behavior by storing the signal in the outbox of the calling process.
The destruction of such links is asynchronous with the guarantee that, once their destruction completed, they have no effect anymore [ERTS, erlang, unlink-1].Ending the calling process fails when the argument is of wrong type: Ref for demonitor, Pid for unlink.These mechanisms are similar, we omit their definitions in this paper.

Exit Signals
In our model, exit signals are flagged with the type of link which caused their sending.If the link does not exist in the receiving process, the signal is simply ignored.Otherwise, when it is a link, the process ends too, as shown in the rule sig_exit_link in Fig. 11 and when it is a monitor link, the signal is deleted and a message is created, as described in the rule sig_exit_monitored in Fig. 11, in which process is a a flag indicating that the exit signal comes from a monitor link established between two processes.

Discussion
The semantics we proposed here contrasts with previous small steps semantics in two ways.First, it describes in a more precise way the Core Erlang monitoring and signal system.Second, it is based on the official documentation, rather than the current implementation.We tried to develop a semantics that is close enough to previous formalizations so as to make it easier to establish links between both, which we hope will come handy for the next part of our works.
We are currently working on an implementation of this semantics in Maude [8], which will be available, with an updated and further documented semantics, at [7].Our next step will be to leverage abstract interpretation in order to generate causal explanations of an observed faulty behavior.Although not based on exactly the same language as our work, [9] and [13] seem to be a good starting point.

C Failing subexpression
If the evaluation of a subexpression fails, it stops the execution of its process and updates r, as seen in Fig. 13 The evaluation c[e] of an expression e under an evaluation context c is defined inductively over the syntax of the evaluation context c as follows:

D Call of a built-in function
Rule CallBIM in Fig. 12, where evalResFunc returns result of such black box evaluation in the present context.

E Failing subexpression
If the evaluation of a subexpression fails, it stops the execution of its process and updates r, as seen in Fig.

F Recursive call and other rules
As we focused on presenting the signals handling and monitoring aspects of our semantics, for the sake of briefty, we did not include our whole semantics.This explains why, in our model, some terms are never modified by the rules of the paper.It is for example the case of the functions table τ , only modified by the letrec expression (see Fig. 14).We are currently working on an implementation of this semantics in Maude, which we will make public, along with the complete semantics [7].

Figure 1 :
Figure 1: Our subset of Core Erlang syntax (based on[14], expanded with[6]).When a category is followed by a string in parentheses, this string denotes an element of this category in the rest of the paper.

Figure 3 :
Figure 3: End of a process.The function permutations returns the set of permutations of its argument converted into a list.

Figure 9 :
Figure 9: Birth of a monitor link.