3 Pattern matching in join definitions
3.1 Principle
Consider the following join definition:
let A(n) | B() = P(n)
and A(n) | C() = Q(n)
;;
This defines three names A, B and C. Name A has arity one,
whereas names B and C have arity zero. Names may be synchronous
or asynchronous, depending on whether there are
reply… to…
constructs applying to them inside the guarded processes P(n)
and Q(n) or not.
According to the general join-calculus semantics, the guarded process
P(n) may be fired whenever there are some messages pending on A and
B. Similarly, Q(n) may be fired whenever there are some messages
pending on A and C. In both cases, the formal parameter n is
replaced by (or bound to in the implementation) one of the messages
pending on A.
Reactivity information is to be considered at the definition level,
since matching is indeed performed at this level. Moreover, in order
to use finite-state automata, we want this information to range on a
finite set of possible values. As far as matching is concerned and by
the linearity of patterns, only the presence or absence of messages
matters. Thus, let us call 0 the status of a name without
any message pending, and N the status of a name with at least
one message pending. Then, the status of a definition is a tuple
consisting of the statuses of the names it defines, once some
arbitrary order of these names has been adopted.
For instance, if some messages are pending on B and C, whereas
none is pending on A, then the status of the A, B, C
definition is a three-tuple written 0NN.
A matching status is defined as a status that holds enough
N, so that at least one guarded process can be fired.
Definition status evolves towards matching status as messages arrive.
This yields a first kind of “increasing” transitions. More
specifically, when a message arrives on some name, then this name
status either evolves from 0 to N or remains
N. Definition status evolves accordingly. In the A, B,
C case we get the following transitions. (In this diagram,
transitions are labeled by the name that gets a new message and
matching statuses are filled in gray.)
Definition status also evolves when matching occurs. This yields
new, “decreasing”, transitions that we call matching transitions.
According to the join-calculus semantics,
matching may occur at any moment (provided of course that
matching is possible).
As a consequence, matching transitions start from matching states and
they are unlabeled.
In the A, B, C case, they are as follows:
Observe that there may be several matching transitions starting from a
given status. This is not always a consequence of the
non-deterministic semantics of the join-calculus.
Often, ambiguity is only apparent. For instance, matching transitions
starting from NN0 lead to NN0, N00,
0N0 and 000. When such a matching occurs, two
messages are consumed (one pending on A and one pending on B)
then, depending on whether there are some messages left pending on A
and B or not, status evolves to NN0, N00,
0N0 or 000. From the implementation point of view,
this means that a little runtime testing is required once matching has
been performed. Here, we pay a price for using finite-state automata.
However, some true non-determinism is still present. Consider
status NNN for instance. Then, both guarded processes of the
A, B, C definition can now be fired. The choice of firing
either P(n) or Q(n) will result in either consuming one message
pending on A and one on B, or consuming one message pending on A
and one on C.
Finally, a view of join-matching compilation can be given by taking
together both kinds of transitions. This yields a non-deterministic
automaton.
Note that matching of non-linear patterns can also be compiled using automata.
For instance, if a name appears at most twice in one or more pattern,
then it status will range on 0, 1 and N.
We do not present this extension in greater detail, for simplicity, and
because we do not implement non-linear patterns.
3.2 Towards deterministic automata
For efficiency and simplicity reasons we choose to implement matching
using deterministic automata that react to message reception.
Fortunately, it is quite possible to do so. It suffices to perform
matching as soon as possible. More precisely, when a message arrives
and carries definition status into matching status, matching is
performed immediately, while definition status is adjusted to reflect
message consumption. This results in pruning the status space just
below matching statuses.
In practise, in the A, B, C case, we get the
automaton of figure 1.
Figure 1:
Automaton in the
A,
B,
C case
Observe that all transitions are now labeled and that a name labels a
transition when message reception on this name triggers that
transition. Furthermore, matching transitions that correspond to
firing P(n) or firing Q(n) are now represented differently (the
former by a dotted arrow, the latter by a dashed arrow). This
highlights the difference between false and true non-deterministic
transitions: real non-determinism is present when there are both
dotted and dashed edges with the same label starting from the same
node.
For instance, there are two B-labeled dotted transitions starting
from N00. Non-determinism is only apparent here, since
P(n) is fired in both cases and that the selected transition depends
only on whether there is at least one message left pending on A
or not after firing P(n).
By contrast, from status 0NN, the automaton may react to the
arrival of a message on A in two truly different manners, by firing
either P(n) or Q(n). This is clearly shown in figure 1 by
the A-labeled edges that start from status 0NN, some of
them being dashed and the others being dotted. A simple way to avoid
such a non-deterministic choice at run-time is to perform it at
compile-time. That is, here, we suppress either dotted or dashed
A-labeled transitions that start from 0NN.
In the rest of the paper, we take automata such as the one of
figure 1 as suitable abstractions
of join-pattern compilation output.
3.3 Automata and semantics
Both the “match as soon as possible” behavior and the deletion of
some matching transitions have a price in terms of semantics.
More precisely, some CHAM behaviors now just cannot be observed
anymore.
However,
the CHAM semantics is a non-deterministic one: an initial
configuration of the CHAM may evolve into a variety of configurations.
Furthermore, there is no fairness constraint of any kind and
no particular event is required to occur.
As an example of the consequence of the “match as soon as possible”
behavior, consider this definition:
let A() = print(1);
and B() = print(2);
and A() | B() = print(3);
;;
Then, we get the following automaton:
Status NN that is preceded by the two matching statuses
0N and N0 cannot be reached. As a consequence, the
above program will never print a 3, no matter how many messages
are sent on A and B.
Next, to illustrate the effect of deleting ambiguous matching
transitions, consider the following definition:
let A() = print(1);
and A() | B() = print(2);
Such a definition will get compiled into one of the following
deterministic automata:
In the case of the left automaton, only 1 will ever get printed.
In the case of the right
automaton, 2 will be printed when some messages arrives on B
and then on A.
Both automata lead to correct implementations of the semantics. However
the second automata looks a better choice than the first one, since it
yields more program behaviors.