ISSN 0249­6399 ISRN INRIA/RR­­4394­­FR+ENG apport de recherche THÈME 1 INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE Synthesising Labelled Transitions and Operational Congruences in Reactive Systems, Part 1 James J. Leifer N° 4394 March 2002 Unité de recherche INRIA Rocquencourt Domaine de Voluceau, Rocquencourt, BP 105, 78153 Le Chesnay Cedex (France) Téléphone : +33 1 39 63 55 11 --- Télécopie : +33 1 39 63 53 30 Synthesising Labelled Transitions and Operational Congruences in Reactive Systems, Part 1 James J. Leifer Th‘eme 1 --- R’eseaux et syst‘emes Projet Moscova Rapport de recherche n° 4394 --- March 2002 --- 54 pages Abstract: The dynamics of process calculi, e.g. CCS, have often been defined using a labelled transition system (LTS). More recently it has become common when defining dy­ namics to use reaction rules ---i.e. unlabelled transition rules--- together with a structural congruence. This form, which I call a reactive system, is highly expressive but is limited in an important way: LTSs lead more naturally to operational equivalences and preorders. This paper shows how to synthesise an LTS for a wide range of reactive systems. A label for an agent (process) `a' is defined to be any context `F' which intuitively is just large enough so that the agent `Fa' (`a' in context `F') is able to perform a reaction step. The key contribution of my work is the precise definition of ``just large enough'' in terms of the categorical notion of relative pushout (RPO). I then prove that several operational equivalences and preorders (strong bisimulation, weak bisimulation, the traces preorder, and the failures preorder) are congruences when su#cient RPOs exist. Key­words: labelled transition, structural congruence, bisimulation, action calculi, pushout Synth‘ese de transitions ’etiquet’ees et de congruences op’erationnelles pour les syst‘emes r’eactifs, premi‘ere partie R’esum’e : La tradition veut que l'on d’efinisse la dynamique des calculs de processus tels que CCS ‘a l'aide d'un syst‘eme de transitions ’etiquet’ees (LTS). R’ecemment, il est devenu courant d'utiliser plut“ot des r‘egles de r’eaction (c'est­‘a­dire des r‘egles de transitions non ’etiquet’ees) associ’ees ‘a une congruence structurelle. Cette pr’esentation, que j'appelle syst‘eme r’eactif, est tr‘es expressive, mais sou#re d'une limitation importante : les LTS permettent de d’efinir de mani‘ere plus naturelle des ’equivalences et pr’eordres op’erationnels. Cet article montre comment synth’etiser des LTS pour de larges classes de syst‘emes r’eactifs. On prend comme ’etiquette d'un agent (processus) `a' n'importe quel contexte `F' qui, intuitivement, est tout juste assez large pour que l'agent `Fa' (`a' sous le contexte `F') puisse e#ectuer une ’etape de r’eaction. Ma principale contribution est une d’efinition pr’ecise de ``juste assez large'' ‘a l'aide de la notion cat’egorique de somme amalgam’ee relative (``relative pushout'', RPO). Je prouve ensuite que plusieurs ’equivalences et pr’eordres op’erationnels (bisimulation forte, bisimulation faible, pr’eordre des traces, pr’eordre des ’echecs) sont des congruences lorsqu'il existe assez de RPO. Mots­cl’es : transition ’etiquet’ee, congruence structurelle, bisimulation, action calculi, somme amalgam’ee Synthesising Labelled Transitions and Operational Congruences in Reactive Systems, Part 1 1 Contents 1 Introduction 2 1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Historical background and motivation . . . . . . . . . . . . . . . . . . . . . 3 1.3 Contexts as labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.4 Relative pushouts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.5 Other work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.6 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.7 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2 Operational congruences for reactive systems 18 2.1 Formalisation of reactive systems . . . . . . . . . . . . . . . . . . . . . . . . 18 2.2 Categorical basis for contextual labels . . . . . . . . . . . . . . . . . . . . . 21 2.3 Labelled transitions and congruence for strong bisimulation . . . . . . . . . 25 3 Further congruence results 29 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.2 Functorial reactive systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.3 Cutting and pasting portable IPO squares . . . . . . . . . . . . . . . . . . . 32 3.4 Strong bisimulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.5 Weak bisimulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.6 Traces preorder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.7 Failures preorder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.8 Multi­hole contexts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4 Conclusions 48 Bibliography 50 RR n° 4394 2 J.J. Leifer 1 Introduction 1.1 Overview This paper is concerned with process calculi, which are mathematical models of computa­ tion. Process calculi come in many forms, but share several common features. Much as the #­calculus isolated the canonical features of sequential computation, process calculi do the same for concurrent phenomena, such as non­determinism, synchronisation, and com­ munication. By concentrating on a few core syntactic primitives, process calculi provide a setting in which to reason about these phenomena without the added baggage associated with a full programming language. There are three critical ingredients beyond syntax that a process calculus may possess. I summarise these now and discuss each in greater detail later. The first consists of unlabelled transitions, or reactions as I call them in this paper, which characterise only the internal state changes of an agent (a process). Reactions do not require interaction with the surrounding environment. The second consists of labelled transitions, which characterise the state changes that an agent may undergo in concert with the environment. Each transition is associated with a label, which records the interaction with the environment (for example an output on a channel) that enables the transition. A label is thus an observation about the behaviour of an agent. Normally, the reactions of an agent are special labelled transitions for which the interaction with the environment is vacuous (so­called #­transitions). The third consists of operational preorders and equivalences, which characterise when one computation is respectively behaviourally replaceable by, and behaviourally equal to, another. An equivalence is most useful when it is a congruence, i.e. is preserved by the syntactic constructions of the calculus; in this case, proving the equivalence of two large agents can be reduced to proving the equivalence of components. Thus congruence is one of the most desirable properties an equivalence may possess. These remarks apply as well to operational preorders. As I will describe later, much research has concentrated on finding general definitions of operational preorders and equivalences in terms of labelled transitions. For any specific process calculus, it is then a challenge to prove that the preorder or equivalence in question is a congruence. But what happens when there are no labelled transitions, only reactions? The latter situation has become common since reactions provide a clean way of specifying allowable state changes without commitment to particular observables. Reactions are highly expressive but limited in an important way: they do not lead as naturally as labelled transitions do to congruential operational equivalences and preorders. The central problem addressed by this paper is as follows: From a process calculus with only a reaction relation ---called a reactive system--- can we synthesise a tractable INRIA Synthesising Labelled Transitions and Operational Congruences in Reactive Systems, Part 1 3 labelled transition relation? By tractable, I mean two things: the labels come from some small set; and, the labelled transitions readily yield operational preorders and equivalences that are congruences. My approach is to take a label to be a context (an agent with a hole in it). For any agent a, a label is any context F which intuitively is just large enough so that the agent Fa (``a in context F '') is able to perform a reaction. The key contribution of this work is to make precise the definition of ``just large enough'' in terms of the categorical notion of relative pushout (RPO) and to show that several operational equivalences and preorders, namely strong bisimulation, weak bisimulation, the traces preorder, and the failures preorder, are congruences when su#cient RPOs exist. By showing that the RPO property is a su#cient condition for congruence, this paper brings together two of the main themes of concurrency semantics: the failures preorders of Hoare and Roscoe (developed originally for CSP) and the bisimulation equivalences of Milner and Park (developed originally for CCS). The former are model­theoretic, com­ paring programs as points in a complete partial order; the latter are operationally­based, comparing programs coinductively. It is striking, therefore, that the same RPO hypothesis serves for my congruence proofs of both. This paper is the first part of a series of two. The companion paper [Lei02], referred to as Part 2 throughout, proves that a general class of functors does indeed satisfy the axioms of a functorial reactive system (an elaboration of the notion of reactive system referred to above). Moreover, it shows that a significant example of such, namely a simplified version of Milner's action calculi, has su#cient RPOs, as required by all the congruence proofs. In the rest of this section, I discuss some of the history of operational congruences and o#er some background as to why reactions have become prominent. I then sketch the solution strategy presented in this paper. I relate my approach to work by other researchers and to material I have jointly published. I conclude with an overview of the structure of the sections. 1.2 Historical background and motivation It may seem strange at first to investigate the ``derivation of operational congruences'' since congruence is usually a postulated notion in mathematics: to define a model, one chooses which elements are equivalent. But choosing has become di#cult for many models of computation, creating a gap between the properties such as reaction that are easy to postulate and those such as equivalence that are useful to have. To illustrate how this gap opened, let us look at some strands in the history of process calculi. Starting with the seminal work by E.F. Moore [Moo56] which examined ``finite automata from the experimental point of view'', theoretical computer scientists have pur­ sued the notion that the observable behaviour of a computation is more fundamental than RR n° 4394 4 J.J. Leifer the form in which the computation is written; in this view, two programs are equivalent when they are indistinguishable by an outside observer. In the case of automata, an ob­ servation consists of a labelled transition of the form a x a # : automaton a can input the label x and become a # . (This is a slight simplification of Moore's original notion, which distinguished input from output labels.) One automaton refines another if the former has a subset of the labelled transitions of the latter, whether or not the two have di#erent syntactic forms. For example, a “ = x.(y+z) and b “ = x.y + x.z both can input the same strings of labels, ##, #x#, #xy#, #xz#, so are equal (refine each other) despite their syntactic di#erences. This notion of comparing computations based on their labelled transitions is now known as the traces preorder ; the interpretation of computations in terms of their traces is the traces model (see [Ros98]). The traces preorder is attractive for its simplicity, but is limited. Some of its inade­ quacies with respect to particular applications were overcome in the late 1970s by Hoare and Milner in ways that I describe below. The primary weakness of the traces preorder is its inability to handle the subtleties of non­determinism. This is evident if we consider the pair a, b defined earlier. After inputting x, a is in a state that can input either y or z; but b takes a ``silent'' choice (an internal choice not controllable by external influence) when inputting x and enters one of two states: either a state in which only y can be input or a state in which only z can be input. Even though a and b have identical traces, their non­deterministic behaviour is di#erent. The problem of handling non­determinism together with causality (the dependence of one transition upon an earlier one) was first addressed by Petri nets (see, for example, [Rei85]). Researchers on Petri nets did not originally consider equivalences or preorders for them. However, event structures, which are similar to traces but account for the causality and non­determinism of events, are a setting for modelling Petri nets and other causal systems and yield notions of equivalence [WN95]. Because Petri nets do not have a compositional syntax, it is di#cult to understand what it means for an equivalence to be a congruence. For this reason, I do not discuss them further and confine my attention to process calculi with compositional syntax. Di#erent applications for process calculi have di#erent informal requirements for when the patterns of labelled transitions for two agents are the same (in the case of equiva­ lences) or refine one another (in the case of preorders). No one definition can be faithful to all possible requirements. Given that, we can consider schemas parameterised by a labelled transition relation for defining di#erent equivalences and preorders --- and thus take labelled transitions to be the postulated part of a process calculus from which equiv­ alences and preorders are derived. This di#erence lends labelled transitions their power: a labelled transition relation captures the important interactions between a computation and its environment without making a commitment to a specific equivalence or preorder. INRIA Synthesising Labelled Transitions and Operational Congruences in Reactive Systems, Part 1 5 As I will discuss later, many new process calculi for modelling distributed computation do not include axioms for labelled transitions but instead rest on an even simpler base. Hoare's work on Communicating Sequential Processes (CSP) [Hoa78, Hoa85] and Mil­ ner's on Communicating Concurrent Systems (CCS) [Mil80, Mil88] addressed the problem of comparing computations in a way that is sensitive to non­determinism, and hence dis­ tinguishes a from b (the examples I gave earlier). Hoare's approach in CSP was to present every agent as an explicit set­theoretic con­ struction including, amongst other data, failures; the latter are traces, each of which is enriched with a set of refusals indicating which labels may be refused by the agent after inputting the trace. For example, after inputing x, b can refuse y and can refuse z but a cannot refuse either. Each constructor in CSP (prefixing, parallel composition, choice, etc.) is defined in terms of the manipulation of failures (as a continuous function on the complete partial order of agents). The explicit set­theoretic representation of agents in CSP supports the design of model checkers [Ros94, Ros98] which are, for example, e#ec­ tive in detecting bugs in protocols [Low96]. I concentrate in this paper on the strand of research originating with CCS, but return to the failures preorder in Subsection 3.7 when proving that it is a congruence. In Milner's CCS, agents have a free syntax and the labelled transition relation is generated by inference rules in the style of Plotkin's structured operational semantics [Plo81], e.g. x.a x a a # a # a + b # a # Remark: Throughout this paper a, b, c, . . . are used to denote agents even when this con­ tradicts historical conventions. The labels (ranged over by # . . .) comprise input channels x, y, z, . . . and output channels • x, • y, • z, . . . and a special distinguished element # . This ter­ minology is somewhat misleading because no data is actually input or output; they are just complementary flavours. In the #­calculus, which I describe later, the di#erences are significant. The key idea in CCS is that of synchronisation. If one agent can output x and the other can input x then their parallel composition can synchronise on x. The result is a #­transition which records the synchronisation but leaves the name on which it occurs anonymous: a • x a # b x b # a | b # a # | b # Milner considered several equivalences for CCS; the main ones were strong and weak bisim­ ulation. Both kinds employ a coinductive form of definition which gives a powerful proof RR n° 4394 6 J.J. Leifer technique for comparing agents (an idea that originates with Park [Par81]) and provide a general way of defining an equivalence parameterised by a labelled transition system. Strong bisimulation relies on no assumptions about the underlying structure of the labels; weak bisimulation requires a distinguished #­transition. I will discuss both in detail later (Subsection 2.3 and Subsection 3.5) but summarise the two now. Both strong and weak bisimulation are sensitive to non­determinism and are able to distinguish the pair a, b described earlier in the discussion of the traces preorder. A strong bisimulation S is a relation on agents satisfying the following coinductive property: for all (a, b) # S and all labelled transitions a # a # , there exists b # such that b # b # and (a # , b # ) # S (and vice versa). Informally, ``if a and b are S­related then whatever a can do, b can do too, and they both end up in S­related states; likewise for whatever b can do''. The largest strong bisimulation relation (which is the union of all strong bisimulations) is denoted by #. The largest weak bisimulation, denoted by #, is coarser because it is flexible in allowing #­transitions to be collapsed and expanded when comparing two agents. For example, a agent that inputs x and then immediately outputs y would be related by weak bisimulation to one that inputs x, then has several #­steps (internal computations), and finally outputs y. Milner proved that both kinds of bisimulation are congruences (though not with respect to sum for weak bisimulation --- see Subsection 3.5 for further discussion). Such proofs require care in CCS and are more di#cult in later calculi. This is the price of avoiding CSP­ style explicit representations of agents and agent constructors. It is a central theme of this paper to provide mathematical tools for easing the burden of proving that operationally defined equivalences (such as bisimulation) are congruences. The idea of using a # ­like transition to record a primitive computational step plays a central role in later calculi. These transitions are variously called reaction rules, reduction rules, firing rules, etc.; I shall use reaction rules throughout. In CCS, the # transition relation requires the entire collection of labelled transition rules to generate it. A dramatic simplification was proposed in the Chemical Abstract Machine (CHAM) of Berry and Boudol [BB90, BB92] and used in work [Mil92] on the #­calculus of Milner, Parrow, and Walker [MPW89, MPW92]. These calculi were the first to employ a lightweight quotient of the agent terms, called a structural congruence, in order to make their reaction rules easy to define. I shall confine the discussion to the #­calculus. (CHAM treats the quotient more explicitly with ``heating and cooling'' rules, though the idea is similar.) Structural congruence is an equivalence relation # on agents that allows the parts of an agent to rearrange themselves, e.g. a | b # b | a a | (b | c) # (a | b) | c · · · The reaction relation , which characterises the primitive computational step (namely communication of a name over a channel), is simple to define; it is the smallest relation INRIA Synthesising Labelled Transitions and Operational Congruences in Reactive Systems, Part 1 7 satisfying the rule • x#y#.a | x(z).b a | {y/z}b that respects structural congruence a # a # a # b # b # # b a b and is closed under all non­guarding contexts C (e.g. c | -; see Subsection 2.1 for further discussion) a b C[a] C[b] . Thus the agent • x#y# | (c | x(z).b) has a reaction even though the parts needed to enable the reaction --- namely • x#y# and x(z).b --- are not adjacent: • x#y# | (c | x(z).b) # (•x#y# | x(z).b) | c {y/z}b | c . The ease with which reaction rules are defined in this style facilitated an outpouring of new process calculi for modelling encrypted communication [AG97], secure encapsulation [SV99], agent migration [CG98, Sew98, FGL + 96] and so on. Each isolates a computational phenomenon and presents it via a reaction rule together with a structural congruence over some syntax. Here are two examples of reaction rules: ˆ In Cardelli and Gordon's ambient calculus, one ambient may move inside another (like a packet through a firewall): y[in x.a | b] | x[c] x[y[a | b] | c] . ˆ In Sewell and Vitek's Box­# calculus, a message may move from inside to outside a wrapper and is decorated with the wrapper's name while doing so: y[•x # v | b] • x • y v | y[b] . In the #­calculus the communication of names along channels presented subtleties in the design of a labelled transitions system and of equivalences (of which several are now studied [SW01]); the proofs of congruence for these equivalences require care. For many of the newer calculi, such as those listed above, the problem of choosing appropriate labelled transitions and proving that bisimulation is a congruence is di#cult, and, in many cases, not attempted. In all these cases, experimentation is costly: a slight modification of the syntax or reaction rules often causes the labelled transition relation to change and breaks congruence proofs, forcing them to be reconstructed. RR n° 4394 8 J.J. Leifer So the gap first described at the beginning of this subsection has widened as the data defining a typical process calculus have changed. Labelled transitions are no longer postulated primitives of a calculus but instead are teased out from the four fundamental components: ˆ syntax (agents and agent contexts); ˆ structural congruence; ˆ set of reactive (non­guarding) contexts; ˆ reaction rules. I call a process calculus containing these components a reactive system. In this way, a reactive system closely resembles instances of the #­calculus [Bar84]. The latter consist of a simple syntax, a structural congruence based on #­conversion, a set of reactive contexts (known as ``evaluation contexts'' [FF86]) chosen to force strategies such as call­by­name or call­by­value, and a reaction rule based on #­reduction. There is however an important di#erence which renders the problem of finding useful equivalences for process calculi more di#cult: their reaction relations are usually neither confluent nor normalising. As a result, equivalences for #­calculi, such as those based on normal forms or on termination properties [Bar84, L’ev78], do not provide viable routes to follow. Therefore we need to consider equivalences based on bisimulation or other techniques that make no assumptions about confluence or normalisation properties. Yet bisimulation requires labelled transitions, which are not provided in a reactive system, and proofs of congruence, which can be di#cult and fragile, as already discussed. The next subsection outlines the strategy for synthesising labelled transitions which is the basis for the work in this paper. 1.3 Contexts as labels We wish to answer two questions about an arbitrary reactive system consisting of agents (whose syntax may be quotiented by a structural congruence) and a reaction relation (generated by reaction rules): 1. Can we synthesise a labelled transition relation # where # comes from a small set of labels that intuitively reflect how an agent interacts with its environment? 2. Under what general conditions is strong bisimulation (and, more generally, other preorders and equivalences) over # a congruence? We can begin to address question 1 by considering CCS. Let a, b range over agents, C, D,F range over contexts (agents with a hole), and x range over names. The usual labelled INRIA Synthesising Labelled Transitions and Operational Congruences in Reactive Systems, Part 1 9 transitions # for # ::= • x # # x # # # reflect an agent's capability to engage in some behaviour, e.g. • x.a | b has the labelled transition • x because • x.a | b can perform an output on x. However, if we shift our emphasis from characterising the capabilities of an agent to the contexts that enable the agent to react, then we gain an approximate answer to question 1 by choosing the contexts themselves as labels; namely we define a F a # i# Fa a # (1) for all contexts F . (We denote context composition and application by juxtaposition throughout. We regard a context with a trivial hole as an agent, so application is a special case of composition.) Instead of observing that • x.a | b ``can do an • x'' we might instead see that it ``interacts with an environment that o#ers to input on x, i.e. reacts when placed in the context - | x''. Thus, • x.a | b -|x a | b. The definition of labelled transition in (1) is attractive when applied to an arbitrary reactive system because it in no way depends upon the presence or absence of structural congruence. Furthermore, it is generated entirely from the reaction relation (ques­ tion 1); and, strong bisimulation over the synthesised labelled transition relation · is a congruence, (question 2). The proof of the latter is straightforward: let C be an arbitrary context and suppose a # b; we show that Ca # Cb. Suppose Ca F a # ; by definition, FCa a # , hence a FC a # . Since a # b, there exists b # such that b FC b # and a # # b # . Hence Cb F b # , as desired. The other direction follows by symmetry. Nonetheless, the definition in (1) is unsatisfactory: the label F comes from the set of all contexts --- not the ``small set'' asked for in question 1 --- thus making strong bisimulation proofs intolerably heavy. Also, the definition fails to capture its intended meaning, namely that a F a # holds when a requires the context F in order that a reaction is enabled in Fa: there is nothing about the reaction Fa a # that forces all of F --- or indeed any of F --- to be used. In particular, if a a # then for all contexts F that preserve reaction, Fa Fa # , hence a F Fa # ; thus a has many labelled transitions that reflect nothing about the interactions of a itself. Let us unpack (1) to understand in detail where it goes wrong. Consider an arbitrary reactive system equipped with a set Reacts of reaction rules; the reaction relation contains Reacts and is preserved by all contexts, as formalised by the following axiom and inference rule: l r if (l, r) # Reacts a a # Ca Ca # . Expanding (1) according to this definition of we have: a F a # i# Fa a # i# #(l, r) # Reacts, D. Fa = Dl & a # = Dr . (2) RR n° 4394 10 J.J. Leifer # # # # a l F D 1 The requirement Fa = Dl in (2) is rendered by a commuting square (as shown in Figure 1) in some category whose arrows are the agents and contexts of the reactive system. This requirement reveals the flaw described earlier: nothing in (2) forces F and D to be a ``small upper bound'' on a and l. The next subsection explores the challenges involved in making precise what ``small upper bound'' means. 1.4 Relative pushouts For several years I tried to make precise what ``small'' means. I was mired in detailed arguments about specific examples of contexts for which I would search for a ``dissection lemma'', having roughly the following form: given Fa = Dl, there exists a ``maximum'' C, such that for some F # and D # we have F # a = D # l, F = CF # and D = CD # . In other words, dissection peels o# as much as possible from the outside of F and D, decomposing Fa = Dl as CF # a = CD # l. If the dissection of Fa = Dl peels o# nothing, i.e. produces the decomposition id Fa = id Dl, where id is the identity context, then F and D are ``a small upper bound''. But the problem then remains: when is C the ``maximum'' possible context that can be peeled o#? In this subsection I look at some examples of dissection in order to illustrate just how subtle the problem is. These examples motivate a solution that I introduce here and present in detail later in this paper. For some particular classes of contexts, it is easy to understand what to do. Consider this example of two compositions Fa = Dl in a free term algebra: # ### # ###-### # # # = # ### # #-## # # # #### # = ### # ####### . (For the sake of clarity, I use # for composition in this example.) The maximum context C that can be peeled o# is ### # #-##. Nothing more can be peeled o# because C = D. The only other possibilities are - and ##-#, neither of which is as big as C. So C it is! Of course I am arguing informally here, but it is straightforward to make this line of thinking precise. Free term algebras are the simplest setting in which to consider dissection exactly because they are free. When passing to a syntax quotiented by a non­trivial struc­ tural congruence, the question becomes di#cult, especially when compounded with nam­ ing structure such as in the #­calculus. Consider a typical #­calculus agent such as a “ = (#u)(•x#u# | x(z).•y#z#). Notice that a contains a parallel constructor |, which is associative and commutative and has identity 0; also, u and z are bound, so are sub­ ject to #­conversion. Furthermore, the name x is used twice (which requires attention when substituting another name for it) and is discarded after the reaction a a # , where a # “ = (#u)(•y#u#). INRIA Synthesising Labelled Transitions and Operational Congruences in Reactive Systems, Part 1 11 out out out nu nu in x y x y Figure 2: A graphical representation of a #­calculus reaction (#u)(•x#u# | x(z).•y#z#) (#u)(•y#u#) This structure demands careful treatment and makes dissection for the #­calculus di#cult: for term algebra, one can incrementally peel o# a top­level function symbol, but for the #­calculus, there is no notion of a ``top­level'' constructor. The structural congruence of the #­calculus equates so many di#erent syntactic representations of the same agent that it is di#cult to understand where to begin. Without any naming structure, parallel composition becomes easier to handle, as shown by Sewell [Sew]. As I explain in Subsection 1.5, it is the treatment of names that distinguishes the dissection results in this paper from his. A possible approach is to abandon tree­like syntax and to think in terms of graph­like syntax that automatically quotients out many ignorable di#erences. Even if a dissection result could be proved for some graph­theoretic representation of the #­calculus, it would not necessarily generalise smoothly to other calculi. As a result, I studied dissection for Milner's action calculi [Mil96], which are a family of reactive systems. The syntax of action calculi is su#ciently rich to embrace process calculi such as #­calculus, the #­calculus, and the ambient calculus. Action calculi are introduced in Subsection 4.2 in Part 2; here I confine my attention to a few salient features of their graphical form. Consider the example shown in Figure 2; it illustrates a pair of action graphs ---agents in an action calculus--- that represent the #­calculus reaction a a # given earlier. An action graph consists of nodes (rectangles with two blunt corners) and arcs: ˆ Nodes are labelled with controls from a control signature of primitives. Each action calculus may have a di#erent control signature, such as {nu, in, out} for a simple #­ calculus without replication or choice, or {ap, lam} (application and #­abstraction) for the #­calculus. RR n° 4394 12 J.J. Leifer M N L K (0, 0) L K K L N (0, 0) (0, 0) (0, 0) (1, 1) (0, 2) (2, 0) M M a l D # F # C N M D F L N K Figure 3: A surprising dissection involving reflexion (0, 2) (0, 2) (0, 2) (0, 0) F F # D # (0, 0) (0, 2) a t 0 t 1 l t 2 t 3 D K K K K (t 1 ,t 3 ) (t 1 ,t 2 ) (t 0 ,t 3 ) (t 0 ,t 2 ) C (0, 4) Figure 4: A surprising dissection involving forking ˆ Arcs represent names. They connect source ports to target ports. The source ports are arrayed on the left interface of an action graph (such as is shown for the action graph nested inside in) and on the right side of nodes (e.g. nu); they also include free names (e.g. y). The target ports are arrayed on the right interface of action graphs and the left side of nodes. Arcs may be forked from a source port (such as the arcs from x in the LHS) and may be discarded (such as the arc from x in the RHS). How can we perform dissection on action graphs? It is the wiring structure (the arcs) of an action graph that makes dissection so di#cult. Consider the example of a composition INRIA Synthesising Labelled Transitions and Operational Congruences in Reactive Systems, Part 1 13 m a l F D F # D # C “ C “ F # “ D # JJ Figure 5: A relative pushout Fa = Dl shown in Figure 3. (This example is based on one first suggested by Philippa Gardner.) The contexts F and D are just action graphs with a hole (shown as a shaded rectangle). The composition Fa is formed by placing a inside the hole of F and joining up the corresponding arcs. By reasoning about the controls it is possible to see which ones can be peeled o#: F has N,M and D has N,L, so C might have N (the intersection of the controls in F and D). Indeed, the C shown does have just one control. But what arcs should C have? Does the loop shown in C make C ``maximal'' in some sense? Should C have other forked and discarded arcs? How can we choose? The example in Figure 4 is even more curious: F and D have no nodes at all --- they are pure wiring contexts. Since F = D and a = l, one might expect that the ``maximal'' common part that can be peeled o# of Fa = Dl is F = D itself. This is not true! The triple F # , D # , C provides a better dissection in a way that I will make precise below. The point of these examples is to demonstrate that informal reasoning about dissection is di#cult when dealing with contexts containing wiring: There is almost no way to decide just by looking at the arcs themselves whether one dissection is better than another. Out of this murkiness finally emerged a clearer way. I came upon an abstract charac­ terisation, called a relative pushout, of which dissections are ``best''. By abstract, I mean that the characterisation makes no use of any specific properties of contexts except their composability, thus can be instantly cast as a category theoretic property. Consider the outside commuting square in Figure 5 which shows Fa = Dl. A relative pushout (RPO) for this square is a triple F # , D # , C satisfying two properties: first, the triple is a candidate, i.e. F # a = D # l and CF # = F and CD # = D; second, for any other candidate “ F # , “ D # , “ C, there exists a unique mediating J making all the triangles commute (as shown). RPOs are the key contribution of this paper. It is by working with them, rather than trying to come up with ad hoc ways of dissecting contexts in specific examples, that we gain two important advantages: RR n° 4394 14 J.J. Leifer ˆ RPOs are abstract: as I said before they do not rely on any specific properties of contexts except composability. By defining labelled transitions in terms of RPO con­ structions, it is possible carry out proofs of congruence (see Section 2 and Section 3) for several operational equivalences and preorders (strong bisimulation, weak bisim­ ulation, the traces preorder, and the failures preorder), for which the proofs do not depend on any specific properties of contexts except the existence of RPOs. Thus these results are applicable to any reactive system which possesses su#cient RPOs. ˆ RPOs provide a unifying discipline for analysing contexts in specific examples. The fuzziness about which dissection is ``best'' in the examples of action graphs shown earlier disappears: we want the candidate triple F # , D # , C # from which there is a unique mediator to any other candidate. By this requirement, the candidates shown in Figure 3 and Figure 4 are ``best''. (There may be many best candidates, but all of them are isomorphic to each other by standard categorical reasoning.) The problem of finding RPOs for graphs is non­trivial, as shown by the lengthy proofs in Chapter 6 of [Lei01], but one is sustained by the unambiguity of the task: there is no vagueness about the properties required of an RPO. Thus, the notion of an RPO does not solve the problem of finding a dissection, but it makes the problem well­defined and provides a reward for the e#ort by virtue of the congruence proofs that rely on the existence of su#cient RPOs. 1.5 Other work This subsection brings together some of the important related work. There is a large collection of literature about process calculi, some of which I referred to in previous sub­ sections. I will concentrate on those pieces of research that most closely impinge on the specific problems that I address in this paper. The idea of finding conditions under which a labelled transition relation yields an operational congruence has been thoroughly studied in work on structural operational semantics (SOS) [GV92, TP97]. The principle is to postulate rule formats, conditions on an inductive presentation of a labelled transition relation that ensure that operational equivalences (e.g. weak bisimulation [Blo93]) are congruences. There is a fundamental di#erence between this problem and the one I am looking at. The work on SOS presumes that a labelled transition relation is already given: the problem is to show that if it satisfies a particular format then the congruences follow. My work takes reaction rules as primitive, not labelled transitions: I aim to synthesise a labelled transition relation from the reaction rules for which operational equivalences (and preorders) are congruences. In my case, the synthesised labelled transition relation is not inductively presented. It is an open question whether it can be inductively presented and, further, whether this presentation satisfies INRIA Synthesising Labelled Transitions and Operational Congruences in Reactive Systems, Part 1 15 some well­known rule format from SOS theory: this possibility is discussed in greater detail in Section 5 of Part 2, which covers future work. The problem of deriving operational congruences for process calculi from a reaction relation and not from a labelled transition relation is studied in the work on barbed bisimu­ lation [MS92], insensitivity observation [HY95], and testing equivalence [DH84]. The first two construct equivalences by augmenting bisimulation over the reaction relation with observations about related states. For example, in the former, the observations are barbs which detect the ability of an agent to perform an output on a channel. The last (testing) compares two agents by their ability to satisfy the same ``may'' and ``must'' tests. Thus all three depend on primitive observations (not just reactions) though these can be simpler than labelled transitions. The main obstacle to their use is the fact that they do not yield congruences but instead need to be ``closed up'' by all contexts. For example, it is straightforward to prove barbed equivalence in particular cases but di#cult to show barbed congruence because of the heavy quantification over all contexts. Work by Fournet [Fou98] and by Fournet and Gonthier [FG98] ease this burden with techniques that allow a proof of barbed congruence to be broken into pieces, each of which may be carried out using other congruence relations. Je#rey and Rathke [JR99] used contexts as the basis for the labels of an LTS in the case of the #­calculus (a variant of the #­calculus with fresh name creation). They did not synthesise uniformly these labels from a reaction relation but they were guided by the intuition that the labels are small contexts that enable a reaction. These labels give them the right observational power to obtain useful congruences based on bisimulation. Sewell's work [Sew] is the closest to mine of all the material I have cited here. He studied the problem of deriving contextual labels for a family of reactive systems (parameterised by arbitrary reaction rules) whose syntax consists of free terms quotiented by a structural congruence for parallel composition. He defined labelled transitions by explicitly reasoning about how a label overlaps with a redex. From this definition, he proved that bisimulation is a congruence by appealing to his specific dissection results (not motivated by RPOs). There are three important di#erences in approach: ˆ He dealt with multi­hole redexes which capture the full uniformity of metavariables, thus leading to lighter labelled transitions than I can synthesise with my present RPO technology. I discuss possible remedies for this in Section 5 in Part 2. ˆ He invented explicitly his definition of labelled transition and his statement of dis­ section, both of which are complex, without employing RPOs (which I worked with later). As as result, his proof of congruence is not easily generalisable to other syn­ tactic families and other operational equivalences. It seems likely, however, that his dissection results imply the existence of RPOs for the class of reactive systems he considered, and it would be worth trying to recast them so as to make this precise. RR n° 4394 16 J.J. Leifer ˆ He confined his attention to free term algebras with parallel composition and did not handle wiring structure such as is shown in Figure 4. I discuss wiring in Section 4 in Part 2 --- and show detailed proofs involving it in Chapter 6 of [Lei01] where the discipline of seeking RPOs guides the dissection involved. Given the complexity of Sewell's definition of labelled transition and of his statement of dissection, it is di#cult to see how these could be generalised to embrace wiring without the benefit of the notion of RPOs or other universal constructions. The idea of RPOs first appears in [LM00], published jointly with Milner. It contains congruence results for strong bisimulation and weak bisimulation for reactive system (thus partly overlapping with Section 2 in the present paper) not the functorial reactive system considered here in Section 3. The presentation of action calculi contexts in Section 4 of Part 2 intersects with that of [CLM00], though the latter work does not contain any use of functorial reactive systems, a substantial innovation of the present paper and Part 2. Milner's recent study of bigraphical reactive systems [Mil01] attacks some of the open problems brought out in the present paper and in Part 2. I have inserted pointers to his work in the discussion about the role of precategories in functorial reactive systems (Subsection 3.7 in Part 2) and in the description of future work related to nested graphs, multi­hole contexts and metavariables in reactions, free names and binding, and summa­ tion (all in Section 5 in Part 2). In this paper, I have made no use of the double pushout techniques developed in graph rewriting [Ehr79]. These are a way to describe the occurrence of a subgraph ---especially a redex--- in a graph. To avoid confusion, I should emphasise that relative pushouts play quite a di#erent role. In my work, subgraph occurrences are handled by embeddings and contexts; the nature of the graphs (with forked wiring) seems to require a specific approach. But it would be useful to examine in the future how the embeddings relate to the double pushout construction, and how graph­theoretic representations of the syntax of the #­calculus and other calculi formed by quotienting out structural congruence compare to similar work in graph rewriting [CM91, K˜on99]. The proofs contained in this paper and in Part 2 are sometimes omitted for the sake of brevity. In all cases I refer to the full proof in my Ph.D. [Lei01]. 1.6 Outline The subsequent sections of this paper are organised in the following way: Section 2: I make precise the notion of a reactive system and give a definition of labelled transition in terms of idem pushouts (IPO), a sister notion of RPOs. I then present a series of results using simple categorical reasoning that shows how to manipulate INRIA Synthesising Labelled Transitions and Operational Congruences in Reactive Systems, Part 1 17 IPOs and RPOs. The main theorem then follows by direct use of the categorical results from the section: if su#cient RPOs exist then strong bisimulation is a con­ gruence. I conclude by reproving the same theorem in a cleaner way by isolating two lemmas which give derived inference rules for labelled transitions. Section 3: I generalise the definition of reactive system by enriching it so as to comprise two categories with a functor F between them. The idea is that the downstairs category is the one in which one wishes to consider agents and contexts, but for which enough RPOs might not exist. The upstairs category does have RPOs but at the cost of extra intensional information in the arrows and objects. By refining the definition of labelled transition so that it relates arrows downstairs in terms of IPO properties of their preimages upstairs, I obtain congruence results for strong bisim­ ulation, weak bisimulation, the traces preorder, and the failures preorder. Finally, I propose added structure needed in a functorial reactive system to cater explicitly for RPOs that yield multi­hole contexts. I conclude by proving that strong bisimulation is a congruence here as well. Section 4: This section reviews some of the accomplishments of this paper. Part 2 continues this thread of work. It shows that a general class of functors satisfies the axioms of functorial reactive systems, from which I argue that a subclass of Milner's action calculi with su#cient RPOs is indeed an instance. Part 2 concludes with a list of open questions, whose solutions are critical to the application of this work to a wide variety of process calculi. 1.7 Acknowledgements I thank Robin Milner (my Ph.D. supervisor) for his inspiration and warmth. I learned more about the ways of doing research and the style of presenting it from him than anyone else. I had extensive and valuable discussions with Luca Cattani and Peter Sewell. Many colleagues provided suggestions, for which I am grateful: Tony Hoare, Martin Hyland, Philippa Gardner, Michael Norrish, Andy Pitts, John Power, Edmund Robinson, and Lucian Wischik I was supported by the following funding sources: EPSRC Research Grant GR/L62290 and an NSF Graduate Research Fellowship. RR n° 4394 18 J.J. Leifer 2 Operational congruences for reactive systems 2.1 Formalisation of reactive systems In this subsection I investigate how to give reactive systems, which were introduced in­ formally in Subsection 1.2, a precise category­theoretic definition. The goal is to include appropriate categorical structure so that it is possible to synthesise labelled transitions and prove that several operational equivalences and preorders are congruences. To that end, I first study relative pushouts (RPOs) and idem pushouts (IPOs), which are universal constructions related to pushouts. I then show how to synthesise labelled transitions from a set of reaction rules, with IPOs playing the central role. Finally I prove by categorical reasoning that if su#ciently many RPOs exist then strong bisimulation is a congruence. The next section considers richer notions of equivalences and reactive systems, proving that the former are congruences. This paper employs only basic category theory, such as simple universal constructions, slices (and coslices), monoidal structures, and functors. Full explanations appear in the classic work by Mac Lane [Mac71], or in [BW01], which is online. Throughout this section, I use lowercase roman letters a, b, . . . for agents (processes) and uppercase roman letters C, D, . . . for agent contexts (process contexts). Both are arrows in a category of contexts (as explained below). Juxtaposition is used for categorical composition. Other notation is explained as it comes up. How do we get at the essence of reactive systems, i.e. how do we find mathematical structure that is simple enough to get general results about congruences and rich enough to encompass significant examples? The key ingredients of a reactive system were shown in Subsection 1.2 and are recalled here: ˆ syntax (agents and agent contexts); ˆ structural congruence; ˆ set of reactive (non­guarding) contexts; ˆ reaction rules. For each, I outline the mathematical design space and explain the decisions I have taken. syntax: Since contexts are composable, I take them to be the arrows of some category C. This presents an immediate question. Are the objects of C agents or sorts? Following the former route leads to a problem: If we think of a context C as an embedding of an agent a into an agent a # , i.e. `` C : a a # '', then there is no easy way to apply C to a di#erent agent. For example, we cannot state the congruence INRIA Synthesising Labelled Transitions and Operational Congruences in Reactive Systems, Part 1 19 property of an equivalence #: if a # b for agents a and b then C ``applied to'' a and C ``applied to'' b are #­equivalent for all contexts C. (In Chapter 6 of [Lei01] I discuss embeddings, which are critical in proving the existence of universal constructions for categories of graphs.) Alternatively, if the objects are the sorts (or ``types'' or ``arities'') of the contexts, then agents are a special subclass of contexts. In particular, agents are contexts with a null hole, i.e. contexts whose domain is some distinguished object 0. Now the congruence property of # is neatly rendered: if a # b for all arrows a, b : 0 m, then Ca # Cb for all arrows C with domain m. For concreteness, consider as an example a category of contexts for some Algol­like programming language. The objects of the category could comprise the usual types: bool , int , cmd . Then we have the following examples of arrows: C 0 “ = if - then x := 0 else skip : bool cmd C 1 “ = 14 < - : int bool C 0 C 1 = if 14 < - then x := 0 else skip : int cmd . Another example is of a category of linear multi­hole term algebra contexts over some signature #. These are considered in more detail in Subsection 3.8. The objects are natural numbers. The arrows m n are n­tuples of terms over # # {- 1 , . . . , -m }, where each symbol - i is used exactly once. For example, if # = {#, # # , #, #}, where # and # # are constants, # is a 1­place function symbol, and # is a 2­place function symbol, then: C 0 “ = ###- 2 , # # #, #, ##- 1 ## : 2 3 C 1 “ = ##, ### # ## : 0 2 C 0 C 1 = ###### # #, # # #, #, ##### : 0 3 . structural congruence: The main decision here is whether to make structural congru­ ence explicit or implicit. The simplest solution (which is the one taken in this paper) is leave it implicit in the definition of arrow equality --- thus the arrows are structural equivalence classes of contexts. Consequently, certain categories (such as those of graph contexts, see Section 4 in Part 2) do not have enough universal constructions to give the desired congruence results. In these cases, we are forced to look for the universal constructions in less quotiented categories and then exhibit functors with special properties back to the fully quotiented categories (see Subsection 3.2). set of reactive contexts: This is modelled by a set D of arrows. Since reactive contexts are composable and identity contexts are reactive, I take D to be a subcategory of C. Furthermore, decomposing reactive contexts yields reactive contexts, so D 1 D 0 # D implies D 1 , D 0 # D. RR n° 4394 20 J.J. Leifer For example, in the call­by­value #­calculus [Plo75], the reactive contexts consist of all compositions of the following contexts: - ap(v, -) ap(-, a) where v is any value (closed abstraction) and a is any term. In the #­calculus considered in [Mil92], the reactive contexts consist of all compositions of following contexts (closed under structural congruence): - (#x)(-) - | a where x is any name and a is any process. Reactive contexts correspond to evaluation contexts of Felleisen and Friedman [FF86]. reaction rules: These are given by a set Reacts of redex­contractum pairs of agents (l, r) with common codomain, i.e. (l, r) # Reacts implies that there is an object m of C such that l, r : 0 m. For simplicity, I consider redexes and contractums that are pure agents, not agents with meta­variables (i.e. contexts). Thus, to define the reactions of CCS, we let Reacts “ = ## • x.a | x.b , a | b # / x is a name and a, b are agents # rather than: Reacts “ = ## • x.- 1 | x.- 2 , - 1 | - 2 # / x is a name # . I use / throughout this paper for set comprehensions. The latter approach main­ tains the maximum uniformity present in rules and is considered in detail by Sewell in [Sew]; however, that approach is complex and would require future work to adapt it to the categorical setting of this paper (see Section 5 in Part 2). Distilling the structures described in the past paragraphs yields the following definition of a reactive system and a reaction relation: Definition 2.1 (reactive system) A reactive system consists of a category C with added structure. We let m,n range over objects. C has the following extra components: ˆ a distinguished object 0 (not necessarily initial); ˆ a set of reaction rules called Reacts # # m#objC C(0, m) 2 , a relation containing pairs of agents with common codomain; ˆ a subcategory D of C, whose arrows are the reactive contexts, with the property that D 1 D 0 # D implies D 1 , D 0 # D. # INRIA Synthesising Labelled Transitions and Operational Congruences in Reactive Systems, Part 1 21 Definition 2.2 (reaction relation) Given a reactive system C, the reaction relation # # m#objC C(0, m) 2 contains pairs of agents with common codomain and is defined by lifting the reaction rules through all reactive contexts: a a # i# #(l, r) # Reacts, D # D. a = Dl & a # = Dr . # We now have enough definitions to make precise the first approximation for labelled transitions given in (1). The only change is that we think of composing arrows rather than ``applying contexts'' and we are careful about which contexts are reactive (by writing D # D below): Definition 2.3 (labelled transition --- first approximation) a F a # i# Fa a # i# #(l, r) # Reacts, D # D. Fa = Dl & a # = Dr . # # # # # 0 a l F D 6 The commuting square to the right renders the equality Fa = Dl. As I argued in Subsection 1.2, there may be ``junk'' in F and D, i.e. parts of F and D that do not contribute to the reaction. For example, in a category of CCS contexts, the outside square in Figure 7 commutes. So, by the naive definition of labelled transitions given above, • x.a -|x.b|y a | b | y (3) But the y in the label is superfluous. Is there a general condition on Figure 6 that prevents this labelled transition, but still allows the following: • x.a -|x.b a | b ? # # # # 0 • x.a • x.a|x.b -|x.b|y -|y -|x.b - -|y 7 Informally, the condition would state that there is no lesser upper bound in Figure 6 for a, l than F, D. In Figure 7 there clearly is a lesser upper bound, as illustrated by the triple of arrows inside the square. In the following subsections I render this condition in terms of categorical constructions and incorporate it in a new definition of labelled transitions. I then show that strong bisimulation is a congruence with respect to this labelled transition relation. A variety of preorders and other operational equivalences are discussed in the next section. 2.2 Categorical basis for contextual labels The goal of this subsection is to find a tractable definition of a labelled transition relation, one which readily leads to congruential equivalences and preorders and moreover facilitates proofs concerning these relations. Intuitively, the labels represent just the information exchanged between an agent and its environment in order to make a reaction. RR n° 4394 22 J.J. Leifer 0 1 1 1 # # # # # 9(1) 0 1 1 1 1 # # # ##-,# # # ###,-# ### # ,-# ##-,## # # 9(2) Figure 9: Non­existence of pushouts # # # # a l F F # D D # GG 8 Following the intuitions of the previous subsections concerning Figure 6, the natural question is as follows. How can F, D be forced to contain no ``junk''? A possible solution is to require that F and D are a ``least upper bound'' for a and l. The typical way to formulate this is to state that the square in Figure 8 is a pushout, i.e. has the property: Fa = Dl, and for every F # and D # satisfying F # a = D # l there exists a unique G such that GF = F # and GD = D # , as shown here. Unfortunately, pushouts rarely exist in the categories that interest us. Consider, for example, a category of term contexts over a signature #; its objects consist of 0 and 1; its arrows 0 1 are terms over #; its arrows 1 1 are one­hole contexts over #; there are no arrows 1 0 and exactly one arrow id 0 : 0 0. Now, if # contains only constant symbols, say # = {#, # # }, then there is no pushout completing Figure 9(1) because there are no contexts other than the identity. If we introduce a 2­place function symbol # into #, we can construct an upper bound for # and # # but still no pushout (Figure 9(2)). A more refined approach is to assert that F and D are a ``minimal upper bound'' --- informally, an upper bound for which there are no lesser upper bounds. Before defining this notion in terms of idem pushouts (IPOs), I give a more basic construction, namely that of relative pushouts (RPOs). The latter, unlike pushouts, exist in many categories of agent contexts. The plan for the rest of this subsection is to develop a sequence of propositions that will serve as a basis for the proofs of congruence by categorical reasoning given in this paper. Because RPOs and IPOs are categorical constructions independent of reactive systems, I shall work in this subsection with an arbitrary category C whose arrows and objects I denote by f, g, h, k, x, y, z and m,n; in pictures I omit labels on the objects when possible. Definition 2.4 (RPO) In any category C, consider a commuting square (Figure 10(1)) consisting of g 0 f 0 = g 1 f 1 . An RPO is a triple h 0 , h 1 , h satisfying two properties: commutation: h 0 f 0 = h 1 f 1 and hh i = g i for i = 0, 1 (Figure 10(2)); INRIA Synthesising Labelled Transitions and Operational Congruences in Reactive Systems, Part 1 23 m f 0 f 1 g 0 g 1 10(1) m f 0 f 1 g 0 g 1 h 0 h 1 h 10(2) m f 0 f 1 g 0 g 1 h 0 h 1 h h # h # 0 h # 1 kk 10(3) Figure 10: Construction of an RPO universality: for any h # 0 , h # 1 , h # satisfying h # 0 f 0 = h # 1 f 1 and h # h # i = g i for i = 0, 1, there exists a unique mediating arrow k such that h # k = h and kh i = h # i (Figure 10(3)). # A triple, such as h # 0 , h # 1 , h # given above, that satisfies the commutation property, i.e. h # 0 f 0 = h # 1 f 1 and h # h # i = g i for i = 0, 1, is often called a candidate. Thus an RPO triple is a candidate for which there is a unique mediating arrow from it to any other candidate. An RPO for Figure 10(1) is just a pushout in the slice category of C over m. Thus an RPO is a standard combination of categorical constructions --- though it is not commonly used in category theory and its application to reactive systems is novel. In Part 2, I illustrate the existence of RPOs for categories of graphs. For concreteness, though, it is worth examining now the example of an RPO and another candidate shown in Figure 11. The arrows are in a category of term algebra contexts over the signature {#, #, #, #}, where # is a constant and #, #, # are 1­place function symbols. The RPO triple -, ##-#, ####-## adds just the minimal extra bit of context ##-# to # in order to get an upper bound for #### and #; the arrow ####-## then provides the extra junk necessary to recover the upper bound provided by the surrounding square. The reader may enjoy checking that the candidate triple ##-#, ####-##, ##-# is the only other non­trivial one possible and that the mediating dotted arrow ##-# is unique. A square is called an IPO if it has an RPO of a special kind: Definition 2.5 (IPO) The commuting square in Figure 10(1) is an IPO if the triple g 0 , g 1 , id m is an RPO. # The di#erence between a pushout and an IPO is clearest in a partial order category: a pushout is a least upper bound (i.e. less than any other upper bound) and an IPO is a minimal upper bound (i.e. not greater than any other upper bound). IPOs form the basis of our abstract definition of labelled transition and their existence follows from that of RPOs as shown by the following lemma: RR n° 4394 24 J.J. Leifer #### # ####-## - ######-### ##-# ####-## ##-# ##-# ####-## ##-# Figure 11: An example of an RPO and another candidate # # # # n m f 0 f 1 g 0 g 1 h 0 h 1 h 12 n f 0 f 1 h 0 h 1 13 Proposition 2.6 (IPOs from RPOs) If Figure 12 is an RPO diagram then the square in Figure 13 is an IPO. Proof Arrow chasing. See Proposition 2.7 in [Lei01]. # The next result provides a partial converse to the previous proposition. It serves as a key part of the proof of IPO pasting which comes afterwards: # # # # m f 0 f 1 h 0 h 1 14 f 0 f 1 hh 0 hh 1 15 m f 0 f 1 hh 0 hh 1 h 0 h 1 h 16 Proposition 2.7 (RPOs from IPOs) If Figure 14 is an IPO and Figure 15 has an RPO then Figure 16 is an RPO. Proof Arrow chasing. See Proposi­ tion 2.8 in [Lei01]. # IPOs can be pasted together as shown by the following proposition, which is analogous to the standard pasting result for pushouts. # # # # n m f 0 x y g 0 z f 1 g 1 17 n m f 0 x g 0 f 1 z g 1 18 Proposition 2.8 (IPO pasting) Suppose that both squares in Figure 17 commute and that Figure 18 has an RPO. Then the following properties hold of Figure 17: 1. If the two squares are IPOs then so is the big rectangle. 2. If the big rectangle and the left square are IPOs then so is the right square. Proof Proposition 2.7 and arrow chasing. See Proposition 2.9 in [Lei01]. # INRIA Synthesising Labelled Transitions and Operational Congruences in Reactive Systems, Part 1 25 Finally, I conclude this collection of categorical results with two concerning IPOs; they are not immediately relevant to this section, but play an important role, respectively, in Proposition 3.15 and in Proposition 3.6 of Part 2. The first shows how IPOs can arise from epis. # # # # f 0 f 1 id g 1 19 Proposition 2.9 (IPOs from epis) Suppose f 1 is an epi. Then the outer square in Figure 19 is an IPO. Proof Arrow chasing. See Proposition 2.11 in [Lei01]. # The second asserts that only a subcategory of C plays any role in characterising that a particular square is an IPO. # # # # m m # f 0 f 1 g 0 g 1 20 Proposition 2.10 (IPOs in a full subcategory) Let m,m # be objects of C and let C # be a full subcategory of C satisfying the following property: obj C # # {n # obj C / #h # C(m,n) & #h # # C(n, m # )} . Suppose the square in Figure 20 commutes, where f i , g i # C # for i = 0, 1. Then the square is an IPO in C i# it is an IPO in C # . Proof The only arrows relevant to the square being an IPO in C are contained in C # . # 2.3 Labelled transitions and congruence for strong bisimulation The category theory developed in the previous subsection provides the machinery needed in this subsection to accomplish two aims. The first is to improve the unsatisfactory definition of labelled transition given earlier (Definition 2.3). The second is to prove that strong bisimulation over the new labelled transitions is a congruence. I return to the notations of Subsection 2.1, using C for a reactive system, with a, b # C ranging over arrows with domain 0 (agents) and C, D,F # C ranging over arbitrary arrows (contexts). # # # # 0 a l F D 21 The new version of labelled transitions is a modification of the approxi­ mation given by Definition 2.3, where the condition Fa = Dl is strengthened to require that the square in Figure 21 is an IPO: Definition 2.11 (labelled transition) a F a # i# there exists (l, r) # Reacts and D # D such that Figure 21 is an IPO and a # = Dr. # This definition assures that F, D provides a minimal upper bound on a and l, as required in Subsection 2.1. For suppose there is another upper bound F # , D # , i.e. F # a = D # l, and also F = RF # and D = RD # for some R. Then the IPO property for Figure 21 ensures that for some R # (with RR # = id) we have F # = R # F and D # = R # D --- so F, D provides a ``lesser'' upper bound than F # , D # after all. Proposition 2.12 For all contexts F we have that a F a # implies Fa a # . # RR n° 4394 26 J.J. Leifer The converse fails in general (which is good, given the remarks made after Definition 2.3 about the first approximation for labelled transitions). I return to the converse property later in Subsection 3.4 in the special case that F is an iso. Strong bisimulation over · follows its usual scheme [Par81]: Definition 2.13 (strong bisimulation over · ) Let S # # m#objC C(0, m) 2 be a relation that contains pairs of agents with common codomain. S is a simulation over · i# S satisfies the following property for all (a, b) # S: if a F a # then there exists b # such that b F b # and (a # , b # ) # S. S is a strong bisimulation i# S and S -1 are strong simulations. Let # be the largest strong bisimulation over · . # I now state and prove the congruence result for strong bisimulation, one of the central results of this paper: if C has a su#ciently rich collection of RPOs then # is a congruence. Definition 2.14 (C has all redex­RPOs) Say that C has all redex­RPOs if for all (l, r) # Reacts and arrows a, F, D such that D # D and Fa = Dl, the square in Figure 21 has an RPO. # Theorem 2.15 (congruence for #) Let C be a reactive system which has all redex­ RPOs. Then # is a congruence, i.e. a # b implies Ca # Cb for all C # C with required domain. Proof By symmetry, it is su#cient to show that the following relation is a strong sim­ ulation: S “ = {(Ca, Cb) / a # b and C # C} . The proof falls into three parts, each of which is an implication as illustrated in Figure 22(1). Dashed lines connect pairs of points contained within the relation anno­ tating the line. Each arrow `` # '' is tagged by the part of the proof below that justifies the implication. Suppose that a # b and C # C, and thus (Ca, Cb) # S. (i): If Ca F a # then, by definition, there exists (l, r) # Reacts and D # D such that the big rectangle in Figure 22(2) is an IPO and a # = Dr. Because C has all redex­RPOs, there exists F # , D # , C # forming an RPO as in Figure 22(2); moreover, D # , C # # D since C # D # = D # D. By Proposition 2.6, Figure 22(3) is an IPO. Because C has all redex­RPOs, Proposition 2.8 implies that Figure 22(4) is an IPO too. By definition, a F # D # r and a # = C # D # r. (ii): Since a # b, there exists b ## such that b F # b ## and D # r # b ## . By definition there exists (l # , r # ) # Reacts and E # # D such that Figure 22(5) is an IPO and b ## = E # r # . (iii): Because C has all redex­RPOs, Proposition 2.8 implies that we can paste Figure 22(5) with Figure 22(4) (both IPOs) along F # and conclude that Figure 22(6) is an IPO. Hence Cb F C # E # r # and (C # D # r, C # E # r # ) # S because D # r # E # r # , as desired. # INRIA Synthesising Labelled Transitions and Operational Congruences in Reactive Systems, Part 1 27 Ca a # =C # D # r #(i) a D # r #(ii) b b ## =E # r # #(iii) Cb C # b ## =C # E # r # F S S F F # # F # # 22(1) 0 a l C F # F D D # C # 22(2) 0 a l F # D # 22(3) C F # F C # 22(4) 0 b l # F # E # 22(5) 0 b l # C F E # C # 22(6) Figure 22: Congruence proof for strong bisimulation The crux of the above proof is that Figure 22(4), which mediates between an F # ­ labelled transition of a and an F ­labelled transition of Ca, is ``portable'', i.e. can be pasted onto a new diagram, serving the same function for b and Cb. This essential idea appears to be robust under variation both of the definition of labelled transition and of the congruence being established. Many examples are shown in Section 3. We can isolate precisely in two lemmas how such portable IPO squares are cut and then pasted. These lemmas are just pieces of the congruence proof above, but their factorisation from the main proof greatly simplifies the latter and lays the ground for tractable presentations of more di#cult congruences results in the next section. The first lemma shows that portable IPO squares arise when a composite agent has a labelled transition: Lemma 2.16 (portable IPO cutting) If C has all redex­RPOs then the following inference rule holds: Ca F a # # a ## and an IPO C F # F C # . a F # a ## a # = C # a ## C # # D . # The second shows how to ``paste'' a portable square to gain a labelled transition of a composite agent: RR n° 4394 28 J.J. Leifer Lemma 2.17 (portable IPO pasting) If C has all redex­RPOs then the following inference rule holds: C F # F C # is an IPO a F # a ## C # # D Ca F C # a ## . # We can now replay the proof of Theorem 2.15 in a more concise form by employing these lemmas: # # # # C F # F C # 23 (i): If Ca F a # then by Lemma 2.16, there exists a ## and an IPO square shown in Figure 23 such that a F # a ## a # = C # a ## C # # D . (ii): Since a # b, there exists b ## such that b F # b ## and a ## # b ## . (iii): Since Figure 23 is an IPO and C # # D, Lemma 2.17 implies that Cb F C # b ## . Also, a ## # b ## implies (C # a ## , C # b ## ) # S, as desired. Let us return to Lemma 2.16 in order to expose an odd property. The Lemma contra­ dicts the situation in many process calculi: normally, Ca F does not necessarily imply that a has any labelled transitions. In CCS, for example, 0 | x x (using the tradi­ tional CCS non­contextual labels) but 0 has no transitions. As a result, in typical proofs of congruence for strong bisimulation, two cases are distinguished when considering the transition Ca F (using the notation of this section): ˆ C and F together conspire to create the transition without reference to a. In this case Cb F holds without using the assumption that a # b. (For example, see ``Case 2'' on p. 98 of [Mil88].) ˆ Or, a, C, and F together conspire to create the transition, as in part (i) above. Recasting the CCS example in terms of contextual labels, we have that 0 | x -|•x . But then there is a contextual transition for 0, namely 0 -|x|•x , though not a satisfactory one: the label provides the entire redex, without any contribution from 0. This is attributable to a defect in the definition of contextual labelled transitions: if a F , the IPO property requires that F contain parts of the relevant redex and no extra junk, but does not prevent F from containing all of the redex. It is by enriching the categorical structure to express multi­hole contexts (see Subsection 3.8) that we eliminate this defect of transitions. When we do, exactly the same case analysis (shown above) is carried out when proving congruence. INRIA Synthesising Labelled Transitions and Operational Congruences in Reactive Systems, Part 1 29 3 Further congruence results 3.1 Introduction This section generalises the definition of labelled transition given in the previous section and provides congruence proofs for additional equivalences and preorders. These include weak bisimulation, the traces preorder, and the failures preorder. I will describe them in turn as each congruence proof is presented. The more important step, though, is the generalisation of reactive systems and la­ belled transitions. In the previous section, the central hypothesis required in the proof of congruence for strong bisimulation is that the reactive system C has all redex­RPOs. This section addresses the problem of what to do if C does not possess su#cient RPOs. Such a situation arises when considering, for example, a category of graph contexts (see Subsection 4.3 in Part 2). Roughly, the lack of RPOs is attributable to the absence of enough intensional information about the occurrence of nodes: it is ambiguous which node in a context corresponds to a node in the composition of the context with another. Thus if C 0 B 0 = C 1 B 1 , it is ambiguous which nodes are common to both C 0 and C 1 and thus impossible to choose the context to be factored o# when constructing an RPO. What can be done when there are not enough RPOs in a reactive system? In general, it is not a good solution simply to enrich the reactive system to force it to have enough RPOs. The enrichment could yield a category with too much intensional information. For example, the enrichment considered for graph contexts (Subsection 4.6 in Part 2) forces arrows with the same domain and codomain to have the same number of nodes. Since the definition of strong bisimulation requires that a # b implies that a, b : 0 m for some object m, the strong bisimulation relation could only compare arrows of the same number of nodes. Such a restriction is unacceptable because strong bisimulation should include pairs of agents with widely di#ering static structure. The solution presented in this section is to accommodate two categories “ C and C related by a functor: “ C C F . The idea is that C is a reactive system, whose arrows correspond to the agents and agent contexts; C does not necessarily have enough RPOs. Sitting ``above'' C is “ C, a category with su#cient RPOs. The definition of the labelled transition relates arrows in C, just as in the previous section: i.e. a F a # is defined for a, a # agents in C and F an agent context of C. But, by contrast with the previous section, the definition is given in terms of the existence of an IPO ``upstairs'' in “ C. (If “ C = C and F is the identity functor, then the new definition of this section collapses into the old one given in the previous.) RR n° 4394 30 J.J. Leifer Thus we get the best of both worlds: the agents whose labelled transition behaviour we consider need not contain any superfluous intensional data; as long as we can construct a more intensional category above containing su#cient RPOs and a suitable functor, then we can get congruence results downstairs. # # # # C F # F C # 24 C 0 F # 0 F 0 C # 0 25 These congruence results require the functor F : “ C C to satisfy certain properties. Most are trivial but one is interesting, namely that F allows IPO sliding. Recall from the previous sec­ tion that the crux of the congruence proof for strong bisimulation was the portability of the IPO square that related F # ­transitions of an agent to F ­transitions of C applied to the agent. This square was cut o# when passing from Ca F to a F # and then pasted back on when passing from b F # to Cb F . In the new definition of labelled transition considered in this section, the pasting operation is more complex. The portable square, e.g. Figure 24, now lives in “ C (the upstairs category) and its left leg is F # , some arrow for which F(F # ) = F # . (Teletype font is used for arrows in “ C.) However, the transition b F # is justified by an IPO square upstairs whose right­leg is F # 0 , an arrow in the preimage of F # not necessarily equal to F # . Thus Figure 24 cannot be pasted without first sliding it to a new IPO square, e.g. Figure 25, whose left­leg is F # 0 and whose F­image is kept invariant. The present section assumes that F allows IPO sliding; Section 3 in Part 2 proves that this is the case when F is of a certain general form. The outline of this section is as follows. In the next subsection, I define the notion of a functorial reactive system, giving precise requirements for F . Then I define the reaction and labelled transition relations. In the following subsection, I prove some results about portable IPO squares that are direct analogies to those (Lemma 2.16 and Lemma 2.17) at the end of the previous section. The main subsections are concerned with a series of congruence proofs for strong bisimulation, weak bisimulation, the traces preorder, and the failures preorder. The final subsection treats a richer notion of functorial reactive system with arrows corresponding to multi­hole contexts and shows that strong bisimulation is indeed a congruence here as well. 3.2 Functorial reactive systems The first part of the setup is to define precisely the notion of functorial reactive system, which was introduced informally above. Its definition rests on that of a reactive system, given in the previous section, which we recall here first for ease of reference: Definition (reactive system recalled --- see Definition 2.1) A reactive system consists of a category C with added structure. We let m,n range over objects. C has the following extra components: ˆ a distinguished object 0 (not necessarily initial); INRIA Synthesising Labelled Transitions and Operational Congruences in Reactive Systems, Part 1 31 ˆ a set of reaction rules called Reacts # # m#objC C(0, m) 2 , a relation containing pairs of agents with common codomain; ˆ a subcategory D of C, whose arrows are the reactive contexts, with the property that D 1 D 0 # D implies D 1 , D 0 # D. # Definition 3.1 (functorial reactive system) Let C be a reactive system. A functorial reactive system over C consists of a functor F : “ C C which maps a distinguished object # # obj “ C to 0 (the distinguished object of C) and which satisfies the following properties. F lifts agents: for any a : 0 m there exists a : # m such that F(a) = a. F creates isos: if F(C) is an iso then C is an iso. F creates compositions: if F(C) = C 1 C 0 , there exist C 0 , C 1 # “ C such that C = C 1 C 0 and F(C i ) = C i for i = 0, 1. # # # # C F # F C # 26 C 0 F # 0 F 0 C # 0 27 F allows IPO sliding: for any IPO square as in Figure 26 and any arrow F # 0 with F(F # 0 ) = F(F # ) there exist C 0 , C # 0 , F 0 form­ ing an IPO square as in Figure 27 with F(C 0 ) = F(C) F(C # 0 ) = F(C # ) F(F 0 ) = F(F) . # Throughout this section, I use uppercase teletype characters to denote arrows in “ C and lowercase teletype characters (a, l, . . .) to denote arrows with domain # in “ C. The F images of these are agents in C. The special domain requirement of a, l, . . . is left tacit throughout this section: thus (#l # “ C. . . .) means (#l # “ C. Dom l = # & . . .). The definition of the reaction relation is identical to the one given earlier: Definition (reaction relation ( ) recalled; cf. Definition 2.2) Given a functorial reactive system F : “ C C, the reaction relation # # m#objC C(0, m) 2 contains pairs of agents with common codomain and is defined by lifting the reaction rules through all reactive contexts: a a # i# there exists D # D and (l, r) # Reacts such that a = Dl and a # = Dr. # This definition has an alternative characterisation given by the following result: Proposition 3.2 (characterisation of ) a a # i# there exist a, l, D # “ C and r # C such that a = Dl and a # = F(D)r F(D) # D (F(l), r) # Reacts F(a) = a . Proof Follows immediately because F lifts agents and creates compositions. # RR n° 4394 32 J.J. Leifer We now turn to the definition of labelled transition. As stated earlier, · is a ternary relation whose arguments are all arrows in C. The original requirement that a particular square be an IPO in C (see Definition 2.11) is replaced here by requiring that there exist a preimage of this square that is an IPO in “ C: # # # # a l F D 28 Definition 3.3 (labelled transition ( · ); cf. Definition 2.11) a F a # i# there exist a, l, F, D # “ C and r # C such that Figure 28 is an IPO in “ C and a # = F(D)r F(D) # D (F(l), r) # Reacts F(a) = a F(F) = F . # Notice that a # , the RHS of the transition, is required to be F(D)r, and not F(Dr) for some r with F(r) = r. This is important since it allows the reaction rules Reacts to contain pairs (l, r) for which F­preimages of l and r might not have common codomains. By analogy with Definition 2.14, we can define when a functorial reactive system F has all redex­RPOs; the primary di#erence is that here the RPOs exist upstairs in “ C, not downstairs in C: # # # # a l F D 29 Definition 3.4 (F has all redex­RPOs; cf. Definition 2.14) A functorial reactive system F : “ C C has all redex­RPOs if any square, such as in Figure 29, has an RPO, provided that F(D) # D and that there exists r # C such that (F(l), r) # Reacts. # Note that we do not demand that all RPOs exist, just ones for which the left­leg of the enclosing square is a preimage of a redex and the bottom leg is a preimage of an arrow in D. (This narrowing of the definition is exactly analogous to what happens in Definition 2.14.) 3.3 Cutting and pasting portable IPO squares This subsection replays the results at the end of the previous section which show how to cut and paste portable IPO squares. The main di#erence lies in the proof of pasting: here we make explicit use of the assumption that F allows IPO sliding. The first result shows how the transitions of composite agents yield IPO squares: Lemma 3.5 (portable IPO cutting; cf. Lemma 2.16) Suppose F : “ C C is a functorial reactive system and has all redex­RPOs. The following inference rule holds: Ca F a # # a ## # C and an IPO square C F # F C # . # a F(F # ) a ## a # = F(C # )a ## F(C # ) # D F(C) = C F(F) = F # . INRIA Synthesising Labelled Transitions and Operational Congruences in Reactive Systems, Part 1 33 # # # # a l C F # F D D # C # 30 Proof By the definition of F and the hypothesis that F creates com­ positions, there exists a, C, l, F, D # “ C and r # C such that the big rectangle in Figure 30 is an IPO and a # = F(D)r F(D) # D (F(l), r) # Reacts F(a) = a F(C) = C F(F) = F . Because F has all redex­RPOs, there exist F # , D # , C # forming an RPO in “ C, as in Figure 30. Then F(C # ) # D since F(C # )F(D # ) = F(D) # D. By Proposition 2.6, the small left­hand square of Figure 30 is an IPO. Because F has all redex­RPOs, Proposition 2.8 implies that the small right­hand square is an IPO too. By definition, a F(F # ) a ## and a # = F(C # )a ## where a ## “ = F(D # )r. # The next result shows how the reactions of composite agents can be decomposed. There is no analogue of this result the previous section since it is not needed in the congruence proof for strong bisimulation. Lemma 3.6 (portable IPO cutting for reactions) Suppose F : “ C C is a functorial reactive system and has all redex­RPOs. The following inference rule holds: Ca a # # a ## # C and C # , F # # “ C. # a F(F # ) a ## a # = F(C # )a ## F(C # ) # D F(C # F # ) = C # . Moreover, if F # is an iso in the conclusion then it is equal to id. Proof See Lemma 3.6 in [Lei01]. # The final result shows how to paste a portable IPO square in order to gain a transition for a composite agent. As stated above, this is where IPO sliding is used: Lemma 3.7 (portable IPO pasting; cf. Lemma 2.17) Suppose F : “ C C is a functorial reactive system and has all redex­RPOs. The following inference rule holds: C F # F C # is an IPO a F(F # ) a # F(C # ) # D F(C)a F(F) F(C # )a # . Proof See Lemma 3.7 in [Lei01]. # That concludes the setup for functorial reactive systems. The rest of the section is devoted to a sequence of congruence proofs. RR n° 4394 34 J.J. Leifer Ca a # =F(C # )a ## #(i) a a ## #(ii) b b ## #(iii) Cb F(C # )b ## F S S F F(F # ) # F(F # ) # Figure 31: Schema of the congruence proof for # 3.4 Strong bisimulation This subsection proves that strong bisimulation is a congruence for a functorial reactive system. The definition is straightforward: Definition 3.8 (strong bisimulation over · ; cf. Definition 2.13) Let # be the largest strong bisimulation over · . # The proof of congruence is almost identical to the one presented at the end of the previous section: the only di#erence is that the updated IPO cutting and pasting results for functorial reactive systems are substituted for the old ones. Theorem 3.9 (congruence for #) Let F : “ C C be a functorial reactive system which has all redex­RPOs. Then # is a congruence, i.e. a # b implies Ca # Cb for all C # C of the required domain. Proof By symmetry, it is su#cient to show that the following relation is a strong sim­ ulation: S “ = {(Ca, Cb) / a # b and C # C} . The proof falls into three parts, each of which is an implication as illustrated in Figure 31. Dashed lines connect pairs of points contained within the relation annotating the line. Each arrow ``#'' is tagged by the part of the proof below that justifies the implication. Suppose that a # b and C # C, and thus (Ca, Cb) # S. INRIA Synthesising Labelled Transitions and Operational Congruences in Reactive Systems, Part 1 35 # # # # C F # F C # 32 (i): If Ca F a # then by Lemma 3.5, there exist a ## # C and an IPO square shown in Figure 32 such that a F(F # ) a ## and a # = F(C # )a ## F(C # ) # D F(C) = C F(F) = F . (ii): Since a # b, there exists b ## such that b F(F # ) b ## and a ## # b ## . (iii): Since Figure 32 is an IPO and F(C # ) # D, Lemma 3.7 implies that Cb F F(C # )b ## . Also, a ## # b ## implies (F(C # )a ## , F(C # )b ## ) # S, as desired. # In most process calculi the reaction relation and the #­labelled transition relation coincide. See, for example Proposition 5.2 in [Mil92] and Theorem 2 in [Sew00]. A # transition is a ``silent move'': a transition that takes place without interacting with the external environment. Intuitively, a #­labelled transition corresponds to an id­labelled transition when using contexts as labels, i.e. a id a # i# the environment need only supply a vacuous identity context in order to enable a to react. However, if we look carefully at the definition of labelled transition given in Definition 3.3 and the characterisation of reaction in Proposition 3.2, we see that id and are not necessarily identical. There is an implication in one direction, namely id # , since every IPO square is also a commuting square, but not necessarily the converse. Indeed, Figure 3 contains a non­IPO commuting square whose outermost legs on the right are both identity arrows. In the special situation when all preimages of redexes are epis, id and do coincide, thanks to Proposition 2.9. This situation is explored at the end of this subsection in Proposition 3.15. Before taking the epi hypothesis on board let us a consider an alternate definition of labelled transition for which we do recover the reaction relation: Definition 3.10 (labelled transition by cases ( · c )) a F c a # i# # Fa a # if F is an iso a F a # if F is not an iso . # It follows immediately from the definition that a id c a # i# a a # . Furthermore, the induced strong bisimulation (defined next) is a congruence, as shown below. It is worth considering whether there are other definitions that recover the reaction relation but do not involve case analysis. This point is taken up in Appendix B in [Lei01] where I present a definition of labelled transition involving retractions that satisfies this requirement; I show furthermore that this definition induces a congruence which includes # c . Let us return to the main flow of the argument now and consider # c in detail. RR n° 4394 36 J.J. Leifer Definition 3.11 (strong bisimulation over · c ) Let # c be the largest strong bisim­ ulation over · c . # Theorem 3.12 (congruence for # c ) Let F : “ C C be a functorial reactive system which has all redex­RPOs. Then # c is a congruence, i.e. a # c b implies Ca # c Cb for all C # C of the required domain. Proof See Theorem 3.12 in [Lei01]. # Because · and · c are so closely related, it is not surprising that the induced congru­ ences are also related. Indeed the following result shows that # c is a coarser equivalence than #. It is an open question whether there exists a functorial reactive system for which the inequality is strict. Proposition 3.13 (# # # c ) Let F : “ C C be a functorial reactive system which has all redex­RPOs. Then # # # c . Proof See Proposition 3.13 in [Lei01]. # I now return to the epi hypothesis discussed earlier and show that if it is satisfied then · = . A corollary is that · = · c and thus # = # c . (In Chapter 7 of [Lei01], I show exact conditions on redexes of a functorial reactive system of graph contexts which hold i# the epi hypothesis is satisfied.) Definition 3.14 (redexes have epi preimages) Let F : “ C C be a functorial reactive system. Say that redexes have epi preimages i# for all l # “ C, if there exists r # C such that (F(l), r) # Reacts, then l is an epi. (Recall that lowercase teletype letters stand for arrows in “ C with domain #.) # # # # # a l id D 33 Proposition 3.15 Let F : “ C C be a functorial reactive system. Suppose that redexes have epi preimages. Then id = . Proof Follows from Proposition 2.9. See Proposition 3.15 in [Lei01]. # Corollary 3.16 Let F : “ C C be a functorial reactive system which has all redex­ RPOs. Suppose the epi hypothesis of Proposition 3.15 is satisfied, namely, the preimage of every redex is an epi. Then · = · c and thus # = # c . Proof See Corollary 3.16 in [Lei01]. # 3.5 Weak bisimulation Weak bisimulation [Mil88] is a coarser equivalence than strong bisimulation and is less sensitive to the number of silent steps made by the agents it compares. A single labelled transition by one agent may be matched by a weak labelled transition of another, namely a sequence of reactions, followed by a like transition, followed by further reactions. INRIA Synthesising Labelled Transitions and Operational Congruences in Reactive Systems, Part 1 37 Definition 3.17 (weak labelled transition ( · )) The weak labelled transition relation is defined as follows: a F a # i# a # F # a # . # The definition of weak bisimulation follows exactly the form set out by Milner: Definition 3.18 (weak bisimulation over · ; cf. Definition 5 on p. 108 in [Mil88]) Let S # # m#objC C(0, m) 2 be a relation that contains pairs of agents with common codomain. S is a weak simulation over · i# it satisfies the following properties for all (a, b) # S: 1. If a a # , then there exists b # such that b # b # and (a # , b # ) # S. 2. If a F a # where F is not an iso, then there exists b # such that b F b # and (a # , b # ) # S. S is a weak bisimulation i# S and S -1 are weak simulations. Let # be the largest weak bisimulation over · . # In the second clause of the definition of weak simulation, the label F is required to be not an iso. Without this requirement, the definition would force a F to be matched by b F , for F an iso. By the definition of IPOs, iso­labelled transitions are essentially like id labelled transitions. As argued in the previous subsection id # , and moreover the converse holds in certain cases. So allowing F to be an iso in clause 2 would override clause 1. But clause 1 embodies a basic principle, namely that a silent transitions is matched by zero or more silent transitions and not by one or more. The congruence property of weak bisimulation is more limited than that of strong bisimulation: # is a congruence with respect to arrows only in D, a subcategory of C. Recall that D consists of the ``reactive contexts'', i.e. the contexts that allow reaction under them: a a # implies Da Da # for D # D. (See Definition 2.) This limitation is not surprising. In CCS, for example, weak bisimulation is not a congruence with respect to summation contexts, which are not reactive, i.e. we do not have that a a # implies a + b a # + b. (I am using a, b for agents and not for names in order to maintain consistency with the notation of the rest of this paper.) Use of the hypothesis C # D occurs twice in the proof. In CCS, weak bisimulation is a congruence with respect to some non­reactive contexts, namely the prefixing contexts x.- and • x.-. We would require richer structure than is contained in Part 2 in order to have a category of CCS contexts, namely the nesting of graphs to represent prefixing, some added data (not yet well understood) to represent summation, and the inclusion of free names to represent the naming structure of CCS. (See Section 5 in Part 2 for a discussion of these open problems.) If we could construct RR n° 4394 38 J.J. Leifer such a category then it is likely that proving explicitly that weak bisimulation is preserved by prefixing would be easy since the only initial labelled transition of a prefixed agent is based on the prefix itself. Nonetheless, it is worth considering whether we could get a better general congruence result for weak bisimulation by dividing the set of reactive contexts in two, with one set containing prefixing­like contexts and another containing sum­like contexts. I am not sure how this would work, but something similar is done in work on rule formats for structural operational semantics by Bloom in [Blo93] who distinguishes ``tame'' from ``wild'' contexts. Theorem 3.19 (congruence for # w.r.t. D) Let F : “ C C be a functorial reactive system which has all redex­RPOs. Then # is a congruence with respect to all contexts in D, i.e. a # b implies Ca # Cb for all C # D of the required domain. Proof See Theorem 3.19 in [Lei01]. # How can we get a congruence with respect to C, not just D? One possibility is to replace the F of clause 2 in Definition 3.18 with F # . The largest symmetric relation satisfying this new definition is a congruence with respect to all of C. A second possibility is to make this change for the first step only of the weak bisimulation relation and then revert to the normal definition in Definition 3.18: Definition 3.20 (greedy weak bisimulation (# gr )) Let # gr # # m#objC C(0, m) 2 be the largest symmetric relation that contains pairs of agents with common codomain and which satisfies the following properties: 1. If a a # , then there exists b # such that b # b # and a # # b # . 2. If a F a # where F is not an iso, then there exists b # such that b F # b # and a # # b # . We call # gr greedy weak bisimulation. # Notice that this definition requires that a # # b # , not the stronger condition that a # # gr b # . The relation # gr is also a congruence with respect to C. The proof is almost identical to that of Theorem 3.19: Theorem 3.21 (congruence for # gr w.r.t. C) Let F : “ C C be a functorial reactive system which has all redex­RPOs. Then # gr is a congruence with respect to all contexts in C, i.e. a # b implies Ca # Cb for all C # C of the required domain. # The idea of having a special requirement for the first step of a weak bisimulation followed by the use of standard weak bisimulation to compare the continuations is well­ established. See, for example, the definition of observational congruence (Definition 2 on p. 153 in [Mil88]), also known as rooted weak bisimulation in the literature. Experience with CCS suggests that a congruence in ensured by changing clause 1 (to require b INRIA Synthesising Labelled Transitions and Operational Congruences in Reactive Systems, Part 1 39 # b # ) and not clause 2 in Definition 3.18 --- exactly the opposite of what is done in Definition 3.20. This anomaly requires further research, both to find categorical theory to model di#erent kinds of non­reactive contexts and to show that RPOs exist for categories of graph­like contexts that contain summation. 3.6 Traces preorder This subsection addresses the traces preorder, a simple preorder that compares agents based on their finite traces. A trace (see p. 41 in [Hoa85]) is a sequence of labelled transitions. The traces preorder is insensitive to non­determinism and deadlock so is of limited use. Nonetheless, traces are good for specifying security properties since it is easy to formulate that an agent does not possess certain ``bad'' traces (see, for example, [Pau98, SV00]). The main motivation for this subsection is to provide a warmup for the next one, which looks at the failures preorder. As a result, the traces considered here are all strong (not interspersed with reaction steps) since the way to handle the weak case is subsumed by the results presented in the next subsection. Definition 3.22 (traces preorder (# tr ); cf. p. 45 in [Ros98]) A pair of agents b and a with common codomain are related by the traces preorder, written a # tr b, i# all the traces of a are traces of b: for every trace #F 1 , . . . , F n #, a F 1 · · · Fn implies b F 1 · · · Fn . # (In this subsection and the next, n, m are natural numbers and not objects of a cate­ gory.) The proof of congruence is more complicated that that of strong bisimulation (Theorem 3.9) because we need to consider traces, not just individual labelled transi­ tions. The heart of the argument is an inductive construction of a trace of a from a trace of Ca. Each inductive step cuts o# a portable IPO square (see Lemma 3.5) which is sub­ sequently pasted back on (Lemma 3.7) when constructing a trace of Cb from a trace of b. Theorem 3.23 (congruence for # tr ) Let F : “ C C be a functorial reactive system which has all redex­RPOs. Then # tr is a congruence, i.e. a # tr b implies Ca # tr Cb for all C # C of the required domain. Proof Suppose that a # tr b. Let C be any context of appropriate domain. We wish to prove that Ca # tr Cb. The proof is divided into three parts, which are shown schematically in Figure 34. RR n° 4394 40 J.J. Leifer C 0 a 0 ··· F 1 Cnan Fn #(i) a 0 ··· F(F # 1 ) an F(F # n ) #(ii) b 0 ··· F(F # 1 ) bn F(F # n ) #(iii) C 0 b 0 ··· F 1 Cn bn Fn # tr # tr Figure 34: Schema of the congruence proof for # tr # # # # C i F # i F i C # i 35 (i): Let • a 0 “ = Ca and consider any trace • a 0 F 1 · · · Fn • a n , where n # 0. We construct (a 0 , C 0 ) . . . (a n , C n ) and the square shown in Figure 35 for 1 < i # n such that the following conditions hold for 0 # i # n: • a i = C i a i Tr­a a i-1 F(F # i ) a i for i #= 0 Tr­lab C i # D for i #= 0 Tr­D F(C i ) = C i-1 for i #= 0 Tr­C F(C # i ) = C i for i #= 0 Tr­Cprime F(F i ) = F i for i #= 0 Tr­F Figure 35 is an IPO for i #= 0 . Tr­IPO base: Let a 0 “ = a and C 0 “ = C. Then Tr­a holds and the other conditions are vacuous. # # # # C i+1 F # i+1 F i+1 C # i+1 36 step: We construct a i+1 , C i+1 and the square in Figure 36 from a i , C i as follows, assuming 0 # i < n. By the inductive hy­ pothesis Tr­a holds for i, thus • a i = C i a i . Since • a i F i • a i+1 , Lemma 3.5 implies that there exist a i+1 # C and an IPO square shown in Figure 36 such that a i F(F # i+1 ) a i+1 and • a i+1 = C i+1 a i+1 C i+1 # D F(C i+1 ) = C i F(F i+1 ) = F i+1 . where we let C i+1 “ = F(C # i+1 ). Then all of the inductive properties are satisfied for i + 1. INRIA Synthesising Labelled Transitions and Operational Congruences in Reactive Systems, Part 1 41 Thus by Tr­lab a = a 0 F(F # 1 ) · · · F(F # n ) a n . (ii): Since a # tr b there exist b 0 . . . b n such that b = b 0 F(F # 1 ) · · · F(F # n ) b n . (iii): We now claim that C i b i F i+1 C i+1 b i+1 for 0 # i < n. Since b i F(F # i+1 ) b i+1 and Figure 36 is an IPO (by Tr­IPO), with F(C # i+1 ) = Tr­Cprime C i+1 # D by Tr­D, then Lemma 3.7 implies C i b i = Tr­C F(C i+1 )b i F(F i+1 ) F(C # i+1 )b i+1 = Tr­Cprime C i+1 b i+1 , and thus by Tr­F, C i b i F i+1 C i+1 b i+1 . So Cb = C 0 b 0 F 1 · · · Fn as desired. # 3.7 Failures preorder This subsection looks at the failures preorder, which is a fundamental part of the failures and divergences model of CSP [Hoa85]. I do not consider divergences here, so the definition I use only employs failures. The failures preorder is sensitive to non­determinism and deadlock (see Subsection 3.3 in [Ros98]). The failures of an agent provide a domain­ theoretic interpretation, assigning a meaning to each agent independently of the others (unlike for bisimulation). This makes failures properties well­suited to model checking [Ros94, Low96]. In order to define a failure of an agent, I first extend the notion of a weak labelled transition to allow for sequences of labels (not just single labels): Definition 3.24 (weak labelled transition extended; cf. Definition 3.17) a # # a # i# a # a # a #F #“t a # i# a F t a # , where t is a sequence of arrows of appropriate domain and “ is the concatenation operator. # A failure of a consists of a sequence of weak labelled transitions t and a set of labels X such that a evolves by t to a stable state (one for which no reactions are possible) which refuses X, i.e. cannot engage in a transition labelled by any of the arrows in X. To prevent reactions from masquerading as labelled transitions, every arrow in t and X is not an iso (cf. the discussion immediately following Definition 3.18). RR n° 4394 42 J.J. Leifer Definition 3.25 (failure; cf. p. 171 in [Ros98]) A failure of a is a pair (t, X) where t is a finite sequence of arrows (each not an iso) and X is a set of arrows (each not an iso) for which there exists a # such that the following conditions hold: a t a # a has a weak trace t; a # / a # is stable; #F # X. a # F / a # refuses X. # Definition 3.26 (failures preorder (# f ); cf. p. 193 in [Ros98]) A pair of agents b and a with common codomain are related by the failures preorder, written a # f b, i# all the failures of a are possessed by b. # The relation # f is only a congruence with respect D, the subcategory of C consisting of reactive contexts (cf. Theorem 3.19). The only use of the hypothesis C # D occurs in the base case of the induction. The proof, which we omit, is similar to that of the traces preorder however there are two aspects that require care: the cutting and pasting of portable IPO squares for weak labelled transitions and the propagation of refusal sets. Theorem 3.27 (congruence for # f w.r.t. D) Let F : “ C C be a functorial reactive system which has all redex­RPOs. Then # f is a congruence with respect to all contexts in D, i.e. a # f b implies Ca # Cb for all C # D of the required domain. Proof See Theorem 3.26 in [Lei01]. # 3.8 Multi­hole contexts As anticipated at the end of the previous section, this subsection enriches the definition of functorial reactive systems to model explicitly multi­hole contexts. At first glance, there is no obstacle in accommodating multi­hole contexts directly within the existing theory of reactive systems. (For simplicity, let us ignore functorial reactive systems until later.) However this does not work well as I illustrate in the next few paragraphs. If we consider multi­hole term algebra contexts (as used in term rewriting), say, then we can choose the objects of C to be natural numbers and the arrows to be tuples of contexts (that use each hole exactly once) constructed from function symbols taken from some signature #. Concretely, if # = {#, # # , #, #}, where # and # # are constants, # is a 1­place function symbol, and # is a 2­place function symbol, then, we have the following examples of arrows: C 0 “ = #- 1 , #(# # )# : 1 2 C 1 “ = ##(- 2 , # # ), #, #(- 1 )# : 2 3 C 1 C 0 = ##(#(# # ), # # ), #, #(- 1 )# : 1 3 INRIA Synthesising Labelled Transitions and Operational Congruences in Reactive Systems, Part 1 43 # # # # 0 m u m # n a l F D F # D # C 37 But what is an agent? A natural choice is to take agents to be pure terms, i.e. arrows 0 1 (a 1­tuple containing a term context with 0 holes). But this is not supported by the definition of reactive system, where agents consist of all arrows 0 m for m an arbitrary object of C. This discrepancy is non trivial: if we try to confine the definition of labelled transition and bisimulation to use only a limited set of agents, say those arrows 0 m for m # A, where A is some subset of the objects of C, then the proof of congruence goes wrong. The problem is that even if m,m # , n # A in Figure 37, it is not necessarily the case that an RPO F # , D # , C yields an object u # A. Thus even though F # , D # forms an IPO with respect to a, l, it is not the case that a F # since the codomain of F # is not in A. However, it is exactly the fact that RPOs sometimes do not produce an object in A that gives multi­hole contexts their power and that makes them worth considering. To see why, suppose that we take C to be a category of exclusively 0­ and 1­hole contexts. Then RPOs exist, as a corollary of Sewell's dissection result for terms (Lemma 1 in [Sew]). Consequently, bisimulation is a congruence for term rewriting systems. The resulting labels are unnecessarily heavy, though. For consider the reaction rule (#(#), # # ); we have # #(-) # # which corresponds to our intuition that # needs #(-) to react. (When there is no confusion I use - for - 1 .) Unfortunately, we also have a labelled transition where the label contains a complete copy of the redex : # # #(-,#(#)) #(# # , # # ) . This was discussed at the end of the previous section. This heavy labelled transition is absent when we look at multi­hole contexts, as illustrated with the help of the diagram below. (Tuple brackets # # are omitted from singletons.) 0 1 2 1 1 # # #(#) #(-,#(#)) #(# # ,-) #-,#(#)# ## # ,-# #(- 1 ,- 2 ) If we work in the category of 0­ and 1­hole contexts then the outer square is an IPO, which gives rise to the transition #(-,#(a)) mentioned earlier. By admitting multi­hole contexts we have given the outer­square a simpler RPO. It is now possible to make precise the motivation for the results developed in this subsection. The goal is to reconcile two things: (i) we want to restrict of the collection of agents of C to arrows 0 m for m # A, where A can be a proper subset of the objects RR n° 4394 44 J.J. Leifer of C (for example A = {1} in the case of multi­hole term contexts); (ii) we want to admit RPOs that yield objects which are not contained in A. The key idea is to consider the notion of a strict monoidal category (C, #, 0), a category C equipped with a functor # : C×C C and an object 0 such that # is strictly associative and has unit 0. The role of the tensor # is to ``tuple'' arrows and objects, e.g. recalling the term contexts C 0 , C 1 from above, we have that: C 0 “ = #- 1 , #(# # )# : 1 2 C 1 “ = ##(- 2 , # # ), #, #(- 1 )# : 2 3 C 0 # C 1 = #- 1 , #(# # ), #(- 3 , # # ), #, #(- 2 )# : 3 5 C 1 # C 0 = ##(- 2 , # # ), #, #(- 1 ), - 3 , #(# # )# : 3 5 The following definition extends that of a functorial reactive system by postulating that both the upstairs and downstairs categories are monoidal and requiring that the functor between them respects the monoidal structure. The same enrichment could be performed on a reactive system. There is, however, no reason not to take more general approach shown here. Definition 3.28 (functorial monoidal reactive system; cf. Definition 3.1) A functorial monoidal reactive system consists of a functorial reactive system F : “ C C with the following added structure: ˆ Both “ C and C are strict monoidal categories with unit objects # and 0, respectively, and tensor # (the same symbol being used for both categories). ˆ F preserves tensors, i.e. F(C 1 # C 0 ) = F(C 1 ) # F(C 0 ). ˆ F preserves and creates units, i.e. for all u # obj “ C, F(u) = 0 i# u = #. ˆ There is a subset A of C­objects and a subset A of “ C­objects, where A is the preimage under F of A. We use m,m # , . . . to range over A and m, m # , . . . to range over A. ˆ Reacts # # m#A C(0, m) 2 . ˆ Pairing with an agent yields a reactive context: a # id m # # D for a : 0 m. # The agents are arrows 0 m where m # A and the agent contexts are arrows m m # , for m,m # # A. Thus the reaction rules of Reacts are only between agents. We overload the terminology for “ C in a straightforward way: arrows # m of “ C are also agents for m # A; arrows m m # of “ C are also agent contexts for m, m # # A. The definition of labelled transition confines the arguments to be agents: Definition 3.29 (labelled transition for functorial monoidal reactive systems ( · m ); cf. Definition 3.3) a F m a # i# a F a # and a, a # are agents (thus forcing F to be an agent context). # INRIA Synthesising Labelled Transitions and Operational Congruences in Reactive Systems, Part 1 45 Two properties now replace the hypothesis that F has all redex­RPOs. The first asserts that the RPO consists either of agent contexts or of pairing operations that place disjoint instances of a and l into the composite Fa = Dl: # # # # # m u m # n a l F D F # D # C 38 Definition 3.30 (F has all monoidal­redex­RPOs; cf. Definition 3.4) Suppose F : “ C C is a functorial monoidal re­ active system and that a, l are agents, F, D are agent contexts, F(D) # D, and there exists r # C such that (F(l), r) # Reacts. Then F has all monoidal­redex­RPOs if any square, such as in Figure 38, has an RPO (as shown) such that if u / # A then u = m # m # , F # = id m # l, and D # = a # id m # . # The second property asserts that pairing agent contexts yields an IPO: # # # # # m m # m#m # a l id m #l a#id m # 39 Definition 3.31 (F has all monoidal­redex­IPOs) A functorial monoidal reactive system F : “ C C has all monoidal­redex­IPOs if any square, such as in Figure 39, is an IPO, provided a, l are agents and there exists r # C such that (F(l), r) # Reacts. # Just before giving a proof of congruence for strong bisimulation we need to consider a corollary of Lemma 3.7 for functorial monoidal reactive systems: Corollary 3.32 (portable IPO pasting for functorial monoidal reactive systems; cf. Lemma 3.7) Suppose F : “ C C is a functorial monoidal reactive system and has all monoidal­redex­RPOs. The following inference rule holds: C F # F C # is an IPO consisting of agent contexts a F(F # ) m a # F(C # ) # D F(C)a F(F) m F(C # )a # . # Strong bisimulation compares pairs of agents: Definition 3.33 (strong bisimulation over · m ; cf. Definition 3.8) Let # m # # m#A C(0, m) 2 be the largest strong bisimulation over · m such that # m contains only pairs of agents with common codomain. # Finally we prove congruence. As promised at the end of the previous section, the argu­ ment mirrors closely congruence proofs in typical process calculi. In particular, two cases are distinguished when analysing the transition Ca F m : (i) a, C, and F all contribute, in which case a itself has a labelled transition; (ii) only C and F contribute, in which case Cb F m without needing to consider the behaviour of a and b. Theorem 3.34 (congruence for # m w.r.t. agent contexts) Let F : “ C C be a functorial monoidal reactive system which has all monoidal­redex­RPOs and has all RR n° 4394 46 J.J. Leifer monoidal­redex­IPOs. Then # m is a congruence with respect to agent contexts, i.e. a # m b implies Ca # m Cb for any agent context C # C of the required domain. Proof By symmetry, it is su#cient to show that the following relation is a strong sim­ ulation: S “ = {(Ca, Cb) / a # m b and C # C is an agent context} . # # # # # m n u m # a l C F # F D D # C # 40 Suppose that a # m b and C # C, and thus (Ca, Cb) # S, where a, b : 0 m and C : m n. Suppose Ca F m a # . By the definition of F m and the hypothesis that F creates compositions, there exist “ C­ arrows a : # m, C : m n, l : # m # , F, D and a C­arrow r : 0 F(m # ) such that the big rectangle in Figure 40 is an IPO and a # = F(D)r F(D) # D (F(l), r) # Reacts F(a) = a F(C) = C F(F) = F . Because F has all monoidal­redex­RPOs, there exist F # , D # , C # forming an RPO in “ C, as in Figure 40. Note that F(C # ) # D since F(C # )F(D # ) = F(D) # D. By Proposition 2.6, the small left­hand square of Figure 40 is an IPO; Proposition 2.8 implies that the small right­hand square is an IPO too. Since F has all monoidal­redex­RPOs, we have additional information depending on whether u # A. We consider both cases. Case u # A: By definition, a F(F # ) m a ## and a # = F(C # )a ## , where a ## “ = F(D # )r. Since a # m b, there exists b ## such that b F(F # ) m b ## and a ## # m b ## . Since the small right­hand square in Figure 40 is an IPO and F(C # ) # D, Corollary 3.32 implies that Cb F m F(C # )b ## . Also, a ## # m b ## implies (F(C # )a ## , F(C # )b ## ) # S, as desired. # # # # # m 0 m # m 0 #m # b l id m 0 #l b#id m # C 0 F 0 C # 0 41 Case u / # A: We have that F # = id m # l and D # = a # id m # . Since F lifts agents there exists b : # m 0 such that F(b) = b, and thus F(m 0 ) = m = F(m). Since F preserves tensors, F(F # ) = id F(m) # F(l) = id F(m 0 ) # F(l) = F(id m 0 # l). Since F allows IPO sliding, the small right­hand IPO square of Figure 40 can be slid to form the small right­hand IPO square in Figure 41, where F(C 0 ) = F(C) F(F 0 ) = F(F) F(C # 0 ) = F(C # ) . Since F has all monoidal­redex­IPOs, the small left­hand square of Figure 41 is an IPO. Since F has all monoidal­redex­RPOs, Proposition 2.8 implies that the big rectangle in Figure 41 is an IPO. Since F preserves tensors and since pairing with an agent yields a reactive context, F(C # 0 (b # id m # )) = F(C # )(b # id F(m # ) ) # D so: Cb F m F(C # 0 (b # id m # ))r = F(C # )(b # r) = F(C # )(id m # r)b . INRIA Synthesising Labelled Transitions and Operational Congruences in Reactive Systems, Part 1 47 The last equality is a standard property of strict monoidal categories. Furthermore, a # = F(D)r = F(C # )F(D # )r = F(C # )F(a # id m # )r = F(C # )(a # r) = F(C # )(id m # r)a . Thus a # m b implies (F(C # )(id m # r)a, F(C # )(id m # r)b) # S as desired. # To round out this subsection, let us look again at the example of multi­hole term contexts over a signature #. Definition 3.35 (multi­hole term contexts (T # (#))) Given a signature # of function symbols then the category of multi­hole term contexts T # (#) over # is constructed as follows: the objects are the natural numbers; an arrow j k is a k­tuple of terms over the signature # # {- 1 , . . . , - j } containing exactly one use of each hole - i (1 # i # j). (When j = 1, I often write - 1 as -.) The identities are: id j “ = #- 1 , . . . , - j # : j j. For C = #t 1 , . . . , t k # : j k and D : k m, their composition is the substitution DC “ = {t 1 /- 1 , · · · , t k /- k }D . # In order to apply Theorem 3.34 to T # (#), we let C = “ C = T # (#) and F be the identity functor. If we let A = {1} then ˆ an agent of T # (#) is a term a : 0 1; ˆ agent context of T # (#) is a term context C : 1 1, i.e. a term containing a single hole. We may choose any subcategory of T # (#) to be the reactive contexts, subject to the conditions in Definition 3.28. The labelled transitions of T # (#) depend, of course, on the reaction rules. Once these are specified, the labelled transition relation · m is determined by Definition 3.29 and the induced strong bisimulation # m by Definition 3.33. As a corollary of Sewell's dissection result for terms (Lemma 1 in [Sew]), F has all monoidal­redex­RPOs and has all monoidal­redex­IPOs. Hence from Theorem 3.34 the induced strong bisimulation # m is preserved by all term contexts. Let us now revisit the reactive system whose only reaction rule is (#(#), # # ). It contains exactly the following labelled transitions: D(#(#)) - m D(# # ) for all reactive term contexts D # D # #(-) m # # These agree with the transitions found by Sewell in the case of ground­term rewriting. I believe for any reaction rules specified by Reacts the synthesised labelled transitions for T # (#) coincide exactly with Sewell's. RR n° 4394 48 J.J. Leifer 4 Conclusions This final section summarises some of the accomplishments of the paper. I delay the discussion of future work to the conclusion of Part 2, since the relevant open problems involve the combination of both parts. My main motivation in this work is to provide mathematical tools that ease the con­ struction of new process calculi. Every computational phenomenon (distributed hosts, failures, cryptography, etc.) that gives rise to a new calculus carries with it rules for de­ scribing how it works, how a computation evolves. These are often reaction rules, which capture the allowable state changes. The redexes are typically composites formed from atoms, so the following questions naturally arise. When an agent has parts of a redex, which atoms does the agent require to form a complete redex? Can we get a definition of labelled transition if we choose the labels to be those small contexts that complement an agent to complete a redex? These questions are not original with me: they are often used by the process calculus community as motivation and intuition for designing specific labelled transition relations. I believe, though, that the original contribution of my work is to give a general way of transforming reaction rules into tractable labelled transitions. By tractable, I mean two things: (i) the labels are small, i.e. contain only those parts of a redex necessary to complete a whole redex in an agent; (ii) the labelled transitions yield operational equivalences and preorders that are congruences. The key ideas in trying to achieve both desiderata are that of a relative pushout (RPO) and an idem pushout (IPO). With respect to (ii) the results are good: I prove congruence for strong bisimulation, weak bisimulation, the traces preorder, and the failures preorder. For (i), the results are encouraging: IPOs ensure that the labels of transitions contain no junk, i.e. that no lesser label would do. There is, however, more work required in making the labels even simpler by exploiting the uniformity present in reaction rules that contain metavariables. The conclusion of Part 2 discusses this problem in the paragraphs concerned with multi­hole redexes. One of the most attractive qualities of the theory of RPOs is that it is abstract : it is applicable to all categories of contexts and therefore provides a unifying discipline. Before I understood RPOs, I spent several years thinking about dissection results for specific graph­like contexts: ``If C 0 a 0 = C 1 a 1 , then what part of a i is present in C 1-i for i = 0, 1?'' I could only get results about nodes (for example, which nodes of a 1 are supplied by C 0 ) but not about arcs. Trying to work in an ad hoc way with graphs was unsatisfying until the RPO theory arrived: it was only the discipline of a universal property in the category theory sense that sustained me in carrying out the constructions contained in Chapter 6, ``RPOs for action graph contexts'', in [Lei01]. Sewell's ad hoc reasoning about dissection for term algebras with parallel composition [Sew] was successful without employing RPOs, but his statement of dissection was complicated. To be fair, his dissections were for multi­ INRIA Synthesising Labelled Transitions and Operational Congruences in Reactive Systems, Part 1 49 hole redexes that RPOs do not neatly handle. He is now recasting some of his results in terms of RPOs and other universal constructions. RPOs thus support the goal of developing a shared theory that can produce labelled transitions and operational congruences for many process calculi. The task that we usually face for each new process calculus of defining labelled transitions and proving that equiv­ alences and preorders are congruences is replaced by the task of choosing reactions and proving that RPOs exist. This is attractive leverage from RPOs. This leverage gains us a greater advantage if we can find families of functorial reactive systems with rich graph­ like structure for which the existence of RPOs is proved once, and for which practical examples of process calculi are instances. Part 2 works towards this goal: it provides a series of theorems that verify the functorial reactive system axioms (such as IPO sliding) for a general class of functors, thus showing that that subclass of Milner's action calculi contexts satisfy these axioms. It then states the RPO result (omitting the proof for the sake of brevity). Finally, it concludes by detailing the future work required to enrich the structure of these graphs to the point that they embrace practical process calculi. RR n° 4394 50 J.J. Leifer References Curly braces enclose pointers back to the pages in this paper that cite the work. [AG97] M. Abadi and A. D. Gordon. A calculus for cryptographic protocols: the spi calculus. In Proc. 4th ACM Conf. on Computer and Communications Security, Z˜urich, pages 36--47. ACM Press, 1997. {7} [Bar84] H. P. Barendregt. The Lambda Calculus: Its Syntax and Semantics. North Holland, revised edition, 1984. {8} [BB90] G. Berry and G. Boudol. The chemical abstract machine. In Proc. 17th Annual Symposium of Principles of Programming Languages, pages 81--94. ACM Press, 1990. {6} [BB92] G. Berry and G. Boudol. The chemical abstract machine. Theoretical Computer Science, 96(1):217--248, 1992. {6} [Blo93] B. Bloom. Structural operational semantics for weak bisimulations. Technical Re­ port TR­93­1373, Department of Computer Science, Cornell University, August 1993. {14, 38} [BW01] M. Barr and C. F. Wells. Toposes, triples and theories. Version 1.1. Available from: http://www.cwru.edu/artsci/math/wells/pub/ttt.html, 2001. {18} [CG98] L. Cardelli and A. D. Gordon. Mobile ambients. In Foundations of Software Science and Computation Structure, First International Conference, FoSSaCS '98, Held as Part of the European Joint Conferences on the Theory and Practice of Software, ETAPS '98, Lisbon, Portugal, March 28 -- April 4, 1998, Proceedings, volume 1378 of Lecture Notes in Computer Science. Springer­Verlag, 1998. {7} [CLM00] G. L. Cattani, J. J. Leifer, and R. Milner. Contexts and embeddings for closed shallow action graphs. Technical Report 496, Computer Laboratory, University of Cambridge, July 2000. Available from http://pauillac.inria.fr/#leifer/. {16} [CM91] A. Corradini and U. Montanari. An algebra of graphs and graph rewriting. In 4th Biennial Conference on Category Theory and Computer Science, Proceedings, vol­ ume 530 of Lecture Notes in Computer Science, pages 236--260. Springer­Verlag, 1991. {16} [DH84] R. De Nicola and M. C. B. Hennessy. Testing equivalences for processes. Theo­ retical Computer Science, 34:83--133, 1984. {15} INRIA Synthesising Labelled Transitions and Operational Congruences in Reactive Systems, Part 1 51 [Ehr79] H. Ehrig. Introduction to the algebraic theory of graph grammar. In Proc. first international Workshop on Graph Grammars and their application to Computer Science and Biology, volume 73 of Lecture Notes in Computer Science, pages 1--69. Springer­Verlag, 1979. {16} [FF86] M. Felleisen and D. P. Friedman. Control operators, the SECD­machine and the #­calculus. In M. Wirsing, editor, Formal Description of Programming Concepts III, pages 193--217. North Holland, 1986. {8, 20} [FG98] C. Fournet and G. Gonthier. A hierarchy of equivalences for asynchronous cal­ culi. In Automata, Languages and Programming, 25th International Colloquium, ICALP '98, Aalborg, Denmark, July 13--17, 1998, Proceedings, volume 1443 of Lecture Notes in Computer Science, pages 844--855. Springer­Verlag, 1998. {15} [FGL + 96] C. Fournet, G. Gonthier, J.­J. L’evy, L. Maranget, and D. R’emy. A calculus of mobile agents. In CONCUR '96, Concurrency Theory, 7th International Confer­ ence, Pisa, Italy, August 26--29, 1996, Proceedings, volume 1119 of Lecture Notes in Computer Science, pages 406--421. Springer­Verlag, 1996. {7} [Fou98] C. Fournet. The Join­Calculus: a Calculus for Distributed Mobile Programming. PhD thesis, ’ Ecole Polytechnique, November 1998. {15} [GV92] J. F. Groote and F. W. Vaandrager. Structural operational semantics and bisim­ ulation as a congruence. Information and Computation, 100(2):202--260, 1992. {14} [Hoa78] C. A. R. Hoare. Communicating sequential processes. Communications of the ACM, 21(8):666--677, August 1978. {5} [Hoa85] C. A. R. Hoare. Communicating Sequential Processes. Prentice­Hall, 1985. {5, 39, 41} [HY95] K. Honda and N. Yoshida. On reduction­based process semantics. Theoretical Computer Science, 152(2):437--486, November 1995. {15} [JR99] A. Je#rey and J. Rathke. Towards a theory of bisimulation for local names. In 14th Annual Symposium on Logic in Computer Science, 2--5 July, 1999, Trento, Italy, pages 56--66. IEEE Press, 1999. {15} [K˜on99] B. K˜onig. Generating type systems for process graphs. In CONCUR '99: Con­ currency Theory, 10th International Conference, Eindhoven, The Netherlands, August 24--27, 1999, Proceedings, volume 1664, pages 352--367. Springer­Verlag, 1999. {16} RR n° 4394 52 J.J. Leifer [Lei01] J. J. Leifer. Operational congruences for reactive systems. PhD thesis, Com­ puter Laboratory, University of Cambridge, 2001. Available in revised form as Technical Report 521,Computer Laboratory, University of Cambridge, 2001, from http://pauillac.inria.fr/#leifer/. {14, 16, 19, 24, 25, 33, 35, 36, 38, 42, 48} [Lei02] J. J. Leifer. Synthesising labelled transitions and operational congruences in reactive systems, part 2. 2002. Submitted for publication. Available from http://pauillac.inria.fr/#leifer/. {3} [L’ev78] J.­J. L’evy. R’eductions correctes et optimales dans le lambda calcul. PhD thesis, Universit’e Paris VII, 1978. {8} [LM00] J. J. Leifer and R. Milner. Deriving bisimulation congruences for reactive sys­ tems. In C. Palamidessi, editor, CONCUR 2000 ­ Concurrency Theory, 11th International Conference, University Park, PA, USA, August 22--25, 2000, Pro­ ceedings, volume 1877 of Lecture Notes in Computer Science, pages 243--258. Springer­Verlag, 2000. Available from http://pauillac.inria.fr/#leifer/. {16} [Low96] G. Lowe. Breaking and fixing the Needham­Schroeder public­key protocol using FDR. In T. Margaria and B. Ste#en, editors, Tools and Algorithms for Con­ struction and Analysis of Systems, Second International Workshop, TACAS '96, Passau, Germany, March 27--29, 1996, Proceedings, volume 1055 of Lecture Notes in Computer Science, pages 147--166. Springer­Verlag, 1996. {5, 41} [Mac71] S. Mac Lane. Categories for the Working Mathematician. Springer­Verlag, 1971. {18} [Mil80] R. Milner. A Calculus of Communicating Systems, volume 92 of Lecture Notes in Computer Science. Springer­Verlag, 1980. {5} [Mil88] R. Milner. Communication and Concurrency. Prentice­Hall, 1988. {5, 28, 36, 37, 38} [Mil92] R. Milner. Functions as processes. Mathematical Structures in Computer Science, 2(2):119--141, 1992. {6, 20, 35} [Mil96] R. Milner. Calculi for interaction. Acta Informatica, 33(8):707--737, 1996. {11} [Mil01] R. Milner. Bigraphical reactive systems: basic theory. Technical Report 523, Computer Laboratory, University of Cambridge, 2001. Available from http://www.cl.cam.ac.uk/#rm135/. {16} [Moo56] E. F. Moore. Gedanken­experiments on sequential machines. In C. E. Shannon and J. McCarthy, editors, Automata Studies, pages 129--153. Princeton University Press, 1956. {3} INRIA Synthesising Labelled Transitions and Operational Congruences in Reactive Systems, Part 1 53 [MPW89] R. Milner, J. Parrow, and D. Walker. A calculus of mobile processes, parts I and II. Technical Report ECS­LFCS­89­85 and ECS­LFCS­89­86, Laboratory for the Foundations of Computer Science, University of Edinburgh, 1989. {6} [MPW92] R. Milner, J. Parrow, and D. Walker. A calculus of mobile processes, parts I and II. Information and Computation, 100(1):1--77, September 1992. {6} [MS92] R. Milner and D. Sangiorgi. Barbed bisimulation. In Automata, Languages and Programming, 19th International Colloquium, ICALP '92, Vienna, Austria, July 13--17, 1992, Proceedings, volume 623 of Lecture Notes in Computer Science. Springer­Verlag, 1992. {15} [Par81] D. Park. Concurrency and automata on infinite sequences. In P. Duessen, editor, Proc. 5th GI Conference, volume 104 of Lecture Notes in Computer Science, pages 167--183. Springer­Verlag, 1981. {6, 26} [Pau98] L. C. Paulson. The inductive approach to verifying cryptographic protocols. J. Computer Security, 6:85--128, 1998. {39} [Plo75] G. D. Plotkin. Call­by­name, call­by­value, and the #­calculus. Theoretical Com­ puter Science, 1(2):125--159, December 1975. {20} [Plo81] G. D. Plotkin. A structural approach to operational semantics. Technical Report DAIMI­FN­19, Department of Computer Science, University of Aarhus, 1981. {5} [Rei85] W. Reisig. Petri Nets: An Introduction. EATCS Monographs on Theoretical Computer Science. Springer­Verlag, 1985. {4} [Ros94] A. W. Roscoe. Model­checking CSP. In A. W. Roscoe, editor, A Classical Mind: Essays in Honour of C. A. R. Hoare, pages 353--378. Prentice­Hall, 1994. {5, 41} [Ros98] A. W. Roscoe. The Theory and Practice of Concurrency. Prentice­Hall, 1998. {4, 5, 39, 41, 42} [Sew] P. Sewell. From rewrite rules to bisimulation congruences. Theoretical Computer Science. To appear. {11, 15, 20, 43, 47, 48} [Sew98] P. Sewell. Global/local subtyping and capability inference for a distributed pi­ calculus. In Automata, Languages and Programming, 25th International Collo­ quium, ICALP '98, Aalborg, Denmark, July 13--17, 1998, Proceedings, volume 1443 of Lecture Notes in Computer Science. Springer­Verlag, 1998. {7} RR n° 4394 54 J.J. Leifer [Sew00] P. Sewell. Applied # --- A brief tutorial. Technical Report 498, Computer Labo­ ratory, University of Cambridge, August 2000. {35} [SV99] P. Sewell and J. Vitek. Secure compositions of insecure components. In Proc. 12th Computer Security Foundations Workshop. IEEE Press, June 1999. {7} [SV00] P. Sewell and J. Vitek. Secure composition of untrusted code: wrappers and causality types. In Proc. 13th Computer Security Foundations Workshop. IEEE Press, July 2000. {39} [SW01] D. Sangiorgi and D. Walker. The #­calculus: a Theory of Mobile Processes. Cambridge University Press, 2001. {7} [TP97] D. Turi and G. Plotkin. Towards a mathematical operational semantics. In 12th Annual IEEE Symposium on Logic in Computer Science, Warsaw, Poland, June 29 -- July 2, 1997, pages 280--291. IEEE Press, 1997. {14} [WN95] G. Winskel and M. Nielsen. Models for concurrency. In D. M. Gabbay, S. Abram­ sky, and T. S. E. Maibaum, editors, Handbook of Logic in Computer Science, volume 4, pages 1--148. Oxford University Press, 1995. {4} INRIA Unité de recherche INRIA Rocquencourt Domaine de Voluceau ­ Rocquencourt ­ BP 105 ­ 78153 Le Chesnay Cedex (France) Unité de recherche INRIA Lorraine : LORIA, Technopôle de Nancy­Brabois ­ Campus scienti£que 615, rue du Jardin Botanique ­ BP 101 ­ 54602 Villers­lès­Nancy Cedex (France) Unité de recherche INRIA Rennes : IRISA, Campus universitaire de Beaulieu ­ 35042 Rennes Cedex (France) Unité de recherche INRIA Rhône­Alpes : 655, avenue de l'Europe ­ 38330 Montbonnot­St­Martin (France) Unité de recherche INRIA Sophia Antipolis : 2004, route des Lucioles ­ BP 93 ­ 06902 Sophia Antipolis Cedex (France) Éditeur INRIA ­ Domaine de Voluceau ­ Rocquencourt, BP 105 ­ 78153 Le Chesnay Cedex (France) http://www.inria.fr ISSN 0249­6399