Operational congruences for reactive systems James Judi Leifer University of Cambridge Computer Laboratory and Trinity College 5 September 2001 This document consists of a slightly revised and corrected version of a dissertation submitted for the degree of Doctor of Philosophy Declaration Except where otherwise stated in the text, this dissertation is the result of my own work and is not the outcome of work done in collaboration. This dissertation is not substantially the same as any work that I have submitted for a degree, diploma, or other qualification at any other university. No part of this dissertation has already been or is being concurrently submitted for any such degree, diploma, or other qualification. This dissertation does not exceed sixty­thousand words, including tables, footnotes, and bibliography. iii iv Abstract The dynamics of process calculi, e.g. CCS, have often been defined using a labelled tran­ sition system (LTS). More recently it has become common when defining dynamics to use reaction rules ---i.e. unlabelled transition rules--- together with a structural congruence. This form, which I call a reactive system, is highly expressive but is limited in an important way: LTSs lead more naturally to operational equivalences and preorders. So one would like to derive from reaction rules a suitable LTS. This dissertation shows how to derive an LTS for a wide range of reactive systems. A label for an agent (process) a is defined to be any context F which intuitively is just large enough so that the agent Fa (``a in context F '') is able to perform a reaction. The key contribution of my work is the precise definition of ``just large enough'', in terms of the categorical notion of rel­ ative pushout (RPO), which ensures that several operational equivalences and preorders (strong bisimulation, weak bisimulation, the traces preorder, and the failures preorder) are congruences when su#cient RPOs exist. I present a substantial example of a family of reactive systems based on closed, shallow action calculi (those with no free names and no nesting). I prove that su#cient RPOs exist for a category of such contexts. The proof is carried out indirectly in terms of a category of action graphs and embeddings and gives precise (necessary and su#cient) conditions for the existence of RPOs. I conclude by arguing that these conditions are satisfied for a wide class of reaction rules. The thrust of this dissertation is, therefore, towards easing the burden of exploring new models of computation by providing a general method for achieving useful operational congruences. v vi Acknowledgements I thank my supervisor Robin Milner for his inspiration, his support, and his warmth. I learned more about the ways of doing research and the style of presenting it from him than anyone else. I discussed almost every idea in this dissertation with my friends Luca Cattani and Peter Sewell. Peter was unwavering in his clear­sightedness, forcing me to account for my tacitly held assumptions at every turn, dissecting every idea, and setting me straight. Luca opened the world of category theory for me, revealing its beauty and simplicity and its use as a discipline for abstraction. I talked with many colleagues and learned much from their ideas. Here is a partial list: Marcelo Fiore, Philippa Gardner, Georges Gonthier, Andy Gordon, Tony Hoare, Martin Hyland, Alan Je#rey, Ole Jensen, Jean­Jacques L’evy, Michael Norrish, Andy Pitts, John Power, Edmund Robinson, Glynn Winskel, Lucian Wischik My o#ce­mate David Richerby kept me sane in the final act of writing up with his wonderful humour. C.T. Morley (my Tutor at Trinity College) was always welcoming and ready to advise. Hazel Felton (Side F Secretary at Trinity College) watched over me with great warmth and helped me with great e#ciency. I am indebted to John Roberts for his lucid and consistent thinking. Natalie Tchernetska's influence touches all parts of this dissertation. While carrying out this research I was supported by the following funding sources: EPSRC Research Grant GR/L62290; Trinity College Senior Rouse Ball Studentship; NSF Graduate Research Fellowship; Trinity College External Research Studentship; British Overseas Research Studentship; and Cambridge Overseas Trust Studentship. vii viii Contents Declaration iii Abstract v Acknowledgements vii 1 Introduction 1 1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Historical background and motivation . . . . . . . . . . . . . . . . . . . . . 2 1.3 Contexts as labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.4 Relative pushouts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.5 Other work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.6 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2 Operational congruences for reactive systems 19 2.1 Formalisation of reactive systems . . . . . . . . . . . . . . . . . . . . . . . . 19 2.2 Categorical basis for contextual labels . . . . . . . . . . . . . . . . . . . . . 23 2.3 Labelled transitions and congruence for strong bisimulation . . . . . . . . . 30 3 Further congruence results 35 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.2 Functorial reactive systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.3 Cutting and pasting portable IPO squares . . . . . . . . . . . . . . . . . . . 38 3.4 Strong bisimulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.5 Weak bisimulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.6 Traces preorder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.7 Failures preorder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.8 Multi­hole contexts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 ix Contents x 4 Sliding IPO squares 61 4.1 Introduction and motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 4.2 Properties of A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 4.3 Construction of “ C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.4 Operations on “ C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.5 Construction of C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 4.6 Construction of F . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 5 Action graph contexts 73 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 5.2 Action calculi reviewed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 5.3 Examples and a problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 5.4 Closed shallow action graphs . . . . . . . . . . . . . . . . . . . . . . . . . . 81 5.5 The well­supported precategory A­Ixt of arities and raw contexts . . . . . 84 5.6 Constructing a functorial reactive system . . . . . . . . . . . . . . . . . . . 89 6 RPOs for action graph contexts 91 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 6.2 The category G­Inc of action graphs and inclusion embeddings . . . . . . . 92 6.3 Relative coproducts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 6.4 Existence of relative coproducts . . . . . . . . . . . . . . . . . . . . . . . . . 98 6.5 Construction of a candidate from a sca#old . . . . . . . . . . . . . . . . . . 103 6.6 Construction of a mediating inclusion embedding . . . . . . . . . . . . . . . 105 6.7 Construction of a relative coproduct . . . . . . . . . . . . . . . . . . . . . . 107 6.8 Existence of RPOs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 7 Expected properties of redexes 113 8 Conclusions and future work 123 A Review of category theory 131 B Labelled transitions via retractions 135 Bibliography 139 Chapter 1 Introduction 1.1 Overview This dissertation is concerned with process calculi, which are mathematical models of computation. Process calculi come in many forms, but share several common features. Much as the #­calculus isolated the canonical features of sequential computation, process calculi do the same for concurrent phenomena, such as non­determinism, synchronisation, and communication. By concentrating on a few core syntactic primitives, process calculi provide a setting in which to reason about these phenomena without the added baggage associated with a full programming language. There are three critical ingredients beyond syntax that a process calculus may possess. I summarise these now and discuss each in greater detail later. The first consists of unlabelled transitions, or reactions as I call them in this disserta­ tion, which characterise only the internal state changes of an agent (a process). Reactions do not require interaction with the surrounding environment. The second consists of labelled transitions, which characterise the state changes that an agent may undergo in concert with the environment. Each transition is associated with a label, which records the interaction with the environment (for example an output on a channel) that enables the transition. A label is thus an observation about the behaviour of an agent. Normally, the reactions of an agent are special labelled transitions for which the interaction with the environment is vacuous (so­called #­transitions). The third consists of operational preorders and equivalences, which characterise when one computation is respectively behaviourally replaceable by, and behaviourally equal to, another. An equivalence is most useful when it is a congruence, i.e. is preserved by the syntactic constructions of the calculus; in this case, proving the equivalence of two large agents can be reduced to proving the equivalence of components. Thus congruence is one of the most desirable properties an equivalence may possess. These remarks apply as well to operational preorders. 1 Chapter 1. Introduction 2 As I will describe later, much research has concentrated on finding general definitions of operational preorders and equivalences in terms of labelled transitions. For any specific process calculus, it is then a challenge to prove that the preorder or equivalence in question is a congruence. But what happens when there are no labelled transitions, only reactions? The latter situation has become common since reactions provide a clean way of specifying allowable state changes without commitment to particular observables. Reactions are highly expressive but limited in an important way: they do not lead as naturally as labelled transitions do to congruential operational equivalences and preorders. The central problem addressed by this dissertation is as follows: From a process cal­ culus with only a reaction relation ---called a reactive system--- can we derive a tractable labelled transition relation? By tractable, I mean two things: the labels come from some small set; the labelled transitions readily yield operational preorders and equivalences that are congruences. My approach is to take a label to be a context (an agent with a hole in it). For any agent a, a label is any context F which intuitively is just large enough so that the agent Fa (``a in context F '') is able to perform a reaction. The key contribution of my work is to make precise the definition of ``just large enough'' in terms of the categorical notion of relative pushout (RPO) and to show that several operational equivalences and preorders (strong bisimulation, weak bisimulation, the traces preorder, and the failures preorder) are congruences when su#cient RPOs exist. I also present a substantial example of a family of reactive systems based on closed, shallow action calculi (those with no free names and no nesting). I prove that su#cient RPOs exist for a category of such contexts. The proof is carried out indirectly in terms of a category of action graphs and embeddings and gives necessary and su#cient conditions for the existence of RPOs. I conclude by arguing that these conditions are satisfied for a wide class of reaction rules. The thrust of this dissertation is, therefore, towards easing the burden of exploring new models of computation by providing a general method for achieving useful operational congruences. In the rest of this chapter, I discuss some of the history of operational congruences and o#er some background as to why reactions have become prominent. I then sketch the solution strategy presented in this dissertation. I relate my approach to work by other researchers and to material I have jointly published. I conclude with an overview of structure of the chapters. 1.2 Historical background and motivation It may seem strange at first to investigate the ``derivation of operational congruences'' since congruence is usually a postulated notion in mathematics: to define a model, one chooses which elements are equivalent. But choosing has become di#cult for many models 1.2. Historical background and motivation 3 of computation, creating a gap between the properties such as reaction that are easy to postulate and those such as equivalence that are useful to have. To illustrate how this gap opened, let us look at some strands in the history of process calculi. Starting with the seminal work by E.F. Moore [Moo56] which examined ``finite automata from the experimental point of view'', theoretical computer scientists have pur­ sued the notion that the observable behaviour of a computation is more fundamental than the form in which the computation is written; in this view, two programs are equivalent when they are indistinguishable by an outside observer. In the case of automata, an ob­ servation consists of a labelled transition of the form a x a # : automaton a can input the label x and become a # . (This is a slight simplification of Moore's original notion, which distinguished input from output labels.) One automaton refines another if the former has a subset of the labelled transitions of the latter, whether or not the two have di#erent syntactic forms. For example, a “ = x.(y+z) and b “ = x.y + x.z both can input the same strings of labels, ##, #x#, #xy#, #xz#, so are equal (refine each other) despite their syntactic di#erences. This notion of comparing computations based on their labelled transitions is now known as the traces preorder ; the interpretation of computations in terms of their traces is the traces model (see [Ros98]). The traces preorder is attractive for its simplicity but is not faithful in comparing exactly when one automata's labelled transitions are ``no worse'' than another's. Some of its inadequacies with respect to particular applications were overcome in the late 1970s by Hoare and Milner in ways that I describe below. Di#erent applications for process calculi have di#erent informal requirements for when the patterns of labelled transitions for two agents are the same (in the case of equivalences) or refine one another (in the case of preorders). No one definition can be faithful to all possible requirements. Given that, we can consider schemas parameterised by a labelled transition relation for defining di#erent equivalences and preorders --- and thus take labelled transitions to be the postulated part of a process calculus from which equivalences and preorders are derived. This di#erence lends labelled transitions their power: a labelled transition relation captures the important interactions between a computation and its environment without making a commitment to a specific equivalence or preorder. As I will discuss later, many new process calculi for modelling distributed computation do not include axioms for labelled transitions but instead rest on an even simpler base. The primary weakness of the traces preorder is its inability to handle the subtleties of non­determinism. This is evident if we consider the pair a, b defined earlier. After inputting x, a is in a state that can input either y or z; but b takes a ``silent'' choice (an internal choice not controllable by external influence) when inputting x and enters one of two states: either a state in which only y can be input or a state in which only z can be input. Even though a and b have identical traces, their non­deterministic behaviour is di#erent. Chapter 1. Introduction 4 The problem of handling non­determinism together with causality (the dependence of one transition upon an earlier one) was first addressed by Petri nets (see, for example, [Rei85]). In its simplest form, a Petri net consists of a directed hypergraph whose nodes are called conditions and whose hyperedges are called events. A token may reside at a condition. When the preconditions of an event hold (i.e. there is a token on every condition that is a source of the event hyperedge), then the event fires: the tokens are removed from the source conditions and places on the target conditions. Non­determinism comes from the possibility that a token may enable several events to fire; causality of events is determined by the graph structure, which constrains the allowable flow of tokens and thus enables certain event firings only after others have occurred. Researchers on Petri nets did not originally consider equivalences or preorders for them. However, event structures, which are similar to traces but account for the causality and non­determinism of events, are a setting for modelling Petri nets and other causal systems and yield notions of equivalence [WN95]. Because Petri nets do not have a compositional syntax, it is di#cult to understand what it means for an equivalence to be a congruence. For this reason, I do not discuss them further and confine my attention to process calculi with compositional syntax. Hoare's work on Communicating Sequential Processes (CSP) [Hoa78, Hoa85] and Mil­ ner's on Communicating Concurrent Systems (CCS) [Mil80, Mil88] addressed the problem of comparing computations in a way that is sensitive to non­determinism, and hence dis­ tinguishes a from b (the examples I gave earlier). Hoare's approach in CSP was to present every agent as an explicit set­theoretic con­ struction including, amongst other data, failures; the latter are traces, each of which is enriched with a set of refusals indicating which labels may be refused by the agent after inputting the trace. For example, after inputing x, b can refuse y and can refuse z but a cannot refuse either. Each constructor in CSP (prefixing, parallel composition, choice, etc.) is defined in terms of the manipulation of failures (as a continuous function on the complete partial order of agents). The explicit set­theoretic representation of agents in CSP supports the design of model checkers [Ros94, Ros98] which are, for example, e#ective in detecting bugs in protocols [Low96]. This explicitness is also an obligation: experimen­ tation with new constructors for combining agents requires the creation of a continuous function that manipulates failures; new preorders other than quotients of the failures pre­ order are not easily catered for. I concentrate in this dissertation, therefore, on the strand of research originating with CCS, but return to the failures preorder in Section 3.7 when proving that it is a congruence. To be fair, though, the heaviness involved in adding new constructors in CSP is not eliminated in CCS­like languages, but instead replaced by di#erent obligations, as described below. In Milner's CCS, agents have a free syntax and the labelled transition relation is generated by inference rules in the style of Plotkin's structured operational semantics 1.2. Historical background and motivation 5 [Plo81], e.g. x.a x a a # a # a + b # a # Remark: Throughout this dissertation a, b, c, . . . are used to denote agents even when this contradicts historical conventions. The labels (ranged over by # . . .) comprise input channels x, y, z, . . . and output channels • x, • y, • z, . . . and a special distinguished element # . This terminology is somewhat misleading because no data is actually input or output; they are just complementary flavours. In the #­calculus, which I describe later, the di#erences are significant. The key idea in CCS is that of synchronisation. If one agent can output x and the other can input x then their parallel composition can synchronise on x. The result is a #­transition which records the synchronisation but leaves the name on which it occurs anonymous: a • x a # b x b # a | b # a # | b # Milner considered several equivalences for CCS; the main ones were strong and weak bisim­ ulation. Both kinds employ a coinductive form of definition which gives a powerful proof technique for comparing agents (an idea that originates with Park [Par81]) and provide a general way of defining an equivalence parameterised by a labelled transition system. Strong bisimulation relies on no assumptions about the underlying structure of the labels; weak bisimulation requires a distinguished #­transition. I will discuss both in detail later (Section 2.3 and Section 3.5) but summarise the two now. Both strong and weak bisimulation are sensitive to non­determinism and are able to distinguish the pair a, b described earlier in the discussion of the traces preorder. A strong bisimulation S is a relation on agents satisfying the following coinductive property: for all (a, b) # S and all labelled transitions a # a # , there exists b # such that b # b # and (a # , b # ) # S (and vice versa). Informally, ``if a and b are S­related then whatever a can do, b can do too, and they both end up in S­related states; likewise for whatever b can do''. The largest strong bisimulation relation (which is the union of all strong bisimulations) is denoted by #. The largest weak bisimulation, denoted by #, is coarser because it is flexible in allowing #­transitions to be collapsed and expanded when comparing two agents. For example, a agent that inputs x and then immediately outputs y would be related by weak bisimulation to one that inputs x, then has several #­steps (internal computations), and finally outputs y. Milner proved that both kinds of bisimulation are congruences (though not with respect to sum for weak bisimulation --- see Section 3.5 for further discussion). Such proofs require Chapter 1. Introduction 6 care in CCS and are more di#cult in later calculi. This is the price of avoiding CSP­ style explicit representations of agents and agent constructors. It is a central theme of this dissertation to provide mathematical tools for easing the burden of proving that operationally defined equivalences (such as bisimulation) are congruences. The idea of using a # ­like transition to record a primitive computational step plays a central role in later calculi. These transitions are variously called reaction rules, reduction rules, firing rules, etc.; I shall use reaction rules throughout. In CCS, the # transition relation requires the entire collection of labelled transition rules to generate it. A dramatic simplification was proposed in the Chemical Abstract Machine (CHAM) of Berry and Boudol [BB90, BB92] and used in work [Mil90] on the #­calculus of Milner, Parrow, and Walker [MPW89, MPW92]. These calculi were the first to employ a lightweight quotient of the agent terms, called a structural congruence, in order to make their reaction rules easy to define. I shall confine the discussion to the #­calculus. (CHAM treats the quotient more explicitly with ``heating and cooling'' rules, though the idea is similar.) Structural congruence is an equivalence relation # on agents that allows the parts of an agent to rearrange themselves, e.g. a | b # b | a a | (b | c) # (a | b) | c · · · The reaction relation , which characterises the primitive computational step (namely communication of a name over a channel), is simple to define; it is the smallest relation satisfying the rule • x#y#.a | x(z).b a | {y/z}b that respects structural congruence a # a # a # b # b # # b a b and is closed under all non­guarding contexts C (e.g. c | -; see Section 2.1 for further discussion) a b C[a] C[b] . Thus the agent • x#y# | (c | x(z).b) has a reaction even though the parts needed to enable the reaction --- namely • x#y# and x(z).b --- are not adjacent: • x#y# | (c | x(z).b) # (•x#y# | x(z).b) | c {y/z}b | c . The ease with which reaction rules are defined in this style facilitated an outpouring of new process calculi for modelling encrypted communication [AG97], secure encapsulation 1.2. Historical background and motivation 7 [SV99], agent migration [CG98, Sew98, FGL + 96] and so on. Each isolates a computational phenomenon and presents it via a reaction rule together with a structural congruence over some syntax. Here are two examples of reaction rules: . In Cardelli and Gordon's ambient calculus, one ambient may move inside another (like a packet through a firewall): y[in x.a | b] | x[c] x[y[a | b] | c] . . In Sewell and Vitek's Box­# calculus, a message may move from inside to outside a wrapper and is decorated with the wrapper's name while doing so: y[•x # v | b] • x • y v | y[b] . In the #­calculus the communication of names along channels presented subtleties in the design of a labelled transitions system and of equivalences (of which several are now studied [SW01]); the proofs of congruence for these equivalences require care. For many of the newer calculi, such as those listed above, the problem of choosing appropriate labelled transitions and proving that bisimulation is a congruence is di#cult, and, in many cases, not attempted. In all these cases, experimentation is costly: a slight modification of the syntax or reaction rules often causes the labelled transition relation to change and breaks congruence proofs, forcing them to be reconstructed. So the gap first described at the beginning of this section has widened as the data defin­ ing a typical process calculus have changed. Labelled transitions are no longer postulated primitives of a calculus but instead are teased out from the four fundamental components: . syntax (agents and agent contexts); . structural congruence; . set of reactive (non­guarding) contexts; . reaction rules. I call a process calculus containing these components a reactive system. In this way, a reactive system closely resembles instances of the #­calculus [Bar84]. The latter consist of a simple syntax, a structural congruence based on #­conversion, a set of reactive contexts (known as ``evaluation contexts'' [FF86]) chosen to force strategies such as call­by­name or call­by­value, and a reaction rule based on #­reduction. There is however an important di#erence which renders the problem of finding useful equivalences for process calculi more di#cult: their reaction relations are usually neither confluent nor normalising. As a result, equivalences for #­calculi, such as those based on Chapter 1. Introduction 8 normal forms or on termination properties [Bar84, L’ev78], do not provide viable routes to follow. Therefore we need to consider equivalences based on bisimulation or other techniques that make no assumptions about confluence or normalisation properties. Yet bisimulation requires labelled transitions, which are not provided in a reactive system, and proofs of congruence, which can be di#cult and fragile, as already discussed. The next section outlines the strategy for deriving labelled transitions which is the basis for the work in this dissertation. 1.3 Contexts as labels We wish to answer two questions about an arbitrary reactive system consisting of agents (whose syntax may be quotiented by a structural congruence) and a reaction relation (generated by reaction rules): 1. Can we derive a labelled transition relation # where # comes from a small set of labels that intuitively reflect how an agent interacts with its environment? 2. Under what general conditions is strong bisimulation over # a congruence? We can begin to address question 1 by considering CCS. Let a, b range over agents, C, D,F range over contexts (agents with a hole), and x range over names. The usual labelled transitions # for # ::= • x # # x # # # reflect an agent's capability to engage in some behaviour, e.g. • x.a | b has the labelled transition • x because • x.a | b can perform an output on x. However, if we shift our emphasis from characterising the capabilities of an agent to the contexts that enable the agent to react, then we gain an approximate answer to question 1 by choosing the contexts themselves as labels; namely we define a F a # i# Fa a # (1.1) for all contexts F . (We denote context composition and application by juxtaposition throughout. We regard a context with a trivial hole as an agent, so application is a special case of composition.) Instead of observing that • x.a | b ``can do an • x'' we might instead see that it ``interacts with an environment that o#ers to input on x, i.e. reacts when placed in the context - | x''. Thus, • x.a | b -|x a | b. The definition of labelled transition in (1.1) is attractive when applied to an arbitrary reactive system because it in no way depends upon the presence or absence of structural congruence. Furthermore, it is generated entirely from the reaction relation (ques­ tion 1); and, strong bisimulation over the derived labelled transition relation · is a congruence, (question 2). The proof of the latter is straightforward: let C be an arbitrary context and suppose a # b; we show that Ca # Cb. Suppose Ca F a # ; by definition, 1.4. Relative pushouts 9 FCa a # , hence a FC a # . Since a # b, there exists b # such that b FC b # and a # # b # . Hence Cb F b # , as desired. The other direction follows by symmetry. Nonetheless, the definition in (1.1) is unsatisfactory: the label F comes from the set of all contexts --- not the ``small set'' asked for in question 1 --- thus making strong bisimulation proofs intolerably heavy. Also, the definition fails to capture its intended meaning, namely that a F a # holds when a requires the context F in order that a reaction is enabled in Fa: there is nothing about the reaction Fa a # that forces all of F --- or indeed any of F --- to be used. In particular, if a a # then for all contexts F that preserve reaction, Fa Fa # , hence a F Fa # ; thus a has many labelled transitions that reflect nothing about the interactions of a itself. Let us unpack (1.1) to understand in detail where it goes wrong. Consider an arbitrary reactive system equipped with a set Reacts of reaction rules; the reaction relation contains Reacts and is preserved by all contexts, as formalised by the following axiom and inference rule: l r if (l, r) # Reacts a a # Ca Ca # . Expanding (1.1) according to this definition of we have: a F a # i# Fa a # i# #(l, r) # Reacts, D. Fa = Dl & a # = Dr . (1.2) # # # # a l F D 1.1 The requirement Fa = Dl in (1.2) is rendered by a commuting square (as shown in Figure 1.1) in some category whose arrows are the agents and contexts of the reactive system. This requirement reveals the flaw described earlier: nothing in (1.2) forces F and D to be a ``small upper bound'' on a and l. The next section explores the challenges involved in making precise what ``small upper bound'' means. 1.4 Relative pushouts For several years I tried to make precise what ``small'' means. I was mired in detailed arguments about specific examples of contexts for which I would search for a ``dissection lemma'', having roughly the following form: given Fa = Dl, there exists a ``maximum'' C, such that for some F # and D # we have F # a = D # l, F = CF # and D = CD # . In other words, dissection peels o# as much as possible from the outside of F and D, decomposing Fa = Dl as CF # a = CD # l. If the dissection of Fa = Dl peels o# nothing, i.e. produces the decomposition id Fa = id Dl, where id is the identity context, then F and D are ``a Chapter 1. Introduction 10 small upper bound''. But the problem then remains: when is C the ``maximum'' possible context that can be peeled o#? In this section I look at some examples of dissection in order to illustrate just how subtle the problem is. These examples motivate a solution that I introduce here and present in detail later in this dissertation. For some particular classes of contexts, it is easy to understand what to do. Consider this example of two compositions Fa = Dl in a free term algebra: # ### # ###-### # # # = # ### # #-## # # # #### # = ### # ####### . (For the sake of clarity, I use # for composition in this example.) The maximum context C that can be peeled o# is ### # #-##. Nothing more can be peeled o# because C = D. The only other possibilities are - and ##-#, neither of which is as big as C. So C it is! Of course I am arguing informally here, but it is straightforward to make this line of thinking precise. Free term algebras are the simplest setting in which to consider dissection exactly because they are free. When passing to a syntax quotiented by a non­trivial struc­ tural congruence, the question becomes di#cult, especially when compounded with nam­ ing structure such as in the #­calculus. Consider a typical #­calculus agent such as a “ = (#u)(•x#u# | x(z).•y#z#). Notice that a contains a parallel constructor |, which is associative and commutative and has identity 0; also, u and z are bound, so are sub­ ject to #­conversion. Furthermore, the name x is used twice (which requires attention when substituting another name for it) and is discarded after the reaction a a # , where a # “ = (#u)(•y#u#). This structure demands careful treatment and makes dissection for the #­calculus di#cult: for term algebra, one can incrementally peel o# a top­level function symbol, but for the #­calculus, there is no notion of a ``top­level'' constructor. The structural congruence of the #­calculus equates so many di#erent syntactic representations of the same agent that it is di#cult to understand where to begin. Without any naming structure, parallel composition becomes easier to handle, as shown by Sewell [Sew01]. As I explain in Section 1.5, it is the treatment of names that distinguishes the dissection results in this dissertation from his. A possible approach is to abandon tree­like syntax and to think in terms of graph­like syntax that automatically quotients out many ignorable di#erences. Even if a dissection result could be proved for some graph­theoretic representation of the #­calculus, it would not necessarily generalise smoothly to other calculi. As a result, I studied dissection for Milner's action calculi [Mil96], which are a family of reactive systems. The syntax of action calculi is su#ciently rich to embrace process calculi such as #­calculus, the #­calculus, and the ambient calculus. Action calculi are introduced in Section 5.2; here I confine my attention to a few salient features of their graphical form. 1.4. Relative pushouts 11 out out out nu nu in x y x y Figure 1.2: A graphical representation of a #­calculus reaction (#u)(•x#u# | x(z).•y#z#) (#u)(•y#u#) Consider the example shown in Figure 1.2; it illustrates a pair of action graphs --- agents in an action calculus--- that represent the #­calculus reaction a a # given earlier. An action graph consists of nodes (rectangles with two blunt corners) and arcs: . Nodes are labelled with controls from a control signature of primitives. Each action calculus may have a di#erent control signature, such as {nu, in, out} for a simple #­ calculus without replication or choice, or {ap, lam} (application and #­abstraction) for the #­calculus. . Arcs represent names. They connect source ports to target ports. The source ports are arrayed on the left interface of an action graph (such as is shown for the action graph nested inside in) and on the right side of nodes (e.g. nu); they also include free names (e.g. y). The target ports are arrayed on the right interface of action graphs and the left side of nodes. Arcs may be forked from a source port (such as the arcs from x in the LHS) and may be discarded (such as the arc from x in the RHS). How can we perform dissection on action graphs? It is the wiring structure (the arcs) of an action graph that makes dissection so di#cult. Consider the example of a composition Fa = Dl shown in Figure 1.3. (This example is based on one first suggested by Philippa Gardner.) The contexts F and D are just action graphs with a hole (shown as a shaded rectangle). The composition Fa is formed by placing a inside the hole of F and joining up the corresponding arcs. By reasoning about the controls it is possible to see which ones can be peeled o#: F has N,M and D has N,L, so C might have N (the intersection of the controls in F and D). Indeed, the C shown does have just one control. But what arcs should C have? Does the loop shown in C make C ``maximal'' in some sense? Should C have other forked and discarded arcs? How can we choose? Chapter 1. Introduction 12 M N L K (0, 0) L K K L N (0, 0) (0, 0) (0, 0) (1, 1) (0, 2) (2, 0) M M a l D # F # C N M D F L N K Figure 1.3: A surprising dissection involving reflexion The example in Figure 1.4 is even more curious: F and D have no nodes at all --- they are pure wiring contexts. Since F = D and a = l, one might expect that the ``maximal'' common part that can be peeled o# of Fa = Dl is F = D itself. This is not true! The triple F # , D # , C provides a better dissection in a way that I will make precise below. The point of these examples is to demonstrate that informal reasoning about dissection is di#cult when dealing with contexts containing wiring: There is almost no way to decide just by looking at the arcs themselves whether one dissection is better than another. Out of this murkiness finally emerged a clearer way. I came upon an abstract charac­ terisation, called a relative pushout, of which dissections are ``best''. By abstract, I mean that the characterisation makes no use of any specific properties of contexts except their composability, thus can be instantly cast as a category theoretic property. Consider the outside commuting square in Figure 1.5 which shows Fa = Dl. A relative pushout (RPO) for this square is a triple F # , D # , C satisfying two properties: first, the triple is a candidate, i.e. F # a = D # l and CF # = F and CD # = D; second, for any other candidate “ F # , “ D # , “ C, there exists a unique mediating J making all the triangles commute (as shown). RPOs are the key contribution of this dissertation. It is by working with them, rather than trying to come up with ad hoc ways of dissecting contexts in specific examples, that we gain two important advantages: . RPOs are abstract: as I said before they do not rely on any specific properties of contexts except composability. By defining labelled transitions in terms of RPO constructions, it is possible carry out proofs of congruence (see Chapter 2 and Chapter 3) for several operational equivalences and preorders (strong bisimulation, 1.4. Relative pushouts 13 (0, 2) (0, 2) (0, 2) (0, 0) F F # D # (0, 0) (0, 2) a t 0 t 1 l t 2 t 3 D K K K K (t 1 ,t 3 ) (t 1 ,t 2 ) (t 0 ,t 3 ) (t 0 ,t 2 ) C (0, 4) Figure 1.4: A surprising dissection involving forking m a l F D F # D # C “ C “ F # “ D # JJ Figure 1.5: A relative pushout weak bisimulation, the traces preorder, and the failures preorder), for which the proofs do not depend on any specific properties of contexts except the existence of RPOs. Thus these results are applicable to any reactive system which possesses su#cient RPOs. . RPOs provide a unifying discipline for analysing contexts in specific examples. The fuzziness about which dissection is ``best'' in the examples of action graphs shown earlier disappears: we want the candidate triple F # , D # , C # from which there is a unique mediator to any other candidate. By this requirement, the candidates shown in Figure 1.3 and Figure 1.4 are ``best''. (I put quotation marks around best because there may be many best candidates, but all of them are isomorphic to each other by standard categorical reasoning.) The problem of finding RPOs for graphs is non­ trivial, as shown in Chapter 6, but one is sustained by the unambiguity of the task: there is no vagueness about the properties required of an RPO. Chapter 1. Introduction 14 Thus, the notion of an RPO does not solve the problem of finding a dissection, but it makes the problem well­defined and provides a reward for the e#ort by virtue of the congruence proofs that rely on the existence of su#cient RPOs. 1.5 Other work This section brings together some of the important related work. There is a large collection of literature about process calculi, some of which I referred to in previous sections. I will concentrate on those pieces of research that most closely impinge on the specific problems that I address in this dissertation. The idea of finding conditions under which a labelled transition relation yields an operational congruence has been thoroughly studied in work on structural operational semantics (SOS) [GV92, TP97]. The principle is to postulate rule formats, conditions on an inductive presentation of a labelled transition relation that ensure that operational equivalences (e.g. weak bisimulation [Blo93]) are congruences. There is a fundamental di#erence between this problem and the one I am looking at. The work on SOS presumes that a labelled transition relation is already given: the problem is to show that if it satisfies a particular format then the congruences follow. My work takes reaction rules as primitive, not labelled transitions: I aim to derive a labelled transition relation from the reaction rules for which operational equivalences (and preorders) are congruences. In my case, the derived labelled transition relation is not inductively presented. It is an open question whether it can be inductively presented and, further, whether this presentation satisfies some well­known rule format from SOS theory: this possibility is not explored in the dissertation but is discussed in greater detail in Chapter 8, which covers future work. The problem of deriving operational congruences for process calculi from a reaction relation and not from a labelled transition relation is studied in the work on barbed bisimu­ lation [MS92], insensitivity observation [HY95], and testing equivalence [DH84]. The first two construct equivalences by augmenting bisimulation over the reaction relation with observations about related states. For example, in the former, the observations are barbs which detect the ability of an agent to perform an output on a channel. The last (testing) compares two agents by their ability to satisfy the same ``may'' and ``must'' tests. Thus all three depend on primitive observations (not just reactions) though these can be simpler than labelled transitions. The main obstacle to their use is the fact that they do not yield congruences but instead need to be ``closed up'' by all contexts. For example, it is straightforward to prove barbed equivalence in particular cases but di#cult to show barbed congruence because of the heavy quantification over all contexts. Work by Fournet [Fou98] and by Fournet and Gonthier [FG98] ease this burden with techniques that allow a proof of barbed congruence to be broken into pieces, each of which may be carried out using other congruence relations. 1.5. Other work 15 Je#rey and Rathke [JR99] used contexts as the basis for the labels of an LTS in the case of the #­calculus (a variant of the #­calculus with fresh name creation). They did not derive uniformly these labels from a reaction relation but they were guided by the intuition that the labels are small contexts that enable a reaction. These labels give them the right observational power to obtain useful congruences based on bisimulation. Sewell's work [Sew01] is the closest to mine of all the material I have cited here. He studied the problem of deriving contextual labels for a family of reactive systems (parameterised by arbitrary reaction rules) whose syntax consists of free terms quotiented by a structural congruence for parallel composition. He defined labelled transitions by explicitly reasoning about how a label overlaps with a redex. From this definition, he proved that bisimulation is a congruence by appealing to his specific dissection results (not motivated by RPOs). There are three important di#erences in approach: . He dealt with multi­hole redexes which capture the full uniformity of metavariables, thus leading to lighter labelled transitions than I can derive with my present RPO technology. I discuss possible remedies for this in Chapter 8. . He invented explicitly his definition of labelled transition and his statement of dis­ section, both of which are complex, without employing RPOs (which I worked with later). As as result, his proof of congruence is not easily generalisable to other syn­ tactic families and other operational equivalences. It seems likely, however, that his dissection results imply the existence of RPOs for the class of reactive systems he considered, and it would be worth trying to recast them so as to make this precise. . He confined his attention to free term algebras with parallel composition and did not handle wiring structure such as is shown in Figure 1.4. I do handle wiring in Chapter 6 where the discipline of seeking RPOs guides the dissection involved. Given the complexity of Sewell's definition of labelled transition and of his statement of dissection, it is di#cult to see how these could be generalised to embrace wiring without the benefit of the notion of RPOs or other universal constructions. In this dissertation, I have made no use of the double pushout techniques developed in graph rewriting [Ehr79]. These are a way to describe the occurrence of a subgraph ---especially a redex--- in a graph. To avoid confusion, I should emphasise that relative pushouts play quite a di#erent role. In my work, subgraph occurrences are handled by embeddings and contexts; the nature of the graphs (with forked wiring) seems to require a specific approach. But it would be useful to examine in the future how the embeddings relate to the double pushout construction, and how graph­theoretic representations of the syntax of the #­calculus and other calculi formed by quotienting out structural congruence compare to similar work in graph rewriting [CM91, K˜on99]. Chapter 1. Introduction 16 Some of my work has been done in collaboration with Milner and all of it under his supervision. I wish to draw attention to two relevant pieces of published work for which I am a joint author. All other work presented in this dissertation that is not mentioned here is my own. . Leifer and Milner: ``Deriving bisimulation congruences for reactive systems'' [LM00a]. This paper introduces RPOs and gives a proof of congruence for strong bisimulation, thus overlapping with some of the material in Chapter 2. Milner collaborated with me on the categorical manipulations, which are critical to the proof of congruence. . Cattani, Leifer, and Milner: ``Contexts and embeddings for closed shallow action graphs'' [CLM00]. This technical report demonstrates the existence of RPOs for a category of action graph contexts by relating RPOs to relative coproducts in a category of action graph embeddings, thus overlapping with some of the material in Chapter 5 and Chapter 6. Cattani's contribution was to recast the opfibration (originally investigated by Milner and me) between the embeddings category and the contexts category in terms of the coslice of the latter. Milner and I worked jointly on the definition of context composition and of embedding. I am responsible for the proof of the existence of relative coproducts (given in full detail in Chapter 6). 1.6 Outline The subsequent chapters of this dissertation are organised in the following way: Chapter 2: I make precise the notion of a reactive system and give a definition of labelled transition in terms of idem pushouts (IPO), a sister notion of RPOs. I then prove a series of results using simple categorical reasoning that shows how to manipulate IPOs and RPOs. The main theorem then follows by direct use of the categorical results from the chapter: if su#cient RPOs exist then strong bisimulation is a con­ gruence. I conclude by reproving the same theorem in a cleaner way by isolating two lemmas which give derived inference rules for labelled transitions. Chapter 3: I generalise the definition of reactive system by enriching it so as to com­ prise two categories with a functor F between them. The idea is that the downstairs category is the one in which one wishes to consider agents and contexts, but for which enough RPOs might not exist. The upstairs category does have RPOs but at the cost of extra intensional information in the arrows and objects. By refining the definition of labelled transition so that it relates arrows downstairs in terms of IPO properties of their preimages upstairs, I obtain congruence results for strong bisim­ ulation, weak bisimulation, the traces preorder, and the failures preorder. Finally, I 1.6. Outline 17 propose added structure needed in a functorial reactive system to cater explicitly for RPOs that yield multi­hole contexts. I conclude by proving that strong bisimulation is a congruence here as well. Chapter 4: The proofs of congruence for the preceding chapter rest on some hypotheses about the functor F . The most important one of these is that F allows an IPO square upstairs to ``slide'' so as to leave the F­image of the square invariant. This chapter eases the burden of showing that these hypotheses are satisfied by giving an abstract way of generating a functor satisfying them from simpler data. The chapter started by assuming the existence of a well­supported precategory A, which is like a category but lacks some arrow compositions and has extra structure for extracting and renaming the support of an arrow. The motivation for this comes from the raw contexts in Chapter 5 for which composition of two contexts is not defined unless their node sets (supports) intersect trivially. I derive from A two categories and a functor between them. The upstairs category is formed from A by adding extra information to the objects, so as to make composition total. The downstairs category is formed by quotienting away the supports. The functor F maps arrows to their quotient equivalence classes. By reasoning abstractly, I show that F allows IPO sliding and has all of the other properties required of functorial reactive systems in Chapter 3. By instantiating these results, as in Chapter 5 for example, one gets ``for free'' a functorial reactive system provided that one starts with a well­supported precategory, a light burden. Chapter 5: This chapter gives a significant example of a family of functorial reactive systems. The contexts are derived from closed, shallow action graphs, those with no nesting and no free names. Their graph structure includes forked and discarded wiring connecting interfaces on each node and on the inside and outside of each context. By instantiating the results of the preceding chapter, we construct a func­ torial reactive system with the following properties: the downstairs category does not distinguishes the occurrences of controls, as desired when modelling agents of a process calculus; the upstairs one does distinguish them, thus providing su#cient RPOs as shown in Chapter 6. Chapter 6: I take up the task of proving that su#cient RPOs exist in a category of action graph contexts. My strategy is to follow an indirect route: rather than try to construct RPOs directly in a category of contexts, I shift the problem to that of looking for a related construction called a relative coproduct in a category of action graphs and embeddings. The motivation for this choice is to avoid some of the heaviness inherent in manipulating contexts: the equality of two decompositions, such as in C 0 a 0 = C 1 a 1 , contains much redundant information about the nodes and arcs of the component contexts. The important data is how a 0 and a 1 overlap in the Chapter 1. Introduction 18 common action graph C 0 a 0 = C 1 a 1 . The embeddings category is the right setting for analysing this: each embedding is a tuple of functions mapping the nodes and ports of one action graph into another. Thus, given two embeddings # i : a i b into the same action graph b “ = C 0 a 0 = C 1 a 1 , it is clear how the nodes and arcs of a 0 and a 1 overlap. The chapter defines embeddings and gives necessary and su#cient conditions for the existence of relative coproducts. A simple corollary then shows that su#cient RPOs exist in the contexts category provided that the redexes satisfy a constraint on their wiring. As a result, all of the congruence proofs in Chapter 3 are applicable to the family of functorial reactive systems formed from action graphs. Chapter 7: The preceding chapter showed that su#cient RPOs exist provided the wiring present in redexes is constrained. This chapter argues that this constraint is rea­ sonable and shows an example of undesirable non­determinism in the pattern of reactions when the constraint is not satisfied. It transpires that by confining redexes so as to satisfy two further ``reasonable'' wiring requirements on redexes, a beautiful categorical property holds, namely that the preimages of redexes are epis. The epi property may be exploited abstractly (Chapter 3) to prove that id­labelled transi­ tions and reactions correspond exactly. The chapter concludes by speculating about other possible concrete and abstract constraints on redexes and the benefits gained when they are satisfied. Chapter 8: This chapter first reviews some of the accomplishments of this dissertation and then discusses future research. The latter includes two main lines. The first is the extension of the action graph contexts handled in Chapter 5 to include richer features such as nesting, free names, binding, and full reflexion. The second is the extension of graphs and of functorial reactive systems to deal with multi­hole redexes in order to capture the full uniformity of reaction rules with metavariables. The goal of both of these is to create a shared theory that is su#ciently powerful to provide lightweight labelled transitions and useful operational congruences for current and future process calculi Appendix A: This appendix reviews the category theory required in this dissertation. Appendix B: The second appendix considers a variation based on retractions of the labelled transitions defined in Chapter 3; the result is that id­labelled transitions match the reaction relation exactly. Chapter 2 Operational congruences for reactive systems 2.1 Formalisation of reactive systems In this section I investigate how to give reactive systems, which were introduced informally in Section 1.2, a precise category­theoretic definition. The goal is to include appropriate categorical structure so that it is possible to derive labelled transitions and prove that several operational equivalences and preorders are congruences. To that end, I first study relative pushouts (RPOs) and idem pushouts (IPOs), which are universal constructions related to pushouts. I then show how to derive labelled transitions from a set of reaction rules, with IPOs playing the central role. Finally I prove by categorical reasoning that if su#ciently many RPOs exist then strong bisimulation is a congruence. The next chapter considers richer notions of equivalences and reactive systems, proving that the former are congruences. This dissertation employs only basic category theory, such as simple universal con­ structions, slices (and coslices), monoidal structures, and functors. A summary of all the required theory is given in Appendix A. Throughout this chapter, I use lowercase roman letters a, b, . . . for agents (processes) and uppercase roman letters C, D, . . . for agent contexts (process contexts). Both are arrows in a category of contexts (as explained below). Juxtaposition is used for categorical composition. Other notation is explained as it comes up. How do we get at the essence of reactive systems, i.e. how do we find mathematical structure that is simple enough to get general results about congruences and rich enough to encompass significant examples? The key ingredients of a reactive system were shown in Section 1.2 and are recalled here: . syntax (agents and agent contexts); 19 Chapter 2. Operational congruences for reactive systems 20 . structural congruence; . set of reactive (non­guarding) contexts; . reaction rules. For each, I outline the mathematical design space and explain the decisions I have taken. syntax: Since contexts are composable, I take them to be the arrows of some category C. This presents an immediate question. Are the objects of C agents or sorts? Following the former route leads to a problem: If we think of a context C as an embedding of an agent a into an agent a # , i.e. `` C : a a # '', then there is no easy way to apply C to a di#erent agent. For example, we cannot state the congruence property of an equivalence #: if a # b for agents a and b then C ``applied to'' a and C ``applied to'' b are #­equivalent for all contexts C. In Chapter 6 I return to embeddings (which are critical in proving the existence of universal constructions for categories of graphs) but concentrate now on a category C of contexts and sorts. If the objects are the sorts (or ``types'' or ``arities'') of the contexts, then agents are a special subclass of contexts. In particular, agents are contexts with a null hole, i.e. contexts whose domain is some distinguished object 0. Now the congruence property of # is neatly rendered: if a # b for all arrows a, b : 0 m, then Ca # Cb for all arrows C with domain m. For concreteness, consider as an example a category of contexts for some Algol­like programming language. The objects of the category could comprise the usual types: bool , int , cmd . Then we have the following examples of arrows: C 0 “ = if - then x := 0 else skip : bool cmd C 1 “ = 14 < - : int bool C 0 C 1 = if 14 < - then x := 0 else skip : int cmd . Another example is of a category of linear multi­hole term algebra contexts over some signature #. These are considered in more detail in Section 3.8. The objects are natural numbers. The arrows m n are n­tuples of terms over # # {- 1 , . . . , -m }, where each symbol - i is used exactly once. For example, if # = {#, # # , #, #}, where # and # # are constants, # is a 1­place function symbol, and # is a 2­place function symbol, then: C 0 “ = ###- 2 , # # #, #, ##- 1 ## : 2 3 C 1 “ = ##, ### # ## : 0 2 C 0 C 1 = ###### # #, # # #, #, ##### : 0 3 . 2.1. Formalisation of reactive systems 21 structural congruence: The main decision here is whether to make structural congru­ ence explicit or implicit. The simplest solution (which is the one taken in this dissertation) is leave it implicit in the definition of arrow equality --- thus the arrows are structural equivalence classes of contexts. Consequently, certain categories (such as those of graph contexts, see Chapter 5) do not have enough universal construc­ tions to give the desired congruence results. In these cases, we are forced to look for the universal constructions in less quotiented categories and then exhibit functors with special properties back to the fully quotiented categories (see Section 3.2). set of reactive contexts: This is modelled by a set D of arrows. Since reactive contexts are composable and identity contexts are reactive, I take D to be a subcategory of C. Furthermore, decomposing reactive contexts yields reactive contexts, so D 1 D 0 # D implies D 1 , D 0 # D. For example, in the call­by­value #­calculus [Plo75], the reactive contexts consist of all compositions of the following contexts: - ap(v, -) ap(-, a) where v is any value (closed abstraction) and a is any term. In the #­calculus (see [Mil90]), the reactive contexts consist of all compositions of following contexts (closed under structural congruence): - (#x)(-) - | a where x is any name and a is any process. Reactive contexts correspond to evaluation contexts of Felleisen and Friedman [FF86]. reaction rules: These are given by a set Reacts of redex­contractum pairs of agents (l, r) with common codomain, i.e. (l, r) # Reacts implies that there is an object m of C such that l, r : 0 m. For simplicity, I consider redexes and contractums that are pure agents, not agents with meta­variables (i.e. contexts). Thus, to define the reactions of CCS, we let Reacts “ = ## • x.a | x.b , a | b # / x is a name and a, b are agents # rather than: Reacts “ = ## • x.- 1 | x.- 2 , - 1 | - 2 # / x is a name # . I use / throughout this dissertation for set comprehensions. The latter approach maintains the maximum uniformity present in rules and is considered in detail by Sewell in [Sew01]; however, that approach is complex and would require future work to adapt it to the categorical setting of this dissertation (see Chapter 8). Chapter 2. Operational congruences for reactive systems 22 Distilling the structures described in the past paragraphs yields the following definition of a reactive system and a reaction relation: Definition 2.1 (reactive system) A reactive system consists of a category C with added structure. We let m,n range over objects. C has the following extra components: . a distinguished object 0 (not necessarily initial); . a set of reaction rules called Reacts # # m#objC C(0, m) 2 , a relation containing pairs of agents with common codomain; . a subcategory D of C, whose arrows are the reactive contexts, with the property that D 1 D 0 # D implies D 1 , D 0 # D. # Definition 2.2 (reaction relation) Given a reactive system C, the reaction relation # # m#objC C(0, m) 2 contains pairs of agents with common codomain and is defined by lifting the reaction rules through all reactive contexts: a a # i# #(l, r) # Reacts, D # D. a = Dl & a # = Dr . # We now have enough definitions to make precise the first approximation for labelled transitions given in (1.1). The only change is that we think of composing arrows rather than ``applying contexts'' and we are careful about which contexts are reactive (by writing D # D below): Definition 2.3 (labelled transition --- first approximation) a F a # i# Fa a # i# #(l, r) # Reacts, D # D. Fa = Dl & a # = Dr . # # # # # 0 a l F D 2.1 The commuting square to the right renders the equality Fa = Dl. As I argued in Section 1.2, there may be ``junk'' in F and D, i.e. parts of F and D that do not contribute to the reaction. For example, in a category of CCS contexts, the outside square in Figure 2.2 commutes. So, by the naive definition of labelled transitions given above, • x.a -|x.b|y a | b | y (2.1) But the y in the label is superfluous. Is there a general condition on Figure 2.1 that prevents this labelled transition, but still allows the following: • x.a -|x.b a | b ? 2.2. Categorical basis for contextual labels 23 # # # # 0 • x.a • x.a|x.b -|x.b|y -|y -|x.b - -|y 2.2 Informally, the condition would state that there is no lesser upper bound in Figure 2.1 for a, l than F, D. In Figure 2.2 there clearly is a lesser upper bound, as illustrated by the triple of arrows inside the square. In the following sections I render this condition in terms of categorical constructions and incorporate it in a new definition of labelled transitions. I then show that strong bisimulation is a congruence with respect to this labelled transition relation. A variety of preorders and other operational equivalences are discussed in the next chapter. 2.2 Categorical basis for contextual labels The goal of this section is to find a tractable definition of a labelled transition relation, one which readily leads to congruential equivalences and preorders and moreover facilitates proofs concerning these relations. Intuitively, the labels represent just the information exchanged between an agent and its environment in order to make a reaction. # # # # a l F F # D D # GG 2.3 Following the intuitions of the previous sections concerning Figure 2.1, the natural question is as follows. How can F, D be forced to contain no ``junk''? A possible solution is to require that F and D are a ``least upper bound'' for a and l. The typical way to formulate this is to state that the square in Figure 2.3 is a pushout, i.e. has the property: Fa = Dl, and for every F # and D # satisfying F # a = D # l there exists a unique G such that GF = F # and GD = D # , as shown here. Unfortunately, pushouts rarely exist in the categories that interest us. Consider, for example, a category of term contexts over a signature #; its objects consist of 0 and 1; its arrows 0 1 are terms over #; its arrows 1 1 are one­hole contexts over #; there are no arrows 1 0 and exactly one arrow id 0 : 0 0. Now, if # contains only constant symbols, say # = {#, # # }, then there is no pushout completing Figure 2.4(1) because there are no contexts other than the identity. If we introduce a 2­place function symbol # into #, we can construct an upper bound for # and # # but still no pushout (Figure 2.4(2)). A more refined approach is to assert that F and D are a ``minimal upper bound'' --- informally, an upper bound for which there are no lesser upper bounds. Before defining this notion in terms of idem pushouts (IPOs), I give a more basic construction, namely that of relative pushouts (RPOs). The latter, unlike pushouts, exist in many categories of agent contexts. The plan for the rest of this section is to develop a sequence of propositions that will serve as a basis for the proofs of congruence by categorical reasoning given in a later section of this chapter and in subsequent chapters of the dissertation. Chapter 2. Operational congruences for reactive systems 24 0 1 1 1 # # # # # 2.4(1) 0 1 1 1 1 # # # ##-,# # # ###,-# ### # ,-# ##-,## # # 2.4(2) Figure 2.4: Non­existence of pushouts m f 0 f 1 g 0 g 1 2.5(1) m f 0 f 1 g 0 g 1 h 0 h 1 h 2.5(2) m f 0 f 1 g 0 g 1 h 0 h 1 h h # h # 0 h # 1 kk 2.5(3) Figure 2.5: Construction of an RPO Because RPOs and IPOs are categorical constructions independent of reactive systems, I shall work in this section with an arbitrary category C whose arrows and objects I denote by f, g, h, k, x, y, z and m,n; in pictures I omit labels on the objects when possible. Definition 2.4 (RPO) In any category C, consider a commuting square (Figure 2.5(1)) consisting of g 0 f 0 = g 1 f 1 . An RPO is a triple h 0 , h 1 , h satisfying two properties: commutation: h 0 f 0 = h 1 f 1 and hh i = g i for i = 0, 1 (Figure 2.5(2)); universality: for any h # 0 , h # 1 , h # satisfying h # 0 f 0 = h # 1 f 1 and h # h # i = g i for i = 0, 1, there exists a unique mediating arrow k such that h # k = h and kh i = h # i (Figure 2.5(3)). # A triple, such as h # 0 , h # 1 , h # given above, that satisfies the commutation property, i.e. h # 0 f 0 = h # 1 f 1 and h # h # i = g i for i = 0, 1, is often called a candidate. Thus an RPO triple is a candidate for which there is a unique mediating arrow from it to any other candidate. An RPO for Figure 2.5(1) is just a pushout in the slice category of C over m. Thus an RPO is a standard combination of categorical constructions --- though it is not commonly used in category theory and its application to reactive systems is novel. In later chapters, I illustrate the existence of RPOs for categories of graphs. For concreteness, though, it is worth examining now the example of an RPO and another 2.2. Categorical basis for contextual labels 25 #### # ####-## - ######-### ##-# ####-## ##-# ##-# ####-## ##-# Figure 2.6: An example of an RPO and another candidate candidate shown in Figure 2.6. The arrows are in a category of term algebra contexts over the signature {#, #, #, #}, where # is a constant and #, #, # are 1­place function symbols. The RPO triple -, ##-#, ####-## adds just the minimal extra bit of context ##-# to # in order to get an upper bound for #### and #; the arrow ####-## then provides the extra junk necessary to recover the upper bound provided by the surrounding square. The reader may enjoy checking that the candidate triple ##-#, ####-##, ##-# is the only other non­trivial one possible and that the mediating dotted arrow ##-# is unique. A square is called an IPO if it has an RPO of a special kind: Definition 2.5 (IPO) The commuting square in Figure 2.5(1) is an IPO if the triple g 0 , g 1 , id m is an RPO. # The di#erence between a pushout and an IPO is clearest in a partial order category: a pushout is a least upper bound (i.e. less than any other upper bound) and an IPO is a minimal upper bound (i.e. not greater than any other upper bound). IPOs form the basis of our abstract definition of labelled transition and their existence follows from that of RPOs as shown by the proposition immediately after the following lemma: # # # # n m f 0 f 1 g 0 g 1 h 0 h 1 h 2.7 Lemma 2.6 If Figure 2.7 is an RPO diagram and j : n n satisfies hj = h jh i = h i (i = 0, 1) then j = id n . Proof If we think of the triple h 0 , h 1 , h as a candidate itself, then by the RPO property Chapter 2. Operational congruences for reactive systems 26 there exists a unique j # : n n such that: hj # = h j # h i = h i (i = 0, 1) Therefore j = j # = id n . # # # # # n m f 0 f 1 g 0 g 1 h 0 h 1 h 2.8 n f 0 f 1 h 0 h 1 2.9 Proposition 2.7 (IPOs from RPOs) If Figure 2.8 is an RPO diagram then the square in Figure 2.9 is an IPO. Proof Consider any candidate triple h # 0 , h # 1 , h # in­ side Figure 2.9; the components satisfy the following equations: h # 0 f 0 = h # 1 f 1 h # h # i = h i (i = 0, 1) (2.2) We can therefore shift this candidate over to Figure 2.8 by considering the triple h # 0 , h # 1 , hh # . Since Figure 2.8 is an RPO, there exists a unique k such that: hh # k = h (2.3) kh i = h # i (i = 0, 1) (2.4) Therefore, h # kh i = (2.4) h # h # i = (2.2) h i . By Lemma 2.6, h # k = id m . Uniqueness: suppose k # satisfies: h # k # = id m and k # h i = h # i . Then k # has the same property as k in (2.3) and (2.4), therefore k # = k as desired. # The next result provides a partial converse to the previous proposition. It serves as a key part of the proof of IPO pasting which comes afterwards: # # # # m f 0 f 1 h 0 h 1 2.10 f 0 f 1 hh 0 hh 1 2.11 m f 0 f 1 hh 0 hh 1 h 0 h 1 h 2.12 Proposition 2.8 (RPOs from IPOs) If Figure 2.10 is an IPO and Figure 2.11 has an RPO then Figure 2.12 is an RPO. Proof Let g 0 , g 1 , g be an RPO for Figure 2.11 and suppose Cod g 0 = Cod g 1 = Dom g = m # . Then h 0 , h 1 , h is a candidate, hence there exists a unique k : m # m such that: hk = g (2.5) kg i = h i (i = 0, 1) (2.6) 2.2. Categorical basis for contextual labels 27 m # m f 0 f 1 hh 0 g 0 h 0 k g h j hh 1 g 1 h 1 Therefore, g 0 , g 1 , k is a candidate for Figure 2.10, hence there exists a unique j : m m # such that: kj = id m (2.7) jh i = g i (i = 0, 1) (2.8) Now gjk = (2.5) hkjk = (2.7) hk = (2.5) g jkg i = (2.6) jh i = (2.8) g i Therefore by Lemma 2.6, jk = id m # , which in conjunction with (2.7), implies that j and k are isos. Thus Figure 2.12 is an RPO, as desired. # IPOs can be pasted together as shown by the following proposition, which is analogous to the standard pasting result for pushouts. # # # # n m f 0 x y g 0 z f 1 g 1 2.13 n m f 0 x g 0 f 1 z g 1 2.14 Proposition 2.9 (IPO pasting) Suppose that both squares in Figure 2.13 commute and that Figure 2.14 has an RPO. Then the following properties hold of Figure 2.13: 1. If the two squares are IPOs then so is the big rectangle. 2. If the big rectangle and the left square are IPOs then so is the right square. Proof Chapter 2. Operational congruences for reactive systems 28 # # # # n # n m f 0 x g 0 h 0 z h h 1 f 1 g 1 2.15 1. Let h 0 , h 1 , h be a candidate for the big rectangle of Figure 2.13, i.e. h 0 g 0 f 0 = h 1 x, hh 0 = z, and hh 1 = g 1 f 1 , as shown in Figure 2.15. By hypothesis, the left square of Figure 2.13 is an IPO and Figure 2.14 has RPOs; therefore by Proposition 2.8, y, f 1 , g 1 is an RPO for Figure 2.14. By construction, h 0 g 0 , h 1 , h is a candidate, hence there exists a unique k : n n # such that: hk = g 1 ky = h 0 g 0 kf 1 = h 1 (2.9) Therefore h 0 , k, h is a candidate for the right square of Figure 2.13. Hence there exists a unique j : m n # such that hj = id m (2.10) jg 1 = k (2.11) jz = h 0 (2.12) n n # n m f 0 x y g 0 g 0 k g 1 h h 0 z f 1 h 1 f 1 g 1 j Now jg 1 f 1 = (2.11) kf 1 = (2.9) h 1 (2.13) By (2.10), (2.12), and (2.13) j is a mediating arrow. Uniqueness: suppose j # : m n # satisfies the same specification: hj # = id m (2.14) j # g 1 f 1 = h 1 (2.15) j # z = h 0 (2.16) 2.2. Categorical basis for contextual labels 29 Then j # g 1 satisfies k 's universal property because of (2.15) and hj # g 1 = (2.14) g 1 j # g 1 y = Figure 2.13 j # zg 0 = (2.16) h 0 g 0 Hence j # g 1 = k by the uniqueness of k. Therefore j # satisfies the universal property of j, and hence j # = j as desired. # # # # n # n m g 0 y z g 1 h 0 h 1 h 2.16 2. Let h 0 , h 1 , h be a candidate for the right square of Figure 2.13, i.e. h 0 g 0 = h 1 y, hh 0 = z, and hh 1 = g 1 , as shown in Figure 2.16. Then h 0 , h 1 f 1 , h is a candidate for the big rect­ angle of Figure 2.13 therefore there exists a unique j : m n # such that hj = id m (2.17) jz = h 0 (2.18) jg 1 f 1 = h 1 f 1 (2.19) By hypothesis, the left square of Figure 2.13 is an IPO and Figure 2.14 has an RPO; therefore by Proposition 2.8, y, f 1 , g 1 is an RPO for Figure 2.14. By construction, h 0 g 0 , h 1 f 1 , h is a candidate. Therefore there exists a unique k : n n # such that: hk = g 1 ky = h 0 g 0 kf 1 = h 1 f 1 Then h 1 satisfies the universal property of k, therefore k = h 1 . Now, jg 1 satisfies the same universal property because of (2.19) and hjg 1 = (2.17) g 1 jg 1 y = Figure 2.13 jzg 0 = (2.18) h 0 g 0 Therefore jg 1 = k = h 1 and hence by (2.17) and (2.18), j is the required mediating arrow. Uniqueness: suppose j # is also a mediating arrow. Then hj # = id m , j # z = h 0 , and j # g 1 = h 1 ; the first two are analogues of (2.17) and (2.18); postmultiplying the last equation by f 1 gives the an analogue of (2.19). Therefore j # = j as desired. # Finally, I conclude this collection of categorical results with three concerning IPOs; they are not immediately relevant to this chapter, but play an important role later in the dissertation. The first asserts that if the left­leg of an IPO is an iso then so is the right­leg. Chapter 2. Operational congruences for reactive systems 30 # # # # m m n g 0 f 0 f 1 id m g 1 g 0 f -1 0 f 1 2.17 Proposition 2.10 If the outside square in Figure 2.17 is an IPO and f 0 is an iso then f 1 is an iso. Proof Let f -1 0 be the inverse of f 0 . Then all the triangles in Figure 2.17 commute. Therefore, there exists (a unique) k : n m such that f 1 k = id n , kf 1 = id m , and kg 1 = g 0 f -1 0 . The first two conditions imply that f 1 is an iso. # The second asserts that IPOs can arise from epis. # # # # f 0 f 1 id g 1 x 0 x 1 x x 0 x 0 2.18 Proposition 2.11 Suppose f 1 is an epi. Then the outer square in Figure 2.18 is an IPO. Proof Consider any candidate x 0 , x 1 , x for Figure 2.18, i.e. x 0 f 0 = x 1 f 1 (2.20) xx 0 = id (2.21) xx 1 = g 1 (2.22) Then x 0 g 1 f 1 = Figure 2.18 x 0 f 0 = (2.20) x 1 f 1 . Since f 1 is an epi, x 0 g 1 = x 1 . This last equation plus (2.21) imply that x 0 is a mediating arrow. It is unique because any mediating arrow k must satisfy k id = x 0 , i.e. k = x 0 . # The final result asserts that only a subcategory of C plays any role in characterising that a particular square is an IPO. # # # # m m # f 0 f 1 g 0 g 1 2.19 Proposition 2.12 Let m,m # be objects of C and let C # be a full sub­ category of C satisfying the following property: obj C # # {n # obj C / #h # C(m,n) & #h # # C(n, m # )} . Suppose the square in Figure 2.19 commutes, where f i , g i # C # for i = 0, 1. Then the square is an IPO in C i# it is an IPO in C # . Proof The only arrows relevant to the square being an IPO in C are contained in C # . # 2.3 Labelled transitions and congruence for strong bisimu­ lation The category theory developed in the previous section provides the machinery needed in this section to accomplish two aims. The first is to improve the unsatisfactory definition of labelled transition given earlier (Definition 2.3). The second is to prove that strong bisimulation over the new labelled transitions is a congruence. I return to the notations of Section 2.1, using C for a reactive system, with a, b # C ranging over arrows with domain 0 (agents) and C, D,F # C ranging over arbitrary arrows (contexts). 2.3. Labelled transitions and congruence for strong bisimulation 31 # # # # 0 a l F D 2.20 The new version of labelled transitions is a modification of the approxi­ mation given by Definition 2.3, where the condition Fa = Dl is strengthened to require that the square in Figure 2.20 is an IPO: Definition 2.13 (labelled transition) a F a # i# there exists (l, r) # Reacts and D # D such that Figure 2.20 is an IPO and a # = Dr. # This definition assures that F, D provides a minimal upper bound on a and l, as required in Section 2.1. For suppose there is another upper bound F # , D # , i.e. F # a = D # l, and also F = RF # and D = RD # for some R. Then the IPO property for Figure 2.20 ensures that for some R # (with RR # = id) we have F # = R # F and D # = R # D --- so F, D provides a ``lesser'' upper bound than F # , D # after all. Proposition 2.14 For all contexts F we have that a F a # implies Fa a # . # The converse fails in general (which is good, given the remarks made after Definition 2.3 about the first approximation for labelled transitions). I return to the converse property later in Section 3.4 in the special case that F is an iso. Strong bisimulation over · follows its usual scheme [Par81]: Definition 2.15 (strong bisimulation over · ) Let S # # m#objC C(0, m) 2 be a relation that contains pairs of agents with common codomain. S is a simulation over · i# S satisfies the following property for all (a, b) # S: if a F a # then there exists b # such that b F b # and (a # , b # ) # S. S is a strong bisimulation i# S and S -1 are strong simulations. Let # be the largest strong bisimulation over · . # I now state and prove the congruence result for strong bisimulation, one of the central results of this dissertation: if C has a su#ciently rich collection of RPOs then # is a congruence. Definition 2.16 (C has all redex­RPOs) Say that C has all redex­RPOs if for all (l, r) # Reacts and arrows a, F, D such that D # D and Fa = Dl, the square in Figure 2.20 has an RPO. # Theorem 2.17 (congruence for #) Let C be a reactive system which has all redex­ RPOs. Then # is a congruence, i.e. a # b implies Ca # Cb for all C # C with required domain. Proof By symmetry, it is su#cient to show that the following relation is a strong sim­ ulation: S “ = {(Ca, Cb) / a # b and C # C} . The proof falls into three parts, each of which is an implication as illustrated in Figure 2.21(1). Dashed lines connect pairs of points contained within the relation an­ notating the line. Each arrow `` # '' is tagged by the part of the proof below that justifies the implication. Suppose that a # b and C # C, and thus (Ca, Cb) # S. Chapter 2. Operational congruences for reactive systems 32 Ca a # =C # D # r #(i) a D # r #(ii) b b ## =E # r # #(iii) Cb C # b ## =C # E # r # F S S F F # # F # # 2.21(1) 0 a l C F # F D D # C # 2.21(2) 0 a l F # D # 2.21(3) C F # F C # 2.21(4) 0 b l # F # E # 2.21(5) 0 b l # C F E # C # 2.21(6) Figure 2.21: Congruence proof for strong bisimulation (i): If Ca F a # then, by definition, there exists (l, r) # Reacts and D # D such that the big rectangle in Figure 2.21(2) is an IPO and a # = Dr. Because C has all redex­RPOs, there exists F # , D # , C # forming an RPO as in Figure 2.21(2); moreover, D # , C # # D since C # D # = D # D. By Proposition 2.7, Figure 2.21(3) is an IPO. Because C has all redex­RPOs, Proposition 2.9 implies that Figure 2.21(4) is an IPO too. By definition, a F # D # r and a # = C # D # r. (ii): Since a # b, there exists b ## such that b F # b ## and D # r # b ## . By definition there exists (l # , r # ) # Reacts and E # # D such that Figure 2.21(5) is an IPO and b ## = E # r # . (iii): Because C has all redex­RPOs, Proposition 2.9 implies that we can paste Figure 2.21(5) with Figure 2.21(4) (both IPOs) along F # and conclude that Figure 2.21(6) is an IPO. Hence Cb F C # E # r # and (C # D # r, C # E # r # ) # S because D # r # E # r # , as desired. # The crux of the above proof is that Figure 2.21(4), which mediates between an F # ­ labelled transition of a and an F ­labelled transition of Ca, is ``portable'', i.e. can be pasted onto a new diagram, serving the same function for b and Cb. This essential idea appears to be robust under variation both of the definition of labelled transition and of the congruence being established. Many examples are shown in Chapter 3. We can isolate precisely in two lemmas how such portable IPO squares are cut and then pasted. These lemmas are just pieces of the congruence proof above, but their 2.3. Labelled transitions and congruence for strong bisimulation 33 factorisation from the main proof greatly simplifies the latter and lays the ground for tractable presentations of more di#cult congruences results in the next chapter. The first lemma shows that portable IPO squares arise when a composite agent has a labelled transition: Lemma 2.18 (portable IPO cutting) If C has all redex­RPOs then the following inference rule holds: Ca F a # # a ## and an IPO C F # F C # . a F # a ## a # = C # a ## C # # D . # The second shows how to ``paste'' a portable square to gain a labelled transition of a composite agent: Lemma 2.19 (portable IPO pasting) If C has all redex­RPOs then the following inference rule holds: C F # F C # is an IPO a F # a ## C # # D Ca F C # a ## . # We can now replay the proof of Theorem 2.17 in a more concise form by employing these lemmas: # # # # C F # F C # 2.22 (i): If Ca F a # then by Lemma 2.18, there exists a ## and an IPO square shown in Figure 2.22 such that a F # a ## a # = C # a ## C # # D . (ii): Since a # b, there exists b ## such that b F # b ## and a ## # b ## . (iii): Since Figure 2.22 is an IPO and C # # D, Lemma 2.19 implies that Cb F C # b ## . Also, a ## # b ## implies (C # a ## , C # b ## ) # S, as desired. Let us return to Lemma 2.18 in order to expose an odd property. The Lemma contra­ dicts the situation in many process calculi: normally, Ca F does not necessarily imply that a has any labelled transitions. In CCS, for example, 0 | x x (using the tradi­ tional CCS non­contextual labels) but 0 has no transitions. As a result, in typical proofs of congruence for strong bisimulation, two cases are distinguished when considering the transition Ca F (using the notation of this chapter): Chapter 2. Operational congruences for reactive systems 34 . C and F together conspire to create the transition without reference to a. In this case Cb F holds without using the assumption that a # b. (For example, see ``Case 2'' on p. 98 of [Mil88].) . Or, a, C, and F together conspire to create the transition, as in part (i) above. Recasting the CCS example in terms of contextual labels, we have that 0 | x -|•x . But then there is a contextual transition for 0, namely 0 -|x|•x , though not a satisfactory one: the label provides the entire redex, without any contribution from 0. This is attributable to a defect in the definition of contextual labelled transitions: if a F , the IPO property requires that F contain parts of the relevant redex and no extra junk, but does not prevent F from containing all of the redex. It is by enriching the categorical structure to express multi­hole contexts (see Section 3.8) that we eliminate this defect of transitions. When we do, exactly the same case analysis (shown above) is carried out when proving congruence. Chapter 3 Further congruence results 3.1 Introduction This chapter generalises the definition of labelled transition given in the previous chapter and provides congruence proofs for additional equivalences and preorders. These include weak bisimulation, the traces preorder, and the failures preorder. I will describe them in turn as each congruence proof is presented. The more important step, though, is the generalisation of reactive systems and la­ belled transitions. In the previous chapter, the central hypothesis required in the proof of congruence for strong bisimulation is that the reactive system C has all redex­RPOs. This chapter addresses the problem of what to do if C does not possess su#cient RPOs. Such a situation arises when considering, for example, a category of graph contexts (see Section 5.3). Roughly, the lack of RPOs is attributable to the absence of enough in­ tensional information about the occurrence of nodes: it is ambiguous which node in a context corresponds to a node in the composition of the context with another. Thus if C 0 B 0 = C 1 B 1 , it is ambiguous which nodes are common to both C 0 and C 1 and thus impossible to choose the context to be factored o# when constructing an RPO. What can be done when there are not enough RPOs in a reactive system? In general, it is not a good solution simply to enrich the reactive system to force it to have enough RPOs. The enrichment could yield a category with too much intensional information. For example, the enrichment considered for graph contexts (Section 5.6) forces arrows with the same domain and codomain to have the same number of nodes. Since the definition of strong bisimulation requires that a # b implies that a, b : 0 m for some object m, the strong bisimulation relation could only compare arrows of the same number of nodes. Such a restriction is unacceptable because strong bisimulation should include pairs of agents with widely di#ering static structure. The solution presented in this chapter is to accommodate two categories “ C and C 35 Chapter 3. Further congruence results 36 related by a functor: “ C C F . The idea is that C is a reactive system, whose arrows correspond to the agents and agent contexts; C does not necessarily have enough RPOs. Sitting ``above'' C is “ C, a category with su#cient RPOs. The definition of the labelled transition relates arrows in C, just as in the previous chapter: i.e. a F a # is defined for a, a # agents in C and F an agent context of C. But, by contrast with the previous chapter, the definition is given in terms of the existence of an IPO ``upstairs'' in “ C. (If “ C = C and F is the identity functor, then the new definition of this chapter collapses into the old one given in the previous.) Thus we get the best of both worlds: the agents whose labelled transition behaviour we consider need not contain any superfluous intensional data; as long as we can construct a more intensional category above containing su#cient RPOs and a suitable functor, then we can get congruence results downstairs. # # # # C F # F C # 3.1 C 0 F # 0 F 0 C # 0 3.2 These congruence results require the functor F : “ C C to satisfy certain properties. Most are trivial but one is interesting, namely that F allows IPO sliding. Recall from the previous chapter that the crux of the congruence proof for strong bisimulation was the portability of the IPO square that related F # ­transitions of an agent to F ­transitions of C applied to the agent. This square was cut o# when passing from Ca F to a F # and then pasted back on when passing from b F # to Cb F . In the new definition of labelled transition considered in this chapter, the pasting operation is more complex. The portable square, e.g. Figure 3.1, now lives in “ C (the upstairs category) and its left leg is F # , some arrow for which F(F # ) = F # . (Teletype font is used for arrows in “ C.) However, the transition b F # is justified by an IPO square upstairs whose right­leg is F # 0 , an arrow in the preimage of F # not necessarily equal to F # . Thus Figure 3.1 cannot be pasted without first sliding it to a new IPO square, e.g. Figure 3.2, whose left­leg is F # 0 and whose F­image is kept invariant. The present chapter assumes that F allows IPO sliding; the next chapter proves that this is the case when F is of a certain general form. The outline of this chapter is as follows. In the next section, I define the notion of a functorial reactive system, giving precise requirements for F . Then I define the reaction and labelled transition relations. In the following section, I prove some results about portable IPO squares that are direct analogies to those (Lemma 2.18 and Lemma 2.19) at the end of the previous chapter. The main sections are concerned with a series of congruence proofs for strong bisimulation, weak bisimulation, the traces preorder, and the failures preorder. The final section treats a richer notion of functorial reactive system with arrows corresponding to multi­hole contexts and shows that strong bisimulation is indeed a congruence here as well. 3.2. Functorial reactive systems 37 3.2 Functorial reactive systems The first part of the setup is to define precisely the notion of functorial reactive system, which was introduced informally above. Its definition rests on that of a reactive system, given in the previous chapter, which we recall here first for ease of reference: Definition (reactive system recalled; see Definition 2.1) A reactive system consists of a category C with added structure. We let m,n range over objects. C has the following extra components: . a distinguished object 0 (not necessarily initial); . a set of reaction rules called Reacts # # m#objC C(0, m) 2 , a relation containing pairs of agents with common codomain; . a subcategory D of C, whose arrows are the reactive contexts, with the property that D 1 D 0 # D implies D 1 , D 0 # D. # Definition 3.1 (functorial reactive system) Let C be a reactive system. A functorial reactive system over C consists of a functor F : “ C C which maps a distinguished object # # obj “ C to 0 (the distinguished object of C) and which satisfies the following properties. F lifts agents: for any a : 0 m there exists a : # m such that F(a) = a. F creates isos: if F(C) is an iso then C is an iso. F creates compositions: if F(C) = C 1 C 0 , there exist C 0 , C 1 # “ C such that C = C 1 C 0 and F(C i ) = C i for i = 0, 1. # # # # C F # F C # 3.3 C 0 F # 0 F 0 C # 0 3.4 F allows IPO sliding: for any IPO square as in Figure 3.3 and any arrow F # 0 with F(F # 0 ) = F(F # ) there exist C 0 , C # 0 , F 0 form­ ing an IPO square as in Figure 3.4 with F(C 0 ) = F(C) F(C # 0 ) = F(C # ) F(F 0 ) = F(F) . # Throughout this chapter, I use uppercase teletype characters to denote arrows in “ C and lowercase teletype characters (a, l, . . .) to denote arrows with domain # in “ C. The F images of these are agents in C. The special domain requirement of a, l, . . . is left tacit throughout this chapter: thus (#l # “ C. . . .) means (#l # “ C. Dom l = # & . . .). The definition of the reaction relation is identical to the one given earlier: Definition (reaction relation ( ) recalled; cf. Definition 2.2) Given a functorial reactive system F : “ C C, the reaction relation # # m#objC C(0, m) 2 contains pairs of agents with common codomain and is defined by lifting the reaction rules through all reactive contexts: a a # i# there exists D # D and (l, r) # Reacts such that a = Dl and a # = Dr. # Chapter 3. Further congruence results 38 This definition has an alternative characterisation given by the following result: Proposition 3.2 (characterisation of ) a a # i# there exist a, l, D # “ C and r # C such that a = Dl and a # = F(D)r F(D) # D (F(l), r) # Reacts F(a) = a . Proof Follows immediately because F lifts agents and creates compositions. # We now turn to the definition of labelled transition. As stated earlier, · is a ternary relation whose arguments are all arrows in C. The original requirement that a particular square be an IPO in C (see Definition 2.13) is replaced here by requiring that there exist a preimage of this square that is an IPO in “ C: # # # # a l F D 3.5 Definition 3.3 (labelled transition ( · ); cf. Definition 2.13) a F a # i# there exist a, l, F, D # “ C and r # C such that Figure 3.5 is an IPO in “ C and a # = F(D)r F(D) # D (F(l), r) # Reacts F(a) = a F(F) = F . # Notice that a # , the RHS of the transition, is required to be F(D)r, and not F(Dr) for some r with F(r) = r. This is important since it allows the reaction rules Reacts to contain pairs (l, r) for which F­preimages of l and r might not have common codomains. By analogy with Definition 2.16, we can define when a functorial reactive system F has all redex­RPOs; the primary di#erence is that here the RPOs exist upstairs in “ C, not downstairs in C: # # # # a l F D 3.6 Definition 3.4 (F has all redex­RPOs; cf. Definition 2.16) A functorial reactive system F : “ C C has all redex­RPOs if any square, such as in Figure 3.6, has an RPO, provided that F(D) # D and that there exists r # C such that (F(l), r) # Reacts. # Note that we do not demand that all RPOs exist, just ones for which the left­leg of the enclosing square is a preimage of a redex and the bottom leg is a preimage of an arrow in D. (This narrowing of the definition is exactly analogous to what happens in the previous chapter.) 3.3 Cutting and pasting portable IPO squares This section replays the results at the end of the previous chapter which show how to cut and paste portable IPO squares. The main di#erence lies in the proof of pasting: here we make explicit use of the assumption that F allows IPO sliding. 3.3. Cutting and pasting portable IPO squares 39 The first result shows how the transitions of composite agents yield IPO squares: Lemma 3.5 (portable IPO cutting; cf. Lemma 2.18) Suppose F : “ C C is a functorial reactive system and has all redex­RPOs. The following inference rule holds: Ca F a # # a ## # C and an IPO square C F # F C # . # a F(F # ) a ## a # = F(C # )a ## F(C # ) # D F(C) = C F(F) = F # . # # # # a l C F # F D D # C # 3.7 Proof By the definition of F and the hypothesis that F creates compositions, there exists a, C, l, F, D # “ C and r # C such that the big rectangle in Figure 3.7 is an IPO and a # = F(D)r F(D) # D (F(l), r) # Reacts F(a) = a F(C) = C F(F) = F . Because F has all redex­RPOs, there exist F # , D # , C # forming an RPO in “ C, as in Figure 3.7. Then F(C # ) # D since F(C # )F(D # ) = F(D) # D. By Proposition 2.7, the small left­hand square of Figure 3.7 is an IPO. Because F has all redex­RPOs, Proposition 2.9 implies that the small right­hand square is an IPO too. By definition, a F(F # ) a ## and a # = F(C # )a ## where a ## “ = F(D # )r. # The next result shows how the reactions of composite agents can be decomposed. There is no analogue of this result the previous chapter since it is not needed in the congruence proof for strong bisimulation. Lemma 3.6 (portable IPO cutting for reactions) Suppose F : “ C C is a functorial reactive system and has all redex­RPOs. The following inference rule holds: Ca a # # a ## # C and C # , F # # “ C. # a F(F # ) a ## a # = F(C # )a ## F(C # ) # D F(C # F # ) = C # . Moreover, if F # is an iso in the conclusion then it is equal to id. # # # # a l C F # id D D # C # 3.8 Proof By Proposition 3.2 and the hypothesis that F creates compo­ sitions, there exist a, l, C, D # “ C and r # C such that the big rectangle in Figure 3.8 commutes and a # = F(D)r F(D) # D (F(l), r) # Reacts F(a) = a F(C) = C . Because F has all redex­RPOs, there exist F # , D # , C # forming an RPO in “ C, as in Figure 3.8. Because RPOs are a universal construction, we can assume without loss of generality that Chapter 3. Further congruence results 40 if F # is an iso, it is equal to id. Then F(C # ) # D since since F(C # )F(D # ) = F(D) # D. By Proposition 2.7, the small left­hand square of Figure 3.8 is an IPO. By definition, a F(F # ) a ## and a # = F(C # )a ## where a ## “ = F(D # )r. Finally, since the small right­hand square commutes, C # F # = C as desired. # The final result shows how to paste a portable IPO square in order to gain a transition for a composite agent. As stated above, this is where IPO sliding is used: Lemma 3.7 (portable IPO pasting; cf. Lemma 2.19) Suppose F : “ C C is a functorial reactive system and has all redex­RPOs. The following inference rule holds: C F # F C # is an IPO a F(F # ) a # F(C # ) # D F(C)a F(F) F(C # )a # . # # # # a l F # 0 D 3.9 C 0 F # 0 F 0 C # 0 3.10 Proof Since a F(F # ) a # there exist a, l, F # 0 , D # “ C and r # C such that Figure 3.9 is an IPO and a # = F(D)r F(D) # D (F(l), r) # Reacts F(a) = a F(F # 0 ) = F(F # ) . Since F allows IPO sliding, there exist C 0 , F 0 , C # 0 # “ C such that Figure 3.10 is an IPO and F(C 0 ) = F(C) F(F 0 ) = F(F) F(C # 0 ) = F(C # ) . # # # # a l D C 0 F 0 C # 0 3.11 Because F has all redex­RPOs, Proposition 2.9 implies that we can paste Figure 3.9 with Figure 3.10 (both IPOs) along F # 0 and conclude that Figure 3.11 is an IPO. Thus F(C)a F(F) F(C # )F(D)r = F(C # )a # , as desired. # That concludes the setup for functorial reactive systems. The rest of the chapter is devoted to a sequence of congruence proofs. 3.4 Strong bisimulation This section proves that strong bisimulation is a congruence for a functorial reactive sys­ tem. The definition is straightforward: Definition 3.8 (strong bisimulation over · ; cf. Definition 2.15) Let # be the largest strong bisimulation over · . # The proof of congruence is almost identical to the one presented at the end of the previous chapter: the only di#erence is that the updated IPO cutting and pasting results for functorial reactive systems are substituted for the old ones. 3.4. Strong bisimulation 41 Ca a # =F(C # )a ## #(i) a a ## #(ii) b b ## #(iii) Cb F(C # )b ## F S S F F(F # ) # F(F # ) # Figure 3.12: Schema of the congruence proof for # Theorem 3.9 (congruence for #) Let F : “ C C be a functorial reactive system which has all redex­RPOs. Then # is a congruence, i.e. a # b implies Ca # Cb for all C # C of the required domain. Proof By symmetry, it is su#cient to show that the following relation is a strong sim­ ulation: S “ = {(Ca, Cb) / a # b and C # C} . The proof falls into three parts, each of which is an implication as illustrated in Figure 3.12. Dashed lines connect pairs of points contained within the relation annotating the line. Each arrow ``#'' is tagged by the part of the proof below that justifies the implication. Suppose that a # b and C # C, and thus (Ca, Cb) # S. # # # # C F # F C # 3.13 (i): If Ca F a # then by Lemma 3.5, there exist a ## # C and an IPO square shown in Figure 3.13 such that a F(F # ) a ## and a # = F(C # )a ## F(C # ) # D F(C) = C F(F) = F . (ii): Since a # b, there exists b ## such that b F(F # ) b ## and a ## # b ## . (iii): Since Figure 3.13 is an IPO and F(C # ) # D, Lemma 3.7 implies that Cb F F(C # )b ## . Also, a ## # b ## implies (F(C # )a ## , F(C # )b ## ) # S, as desired. # In most process calculi the reaction relation and the #­labelled transition relation coincide. See, for example Proposition 5.2 in [Mil92] and Theorem 2 in [Sew00]. A # Chapter 3. Further congruence results 42 transition is a ``silent move'': a transition that takes place without interacting with the external environment. Intuitively, a #­labelled transition corresponds to an id­labelled transition when using contexts as labels, i.e. a id a # i# the environment need only supply a vacuous identity context in order to enable a to react. However, if we look carefully at the definition of labelled transition given in Definition 3.3 and the characterisation of reaction in Proposition 3.2, we see that id and are not necessarily identical. There is an implication in one direction, namely id # , since every IPO square is also a commuting square, but not necessarily the converse. Indeed, Example 5.2 (p. 77) contains a non­IPO commuting square whose bottom­ and right­legs are both identity arrows. In the special situation when all preimages of redexes are epis, id and do coin­ cide, thanks to Proposition 2.11. This situation is explored at the end of this section in Proposition 3.15. Before taking the epi hypothesis on board let us a consider an alternate definition of labelled transition for which we do recover the reaction relation: Definition 3.10 (labelled transition by cases ( · c )) a F c a # i# # Fa a # if F is an iso a F a # if F is not an iso . # It follows immediately from the definition that a id c a # i# a a # . Furthermore, the induced strong bisimulation (defined next) is a congruence, as shown below. It is worth considering whether there are other definitions that recover the reaction relation but do not involve case analysis. This point is taken up in Appendix B where I present a definition of labelled transition that satisfies this requirement; I show furthermore that this definition induces a congruence which includes # c . Let us return to the main flow of the argument now and consider # c in detail. Definition 3.11 (strong bisimulation over · c ) Let # c be the largest strong bisim­ ulation over · c . # Theorem 3.12 (congruence for # c ) Let F : “ C C be a functorial reactive system which has all redex­RPOs. Then # c is a congruence, i.e. a # c b implies Ca # c Cb for all C # C of the required domain. Proof By symmetry, it is su#cient to show that the following relation is a strong simulation: S “ = {(Ca, Cb) / a # c b and C # C} . Suppose that a # c b and C # C, and thus (Ca, Cb) # S. Suppose Ca F c a # . Case F is an iso: Then Ca F -1 a # . By Lemma 3.6, there exist C # , F # # “ C and a ## # C such that a F(F # ) a ## , thus a F(F # ) c a ## , and a # = FF(C # )a ## F(C # ) # D F(C # F # ) = C . 3.4. Strong bisimulation 43 Since a # c b, there exists b ## such that b F(F # ) c b ## and a ## # c b ## . Since F(C # ) # D, we have that FCb = FF(C # )F(F # )b FF(C # )b ## . Thus Cb F c FF(C # )b ## . Since a ## # c b ## , we have (FF(C # )a ## , FF(C # )b ## ) # S, as desired. # # # # C F # F C # 3.14 Case F is not an iso: By definition Ca F a # . By Lemma 3.5, there exist a ## # C and an IPO square shown in Figure 3.14 such that a F(F # ) a ## and a # = F(C # )a ## F(C # ) # D F(C) = C F(F) = F . Thus a F(F # ) c a ## . Since a # c b, there exists b ## such that b F(F # ) c b ## and a ## # c b ## . Since F is not an iso, F is not an iso; Proposition 2.10 implies that F # is also not an iso; since F creates isos, F(F # ) is not an iso, thus b F(F # ) b ## . Since Figure 3.14 is an IPO and F(C # ) # D, Lemma 3.7 implies that Cb F F(C # )b ## , so Cb F c F(C # )b ## . Since a ## # c b ## , we have (F(C # )a ## , F(C # )b ## ) # S, as desired. # Because · and · c are so closely related, it is not surprising that the induced congru­ ences are also related. Indeed the following result shows that # c is a coarser equivalence than #. It is an open question whether there exists a functorial reactive system for which the inequality is strict. Proposition 3.13 (# # # c ) Let F : “ C C be a functorial reactive system which has all redex­RPOs. Then # # # c . Proof We show that # is a strong bisimulation with respect to the labelled transition relation · c . By symmetry, it is enough to shown that # is a strong simulation over · c . Consider any a, b for which a # b. Suppose a F c a # . Case F is an iso: Then Fa a # . By Lemma 3.6, there exist C # , F # # “ C and a ## # C such that a F(F # ) a ## and a # = F(C # )a ## F(C # ) # D F(C # F # ) = F . Since a # b, there exists b ## such that b F(F # ) b ## and a ## # b ## . Since F(C # ) # D, we have that F b = F(C # F # )b F(C # )b ## , so, b F c F(C # )b ## ; also a # = F(C # )a ## # F(C # )b ## since # is a congruence. Case F is not an iso: Then a F a # . Since a # b, there exists b # such that b F b # and a # # b # . Also b F c b # , as desired. # I now return to the epi hypothesis discussed earlier and show that if it is satisfied then · = . A corollary is that · = · c and thus # = # c . In Chapter 7, I show exact conditions on redexes of a functorial reactive system of graph contexts which hold i# the epi hypothesis is satisfied. Definition 3.14 (redexes have epi preimages) Let F : “ C C be a functorial reactive system. Say that redexes have epi preimages i# for all l # “ C, if there exists r # C such that (F(l), r) # Reacts, then l is an epi. (Recall that lowercase teletype letters stand for arrows in “ C with domain #.) # Chapter 3. Further congruence results 44 # # # # a l id D 3.15 Proposition 3.15 Let F : “ C C be a functorial reactive system. Suppose that redexes have epi preimages. Then id = . Proof As discussed earlier, we know that id # . So we consider the converse. Let a a # . By Proposition 3.2 there exist a, l, D # “ C and r # C such that a = Dl, i.e. Figure 3.15 commutes, and a # = F(D)r F(D) # D (F(l), r) # Reacts F(a) = a . By the epi hypothesis, l is an epi. By Proposition 2.11, Figure 3.15 is an IPO. By defini­ tion, a id a # , as desired. # Corollary 3.16 Let F : “ C C be a functorial reactive system which has all redex­ RPOs. Suppose the epi hypothesis of Proposition 3.15 is satisfied, namely, the preimage of every redex is an epi. Then · = · c and thus # = # c . # # # # Fa l H D 3.16 a l HF D 3.17 Proof The only interesting part is to show that F c # F for F an iso. Suppose a F c a # . By definition, Fa a # . By Proposition 3.15, Fa id a # . By definition, and since F creates compositions, there exist a, l, F, D, H # “ C and r # C such that Figure 3.16 is an IPO in “ C and a # = F(D)r F(D) # D (F(l), r) # Reacts F(a) = a F(F) = F F(H) = id . Since F creates isos and F is an iso, then F is an iso. Thus Figure 3.17 is also an IPO, whence a F a # as desired. Finally, since · = · c the strong bisimulation relations induced by both are equal, thus # = # c . # 3.5 Weak bisimulation Weak bisimulation [Mil88] is a coarser equivalence than strong bisimulation and is less sensitive to the number of silent steps made by the agents it compares. A single labelled transition by one agent may be matched by a weak labelled transition of another, namely a sequence of reactions, followed by a like transition, followed by further reactions. Definition 3.17 (weak labelled transition ( · )) The weak labelled transition relation is defined as follows: a F a # i# a # F # a # . # The definition of weak bisimulation follows exactly the form set out by Milner: 3.5. Weak bisimulation 45 Definition 3.18 (weak bisimulation over · ; cf. Definition 5 on p. 108 in [Mil88]) Let S # # m#objC C(0, m) 2 be a relation that contains pairs of agents with common codomain. S is a weak simulation over · i# it satisfies the following properties for all (a, b) # S: 1. If a a # , then there exists b # such that b # b # and (a # , b # ) # S. 2. If a F a # where F is not an iso, then there exists b # such that b F b # and (a # , b # ) # S. S is a weak bisimulation i# S and S -1 are weak simulations. Let # be the largest weak bisimulation over · . # In the second clause of the definition of weak simulation, the label F is required to be not an iso. Without this requirement, the definition would force a F to be matched by b F , for F an iso. By the definition of IPOs, iso­labelled transitions are essentially like id labelled transitions. As argued in the previous section id # , and moreover the converse holds in certain cases. So allowing F to be an iso in clause 2 would override clause 1. But clause 1 embodies a basic principle, namely that a silent transitions is matched by zero or more silent transitions and not by one or more. The congruence property of weak bisimulation is more limited than that of strong bisimulation: # is a congruence with respect to arrows only in D, a subcategory of C. Recall that D consists of the ``reactive contexts'', i.e. the contexts that allow reaction under them: a a # implies Da Da # for D # D. (See Definition 2.) This limitation is not surprising. In CCS, for example, weak bisimulation is not a congruence with respect to summation contexts, which are not reactive, i.e. we do not have that a a # implies a + b a # + b. (I am using a, b for agents and not for names in order to maintain consistency with the notation of the rest of this dissertation.) Use of the hypothesis C # D occurs twice in the proof: see (3.1) and (3.2) below. In CCS, weak bisimulation is a congruence with respect to some non­reactive contexts, namely the prefixing contexts x.- and • x.-. We would require richer structure than is contained in Chapter 5 in order to have a category of CCS contexts, namely the nesting of graphs to represent prefixing, some added data (not yet well understood) to represent summation, and the inclusion of free names to represent the naming structure of CCS. If we could construct such a category then it is likely that proving explicitly that weak bisimulation is preserved by prefixing would be easy since the only initial labelled transition of a prefixed agent is based on the prefix itself. Nonetheless, it is worth considering whether we could get a better general congruence result for weak bisimulation by dividing the set of reactive contexts in two, with one set containing prefixing­like contexts and another containing sum­like contexts. I am not sure how this would work, but something similar Chapter 3. Further congruence results 46 is done in work on rule formats for structural operational semantics by Bloom in [Blo93] who distinguishes ``tame'' from ``wild'' contexts. Theorem 3.19 (congruence for # w.r.t. D) Let F : “ C C be a functorial reactive system which has all redex­RPOs. Then # is a congruence with respect to all contexts in D, i.e. a # b implies Ca # Cb for all C # D of the required domain. Proof By symmetry, it is su#cient to show that the following relation is a weak simu­ lation: S “ = {(Ca, Cb) / a # b and C # D} . The proof falls into two cases corresponding to those of Definition 3.18. Suppose that a # b and C # D, and thus (Ca, Cb) # S, Case Ca a # : By Lemma 3.6, there exist a ## # C and C # , F # # “ C such that a F(F # ) a ## and a # = F(C # )a ## F(C # ) # D F(C # F # ) = C . Moreover, if F # is an iso then it is equal to id. We distinguish between two cases. Case F # is an iso: By definition a a ## . Since a # b, there exists b ## such that b # b ## and a ## # b ## . Since F(C # ) # D, we have that Cb = F(C # )b # F(C # )b ## . Case F # is not an iso: Since F creates isos, F(F # ) is not an iso. Since a # b, there exist b 0 , b 1 , b ## such that b # b 0 F(F # ) b 1 # b ## with a ## # b ## . By hypothesis, C # D, therefore: Cb # Cb 0 . (3.1) Since F(C # ) # D we have that Cb 0 = F(C # )F(F # )b 0 F(C # )b 1 # F(C # )b ## . In both cases, a ## # b ## , so (F(C # )a ## , F(C # )b ## ) # S as desired. # # # # C F # F C # 3.18 Case Ca F a # for F not an iso: By Lemma 3.5, there exist a ## # C and an IPO square shown in Figure 3.18 such that a F(F # ) a ## and a # = F(C # )a ## F(C # ) # D F(C) = C F(F) = F . Since F is not an iso, F is not an iso; Proposition 2.10 implies that F # is also not an iso; since F creates isos, F(F # ) is not an iso, thus a # b implies that there exist b 0 , b 1 , b ## such that b # b 0 F(F # ) b 1 # b ## with a ## # b ## . By hypothesis, C # D, therefore: Cb # Cb 0 . (3.2) Since Figure 3.18 is an IPO and F(C # ) # D, Lemma 3.7 implies that Cb 0 F F(C # )b 1 . Since F(C # ) # D, F(C # )b 1 # F(C # )b ## . Moreover, (F(C # )a ## , F(C # )b ## ) # S, as desired. # 3.6. Traces preorder 47 How can we get a congruence with respect to C, not just D? One possibility is to replace the F of clause 2 in Definition 3.18 with F # . The largest symmetric relation satisfying this new definition is a congruence with respect to all of C. A second possibility is to make this change for the first step only of the weak bisimulation relation and then revert to the normal definition in Definition 3.18: Definition 3.20 (greedy weak bisimulation (# gr )) Let # gr # # m#objC C(0, m) 2 be the largest symmetric relation that contains pairs of agents with common codomain and which satisfies the following properties: 1. If a a # , then there exists b # such that b # b # and a # # b # . 2. If a F a # where F is not an iso, then there exists b # such that b F # b # and a # # b # . We call # gr greedy weak bisimulation. # Notice that this definition requires that a # # b # , not the stronger condition that a # # gr b # . The relation # gr is also a congruence with respect to C. The proof is almost identical to that of Theorem 3.19. The idea of having a special requirement for the first step of a weak bisimulation followed by the use of standard weak bisimulation to compare the continuations is well­ established. See, for example, the definition of observational congruence (Definition 2 on p. 153 in [Mil88]), also known as rooted weak bisimulation in the literature. Experience with CCS suggests that a congruence in ensured by changing clause 1 (to require b # b # ) and not clause 2 in Definition 3.18 --- exactly the opposite of what is done in Definition 3.20. This anomaly requires further research, both to find categorical theory to model di#erent kinds of non­reactive contexts and to show that RPOs exist for categories of graph­like contexts that contain summation. 3.6 Traces preorder This section addresses the traces preorder, a simple preorder that compares agents based on their finite traces. A trace (see p. 41 in [Hoa85]) is a sequence of labelled transitions. The traces preorder is insensitive to non­determinism and deadlock so is of limited use. Nonetheless, traces are good for specifying security properties since it is easy to formulate that an agent does not possess certain ``bad'' traces (see, for example, [Pau98, SV00]). The main motivation for this section is to provide a warmup for the next one, which looks at the failures preorder. As a result, the traces considered here are all strong (not interspersed with reaction steps) since the way to handle the weak case is subsumed by the results presented in the next section. Chapter 3. Further congruence results 48 C 0 a 0 ··· F 1 Cnan Fn #(i) a 0 ··· F(F # 1 ) an F(F # n ) #(ii) b 0 ··· F(F # 1 ) bn F(F # n ) #(iii) C 0 b 0 ··· F 1 Cn bn Fn # tr # tr Figure 3.19: Schema of the congruence proof for # tr Definition 3.21 (traces preorder (# tr ); cf. p. 45 in [Ros98]) A pair of agents b and a with common codomain are related by the traces preorder, written a # tr b, i# all the traces of a are traces of b: for every trace #F 1 , . . . , F n #, a F 1 · · · Fn implies b F 1 · · · Fn . # (In this section and the next, n, m are natural numbers and not objects of a category.) The proof of congruence is more complicated that that of strong bisimulation (Theorem 3.9) because we need to consider traces, not just individual labelled transi­ tions. The heart of the argument is an inductive construction of a trace of a from a trace of Ca. Each inductive step cuts o# a portable IPO square (see Lemma 3.5) which is sub­ sequently pasted back on (Lemma 3.7) when constructing a trace of Cb from a trace of b. Theorem 3.22 (congruence for # tr ) Let F : “ C C be a functorial reactive system which has all redex­RPOs. Then # tr is a congruence, i.e. a # tr b implies Ca # tr Cb for all C # C of the required domain. Proof Suppose that a # tr b. Let C be any context of appropriate domain. We wish to prove that Ca # tr Cb. The proof is divided into three parts, which are shown schematically in Figure 3.19. 3.6. Traces preorder 49 # # # # C i F # i F i C # i 3.20 (i): Let • a 0 “ = Ca and consider any trace • a 0 F 1 · · · Fn • a n , where n # 0. We construct (a 0 , C 0 ) . . . (a n , C n ) and the square shown in Figure 3.20 for 1 < i # n such that the following conditions hold for 0 # i # n: • a i = C i a i Tr­a a i-1 F(F # i ) a i for i #= 0 Tr­lab C i # D for i #= 0 Tr­D F(C i ) = C i-1 for i #= 0 Tr­C F(C # i ) = C i for i #= 0 Tr­Cprime F(F i ) = F i for i #= 0 Tr­F Figure 3.20 is an IPO for i #= 0 . Tr­IPO base: Let a 0 “ = a and C 0 “ = C. Then Tr­a holds and the other conditions are vacuous. # # # # C i+1 F # i+1 F i+1 C # i+1 3.21 step: We construct a i+1 , C i+1 and the square in Figure 3.21 from a i , C i as follows, assuming 0 # i < n. By the inductive hypothesis Tr­a holds for i, thus • a i = C i a i . Since • a i F i • a i+1 , Lemma 3.5 implies that there exist a i+1 # C and an IPO square shown in Figure 3.21 such that a i F(F # i+1 ) a i+1 and • a i+1 = C i+1 a i+1 C i+1 # D F(C i+1 ) = C i F(F i+1 ) = F i+1 . where we let C i+1 “ = F(C # i+1 ). Then all of the inductive properties are satisfied for i + 1. Thus by Tr­lab a = a 0 F(F # 1 ) · · · F(F # n ) a n . (ii): Since a # tr b there exist b 0 . . . b n such that b = b 0 F(F # 1 ) · · · F(F # n ) b n . (iii): We now claim that C i b i F i+1 C i+1 b i+1 for 0 # i < n. Since b i F(F # i+1 ) b i+1 and Figure 3.21 is an IPO (by Tr­IPO), with F(C # i+1 ) = Tr­Cprime C i+1 # D by Tr­D, then Lemma 3.7 implies C i b i = Tr­C F(C i+1 )b i F(F i+1 ) F(C # i+1 )b i+1 = Tr­Cprime C i+1 b i+1 , and thus by Tr­F, C i b i F i+1 C i+1 b i+1 . So Cb = C 0 b 0 F 1 · · · Fn as desired. # Chapter 3. Further congruence results 50 3.7 Failures preorder This section looks at the failures preorder, which is a fundamental part of the failures and divergences model of CSP [Hoa85]. I do not consider divergences here, so the definition I use only employs failures. The failures preorder is sensitive to non­determinism and deadlock (see Section 3.3 in [Ros98]). The failures of an agent provide a domain­theoretic interpretation, assigning a meaning to each agent independently of the others (unlike for bisimulation). This makes failures properties well­suited to model checking [Ros94, Low96]. In order to define a failure of an agent, I first extend the notion of a weak labelled transition to allow for sequences of labels (not just single labels): Definition 3.23 (weak labelled transition extended; cf. Definition 3.17) a # # a # i# a # a # a #F #“t a # i# a F t a # , where t is a sequence of arrows of appropriate domain and “ is the concatenation operator. # A failure of a consists of a sequence of weak labelled transitions t and a set of labels X such that a evolves by t to a stable state (one for which no reactions are possible) which refuses X, i.e. cannot engage in a transition labelled by any of the arrows in X. To prevent reactions from masquerading as labelled transitions, every arrow in t and X is not an iso (cf. the discussion immediately following Definition 3.18). Definition 3.24 (failure; cf. p. 171 in [Ros98]) A failure of a is a pair (t, X) where t is a finite sequence of arrows (each not an iso) and X is a set of arrows (each not an iso) for which there exists a # such that the following conditions hold: a t a # a has a weak trace t; a # / a # is stable; #F # X. a # F / a # refuses X. # Definition 3.25 (failures preorder (# f ); cf. p. 193 in [Ros98]) A pair of agents b and a with common codomain are related by the failures preorder, written a # f b, i# all the failures of a are possessed by b. # The relation # f is only a congruence with respect D, the subcategory of C consisting of reactive contexts (cf. Theorem 3.19). The only use of the hypothesis C # D occurs in the base case of the induction. The proof is similar to that of the traces preorder however there are two aspects that require care: the cutting and pasting of portable IPO squares for weak labelled transitions and the propagation of refusal sets. 3.7. Failures preorder 51 C 0 a 0 • a 0 = • am t | F | (F#X) #(i) a 0 am s | F # | (F # #Y #Z) #(ii) b 0 b m+1 s | F # | (F # #Y #Z) #(iii) C 0 b 0 Cm b m+1 t | F | (F#X) # f # f Figure 3.22: Schema of the congruence proof for # f Theorem 3.26 (congruence for # f w.r.t. D) Let F : “ C C be a functorial reactive system which has all redex­RPOs. Then # f is a congruence with respect to all contexts in D, i.e. a # f b implies Ca # Cb for all C # D of the required domain. Proof Suppose that a # f b. Let C # D be any (reactive) context of appropriate domain. We wish to prove that Ca # f Cb. The proof is divided into three parts, which are shown schematically in Figure 3.22. (i): Let • a 0 “ = Ca and consider any failure (t, X) of • a 0 , where t = #F 1 , . . . , F n #, for n # 0. By Definition 3.24 and Definition 3.23, there exist natural numbers 0 # m and 0 < p 1 < · · · < p n # m and a sequence • a 0 , . . . , • am such that for 0 < i # m: • a i-1 • a i if i / # {p 1 , . . . , p n }; • a i-1 F j • a i if i = p j for some j, where 1 # j # n. Moreover, • am is stable and refuses X. # # # # C i F # i F j C # i 3.23 We construct inductively (a i , C i ) for 0 # i # m, and (F # i , C # i , C i ) for 1 # i # m, and F j , for 1 # j # n such that the following conditions Chapter 3. Further congruence results 52 hold for 0 # i # m: • a i = C i a i Fail­a C i # D Fail­d a i-1 F(F # i ) a i for i #= 0 and F # i #= id Fail­lab F # i is not an iso for i #= 0 and F # i #= id Fail­non­iso a i-1 a i for i #= 0 and F # i = id Fail­react F(C i ) = C i-1 for i #= 0 Fail­C F(C # i ) = C i for i #= 0 Fail­Cprime F(F j ) = F j for i = p j , where 1 # j # n Fail­F Figure 3.23 is an IPO for i = p j , where 1 # j # n Fail­ipo C i = C # i F # i for i #= 0 and i / # {p 1 , . . . , p n } Fail­commute base: Let a 0 “ = a and C 0 “ = C. Then Fail­a and Fail­d hold and the other condi­ tions are vacuously true. step: Given a i , C i for 0 # i < m, we construct all the required data for i + 1. # # # # C i+1 F # i+1 F j C # i+1 3.24 case i + 1 = p j for some j, where 1 # j # n: In this case, Fail­commute for i + 1 is vacuously true. By the in­ ductive hypothesis, Fail­a holds for i, thus • a i = C i a i . Since • a i F j • a i+1 , Lemma 3.5 implies that there exist a i+1 # C and an IPO square shown in Figure 3.24 such that a i F(F # i+1 ) a i+1 and • a i+1 = C i+1 a i+1 C i+1 # D F(C i+1 ) = C i F(F j ) = F j . where we let C i+1 “ = F(C # i+1 ). Since F j is not an iso, F j is not an iso; Proposition 2.10 implies that F # i+1 is also not an iso. Then all of the induc­ tive properties are satisfied for i + 1. case i + 1 / # {p 1 , . . . , p n }: In this case, Fail­ipo and Fail­F for i + 1 are vacu­ ously true. By the inductive hypothesis, Fail­a holds for i, thus • a i = C i a i . Since • a i • a i+1 , Lemma 3.6 implies that there exist a i+1 # C and C i+1 , C # i+1 , F # i+1 such that a i F(F # i+1 ) a i+1 C # i+1 F # i+1 = C i+1 • a i+1 = C i+1 a i+1 C i+1 # D F(C i+1 ) = C i , where we let C i+1 “ = F(C # i+1 ); furthermore, F # i+1 = id if F # i+1 is an iso. All the inductive properties are satisfied for i + 1. If we let s be the sequence #F(F # 1 ), . . . , F(F # m )# with all the id arrows removed, then Fail­react, Fail­lab, and Fail­non­iso imply that a = a 0 s am and that each label in s is not an iso. 3.7. Failures preorder 53 Claim: am is stable. Suppose for contradiction am . Then • am = Cmam because by Fail­d, Cm # D. But, by hypothesis, • am is stable, a contradiction. Now let Y “ = # # # # # # # F(F # ) / F # is not an iso & # #C, D, F. F(C) = Cm & F(D) # D & F(F) # X & Figure 3.25 is an IPO # # # # # # # # Z “ = # F(F # ) / F # is not an iso & (#D. F(D) # D & Cm = F(DF # )) # # # # # C F # F D 3.25 Claim: am refuses Y #Z. Suppose F # is not an iso. Since F creates isos, F(F # ) is not an iso. Consider F(F # ) # Y , as witnessed by F(C) = Cm , F(D) # D, and F(F) # X such that Figure 3.25 is an IPO. Suppose for contradiction that am F(F # ) . By Lemma 3.7, • am = Cmam F(F) , a contradiction since • am refuses X # F(F). Consider F(F # ) # Z, as witnessed by F(D) # D such that Cm = F(DF # ). Suppose for contradiction that am F(F # ) . Then • am = Cmam = F(D)F(F # )a m since F(D) # D. This contradicts the assumption that • am is stable. Thus a has failure (s, Y # Z). (ii): Since a # f b, it follows that b has failure (s, Y # Z). By Fail­lab, Fail­non­iso, and Fail­react, there exist b = b 0 , . . . , b m+1 such that: b i-1 F(F # i ) b i if F # i #= id and 0 < i # m; b i-1 # b i if F # i = id and 0 < i # m; b m # b m+1 . Furthermore b m+1 is stable and refuses Y # Z. (iii): In the final part, we argue that Cb = C 0 b 0 has failure (t, X). First we claim that for 0 < i # m C i-1 b i-1 F j C i b i if i = p j for some j, where 1 # j # n. C i-1 b i-1 # C i b i if i / # {p 1 , . . . , p n }; We consider two cases: # # # # C i F # i F j C # i 3.26 case i = p j for some j, where 1 # j # n: By Fail­ipo, Figure 3.26 is an IPO. By Fail­F, F j is not an iso, so Proposition 2.10 implies that F # i is also not an iso. By construc­ tion, b i-1 F(F # i ) b i , so there exist b # , b ## such that: b i-1 # b # F(F # i ) b ## # b i . Chapter 3. Further congruence results 54 By Fail­d, C i-1 b i-1 # C i-1 b # . Since Figure 3.26 is an IPO, Lemma 3.7 and Fail­F imply that C i-1 b # = Fail­C F(C i )b # F j F(C # i )b ## = Fail­Cprime C i b ## . By Fail­d, C i b ## # C i b i . Thus C i-1 b i-1 F j C i b i , as desired. case i / # {p 1 , . . . , pm }: This case falls into two subcases: case F # i #= id: By construction b i-1 F(F # i ) b i , so there exist b # , b ## such that: b i-1 # b # F(F # i ) b ## # b i . By Fail­d, C i-1 b i-1 # C i-1 b # . By definition, F(F # i )b # b ## . Hence by Fail­d, C i-1 b # = Fail­C F(C i )b # = Fail­commute F(C # i )F(F # i )b # = Fail­Cprime C i F(F # i )b # C i b ## # C i b i . Thus, C i-1 b i-1 # C i b i , as desired. case F # i = id: By construction b i-1 # b i ; by Fail­d, C i # D, so C i-1 b i-1 # C i-1 b i = Fail­C F(C i )b i = Fail­commute F(C # i )b i = Fail­Cprime C i b i . Thus Cb = C 0 b 0 t Cm b m . By Fail­d, Cm # D, so Cb t Cm b m+1 . Claim: Cm b m+1 refuses X. Suppose for contradiction Cm b m+1 F for some F # X. By Lemma 3.5, there exists an IPO square, as in Figure 3.25, such that b m+1 F(F # ) and F(D) # D F(C) = Cm F(F) = F . By hypothesis, F is not an iso. Hence F is not an iso; Proposition 2.10 implies that F # is also not an iso. Furthermore F(F # ) # Y , which contradicts the hypothesis that b m+1 refuses Y . Claim: Cm b m+1 is stable. Suppose for contradiction that Cm b m+1 . By Lemma 3.6, there exists D, F # such that b m+1 F(F # ) and F(D) # D F(DF # ) = Cm . Moreover, if F # is an iso then it is equal to id. We consider two cases: case F # #= id: Then F(F # ) # Z, which contradicts the hypothesis that b m+1 refuses Z. case F # = id: Then b m+1 , which contradicts the hypothesis that b m+1 is stable. Thus Cb has failure (t, X), as desired. # 3.8 Multi­hole contexts As anticipated at the end of the previous chapter, this section enriches the definition of functorial reactive systems to model explicitly multi­hole contexts. At first glance, there 3.8. Multi­hole contexts 55 is no obstacle in accommodating multi­hole contexts directly within the existing theory of reactive systems. (For simplicity, let us ignore functorial reactive systems until later.) However this does not work well as I illustrate in the next few paragraphs. If we consider multi­hole term algebra contexts (as used in term rewriting), say, then we can choose the objects of C to be natural numbers and the arrows to be tuples of contexts (that use each hole exactly once) constructed from function symbols taken from some signature #. Concretely, if # = {#, # # , #, #}, where # and # # are constants, # is a 1­place function symbol, and # is a 2­place function symbol, then, we have the following examples of arrows: C 0 “ = #- 1 , #(# # )# : 1 2 C 1 “ = ##(- 2 , # # ), #, #(- 1 )# : 2 3 C 1 C 0 = ##(#(# # ), # # ), #, #(- 1 )# : 1 3 # # # # 0 m u m # n a l F D F # D # C 3.27 But what is an agent? A natural choice is to take agents to be pure terms, i.e. arrows 0 1 (a 1­tuple containing a term context with 0 holes). But this is not supported by the definition of reactive system, where agents consist of all arrows 0 m for m an arbitrary object of C. This discrepancy is non trivial: if we try to confine the definition of labelled transition and bisimulation to use only a limited set of agents, say those arrows 0 m for m # A, where A is some subset of the objects of C, then the proof of congruence goes wrong. The problem is that even if m,m # , n # A in Figure 3.27, it is not necessarily the case that an RPO F # , D # , C yields an object u # A. Thus even though F # , D # forms an IPO with respect to a, l, it is not the case that a F # since the codomain of F # is not in A. However, it is exactly the fact that RPOs sometimes do not produce an object in A that gives multi­hole contexts their power and that makes them worth considering. To see why, suppose that we take C to be a category of exclusively 0­ and 1­hole contexts. Then RPOs exist, as a corollary of Sewell's dissection result for terms (Lemma 1 in [Sew01]). Consequently, bisimulation is a congruence for term rewriting systems. The resulting labels are unnecessarily heavy, though. For consider the reaction rule (#(#), # # ); we have # #(-) # # which corresponds to our intuition that # needs #(-) to react. (When there is no confusion I use - for - 1 .) Unfortunately, we also have a labelled transition where the label contains a complete copy of the redex : # # #(-,#(#)) #(# # , # # ) . This was discussed at the end of the previous chapter. This heavy labelled transition is absent when we look at multi­hole contexts, as illustrated with the help of the diagram Chapter 3. Further congruence results 56 below. (Tuple brackets # # are omitted from singletons.) 0 1 2 1 1 # # #(#) #(-,#(#)) #(# # ,-) #-,#(#)# ## # ,-# #(- 1 ,- 2 ) If we work in the category of 0­ and 1­hole contexts then the outer square is an IPO, which gives rise to the transition #(-,#(a)) mentioned earlier. By admitting multi­hole contexts we have given the outer­square a simpler RPO. It is now possible to make precise the motivation for the results developed in this section. The goal is to reconcile two things: (i) we want to restrict of the collection of agents of C to arrows 0 m for m # A, where A can be a proper subset of the objects of C (for example A = {1} in the case of multi­hole term contexts); (ii) we want to admit RPOs that yield objects which are not contained in A. The key idea is to consider the notion of a strict monoidal category (C, #, 0), a category C equipped with a functor # : C×C C and an object 0 such that # is strictly associative and has unit 0. The role of the tensor # is to ``tuple'' arrows and objects, e.g. recalling the term contexts C 0 , C 1 from above, we have that: C 0 “ = #- 1 , #(# # )# : 1 2 C 1 “ = ##(- 2 , # # ), #, #(- 1 )# : 2 3 C 0 # C 1 = #- 1 , #(# # ), #(- 3 , # # ), #, #(- 2 )# : 3 5 C 1 # C 0 = ##(- 2 , # # ), #, #(- 1 ), - 3 , #(# # )# : 3 5 The following definition extends that of a functorial reactive system by postulating that both the upstairs and downstairs categories are monoidal and requiring that the functor between them respects the monoidal structure. The same enrichment could be performed on a reactive system. There is, however, no reason not to take more general approach shown here. Definition 3.27 (functorial monoidal reactive system; cf. Definition 3.1) A functorial monoidal reactive system consists of a functorial reactive system F : “ C C with the following added structure: . Both “ C and C are strict monoidal categories with unit objects # and 0, respectively, and tensor # (the same symbol being used for both categories). . F preserves tensors, i.e. F(C 1 # C 0 ) = F(C 1 ) # F(C 0 ). 3.8. Multi­hole contexts 57 . F preserves and creates units, i.e. for all u # obj “ C, F(u) = 0 i# u = #. . There is a subset A of C­objects and a subset A of “ C­objects, where A is the preimage under F of A. We use m,m # , . . . to range over A and m, m # , . . . to range over A. . Reacts # # m#A C(0, m) 2 . . Pairing with an agent yields a reactive context: a # id m # # D for a : 0 m. # The agents are arrows 0 m where m # A and the agent contexts are arrows m m # , for m,m # # A. Thus the reaction rules of Reacts are only between agents. We overload the terminology for “ C in a straightforward way: arrows # m of “ C are also agents for m # A; arrows m m # of “ C are also agent contexts for m, m # # A. The definition of labelled transition confines the arguments to be agents: Definition 3.28 (labelled transition for functorial monoidal reactive systems ( · m ); cf. Definition 3.3) a F m a # i# a F a # and a, a # are agents (thus forcing F to be an agent context). # Two properties now replace the hypothesis that F has all redex­RPOs. The first asserts that the RPO consists either of agent contexts or of pairing operations that place disjoint instances of a and l into the composite Fa = Dl: # # # # # m u m # n a l F D F # D # C 3.28 Definition 3.29 (F has all monoidal­redex­RPOs; cf. Definition 3.4) Suppose F : “ C C is a functorial monoidal reactive system and that a, l are agents, F, D are agent contexts, F(D) # D, and there exists r # C such that (F(l), r) # Reacts. Then F has all monoidal­redex­RPOs if any square, such as in Figure 3.28, has an RPO (as shown) such that if u / # A then u = m # m # , F # = id m # l, and D # = a # id m # . # The second property asserts that pairing agent contexts yields an IPO: # # # # # m m # m#m # a l id m #l a#id m # 3.29 Definition 3.30 (F has all monoidal­redex­IPOs) A functo­ rial monoidal reactive system F : “ C C has all monoidal­redex­ IPOs if any square, such as in Figure 3.29, is an IPO, provided a, l are agents and there exists r # C such that (F(l), r) # Reacts. # Just before giving a proof of congruence for strong bisimulation we need to consider a corollary of Lemma 3.7 for functorial monoidal reactive systems: Corollary 3.31 (portable IPO pasting for functorial monoidal reactive systems; cf. Lemma 3.7) Suppose F : “ C C is a functorial monoidal reactive system and has Chapter 3. Further congruence results 58 all monoidal­redex­RPOs. The following inference rule holds: C F # F C # is an IPO consisting of agent contexts a F(F # ) m a # F(C # ) # D F(C)a F(F) m F(C # )a # . # Strong bisimulation compares pairs of agents: Definition 3.32 (strong bisimulation over · m ; cf. Definition 3.8) Let # m # # m#A C(0, m) 2 be the largest strong bisimulation over · m such that # m contains only pairs of agents with common codomain. # Finally we prove congruence. As promised at the end of the previous chapter, the argument mirrors closely congruence proofs in typical process calculi. In particular, two cases are distinguished when analysing the transition Ca F m : (i) a, C, and F all con­ tribute, in which case a itself has a labelled transition; (ii) only C and F contribute, in which case Cb F m without needing to consider the behaviour of a and b. Theorem 3.33 (congruence for # m w.r.t. agent contexts) Let F : “ C C be a functorial monoidal reactive system which has all monoidal­redex­RPOs and has all monoidal­redex­IPOs. Then # m is a congruence with respect to agent contexts, i.e. a # m b implies Ca # m Cb for any agent context C # C of the required domain. Proof By symmetry, it is su#cient to show that the following relation is a strong sim­ ulation: S “ = {(Ca, Cb) / a # m b and C # C is an agent context} . # # # # # m n u m # a l C F # F D D # C # 3.30 Suppose that a # m b and C # C, and thus (Ca, Cb) # S, where a, b : 0 m and C : m n. Suppose Ca F m a # . By the definition of F m and the hypothesis that F creates compositions, there exist “ C­ arrows a : # m, C : m n, l : # m # , F, D and a C­arrow r : 0 F(m # ) such that the big rectangle in Figure 3.30 is an IPO and a # = F(D)r F(D) # D (F(l), r) # Reacts F(a) = a F(C) = C F(F) = F . Because F has all monoidal­redex­RPOs, there exist F # , D # , C # forming an RPO in “ C, as in Figure 3.30. Note that F(C # ) # D since F(C # )F(D # ) = F(D) # D. By Proposition 2.7, the small left­hand square of Figure 3.30 is an IPO; Proposition 2.9 implies that the small right­hand square is an IPO too. Since F has all monoidal­redex­RPOs, we have additional information depending on whether u # A. We consider both cases. 3.8. Multi­hole contexts 59 Case u # A: By definition, a F(F # ) m a ## and a # = F(C # )a ## , where a ## “ = F(D # )r. Since a # m b, there exists b ## such that b F(F # ) m b ## and a ## # m b ## . Since the small right­hand square in Figure 3.30 is an IPO and F(C # ) # D, Corollary 3.31 implies that Cb F m F(C # )b ## . Also, a ## # m b ## implies (F(C # )a ## , F(C # )b ## ) # S, as desired. # # # # # m 0 m # m 0 #m # b l id m 0 #l b#id m # C 0 F 0 C # 0 3.31 Case u / # A: We have that F # = id m # l and D # = a # id m # . Since F lifts agents there exists b : # m 0 such that F(b) = b, and thus F(m 0 ) = m = F(m). Since F preserves tensors, F(F # ) = id F(m) #F(l) = id F(m 0 ) # F(l) = F(id m 0 # l). Since F allows IPO sliding, the small right­hand IPO square of Figure 3.30 can be slid to form the small right­hand IPO square in Figure 3.31, where F(C 0 ) = F(C) F(F 0 ) = F(F) F(C # 0 ) = F(C # ) . Since F has all monoidal­redex­IPOs, the small left­hand square of Figure 3.31 is an IPO. Since F has all monoidal­redex­RPOs, Proposition 2.9 implies that the big rectangle in Figure 3.31 is an IPO. Since F preserves tensors and since pairing with an agent yields a reactive context, F(C # 0 (b # id m # )) = F(C # )(b # id F(m # ) ) # D so: Cb F m F(C # 0 (b # id m # ))r = F(C # )(b # r) = F(C # )(id m # r)b . The last equality is a standard property of strict monoidal categories. Furthermore, a # = F(D)r = F(C # )F(D # )r = F(C # )F(a # id m # )r = F(C # )(a # r) = F(C # )(id m # r)a . Thus a # m b implies (F(C # )(id m # r)a, F(C # )(id m # r)b) # S as desired. # To round out this section, let us look again at the example of multi­hole term contexts over a signature #. Definition 3.34 (multi­hole term contexts (T # (#))) Given a signature # of function symbols then the category of multi­hole term contexts T # (#) over # is constructed as follows: the objects are the natural numbers; an arrow j k is a k­tuple of terms over the signature # # {- 1 , . . . , - j } containing exactly one use of each hole - i (1 # i # j). (When j = 1, I often write - 1 as -.) The identities are: id j “ = #- 1 , . . . , - j # : j j. For C = #t 1 , . . . , t k # : j k and D : k m, their composition is the substitution DC “ = {t 1 /- 1 , · · · , t k /- k }D . # In order to apply Theorem 3.33 to T # (#), we let C = “ C = T # (#) and F be the identity functor. If we let A = {1} then . an agent of T # (#) is a term a : 0 1; Chapter 3. Further congruence results 60 . agent context of T # (#) is a term context C : 1 1, i.e. a term containing a single hole. We may choose any subcategory of T # (#) to be the reactive contexts, subject to the conditions in Definition 3.27. The labelled transitions of T # (#) depend, of course, on the reaction rules. Once these are specified, the labelled transition relation · m is determined by Definition 3.28 and the induced strong bisimulation # m by Definition 3.32. As a corollary of Sewell's dissection result for terms (Lemma 1 in [Sew01]), F has all monoidal­redex­RPOs and has all monoidal­redex­IPOs. Hence from Theorem 3.33 the induced strong bisimulation # m is preserved by all term contexts. Let us now revisit the reactive system whose only reaction rule is (#(#), # # ). It contains exactly the following labelled transitions: D(#(#)) - m D(# # ) for all reactive term contexts D # D # #(-) m # # These agree with the transitions found by Sewell in the case of ground­term rewriting. I believe for any reaction rules specified by Reacts the derived labelled transitions for T # (#) coincide exactly with Sewell's. Chapter 4 Sliding IPO squares 4.1 Introduction and motivation The previous chapter shows a series of congruence proofs, each one for a di#erent opera­ tional equivalence. The theme throughout is the separation of ``reactive system'' into two categories with a functor F between them: the domain of F , i.e. “ C, is a category in which RPOs and IPOs exist; the codomain of F , i.e. C, is a category containing (i) the agents that perform reactions and labelled transitions and (ii) the agent contexts that serve as the labels and specify the closure condition for congruence. This separation is useful because the category for which we prove a congruence result is typically not one in which RPOs exist, as I show in the next chapter when considering categories of graph contexts. Thus functorial reactive systems are a powerful generalisation of reactive systems. This separation imposes a burden, though, in the form of the list of requirements given in Definition 3.1, i.e.: F lifts agents, creates isos, creates compositions, and allows IPO sliding. The motivation for the current chapter is to alleviate this burden by showing that all of these properties follow directly if we construct “ C, C, and F from a precategory A in a particular way. The assumption is that A is easier to construct than the others. (The next chapter gives a concrete instance of A.) Precategories are defined formally below. By way of motivation, though, let us look informally at an example we have in mind when deriving F : “ C C from A. A precate­ gory is just like a category but has a composition operation that is partial, i.e. not always defined. For example, consider a precategory A of ``raw contexts''. For the purpose of this example, take the objects to be natural numbers (representing the number of ports in an interface). Take the arrows m n to be just like normal graphs but with some added structure, namely an inner and outer interface. An arrow is thus a doughnut­shaped graph consisting of a set of nodes (which we call the support) and an arc relation; the arcs link nodes to one another and to the interface ports; m (ordered) interface ports sit on the inside ``hole'' and n (ordered ones) on the outside. (These are simpler than the graph con­ 61 Chapter 4. Sliding IPO squares 62 texts that appear in the next chapter.) Composition consists of placing one raw context inside the hole of another and joining together common ports. To see how this works, consider arrows A 0 : 1 2 and A 1 : 2 3: A 1 = v 1 v 2 A 0 = v 0 Then their composition, which we denote A 1 # A 0 : 1 3, is as follows: A 1 # A 0 = v 2 v 0 v 1 This example reveals why composition is partial in A. If we form A # 0 from A 0 by renaming the node v 0 by v 2 then the supports of A # 0 and A 1 are not disjoint. Thus the composition A 1 # A # 0 is undefined since there is no clear choice for the support of the composite. There are several possible ways of building a true category (i.e. with a total composition relation) from the data provided by A. Two possible ways are as follows: . We can construct a category “ C whose objects are pairs (m, U) where m is an object of A (in this case a natural number) and U is a set. An arrow (m 0 , U 0 ) (m 1 , U 1 ) consists of an A­arrow A : m 0 m 1 for which U 0 # U 1 and U 1 \ U 0 is equal to the support of A. Thus we can incorporate A 1 (given above) into many possible arrows in “ C, e.g. (2, #) A 1 (3, {v 1 , v 2 }) and (2, {w}) A 1 (3, {w, v 1 , v 2 }). As a result composition is always well­defined: if (m i , U i ) A i (m i+1 , U i+1 ) are “ C­arrows for i = 0, 1 then the supports of A 0 and A 1 are disjoint. . We can construct a category C, whose objects are the objects of A and whose arrows are #­equivalence classes of A­arrows. Two A­arrows are #­equivalent i# they are graph­isomorphic, i.e. have (potentially) di#erent supports but look like the same graphs. Composition for this category is also well­defined since it is always possible when composing arrows to find representatives of each equivalence class with disjoint supports. One might consider a third way, i.e. to use the arrows of A but rename the supports so as to make them disjoint when composing arrows; but this yields a bicategory, since composition is not associative. This approach is being investigated by Peter Sewell; I do not consider this line of research in this dissertation. What is the relationship between “ C and C? There is a simple functor F which discards the set component of each “ C­object (i.e. F : (m, U) ## m) and maps each arrow to its 4.2. Properties of A 63 #­equivalence class. As we see later in this chapter, this functor has all of the desired properties listed earlier, e.g. F allows IPO sliding. The rest of this chapter repeats the above constructions in full formality. 4.2 Properties of A First we formally define a precategory (see [MSS00]): Definition 4.1 (precategory) A precategory A consists of similar data to that of a category: a collection of objects m, n, . . .; a collection of arrows A(m,m # ) between objects m and m # ; an identity arrow id m # A(m,m) for all m; and a partial composition operation, which we write here as # : A(m 1 , m 2 ) × A(m 0 , m 1 ) # A(m 0 , m 2 ) on arrows. Identity: composition with an identity arrow is always well­defined, i.e. for all A : m 0 m 1 , we have that id m 1 # A = A = A # id m 0 and both compositions are defined. Associativity: if A 2 # A 1 and A 1 # A 0 are defined then either both A 2 # (A 1 # A 0 ) and (A 2 # A 1 ) # A 0 are undefined or both are defined and equal. # Next we define some special properties of a precategory. These properties form a specification (used in Chapter 5) which any precategory is required to satisfy in order to make use of the constructions and propositions of this chapter. Definition 4.2 (well­supported precategory) A is a well­supported precategory i# it is a precategory and it satisfies the following properties: . A has a support function |·| that maps an arrow to a set such that A 1 # A 0 is defined i# |A 1 | # |A 0 | = # and DomA 1 = CodA 0 . The support function satisfies additionally two axioms: |A 1 # A 0 | = |A 1 | # |A 0 | (provided A 1 # A 0 is defined) Supp­comp |id m | = # . Supp­id . For any arrow A # A(m 0 , m 1 ) and any injective map # for which Dom # # |A|, there exists an arrow # # A # A(m 0 , m 1 ), which is called the support translation by # of A, where: # # id m = id m Trans­id­r # # (A 1 # A 0 ) = # # A 1 # # # A 0 Trans­comp­r Id |A| # A = A Trans­id­l (# 1 # # 0 ) # A = # 1 # (# 0 # A) Trans­comp­l # 0 # |A| = # 1 # |A| implies # 0 # A = # 1 # A Trans­res |# # A| = #|A| . Trans­supp Chapter 4. Sliding IPO squares 64 A note about Trans­comp­r: Since # is injective, Trans­supp implies that the LHS is defined i# the RHS is defined. (These axioms are similar to those of Honda's Rooted P­Sets [Hon00], though his application concerns free names and renaming.) # 4.3 Construction of “ C We now turn to problem of building a genuine category from a well­supported precategory A. The idea is to enrich the object structure with enough data so that composition is always defined. This construction is captured in the following definition: Definition 4.3 (track) Given a well­supported precategory A, the track of A is a category “ C. An object of “ C is a profile: a pair (m, U) where m is an object of A and U is a set. We let p range over profiles and adopt the firm convention that the components of a profile p are always written (m, U) with suitable decoration, e.g. p # = (m # , U # ) and p i = (m i , U i ). An arrow p 0 A p 1 consists of an arrow A # A(m 0 , m 1 ) such that U 0 # U 1 and |A| = U 1 \ U 0 . We always include the profiles when referring to an arrow of “ C since di#erent “ C­arrows can be constructed from the same A­arrow, each with di#erent profiles. The identities of “ C are defined by id p “ = p id m p. Composition is defined in terms of the underlying composition in A: p 0 A 0 p 1 A 1 p 2 “ = p 0 A 1 #A 0 p 2 . # Proposition 4.4 If “ C is the track of A then “ C is a category. Proof Composition is well­defined: if p i A i p i+1 are arrows for i = 0, 1, then |A 1 | # |A 0 | = (U 2 \U 1 ) # (U 1 \ U 0 ) = #, so A 1 # A 0 is defined. Furthermore, U 0 # U 1 # U 2 and U 2 \ U 0 = (U 2 \ U 1 ) # (U 1 \ U 0 ) = |A 1 | # |A 0 | = Supp­comp |A 1 # A 0 | , so p 0 A1 #A0 p 2 is an arrow of “ C as desired. The identity arrows are well­defined: By Supp­id, |id m | = #, so for any p = (m, U ), p id m p is an arrow of “ C. Composition is associative and respects identities: Immediate because these same proper­ ties hold in A. # 4.4 Operations on “ C In order to motivate the following results, let us recall the intuitions about the functor F (defined formally later): F maps p 0 A p 1 to an isomorphism equivalence class of A and throws away the set data contained in the profiles p 0 , p 1 . To prove that F allows IPO sliding, we require two things. (i) We have to understand how two “ C­arrows are related 4.4. Operations on “ C 65 when they have the same F image, i.e. F(p 0 A p 1 ) = F(p # 0 A # p # 1 ). (ii) From the first piece of information, we have to slide similarly an IPO square whose left leg is p 0 A p 1 to one whose left leg is p # 0 A # p # 1 . For (i), it is clear that we can perform some profile translation (defined precisely later) on p 0 A p 1 to replace the set components of p 0 and p 1 and then perform a support transla­ tion on the resulting arrow to arrive at p # 0 A # p # 1 . If these two operations (profile translation and support translation) are iso functors (and thus preserve categorical constructions) then we can accomplish (ii). These two operations are not in fact iso functors on the whole of “ C but they are on certain convex subcategories, as defined next. Fortunately, the subcategories in question are rich enough to be proxies for “ C with respect to IPO squares, as shown in the result immediately following the definition: Definition 4.5 (convex subcategories of “ C) For any sets U 0 # U 1 , we write “ CU 0 ,U 1 for the convex subcategory of “ C w.r.t. U 0 , U 1 , namely the full subcategory of “ C formed by taking only those profiles (m, U) for which U 0 # U # U 1 . # # # # # p p 0 p 1 p 2 A 0 A 1 B 0 B 1 4.1 Proposition 4.6 Suppose that Figure 4.1 commutes in “ C. Then it is an IPO in “ C i# it is an IPO in “ CU,U 2 . Proof For any pair of arrows p C p # C # p 2 , we have that U # U # # U 2 , so p # # obj “ CU,U2 (Definition 4.5) hence “ CU,U2 satisfies the hypothesis of Proposition 2.12, whence the result follows. # Now we now define precisely profile translation and establish that it is an iso functor: Proposition 4.7 (profile translation is an iso functor) If W 1 = W 0 #W and W # 1 = W # 0 # W then the following operation, called profile translation, induces an isomorphism of categories H : “ CW 0 ,W 1 “ CW # 0 ,W # 1 , H : (m, W 0 # V ) ## (m, W # 0 # V ) for V # W H : (p 0 A p 1 ) ## H(p 0 ) A H(p 1 ) . Proof H is well­defined on arrows: Suppose that (p 0 A p 1 ) # “ CW0,W1 where U i = W 0 # V i , i = 0, 1 for some V 0 # V 1 # W . Now, W # 0 # V 0 # W # 0 # V 1 and (W # 0 # V 1 ) \ (W # 0 # V 0 ) = V 1 \ V 0 = |A| so (H(p 0 ) A H(p 1 )) # “ CW # 0 ,W # 1 as desired. H is a functor: Consider the action of H on identities: H(id p ) = H(p id m p) = H(p) id m H(p) = id H(p) Chapter 4. Sliding IPO squares 66 and on compositions: H(p 0 A0 p 1 A1 p 2 ) = H(p 0 A1 #A0 p 2 ) = H(p 0 ) A1 #A0 H(p 2 ) = H(p 0 ) A1 H(p 1 ) A0 H(p 2 ) H is an isomorphism of categories: By symmetry, the map H # : “ CW # 0 ,W # 1 “ CW0,W1 defined by H # : (m, W # 0 # V ) ## (m, W 0 # V ) for V # W H # : (p 0 A p 1 ) ## H # (p 0 ) A H # (p 1 ) . is also a functor and clearly inverts H, as desired. # Finally we lift the support translation operation from A to “ C in a straightforward way. This definition induces a functor on convex subcategories of “ C Proposition 4.8 (support translation is an iso functor) Given W 0 # W 1 and an injection # with Dom # # W 1 , the following operation, called support translation, induces an isomorphism of categories # # (·) : “ CW 0 ,W 1 “ C#W 0 ,#W 1 , # # (m, U) “ = (m, #U) # # (p 0 A p 1 ) “ = # # p 0 # # A # # p 1 , where # # A is the support translation by # of A in A. Proof # # (·) is well­defined on arrows: Suppose that (p 0 A p 1 ) # “ CW0,W1 . Then W 0 # U 0 # U 1 # W 1 # Dom #. Thus #U 0 # #U 1 and #U 1 \ #U 0 = # injective #(U 1 \ U 0 ) = #|A| = Trans­supp |# # A| , so # # (·) maps an arrow to an arrow. # # (·) is a functor: By Trans­id­r and Trans­comp­r, # # (·) preserves identities and composi­ tions, so is a functor. # # (·) is an iso functor: Note that # has an inverse # -1 : Im # # Dom #. Furthermore, # -1 # (·) : “ C#W0,#W1 “ CW0,W1 is a functor for the same reasons (given above) that # # (·) is. By Trans­comp­l and Trans­ id­l, the functors # # (·) and # -1 # (·) invert each other, so # # (·) is an isomorphism of categories, as required. # 4.5 Construction of C We now turn to the construction of C, which was described informally in Section 4.1. Recall that the arrows of C are equivalence classes of arrows of A. In this section we make precise the underlying equivalence relation and the construction of C and verify that C is a well­defined category. 4.6. Construction of F 67 Definition 4.9 (#­equivalence for A) Given two arrows A, A # : m 0 m 1 in A, we say that they are #­equivalent, written A # A # , i# there exists a bijection # : |A| ## |A # | such that # # A = A # . By Trans­id­l and Trans­comp­l, # is an equivalence relation. # Now the construction of C is straightforward: Definition 4.10 (support quotient) Given a well­supported precategory A, the sup­ port quotient of A is a category C. The objects of C are the objects of A. The arrows m 0 m 1 of C are #­equivalence classes of arrows in A: C(m 0 , m 1 ) “ = {[A] # / A # A(m 0 , m 1 )} . Identities: id m # C(m,m) “ = [id m ] # . Composition: if A 1 # A 0 is defined in A then [A 1 ] # [A 0 ] # “ = [A 1 # A 0 ] # . # This definition yields a well­defined category: Proposition 4.11 If C is the support quotient of A then C is a category. Proof Composition is total: Consider any two arrows in C such as [A i ] # : m i m i+1 for i = 0, 1. Let W be a fresh set in bijection with |A 1 |, as witnessed by # : |A 1 | ## W . Then # # A 1 # A 0 is defined since |# # A 1 | # |A 0 | = Trans­supp W # |A 0 | = #; thus [A 1 ] # [A 0 ] # = [# # A 1 ] # [A 0 ] # = [# # A 1 # A 0 ] # , as desired. Composition is well­defined: Let [A i ] # = [A # i ] # for i = 0, 1 with both A 1 # A 0 and A # 1 # A # 0 defined. Claim: [A 1 ] # [A 0 ] # = [A # 1 ] # [A # 0 ] # . By hypothesis, there exist bijections # i : |A i | ## |A # i | such that A # i = # i # A i for i = 0, 1. Since |A 1 | # |A 0 | = # and |A # 1 | # |A # 0 | = #, we can define # “ = # 0 # # 1 , a union of bijections with disjoint domains and disjoint codomains. Now, # # (A 1 # A 0 ) = Trans­comp­r # # A 1 # # # A 0 = Trans­res # 1 # A 1 # # 0 # A 0 = A # 1 # A # 0 so [A 1 ] # [A 0 ] # = [A 1 # A 0 ] # = [A # 1 # A # 0 ] # = [A # 1 ] # [A # 0 ] # , as desired. Composition is associative: Follows from the associativity of the underlying composition in A. Composition respects identities: Follows from the fact that composition respects identities in A. # 4.6 Construction of F In this final section we define a functor F from “ C (the track of A) to C (the support quotient of A). We then verify that F has all the required properties, i.e.: F lifts agents, creates isos, creates compositions, creates left inverses, and allows IPO sliding. Two Chapter 4. Sliding IPO squares 68 comments are in order. (i) For the first we verify a stronger property defined below, namely F lifts arrows by their domain. The reason for this choice is that there is no postulated distinguished object in A or C corresponding to the 0 of a functorial reactive system (see Definition 3.1), which is required when defining the property ``F lifts agents''. However, the stronger property ``F lifts arrows by their domain'' is well­defined. (ii) The penultimate property, ``F creates left inverses'' does not appear in the definition of a functorial reactive system but is used in Appendix B. Definition 4.12 (support­quotienting functor) Let A be a well­supported precat­ egory and “ C and C be as in Definition 4.3 and Definition 4.10. Then we define a map F : “ C C called the support­quotienting functor : F : (m, U) ## m F : (p 0 A p 1 ) ## F(p 0 ) [A]# F(p 1 ) . # Lemma 4.13 F is a functor. Proof Observe the action on identities: F(id p ) = F(p id m p) = [id m ] # = id m and on compositions: F(p 0 A0 p 1 A1 p 2 ) = F(p 0 A1 #A0 p 2 ) = [A 1 # A 0 ] # = [A 1 ] # [A 0 ] # = F(p 1 A1 p 2 ) F(p 0 A0 p 1 ) # Now we prove all the easy properties of F : Theorem 4.14 Let F : “ C C be the support­quotienting functor constructed from A. Then: . F lifts arrows by their domain: if F(p 0 ) = Dom [A] # then there exists p 0 B p 1 such that F(p 0 B p 1 ) = [A] # . . F creates isos: if F(p 0 A p 1 ) is an iso then p 0 A p 1 is an iso. . F creates compositions: if F(p # 0 B p # 2 ) = [A 1 ] # [A 0 ] # then there exist “ C­arrows p # i B i p # i+1 with F(p # i B i p # i+1 ) = [A i ] # for i = 0, 1 (4.1) p # 0 B p # 2 = p # 0 B 0 p # 1 B 1 p # 2 (4.2) 4.6. Construction of F 69 . F creates left inverses: if id m 0 = [A 1 ] #F(p 0 A 0 p 1 ) then there exists p 1 A # 1 p 0 such that id p 0 = p 0 A 0 p 1 A # 1 p 0 and F(p 1 A # 1 p 0 ) = [A 1 ] # . Proof F lifts arrows by their domain: Suppose [A] # : (m 0 , n 0 ) (m 1 , n 1 ), thus A : (m 0 , n 0 ) (m 1 , n 1 ) in A. Let W be a fresh set in bijection with |A|, as witnessed by # : |A| ## W . Then p 0 # # A p 1 is an arrow in “ C, where U 1 “ = U 0 #W . Furthermore F(p 0 # # A p 1 ) = [A] # as desired. F creates isos: Suppose F(p 0 A p 1 ) is an iso, i.e. there exists an A­arrow A # : m 1 m 0 such that: [A # ] # [A] # = id m0 = [id m0 ] # [A] # [A # ] # = id m1 = [id m1 ] # Without loss of generality, assume that |A| # |A # | = #. Then A # # A # id m0 and A # A # # id m1 . By Supp­id and Trans­id­r, A # # A = id m0 and A # A # = id m1 . By Supp­comp, |A| = |A # | = #. Thus U 1 = U 0 and p 1 A # p 0 is a “ C­arrow. Moreover, p 0 A p 1 A # p 0 = p 0 A # #A p 0 = p 0 id m 0 p 0 = id p0 and symmetrically. Thus p 0 A p 1 is an iso in “ C as desired. F creates compositions: Without loss of generality, assume that |A 1 | # |A 0 | = #. Then there exist p 0 , p 1 , p 2 such that p i A i p i+1 are arrows in “ C for i = 0, 1. By the definition of F , there exists a bijection # : |A 1 | # |A 0 | ## |B| such that # # (A 1 # A 0 ) = B; moreover m 0 = m # 0 and m 2 = m # 2 . Let B i “ = # # A i for i = 0, 1. Let (m # 1 , U # 1 ) “ = (m 1 , U # 0 # |B 0 |), thus defining p # 1 . We claim that p # i B i p # i+1 are arrows in “ C for i = 0, 1; there are three things that we need to check: U # 0 # U # 0 # |B 0 | # U # 0 # |B| = U # 2 U # 1 \ U # 0 = |B 0 | U # 2 \ U # 1 = (U # 0 # |B|) \ (U # 0 # |B 0 |) = |B| \ |B 0 | = |B 1 | . Now, B 1 # B 0 = B by Trans­comp­r, from which (4.2) follows. Also, by Trans­res, B i # A i for i = 0, 1, from which (4.1) follows. F creates left inverses: By hypothesis, id m0 = [A 1 ] # [A 0 ] # = [A # 1 # A 0 ] # for some A # 1 : m 1 m 0 with A 1 # A # 1 , since composition in C is total. Thus id m0 # A # 1 #A 0 . By Supp­id, Supp­ comp, and Trans­id­l, id m0 = A # 1 #A 0 and |A # 1 | = |A 0 | = #. Thus U 0 = U 1 . Thus p 1 A # 1 p 0 is an arrow in “ C and moreover F(p 1 A # 1 p 0 ) = [A # 1 ] # = [A 1 ] # and p 0 A0 p 1 A # 1 p 0 = id p0 , as desired. # An aside on the condition ``F creates compositions'': This looks tantalisingly close to the Conduch’e fibration property [Con72, Joh99, BF00], especially if one says that two decompositions in “ C are equivalent if one is the result of a support translation of the other. Perhaps there is some 2­category version of the Conduch’e property which works exactly if one thinks of “ C as a 2­category with support translations as 2­cells. Let us now return to the main flow of the argument. Chapter 4. Sliding IPO squares 70 Finally, we prove the key property, namely that F allows IPO sliding. The proof follows the outline given in Section 4.4. We start with two “ C­arrows with the same image under F , namely F(p A 1 p 1 ) = F(p # A # 1 p # 1 ). The first arrow is the left leg of an IPO square in “ C. This square is also an IPO in “ CU,U 2 , the convex subcategory w.r.t. U, U 2 (Definition 4.5), where U # U 2 are the sets in, respectively, the upper­left and lower­right profiles of the square. We isomorphically transform this subcategory by profile translation and then support translation, gaining a new square that has three properties: it has the same image under F as the original; its left leg is p # A # 1 p # 1 ; it is an IPO in a convex subcategory, so is an IPO in “ C. Before looking at the technical details, it is useful to consider a concrete case of sliding. Because we have not formally defined the graph contexts referred to at the beginning of this chapter, it is impossible to be precise about which commuting squares are IPOs and which are not. Nonetheless, the activity of ``sliding'' is relevant for all commuting squares, whether or not they are IPOs. Let us consider a category of graph contexts formed as the track (Definition 4.3) of the precategory or raw contexts informally defined in Section 4.1. The arrows of this category are just like the raw contexts (doughnut­shaped graphs with an inner and outer interface) but with profiles (Definition 4.3) rather than just natural numbers as objects. Consider the square in Figure 4.2(1). Its left leg has the same F image as the left leg of the square in Figure 4.2(3): the two graph contexts look essentially the same, the only di#erence being the supports. With a profile translation, we can replace the singleton set {u} in the top­left corner of Figure 4.2(1) with a fresh 2­element set {u ## 0 , u ## 1 }, as shown in Figure 4.2(2). The freshness is essential to prevent clashes with the other nodes present in the square, namely v 0 , v 1 . Now, if # is defined as follows: # : v i ## v # i i = 1, 2 # : u ## i ## u # i i = 1, 2 then the support translation by # of Figure 4.2(2) yields Figure 4.2(3), as desired. Since the passage from Figure 4.2(1) to Figure 4.2(2) and then to Figure 4.2(3) was e#ected by iso functors, all the universal properties of Figure 4.2(1) (e.g. being an IPO) hold of Figure 4.2(3) too. # # # # p p 0 p 1 p 2 A 0 A 1 B 0 B 1 4.3 Theorem 4.15 (F allows IPO sliding) Let F : “ C C be the support­quotienting functor constructed from A. Then F allows IPO sliding (Definition 3.1). Proof Consider any IPO square in “ C, as in Figure 4.3, and any arrow p # A # 1 p # 1 with F(p # A # 1 p # 1 ) = F(p A 1 p 1 ); thus A # 1 = # # A 1 for some bijection # : |A 1 | |A # 1 | and U # 1 = U # # #|A 1 |. 4.6. Construction of F 71 v 1 v 0 v 1 (2, {u}) (2, {v0 , v1 , u}) (0, {v0 , v1 , u}) (4, {v0 , u}) v 0 4.2(1) A commuting square before sliding v1 v0 v1 (2, {u ## 0 , u ## 1 }) (2, {v0 , v1 , u ## 0 , u ## 1 }) (0, {v0 , v1 , u ## 0 , u ## 1 }) (4, {v0 , u ## 0 , u ## 1 }) v0 4.2(2) First we apply profile translation . . . v # 0 v # 1 v # 0 v # 1 (4, {v # 0 , u # 0 , u # 1 }) (2, {u # 0 , u # 1 }) (2, {v # 0 , v # 1 , u # 0 , u # 1 }) (0, {v # 0 , v # 1 , u # 0 , u # 1 }) 4.2(3) . . . and then support translation by # Figure 4.2: Sliding Chapter 4. Sliding IPO squares 72 # # # # H(p) H(p 0 ) H(p 1 ) H(p 2 ) A 0 A 1 B 0 B 1 4.4 By Proposition 4.6, Figure 4.3 is an IPO in “ CU,U 2 . Let U ## be a fresh set in bijection with U # , as witnessed by µ : U ## ## U # . Let U ## 2 “ = U ## # (U 2 \ U ). Then U 2 \ U = U ## 2 \ U ## so by Proposition 4.7, there is a profile translation H : “ CU,U 2 “ C U ## ,U ## 2 which is an iso functor and whose action on profiles is: H : (m, U # V ) ## (m, U ## # V ) for V # U 2 \ U and whose action on arrows leaves the underlying A­arrow component unchanged. Since isomorphisms of categories preserve universal constructions, Figure 4.4 is an IPO in “ C U ## ,U ## 2 and has the same image under F as Figure 4.3 does. # # # # # # H(p) # # H(p 0 ) # # H(p 1 ) # # H(p 2 ) # # A 0 # # A 1 # # B 0 # # B 1 4.5 Let W be a fresh set in bijection with |B 1 |, as witnessed by # : |B 1 | ## W . Let # “ = µ # # # #, a union of bijections with mutually disjoint domains and codomains. Also Dom # = U ## # |A 1 | # |B 1 | = U ## 2 . Because # is bijective, Proposition 4.8 implies that there is a support translation # # (·) : “ C U ## ,U ## 2 “ C #U ## ,#U ## 2 which is an iso functor. Iso functors preserve universal constructions, so Figure 4.5 is a IPO in “ C #U ## ,#U ## 2 and has the same image under F . By Proposition 4.6, this square is an IPO in “ C. Moreover, # # (H(p) A 1 H(p 1 )) = µ # H(p) # # A 1 (µ # #) # H(p 1 ) = p # A # 1 p # 1 as desired. # Chapter 5 Action graph contexts 5.1 Introduction As promised in the previous chapter, the present one gives a substantial example of a precategory A­Ixt of raw contexts. (The ``Ixt'' in C­Ixt is for ``insertion context'', a ter­ minology explained in Section 5.6.) The need to handle graphs was the original motivation for functorial reactive systems: as we will see in Section 5.3, RPOs do not always exist for C­Ixt, the support quotient (Definition 4.10) of A­Ixt, so it is necessary to consider an upstairs category which does possess su#cient RPOs and a functor down to C­Ixt. The precategory A­Ixt has all the extra structure required of A, namely a sup­ port function and a support translation operation, so is a well­supported precategory (Definition 4.2). By direct instantiation of the results of the previous chapter, we can construct three things: the track of A­Ixt, which we call “ C­Ixt; the support quotient of A­Ixt, which we call C­Ixt; and a functor F : “ C­Ixt C­Ixt. These are related in the following table to their counterparts from the Chapter 4: well­supported precategory: A A­Ixt track: “ C “ C­Ixt support­quotienting functor: # F # F support quotient: C C­Ixt Thus by Theorem 4.14 and Theorem 4.15, F lifts arrows by their domain, creates isos, creates compositions, and allows IPO sliding. Furthermore A­Ixt has a distinguished ob­ ject 0, hence there are distinguished objects 0 and # of C­Ixt and “ C­Ixt, with F(#) = 0, whence F lifts agents. Thus, any choice of reaction rules Reacts for C­Ixt and reac­ tive context subcategory D of C­Ixt (Definition 2.1) yields a functorial reactive system (Definition 3.1). 73 Chapter 5. Action graph contexts 74 The main hurdle then in using the congruence results of Chapter 3 is proving that F has all redex­RPOs (Definition 3.4). It turns out that this property fails for some choices of reaction rules, but that it does hold for a rich class of them. As promised in Section 3.1, the category C­Ixt of graph contexts (the codomain of F) does not admit enough RPOs for subtle reasons. Examples are shown in Section 5.3. The cause is the lack of su#cient intensional information as to which node in contexts C 0 or C 1 , say, corresponds to a node in the composite C 0 C 1 . It is exactly “ C­Ixt, the domain of F , that contains just enough structure to track how the nodes of two contexts are related to the nodes of their composition. By the definition of ``F has all redex­RPOs'', the proof obligation is to show that su#cient RPOs exist in “ C­Ixt. It transpires that a frontal attack on this is di#cult (and not attempted in this dissertation): the combinatorial complexity of context composition, for example, is daunting, and the manipulation of the compositions required for verifying the existence of RPOs is too much to handle. However, an indirect approach based on a category G­Inc of graphs and inclusion embeddings (a kind of graph embedding) works smoothly. This is the subject of the next chapter, which shows that RPOs in “ C­Ixt correspond to relative coproducts in G­Inc; gives a direct construction of the latter subject to some conditions; and, finally, shows that these conditions are necessary for the existence of relative coproducts. Thus F has all redex­RPOs for a wide variety of reaction rules. The development of the theory presented in this chapter and the next is adapted from [CLM00] but di#ers from it in the following ways: I omit extraneous material (not relevant to labelled transitions) and include full proofs of the existence of relative coproducts (Theorem 6.27 and Theorem 6.26). I also have renamed some of the structures: in this dissertation I write “ C­Ixt and C­Ixt; in [CLM00], these are denoted PIns 0 and ACxt 0 respectively. 5.2 Action calculi reviewed I review a restricted class of the action calculi which were presented in [Mil96]. The rest of this chapter and the next make no specific use of the algebra of action calculi shown here since all of the work is done directly on graphs. Nonetheless, the design decisions taken when defining graphs are guided by the algebraic presentation of action calculi, so the latter provide valuable motivation. A closed, shallow action calculus is a strict monoidal category whose objects are natural numbers k, m, n, o . . ., and whose arrows are called action graphs and written a : (k, o), b : (m, n). (I avoid the usual arrow notation a : k o, reserving it for the context arrows of reactive systems.) The tensor product of these two action graphs is a# b : (k +m, o+n); the composition of a : (k, o) and b : (o, m) is a·b : (k, m); the identity action graph of arity (m, m) is i m . The order of composition is not conventional in category 5.2. Action calculi reviewed 75 theory: it connotes the orientation of the action graphs we work with. (Note that this composition is the horizontal juxtaposition of action graphs and has nothing to do with contextual composition which we consider later.) A pair of natural numbers (k, o) is an arity ; let #, #, . . . range over arities. I deal only with closed, shallow action calculi and so usually omit these adjectives from now on. An action calculus has a control signature K = {K, L, . . .} consisting of controls, each of which has an arity. There are constants p : (2, 2), c : (1, 2) and # : (1, 0) for permutation, copy and discard. These constants represent only the swapping, sharing and elimination of arcs, not of nodes. They satisfy simple equations, e.g. c·p = c representing the com­ mutativity of copying. There is also an operator called reflexion [Mil94] (similar to the ``trace'' of Joyal et al. [JSV96]) which we need not detail here. Finally, each action calculus has a binary reaction relation , relating action graphs of equal arity. This relation is preserved by all constructions, i.e. by composition, tensor product and reflexion. For examples of action graphs, let K : (0, 1), M : (1, 1) and L : (1, 0) be controls. Then the following are action graphs, with their arities: K #M : (1, 2) K·c·(M # L) : (0, 1) (K·M) # (M ·L) : (1, 1) . Composition · binds tighter than tensor #, so the last can be written K·M #M ·L. A context C is an action graph containing a single hole with arity #, written - # . I omit the arity, writing -, if it is determined by the rest of the context or the surrounding discussion. Thus a context C : # # is an action graph of arity # with a hole of arity #. Here are two contexts along with their domain and codomain arities (the arity of the hole being fully determined in the second case): - 1,1 ·c·(M #M) : (1, 1) (1, 2) K·-·L : (1, 1) (0, 0) . Figure 5.1 shows an action graph and a context using a graphical notation. It uses nodes (rectangles with two blunt corners) to represent occurrences of controls, and arcs to represent composition. An action graph with arity (m, n) has m source ports on its left side and n target ports on its right side. A control node or hole of arity (m, n) has m target ports at its left side n source ports at its right side. At a source node, branching of arcs represents c and absence of arcs represents #. Two contexts are equal if the algebraic theory equates them, treating the hole as a control distinct from all others. The composition of two contexts C : # # and Chapter 5. Action graph contexts 76 L M M M K Figure 5.1: The action graph K·c·(M # L) and the context - 1,1 ·c·(M #M) D : # #, written here DC (note the conventional order of composition), is formed by replacing the hole in D by C and joining the arcs according to the common ports. (This is context composition, not horizontal action graph composition described earlier.) Composition is clearly associative, and there is an identity context id # = - # for each arity. An action graph a : # can be considered as a context a : (0, 0) # whose hole has minimum arity. We shall use lower case letters a, . . . for action graphs. We have thus associated with an action calculus a reactive system C­Ixt, whose objects are arities, with distinguished null arity (0, 0), and whose arrows are contexts, including action graphs. 5.3 Examples and a problem In this section I give examples of specific RPOs in C­Ixt, illustrating several phenomena. I end with an example showing cases in which RPOs fail to exist; this motivates the strategy of defining a category ``upstairs'' (the domain of a functor with codomain C­Ixt) for which enough RPOs do exist. Remember that C­Ixt is really a family of reactive systems arising from action calculi; each is determined by a control signature and a set of reaction rules. Example 5.1 (arithmetic) I first illustrate how an RPO can determine a labelled transition, using an action calculus for elementary arithmetic having controls 0 : (0, 1), S : (1, 1), and + : (2, 1). The reactive system is shown in Figure 5.2; it is an example of the sharing graphs of Hasegawa [Has99], which add sharing to the interaction nets of Lafont [Laf90]. Nodes represent subexpressions, and the forking of arcs allows these to be shared. The reaction rules are in the top diagram; the garbage collection rules allow unattached expressions to be incrementally destroyed. The middle diagram shows an action graph a occurring in a larger one b # , which also contains an occurrence of the redex l 1 of the rule for S. The contexts C # and D # correspond to the two occurrences, which overlap. Now what is the ``just large enough'' context C which extends a to contain l 1 ? It is not quite C # , because C # has destroyed the possibility of sharing S which is o#ered by l 1 . In fact it is C as shown in the lower diagram; it may not seem ``smaller'' than C # , but it is indeed a factor of C # , as witnessed by the context 5.3. Examples and a problem 77 E. (C # cannot be a factor of C; no context F surrounding C # can cause its S­node to be shared.) So our derived labelled transition system will admit the transition a C Dr 1 . We would expect to add further controls, e.g. for subtraction, before obtaining an interesting behavioural congruence. Example 5.2 (wiring) The preceding example used the forking and deletion of arcs to represent the sharing of components. This non­linearity is a pervasive feature in process calculi. CCS and the #­calculus depend heavily on it; witness the double occurrence of x in its reaction rule for CCS: • x.a | x.b a | b (and similarly for the #­calculus). The redex has no direct representation in the family of action calculi introduced in this section because we confine our attention to shallow, closed action graphs, i.e. ones without nesting of graphs (needed for prefixing) and free names. Without such limitations, the redex is exactly modelled by an action graph with a forked arc (representing the sharing of x) connected to two controls representing the output and input prefixes, containing inside a and b respectively. See [Mil96] for many examples. Non­linearity can give rise to RPOs which are more complex than one might expect. Figure 5.3 shows two identical action graphs a = b = K·c, where K : (0, 1); using the identity contexts C # = D # = - 0,2 they are embedded in K·c. But the RPO C, D,E does not consist of identity contexts! A candidate might choose to identify t 0 in a with either t 2 or t 3 in b, and similarly for t 1 . To be the ``best'' candidate, the C, D,E must handle all these pairings; to indicate this we have indexed its targets by pairs in the diagram. In fact we have Ca = Db = K·c·(c # c) . Example 5.3 (reflexion) A surprising phenomenon is how the presence of reflexion can a#ect the RPO. Let K,N : (1, 1), L : (0, 2) and M : (2, 0), and recall that i 1 is the identity of arity (1, 1) for action graph composition. Figure 5.4 shows a = L·(i 1 #K) and b = (i 1 #K)·M embedded in C # a = D # b = L·(N #K)·M . The contexts C # and D # do not involve reflexion. In the RPO C, D,E shown we have Ca = Db = (i 1 #L)·(p#K)·(i 1 #M ); this extends a by only one control (M) in order to create an occurrence of b. The contexts C and D do not use reflexion, but E does use it. If reflexion is forbidden then the RPO C + , D + , E + is such that C + a = D + b contains N ; this would yield a more complex derived labelled transition relation. These examples do not exhaust the phenomena which arise in finding RPOs in C­Ixt, but they indicate that the general construction will not be trivial. The reader may feel that, having coped informally with a number of phenomena, we are well on the way to finding RPOs in every case. However, they do not always exist in C­Ixt! Here is a counter­example. Chapter 5. Action graph contexts 78 + Arithmetic rules + + S + 0 S S S 0 0 Garbage collection rules r 1 l 0 r 0 l 1 + 0 + S + 0 S b # (0, 0) (1, 1) (1, 1) (2, 2) a l 1 D # C # C # a = D # l 1 = b # S 0 An action graph a overlapping a redex l 1 a l 1 D # 0 S (0, 0) (1, 2) (1, 1) (2, 2) (1, 1) C # D C E Ca = Dl 1 = b S + 0 b An RPO for a and l 1 w.r.t. C # and D # Figure 5.2: A reactive system for arithmetic (Example 5.1) 5.3. Examples and a problem 79 (0, 2) (0, 2) (0, 2) (0, 0) C # C D (0, 0) (0, 2) a t 0 t 1 b t 2 t 3 D # K K K K (t 1 ,t 3 ) (t 1 ,t 2 ) (t 0 ,t 3 ) (t 0 ,t 2 ) E (0, 4) Figure 5.3: An RPO for copied wiring (Example 5.2) M N L K (0, 0) L K K L N (0, 0) (0, 0) (0, 0) (1, 1) (0, 2) (2, 0) M M a b D C E N M D # C # L N K Figure 5.4: An RPO using reflexion (Example 5.3) Chapter 5. Action graph contexts 80 K K K L L K K K L L L L (0, 0) (0, 0) (0, 0) (0, 1) (0, 1) (0, 0) b a D # C # A context equation C # a = D # b -·(L # L) K·L # -·L (0, 0) - K K K #- K·L # -·L K·L # -·L D -#K F 0 - F 1 (0, 0) (0, 0) (0, 1) (0, 1) impossible! (1, 1) C Figure 5.5: A missing RPO in C­Ixt (Example 5.4) 5.4. Closed shallow action graphs 81 Example 5.4 (missing RPO) Let K : (0, 1) and L : (1, 0). Let a = b = K and let C # = D # = K·L #- 0,1 ·L with arity (0, 1) (0, 0). Then C # a = D # b = K·L #K·L; this is shown in the upper diagram of Figure 5.5. The lower diagram shows two candidate RPOs, for which it is easy to verify commutation: C 0 , D 0 , E 0 = -, -, K·L # -·L C 1 , D 1 , E 1 = -#K, K #-, -·(L # L) . But if there were an RPO, then contexts C, D,F 0 , F 1 would exist as shown making the diagram commute, yielding a contradiction as follows: Neither C nor D can contain any control, since F 0 C = F 0 D = -. Hence from CK = DK we deduce C = D, using the criterion for context equality stated above (since the control K appears in neither C nor D). Hence -#K = F 1 C = F 1 D = K #-, a contradiction. This counter­example involves neither copying nor discard of arcs, so is applicable to linear action calculi too [LM00b] --- those in which all source ports bear exactly one arc. Thus we cannot attribute the lack of RPOs to c and #, even though they demand careful treatment as shown in Example 5.1 and Example 5.2. The counter­example also illustrates why RPOs do not always exist in C­Ixt. The equations C 0 K = D 0 K = K and C 1 K = D 1 K = K # K hold respectively for the two candidates; but if we ``track'' the two occurrences of a = K and b = K through these equations, we find that they correspond to the same occurrence of K in the first case, and to two di#erent occurrences in the second case. This is a key to solving our problem; we seek a suitable refinement of C­Ixt that possesses the RPOs we need. We expect its contexts to track the occurrences of nodes. This is a similar idea to that of ``colouring'', which has been suggested by Sewell to replace his dissection based definitions [Sew01]. The strategy is as follows. The next two sections are devoted to the construction of A­Ixt, a well­supported precategory of arities and raw contexts. Replaying the construc­ tions of Chapter 4, we derive two categories from A­Ixt and a functor between them F : “ C­Ixt C­Ixt. The next chapter shows that su#cient RPOs exist in “ C­Ixt, as desired. 5.4 Closed shallow action graphs In this section, after introducing some notation, I define a class of action graphs. These graphs are enriched in the next section to form raw contexts (graphs with a hole in them), the arrows of the precategory A­Ixt. No attempt is made to verify formally that the action graphs presented here correspond to action calculi terms quotiented by the action calculi axioms (which are not presented here). Such an undertaking represents future work and will be of more value when the tractable graph­theoretic features include free names and nesting. Chapter 5. Action graph contexts 82 Notation I write [m] for the ordinal number {0, 1, . . . , m-1}. The disjoint sum # i#I X i of a family of sets is taken to be the set # i#I ({i} ×X i ). A particular case is when I = [n]; then the disjoint sum may be written, without parentheses, as X 0 + X 1 + · · · + X n-1 . Examples in this chapter take the form S = # v#V S v + [m] + [n], a ternary disjoint sum, the first summand of which is itself a disjoint sum; S has elements of the form (0, (v, s)) for each v # V and s # S v , (1, i) for each i # [m], and (2, j) for each j # [n]. If the sets X and Y are disjoint, I often write their union as X #Y . This notation is to be distinguished from a disjoint sum. In particular, (X 1 #X 2 )#X 3 = X 1 #(X 2 #X 3 ) and will often be written without parentheses; on the other hand, the disjoint sums X 1 +X 2 +X 3 , (X 1 + X 2 ) + X 3 and X 1 + (X 2 + X 3 ) are all distinct but in bijective correspondence. If f : X Z and g : Y Z are two functions with X disjoint from Y , then f#g : X#Y Z is their union. I use f : X # Y , f : X # Y and f : X ## Y for respectively injective, surjective and bijective functions, and f : X ## Y for an injection which is an inclusion. Finally, # denotes function composition, Id X the identity function on the set X, and #X the empty function from # to X. Definition 5.5 (control signature) We fix a control signature K, a set of controls, equipped with an arity function called arity : K N 2 and let K, L, . . . range over K. For arity(K) = (m, n) we write K : (m, n); in this case, two functions extract the components of the pair: arin(K) “ = m and arout(K) “ = n. # Definition 5.6 (action graphs) A (closed, shallow) action graph G = (m, n, V, contr , src) comprises an arity (m, n), a set V of nodes, called a support, a control map contr : V K assigning a control in K to each node in V , and a source map src : T S assigning a source (port) in S to each target (port) in T , where . the source set S “ = # v#V [arout(contr(v))] + [m] comprises the binding sources for each v # V and the input sources indexed by [m]; . the target set T “ = # v#V [arin(contr(v))] + [n] comprises the argument targets for each v # V and the output targets indexed by [n]. # Nomenclature We may write a graph as G = (V, contr , src) : (m, n), or just G = (V, contr , src) when the arity is understood. We denote the empty graph (#, #K , ## ) : (0, 0) by 0. We shall abbreviate arin(contr(v)) to arin(v) etc., when there is no ambiguity. We denote the injections induced by the disjoint sums S and T as 5.4. Closed shallow action graphs 83 Figure 5.6: A closed shallow action graph follows: bind(v) : [arout(v)] # S for the binding sources of each v # V ; in : [m] # S for the input sources; arg(v) : [arin(v)] # T for the argument targets of each v # V ; out : [n] # T for the output targets. # We shall write bind(v, i) and arg(v, j) for the ports bind(v)(i) and arg(v)(j). For any injection f into a set A we write A f for the range of f ; thus for example S in is the set of all input sources and T arg(v) the set of argument targets of v. We shall also write for example S bind for # v#V S bind(v) . With this notation we can represent our partitions as S = S in # S bind T = T out # T arg . An example of an action graph with arity (1, 3) is shown in Figure 5.6, with node names and the control map omitted. The whole graph is in a rectangle, with input sources at the left and output targets at the right. Nodes are drawn as rectangles with two corners blunted to give orientation; we may want to tilt some of them, as here, or even turn them upside down. The three nodes have arities (1, 1), (2, 1) and (1, 2). The arcs represent the source map, with arrows pointing from source to target. Ports could be drawn as blobs, but this is somewhat redundant; a target is always indicated by exactly one incoming arc, and we indicate a source with no outgoing arcs by a little aborted arc. Cycles are allowed. The graphs studied here are for action calculi which are closed, meaning that free names such as x, y, . . . are not used as sources, and shallow, meaning Chapter 5. Action graph contexts 84 that nodes do not contain action graphs nested inside. I study the closed, shallow action graphs in this dissertation as a preliminary to future work on a full graphical presentation of action calculi, since notions such as graph embedding and graph context are more clearly handled in this simpler setting. Convention Action graphs are almost always denoted by G suitably subscripted. Rather than always explicitly list all their primary components V , contr , and src, or their derived components S, T , bind , in etc., we shall follow a firm convention that the names of these components are standard, decorated as necessary to connote the graph's name. 5.5 The well­supported precategory A­Ixt of arities and raw contexts I proceed now to construct the well­supported precategory A­Ixt of raw contexts with a support function and support translation operation (Definition 4.2). The intuition of ``context'' is well supplied by Figure 5.7; the graph G occurs inside the dotted rectangle in G # , and we may think of the context in which G is placed as that part of G # lying outside the dotted rectangle. A context is therefore an action graph, but with a little more structure since it has an internal as well as an external interface. The internal interface is often called a hole. (I do not consider here contexts with more than one hole.) The lower diagram shows the context C which results when G is excised from G # ; in the notation of what follows, G # = CG (C composed with G). Note the new targets and sources on respectively the left and right sides of C's internal interface; in particular, the middle internal source lacks any targets and therefore represents the discard of the corresponding output target of G --- or of any other graph --- when placed in the hole. Definition 5.7 (raw context) A (closed, shallow, loose) raw context A = (V, contr , src) of arity (m # , n # ) to (m, n), written A : (m # , n # ) (m, n), comprises a support V , which is a set of nodes, a control map contr : V K assigning a control in K to each node in V , and a source map src : T S assigning a source (port) in S to each target (port) in T , where . the source set S “ = # v#V [arout(v)] + [m] + [n # ] comprises the binding sources for each v # V , the input sources indexed by [m], and the upput sources indexed by [n # ]; . the target set T “ = # v#V [arin(v)] + [n] + [m # ] comprises the argument targets for each v # V , the output targets indexed by [n], and the downput targets indexed by [m # ]. 5.5. The well­supported precategory A­Ixt of arities and raw contexts 85 G # G The action graph G is a subgraph of G # C The corresponding context Figure 5.7: The bottom context composed with the middle action graph yields the top action graph Chapter 5. Action graph contexts 86 A 0 A 1 T down 0 S in 1 S up 0 T down 1 S in 0 S up 1 T out 1 T out 0 Figure 5.8: The composition of two raw contexts Furthermore, we require that the looseness condition is satisfied (see nomenclature defined below): src(T down ) # S up = # . Loose The looseness condition precludes a ``tight'' loop from the back of the hole to the front, such as is formed by reflexion in action calculi. It does not preclude such a loop proceeding via a control, which indeed occurs in Figure 5.7; thus the contexts permit only limited reflexion, which simplifies the definition of composition. (See Chapter 8 for further discussion.) Finally, |A| is the support of A, i.e. V in this case. # Note that an action graph G is just a raw context A as above in which m # = n # = 0. Nomenclature As for action graphs, there are induced injections for raw contexts as follows: bind(v) : [arout(v)] # S for the binding sources of each v # V ; in : [m] # S for the input sources; up : [n # ] # S for the upput sources; arg(v) : [arin(v)] # T for the argument targets of each v # V ; out : [n] # T for the output targets; down : [m # ] # T for the downput targets. With naming similar to that in action graphs for the partitions of S and T , we have S = S bind # S in # S up T = T arg # T out # T down . Raw contexts cannot be composed when the supports of each operand in a composi­ tion intersect non­trivially. Composition is well­defined when the operands have disjoint supports--- one of the required properties of a well­supported precategory (Definition 4.2). 5.5. The well­supported precategory A­Ixt of arities and raw contexts 87 Definition 5.8 (identity and composition for raw contexts) The identity raw context id (m,n) = (#, #K , src) : (m, n) (m, n) has S “ = #+ [m] + [n] = S in # S up T “ = #+ [n] + [m] = T out # T down src : # down(i) ## in(i) i # [m] out(j) ## up(j) j # [n] . Let A i = (V i , contr i , src i ) : (m i , n i ) (m i+1 , n i+1 ) be two raw contexts for i = 0, 1, with V 0 # V 1 = #. Their composition, illustrated in Figure 5.8 by nesting A 0 inside A 1 , is A 1 # A 0 “ = A 2 = (V 2 , contr 2 , src 2 ) : (m 0 , n 0 ) (m 2 , n 2 ) , where V 2 “ = V 1 # V 0 and contr 2 “ = contr 1 # contr 0 (thus determining arin 2 = arin 1 # arin 0 and likewise arout 2 , bind 2 , arg 2 ), and S 2 “ = # v#V 2 [arout 2 (v)] + [m 2 ] + [n 0 ] = (S bind 1 # S bind 0 ) # S in 1 # S up 0 T 2 “ = # v#V 2 [arin 2 (v)] + [n 2 ] + [m 0 ] = (T arg 1 # T arg 0 ) # T out 1 # T down 0 . Note (see Figure 5.8) that T down 1 is in bijection with S in 0 and [m 1 ], while S up 1 is in bijection with T out 0 and [n 1 ]. It remains to define the source function src 2 ; this is done in terms of two auxiliary functions # i : S i S 2 (i = 0, 1) which describe how sources of A 0 and A 1 ``become'' sources of A 2 : # 0 (s) “ = # s if s # S bind 0 # S up 0 src 1 down 1 (i) if s = in 0 (i) # S in 0 # 1 (s) “ = # s if s # S bind 1 # S in 1 # 0 src 0 out 0 (j) if s = up 1 (j) # S up 1 src 2 (t) “ = # # 1 src 1 (t) if t # T arg 1 # T out 1 # 0 src 0 (t) if t # T arg 0 # T down 0 . # We have adopted the symbol `` # '' for composition of raw contexts as a reminder that it is partial: the supports of the operands are required to be disjoint in order for a composition to be defined. Proposition 5.9 (raw contexts form a precategory) If A 0 and A 1 are raw contexts with |A 1 | # |A 0 | = # then A 1 # A 0 is a raw context too. (In particular, # preserves the Loose condition.) Composition is associative and has identity id (in the way required of a precategory, see Definition 4.1). Proof Follows immediately from Propositions 21 and 22 of [CLM00]. # Chapter 5. Action graph contexts 88 By definition composition satisfies Supp­comp and Supp­id (Definition 4.2), i.e. |A 1 # A 0 | = |A 1 | # |A 0 | Supp­comp |id m | = # . Supp­id The only task remaining in order to show that A­Ixt is a well­supported precategory is the definition of a support translation operation and the verification of its properties. Definition 5.10 (support translation for raw contexts) Given a raw context A : (m 0 , n 0 ) (m 1 , n 1 ) and an injection # whose domain contains the support of A, i.e. Dom # # |A|, we define the support translation by # of A, written # # A, as follows: # # A “ = A # , where: V # “ = #V contr # (v) “ = contr(# -1 (v)) thus determining arin, arout , bind , arg S # “ = # v#V # [arout # (v)] + [m 1 ] + [n 0 ] T # “ = # v#V # [arin # (v)] + [n 1 ] + [m 0 ] thus determining bijections # S : S ## S # and # T : T # ## T whence we define: src # “ = # S # src # # T . # This definition satisfies the appropriate healthiness conditions: Proposition 5.11 (support translation is well­defined) If A satisfies Loose then so does # # A, thus # # (·) is a well­defined operation on raw contexts. Proof Immediate from src # “ = # S # src # # T , since # S and # T are the identity on upput sources and downput targets respectively. (Proposition 24 in [CLM00].) # Furthermore, support translation satisfies the remaining axioms in the definition of a well­supported precategory (Definition 4.2): Proposition 5.12 (support translation properties) All of the following properties hold: # # id m = id m Trans­id­r # # (A 1 # A 0 ) = # # A 1 # # # A 0 Trans­comp­r Id |A| # A = A Trans­id­l (# 1 # # 0 ) # A = # 1 # (# 0 # A) Trans­comp­l # 0 # |A| = # 1 # |A| implies # 0 # A = # 1 # A Trans­res |# # A| = #|A| . Trans­supp 5.6. Constructing a functorial reactive system 89 Proof All of these except the last follow from Proposition 25 in [CLM00]. Trans­supp follows immediately from Definition 5.10. # Thus: Theorem 5.13 A­Ixt is a well­supported precategory. # 5.6 Constructing a functorial reactive system Having established that A­Ixt is a well­supported precategory, the rest of the re­ sults of Chapter 4 are immediately applicable. We let “ C­Ixt be the track of A­Ixt (Definition 4.3), a category of profiles and insertion contexts. The arrows are called ``in­ sertion contexts'' to remind us that they ``insert'' the nodes of the domain profile into the codomain profile. Thus, if (m 0 , n 0 , U 0 ) A (m 1 , n 1 , U 1 ) is an arrow of “ C­Ixt then U 0 # U 1 . A more complex category is considered in [CLM00] which allows the domain set to be injected (rather than just included) in the codomain set. This richness seems to be super­ fluous, so I ignore it here. Now let C­Ixt be the support quotient (Definition 4.10) of A­Ixt. Finally let F : “ C­Ixt C­Ixt be the support­quotienting functor (Definition 4.12). Then by Theorem 4.14, F lifts arrows by their domain, creates isos, creates compositions, and creates left inverses. (The last is used only in Appendix B.) By Theorem 4.15, F allows IPO sliding. A­Ixt has a distinguished object (0, 0): as stated immediately after Definition 5.7, a raw context with domain (0, 0) is identical to an action graph. Since C­Ixt has the same objects as A­Ixt does, the distinguished object 0 of C­Ixt as a reactive system is just (0, 0). Likewise, let # “ = (0, 0, #), the distinguished object for “ C­Ixt. Since F(#) = 0, we have that F lifts agents, and thus: Theorem 5.14 (F is functorial reactive system) The support­quotienting functor F : “ C­Ixt C­Ixt is a functorial reactive system for any choice of D and Reacts. # Thus all the congruence results of Chapter 3 are applicable to F (except Theorem 3.33 since we are not dealing with multi­hole contexts) with one proviso: we wish to determine choices of D and Reacts such that F has all redex­RPOs. That is the subject of the next chapter which shows that an arbitrary choice of D and a wide variety of choices of Reacts (in particular of redexes) leads to su#cient RPOs. Chapter 5. Action graph contexts 90 Chapter 6 RPOs for action graph contexts 6.1 Introduction # # # # # p 0 p 1 p G 0 G 1 A 0 A 1 6.1 The goal of this chapter is to show that F : “ C­Ixt C­Ixt has all redex­RPOs for a wide variety of reaction rules Reacts and reactive contexts D. With regard to D, the strongest result obtainable is with D = C­Ixt, i.e. with all contexts in C­Ixt reactive, so let us fix on this throughout this chapter. Recall from Definition 3.4, that F has all redex­RPOs if any square in “ C­Ixt, such as in Figure 6.1, has an RPO provided that there exists r # C­Ixt such that (F(# G 1 p 1 ), r) # Reacts. So the key question is: for which choice of redexes l (first components of the Reacts relation) do there exist RPOs in “ C­Ixt for squares whose upper­left profile is # = (0, 0, #) and whose left leg is a preimage of l? Answering this question is the task of the present chapter. As indicated in the previous chapter, constructing RPOs by the direct manipulation of contexts in “ C­Ixt is di#cult and not attempted here. An alternate strategy is to carry out a related construction (relative coproducts, to be explained later) in a related category G­Inc of action graphs and inclusion embeddings. That is the plan followed here. This chapter is organised as follows. First I define G­Inc and relate it to “ C­Ixt. Second, I define relative coproducts and relate them to RPOs. Third, I show precise (necessary and su#cient) conditions for the existence of relative coproducts in G­Inc. Fourth, I give conditions on the redexes in Reacts which guarantee that F : “ C­Ixt C­Ixt has all redex­RPOs (Theorem 6.29) --- the main applied result in this dissertation. 91 Chapter 6. RPOs for action graph contexts 92 G # t 0 s 2 t 2 t # 0 t # 1 t # 2 s # 0 t ## 2 s 1 t 1 s # 1 G s 0 v v s # 2 Figure 6.2: An inclusion embedding 6.2 The category G­Inc of action graphs and inclusion em­ beddings I now define the category G­Inc in which the objects are action graphs and the arrows are inclusion embeddings. Our main task is to define what we mean by inclusion embeddings and to capture the meaning in a tractable list of axioms. Intuition Two graphs are shown in Figure 6.2. G # is the graph of Figure 5.6, and G is a smaller graph which intuitively ``occurs'' in G # . The way it occurs is represented informally by the dotted rectangle. The input sources s 0 and s 1 of G are represented respectively by s # 0 and s # 1 of G # , and the binding source s 2 by s # 2 . The argument targets t 0 and t 1 are represented respectively by t # 0 and t # 1 , and the output target t 2 by both t # 2 and t ## 2 . The other output target of G is not represented in G # ; the aborted arc (not formally part of G # ) indicates that the lower arc from s 2 in G is discarded by the inclusion embedding into G # . Note that the action graph G here is simpler (by one less arc) than the one considered in Figure 5.7. There is no reason why the node v in G has to be sent to v in G # and not to some v # , 6.2. The category G­Inc of action graphs and inclusion embeddings 93 however for simplicity, we confine our attention to inclusion embeddings --- and usually omit the adjective ``inclusion''. The more general situation (G­Emb) for which embed­ dings inject (rather than include) the nodes of one graph in another is considered in detail in [CLM00]. I make no use of G­Emb here. Of course several parts of G # do not represent G. Also our definition of inclusion embedding must allow that a target of G may have any number (including 0) of represen­ tatives in G # ; but we shall insist that every source in G is represented exactly once in G # . Two sources may have the same representative, though this is not the case in our example. We are therefore led to the following definition: Definition 6.1 (inclusion embedding) Given graphs G i = (V i , contr i , src i ), i = 0, 1, a (loose) inclusion embedding # : G 0 G 1 of G 0 in G 1 consists of a pair of functions (# S , # T ) where # S : S 0 S 1 is a map of sources; # T : T 1 # T 0 is a partial map of targets satisfying the following axioms: contr 1 # V 0 = contr 0 Inc­Contr # S # src 0 # # T # src 1 Inc­Src # S -1 {bind 1 (v, i)} = {bind 0 (v, i)} Inc­Bind # T-1 {arg 0 (v, j)} = {arg 1 (v, j)} Inc­Arg src -1 1 # S (S bind 0 ) # Dom (# T ) . Inc­Targs # These axioms deserve a little explanation. Inc­Contr says that # respects the control function; Inc­Src says that it also respects the source function where # T is defined. Inc­ Bind says not only that # respects the binding source functions, but also that # S never identifies (in G 1 ) a binding source of G 0 with any other source. This enforces the looseness constraint (see below). Inc­Arg is similar. Inc­Targs says that if a source s in G 1 represents a bound source of G 0 , then all its targets represent targets of G 0 . The looseness constraint prevents an inclusion embedding for which G 1 creates a tight loop (such as is formed by reflexion in action calculi) from the back of the G 0 to the front. It does not preclude such a loop proceeding via a control, which indeed occurs in Figure 6.2; thus our inclusion embeddings will permit some but not all of the power of reflexion. To match full reflexion is more di#cult and is not attempted here. This limitation on inclusion embeddings corresponds to the limitation on contexts (Loose) described in Definition 5.7. I shall usually omit the adjective ``loose'' as it applies to inclusion embeddings throughout. Chapter 6. RPOs for action graph contexts 94 Definition 6.2 (the category G­Inc of action graphs and inclusion embeddings) The objects of the category G­Inc are action graphs (Definition 5.6) and the arrows are (loose) inclusion embeddings (Definition 6.1). If # : G G # and # : G # G ## are two inclusion embeddings, then their composition ## : G G ## is defined by ## “ = (# S # # S , # T # # T ) . The identity inclusion embedding id G : G G is defined by id G “ = (Id S , Id T ) . # Proposition 6.3 G­Inc is a category with an initial object 0 (the empty graph). Proof Definition 16 and Proposition 7 in [CLM00]. # For any graph G we shall write 0G for the unique inclusion embedding 0 G whose components are both empty (partial) functions. Immediately following Definition 5.7, we noticed that action graphs (V, contr , src) : (m, n) correspond to raw contexts with domain (0, 0) and support V , and thus to contexts in “ C­Ixt of the form # (m, n, V ). The relationship between G­Inc and “ C­Ixt goes further, as witnessed by the following functor: A : G­Inc #/ “ C­Ixt where #/ “ C­Ixt is a coslice category of “ C­Ixt; this is a simple general notion that is defined formally in Appendix A. In brief, the objects of #/ “ C­Ixt are contexts in “ C­Ixt with domain # and the arrows G G # are contexts p A p # in “ C­Ixt for which # G p A p # = # G # p # , i.e. A # G = G # . The functor is formally defined as follows: Definition 6.4 (A) The functor A : G­Inc #/ “ C­Ixt is defined on objects by A(G) “ = # G (m, n, V ), where G = (V, contr , src) : (m, n). Now let # : G 0 G 1 be an arrow (i.e. an embedding). Suppose G i = (V i , contr i , src i ) : (m i , n i ) for i = 0, 1 with V 0 # V 1 . Then A(#) “ = (m 0 , n 0 , V 0 ) A (m 1 , n 1 , V 1 ) A “ = (V A , contr A , src A ) , where VA “ = V 1 \V 0 and contr A “ = contr 1 #VA (thus determining arity A = arity 1 #VA , etc.), and SA “ = # v#VA [arout A (v)] + [m 1 ] + [n 0 ] = S bind A # S in 1 # S up 1 TA “ = # v#VA [arin A (v)] + [n 1 ] + [m 0 ] = T arg A # T out 1 # T down 1 . 6.2. The category G­Inc of action graphs and inclusion embeddings 95 Note that S up 1 and T down 1 are in bijection with T out 0 and S in 0 respectively. Finally we define src A : src A (t) “ = # # # # # # # # # # # # up 1 (j) if # T (t) is defined, i.e. # T (t) = out 0 (j) for some j # [n 0 ] src 1 (t) if # T (t) is undefined if t # T arg A # T out 1 # S in 0 (i) if t = down 1 (i) # T down 1 for some i # [m 0 ] A is a functor by Corollary 32 in [CLM00]. # This functor is full and faithful, as witnessed by its inverse on every homset DG 0 ,G 1 : #/ “ C­Ixt(A(G 0 ), A(G 1 )) G­Inc(G 0 , G 1 ) , which is formally defined as follows: Definition 6.5 (D) Let G 0 and G 1 be action graphs. We define a function DG 0 ,G 1 : #/ “ C­Ixt(A(G 0 ), A(G 1 )) G­Inc(G 0 , G 1 ) from contexts to embeddings. Let (p 0 A p 1 ) : A(G 0 ) A(G 1 ) be a context, where G i = (V i , contr i , src i ) : (m i , n i ) for i = 0, 1. Recall that by definition G 1 = A # G 0 . Thus if A = (V, contr , src) then the components of G 1 = A # G 0 = (V 1 , contr 1 , src 1 ) are as follows (a special case of raw context composition) V 1 = V # V 0 contr 1 = contr # contr 0 S 1 = (S bind # S bind 0 ) # S in T 1 = (T arg # T arg 0 ) # T out src 1 (t) = # # src(t) if t # T arg # T out # 0 src 0 (t) if t # T arg 0 where # 0 (s) = # src down(i) if s = in 0 (i) # S in 0 s if s # S bind 0 #(s) = # # 0 src 0 out(j) if s = up(j) # S up s if s # S bind # S in The components of the inclusion embedding DG 0 ,G 1 (p 0 A p 1 ) “ = # : G 0 G 1 are then Chapter 6. RPOs for action graph contexts 96 m f 0 f 1 g 0 g 1 6.3(1) m f 0 f 1 g 0 g 1 h 0 h 1 h 6.3(2) m f 0 f 1 g 0 g 1 h 0 h 1 h h # h # 0 h # 1 kk 6.3(3) Figure 6.3: Construction of an RPO defined by # V : V 0 ## V 1 “ = v ## v (v # V 0 ) # S : S 0 S 1 “ = # 0 # T : T 1 # T 0 “ = t ## # # # # # t if t # T arg 0 # undefined if src(t) # S bind # S in out 0 (j) if src(t) = up(j) # S up if t # T arg # T out # This function and is bijective on objects, so: Proposition 6.6 A : G­Inc #/ “ C­Ixt is an isomorphism of categories. (See Proposi­ tion 42 of [CLM00].) # Because of the notational overload of action graphs and contexts out of #, we think of A as the identity on objects, i.e. A(G) = G. 6.3 Relative coproducts A consequence of A : G­Inc #/ “ C­Ixt being an isomorphism of categories is that relative coproducts (defined below) in G­Inc correspond naturally to RPOs in “ C­Ixt. This is a crucial fact in deriving labelled transition systems. I first recall RPOs, then define relative coproducts and prove a result relating them. For this pure category theory arrows are denoted by f, g, h, . . .. When we come to use the results I shall revert to the applied notation. Definition (RPO recalled; see Definition 2.4) In any category C, consider a com­ muting square (Figure 6.3(1)) consisting of g 0 f 0 = g 1 f 1 . An RPO is a triple h 0 , h 1 , h satisfying two properties: commutation: h 0 f 0 = h 1 f 1 and hh i = g i for i = 0, 1 (Figure 6.3(2)); 6.3. Relative coproducts 97 h 1 h 0 g 1 z g # 0 g # 1 g g 0 g # h 1 h 0 Figure 6.4: A relative coproduct g 0 , g 1 , g universality: for any h # 0 , h # 1 , h # satisfying h # 0 f 0 = h # 1 f 1 and h # h # i = g i for i = 0, 1, there exists a unique k such that h # k = h and kh i = h # i (Figure 6.3(3)). # Definition 6.7 (relative coproduct) In any category C, let h 0 , h 1 be a pair of arrows with a common codomain (see Figure 6.4). A triple g 0 , g 1 , g for which gg i = h i (i = 0, 1) is a relative coproduct of h 0 , h 1 if, for any other triple g # 0 , g # 1 , g # such that g # g # i = h # i , there is an unique mediating arrow z such that zg i = g # i and g # z = g. # Remark RPOs and relative coproducts are pushouts and coproducts in slice categories (the obvious dual notion to that of coslice, see Appendix A). I make no specific use of this observation. Nomenclature If h 0 , h 1 is a pair of arrows as in Figure 6.4, a triple g 0 , g 1 , g for which gg i = h i is a candidate (relative coproduct) for h 0 , h 1 . If X is an object of C, relative coproducts in the coslice category X/C correspond to RPOs in C for pairs of arrows with domain X: Proposition 6.8 Let C be a category. Let X,X 0 , X 1 , X 2 be four objects of C, and f i : X X i , h i : X i X 2 (i = 0, 1) four arrows of C. If we regard f 0 and f 1 as objects of X/C and h i : f i h i f i as arrows of X/C, respectively, the triple g 0 , g 1 , g is a relative coproduct of h 0 , h 1 i# it is an RPO in C for f 0 , f 1 w.r.t. h 0 , h 1 . Proof The proof is a trivial consequence of the definitions of relative coproduct and RPO applied to the categories X/C and C respectively. # This proposition relates RPOs in “ C­Ixt to relative coproducts in (m, n, U)/ “ C­Ixt. In particular we are interested in RPOs for pairs of contexts out of the empty profile #, so I focus on relative coproducts in #/ “ C­Ixt. As shown above, A : G­Inc #/ “ C­Ixt is an isomorphism of categories, so preserves relative coproducts. These facts underpin Corollary 6.28 which shows how to derive RPOs for “ C­Ixt from relative coproducts in G­Inc. Let us look at an example of this derivation in the simple Chapter 6. RPOs for action graph contexts 98 # # 0 # 1 G 0 G 1 G G 0 G 1 A(# 1 ) A(# 0 ) A(# 0 ) t t # # # 1 # 0 G 2 A(# 1 ) A(#) Figure 6.5: A relative coproduct in G­Inc becomes an RPO in “ C­Ixt situation depicted in Figure 6.5. The left diagram shows a relative coproduct # 0 , # 1 , # for # 0 , # 1 in G­Inc, which becomes an RPO in “ C­Ixt via A. (Recall that an arrow A(#) of #/ “ C­Ixt is also an arrow of “ C­Ixt.) Note that A(# 0 ) (# G 0 p 0 ) = A(# 1 ) (# G 1 p 1 ) = (# G 2 p 2 ) and A(#) A(# i ) = A(# # i ) = A(#), thus the proposed RPO commutes correctly. The source and target maps of # 0 , # 1 and # are shown as dotted lines. For example the embedding of G 0 's single arc into G 2 is represented by the hole in the context A(# 0 ). Since G has no targets, the target map # T is empty; this corresponds to the discarding of the targets t and t # , represented by the aborted arcs in A(#). Just as in the action graph b of Figure 5.2 in Example 5.1 (arithmetic), the extra arc of G 2 connecting the control node to t # may seem at first superfluous. But it is essential, since a competing candidate can have a target like t # , sourced by the control node, that is mapped back to G 1 but not to G 0 . This distinguishes it from t, which is mapped to both. By Inc­Targs there are no other possibilities (such as a target mapped to G 0 but not to G 1 ). Thus G 2 contains precisely the targets it needs to be ``as good as'' any candidate. The formal construction of G 2 is given later in this chapter. 6.4 Existence of relative coproducts Our task now is to establish relative coproducts in G­Inc. They are not always present! Figure 6.6 shows a case with no controls --- only ``wiring'' --- in which there is no relative coproduct. G 0 is a graph with a single input source s 0 and output target t 0 , with s 0 = src 0 (t 0 ); G 1 is similar. G has a single source but no target. H and H # are candidate relative coproducts--- one with a single arc and one with two sources and no arcs. The 6.4. Existence of relative coproducts 99 H H # G #G 2 G 1 G 0 s 0 s 1 t 0 t 1 # 0 # 1 Figure 6.6: A case where no relative coproduct exists (and # 0 , # 1 do not satisfy Chaste) source and targets maps of the inclusion embeddings into H and H # are shown as dotted lines. These maps are not shown for inclusion embeddings into G (they are simple: all sources go to the single source in G, and G has no target to map). Now any relative coproduct G 2 must possess inclusion embeddings from G i and to H and H # as shown by dashed lines, making the diagram commute. But no such inclusion embeddings can exist. The reader may enjoy proving this. However, relative coproducts exist for many pairs of inclusion embeddings. We define below a condition Chaste, which characterises exactly these pairs, i.e. is a necessary and su#cient condition for the existence of relative coproducts. First we define a key equivalence relation on which Chaste depends. Definition 6.9 (source coalescing (#)) Let # i : G i G (i = 0, 1) be a pair of inclusion embeddings. The source coalescing of # 0 , # 1 is the smallest equivalence relation # on S 0 + S 1 such that src 0 # T 0 (t) # src 1 # T 1 (t) for all t # Dom # T 0 # Dom # T 1 . # An important property of # is that it relates sources that must be equated in any candidate, as shown by the following proposition: Proposition 6.10 Let # 0 , # 1 , # be any candidate for # 0 , # 1 and let s i be a source of G i for i = 0, 1. If s 0 # s 1 then # S 0 (s 0 ) = # S 1 (s 1 ). # Definition 6.11 (Chaste) Let # i be as in the previous definition. The Chaste condition holds for # 0 , # 1 if for all s i # S in i and t i # T out i (i = 0, 1), if s i = src i (t i ) and # S 0 (s 0 ) = # S 1 (s 1 ) Chapter 6. RPOs for action graph contexts 100 then s 0 # s 1 . # The reader may like to check that Chaste fails for the pair # 0 , # 1 in Figure 6.6. The motivation for the name Chaste comes from the fact that it ensures that the inclusion embeddings do not equate ``promiscuously'' sources which they should not. The Chaste condition is implied by a striking property which is asymmetric between G 0 and G 1 : Definition 6.12 (output­controlled) In any graph, a target is controlled if its source is a bound source. G is output­controlled if all its output targets are controlled; formally: src(T out ) # S bind . (output­controlled) Since Cod src = S in # S bind , an equivalent formulation is: src(T out ) # S in = # . (output­controlled) # Proposition 6.13 If G 1 is output­controlled, then any pair of inclusion embeddings # i : G i G (i = 0, 1) satisfies Chaste. # Our interest in this fact arises because we are mainly concerned to find relative coprod­ ucts, or RPOs, when one of the graphs is a redex. In all cases I have met, the required reaction relation can be achieved with reaction rules whose redexes are output­ controlled. This point is taken up later in Chapter 7. Let us fix a pair # i : G i G (i = 0, 1) of inclusion embeddings. To show that Chaste is su#cient for # 0 , # 1 to have a relative coproduct, we could give a direct construction. However, we also wish to show that Chaste is necessary, and therefore require a second candidate to argue that no relative coproduct exists when Chaste fails. To factor the work of constructing both candidates, the approach taken is to give a general way to lift a triple of partial maps # T 0 , # T 1 , # T to a candidate # 0 , # 1 , #. Only those triples that satisfy certain conditions (given here) can be so lifted; we call them target sca#olds (for # 0 , # 1 ). The drawback to the sca#old approach is that it builds up the relative coproduct in­ crementally, with many lemmas verifying the healthiness of the intermediate constructions interspersed. The reader may wish to look directly at Figure 6.8 which distills out the construction. The proof that the inclusion embeddings shown in Figure 6.8 do actually form a relative coproduct provided that Chaste is satisfied is given in terms of sca#olds in the following pages. Remark The construction of T 2 in Figure 6.8 appears complex, so the reader may wish to pattern­match it against the target sets of the RPO examples given in Section 5.3. As we will see shortly (Corollary 6.28) there is a tight correspondence as we now illustrate. 6.4. Existence of relative coproducts 101 G 0 G G 1 # 0 # 1 6.7(1) T 0 T 2 T T 1 # T 0 # T 1 # T 0 # T 1 # T 6.7(2) G 0 G 2 G G 1 # 0 # 1 # 0 # 1 # 6.7(3) G 0 G G 3 G 1 # 0 # 1 # 0 # 1 # 6.7(4) T 0 T 2 T T 3 T 1 # T 0 # T 1 # T 0 # T 1 # T # T 0 # T 1 # T # T 6.7(5) G 0 G 2 G G 3 G 1 # 0 # 1 # 0 # 1 # # 0 # 1 # # 6.7(6) Figure 6.7: Lifting a sca#old to a candidate relative coproduct Example 5.1 (arithmetic): Consider the action graphs G 0 = a and G 1 = l 1 in Figure 5.2, embedded in b # . The targets of G 2 = b, the RPO graph, correspond exactly to T 2 ; in particular, the extra arc in b connecting S to the top output target is generated by the set O 1 in the definition of T 2 . Similarly, t # # O 1 in Figure 6.5. Example 5.2 (wiring): Consider the identical action graphs G 0 = a and G 1 = b in Figure 5.3, embedded in an identical graph. The targets {(t 0 , t 2 ), . . . , (t 1 , t 3 )} of K in G = Ca = Db are generated by the set O 0,1 (a subset of the product of the targets from G 0 and G 1 ) in the definition of T 2 . The outline of the argument is as follows. Fix a pair of inclusion embeddings # i : G i G, i = 0, 1, as in Figure 6.7(1). Section 6.5 I define the notion of target sca#old (# T 0 , # T 1 , # T ), which consists of a triple of partial maps (Figure 6.7(2)) satisfying certain properties. A sca#old can be lifted to form a candidate w.r.t. # 0 , # 1 (Figure 6.7(3)). Section 6.6 Fix any other candidate # 0 , # 1 , # (Figure 6.7(4)) and a partial map # T mak­ ing the triangles in Figure 6.7(5) commute. Then # T can be lifted to a mediating inclusion embedding # (Figure 6.7(6)). Under certain conditions # is unique. Section 6.7 I pick a particular sca#old # T 0 , # T 1 , # T and lift it to a candidate # 0 , # 1 , # using the technology of sca#old lifting. Provided that Chaste holds, this candidate is a Chapter 6. RPOs for action graph contexts 102 # # # # The construction builds an action graph G 2 and inclusion embeddings # i : G i G 2 , i = 0, 1 and # : G 2 G. It is convenient to first construct the target set T 2 of G 2 and the target components # T i , # T of the inclusion embeddings: T 2 “ = A+O 0 +O 0,1 +O 1 A “ = {arg(v, k) / v # V 0 # V 1 } O 0,1 “ = {(t 0 , t 1 ) # T out 0 × T out 1 / # S 0 src 0 (t 0 ) = # S 1 src 1 (t 1 )} O i “ = {t i # T out i / # S i src i (t i ) / # # S 1-i (S bind 1-i )} # T i (t) “ = # # # # # # T i (t) if t # A t if t # O i t i if t = (t 0 , t 1 ) # O 0,1 # T (t) “ = # # # # # t if t # A # T i (t) if t / # A and t # Dom # T i \ Dom # T 1-i (# T 0 (t), # T 1 (t)) if t / # A and t # Dom # T 0 # Dom # T 1 . Now let G 2 “ = (V 2 , contr 2 , src 2 ) : (m 2 , n 2 ), where: V 2 “ = V 0 # V 1 contr 2 (v) “ = contr(v) for v # V 2 arg 2 (v, k) “ = # T arg(v, k) for v # V 2 n 2 “ = #T 2 # - #Im arg 2 # . and m 2 is defined below. Take . = to be the least equivalence relation on S 0 + S 1 satisfying the following conditions: s 0 . = s 1 for i = 0, 1, s i # S bind i , s 1-i # S 1-i , and # S 0 (s 0 ) = # S 1 (s 1 ) src 0 # T 0 (t) . = src 1 # T 1 (t) for t # Dom # T 0 # Dom # T 1 . Let [ ] . = : S 0 +S 1 (S 0 +S 1 )/ . = map sources to their equivalence classes. Finally, let the source set of G 2 and the source components of # i , # be as follows: S 2 “ = (S 0 + S 1 )/ . = bind 2 (v, k) “ = [bind i (v, k)] . = if v # V i m 2 “ = #S 2 # - #Im bind 2 # src 2 (t) “ = [src i # T i (t)] . = if t # Dom # T i # S i (s) “ = [s] . = for s # S i # S ([s] . = ) “ = # S i (s) for s # S i . This completes the construction. Figure 6.8: The construction of a relative coproduct # 0 , # 1 , # for # i : G i G, i = 0, 1 (without sca#olds) 6.5. Construction of a candidate from a sca#old 103 relative coproduct, thus proving the su#ciency of Chaste. I then exhibit a second target sca#old, “ # T 0 , “ # T 1 , “ # T , lift it to a candidate, and show that the negation of Chaste implies that no putative relative coproduct is better than both #, # 0 , # 1 and “ #, “ # 0 , “ # 1 , thus verifying the necessity of Chaste. 6.5 Construction of a candidate from a sca#old First I define the notion of target sca#old : Definition 6.14 (target sca#old) A target sca#old consists of a set T 2 and a triple of maps (# T , # T 0 , # T 1 ) such that # T : T # T 2 , # T i : T 2 # T i (i = 0, 1), and such that several conditions hold: # T i # # T = # T i for i = 0, 1 Scaf­Comp Dom # T 0 # Dom # T 1 = T 2 Scaf­Tot # S 0 (src 0 (# T 0 t 2 )) = # S 1 (src 1 (# T 1 t 2 )) for t 2 # Dom # T 0 # Dom # T 1 Scaf­S # T i -1 {arg i (v i , k)} = {# T (arg(v i , k))} for v i # V i and i = 0, 1 Scaf­Arg # T i -1 (src -1 i (# S i -1 (# S j S bind j ))) # Dom # T j for i, j # {0, 1} Scaf­Targs # In the presence of Scaf­Comp, the equality in Scaf­Arg can be replaced by #. Scaf­Targs is equivalent to: ``if t 2 # Dom # T i and # S i (src i (# T i t 2 )) # # S 1-i S bind 1-i then t 2 # Dom # T 1-i ''. It is convenient to express the construction of G 2 up to bijection of its source and target sets, rather than in the exact form of Definition 5.6. Let G 2 “ = (V 2 , contr 2 , src 2 ) : (m 2 , n 2 ) where the parts are defined as follows. First we consider the vertices and targets of G 2 : V 2 “ = V 0 # V 1 contr 2 (v) “ = contr(v) for v # V 2 arg 2 (v, k) “ = # T arg(v, k) for v # V 2 n 2 “ = #T 2 # - #Im arg 2 # . Notice that V 0 and V 1 are not necessarily disjoint: V 0 # V 1 contains the vertices of G 0 and G 1 with appropriate overlap. Now we consider the sources and source maps. Take . = to be the least equivalence relation on S 0 + S 1 satisfying the following conditions: s 0 . = s 1 for i = 0, 1, s i # S bind i , s 1-i # S 1-i , and # S 0 (s 0 ) = # S 1 (s 1 ) src 0 # T 0 (t) . = src 1 # T 1 (t) for t # Dom # T 0 # Dom # T 1 . Chapter 6. RPOs for action graph contexts 104 Let [ ] . = : S 0 + S 1 (S 0 + S 1 )/ . = map sources to their equivalence classes. Then, let: S 2 “ = (S 0 + S 1 )/ . = bind 2 (v, k) “ = [bind i (v, k)] . = if v # V i m 2 “ = #S 2 # - #Im bind 2 # src 2 (t) “ = [src i # T i (t)] . = if t # Dom # T i # S i (s) “ = [s] . = for s # S i # S ([s] . = ) “ = # S i (s) for s # S i . Proposition 6.16 below shows that this construction yields a well­defined graph and well­defined inclusion embeddings which form a candidate for # 0 , # 1 . Before getting to this result, we first need a technical lemma used several times in the proposition. Lemma 6.15 If for some i, j # {0, 1}, s # S i and s # # S j and s . = s # then # S i s = # S j s # . Proof For the first rule generating . =, the result follows by definition. For the second rule, the result follows by Scaf­S. # Proposition 6.16 (sca#old lifting) For the construction given above, G 2 is a graph, # 0 , # 1 , # are all well­defined inclusion embeddings, and together they form a candidate for # 0 , # 1 . Proof src 2 is well­defined: Suppose t 2 # Dom # T 0 # Dom # T 1 ; then src 0 (# T 0 t 2 ) . = src 1 (# T 1 t 2 ) by definition of . =. By Scaf­Tot, src 2 is a total function. arg 2 is injective: Suppose arg 2 (v, k) = arg 2 (v # , k # ), i.e. # T (arg(v, k)) = # T (arg(v # , k # )). Suppose v # V i . By Scaf­Comp, arg i (v, k) = # T i (arg(v, k)) = # T i (# T (arg(v, k))) = # T i (# T (arg(v # , k # ))) = # T i (arg(v # , k # )) . By Inc­Arg for # i and the injectivity of arg , v = v # and k = k # , as desired. bind 2 is injective: Suppose bind 2 (v, k) = bind 2 (v # , k # ) where v # V i and v # # V j . Then bind i (v, k) . = bind j (v # , k # ). By Lemma 6.15, bind(v, k) = bind(v # , k # ); hence v = v # and k = k # , as desired. # i is an inclusion embedding for i = 0, 1: We consider each non­trivial axiom. Inc­Src: if t 2 # Dom # T i then # S i (src i (# T i t 2 )) = [src i (# T i t 2 )] . = = src 2 t 2 . Inc­Bind: for v i # V i , # S i -1 {bind 2 (v i , k)} = {s i # S i / s i . = bind i (v i , k)}. If s i # S i and s i . = bind i (v i , k) then by Lemma 6.15, # S i s i = # S i (bind i (v i , k)), hence by Inc­Bind, s i = bind i (v i , k). Inc­Arg: Scaf­Arg. Inc­Targs: Suppose src 2 t 2 = bind 2 (v i , k) for v i # V i . By Scaf­Tot it is su#cient to assume t 2 # Dom # T 1-i and prove t 2 # Dom # T i . Thus, src 1-i (# T 1-i t 2 ) . = bind i (v i , k). By Lemma 6.15, # S 1-i (src 1-i (# T 1-i t 2 )) = # S i (bind i (v i , k)) By Scaf­Targs, t 2 # Dom # T i . 6.6. Construction of a mediating inclusion embedding 105 # is an inclusion embedding: First note that # S is well­defined by Lemma 6.15. Inc­Src: Suppose t # Dom # T . By Scaf­Tot, there exists i = 0, 1 such that # T t # Dom # T i . Hence, # S (src 2 (# T t)) = # S i (src i (# T i (# T t))) = src t by Scaf­Comp and Inc­Src for # i . Inc­Bind: Let v i # V i ; then # S (bind 2 (v i , k)) = bind(v i , k) by Inc­Bind for # i . If # S [s i ] . = = bind(v i , k) for s i # S i then by Inc­Bind for # i , s i = bind i (v i , k); if # S [s 1-i ] . = = bind(v i , k) for s 1-i # S 1-i then s 1-i . = bind i (v i , k). Inc­Arg: By definition, # T (arg(v, k)) = arg 2 (v, k). Also, by Scaf­Comp and Inc­Arg for # T , we have that # T t = arg 2 (v, k) implies t = arg(v, k). Inc­Targs: If src t = bind(v i , k) for v i # V i then t # Dom # T i , hence t # Dom # T by Scaf­ Comp. ## i = # i : By Scaf­Comp we only need to show that # S ## S i = # S i . Observe that # S (# S i s i ) = # S [s i ] . = = # S i s i for all s i # S i . # When # T is surjective, the inclusion embeddings induced by the sca#old have an im­ portant property which we make use of later, namely # 0 , # 1 do not identify more input sources than required by # (cf. Proposition 6.10). Lemma 6.17 Suppose # T T = T 2 . Let s i # S in i for i = 0, 1. If s 0 ## s 1 then # S 0 s 0 #= # S 1 s 1 . Proof We show the contrapositive: s 0 . = s 1 implies s 0 # s 1 . Consider any derivation for the antecedent of minimal length: s 0 = s 1 0 . = s 1 1 . = · · · . = s n 0 . = s n 1 = s 1 where s j i # S i for i = 0, 1, 1 # j # n. Suppose for contradiction there exist i, j such that s j i # S bind i . By Lemma 6.15, # S i s i = # S i s j i , hence # S (# S i s i ) = # S (# S i s j i ); by Inc­Bind # S i s i = # S i s j i , contradicting the minimality of the derivation. Thus, there exist t 1 2 , . . . , t n 2 # Dom # T 0 # Dom # T 1 such that s j i = src i (# T i t j 2 ). Because # T T = T 2 , there exist t 1 , . . . , t n # T such that # T t j = t j 2 for 1 # j # n. Hence s j i = src i (# T i (# T t j )) = src i (# T i t j ). Thus s 0 # s 1 as desired. # 6.6 Construction of a mediating inclusion embedding Let G 3 be a graph and # 0 , # 1 , # a triple of inclusion embeddings such that # i : G i G 3 and # : G 3 G and # i = ## i , i.e. # 0 , # 1 , # is a relative coproduct candidate for # 0 , # 1 . Our aim to construct an inclusion embedding # mediating between the # 0 , # 1 , # candidate (lifted from the candidate sca#old above) and the # 0 , # 1 , # candidate by lifting a mediating target map # T (a ``mediating sca#old''). Formally, let # T : T 3 # T 2 be a partial map satisfying # T # # T = # T and # i # # T = # T i for i = 0, 1. We wish to lift # T to an inclusion embedding # that mediates between the two candidates. Construct # S : S 2 S 3 as follows: # S [s i ] . = = # S i s i for s i # S i . Now, # has all the required properties, as shown in the following proposition, the only result that makes use of the Chaste condition. Chapter 6. RPOs for action graph contexts 106 Proposition 6.18 (mediator lifting) # S is well­defined; # is an inclusion embedding and mediates between the candidates # 0 , # 1 , # and # 0 , # 1 , #. Proof # S is well­defined: We consider each case of the definition of . =. Case s i # S bind i , s 1-i # S 1-i , and # S i s i = # S 1-i s 1-i : Then # S i s i # S bind 3 by Inc­Bind for # i ; also # S (# S i s i ) = # S (# S 1-i s 1-i ) so # S i s i = # S 1-i s 1-i by Inc­Bind for # S . Case t 2 # Dom # T 0 # Dom # T 1 : Let s i = src (# T i t 2 ) for i = 0, 1. By Lemma 6.15, # S 0 s 0 = # S 1 s 1 . Case s i # S bind i for some i = 0, 1: By Inc­Bind, # S i s i # S bind 3 , hence # S (# S 0 s 0 ) = # S (# S 1 s 1 ) implies that # S 0 s 0 = # S 1 s 1 by Inc­Bind for #. Case s i # S in i for i = 0, 1: We distinguish two cases: Case t 2 # T arg 2 : There exists t # T arg such that # T t = t 2 , by Inc­Arg for #. Thus, by Inc­Src for # i we have that for i = 0, 1: # S i s i = # S i (src i (# T i t 2 )) = # S i (src i (# T i (# T t))) = # S i (src i (# T i t)) = # S i (src i (# T i (# T t))) = src 3 (# T t) Case t 2 # T out 2 : By Inc­Arg for # i , # T i t 2 # T out i for i = 0, 1. By the Chaste condition, s 0 # s 1 . By Proposition 6.10, # S 0 s 0 = # S 1 s 1 . # is an inclusion embedding: Inc­Src: if t 3 # Dom# T , then by Scaf­Tot there exists i = 0, 1 such that # T t 3 # Dom # T i . Hence # S (src 2 (# T t 3 )) = # S [src i (# T i (# T t 3 ))] . = = # S [src i (# T i t 3 )] . = = # S i (src i (# T i t 3 )) = src 3 t 3 by Inc­Src for # i . Inc­Bind: if v i # V i then # S (bind 2 (v i , k)) = # S [bind i (v i , k)] . = = # S i (bind i (v i , k)) = bind 3 (v i , k) by Inc­Bind for # i . Also, if # S [s j ] . = = bind 3 (v i , k) for s j # S j then # S j s j = bind 3 (v i , k), therefore # S j s j = bind(v i , k) = # S i (bind i (v i , k)) by Inc­Bind for #. Case i = j: By Inc­Bind, s j = bind i (v i , k). Case i #= j: Then s j . = bind i (v i , k). In both cases [s j ] . = = bind 2 (v i , k). Inc­Arg: for v i # V i , # T t 3 = arg 2 (v i , k) i# # T i (# T t) = # T i (arg 2 (v i , k)) = arg i (v i , k) (by Inc­Arg for # i ) i# # T i t = arg i (v i , k) i# t = arg 3 (v i , k) (by Inc­Arg for # i ). Inc­Targs: if src 3 t 3 = bind 3 (v i , k) for v i # V i , then t 3 # Dom# T i = Dom (# T i # # T ) by Inc­Targs for # i , hence t 3 # Dom# T . ## i = # i and ## = #: The equalities for the target maps are postulated. For the source map equalities, observe that # S (# S i s i ) = # S [s i ] . = = # S i s i for s i # S i ; also # S (# S [s i ] . = ) = # S (# S i s i ) = # S i s i = # S [s i ] . = . # Under certain conditions the mediator # is unique. These conditions depend on the following general definition concerning partial maps. Definition 6.19 (joint­injectivity) Given partial maps f i : X # X i , i = 0, 1, we say that (f 0 , f 1 ) are jointly­injective if two conditions hold: . for x, x # # Dom f i \ Dom f 1-i , if f i x = f i x # then x = x # , for i = 0, 1; 6.7. Construction of a relative coproduct 107 . for x, x # # Dom f 0 # Dom f 1 , if (f 0 x, f 1 x) = (f 0 x # e, f 1 x # ) then x = x # . # Note that (f 0 , f 1 ) are jointly­injective i# their product­pairing #f 0 , f 1 # : X # X 0 + X 0 ×X 1 +X 1 in the category of sets and partial maps is a monomorphism. Proposition 6.20 (uniqueness of the mediator) If (# T 0 , # T 1 ) are jointly­injective then # is a unique mediator. Proof Let # : G 2 G 3 be an inclusion embedding satisfying ## i = # i for i = 0, 1 and ## = #, i.e. # is an another mediating inclusion embedding between the candidates # 0 , # 1 , # and # 0 , # 1 , #. Then # = #: # S = # S : for s i # S i , # S [s i ] . = = # S (# S i s i ) = # S i s i = # S (# S i s i ) = # S [s i ] . = . # T = # T : if t 3 # Dom # T then by Scaf­Tot # T t 3 # Dom # T i for some i = 0, 1; hence t 3 # Dom (# T i # # T ) = Dom# T i = Dom (# T i # # T ); thus t 3 # Dom# T . By symmetry, t 3 # Dom# T i# t 3 # Dom # T , so Dom# T = Dom # T . Also, # T t 3 # Dom # T i i# # T t 3 # Dom # T i , for i = 0, 1. Finally, # T i (# T t 3 ) = # T i t 3 = # T i (# T t 3 ), for i = 0, 1. Hence # T = # T by the joint­injectivity hypothesis for (# T 0 , # T 1 ). # 6.7 Construction of a relative coproduct We now come to the main applied results of the dissertation which characterise precisely the conditions under which relative coproducts exist, and thus under which condition F has all redex­RPOs. The idea is to consider a particular sca#old, prove that it is well­ defined, lift it to a candidate and then show that it is ``better'' than any other competing candidate, subject to condition Chaste. I discuss the su#ciency of Chaste later. We exhibit a target sca#old consisting of a set T 2 and a triple # T 0 , # T 1 , # T , which we lift to a relative coproduct. Let: T 2 “ = A+O 0 +O 0,1 +O 1 A “ = {arg(v, k) / v # V 0 # V 1 } O 0,1 “ = {(t 0 , t 1 ) # T out 0 × T out 1 / # S 0 src 0 (t 0 ) = # S 1 src 1 (t 1 )} O i “ = {t i # T out i / # S i src i (t i ) / # # S 1-i (S bind 1-i )} # T i (t) “ = # # # # # # T i (t) if t # A t if t # O i t i if t = (t 0 , t 1 ) # O 0,1 # T (t) “ = # # # # # t if t # A # T i (t) if t / # A and t # Dom # T i \ Dom # T 1-i (# T 0 (t), # T 1 (t)) if t / # A and t # Dom # T 0 # Dom # T 1 . In Proposition 6.22 below we verify that # T 0 , # T 1 , # T is a target sca#old, so by Proposition 6.16 can be lifted to a candidate # 0 , # 1 , #. We first prove a technical lemma. Chapter 6. RPOs for action graph contexts 108 Lemma 6.21 Let G 3 be a graph and 0 , 1 , a triple of inclusion embeddings such that i : G i G 3 and # : G 3 G and # i = ## i , i.e. # 0 , # 1 , # forms a candidate. 1. If t 3 / # # T A and t 3 # Dom T i \ Dom T 1-i then # T i t 3 # O i . 2. If t 3 / # # T A and t 3 # Dom# T 0 # Dom# T 1 then (# T 0 t 3 , # T 1 t 3 ) # O 0,1 . Proof Note that # T i t 3 # T arg i i# for some k, i and v i # V i , t 3 = arg 3 (v i , k) = T (arg(v i , k)) # T A, by Inc­Arg for # i and #. Thus t 3 / # # T A implies # T i t 3 # T out i if t # Dom T i . We use this observation in each of the following parts. 1. By the observation in the head of this proof, T i t # T out i ; by Inc­Src, # S i (src i ( T i t)) = S (src 3 t 3 ); then S (src 3 t 3 ) # # S 1-i S bind 1-i = # S (# S 1-i S bind 1-i ) implies that src 3 t 3 # # S 1-i S bind 1-i by Inc­Bind for #; by Inc­Targs for # 1-i , we have t 3 # Dom# T 1-i , which contradicts the hypothesis. Thus # T i t 3 # O i as desired. 2. By the observation in the head of this proof, # T i t 3 # T out i for i = 0, 1. By Inc­Src, (# T 0 t 3 , # T 1 t 3 ) # O 0,1 . # Proposition 6.22 (healthiness of relative coproduct sca#old) (# T , # T 0 , # T 1 ) is a target sca#old. Proof Scaf­Comp: By Lemma 6.21 instantiated with G 3 = G, # = id G , and # i = # i , we have # T i (# T t) = # # # # # # T i t if t # A # T i t if t / # A and t # Dom # T i \ Dom # T 1-i # T i t if t / # A and t # Dom # T 0 # Dom # T 1 Because the cases above are exhaustive, # T # T i = # T i . Scaf­Tot: By definition. Scaf­S: Let t # Dom # T 0 # Dom # T 1 . If t # A then # S i (src i (# T i t 2 )) = # S i (src i (# T i t 2 )) = src t 2 by Inc­Src for i = 0, 1. If t # O 0,1 then the result follows by definition. Scaf­Arg: Since Scaf­Comp has already been verified, we need only show #. If # T i t 2 = arg i (v i , k) for some v i # V i then t 2 # A. Hence # T i t 2 = # T i t 2 = arg i (v i , k), so t 2 = arg(v i , k) = arg 2 (v i , k) by Inc­Arg for # i . Scaf­Targs: Suppose that t 2 # Dom # T i and # S i (src i (# T i t 2 )) # # S 1-i S bind 1-i . Then t 2 / # O i . If t 2 # A then by Inc­Src for # i , src t 2 = # S i (src i (# T i t 2 )) = # S i (src i (# T i t 2 )) # # S 1-i S bind 1-i . By Inc­ Targs for # 1-i , t 2 # Dom # T 1-i hence t 2 # Dom # T 1-i (since t 2 # A). If t 2 / # A then t 2 # O 0,1 hence t 2 # Dom # T 1-i . # Now, consider any other candidate: let G 3 be a graph and (# 0 , # 1 , #) a triple of inclusion embeddings such that # i : G i G 3 and # : G 3 G and # i = ## i . Construct # T : T 3 # T 2 as follows: # T t 3 “ = # # # # # # T-1 t 3 if t 3 # # T A # T i t 3 if t 3 / # # T A and t 3 # Dom# T i \ Dom# T 1-i (# T 0 t 3 , # T 1 t 3 ) if t 3 / # # T A and t 3 # Dom# T 0 # Dom# T 1 Now we claim that # T satisfies all the conditions used by Proposition 6.18 needed for it to be lifted to a mediating inclusion embedding: 6.7. Construction of a relative coproduct 109 Proposition 6.23 (mediator healthiness) # T is well­defined; # T ## T = # T and # T i ## T = # T i for i = 0, 1 Proof # T is well­defined: If t 3 # # T A then by Inc­Arg for # there exists a unique t # A such that # T t = t 3 . # T # # T = # T : Expanding the LHS, we have: # T (# T t) = # # # # # t if t # Dom# T and # T t # # T A # T i t if t # Dom# T and # T t / # # T A and # T t # Dom# T i \ Dom# T 1-i (# T 0 t, # T 1 t) if t # Dom# T and # T t / # # T A and # T t # Dom# T 0 # Dom# T 1 By Inc­Arg for #, t # Dom# T and # T t # # T A i# t # A Also, because # T i # # T = # T i for i = 0, 1, t # Dom# T and # T t # Dom# T i i# t # Dom # T i Thus # T # # T = # T . # T i # # T = # T i for i = 0, 1: By Lemma 6.21, the LHS of the second equality expands to: # T i (# T t 3 ) = # # # # # # T i (# T -1 t 3 ) if t 3 # # T A # T i t 3 if t 3 / # # T A and t 3 # Dom# T i \ Dom# T 1-i # T i t 3 if t 3 / # # T A and t 3 # Dom# T 0 # Dom# T 1 Note that if t 3 # # T A then t 3 = # T t for some t # A; thus, the first case in the expansion is: # T i (# T -1 t 3 ) = # T i t = # T i t = # T i t 3 by Inc­Arg. Because the cases in the expansion above are exhaustive, # T i # # T = # T i . # Moreover, by Proposition 6.20, the mediator # is unique because (# T 0 , # T 1 ) are jointly­ injective, as verified in Proposition 6.25 below after a technical lemma. Lemma 6.24 If t 2 # Dom # T i \ Dom # T 1-i then t 2 # O i or t 2 = arg 2 (v i , k) for some k and v i # V i . Proof If t 2 / # O i then t 2 # A, hence t 2 = arg 2 (v j , k) for some k, j, v j # V j . Since t 2 / # Dom # T 1-i , i = j. # Proposition 6.25 (# T 0 , # T 1 ) are jointly­injective. Proof Case t 2 , t # 2 # Dom # T i \ Dom # T 1-i and # T i t 2 = # T i t # 2 : By Lemma 6.24, there are two cases. Case t 2 # O i : Then t 2 = # T i t # 2 . If t # 2 / # O i then by Lemma 6.24, # T i t # 2 # T arg i , but t 2 # T out i , a contradiction. Thus t # 2 # O i and t 2 = t # 2 . Case t 2 = arg(v i , k) for some v i # V i and k: then arg i (v i , k) = # T i t # 2 thus t # 2 / # O i ; by Lemma 6.24, t # 2 = arg(v # i , k # ) for some v # i # V i and k # ; but then v i = v # i and k = k # , so t 2 = t # 2 . Chapter 6. RPOs for action graph contexts 110 Case t 2 , t # 2 # Dom # T 0 # Dom # T 1 and (# T 0 t 2 , # T 1 t 2 ) = (# T 0 t # 2 , # T 1 t # 2 ): There are two subcases. Case t 2 # O 0,1 : Then (t 2 , t 2 ) = (# T 0 t # 2 , # T 1 t # 2 ), so t # 2 # O 0,1 and t 2 = t # 2 . Case t 2 = arg 2 (v i , k) for some v i # V i and k: Then t # 2 # A and # T i t 2 = # T i t # 2 , hence t 2 = t # 2 by Inc­Arg. # In summary, then: Theorem 6.26 (su#ciency of Chaste for relative coproducts) If a pair of inclusion embeddings # i : G i G (i = 0, 1) satisfies Chaste then the constructed triple # 0 , # 1 , # of inclusion embeddings forms a relative coproduct for # 0 , # 1 in G­Inc. # We now turn to the necessity of Chaste. We first construct a second candidate for # 0 , # 1 as follows. Let “ T 2 “ = Dom # T 0 # Dom # T 1 “ # T : T # “ T “ = t ## t for t # “ T “ # T i : “ T # T i “ = # T i # “ T . Then “ # T 0 , “ # T 1 , “ # T is a target sca#old, hence can be extended to a candidate “ # 0 , “ # 1 , “ #. Now we proceed directly to the result: Theorem 6.27 (necessity of Chaste for relative coproducts) If a pair of inclusion embeddings # i : G i G (i = 0, 1) does not satisfy Chaste then no relative coproduct exists for # 0 , # 1 in G­Inc. Proof Suppose that the Chaste condition does not hold, i.e. there are s i # S in i and t i # T out i such that src i (t i ) = s i , # S 0 (s 0 ) = # S 1 (s 1 ), and s 0 ## s 1 . Suppose for contradiction that there is a graph G 3 and a relative coproduct triple 0 , 1 , such that : G 3 G and i : G i G 3 for i = 0, 1. By the definition of relative coproduct, there exist two mediating inclusion embeddings: # : G 3 G 2 for the candidate # 0 , # 1 , # and # : G 3 “ G 2 for the candidate “ # 0 , “ # 1 , “ #. By construction, “ # T T = “ T , so by Lemma 6.17, “ # S 0 s 0 #= “ # S 1 s 1 , hence # S S 0 (s 0 ) #= # S S 1 (s 1 ), hence S 0 (s 0 ) #= S 1 (s 1 ). However, (t 0 , t 1 ) # O 0,1 , so S i (s i ) = S i src i (t i ) = S i src i # T i (t 0 , t 1 ) = S i src i T i # T (t 0 , t 1 ) = src 3 # T (t 0 , t 1 ) for i = 0, 1, by Inc­Src for i ; so S 0 (s 0 ) = S 1 (s 1 ), a contradiction. # 6.8 Existence of RPOs We are finally ready to deduce the existence of RPOs in the category “ C­Ixt of contexts, which is the goal set at the beginning of this chapter. Recall that in “ C­Ixt a context # G (m, n, U ), whose domain is the empty profile #, is actually a graph G : (m, n) with |G| = U . 6.8. Existence of RPOs 111 Corollary 6.28 Let C 0 G 0 = C 1 G 1 = G in “ C­Ixt, where G 1 is output­controlled. Then there exists an RPO for G 0 , G 1 w.r.t. C 0 , C 1 . Proof Let # i = DG i ,G (C i ), for i = 0, 1. Since G 1 is output­controlled the pair # 0 , # 1 satisfies property Chaste by Proposition 6.13, hence possesses a relative coproduct by Theorem 6.26. The iso functor A : G­Inc #/ “ C­Ixt transforms any such relative coproduct into a relative coproduct of C 0 : G 0 G and C 1 : G 1 G, and hence, by Proposition 6.8, into an RPO in “ C­Ixt for G 0 , G 1 with respect to C 0 , C 1 . # Finally, note that if G is output­controlled then # # G is too for any injection # with Dom # # |G|. Thus it is well­defined to say that a context with domain (0, 0) in C­Ixt (the category formed by quotienting out support translations) is output­controlled. Thus: Theorem 6.29 (F has all redex­RPOs) The functorial reactive system F : “ C­Ixt C­Ixt has all redex­RPOs provided that every redex is output­controlled, i.e. pro­ vided that for all (l, r) # Reacts, l is output­controlled. # From this all of the congruence results of Chapter 3 (except the one for multi­hole contexts) are immediately applicable to F : “ C­Ixt C­Ixt, as desired. The next chapter argues that it is tolerable to ask that every redex satisfies certain reasonable constraints, including output­controlled. Chapter 6. RPOs for action graph contexts 112 Chapter 7 Expected properties of redexes In this chapter, I consider the question of which constraints it is reasonable to ask a redex to satisfy in a reactive system. The motivation for this comes from two directions, the first from the applied work in Chapter 6 and the second from the categorical work in Chapter 3. Firstly, the previous chapter's main result (Theorem 6.29) asserted that F : “ C­Ixt C­Ixt has all redex­RPOs provided that every redex is output­controlled. But is this a reasonable constraint? Can we point to redexes that are not output­controlled and show that they induce undesirable reactions? Secondly, as shown in Section 3.4, id­labelled transitions and reactions coincide when redexes have epi preimages (Definition 3.14). Is there a concrete (graph theoretic) char­ acterisation of this requirement? Is this characterisation a reasonable constraint? What form do redexes that do not have such constrained preimages take? Do they lead to undesirable reactions? The first motivation is the stronger one: without RPOs, the edifice of congruence results collapses. And yet, the latter is worth considering in order to understand the space of possible redexes. Delightfully, there is a brief concrete characterisation of when a redex has epi preimages and, moreover, this characterisation consists of three simple conditions, one of which is output­controlled. As a result, the output­controlled condition is on the path to understanding epis, so I argue about the former when it comes up in the course of examining each of the three epi conditions. Let us start then with the problem of finding a concrete characterisation for the epi conditions. Instantiating the property redexes have epi preimages for F : “ C­Ixt C­Ixt yields the following: For all G # “ C­Ixt, if there exists (l, r) # C­Ixt such that F(G) = l, then G is an epi. (As usual we do not distinguish between an action graph G and arrows in “ C­Ixt with domain #.) Notice that this definition does not require that redexes themselves be epis, just that 113 Chapter 7. Expected properties of redexes 114 L 0 K L 1 K K (0, 0) (0, 1) K K L 1 L 0 (0, 1) (0, 1) (0, 0) L 0 L 1 = l C 0 C 1 Figure 7.1: In C­Ixt, l is a not an epi # (0, 1, {v}) (0, 1, {v, v # , v 0 , v 1 }) K v v # K v1 L 1 v0 L 0 v0 L 0 v # K v1 L 1 Figure 7.2: In “ C­Ixt, the tracking of node occurrences makes the left­hand graph an epi 115 their preimages be. To see why, consider two compositions in C­Ixt (Figure 7.1). The redex l containing a single control K is as simple and unobjectionable a redex as could be found. Yet it is not an epi in C­Ixt as shown: there are two contexts C 0 , C 1 # C­Ixt for which C 0 l = C 1 l but C 0 #= C 1 . (The latter follows because the hole is attached to L i in C i .) However every preimage of l is an epi in “ C­Ixt (as proved in Proposition 7.3). Consider the situation in “ C­Ixt as shown in Figure 7.2. Then the two compositions are not equal because an arc links v to v 0 in the top one but to v 1 in the bottom. Under which conditions is an action graph G an epi in “ C­Ixt? The answer is given by the following definition which characterises exactly (Proposition 7.3) the epis with domain # of “ C­Ixt. The interesting part of the argument is carried out in terms of inclusion embeddings (Proposition 7.2) with the functors A and D being used to transfer constructions from G­Inc to “ C­Ixt and back. (This is reminiscent of the approach used to construct RPOs.) Definition 7.1 (G has modest wiring) Say that an action graph G has modest wiring i# G satisfies the following properties: src(T out ) # S in = # output­controlled S in # src(T ) no­discarded­inputs src # T out is injective . no­controlled­forks # The negation of each condition is illustrated by each G shown in Figure 7.6. Note that each one of these conditions is concerned only with wiring and not nodes, so holds for G i# it holds for # # G, where # is any injection with Dom # # |G|. As was pointed out immediately after Corollary 6.28, it is thus well­defined to say that an agent, i.e. an arrow in C­Ixt with domain (0, 0), has modest wiring. The reason is that an agent is an equivalence class of action graphs, all of which are a support translation of one another. Thus an agent is C­Ixt has modest wiring i# every (indeed any) preimage in “ C­Ixt out of # has modest wiring. Now, the interesting question is whether it is tolerable to demand that each redex has modest wiring. I believe yes, based on the absence of any example redex that does not have modest wiring. Of course, this is open to contradiction by future work. However, I now argue that the reactions obtained from a putative redex that does not have modest wiring are su#ciently strange that I feel safe in expunging such redexes for the moment. I consider each condition in turn. The first (output­controlled) is indispensable for the existence of RPOs, as is explained at the beginning of this chapter: output­controlled: Consider the redex l in Figure 7.3 which does not satisfy output­ controlled because of the arc going across from an input source to an output Chapter 7. Expected properties of redexes 116 K K L M 0 M 1 K 1 M 1 M 0 K 0 L a # a l r C M 0 L M 1 K 0 K 1 Figure 7.3: An undesirable reaction a = Cl Cr = a # generated by a redex l that does not satisfy the condition output­controlled K L M 0 l r C M 0 M 1 K 0 K M 0 M 1 M 1 K 0 a # a L Figure 7.4: An undesirable reaction a = Cl Cr = a # generated by a redex l that does not satisfy the condition no­discarded­inputs 117 l r C L K 0 K 1 K K L K 0 L a # a K 1 Figure 7.5: An undesirable reaction a = Cl Cr = a # generated by a redex l that does not satisfy the condition no­controlled­forks target. This arc in l matches an arc connecting M 0 to M 1 in a (as witnessed by C). The induced reaction a#ects K (which seems reasonable given that K is in l), but also breaks M 0 from M 1 and connects a new control K 1 to M 1 . The latter seem unreasonable because it is so highly nonlocal: the arc in l can match and then modify an arc anywhere in an agent, not necessarily near the site of K. no­discarded­inputs: Consider the redex l in Figure 7.4 which does not satisfy no­ discarded­inputs. It too has a nonlocal a#ect on a: the discarded input source of l is matched by the bound source of M 0 in a (as witnessed by C), resulting in a reaction that connects L to M 0 . no­controlled­forks: Consider the redex l in Figure 7.5 which does not satisfy no­ controlled­forks. The bound source of K has a forked arc emanating from it. Despite this, the forked arc matches a non­forked arc in a (as witnessed by C) that connects K to L, resulting in a reaction that connects K 0 to L and leaves K 1 dangling. Furthermore, in each of these cases, it seems that the redex could be replaced with a simpler one that does satisfy the condition in question. Consider the condition output­ controlled. Let (l, r) be a reaction rule whose redex l is not output­controlled. Consider all arcs connecting an input source to an output target. If the contractum r possesses all these arcs, then it can be shown that the alternative rule (l # , r # ) with all such arcs removed yields a superset of the reactions; the rule could be replaced by (l # , r # ) for Chapter 7. Expected properties of redexes 118 the purpose of deriving an LTS. If on the other hand r does not contain such an arc (as in Figure 7.3) then the rule (l, r) appears to generate non­deterministic behaviour of a kind not normally required. I now prove that if G has modest wiring then G is an epi in “ C­Ixt. The first step is to show a related result for inclusion embeddings in G­Inc and then return to “ C­Ixt in Proposition 7.3. Proposition 7.2 G has modest wiring i# for all inclusion embeddings # 0 , # 1 : G G 2 , we have that # 0 = # 1 . Proof LHS implies RHS: Suppose that G has modest wiring. Let G 2 be an arbitrary graph and let # 0 , # 1 : G G 2 . # S 0 = # S 1 : Consider s # S. We distinguish two cases. case s # S bind : By Inc­Bind, # S 0 (s) = # S 1 (s). case s # S in : By no­discarded­inputs and output­controlled, there ex­ ists arg(v, j) # T arg such that s = src arg(v, j) = Inc­Arg src # T i arg 2 (v, j) for i = 0, 1. Thus # S i (s) = # S i src # T i arg 2 (v, j) = Inc­Src src 2 arg 2 (v, j) for i = 0, 1, so # S 0 (s) = # S 1 (s). # T 0 = # T 1 : Consider t # Dom # T 0 . We distinguish two cases: case # T 0 (t) # T arg : By Inc­Arg, t # Dom # T 1 and # T 0 (t) = # T 1 (t). case # T 0 (t) # T out : By output­controlled, src # T 0 (t) # S bind ; then, # S 1 src # T 0 (t) = Inc­Bind # S 0 src # T 0 (t) = Inc­Src src 2 (t) . By Inc­Targs, t # Dom # T 1 . By Inc­Arg, # T 1 (t) # T arg implies # T 0 (t) # T arg , which contradicts the case's hypothesis. Thus # T 1 (t) # T out . By output­controlled, src # T 1 (t) # S bind . By Inc­Src, # S i src # T i (t) = src 2 (t) for i = 0, 1, so by Inc­Bind, src # T 0 (t) = src # T 1 (t). By no­ controlled­forks, # T 0 (t) = # T 1 (t). In both cases # T 0 # # T 1 . By symmetry, # T 0 = # T 1 . RHS implies LHS (outline): Let G be a graph. We argue that each condition in Definition 7.1 is necessary in order for the RHS to be true. output­controlled: Suppose G does not satisfy output­controlled, i.e. there exists t # T out such that src(t) # S in . We know that id G : G G is one 119 G G G id G # t t t 7.6(1) Necessity of output­controlled G G 2 G 2 # 0 # 1 s # s # s # 7.6(2) Necessity of no­discarded­inputs G id G # G G K K K t 0 t 1 t 0 t 1 t 0 t 1 7.6(3) Necessity of no­controlled­forks Figure 7.6: Necessity of Definition 7.1 Chapter 7. Expected properties of redexes 120 inclusion embedding. Let # be the same at id G but with t ## t removed from # T (a partial map). Then # : G G is an inclusion embedding and id G #= #. See Figure 7.6(1) for an example. no­discarded­inputs: Suppose G does not satisfy no­discarded­inputs, i.e. there exists s # # S in such that s # / # src(T ). Suppose G : (m, n). Let G 2 : (m+1,n) be the same as G but with one extra discarded input source, i.e. V 2 “ = V contr 2 “ = contr T 2 “ = T S 2 “ = S # in 2 (m) src 2 : T 2 S 2 “ = t ## src(t) Let # 0 : G G 2 be an inclusion embedding that is essentially the identity, i.e. # S 0 : S S 2 “ = s ## s # T 0 : T 2 T “ = Id T Let # 1 : G G 2 be an inclusion embedding that di#ers only in its action on s # : # S 1 : S S 2 “ = # s ## s if s #= s # s ## in 2 (m) if s = s # # T 1 : T 2 T “ = Id T Then # 0 #= # 1 . See Figure 7.6(2) for an example. no­controlled­forks: Suppose G does not satisfy no­controlled­forks, i.e. there exist t 0 , t 1 # T out such that src(t 0 ) = src(t 1 ) # S bind . We know that id G : G G is one inclusion embedding. Let # be the same at id G but with # T : t i ## t 1-i for i = 0, 1. Then # : G G is an inclusion embedding and id G #= #. See Figure 7.6(3) for an example. # To make the following result easier to read, we take the liberty of omitting the usual decorations on arrows in “ C­Ixt. Proposition 7.3 Let G be a context in “ C­Ixt with domain #. Then G has modest wiring i# G is an epi in “ C­Ixt. Proof Suppose G has modest wiring and that G 2 “ = C 0 G = C 1 G in “ C­Ixt for some C 0 , C 1 . By Proposition 7.2, D G,G 2 (C 0 ) = D G,G 2 (C 1 ) # G­Inc(G, G 2 ), so C 0 = A(D G,G 2 (C 0 )) = A(D G,G 2 (C 1 )) = C 1 since A inverts D. 121 Suppose that G does not have modest wiring. Then by Proposition 7.2, there exist a graph G 2 and # 0 , # 1 # G­Inc(G, G 2 ) such that # 0 #= # 1 . Then A(# 0 ) #= A(# 1 ) since A is faithful but A(# 0 )G = G 2 = A(# 1 )G. Thus G is not an epi in “ C­Ixt. # This result shows that Definition 7.1 provides an exact characterisation of epis in “ C­Ixt, whence we have the following, which rounds out the chapter: Corollary 7.4 For F : “ C­Ixt C­Ixt, if every redex has modest wiring (Definition 7.1) then redexes have epi preimages (Definition 3.14). # The implications of this chapter are not that epis are important per se but that there is a space of possible constraints that may be imposed upon redexes. One can imagine requiring all the nodes in a redex to be connected by wiring. This would guarantee that the parts of the redex occur locally in an agent, precluding far­flung instantiations. It would be enlightening if this or other graph­theoretic constraints could be described categorically. Conversely, as we investigate richer graph structure (such as nesting, free names, binding, etc.) we can look specifically for which redexes satisfy known categorical properties, such as was done here for epi preimages. If we can build up a library of useful categorical constraints, then it may be possible to provide a way of classifying redexes from di#erent functorial reactive systems according to these constraints. An application of this might be a new way of disproving the existence of encodings between process calculi represented as functorial reactive systems: if one could show that any putative encoding preserves certain categorical properties that hold of the redexes in the source of the encoding but not of the redexes in the target, then one has a handle on disproving the encoding's existence. It remains to be seen whether such classifications are possible or useful. Yet, it seems like a fruitful direction: the power of functorial reactive systems is they are abstract and thus provide a general setting in which to derive operational congruences. There is no reason why this power cannot be exploited for other ends as well, such as the classification of dynamics. This, however, is future work. Chapter 7. Expected properties of redexes 122 Chapter 8 Conclusions and future work This final chapter summarises some of the accomplishments of the dissertation and points to areas that require further work. The main motivation for my research is to provide mathematical tools that ease the construction of new process calculi. Every computational phenomenon (distributed hosts, failures, cryptography, etc.) that gives rise to a new calculus carries with it rules for describing how it works, how a computation evolves. These are often reaction rules, which capture the allowable state changes. The redexes are typically composites formed from atoms, so the following questions naturally arise. When an agent has parts of a redex, which atoms does the agent require to form a complete redex? Can we get a definition of labelled transition if we choose the labels to be those small contexts that complete redexes? These questions are not original with me: they are often used by the process calculus community as motivation and intuition for designing specific labelled transition relations. I believe, though, that the original contribution of my work is to give a general way of transforming reaction rules into tractable labelled transitions. By tractable, I mean two things: (i) the labels are small, i.e. contain only those parts of a redex necessary to complete a whole redex in an agent; (ii) the labelled transitions yield operational equivalences and preorders that are congruences. The key ideas in trying to achieve both desiderata are that of a relative pushout (RPO) and an idem pushout (IPO). With respect to (ii) the results are good: I prove congruence for strong bisimulation, weak bisimulation, the traces preorder, and the failures preorder. For (i), the results are encouraging: IPOs ensure that the labels of transitions contain no junk, i.e. that no lesser label would do. There is, however, more work required in making the labels even simpler by exploiting the uniformity present in reaction rules that contain metavariables. This is discussed below in the paragraphs concerned with multi­hole redexes. One of the most attractive qualities of the theory of RPOs is that it is abstract : it is applicable to all categories of contexts and therefore provides a unifying discipline. Before I understood RPOs, I spent several years thinking about dissection results for specific graph­ 123 Chapter 8. Conclusions and future work 124 like contexts: ``If C 0 a 0 = C 1 a 1 , then what part of a i is present in C 1-i for i = 0, 1?'' I could only get results about nodes (for example, which nodes of a 1 are supplied by C 0 ) but not about arcs. Trying to work in an ad hoc way with graphs was unsatisfying until the RPO theory arrived: it was only the discipline of a universal property in the category theory sense that sustained me in carrying out the constructions contained in Chapter 6. Sewell's ad hoc reasoning about dissection for term algebras with parallel composition [Sew01] was successful without employing RPOs, but his statement of dissection was complicated. To be fair, his dissections were for multi­hole redexes that RPOs do not neatly handle. I say more on this below. He is now recasting some of his results in terms of RPOs and other universal constructions. RPOs thus support the goal of developing a shared theory that can produce labelled transitions and operational congruences for many process calculi. The task that we usu­ ally face for each new process calculus of defining labelled transitions and proving that equivalences and preorders are congruences is replaced by the task of choosing reactions and proving that RPOs exist. This is attractive leverage gained from RPOs. Nonetheless, constructing RPOs is di#cult, as witnessed by the daunting complexity in Chapter 6. It is therefore desirable that it not be done afresh for each new process calculus. This is the impetus for the leading example in this dissertation, namely the construction of RPOs for action calculi­like contexts. Recall that the functorial reactive system F : “ C­Ixt C­Ixt is really a family with varying choices of control signature K and of reaction rules Reacts. The fragment of action calculi considered in Chapter 5 is thus a framework : if one can express states of a computational model using controls (for atoms) and arcs (for names and sharing) and reactions in terms of redexes that satisfy the condition output­controlled then RPOs exist and hence Chapter 3 gives labelled transitions and operational congruences. A framework is often deficient for two reasons: (i) it provides insu#cient benefits to compensate for the e#ort of casting a specific example in the framework's form; (ii) it is not rich enough to cover all the phenomena that one wants. Criticism (ii) hits home. The fragment of action calculi presented in this dissertation is inadequate for most purposes: there is no nesting of action graphs (needed for prefixing), no free names, no binding, limited reflexion, no choice operator, and no multi­hole redexes (needed to represent faith­ fully metavariables in reaction rules). I describe future work below that addresses these issues. Criticism (i), however, is unjustified when applied to the action calculi shown here: getting ``for free'' a labelled transition system and a collection of operational congruences is a worthwhile benefit. The reverse situation held for action calculi as originally presented by Milner [Mil96]: criticism (ii) was not such a problem but criticism (i) was: up until now there has been little tangible reward to using action calculi to present specific examples of process calculi. The central goal of future work is to construct a framework with rich enough structure 125 out in in x y Figure 8.1: Nesting of nodes (perhaps like Milner's original action calculi) to dodge criticism (ii) while still providing the rewards of labelled transitions and operational congruences. To this end, it will be important to pursue several research directions: nesting: The nesting of agents inside nodes is essential for representing prefixing. For example, in CCS the agent x.y.•x contains three levels of nesting with the name x shared by the outermost and the innermost. In order to represent this with graph­ like constructions, we need that each node packages up the continuation agent, e.g. as in Figure 8.1 which gives a possible representation of the CCS agent above. At first glance, nesting is a trivial thing: term algebras, for instance, consists of just pure nesting structure with nothing else. It is straightforward to calculate RPOs for them. However the real challenge is the combination of wiring and nesting: Figure 8.1 has an arc sourced by x that is forked and has targets at multiple levels. In particular, even the notion of context composition is di#cult: when composing contexts that fork an arc at di#erent nesting levels, the forks have to be normalised in some way, either by pushing them in to the innermost level or pulling them out to the outermost. As a stepping stone to handling nesting with rich wiring, Milner is currently looking at a simpler situation, namely the enhancement with nesting of work we did on linear graphs [LM00b]. In this case the normalisation of forked arcs is not an issue since arcs cannot be forked or discarded. There are many possible representations for nested graphs. One that seems most promising comes out of the work by Milner referred to here: the idea is to treat a nested graph as a flat graph with an added parent function overlaid. It seems Chapter 8. Conclusions and future work 126 that RPOs can then be calculated by handling the wiring structure and the nesting structure orthogonally. Furthermore, this orthogonality appears to be robust when passing to more complex wiring. Even without the full power of nesting, it may be possible to handle the #­calculus (or variants thereof) indirectly. Honda and Yoshida [HY94a] give a translation in terms of ``concurrent combinators''; these encode prefixing in a flat structure by using triggers to point to parts of agents that may execute. A derived labelled transition relation for their combinator reaction rules might be too intensional since the labels could detect exactly the combinators present in an agent. It is worth trying this, though, to see what kind of equivalences would transpire. free names and binding: It is straightforward to add free names to graphs: the idea is to enrich the set of source ports to include names. It is more subtle though to make precise how a context can bind the names of the thing that goes in its hole. It seems likely that the objects of the category of contexts need to contain name sets to ensure that RPOs exist (analogously to inclusion of node sets in “ C­Ixt), i.e. a context is an arrow of the form (m, n, U, X) (m # , n # , U # , X # ), meaning that its hole is expecting a context with support (nodes) U and names X. Nonetheless, it is not clear whether the downstairs category (the analogue of C­Ixt) should include these name sets. For example, in #­calculus we write the binding context (#x)- without saying which names can fit in the hole. Perhaps the functor from upstairs to downstairs (like F : “ C­Ixt C­Ixt in this dissertation) will allow two kinds of IPO sliding: the first with respect to nodes (as I have shown here) and the second with respect to names. There has been recent work on modelling names and name binding in a categorical way [TFP99, GP99] and it would be exciting to join together those syntactic models with operational semantics, and in particular, the reactive systems shown here. Name binding carries with it the problem of scope extrusion in labelled transitions. In #­calculus, for example, it is a subtle problem to handle the labelled transition of (#x)(•y#x#). There are many ways of handling this when looking at bisimulation, for example, but most involve augmenting the definition with freshness side conditions to cater explicitly for extrusion. An alternate approach [CS00] keeps track of extruded names as annotations of the agents themselves, thus gaining the power of extrusion but keeping the simplicity of bisimulation without added conditions. Adding these annotations seems almost identical to adding name sets in the objects of a context category (as outlined above). multi­hole redexes: The basic reaction rule of the #­calculus is • x#y#.a | x(z).b a | {y/z}b . (8.1) 127 Imagine for a moment that the questions of nesting and free names are overcome. There is still a problem adequately representing (8.1) in a reactive system. Currently I require that the reaction rules Reacts consist of pairs of agents, not agent contexts. As a result, the only way to represent (8.1) is via an infinite family of rules, one for each possible instantiation of a and b. Thus there might be labelled transitions of the form • x#y# -|x(z).b (8.2) for all instantiations of b. But these are ungainly labels: the only important part of any label of the form - | x(z).b is the top­level structure, namely x(z); the b is irrelevant. How can we winnow down the family of labels to a single one? The answer lies in using the uniformity present in (8.1). The a and b there are really metavariables and the rule can be expressed thus: • x#y#.- 1 | x(z).- 2 - 1 | {y/z}- 2 . (8.3) This rule consists of a pair of 2­hole contexts which capture the uniformity present in it. If we relax the condition on Reacts and allow it to contain pairs of contexts (L, R) for which L, R : m n are arbitrary contexts, then it is easy to generate the reaction relation (forgetting about the details of D) A A # i# (#(L, R) # Reacts, C, D. A = DLC & A # = DRC) . # # # # B A F C L D B # F # C # D # 8.2 Contrast this to Definition 2.2 for which the reaction rules are closed­up under context application on the outside, not also on the inside (as shown here). The hard problem is to define labelled transitions A F A # . We cannot rely on IPOs anymore since the domain of A is potentially di#erent from the domain of any redex L. A possible replacement for IPOs might be via hexagon idem pushouts (which are special kinds of hexagon relative pushouts) as suggested by Sewell. Notice how L and A are wrapped up in Figure 8.2 by arrows composed on either side. The hexagon idem pushout property guarantees that there is no lesser hexagon bounded by B # , C # , F # , D # . It seems plausible that such hexagons can yield the single labelled transition • x#y# - 1 |x(z).- 2 from (8.3), which is lighter than having the infinite family shown in (8.2). In [Sew01], Sewell already exploited the uniformity present in reaction rules when looking at term algebras with parallel composition. He achieved lightweight labelled transitions, but made use of complex context dissection lemmas. He is now looking at the problem of recasting his results in terms of universal constructions (such Chapter 8. Conclusions and future work 128 as hexagons). In conclusion, there are two important lines of research related to multi­hole redexes which mirror what was done in my dissertation: (i) defining labelled transitions by universal constructions and proving congruence for a variety of equivalences and preorders; (ii) showing that the relevant universal constructions exist in categories of interest (term algebras, graphs, etc.). sum: As discussed in Section 3.5, the proper treatment of summation (such as occurs in CCS and the #­calculus) requires care. It is not good enough for sum to be a free operator (just another control) if we want to describe its reactive behaviour, e.g. (•x.a + b) | (x.a # + b # ) a | a # #.a + b a . Consider the second reaction rule. In order for the sum to be rearranged so that #.a is on the left and the rest is on the right, we require + to be commutative, associative, and have the empty agent as an identity. I do not understand how to represent this in graphs. The same problem surfaces for other operators that change the structural congruence (syntactic equality). In the #­calculus, for example, replication is handled by the axiom ! a # a | ! a . (8.4) This seems even harder to represent in terms of graphs since the RHS has, poten­ tially, more nodes than the LHS. Even if the problems of finding graph theoretic representation of sum and replication can be overcome, it may be di#cult to con­ struct RPOs. A possible solution, at least for replication, is to disallow the structural equality in (8.4), and instead confine replication to be input guarded and to have a reaction rule of the form • x#y# | ! x(z).a {y/z}a | ! x(z).a . This is a commonly done in asynchronous variations of #­calculus [HT91]. Another way of handling replication is via an encoding in terms of combinators (already mentioned above) [HY94b]. For input guarded summation, encoding [NP96] is also an option. full reflexion: The contexts considered in Chapter 5 have only a limited form of reflexion, as enforced by the condition Loose that prevents tight loops linking a hole to itself. This simplifying assumption makes the composition of contexts and the composition of the associated embeddings easy. With full reflexion, the composition of contexts can create arcs that are formed from arbitrarily many segments of arcs present in the 129 inner and outer context. Indeed, it is di#cult to prove even that the composition of reflexive embeddings is associative [Lei01]. Better technology for handling reflexion has been developed in [LM00b] for graphs with linear wiring. Proving the existence of RPOs for full reflexion with non­linear wiring represents a future challenge. inductive presentation of labelled transitions: This item is not related to compu­ tational phenomena, but rather the form in which derived labelled transitions are presented. In this dissertation, the way of deriving labelled transitions from reactions yields a direct characterisation. This is in contrast to the inductive presentation usually found in the process calculus literature. Can the gap be bridged, though? In other words, can an inductive presentation of labelled transitions be derived from reaction rules? It seems plausible. For any particular reactive system whose syntax is gen­ erated from some constructors, one could instantiate Lemma 2.19 (recalled here) by substituting each constructor for C: C F # F C # is an IPO a F # a ## C # # D Ca F C # a ## . There are several interesting questions that follow: Under what conditions can we derive an inductive presentation that generates precisely the same labelled transition relation as the direct presentation gives? Under what conditions would an inductive presentation satisfy any of the GSOS rule formats [GV92, Blo93]? If some set of the latter is satisfied, it would be enlightening to compare two di#erent proofs of congruence for strong bisimulation, say --- one based on the RPO theory shown in this dissertation and the other based on GSOS reasoning, particularly as provided by recent categorical treatments of the latter [TP97]. It is not surprising that some of these areas (e.g. free names and multi­hole redexes) require changes to the categorical abstractions of a reactive system, not just cleverer ways of constructing RPOs. This pattern is well­established in my dissertation. I started with reactive systems, then passed to functorial reactive systems when it became clear that RPOs needed to be calculated in a separate category, then considered functorial monoidal reactive system to cater explicitly for RPOs resulting in multi­hole contexts. A test of the future work outlined above is to see what labelled transitions and opera­ tional congruences are generated for applications such as CCS, #­calculus, #­calculus, and ambients. It would be exciting (though not likely) to get congruences that coincide exactly with well­known ones (open bisimulation for #­calculus, say). It should not be a cause for dejection if the derived congruences do not match exactly the historically established Chapter 8. Conclusions and future work 130 ones: #­calculus comes with a zoo of bespoke congruences, none of which is good for all applications. The best outcome of this work will be if the mathematical tools that are proposed here ease the path for exploring new primitives and new patterns of behaviour inherent in distributed computing. The development of mathematical techniques for reasoning about hosts and host failure, agent mobility, and cryptography, for example, is critical for guiding infrastructure and language design. Proofs techniques for these models are critical for precise reasoning. Much work needs to be done before the tools of this dissertation are good enough to address these applications. Nonetheless, I see no limits as to what can be so achieved. Appendix A Review of category theory I summarise here the main ideas of category theory used in this dissertation. For greater detail, the reader is referred to the classic work by Mac Lane [Mac71]. Another good reference is [BW01] (which is online). Categories A category C consists of objects X,Y, . . . and of arrows f, g, . . . between objects. An arrow f has a domain object Dom f and a codomain object Cod f . We write f : X Y if Dom f = X and Cod f = Y . The arrows from X to Y are a homset, denoted C(X,Y ). The objects of C are denoted obj C. Composition is defined on all pairs of arrows between appropriate objects: f : X Y g : Y Z gf : X Z and is associative: h(gf) = (hg)f . Every object X has a (unique) identity arrow id X : X X; for f : X Y , we have id Y f = f = fid X . Special arrows An arrow f : X Y is an iso i# there exists an arrow g : Y X such that gf = id X and fg = id Y . If g # : Y X has the same properties as g then g = gid Y = gfg # = id X g # = g # , so the inverse of f is unique and is often written f -1 . An arrow f is an epi i# it is right­cancellable, i.e. g 0 f = g 1 f implies g 0 = g 1 . A split epi f is an arrow with a right­inverse, i.e. there exists h such that fh = id. An arrow f is a mono i# it is left­cancellable, i.e. fg 0 = fg 1 implies g 0 = g 1 . A split mono (also known as a retraction) f is an arrow with a left­inverse, i.e. there exists h such that hf = id. Subcategories A subcategory D of C is a category all of whose objects and arrows are objects and arrows of C and whose identities and compositions are the same as in C. 131 Appendix A. Review of category theory 132 Furthermore, D is a full subcategory if every homset in D is equal to the corresponding homset in C, i.e. for all X,Y # obj D, D(X,Y ) = C(X,Y ). (Thus a full subcategory is determined entirely by its objects.) Commutation The composition of arrows is depicted by concatenating them. For ex­ ample, if f : X Y and g : Y Z then gf is shown as X Y f Z g Often the object labels are omitted. A square commutes if every path with common start and end is equal. For example, the following square commutes i# g 0 f 0 = g 1 f 1 : f 0 f 1 g 0 g 1 Coproducts Given two objects X 0 , X 1 , their coproduct consists of an object X (some­ times written X 0 + X 1 ) and a pair of arrows f i : X i X, i = 0, 1, so that for any other pair f # i : X i X # , there exists a unique mediating arrow k : X X # (sometimes written [f # 0 , f # 1 ]) such that the two triangles below commute: X 0 X X # X 1 f # 0 f # 1 f 0 f 1 k . Universal constructions and isos Coproducts are unique up to isomorphism in the following sense: if f 0 , f 1 and f # 0 , f # 1 are both coproducts then the k shown above is an iso. Conversely, if k is an iso and f 0 , f 1 is a coproduct then f # 0 , f # 1 is too. This robustness with respect to isos is not accidental and is not limited to coproducts. In general, universal constructions, like coproducts, pushouts (see below), and relative pushouts (see Definition 2.4) are unique up to isomorphism. Moreover, if one replaces any object Y in such a construction with, say, Y # for which there is an iso k : Y Y # and one composes k with all the arrows in the construction whose codomain is Y and composes k -1 with all the arrows whose domain is Y , then the diagram is still a universal construction. For example, we have the following property about IPOs (Definition 2.5): Y f 0 f 1 g 0 g 1 is an IPO i# Y # kf 0 f 1 g 0 k -1 g 1 is too. 133 Pushouts Give two arrows f 0 , f 1 with common domain, a pushout is a pair g 0 , g 1 such that g 0 f 0 = g 1 f 1 and for any other pair g # 0 , g # 1 satisfying g # 0 f 0 = g # 1 f 1 , there exists a unique k such that kg i = g # i for i = 0, 1: f 0 f 1 g 0 g # 0 g 1 g # 1 kk Slice and coslice categories Given a category C and an object of X, the slice category C/X is defined as follows. An object of C/X is an arrow in C with codomain X; an arrow h : f g of C/X is an arrow of C such that gh = f : X f h g Composition in C/X is just given by composition in C. Finally, id f in C/X is id Cod f in C. The coslice category X/C is the dual notion: an object in X/C is an an arrow in C with domain X; an arrow h : f g of X/C is an arrow of C such that hf = g: X f g h Composition in X/C is just given by composition in C. Finally, id f in X/C is id Dom f in C. Strict monoidal categories A strict monoidal category C has a tensor #, which is a binary operation on objects and arrows, and a unit object U . On objects, the tensor is associative and has identity U . On arrows, the tensor satisfies the following inference rule: f 0 : X 0 Y 0 f 1 : X 1 Y 1 f 0 # f 1 : X 0 #X 1 Y 0 # Y 1 and is associative and with identity id U . Furthermore, compositions and tensors can be rearranged as follows, (g 0 # g 1 )(f 0 # f 1 ) = (g 0 f 0 ) # (g 1 f 1 ) provided that the arrows match up appropriately: f i : X i Y i and g i : Y i Z i for i = 0, 1. Appendix A. Review of category theory 134 Functors A functor F : C 0 C 1 maps objects and arrows of category C 0 to those of C 1 , such that the following properties hold: f : X Y F (f) : F (X) F (Y ) F (gf) = F (g)F (f) F (id X ) = id F (X) We say that F is faithful if it is injective on arrows; F is full if it maps surjectively the homeset C 0 (X, Y ) onto C 1 (F (X), F (Y )) for all objects X,Y in C 0 . Appendix B Labelled transitions via retractions As promised in Section 3.4, this appendix presents a new definition of labelled transition, which is denoted by · r . This definition has two properties: (i) it recovers the reaction re­ lation, i.e. id = ; (ii) it does not involve case analysis (unlike · c , see Definition 3.10). The key idea is to make use of retractions, otherwise knows as split monos (see p. 22 in [BW01]). We say that an arrow • f is a retraction if it has a left inverse, i.e. if there exists an arrow f such that f • f = id. The development follows closely that of Section 3.4: I present the definition of · r ; show how to recover the reaction relation; prove cutting and pasting results; prove that the induced strong bisimulation # r is a congruence; and, finally, compare # r to # c . These results (specifically Lemma B.4 and Proposition B.6) assume an added property of functorial reactive systems not included in Definition 3.1: If F is a functorial reactive system then F creates left inverses, i.e. if id = C 1 F(C 0 ) then there exists C 1 # “ C such that id = C 1 C 0 and F(C 1 ) = C 1 . This added property holds for all functors constructed in the form shown in Chapter 4, as proved in Theorem 4.14. As a result, the functor F : “ C­Ixt C­Ixt (Chapter 5) creates left inverses, so the results presented here are applicable to graph contexts. # # # # a l • RF D B.1 Definition B.1 (labelled transition by retractions; cf. Definition 3.10) a F r a # i# there exists a, l, F, D, R, • R # “ C and r # C such that Figure B.1 is an IPO in “ C and a # = F(RD)r F(D) # D (F(l), r) # Reacts R • R = id F(a) = a F(F) = F . # It follows immediately from this definition that id r # . The interesting question is the converse: Proposition B.2 (recovery of the reaction relation for · r ) Suppose F : “ C C is a functorial reactive system and has all redex­RPOs. Then a a # implies a id r a # . 135 Appendix B. Labelled transitions via retractions 136 # # # # a l id D • R D # R B.2 a l • R id D # B.3 Proof By Proposition 3.2, there exist a, l, D # “ C and r # C such that a = Dl and a # = F(D)r F(D) # D (F(l), r) # Reacts F(a) = a . Thus the outer square in Figure B.2 commutes. Since F has all redex­RPOs, we can construct • R, R, D # forming an RPO as shown in Figure B.2. Thus a # = F(D)r = F(RD # )r and R • R = id. Moreover, by Proposition 2.7, the square in Figure B.3 is an IPO. Hence a id r a # as desired. # # # # # C F # • RF C # B.4 Lemma B.3 (portable IPO cutting; cf. Lemma 3.5) Suppose F : “ C C is a functorial reactive system and has all redex­RPOs. If Ca F r a # then there exist a ## # C, R # “ C and an IPO square shown in Figure B.4 such that a F(F # ) r a ## and a # = F(RC # )a ## F(C # ) # D R • R = id F(C) = C F(F) = F . # # # # a l C F # • RF D D # C # B.5 Proof By the definition of F r and the hypothesis that F creates compositions, there exist a, C, l, F, R, • R # “ C and r # C such that the big rectangle in Figure B.5 is an IPO and a # = F(RD)r F(D) # D (F(l), r) # Reacts R • R = id F(a) = a F(C) = C F(F) = F . Because F has all redex­RPOs, there exist F # , D # , C # forming an RPO in “ C, as in Figure B.5. Then F(C # ) # D since F(C # )F(D # ) = F(D) # D. By Proposition 2.7, the small left­hand square of Figure B.5 is an IPO. Because F has all redex­RPOs, Proposition 2.9 implies that the small right­hand square is an IPO too. By definition, a F(F # ) r a ## and a # = F(RC # )a ## , where a ## “ = F(D # )r, as desired. # # # # # C F # • RF C # B.6 Lemma B.4 (portable IPO pasting; cf. Lemma 3.7) Suppose F : “ C C is a functorial reactive system and has all redex­RPOs. If a F(F # ) r a # , Figure B.6 is an IPO with F(C # ) # D, and there exists R such R • R = id then F(C)a F(F) r F(RC # )a # . # # # # a l • T 0 F # 0 D B.7 C 0 F # 0 • R 0 F 0 C # 0 B.8 Proof Since a F(F # ) r a # , there exist a, l, F # 0 , D, T 0 , • T 0 # “ C and r # C such that Figure B.7 is an IPO in “ C and a # = F(T 0 D)r F(D) # D (F(l), r) # Reacts T 0 • T 0 = id F(a) = a F(F # 0 ) = F(F # ) . 137 Since F allows IPO sliding and creates compositions, there exist C 0 , • R 0 , F 0 , C # 0 # “ C such that Figure B.8 is an IPO and F(C 0 ) = F(C) F( • R 0 ) = F( • R) F(F 0 ) = F(F) F(C # 0 ) = F(C # ) . Since id = F(id) = F(R)F( • R) = F(R)F( • R 0 ) and F creates left inverses, there exists R 0 such that id = R 0 • R 0 and F(R 0 ) = F(R). # # # # C # 0 • T 0 id C # 0 T 0 B.9 a l C 0 • R 0 F 0 K 0 D K 1 C # 0 T 0 K B.10 a l C 0 K 0 K 1 B.11 Pasting Figure B.8 vertically to the commuting square shown in Figure B.9 and then pasting hori­ zontally the composite to Figure B.7 results in the outer rectangle shown in Figure B.10. Because F has all redex­RPOs, there exist K 0 , K 1 , K forming an RPO inside, as shown. By Proposition 2.7, the rectangle in Figure B.11 is an IPO. # # # # a l C # 0 F # 0 C # 0 T 0 D • T 0 F # 0 D C # 0 T 0 B.12 a l C # 0 F # 0 C # 0 T 0 D • T 0 F # 0 D C # 0 T 0 K K 0 C 0 K 1 JJ B.13 Since F has all redex­RPOs and Figure B.7 is an IPO, Proposition 2.8 implies that Figure B.12 is an RPO. But K 0 C 0 , K 1 , K is a candidate, so there exists a unique J as shown in Figure B.13. Now F(K)F(J)F( • T 0 ) = F(KJ • T 0 ) = Figure B.13 F(C # 0 ) = F(C # ) # D so F(J) # D. # # # # C 0 F # 0 • R 0 F 0 C # 0 K 0 J • T 0 K • K • K B.14 a l C 0 • K • R 0 F 0 JD B.15 Thus, K 0 , J • T 0 , K is a candidate for Figure B.7, an IPO, as shown in Figure B.14. Thus there exists a unique • K such that K • K = id, • K • R 0 F 0 = K 0 , and • KC # 0 = J • T 0 . We re­ draw Figure B.11, an IPO, with equivalent arrows in Figure B.15. Recall that F(J) # D, so F(JD) # D. Thus: F(C)a = F(C 0 )a F(F 0 ) r F(R 0 KJD)r = Figure B.13 F(RC # )F(T 0 D)r = F(RC # )a # , so F(C)a F(F) r F(RC # )a # , as desired. # Theorem B.5 (congruence for # r ; cf. Theorem 3.12) Let F : “ C C be a functorial reactive system which has all redex­RPOs. Then # r is a congruence, i.e. a # r b implies Ca # r Cb for all C # C of the required domain. Proof By symmetry, it is su#cient to show that the following relation is a strong sim­ ulation: S “ = {(Ca, Cb) / a # r b and C # C} . Suppose that a # r b and C # C, and thus (Ca, Cb) # S. Suppose Ca F r a # . Appendix B. Labelled transitions via retractions 138 # # # # C F # • RF C # B.16 By Lemma B.3, there exist a ## # C, R # “ C and an IPO square shown in Figure B.16 such that a F(F # ) r a ## and a # = F(RC # )a ## R • R = id F(C # ) # D F(C) = C F(F) = F . Since a # r b, there exists b ## such that b F(F # ) r b ## and a ## # r b ## . By Lemma B.4, Cb F r F(RC # )b ## . Hence (F(RC # )a ## , F(RC # )b ## ) # S since a ## # r b ## , as desired. # Proposition B.6 (# c # # r ; cf. Proposition 3.13) Let F : “ C C be a functorial reactive system which has all redex­RPOs. Then # c # # r . Proof We show that # c is a strong bisimulation w.r.t. the labelled transition relation · r . By symmetry, it is su#cient to shown that # c is a strong simulation over · r . Consider any a, b for which a # c b. Suppose a F r a # . By definition, there exists F, R, • R # “ C and a ## # C such that a F( • RF) c a ## and a # = F(R)a ## R • R = id F(F) = F . Since a # c b, there exists b ## such that b F( • RF) c b ## and a ## # c b ## . We now distinguish two cases: case F( • RF) is an iso: In this case, F( • RF)b b ## . Since F(R)F( • R) = id # D, we have that F(R) # D, thus F(F)b F(R)b ## . Since F creates isos, • RF is an iso in “ C, so has an inverse K. Thus K • R is an inverse for F, so F is an iso, so F(F) = F is an iso. Thus b F c F(R)b ## . Since # c is a congruence, a # = F(R)a ## # c F(R)b ## , as desired. # # # # b l • R 0 F 0 D B.17 case F( • RF) is not an iso: In this case, b F( • RF) b ## . Since F creates compositions, there exist b, l, F 0 , • R 0 , D # “ C and r # C such that Figure B.17 is an IPO in “ C and b ## = F(D)r F(D) # D (F(l), r) # Reacts F(b) = b F(F 0 ) = F = F(F) F( • R 0 ) = F( • R) . Since id = F(id) = F(R)F( • R) = F(R)F( • R 0 ) and F creates left inverses, there exists R 0 such that id = R 0 • R 0 and F(R 0 ) = F(R). Thus b F r F(R 0 D)r = F(R)b ## . Since # c is a congruence, a # = F(R)a ## # c F(R)b ## , as desired. # Bibliography Curly braces enclose pointers back to the pages in this dissertation that cite the work. [AG97] M. Abadi and A. D. Gordon. A calculus for cryptographic protocols: the spi calculus. In Proc. 4th ACM Conf. on Computer and Communications Security, Z˜urich, pages 36--47. ACM Press, 1997. {6} [Bar84] H. P. Barendregt. The Lambda Calculus: Its Syntax and Semantics. North Holland, revised edition, 1984. {7, 8} [BB90] G. Berry and G. Boudol. The chemical abstract machine. In Proc. 17th Annual Symposium of Principles of Programming Languages, pages 81--94. ACM Press, 1990. {6} [BB92] G. Berry and G. Boudol. The chemical abstract machine. Theoretical Computer Science, 96(1):217--248, 1992. {6} [BF00] M. Bunge and M. P. Fiore. Unique factorisation lifting functors and categories of linearly­controlled processes. Mathematical Structures in Computer Science, 10(2):137--163, 2000. {69} [Blo93] B. Bloom. Structural operational semantics for weak bisimulations. Technical Re­ port TR­93­1373, Department of Computer Science, Cornell University, August 1993. {14, 45, 129} [BW01] M. Barr and C. F. Wells. Toposes, triples and theories. Version 1.1. Available from: http://www.cwru.edu/artsci/math/wells/pub/ttt.html, 2001. {131, 135} [CG98] L. Cardelli and A. D. Gordon. Mobile ambients. In Foundations of Software Science and Computation Structure, First International Conference, FoSSaCS '98, Held as Part of the European Joint Conferences on the Theory and Practice of Software, ETAPS '98, Lisbon, Portugal, March 28 -- April 4, 1998, Proceedings, volume 1378 of Lecture Notes in Computer Science. Springer­Verlag, 1998. {7} 139 Bibliography 140 [CLM00] G. L. Cattani, J. J. Leifer, and R. Milner. Contexts and embeddings for closed shallow action graphs. Technical Report 496, Computer Laboratory, University of Cambridge, July 2000. {16, 74, 87, 88, 89, 93, 94, 95, 96} [CM91] A. Corradini and U. Montanari. An algebra of graphs and graph rewriting. In 4th Biennial Conference on Category Theory and Computer Science, Proceedings, vol­ ume 530 of Lecture Notes in Computer Science, pages 236--260. Springer­Verlag, 1991. {15} [Con72] F. Conduch’e. Au sujet de l'existence d'adjoints ‘a droite aux foncteurs ``image r’eciproque'' dans la cat’egorie des cat’egories. Comptes rendus de l'Acad’emie des sciences A, pages 891--894, 1972. {69} [CS00] G. L. Cattani and P. Sewell. Models for name­passing processes: interleaving and causal. In 15th Annual IEEE Symposium on Logic in Computer Science, 26--29 June 2000, Santa Barbara, California, USA, pages 322--332. IEEE Press, 2000. {126} [DH84] R. De Nicola and M. C. B. Hennessy. Testing equivalences for processes. Theo­ retical Computer Science, 34:83--133, 1984. {14} [Ehr79] H. Ehrig. Introduction to the algebraic theory of graph grammar. In Proc. first international Workshop on Graph Grammars and their application to Computer Science and Biology, volume 73 of Lecture Notes in Computer Science, pages 1--69. Springer­Verlag, 1979. {15} [FF86] M. Felleisen and D. P. Friedman. Control operators, the SECD­machine and the #­calculus. In M. Wirsing, editor, Formal Description of Programming Concepts III, pages 193--217. North Holland, 1986. {7, 21} [FG98] C. Fournet and G. Gonthier. A hierarchy of equivalences for asynchronous cal­ culi. In Automata, Languages and Programming, 25th International Colloquium, ICALP '98, Aalborg, Denmark, July 13--17, 1998, Proceedings, volume 1443 of Lecture Notes in Computer Science, pages 844--855. Springer­Verlag, 1998. {14} [FGL + 96] C. Fournet, G. Gonthier, J.­J. L’evy, L. Maranget, and D. R’emy. A calculus of mobile agents. In CONCUR '96, Concurrency Theory, 7th International Confer­ ence, Pisa, Italy, August 26--29, 1996, Proceedings, volume 1119 of Lecture Notes in Computer Science, pages 406--421. Springer­Verlag, 1996. {7} [Fou98] C. Fournet. The Join­Calculus: a Calculus for Distributed Mobile Programming. PhD thesis, ’ Ecole Polytechnique, November 1998. {14} Bibliography 141 [GP99] M. J. Gabbay and A. M. Pitts. A new approach to abstract syntax involving binders. In 14th Annual Symposium on Logic in Computer Science, 2--5 July, 1999, Trento, Italy, pages 214--224. IEEE Press, 1999. {126} [GV92] J. F. Groote and F. W. Vaandrager. Structural operational semantics and bisim­ ulation as a congruence. Information and Computation, 100(2):202--260, 1992. {14, 129} [Has99] M. Hasegawa. Models of sharing graphs: a categorical semantics of let and letrec. BCS Distinguished Dissertation Series. Springer­Verlag, 1999. {76} [Hoa78] C. A. R. Hoare. Communicating sequential processes. Communications of the ACM, 21(8):666--677, August 1978. {4} [Hoa85] C. A. R. Hoare. Communicating Sequential Processes. Prentice­Hall, 1985. {4, 47, 50} [Hon00] K. Honda. Elementary structures for process theory (1): sets with renam­ ing. Mathematical Structures in Computer Science, 10(5):617--663, October 2000. {64} [HT91] K. Honda and M. Tokoro. An object calculus for asynchronous communication. In ECOOP '91: European Conference on Object­Oriented Programming, Geneva, Switzerland, July 15--19, 1991, Proceedings, volume 512, pages 133--147. Springer­ Verlag, 1991. {128} [HY94a] K. Honda and N. Yoshida. Combinatory representation of mobile processes. In POPL '94: 21st ACM SIGPLAN­SIGACT Symposium on Principles of Program­ ming Languages, Portland, Oregon, January 17--21, 1994, pages 348--360. ACM Press, 1994. {126} [HY94b] K. Honda and N. Yoshida. Replication in concurrent combinators. In Theoreti­ cal Aspects of Computer Software, International Conference TACS '94, Sendai, Japan, April 19--22, 1994, Proceedings, volume 789, pages 786--805. Springer­ Verlag, 1994. {128} [HY95] K. Honda and N. Yoshida. On reduction­based process semantics. Theoretical Computer Science, 152(2):437--486, November 1995. {14} [Joh99] P. Johnstone. A note on discrete Conduch’e fibrations. Theory and Application of Categories, 5(1):1--11, 1999. {69} [JR99] A. Je#rey and J. Rathke. Towards a theory of bisimulation for local names. In 14th Annual Symposium on Logic in Computer Science, 2--5 July, 1999, Trento, Italy, pages 56--66. IEEE Press, 1999. {15} Bibliography 142 [JSV96] A. Joyal, R. Street, and D. Verity. Traced monoidal categories. Mathematical Proceedings of the Cambridge Philosophical Society, 119(3):425--446, 1996. {75} [K˜on99] B. K˜onig. Generating type systems for process graphs. In CONCUR '99: Con­ currency Theory, 10th International Conference, Eindhoven, The Netherlands, August 24--27, 1999, Proceedings, volume 1664, pages 352--367. Springer­Verlag, 1999. {15} [Laf90] Y. Lafont. Interaction nets. In Conference Record of the Seventeenth Annual ACM Symposium on Principles of Programming Languages, San Francisco, California, January 1990, pages 95--108. ACM Press, 1990. {76} [Lei01] J. J. Leifer. A category of action graphs and reflexive embeddings. Technical report, Computer Laboratory, University of Cambridge, 2001. To appear. {129} [L’ev78] J.­J. L’evy. R’eductions correctes et optimales dans le lambda calcul. PhD thesis, Universit’e Paris VII, 1978. {8} [LM00a] J. J. Leifer and R. Milner. Deriving bisimulation congruences for reactive sys­ tems. In C. Palamidessi, editor, CONCUR 2000 ­ Concurrency Theory, 11th International Conference, University Park, PA, USA, August 22--25, 2000, Pro­ ceedings, volume 1877 of Lecture Notes in Computer Science, pages 243--258. Springer­Verlag, 2000. {16} [LM00b] J. J. Leifer and R. Milner. Shallow linear action graphs and their embeddings. Technical Report 508, Computer Laboratory, University of Cambridge, November 2000. {81, 125, 129} [Low96] G. Lowe. Breaking and fixing the Needham­Schroeder public­key protocol using FDR. In T. Margaria and B. Ste#en, editors, Tools and Algorithms for Con­ struction and Analysis of Systems, Second International Workshop, TACAS '96, Passau, Germany, March 27--29, 1996, Proceedings, volume 1055 of Lecture Notes in Computer Science, pages 147--166. Springer­Verlag, 1996. {4, 50} [Mac71] S. Mac Lane. Categories for the Working Mathematician. Springer­Verlag, 1971. {131} [Mil80] R. Milner. A Calculus of Communicating Systems, volume 92 of Lecture Notes in Computer Science. Springer­Verlag, 1980. {4} [Mil88] R. Milner. Communication and Concurrency. Prentice­Hall, 1988. {4, 34, 44, 45, 47} [Mil90] R. Milner. Functions as processes. Technical Report RR­1154, INRIA, Sophia Antipolis, February 1990. {6, 21} Bibliography 143 [Mil92] R. Milner. Functions as processes. Mathematical Structures in Computer Science, 2(2):119--141, 1992. {41} [Mil94] R. Milner. Action calculi V: reflexive molecular forms. Third draft; with an appendix by O. Jensen. Available from: ftp://ftp.cl.cam.ac.uk/users/rm135/ac5.ps.Z, 1994. {75} [Mil96] R. Milner. Calculi for interaction. Acta Informatica, 33(8):707--737, 1996. {10, 74, 77, 124} [Moo56] E. F. Moore. Gedanken­experiments on sequential machines. In C. E. Shannon and J. McCarthy, editors, Automata Studies, pages 129--153. Princeton University Press, 1956. {3} [MPW89] R. Milner, J. Parrow, and D. Walker. A calculus of mobile processes, parts I and II. Technical Report ECS­LFCS­89­85 and ECS­LFCS­89­86, Laboratory for the Foundations of Computer Science, University of Edinburgh, 1989. {6} [MPW92] R. Milner, J. Parrow, and D. Walker. A calculus of mobile processes, parts I and II. Information and Computation, 100(1):1--77, September 1992. {6} [MS92] R. Milner and D. Sangiorgi. Barbed bisimulation. In Automata, Languages and Programming, 19th International Colloquium, ICALP '92, Vienna, Austria, July 13--17, 1992, Proceedings, volume 623 of Lecture Notes in Computer Science. Springer­Verlag, 1992. {14} [MSS00] P. Mateus, A. Sernadas, and C. Sernadas. Precategories for combining proba­ bilistic automata. In M. Hofmann, G. Rosolini, and D. Pavlovic, editors, Proc. CTCS '99, volume 29 of Electronic Notes in Theoretical Computer Science. El­ sevier Science, 2000. {63} [NP96] U. Nestmann and B. C. Pierce. Decoding choice encodings. In CONCUR '96, Concurrency Theory, 7th International Conference, Pisa, Italy, August 26--29, 1996, Proceedings, volume 1119 of Lecture Notes in Computer Science. Springer­ Verlag, 1996. {128} [Par81] D. Park. Concurrency and automata on infinite sequences. In P. Duessen, editor, Proc. 5th GI Conference, volume 104 of Lecture Notes in Computer Science, pages 167--183. Springer­Verlag, 1981. {5, 31} [Pau98] L. C. Paulson. The inductive approach to verifying cryptographic protocols. J. Computer Security, 6:85--128, 1998. {47} [Plo75] G. D. Plotkin. Call­by­name, call­by­value, and the #­calculus. Theoretical Com­ puter Science, 1(2):125--159, December 1975. {21} Bibliography 144 [Plo81] G. D. Plotkin. A structural approach to operational semantics. Technical Report DAIMI­FN­19, Department of Computer Science, University of Aarhus, 1981. {4} [Rei85] W. Reisig. Petri Nets: An Introduction. EATCS Monographs on Theoretical Computer Science. Springer­Verlag, 1985. {4} [Ros94] A. W. Roscoe. Model­checking CSP. In A. W. Roscoe, editor, A Classical Mind: Essays in Honour of C. A. R. Hoare, pages 353--378. Prentice­Hall, 1994. {4, 50} [Ros98] A. W. Roscoe. The Theory and Practice of Concurrency. Prentice­Hall, 1998. {3, 4, 48, 50} [Sew98] P. Sewell. Global/local subtyping and capability inference for a distributed pi­ calculus. In Automata, Languages and Programming, 25th International Collo­ quium, ICALP '98, Aalborg, Denmark, July 13--17, 1998, Proceedings, volume 1443 of Lecture Notes in Computer Science. Springer­Verlag, 1998. {7} [Sew00] P. Sewell. Applied # --- A brief tutorial. Technical Report 498, Computer Labo­ ratory, University of Cambridge, August 2000. {41} [Sew01] P. Sewell. From rewrite rules to bisimulation congruences. Theoretical Computer Science, 272(1--2), 2001. {10, 15, 21, 55, 60, 81, 124, 127} [SV99] P. Sewell and J. Vitek. Secure compositions of insecure components. In Proc. 12th Computer Security Foundations Workshop. IEEE Press, June 1999. {7} [SV00] P. Sewell and J. Vitek. Secure composition of untrusted code: wrappers and causality types. In Proc. 13th Computer Security Foundations Workshop. IEEE Press, July 2000. {47} [SW01] D. Sangiorgi and D. Walker. The #­calculus: a Theory of Mobile Processes. Cambridge University Press, 2001. {7} [TFP99] D. Turi, M. P. Fiore, and G. D. Plotkin. Abstract syntax and variable binding. In 14th Annual Symposium on Logic in Computer Science, 2--5 July, 1999, Trento, Italy, pages 193--202. IEEE Press, 1999. {126} [TP97] D. Turi and G. Plotkin. Towards a mathematical operational semantics. In 12th Annual IEEE Symposium on Logic in Computer Science, Warsaw, Poland, June 29 -- July 2, 1997, pages 280--291. IEEE Press, 1997. {14, 129} [WN95] G. Winskel and M. Nielsen. Models for concurrency. In D. M. Gabbay, S. Abram­ sky, and T. S. E. Maibaum, editors, Handbook of Logic in Computer Science, volume 4, pages 1--148. Oxford University Press, 1995. {4}