Vector Clocks for Representing Temporal Relations between Distributed Events
In this blog I’ll review the use of vector clocks for comparing the times of occurrence of dispersed events in a distributed system. Vector clocks are well-known in the literature, and there are plenty of resources on the net about them. See, for example, the Wikipedia entry about them for original references and credits.
Vector clocks are useful in a scenario where we are not able or willing to make assumptions about clock drift rates at independent nodes of a distributed system, or about message transmission rates between nodes. In such cases, our primary source of information about temporal relations between events across nodes is the fact that for each message sent from node i to node j, the transmittal of the message from node i happens before the receipt of the message at node j. For example, suppose that a message is transmitted from node 1 when node 1’s clock value is 1, and is received at node 2 when node 2’s clock value is 3. If the clock value and the node of origin are piggy-backed on the message, then node 2 will know that its clock value of 3 is after node 1’s clock value of 1.
Based solely on the fact that message traversal entails some (unknown) delay, and that clock values at each node are increasing over time, we would like to be able to determine, for any two events in the system, whether one occurred before the other, or whether nothing can be asserted about their temporal relation.
My primary motivation for going over vector clocks here is to set the stage for my next blog about the use of vector clocks in update-anywhere replicated data stores.
But before we get there, here is a concrete example of the direct application of vector clocks. Consider a battlefield situation where a number of independent moving agents are observing a number of moving objects and communicating the positions of these objects to each other. The observers may get in and out of communication range of each other, and their communications equipment may get jammed or damaged and later be unjammed and repaired. Vector clocks make it possible to determine when an observation in such a scenario is obsoleted by another, so that decisions may be made based on the most recent observations for each object of interest.
Asynchronous Event System
The kind of event system described above is known as an asynchronous event system.
Asynchronous event system. A system of events in which each event is marked by a node+clock pair, where intra-node temporal relations between events are based on clock values at a given node, and where inter-node temporal relations between events rely on message transmittals being earlier than corresponding message receipts.
Figure 1 depicts the elements of an asynchronous event system.
Figure 1. Nodes, Clocks, and Messages in an Asynchronous Event System
The notation E(node, clock) is used to represent an event that occurred at a given node and a given clock value at that node. We’ll call the pairing of a node and a clock, a node-clock (e.g., (3, 6) – the clock value of 6 at node 3). Nodes and clock values are represented as non-negative integers. We’ll typedef these integers to Node, and Clock, respectively, and represent their pairing as a class named NodeClock.
The Predecessor Relation between Events
In this scenario, the events can be thought of as forming the vertices of a graph with two types of edges: a last edge from an event to the last event that immediately preceded it at the same node; and a source edge from a message receipt event to the corresponding message transmittal event. Let’s call this graph the predecessor graph of the set of events, and let’s call the direct relation represented by this graph, that is, the union of the last and the source relations, the immediate-predecessor relation.
Figure 2 depicts the predecessor graph of the distributed events in Figure 1.
Figure 2. Predecessors of Events
Suppose we have two references E1 and E2 to events in such a system. Then E2 succeeds E1 in time, as far as we know, if and only if there is a path from E2 to E1 in the predecessor graph of events. Of course, if E1 and E2 refer to the same event, then their times are also comparable. (For simplicity, I am making the (inessential) assumption that at each node, different events occur at different clock values.) Thus, comparability in time leads directly to the reflexive-transitive closure of the immediate-predecessor relation, which I will call simply the predecessor relation.
predecessor: reflexive-transitive closure of (source + last)
Representing the Predecessor Relation
Vector clocks are a natural data structure motivated by the need for an efficient representation of the predecessor relation.
The direct (and inefficient) representation of the predecessor relation would include, for each event, a list of all node-clocks of events that precede it (directly or transitively) in the predecessor graph.
class Event {
Node origin;
Set<NodeClock> predecessors;
}
In this representation, the predecessor set for event E(3, 5) of Figure 2 is:
(3, 5),
(3, 4),
(2, 4),
(2, 3),
(1, 1)
But we can keep the size of this representation bounded by removing, for each predecessor node, all but the latest node-clock for that node. In our example, the predecessors of event E(3, 5) at node 2 include both E(2, 4), and E(2, 3). But we would know by comparing clock values at node 2, that event E(2, 3) is a predecessor of event E(2, 4). So there is no need keep (2, 3) explicitly in the predecessor set. Since E(2, 4) is the latest predecessor of E(3, 5) at node 2, we know that any other event originating from node 2 whose clock value is less than 4 is also a predecessor of E(3, 5).
So we end up with a representation in which the set of predecessors can be replaced by a map of latest predecessors:
class Event {
Node origin;
Map<Node, Clock> latestPredecessors;
}
The map of latest predecessors is called a vector clock.
In what follows, I’ll use the terms vector clock and latest predecessors interchangeably. The former is standard terminology. The latter is more intention-revealing in this writeup.
vector clock of event: map of latest predecessors of an event for each node
The vector clock of event E(3, 5) in our running example is then:
(1, 1),
(2, 4),
(3, 5)
Clock values for particular nodes in a vector clock may be represented by array reference notation. For example, E.vectorClock[1] (or E.latestPredecessors[1]) refers to the clock value of event E’s latest predecessor originating at node 1.
Direct Test of Temporal Succession
To summarize, including vector clocks in the representation of events provides for an easy test of temporal succession:
For any two events E1(n1, c1) and E(n2, c2):
E1 is a predecessor of E2 (E2 succeeds E1), if and only if,
c1 <= E2.latestPredecessors[n1] (c1 <= E2.vectorClock[n1]) … (1)
Of course, for the latter comparison to be true, E2 must have a latest predecessor for node 1. To finesse this case, we may use a clock value of -1 for each node that has no predecessor in a vector clock.
Vector Clock Dominance Test of Temporal Succession
For our direct test (1) to be useful, we need to keep track of both the vector clocks of events, and the nodes of origination of events.
But what if we don’t keep track of the nodes of origination?
In that case, we can use an equivalent test based solely on vector clocks: the vector clock dominance test. A vector clock vc1 is dominated by another, vc2, if for every node, the clock value of vc1 is no greater than the corresponding clock value of vc2.
It is then easy to demonstrate that:
vector clock dominance is equivalent to temporal succession
Or, in symbols:
E1(n1, c1) <= E2(n2, c2), if and only if,
E1.latestPredecessors <= E2.latestPredecessors
where the relational symbol <= is overloaded to depict both the is-dominated-by relation between vector clocks, and the temporal succession relation between events.
The equivalence of vector clock dominance and temporal succession is a fairly simple consequence of the direct test for temporal succession established above. Here are the details for completeness.
Suppose E1(n1, c1) <= E2(n2, c2). Clearly, since E2 succeeds E1, E2’s predecessors must include all of E1’s predecessors, and so E2’s latest predecessor at each node can’t have occurred any earlier than E1’s latest predecessor. In other words, temporal succession implies vector clock dominance.
Suppose, on the other hand, that E2’s vector clock dominates E1’s: E1(c1, n1).latestPredecessors <= E2(n2, c2).latestPredecessors. Then by the definition of vector clock dominance:
E1(n1, c1).latestPredecessor[n1] <= E2(n2, c2).latestPredecessors[n1] … (2)
(this relation holds for every node, and in particular for node n1). But since the predecessor relation was defined to be reflexive, the latest predecessor of E1(n1, c1) at node n1 (its node of origination), is E1(n1, c1) itself. So (2) simplifies to:
c1 <= E2.latestPredecessors[n1]
which by (1) above implies E1 <= E2.
Propagating Vector Clocks
It is easy to see that when a new event occurs, its vector clock has to become the maximal merge of the vector-clocks of its immediate predecessors plus its own node-clock. The maximal merge of a set S of vector clocks is a vector clock in which the clock value for each node, n, is the maximum clock value for n within the vector clocks in S. This type of computation can easily be performed if we piggy-back the vector clock of a message transmittal event onto the message.
This completes our overview of the use of vector clocks in asynchronous event systems.
What’s Next
As mentioned earlier, this blog forms the backdrop for my next blog on the use of vector clocks in update-anywhere replicated data stores. We’ll see there that just as in our battlefield example, vector clocks help remove obsoleted observations from consideration for each agent, in a replicated data store, vector clocks help remove obsoleted versions of a data item from the data store for each replica of the store.

