StructureSo6 has two kind of components: So6 queues and So6 workspaces.
> WorkspaceA worskspace is a directory where the user can work insulated. A workspace has a unique identifier. When the user modifies files, he generates operations. A sequence of operations generated in a workspace is called a patch.A workspace is connected to one or more So6 queues. A workspace connection allows to generate patches, send local patches or retrieve remotes patches from connected queues. A user can
- Commit its workspace according to a workspace connection. The system detects local operations generated since last commit, aggregate it into a patch and send it to the queue.
- Update its workspace according to a workspace connection. The system retreive unconsummed patches from the queue and merge it with local changes.
> QueueA queue is a timestamper that stores a sequence of patches. A Patch is identified by the pair of the first and last operation timestamp i.e. 3..118,119..201, etc… Operations are timestamped when a user send a patch to the queue. When a user create a queue, the timestamper is initialized to 0 and the sequence of patches is empty. If a user want to send a patch file containing 2 operations, then operations will be timestamped with 1 and 2 and the patch file will be stored as 1..2 in the queue.Each Workspace connection remember the last timestamp consummed by this workspace connection. Figure 1 - Queue and workspacesThis is illustrated in figure 1. This example presents one queue, three workspaces and three workspace connections. The queue contain 3 patches: 1..2,2..3,3..5. User "seb" has consummed all the patches, "momo" has consummed patches 1..2 and 2..3 and "hala" has consummed only the 1..2 patch. Patches has been generated using the commit operator. "Momo" and "hala" can synchronize their workspace using the update operator.In So6, the copy-modify-merge paradigm is implemented as follows:
- copy is done by creating and updating a workspace.
- modify is done locally by calling regular tools on workspace files.
- merge is done when updating if there are concurrent operations. Concurrent operations occurs when changes are made locally while others users commit remote operations. Local patch and remote patch has to be merged.
Transformational ApproachThe model of transformational approach considers n sites. Each site has a copy of the shared objects. When an object is modified on one site, the operation is executed immediately and sent to others sites to be executed again. So every operation is processed in four steps:
- generation on one site
- broadcast to others sites
- reception by others sites
- execution on other sites
Figure 2 - Integration of two concurrent operationsIn the operational transformation approach, received operations are transformed according to local concurrent operations and then executed. This transformation is done by calling transformation functions. A transformation function T takes two concurrent operations op1 and op2 defined on the same state s and returns op' 1. op' 1 is equivalent to op1 but defined on a state where op2 has been applied. We illustrate the effect of a transformation function in figure.When op2 is received on site 1, op2 needs to be transformed according to op1. The integration algorithm calls the transformation function as follows :
- T(Ins(5,s), Ins(2,f)) = Ins(6,s) [T (op2, op1) = op'2]
- T(Ins(2,f), Ins(5,s)) = Ins(2,f) [T (op1, op2) = op'1]
This example makes it clear that the transformational approach defines two main components: the integration algorithm and the transformation functions . The Integration algorithm is responsible for receiving, broadcasting and executing operations. It is independent of the type of shared data, it calls transformation functions when needed. The transformation functions are responsible for merging two concurrent operations defined on the same state. They are specific to the type of shared data (String in our example).A more theoretical model is defined in // ref Sun.ea:98,Suleiman.ea:98,Sun.ea:02,sun02dec .To be correct, an integration algorithm has to ensure three general properties:
T(Ins(p 1,c 1),Ins(p 2,c 2)):- if (p 1 < p 2) then return Ins(p 1, c 1) else return Ins(p 1+1, c 1) endif
- Convergence : When the system is idle (no operation in pipes), all copies are identical.
- Causality : If on one site, an operation op2 has been executed after op1, then op2 must be executed after op1 in all sites.
- Intention preservation : If an operation opi has to be transformed into op' i, then the effects of op' i have to be equivalent to opi.
- The condition C1 defines a state equivalence . The state generated by the execution of op 1 followed by T(op2,op1) must be the same as the state generated by the execution of op2 followed by T(op1,op2) :
C1: op1 • T(op 2,op 1) = op2 • T(op 1,op 2)
- The condition C2 ensures that the transformation of an operation according to a sequence of concurrent operations does not depend on the order in which operations of the sequence are transformed :
C2: T(op 3, op 1 • T(op 2,op 1)) = T(op 3, op 2 • T(op 1,op 2))
- Choose a integration algorithm. Depending of the algorithm, C2 may be required or not on underlying transformation functions
- Define shared data types with their operations
- Write transformation functions for all combination of operations.
T(Ins(p 1,c 1),Ins(p 2,c 2)):- T(Ins(p 1,c 1),Del(p 2)):- T(Del(p 1),Ins(p 2,c 2)):- T(Del(p 1),Del(p 2)):-
- Prove the required conditions on these transformation functions.
So6 Integration Algorithm
Generic synchronization algorithm In the transformational approach, the integration algorithm has the responsibility of receiving, integrating, broadcasting and executing operations. Among the existing algorithms, SOCT4 deferred broadcast is the most suitable algorithm for our synchronization needs. SOCT4 is based on a continuous global order of operations and requires only C1 to be verified by transformation functions. Each operation is sent with a unique global timestamp. An operation on a site S with a given timestamp cannot be sent if all the preceding operations based on the timestamp order have been received and executed.Our synchronization algorithm based on SOCT4 is presented in figureSynchronizing a site S takes two parameters: log and N s. log is the sequence of operations executed locally since the last synchronization. N s is an integer which contains the timestamp of the last operation received or sent by site s. We define two functions:
Sync(log,N s) :- while ((op r = getOp(N s+1))!= ) for (i=0;i<log.size();i++) op l=log[i]; log[i]=T(op l,op i) op' i=T(op i,op l); endfor execute(op'i) N s=N s+1 endwhile for (i=0;i<log.size();i++) op' l=log[i]; if send(op' l,N s+1) then N s=N s+1 else error 'need to synchronize' endif endfor
- getOp(int ticket) : retrieves the operation identified by the timestamp ticket. If no operation is available, getOp return Ø
- send(Operation op,int ticket) : sends a local operation with the timestamp ticket. If ticket already exists, it means that a concurrent synchronization is in progress. In this case, the operation send returns false. The current state is not corrupted, it requires just to start again another synchronization.
Suppose we want to synchronize two sites as illustrated in figure. At the beginning, each site has N s=0. site 1 performed two local operations op 1, op 2, and site 2 performed two concurrent local operations op 3, op 4.
Figure 3 - Scenario of integration