Resource Menu



Structure

So6 has two kind of components: So6 queues and So6 workspaces.

> Workspace

A worskspace is a directory where the user can work insulated. A workspace has a unique identifier. When the user modifies files, he generates operations. A sequence of operations generated in a workspace is called a patch.

A workspace is connected to one or more So6 queues. A workspace connection allows to generate patches, send local patches or retrieve remotes patches from connected queues. A user can

  • Commit its workspace according to a workspace connection. The system detects local operations generated since last commit, aggregate it into a patch and send it to the queue.
  • Update its workspace according to a workspace connection. The system retreive unconsummed patches from the queue and merge it with local changes.

> Queue

A queue is a timestamper that stores a sequence of patches. A Patch is identified by the pair of the first and last operation timestamp i.e. 3..118,119..201, etc… Operations are timestamped when a user send a patch to the queue. When a user create a queue, the timestamper is initialized to 0 and the sequence of patches is empty. If a user want to send a patch file containing 2 operations, then operations will be timestamped with 1 and 2 and the patch file will be stored as 1..2 in the queue.

Each Workspace connection remember the last timestamp consummed by this workspace connection.

Figure 1 - Queue and workspaces

This is illustrated in figure 1. This example presents one queue, three workspaces and three workspace connections. The queue contain 3 patches: 1..2,2..3,3..5. User "seb" has consummed all the patches, "momo" has consummed patches 1..2 and 2..3 and "hala" has consummed only the 1..2 patch. Patches has been generated using the commit operator. "Momo" and "hala" can synchronize their workspace using the update operator.

In So6, the copy-modify-merge paradigm is implemented as follows:

  • copy is done by creating and updating a workspace.
  • modify is done locally by calling regular tools on workspace files.
  • merge is done when updating if there are concurrent operations. Concurrent operations occurs when changes are made locally while others users commit remote operations. Local patch and remote patch has to be merged.

Transformational Approach

The model of transformational approach considers n sites. Each site has a copy of the shared objects. When an object is modified on one site, the operation is executed immediately and sent to others sites to be executed again. So every operation is processed in four steps:

  • generation on one site
  • broadcast to others sites
  • reception by others sites
  • execution on other sites
The execution context of a received operation op i may be different from its generation context. In this case, the integration of opi by other sites may lead to inconsistencies between replicas. We illustrate this behavior in figure.

There are two sites site 1 and site 2 working on a shared data of type String. We consider that a String object can be modified with the operation Ins(p,c) for inserting a character c at position p in the string. We suppose the position of the first character in string is 0. user 1 and user 2 generate two concurrent operations : op1 = Ins(2,f) and op2 = Ins(5,s). When op1 is received and executed on site 2, it produces the expected string "effects". But, when op2 is received on site 1, it does not take into account that op1 has been executed before it. So, we obtain a divergence between site 1 and site 2.

Incorrect integration Integration with transformation

Figure 2 - Integration of two concurrent operations

In the operational transformation approach, received operations are transformed according to local concurrent operations and then executed. This transformation is done by calling transformation functions. A transformation function T takes two concurrent operations op1 and op2 defined on the same state s and returns op' 1. op' 1 is equivalent to op1 but defined on a state where op2 has been applied. We illustrate the effect of a transformation function in figure.

When op2 is received on site 1, op2 needs to be transformed according to op1. The integration algorithm calls the transformation function as follows :

  • T(Ins(5,s), Ins(2,f)) = Ins(6,s) [T (op2, op1) = op'2]
The insertion position of op2 is incremented because op1 has inserted an f before s in state effect. Next, op' 2 is executed on site 1. In the same way, when op1 is received on site 2, the transformation algorithm calls:
  • T(Ins(2,f), Ins(5,s)) = Ins(2,f) [T (op1, op2) = op'1]
In this case the transformation function returns op' 1=op1 because, f is inserted before s. Intuitively we an write the transformation function as follows:
T(Ins(p 1,c 1),Ins(p 2,c 2)):- 
  if (p 1 < p 2) then
    return Ins(p 1, c 1)
  else
    return Ins(p 1+1, c 1)
  endif

This example makes it clear that the transformational approach defines two main components: the integration algorithm and the transformation functions . The Integration algorithm is responsible for receiving, broadcasting and executing operations. It is independent of the type of shared data, it calls transformation functions when needed. The transformation functions are responsible for merging two concurrent operations defined on the same state. They are specific to the type of shared data (String in our example).

A more theoretical model is defined in // ref Sun.ea:98,Suleiman.ea:98,Sun.ea:02,sun02dec .

To be correct, an integration algorithm has to ensure three general properties:

  • Convergence : When the system is idle (no operation in pipes), all copies are identical.
  • Causality : If on one site, an operation op2 has been executed after op1, then op2 must be executed after op1 in all sites.
  • Intention preservation : If an operation opi has to be transformed into op' i, then the effects of op' i have to be equivalent to opi.
To ensure these properties, it has been proved that the underlying transformation functions must satisfy two conditions:
  • The condition C1 defines a state equivalence . The state generated by the execution of op 1 followed by T(op2,op1) must be the same as the state generated by the execution of op2 followed by T(op1,op2) :
C1: op1 • T(op 2,op 1) = op2 • T(op 1,op 2)
  • The condition C2 ensures that the transformation of an operation according to a sequence of concurrent operations does not depend on the order in which operations of the sequence are transformed :
C2: T(op 3, op 1 • T(op 2,op 1)) = T(op 3, op 2 • T(op 1,op 2))

In order to use the transformational model, we must follow these steps :

  • Choose a integration algorithm. Depending of the algorithm, C2 may be required or not on underlying transformation functions
  • Define shared data types with their operations
  • Write transformation functions for all combination of operations.
For example, on a string object with Ins(p,c), Del(p), we have to define :
T(Ins(p 1,c 1),Ins(p 2,c 2)):-
T(Ins(p 1,c 1),Del(p 2)):-
T(Del(p 1),Ins(p 2,c 2)):-
T(Del(p 1),Del(p 2)):-
  • Prove the required conditions on these transformation functions.
See also Operational Transformation.


So6 Integration Algorithm

Sync(log,N s) :-
  while ((op r = getOp(N s+1))!= )
    for (i=0;i<log.size();i++)
      op l=log[i];
      log[i]=T(op l,op i)
      op' i=T(op i,op l);
    endfor
    execute(op'i)
    N s=N s+1
  endwhile

for (i=0;i<log.size();i++) op' l=log[i]; if send(op' l,N s+1) then N s=N s+1 else error 'need to synchronize' endif endfor

Generic synchronization algorithm

In the transformational approach, the integration algorithm has the responsibility of receiving, integrating, broadcasting and executing operations. Among the existing algorithms, SOCT4 deferred broadcast is the most suitable algorithm for our synchronization needs. SOCT4 is based on a continuous global order of operations and requires only C1 to be verified by transformation functions. Each operation is sent with a unique global timestamp. An operation on a site S with a given timestamp cannot be sent if all the preceding operations based on the timestamp order have been received and executed.

Our synchronization algorithm based on SOCT4 is presented in figure

Synchronizing a site S takes two parameters: log and N s. log is the sequence of operations executed locally since the last synchronization. N s is an integer which contains the timestamp of the last operation received or sent by site s. We define two functions:

  • getOp(int ticket) : retrieves the operation identified by the timestamp ticket. If no operation is available, getOp return Ø
  • send(Operation op,int ticket) : sends a local operation with the timestamp ticket. If ticket already exists, it means that a concurrent synchronization is in progress. In this case, the operation send returns false. The current state is not corrupted, it requires just to start again another synchronization.

Suppose we want to synchronize two sites as illustrated in figure. At the beginning, each site has N s=0. site 1 performed two local operations op 1, op 2, and site 2 performed two concurrent local operations op 3, op 4.

Figure 3 - Scenario of integration

1. At point s1, site 1 synchronizes. It calls sync([op 1,op 2],0). There is no concurrent operation available, so we just send op 1,op 2 as it is to site 2. Now, N s=2 on site 1.

2. At point s2, site 2 synchronizes by calling sync([op 3,op 4],0). The following transformation functions are called:

op'' 1, op'' 2 are executed on site 2. op'' 1, op'' 2 are sent to others sites. Now N s=4 on site 2.

3. At point s3, site 1 synchronizes again by calling sync([],2). There is no more local concurrent operations, so remote operations are executed without transformation on site 2 and N s=4.

4. After point s3, site 1 has executed the following sequence:

and site 2 has executed the following equivalent sequence:

This equivalence is ensured if transformation functions verify C1. It is clear in this example that conflicts detection and conflicts resolution are delegated to transformation functions. However, the problem is now simpler. A transformation function detects and resolves conflicts for one combination of two concurrent operations defined on the same state . If one transformed operation has an effect on the next operation, the cascading effects are handled by the integration algorithm.

This algorithm is a safe generic synchronizer if underlying transformation functions verify condition C1. It preserves convergence, causality and intention.



Last edited by Guest at Aug 28, 2006 10:34 AM - Edit content - View source