Resource Menu


Many tools exist in different research areas dealing with synchronizations. We compare our work with file synchronizers, PDAs synchronizers, configuration management tools, synchronization issues in distributed systems and replication in database systems.

Synchronizers

The overall goal of a file synchronizer is to detect conflicting updates and propagate non conflicting ones. To achieve this goal, the semantic of the file system primitives must be well defined as in Unison . However, this approach presents several drawbacks:

  • the approach is restricted to a file system.
  • Synchronization is often limited to two replicas.
  • Reconciliation is coarse grained. It does not attempt to synchronize file contents.
  • A general correctness criterion is not defined.
  • The system interacts with user each time a conflict is detected. If there are 100 conflicts, the system will interact 100 times with the user.
If we just make the comparison between So6 and this kind of synchronizers, So6 handles n replicas, ensures convergence, causality and intention preservation, synchronizes files contents, resolves conflicts automatically in all cases.

PDA synchronizer

ActiveSync , HotSync , I-Sync are now largely used to synchronize data between desktop computers and PDAs. These synchronizers allow to synchronize several kind of data like address books, calendars, tasks, notes , bookmarks, files and so on.

However, this approach is an extension of the file synchronizer approach: it detects conflicting updates and propagates non conflicting ones. So we have exactly the same problems: no correctness criteria, problems with conflict resolution

The genericity of transformational approach makes it easy to write such synchronizers. We can define a type calendar with three operations: AddRendezVous, RemoveRendezVous and UpdateRendezVous. Then we define all transformation functions and make the proof of the condition C1. The result is a safe synchronizer, ensuring convergence, causality and intention preservation.

CM and Merge Tools

In Configuration Management Environments users can work in parallel, produce data divergence and reconciliate later using the copy-modify-merge paradigm. If we look closer on how things are done, we observe that reconciliation is done by tight cooperation between version manager and merge tools.

  • When a reconciliation is required (i.e. often when a user updates his workspace), version managers provides required version to merge tools . Merge is done locally, in the workspace of the user.
  • Merge tools extract from different versions, concurrent logs of operations using Diff algorithms . Of course, diff algorithms are specific to data types.
  • Finally, concurrent operations are merged using ad-hoc algorithm specific to data types.
The transformational model is more general, more uniform and safer than this model. In this approach, each merge tool has its own merge algorithm. One tool merges two divergent file systems, another tool merges two divergent text files, another one merges two divergent XML files. Maybe, they are not consistent together, they do not apply the same strategy. For example, with CVS, compensation is used by the text file merge tool and not by the file system merge tool.

In the transformational approach, the merge algorithm is shared by all transformation functions. It preserves Convergence, Causality and Intention (CCI) if underlying transformation functions ensure condition C1. By this way, we can extend the synchronizer by adding new transformation functions without violating CCI properties.

Distributed systems

Maintaining consistency of shared data is a big issue in distributed systems. Coda , Bayou , Ficus allow users to work disconnected and use reconciliation procedures when people reconnect.

Bayou first used an epidemic algorithm to propagate changes between weakly consistent replicas. When a conflict is detected, merge procedures associated with operations are executed. If the merge procedure cannot find a solution, conflict resolution is delegated to users. Bayou use a total update ordering. Other systems use a partial update ordering and then take advantages of update commutativity. Causality is used to determine the partial ordering.

Distributed systems and transformational approach are similar in many points: both approaches detect conflicts, merge procedures and transformation functions looks identical, commutativity and condition C1 are quite similar and causality are used in both approaches. However, the transformational approach allows to transform operations. C1 is some sort of "transformational commutativity". It allows to compute more complex state of convergence. Unlike merge procedures, transformation functions ensure convergence in all cases.

The IceCube is a generic approach for reconciliating divergent data. IceCube does not define a general correctness criterion for synchronization but uses semantic constraints that the reconciliation algorithm has to preserve. IceCube considers two kind of constraints:

  • Static constraints can be evaluated without using the state of replica. Commutativity of operations can be expressed as a static constraint.
  • Dynamic constraints can refer to the state of replicas. inparaenum
Basically, IceCube explores all possible combinations of concurrent actions. First, IceCube rejects all combinations violating static constraints. For the others, IceCube simulates integrations on replicas and reject combinations violating dynamic constraints. Resulting combinations are ranked and proposed to user.

This approach is interesting because, IceCube is looking for the combinations of concurrent operations that minimize conflicts of reconciliation. Maybe, on this point, transformational approach will not find the optimal reconciliation. On the other hand, IceCube has some intrinsic drawbacks:

  • Combinatorial explosion can occur during the first stage of reconciliation, even if static constraints restrict the number of possible schedules.
  • Constraints are specific to applications and have to be defined.
  • IceCube is interactive,
  • IceCube does not transform operations. What happens if there are just two concurrent operations mkfile("/a") and mkdir("/a").
  • All possible schedules are bad. In this situation, IceCube will just ask users what it has to do as a classical file synchronizer.

Database Systems

Replication and database consistency has been investigated extensively. Replication conflicts can occur in a replication environment that permits concurrent updates to the same data at multiple sites. If two transactions working on two different replicas, update the same row at the same time, a conflict can occur.

Oracle provides built-in resolution methods for resolving update conflicts. The "latest timestamp" value resolves a conflict based most recent update. the Additive method adds the difference of two conflicting "update value" operations to the current value. The "overwrite" method replaces the current value with the new value. Users can define their own conflict resolution methods. If convergence cannot be achieved, then a notification is sent to the administrator. Some built-in resolution methods seem to preserve convergence but not for any kinds of conflicts (uniqueness and delete/update) and not for any configuration of replicas. Transformational approach is more general than replicas management in database systems. We can implement built-in or user defined resolution methods of Oracle as transformation functions and prove formally the convergence.



Last edited by Pascal Molli at Feb 2, 2006 11:33 AM - Edit content - View source