What is happening during a sync in detail?

Category: azure sqlazure


Lisa Dnlt on Tue, 07 May 2013 10:41:44

I've got some generell questions about the detailed workflow during a sync:

1. In which order are rows inserted during a synchronisation?
Is it dependent on the order of how I choosed the tables during the definition of the sync rule or does data sync register the
relationships between the tables and inserts them correctly?

2.How does the hub wins strategy to resolve conflict work with the following scenario?

There are two users: U1 and U2 and two databases B1 and B2 and one Hub (U1 writes to B1 and U2 writes to B2).
There is a table with 5 columns and 5 rows.
The synchronisation direction is bidirectional.
First case:

U1 first changes data in row 1, 3, and 5.
U2 then changes data in row 2 and 4.

1. Sync from B1 to B2 (changes of U1)
changes in row 1, 3, and 5 should be synched to B2

2. Sync from B2 to B1 (changes of U2)
changes in row 2 and 4 should be synched to B1

So I always need two syncs to have the same data on both databases?

Are my assumptions right for this case?

Second case:

U1 first changes data in row 1 in column 1 and 2 in B1.
U2 then changes data in 1 in column 3 and 4 in B2.

What will now happen at the next sync?
My assumption is that acording to hub wins only the changes made by U1 will be synchronised.
But What about the changes made by U2?

3. I've got 2 databases with different schemas. I want to sync one table from db1 to db2, but this table does only exist within db1. Will there be problems when I create a sync group and I define in the sync rule that I do only want to sync this table to db2?

4. When I create a new table in a databse within a sync group will I have to change the sync rule to make sure that this new tale will be created and sync to the other member databases?

5. when I create a sync group is it important which database (Hub or Member) I use to define the sync rule assuming that the schemas are the same?


JuneT on Wed, 08 May 2013 02:01:51

let me have a crack at this:

1. if you have fk relationships, the service should be smart enough to sync the parent table first before the child.

2. First case - yes, assuming no further changes are occuring while you sync, it will take at least two syncs to get all changes to the member databases. just to clarify, member databases never synched with one another. they always sync with the hub. other members get the change from the hub.

2. Second case - change tracking is at the row level, it doest matter which column you updated. the order in which the members are synched is non-deterministic as well. so you can't assume U1/B1 will sync first. in your hub wins scenario,  whichever change from either U1/B1 or U2/B2 gets to the server first will be change that will be synched to the other. if U1/B1 change gets there first, U2/B2 changes lose the conflict resolution and will have the changes from U1/B1. same thing happens if U2/B2 syncs first.

3. If the table is not existing on the other member, it will be created during provisioning. when you pick a table, you cannot explicitly specify that it should only sync to a specific database. it will be synched to all members of the sync group. if you want to restrict that, create a separate sync group.

4. you have to explicitly tell the service which tables you want to sync via the sync group definition. the service will not pick up and automatically update itself to reflect schema changes. so you will have to add the new table to the sync group.

5. it doesnt really matter where you pick the table. for performance reasons though, it's faster to grab the list of tables from an Azure-based database within the same region as your sync service. If you pick from an on-premise database or an azure db in another region, that requires a round-trip from the Azure-based service to your on-premise DB/azure db to get the schema.