Replication in SQL Server is a object level high availability feature. Normally in other high availability technologies we replicating entire database but in this we can replicate database objects like tables, stored procedures, triggers, functions etc.
Do you want to replicate your data from one server to another server? Or do you want to migrate data from old version to newer version in your database?
Then SQL Server replication is the perfect solution for your problem.
You can replicate database objects like tables, stored procedures, triggers, cursors etc. from one server to another server.
Replication is mainly used as a data migration technology although it is one of the high availability options. The concept of replication is to migrate data and database objects from one server to another server.
Why is replication a powerful migration technology?
1. Replication can be used for multiple purposes
We can maintain 2 servers, both databases are completely synchronized, whatever changes are being done on the primary server goes to the secondary server.
Load distribution is possible in replication as both primary and secondary servers are available to users. We can split users among the 2 servers.
After initial synchronization between 2 servers, database replication tracks the changes, and these changes migrate from source server to subscriber server instead of the entire database.
2. Replication has a filter mechanism, while moving data from one server to another server we can use a filter or condition like dept equal to sales or emp salary greater than 100, while migrating data replication checks whether data meeting condition or not. If met data will migrate, if not it ignores the data. This is completely lack in other migration technologies like backup and restore, import and export
3. Replication can be done on object level, we can select tables, stored procedures, functions as well as along with data.
4. Replication can be configured between heterogeneous databases. We can migrate data from SQL Server to Non-SQL Servers like oracle, DB2, Teradata etc.
5. Replication has scheduling options; we can define schedules with high latency or low latency.
How Replication works
Replication requires 3 servers and 3 databases along with it require articles, publications, and agents.
Publisher is a source server containing primary database called publication database here we can select entire database or few tables, stored procedures of database.
Subscriber is a destination server whatever data coming it stores in subscription database
Entire replication process carried in distribution server. It establish connection with publisher identifies data ready for replication copies data to subscriber.
- Local distributor
Publisher and distributor on same machine
Publisher and distributor on different machines.
Articles are the database objects that being replicated.
Publications are collection of articles.
Agents are executables (jobs) used to perform the replication operations.
Replication uses news paper terminology, publisher (News printing person) has publication database in that articles means tables, stored procedures will be available for migration.
Distributor (Delivery person) collect the articles from publication database and forward the same data to subscriber using agents and distribution database
Subscriber (Receiver) receives all the data and keeps in subscription database
When we configure replication, we need to get details
- Replication model
- Data flow
- Publisher, subscriber, and local distributor
- Publisher, subscriber, and remote distributor
- Central publisher and multiple subscribers
- Multiple publisher and central subscriber
- Hybrid (combination of any 2 above models)
We need get the details whether flow is one way or two way
Data flows from publisher to subscriber
If both servers in use need to configure replication. By way as the data moves from publisher to subscriber and get data from subscriber to publisher
Latency is time interval, schedule the data movement either continuous or hourly or weekly.
Types of replication
- Snapshot replication
- Transaction replication
- Peer to peer replication
- Merge replication
Snapshot is going to make exact copy of an existing database; it is one time replication. 2 agents are created in snapshot replication.
- Snapshot agent
It perform snapshot generation
- Distribution agent
Snapshot generation process
- Snapshot agent establishes connection between publisher and distributor.
- Snapshot agent identifies what are objects participated in replication process applies schema locks on those objects while copying structure.
- Snapshot generates script of tables and output will be copies as ‘tablename.sch’. Snapshot folder is a storage location
Data will be copied with .bcp
Triggers will be copied with .trg
Indexes will be copies with .idx
Other constraints copy with .pre
If there are 5 tables participated in replication, for each table 5 files will be generated total 25 files will be generated and these files will copy from publisher database to snapshot folder
- Once snapshot generated automatically locks will be removed on objects and breaks connection between publisher and distributor.
- Distribution agent works from here. It goes to snapshot folder identifies tables and creates same files in subscriber. Once files copy done to subscriber the relation between distributor and subscriber also breaks.
When we go for snapshot
- When source database has constant data
- When huge latency is accepted
- When we do not want changes in subscriber regularly.
Transaction Replication is one way replication. Initial synchronization will be done using snapshot replication. Snapshot generation process is same in all types of replications. Ongoing changes should be tracked and copy those changes to subscriber continuously using log reader agent.
In transaction replication changes will be tracked through log files. While copying modifications locks will be applied once schema is copies locks will be released, log read will read ongoing changes by scanning log file of the database. From here distribution agent will copy data to subscriber. This we call as concurrent snapshot.
Peer to peer replication
Peer to peer replication is introduced in SQL Server 2005 and enhanced in SQL Server 2008, Each server will act as a publisher and a subscriber. Data will move to all the servers
We have conflicts when one server fails in this replication. In SQL Server 2008 we can add or remove servers without interrupting the other servers. Conflicts handling introduced in SQL Server 2008.
Merge replication allows publisher and subscriber to make changes to data independently and then merge the results. We have conflicts with merge replication. It uses merge agent.
What are the types of replication in SQL Server?
1. Snapshot Replication
2. Transactional Replication
3. Peer to Peer Replication
4. Merge Replication
5. Transactional Replication with updatable subscription (this feature was deprecated from SQL Server 2012)
How Replication works?
Replication requires 3 servers publisher, distributor and subscriber along with it requires articles, publications and agents. Publisher is a source server containing primary database called publication database here we can select entire database or few tables, stored procedures of database
What are the benefits of replication in SQL Server?
High availability, Scalability, Migration, we have filtering mechanism like while migrating it checks the whether data is meeting the condition or not. Can be configured between heterogenous databases like oracle, DB2, Teradata etc
What is the difference between log shipping and replication?
Log shipping is database level disaster recovery option while replication is object level high availability feature