The other day at work, the following question came up: if you initialize replication from a backup, can you drop tables that don’t belong to the publication (IE those that are not articles) without breaking replication? TL;DR Yes.
Why did this question come up? Actually, there was a good reason—data security. After we initialized the subscription via a backup, since a snapshot would be large and difficult to perform, we’d like to get rid of the data that isn’t going to be used in the replication solution. Dropping those extraneous tables would help prevent unauthorized data access, and would keep the subscriber focused on only the data that we needed. It would also allow for other management benefits, such as a smaller database size. (Or the inverse, more space for replicated transactions!)
I’m rather new to replication as a whole, and so I wasn’t sure if it was possible, or if it would break the replication given that you’d be creating some transactions on the subscriber. So I did the following experiment to prove that you could drop non-replicated tables without affecting the movement of transactions from publisher to subscriber.
Setting up our Publisher
I began by setting up a new database,
ReplOrigin, with two tables inside of it,
ReplData. Yes, all those names are pretty loaded in meaning –
ReplOrigin is meant to be my publisher,
NonReplData will hold a small amount of data that is not included in the publication (but will be restored by the backup), and
ReplData will be the table that is included as an article. While we control the data entering into
NonReplData, we’re simply generating 10,000 random-ish rows for the
-- Run this part to set up the ReplOrigin database USE master; GO IF NOT EXISTS ( SELECT d.name FROM sys.databases d WHERE d.name = 'ReplOrigin' ) CREATE DATABASE ReplOrigin; GO USE ReplOrigin; GO IF NOT EXISTS ( SELECT s.name + '.' + t.name FROM sys.schemas s JOIN sys.tables t ON s.schema_id = t.schema_id WHERE s.name = 'dbo' AND t.name = 'NonReplData' ) CREATE TABLE dbo.NonReplData ( Id INT NOT NULL IDENTITY PRIMARY KEY, DateCreated DATE NOT NULL DEFAULT GETDATE(), Name VARCHAR(40) NOT NULL ); IF NOT EXISTS ( SELECT s.name + '.' + t.name FROM sys.schemas s JOIN sys.tables t ON s.schema_id = t.schema_id WHERE s.name = 'dbo' AND t.name = 'ReplData' ) CREATE TABLE dbo.ReplData ( Id INT NOT NULL IDENTITY PRIMARY KEY, DateProcessed DATETIME NOT NULL DEFAULT GETDATE(), Value INT NOT NULL ); GO BEGIN TRAN; INSERT INTO dbo.NonReplData (Name) VALUES ('Kendra Little'), ('Matthew Velic'), ('Jeremiah Peschka'); COMMIT; BEGIN TRAN; INSERT INTO dbo.ReplData (Value) SELECT TOP 10000 CAST(RAND(CHECKSUM(NEWID())) * 1000000 + 1 AS INT) AS Value FROM sys.all_columns c1 CROSS JOIN sys.all_columns c2; COMMIT; GO SELECT Id, DateCreated, Name FROM dbo.NonReplData; SELECT COUNT(*) AS ReplOriginDataCount FROM dbo.ReplData;
You can see in the following screenshots that I now have
ReplOrigin listed as a database, and that I have some results from my initial table inserts.
Setting up Replication
While I initially used the GUI to help guide my work, I generated a script that I included here, albeit formatted more vertically. What’s notable is that I’ve set the
@allow_initialize_from_backup argument equal to
true in the
sp_addpublication procedure near the bottom of the setup script. In order to run this in your test or personal lab environment, you’ll have to change the following details; I’d recommend a Find-and-Replace.
- I’m using my computer,
Matt-Rig, as a local distributor.
- My data folder resides in
- My transaction logs reside in
- I created my snapshot folder at
-- Doing the replication bits -- Installing the server as a Distributor. USE master; GO EXECUTE sp_adddistributor @distributor = N'MATT-RIG', @password = N''; GO EXECUTE sp_adddistributiondb @database = N'distribution', @data_folder = N'S:\MSSQL\Data', @log_folder = N'S:\MSSQL\TLogs', @log_file_size = 2, @min_distretention = 0, @max_distretention = 72, @history_retention = 48, @security_mode = 1 GO USE [distribution]; GO IF NOT EXISTS ( SELECT * FROM sysobjects WHERE name = 'UIProperties' AND type = 'U ' ) BEGIN; CREATE TABLE UIProperties ( id INT ); END; IF EXISTS ( SELECT * FROM ::fn_listextendedproperty('SnapshotFolder', 'user', 'dbo', 'table', 'UIProperties', null, null) ) EXECUTE sp_updateextendedproperty N'SnapshotFolder', N'S:\MSSQL\ReplData', 'user', dbo, 'table', 'UIProperties'; ELSE EXECUTE sp_addextendedproperty N'SnapshotFolder', N'S:\MSSQL\ReplData', 'user', dbo, 'table', 'UIProperties'; GO EXECUTE sp_adddistpublisher @publisher = N'Matt-Rig', @distribution_db = N'distribution', @security_mode = 1, @working_directory = N'S:\MSSQL\ReplData', @trusted = N'false', @thirdparty_flag = 0, @publisher_type = N'MSSQLSERVER'; GO USE [ReplOrigin]; GO EXECUTE sp_replicationdboption @dbname = N'ReplOrigin', @optname = N'publish', @value = N'true'; GO -- Adding the transactional publication USE [ReplOrigin]; GO EXECUTE sp_addpublication @publication = N'ReplOriginPub', @description = N'Transactional publication of database ''ReplOrigin'' from Publisher ''MATT-RIG''.', @sync_method = N'concurrent', @retention = 0, @allow_push = N'true', @allow_pull = N'true', @allow_anonymous = N'true', @enabled_for_internet = N'false', @snapshot_in_defaultfolder = N'true', @compress_snapshot = N'false', @ftp_port = 21, @ftp_login = N'anonymous', @allow_subscription_copy = N'false', @add_to_active_directory = N'false', @repl_freq = N'continuous', @status = N'active', @independent_agent = N'true', @immediate_sync = N'true', @allow_sync_tran = N'false', @autogen_sync_procs = N'false', @allow_queued_tran = N'false', @allow_dts = N'false', @replicate_ddl = 1, @allow_initialize_from_backup = N'true', @enabled_for_p2p = N'false', @enabled_for_het_sub = N'false'; GO EXECUTE sp_addpublication_snapshot @publication = N'ReplOriginPub', @frequency_type = 1, @frequency_interval = 0, @frequency_relative_interval = 0, @frequency_recurrence_factor = 0, @frequency_subday = 0, @frequency_subday_interval = 0, @active_start_time_of_day = 0, @active_end_time_of_day = 235959, @active_start_date = 0, @active_end_date = 0, @job_login = null, @job_password = null, @publisher_security_mode = 1; USE [ReplOrigin]; GO EXECUTE sp_addarticle @publication = N'ReplOriginPub', @article = N'ReplDataTrans', @source_owner = N'dbo', @source_object = N'ReplData', @type = N'logbased', @description = null, @creation_script = null, @pre_creation_cmd = N'drop', @schema_option = 0x000000000803509F, @identityrangemanagementoption = N'manual', @destination_table = N'ReplData', @destination_owner = N'dbo', @vertical_partition = N'false', @ins_cmd = N'CALL sp_MSins_dboReplData', @del_cmd = N'CALL sp_MSdel_dboReplData', @upd_cmd = N'SCALL sp_MSupd_dboReplData'; GO
After running the replication script, you should receive a confirmation along the lines of the following:
By refreshing and expanding the Replication folder in the Object Explorer, you’ll see the new Local Publication that we created, but you’ll notice that we haven’t subscribed anything to it yet.
Also, if you refresh your System Databases folder, you’ll see the brand new
Backing up and Restoring
At this point, we can backup our
ReplOrigin database, and then use that backup to restore to our new database,
ReplDest, which will act as a subscriber. Again, here’s the pertinent parts of the script you’ll need to change in order to get it to run in your non-production lab.
- My backup folder resides in
- My datafolder resides in
- My transaction logs reside in
-- Creating our backup to initialize the new subscriber USE master; GO BACKUP DATABASE ReplOrigin TO DISK = 'S:\MSSQL\Backup\ReplOrigin_FULL.bak' WITH CHECKSUM; -- Create a new database, our subscriber, from the backup RESTORE DATABASE ReplDest FROM DISK = 'S:\MSSQL\Backup\ReplOrigin_FULL.bak' WITH CHECKSUM, MOVE 'ReplOrigin' TO 'S:\MSSQL\Data\ReplDest.mdf', MOVE 'ReplOrigin_log' TO 'S:\MSSQL\TLogs\ReplDest_log.ldf';
After you’ve restored, you’ll note that the
ReplDest database has been added to the Object Explorer.
Subscribing to the Publisher
At this point, we haven’t done anything special. We’ve created our initial database, created a publisher, subscription, and an article. We’ve used a backup to “initialize our subscriber” but we haven’t actually added
ReplDest as a subscriber yet. We can do this now. You should note that for the
@sync_type I’ve provided
initialize from backup and I’ve included the pertinent backup information for the
@backupdevicename arguments. Again, be sure to change
S:\MSSQL\Backup to your own backup location.
-- Create a new subscription at the Publisher (ReplOrigin) USE ReplOrigin; GO EXECUTE sp_addsubscription @publication = 'ReplOriginPub', @article = 'all', @subscriber = 'Matt-Rig', @destination_db = 'ReplDest', @sync_type = 'initialize with backup', @status = 'active', @update_mode = 'read only', @backupdevicetype = 'disk', @backupdevicename = 'S:\MSSQL\Backup\ReplOrigin_FULL.bak'; GO
Refreshing the Replication folder in the Object Explorer, you’ll see that we’ve created a subscriber to our publication.
We can now test our replication by generating new transactions in the
ReplData table in the
ReplOrigin database. Then we can count the tables in each database to ensure that the replication is moving those rows. (Granted, you may need to wait a few seconds for the change to occur.)
-- Generating more transactions to test the replication USE ReplOrigin; GO BEGIN TRAN; INSERT INTO dbo.ReplData (Value) SELECT TOP 10000 CAST(RAND(CHECKSUM(NEWID())) * 1000000 + 1 AS INT) AS Value FROM sys.all_columns c1 CROSS JOIN sys.all_columns c2; COMMIT; GO SELECT COUNT(*) AS ReplOriginDataCount FROM dbo.ReplData; USE ReplDest; GO SELECT COUNT(*) AS ReplDestDataCount FROM dbo.ReplData;
Comparing those counts, we can see that the new rows have been replicated from
ReplDest as expected.
Dropping Non-Replicated Tables
We’ve finally reached our experiment phase! Running the following script with drop the
NonReplData table from the
-- Testing dropping the Non-Replication table from ReplDest USE ReplDest; GO IF EXISTS ( SELECT s.name + '.' + t.name FROM sys.schemas s JOIN sys.tables t ON s.schema_id = t.schema_id WHERE s.name = 'dbo' AND t.name = 'NonReplData' ) DROP TABLE dbo.NonReplData; GO
Once that has completed, you can generate another batch of transactions to be replicated from
ReplOrigin, again counting the tables to ensure that they are being replicated.
-- Adding more transactions to our replication database to see if it works since we changed the underlying schema somewhat USE ReplOrigin; GO BEGIN TRAN; INSERT INTO dbo.ReplData (Value) SELECT TOP 10000 CAST(RAND(CHECKSUM(NEWID())) * 1000000 + 1 AS INT) AS Value FROM sys.all_columns c1 CROSS JOIN sys.all_columns c2; COMMIT; GO SELECT COUNT(*) AS ReplOriginDataCount FROM dbo.ReplData; USE ReplDest; GO SELECT COUNT(*) AS ReplDestDataCount FROM dbo.ReplData;
Success! We can delete the non-replicated tables without affecting or breaking the replication, which means we can provide a higher level of data security and appropriate access after we’ve initialized from a backup.
Once you are finished testing, you’ll want to clean up your lab. Then you’ll be able to run the whole thing over again, if you’d like. Again, be sure to change
Matt-Rig to your appropriate publisher/subscriber.
-- Cleanup USE ReplOrigin; GO EXECUTE sp_dropsubscription @publication = 'ReplOriginPub', @article = N'all', @subscriber = 'MATT-RIG'; GO EXECUTE sp_dropsubscriber @subscriber = 'Matt-Rig'; GO EXECUTE sp_droppublication @publication = 'ReplOriginPub'; GO EXECUTE sp_replicationdboption @dbname = 'ReplOrigin', @optname = 'publish', @value = 'false'; GO EXECUTE sp_dropdistpublisher @publisher = 'Matt-Rig'; GO EXECUTE sp_dropdistributiondb @database = 'distribution'; GO EXECUTE sp_dropdistributor; GO USE master; GO IF EXISTS ( SELECT d.name FROM sys.databases d WHERE d.name = 'ReplDest' ) DROP DATABASE ReplDest; GO IF EXISTS ( SELECT d.name FROM sys.databases d WHERE d.name = 'ReplOrigin' ) DROP DATABASE ReplOrigin; GO
As I mentioned, I’m still new when it comes to replication, but I was glad I was able to complete a small, contained experiment in order to answer a business related question. Even more than that, I got to learn a bunch of new (to me) replication related procedures, which is the best part because I love to do as much using T-SQL as I can.