欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  IT编程

SQLServerReplicationII

程序员文章站 2022-03-09 21:57:51
sql server replication scripting setup 大概的一个拓扑, 我们将三种角色分别安装在三台不同的 sql server 服务器上 1.1 distributor 1...

sql server replication scripting setup

大概的一个拓扑, 我们将三种角色分别安装在三台不同的 sql server 服务器上
1.1 distributor
1.2 publisher, publication, article
1.3 subscriber , subscription
1.4 agents , schedulers

简单的一个实现, 以 snapshot replication 为例子

2.1 distributor script :
我们会在 distributor 角色的 sql server 服务器上,将这台服务器设置为 distributor ,建立 distribution , 并配置一个允许使用它作为 distributor 的 publisher。

2.1.1 sp_adddistributor 第一次执行的时候,必须指定password, 这个 password 是 distributor_admin 密码。 在 publisher 连接 distributor 的时候,也必须指定这个 distributor_admin 的密码,用来通信。下面这个例子其实还需要为 sp_adddistributor 参数 @password 赋值.

2.1.2 三大要素: distributor 所用到的 instance , 以 servername\instancename 命名; distributor 用到的数据库 distribution (名字可以更改); snapshot 用到的存储路径.

-- this script uses sqlcmd scripting variables. they are in the form
-- $(myvariable). for information about how to use scripting variables
-- on the command line and in sql server management studio, see the
-- "executing replication scripts" section in the topic
-- "programming replication using system stored procedures".
-- install the distributor and the distribution database.
declare @distributor as sysname;
declare @distributiondb as sysname;
declare @publisher as sysname;
declare @directory as nvarchar(500);
declare @publicationdb as sysname;
-- specify the distributor name.
set @distributor = $(distpubserver);
-- specify the distribution database.
set @distributiondb = n'distribution';
-- specify the publisher name.
set @publisher = $(distpubserver);
-- specify the replication working directory.
set @directory = n'\\' + $(distpubserver) + '\repldata';
-- specify the publication database.
set @publicationdb = n'adventureworks2008r2';
-- install the server mydistpub as a distributor using the defaults,
-- including autogenerating the distributor password.
use master
exec sp_adddistributor @distributor = @distributor;
-- create a new distribution database using the defaults, including
-- using windows authentication.
use master
exec sp_adddistributiondb @database = @distributiondb,
@security_mode = 1;
go
-- create a publisher and enable adventureworks2008r2 for replication.
-- add mydistpub as a publisher with mydistpub as a local distributor
-- and use windows authentication.
declare @distributiondb as sysname;
declare @publisher as sysname;
-- specify the distribution database.
set @distributiondb = n'distribution';
-- specify the publisher name.
set @publisher = $(distpubserver);
use [distribution]
exec sp_adddistpublisher @publisher=@publisher,
@distribution_db=@distributiondb,
@security_mode = 1;
go

2.2 publication script , article script

2.2.1 首先要做的两点,就是:一启动 publisher 的角色;二配置要使用的 distributor . 这里使用到的存储过程 sp_replicationdboption .

2.2.2 在第一步里指定的 replication database, 执行 sp_addpublication 来添加 publication.

-- create a new transactional publication with the required properties.
exec sp_addpublication
@publication = @publication,
@status = n'active',
@allow_push = n'true',
@allow_pull = n'true',
@independent_agent = n'true';
-- create a new snapshot job for the publication, using a default schedule.
exec sp_addpublication_snapshot
@publication = @publication,
@job_login = @login,
@job_password = @password,
-- explicitly specify the use of windows integrated authentication (default)
-- when connecting to the publisher.
@publisher_security_mode = 1;
go

2.2.3 添加 article

declare @publication as sysname;
declare @table as sysname;
declare @filterclause as nvarchar(500);
declare @filtername as nvarchar(386);
declare @schemaowner as sysname;
set @publication = n'advworksproducttran';
set @table = n'product';
set @filterclause = n'[discontinueddate] is null';
set @filtername = n'filter_out_discontinued';
set @schemaowner = n'production';
-- add a horizontally and vertically filtered article for the product table.
-- manually set @schema_option to ensure that the production schema
-- is generated at the subscriber (0x8000000).
exec sp_addarticle
@publication = @publication,
@article = @table,
@source_object = @table,
@source_owner = @schemaowner,
@schema_option = 0x80030f3,
@vertical_partition = n'true',
@type = n'logbased',
@filter_clause = @filterclause;
-- (optional) manually call the stored procedure to create the
-- horizontal filtering stored procedure. since the type is
-- 'logbased', this stored procedures is executed automatically.
exec sp_articlefilter
@publication = @publication,
@article = @table,
@filter_clause = @filterclause,
@filter_name = @filtername;
-- add all columns to the article.
exec sp_articlecolumn
@publication = @publication,
@article = @table;
-- remove the daystomanufacture column from the article
exec sp_articlecolumn
@publication = @publication,
@article = @table,
@column = n'daystomanufacture',
@operation = n'drop';
-- (optional) manually call the stored procedure to create the
-- vertical filtering view. since the type is 'logbased',
-- this stored procedures is executed automatically.
exec sp_articleview
@publication = @publication,
@article = @table,
@filter_clause = @filterclause;
go

2.3 subscription script 以 push subscription 为例子. 所有的操作都在publisher, publication 里面执行。

2.3.1 判断 publication 是不是可以被 push 或者 pull
sp_helppublication
2.3.3 添加 push subscription
sp_addsubscription
2.3.4 添加 push distributor agent
sp_addpushsubscription_agent
默认是一天执行一次 snapshot push over ,那么怎么去修改这个同步间隔呢?
sp_add_schedule

监控健康指标
3.1 replication monitor

移除 replication , 察看 distribution 数据库元数据的更改

4.1 先移除 subscriber 和 subscription
sp_dropsubscription( publication database), sp_subscription_cleanup(subscriber database)

4.2 再移除 publisher 和 publication

4.3 再移除 distributor
4.4 细节解说

to disable publishing and distribution
stop all replication-related jobs. for a list of job names, see the “agent security under sql server agent” section of replication agent security model.
at each subscriber on the subscription database, execute sp_removedbreplication to remove replication objects from the database. this stored procedure will not remove replication jobs at the distributor.
at the publisher on the publication database, execute sp_removedbreplication to remove replication objects from the database.
if the publisher uses a remote distributor, execute sp_dropdistributor.
at the distributor, execute sp_dropdistpublisher. this stored procedure should be run once for each publisher registered at the distributor.
at the distributor, execute sp_dropdistributiondb to delete the distribution database. this stored procedure should be run once for each distribution database at the distributor. this also removes any queue reader agent jobs associated with the distribution database.
at the distributor, execute sp_dropdistributor to remove the distributor designation from the server.

当publisher, distributor, subscriber都配置好以后,我们可以通过下面的脚本来查看各个服务器充当的角色,以及配置的属性:

查看服务器的角色:
select server_id,name ,is_remote_login_enabled,is_system,is_publisher,is_subscriber,is_distributor,is_nonsql_subscriber from sys.servers ;
select name,is_published,is_subscribed,is_merge_published,is_distributor from sys.databases where name in( 'siebeldb','distribution')

配置原理详细解析:

**

1. distributor :

** distributor在每一个replication环境中都必须存在,它的作用是保存snapshot replication, 在publisher, subscriber之间通信,从publisher接收replication,并分发到各个subscriber上。在配置distributor的时候,特别需要注意的两件事就是配置保存snapshot replication文件的文件夹,第二就是指定是否是远程distributor,一般来说都会用一个专门的server来做distributor,来隔离与业务的资源利用冲突。

step 1 指定一个server为distributor角色

     sp_adddistributor [ @distributor= ] 'distributor'   
[ , [ @heartbeat_interval= ] heartbeat_interval ]
[ , [ @password= ] 'password' ]
[ , [ @from_scripting= ] from_scripting ]

[ @distributor=] ‘distributor’ is the distribution server name. distributor is sysname, with no default. this parameter is only used if setting up a remote distributor. it adds entries for the distributor properties in the msdb..msdistributor table.

如果distributor对于publisher来说是台远程服务器,不是本机instance上的服务器,那么我们就需要配置@distributor这个参数。相关属性的值也会被记录在msdb..msdistributor表里面。特别需要注意的是,这个命令是运行在distributor上面的,而不是用来给publisher添加一个distributor。这个命令会将本服务器标示为distributor,并且在msdb..msdistributor里面添加一个distributor的配置信息,如果是单独的一台服务器,那么就会给这个服务器贴一个专门是distributor的标签。既然已经涉及到服务器层面,这就意味着整个instance就只能配置一台数据库专用来做distribution.

[ @heartbeat_interval=] heartbeat_interval
is the maximum number of minutes that an agent can go without logging a progress message. heartbeat_interval is int, with a default of 10 minutes. a sql server agent job is created that runs on this interval to check the status of the replication agents that are running.

这里的心跳设置用来检查replication各个代理之间通信的状况,默认是10分钟。当新加一个distributor之后会自动创建一个job来实现心跳的功能。这个属性会在msdb..msdistributor表里面记录下来。

[ @password=] ‘password’]
is the password of the distributor_admin login. password is sysname, with a default of null. if null or an empty string, password is reset to a random value. the password must be configured when the first remote distributor is added. distributor_admin login and password are stored for linked server entry used for a distributor rpc connection, including local connections. if distributor is local, the password for distributor_admin is set to a new value. for publishers with a remote distributor, the same value for password must be specified when executing sp_adddistributor at both the publisher and distributor. sp_changedistributor_password can be used to change the distributor password.

如果distributor是建立在远程服务器上的,那么我们需要为distributor_admin这个用户设定一个密码。这个密码在publisher配置distributor的时候,也需要指定。

[ @from_scripting= ] from_scripting
identified for informational purposes only. not supported. future compatibility is not guaranteed.
返回值:0-成功;1-失败

上面的脚本还只是用来配置一个角色,但是这个角色对应的数据库还没有真正建立起来,下面的脚本就是用来创建distributor数据库的,而且事先我们不需要这个数据库存在。

这里要思考的问题是:1同一个instance上面可以创建多少个distribution数据库?2同一个distribution数据库可以支撑多少个publisher?

step 2 创建一个distribution数据库

下面这个脚本是运行在distributor上的,为distributor创建一个distribution数据库,这个数据库的名字当然是可以任意指定的。前提是必须先运行sp_adddistributor配置distributor.

sp_adddistributiondb [ @database= ] 'database'   
    [ , [ @data_folder= ] 'data_folder' ]   
    [ , [ @data_file= ] 'data_file' ]   
    [ , [ @data_file_size= ] data_file_size ]   
    [ , [ @log_folder= ] 'log_folder' ]   
    [ , [ @log_file= ] 'log_file' ]   
    [ , [ @log_file_size= ] log_file_size ]   
    [ , [ @min_distretention= ] min_distretention ]   
    [ , [ @max_distretention= ] max_distretention ]   
    [ , [ @history_retention= ] history_retention ]   
    [ , [ @security_mode= ] security_mode ]   
    [ , [ @login= ] 'login' ]   
    [ , [ @password= ] 'password' ]   
    [ , [ @createmode= ] createmode ]  
    [ , [ @from_scripting = ] from_scripting ]  

[ @min_distretention=] min_distretention
is the minimum retention period, in hours, before transactions are deleted from the distribution database. min_distretention is int, with a default of 0 hours.
[ @max_distretention=] max_distretention
is the maximum retention period, in hours, before transactions are deleted. max_distretention is int, with a default of 72 hours. subscriptions that have not received replicated commands that are older than the maximum distribution retention period are marked as inactive and need to be reinitialized. raiserror 21011 is issued for each inactive subscription. a value of 0 means that replicated transactions are not stored in the distribution database.

@min_distretention, @max_distretention, 指的是为重复事务保留的最小,最大时间长。如果某一个subscriber连最大值之前的重复日志都没有接收的话,这个subscriber就被标识为inactive,需要重新reinitialized.

[ @history_retention=] history_retention
is the number of hours to retain history. history_retention is int, with a default of 48 hours.
[ @security_mode=] security_mode
is the security mode to use when connecting to the distributor. security_mode is int, with a default of 1. 0 specifies sql server authentication; 1specifies windows integrated authentication.
指定为0 ,表示采用的安全认证方式是sql server帐户认证;如果为1,说明采用的是windows server认证方式。值得思考的是,我们怎么可以指定这个数据库是既可以用windows认证也可以用sql server帐户认证?

[ @login=] ‘login’
is the login name used when connecting to the distributor to create the distribution database. this is required if security_mode is set to 0. login issysname, with a default of null.
只有当security_mode指定为0的时候,需指定这个参数值。

[ @password=] ‘password’
is the password used when connecting to the distributor. this is required if security_mode is set to 0. password is sysname, with a default of null.

[ @createmode=] createmode
createmode is int, with a default of 1, and can be one of the following values.
这个参数默认是1,创建数据库或者使用现有的数据库,然后在这些数据库上调用instdist.sql来创建一些replication的对象,包括原数据表,存储过程,试图等。

虽说我们可以在配置这个distributor对应的数据库时候,直接创建数据库,但是我们还是倾向于先常规的创建一个数据库,指定一些特别的参数,然后配置这个数据库为distributor database.

当然我们要查询是不是distributor搭建成功,来决定是否需要配置其它的项目:

exec sp_get_distributor ;

返回一个数据集;
- distribution db installed: distribution数据库是不是创建完成
- is distribution publisher: distribution, publisher是不是在同一个instance上面
- has remote distribution publisher:是否有远程服务器数据库作为publisher存在

搭建完成distributor的时候我们需要再次检查属性配置或者我们新接手了一个repliaction环境需要熟悉这些配置,我们可以用下面的t-sql脚本:
- 返回distributor, distribution数据库, working directory的配置信息: sp_helpdistributor ;
- 返回指定distribution数据库的信息: sp_helpdistributiondb

这两个脚本可以返回很多有用的信息,比如snapshot file location(working directory), history cleanup agent job , distribution cleanup agent job, 用来给publisher配置distributor的server name, login name等。

如果需要更改这些属性,可以用下面的脚本来执行:
1. at the distributor, execute sp_changedistributor_propertyto modify distributor properties.
2. at the distributor, execute sp_changedistributiondbto modify distribution database properties.
3. at the distributor, execute sp_changedistributor_passwordto change the distributor password.
4. at the distributor, execute sp_changedistpublisherto change the properties of a publisher using the distributor.

step 3 指定一个publisher为distributor支持的 publisher

sp_adddistpublisher [ @publisher= ] 'publisher'   
        , [ @distribution_db= ] 'distribution_db'   
    [ , [ @security_mode= ] security_mode ]   
    [ , [ @login= ] 'login' ]   
    [ , [ @password= ] 'password' ]   
    [ , [ @working_directory= ] 'working_directory' ]   
    [ , [ @trusted= ] 'trusted' ]   
    [ , [ @encrypted_password= ] encrypted_password ]   
    [ , [ @thirdparty_flag = ] thirdparty_flag ]  
    [ , [ @publisher_type = ] 'publisher_type' ]  

这个脚本可以在distributor角色服务器的任何数据库上执行,因为distributing replication(分发副本)这个操作是服务器级别的,所以这个脚本一系列操作的本质也是作用在服务器上,因此在哪个数据内执行这个命令就无所谓了。

@publisher这个参数尤其要注意,暂不敢确定到底是publisher的instance名称还是其它 @thirdparty_flag, @publisher_type两个参数用来指定publisher是不是非sql server数据库 @working_directory,就是snapshot replication存在的地方

step 4 为publisher配置一个可用的distributor(仅用在分布式replication中)

指定一个被distributor支持的publisher,和为publibsher配置一个可用的distributor是一个连通的过程,相当于企业之间互相签订合同一样,双方各执一份。在publisher上配置一个distributor的命令和在distributor上配置一个distributor一样:

sp_adddistributor [ @distributor= ] 'distributor'   
    [ , [ @heartbeat_interval= ] heartbeat_interval ]   
    [ , [ @password= ] 'password' ]   
[ , [ @from_scripting= ] from_scripting ]  

这里password就起到认证授权的作用了。

因为每增加一个distributor都会往sys.servers表里写入一个象征性的服务器名一样,这里面的服务器名带着各自的instance名称,同时我们会给distributor所在的服务器定义一个名称,比如repl_distributor。 这个名称很重要,暂时我们可以理解为是一台服务器的名称,其它服务器或者客户端都可以通过这个名称来连接,实际上是用在replication架构中。

step 5 配置publisher,distributor, subscriber数据库的元数据

sp_replicationdboption [ @dbname= ] 'db_name'   
        , [ @optname= ] 'optname'   
        , [ @value= ] 'value'   
    [ , [ @ignore_distributor= ] ignore_distributor ]  
    [ , [ @from_scripting = ] from_scripting ]  

@optname,可以是:
- merge publish: 指定数据库可以用作merge publication
- publish:除了merge publish之外的publication
- subscriber:指定数据库为subscription数据库
- sync with backup
@value: false, 丢弃某个数据库的角色;true,设定某个数据库的角色
@ignore_distributor: false,可以不连接distributor来更新信息,比如若想丢弃publication数据库角色,而distributor又连不通,就需要false.

**

2 publisher :

** 通过publication,我们可以将publisher数据库里的数据以及数据库对象发布到其它服务器的指定数据库里面,有些对象在merge publication中不适用,比如stored procedures –execution(transact-sql, clr), indexed views as tables,除此之外,其它对象都可以在三种publication中使用,这些对象可以是 :

tables, partitioned tables, stored procedures – definition (transact-sql and clr), views, indexed views, user-defined types (clr), user-defined functions (transact-sql and clr), alias data types, full text indexes, schema objects (constraints, indexes, user dml triggers, extended properties, and collation).

在没有通过查找msdn来获取publish细节之前,我们先自我假设下,正确搭建一个publisher来publish一些publication, 大概需要哪些步骤?
- 正确搭建一个distributor
- 在distributor上增加一个publisher的认证:sp_adddistpublisher
- 在publisher上指定我们要做publish的数据库
- 在publisher上配置我们要使用的distributor
- 创建一份publication
- 发布publication

这是replication架构的前半部分配置,之后还需要配置subscriber和subscription。最后需要配置的是读写分离的逻辑实现,概念参见《构建高性能web站点》郭欣(著),采用了haproxy机制,对读写sql进行路由分离,写的sql都放到publisher服务器上,而读sql都转到subscriber服务器上,对subscriber做network load balance.

步骤一写出来就知道自己往什么方向着手了,创建一份publication是有讲究的,怎么创建,哪些对象是可以被published的,都有哪些方式,怎么做全量,怎么做增量,publication的定义都存储在哪里? publication的pipeline,过程控制的点,频率怎么控制,延迟如何管控,在本节都需要详细地阐述。

publisher可以是和distributor在同一个instance上面,也可以是远程服务器上的某一个instance。搭建本地publisher和远程publisher的区别在哪里?

首先我们先来看下本地publisher怎么搭建:
- 先搭建一个distributor,配置distributor服务器属性,创建distribution数据库
- 在distributor上指定一个publisher的认证sp_adddistpublisher
- 在distributor上指定一个数据库为publication数据库,设定某一种特定的publication方式,snapshot replication, transaction replication,merge replication

接着再分析下远程publisher怎么搭建:
- 先搭建一个distributor,配置distributor服务器属性,创建distribution数据库
- 在distributor上指定一个publisher的认证sp_adddistpublisher
- 在publisher上配置一个distributor,指定一个数据库为publication数据库,设定某一种特定的publication方式,snapshot replication, transaction replication,merge replication

distributor和publisher其实就类似内容生产商与分销商,必须一一签订合同。本地关系就简单些,远程关系就双方人手一份合同,有趣的是授权认证的方式,如果双方是通过私人掮客签订合同的,那么需要同一个掮客来回搭线,类似于security model为0,使用sql server 帐户密码;如果双方是通过中介公司签订合同的,那么同一个公司的任何业务员都能办理这桩事,所以security model可以设置为1,使用windows帐户认证。

假如我现在已经搭好了distributor,也指定了publisher,无论是本地的还是远程的,那么这些元数据存储在哪里? 既然distributor,publisher都签订了合同,那么双方都应该有保存,这份保存就是我们要找的东西。

gui方式: ssms – replication-replication monitor ; sql command prompt:sqlmonitor.exe t-sql方式:
sp_replmonitorhelppublisher
sp_replmonitorhelppublisher [ [ @publisher = ] ‘publisher’ ]
[ , [ @refreshpolicy = ] refreshpolicy ]
@publisher可以指定一个具体的服务器名,也可以不设置,返回包括所有的publisher。 返回的结果集要注意的就是status字段,它 有6个可能值,意思分别是: 1: started; 2:succeeded; 3:in progress; 4:idle ;5:retrying ;6:failed .
exec sp_helpdistributor

既然我们说了是双方合同,那么publisher上也应该有副本存在,sp_helpdistributor就可以帮我们查询到使用的distributor信息。

关于publication:

use distribution
go
exec sp_replmonitorhelppublication

上面这个脚本只能用来看哪些publication 存在,以及对应的job,但是publication具体定义看不到.

上面的说明讨论都还只是第一步,还没有进入到创建publication阶段,所以称之为准备阶段。接下来就讨论publication阶段了。
不同的publication使用的命令方式不同,总体上分为两种: snapshot或者transactional publication;merge publication. 在创建snapshot, transactional publication的时候我们使用sp_addpublication.

sp_addpublication [ @publication = ] 'publication'  
    [ , [ @taskid = ] tasked ]  
    [ , [ @restricted = ] 'restricted' ]  
    [ , [ @sync_method = ] 'sync_method' ]  
    [ , [ @repl_freq = ] 'repl_freq' ]  
    [ , [ @description = ] 'description' ]  
    [ , [ @status = ] 'status' ]  
    [ , [ @independent_agent = ] 'independent_agent' ]  
    [ , [ @immediate_sync = ] 'immediate_sync' ]  
    [ , [ @enabled_for_internet = ] 'enabled_for_internet' ]  
    [ , [ @allow_push = ] 'allow_push'  
    [ , [ @allow_pull = ] 'allow_pull' ]  
    [ , [ @allow_anonymous = ] 'allow_anonymous' ]  
    [ , [ @allow_sync_tran = ] 'allow_sync_tran' ]  
    [ , [ @autogen_sync_procs = ] 'autogen_sync_procs' ]  
    [ , [ @retention = ] retention ]  
    [ , [ @allow_queued_tran= ] 'allow_queued_updating' ]  
    [ , [ @snapshot_in_defaultfolder= ] 'snapshot_in_default_folder' ]  
    [ , [ @alt_snapshot_folder= ] 'alternate_snapshot_folder' ]  
    [ , [ @pre_snapshot_script= ] 'pre_snapshot_script' ]  
    [ , [ @post_snapshot_script= ] 'post_snapshot_script' ]  
    [ , [ @compress_snapshot= ] 'compress_snapshot' ]  
    [ , [ @ftp_address = ] 'ftp_address' ]  
    [ , [ @ftp_port= ] ftp_port ]  
    [ , [ @ftp_subdirectory = ] 'ftp_subdirectory' ]  
    [ , [ @ftp_login = ] 'ftp_login' ]  
    [ , [ @ftp_password = ] 'ftp_password' ]  
    [ , [ @allow_dts = ] 'allow_dts' ]  
    [ , [ @allow_subscription_copy = ] 'allow_subscription_copy' ]  
    [ , [ @conflict_policy = ] 'conflict_policy' ]  
    [ , [ @centralized_conflicts = ] 'centralized_conflicts' ]   
    [ , [ @conflict_retention = ] conflict_retention ]  
    [ , [ @queue_type = ] 'queue_type' ]  
    [ , [ @add_to_active_directory = ] 'add_to_active_directory' ]  
    [ , [ @logreader_job_name = ] 'logreader_agent_name' ]  
    [ , [ @qreader_job_name = ] 'queue_reader_agent_name' ]  
    [ , [ @publisher = ] 'publisher' ]   
    [ , [ @allow_initialize_from_backup = ] 'allow_initialize_from_backup' ]  
    [ , [ @replicate_ddl = ] replicate_ddl ]  
    [ , [ @enabled_for_p2p = ] 'enabled_for_p2p' ]  
    [ , [ @publish_local_changes_only = ] 'publish_local_changes_only' ]  
    [ , [ @enabled_for_het_sub = ] 'enabled_for_het_sub' ]  
    [ , [ @p2p_conflictdetection = ] 'p2p_conflictdetection' ]  
    [ , [ @p2p_originator_id = ] p2p_originator_id  
    [ , [ @p2p_continue_onconflict = ] 'p2p_continue_onconflict'  
    [ , [ @allow_partition_switch = ] 'allow_partition_switch'  
    [ , [ @replicate_partition_switch = ]'replicate_partition_switch'  

定义好一个publication,就要为publication增加article的定义,然后创建一个publication的snapshot,执行完这个snapshot的生成,接着就可以做transactional的增量了。
所以这里的问题就在于publication是否定为transactional还是snapshot了。不论是否定义publication为transactional还是snapshot,我们都需要做的一步就是生成这份publication的snapshot.

sp_addpublication来定一个publication
怎么控制这个publication是snapshot, transactional, merge publication? 首先在增加publication之前,我们应该先指定一个发布数据库 ,使用sp_repliactiondboption来启用某一个数据库的发布功能。

如果我们需要建立的是transactional replication,我们还需要创建一个logreader agent(job).
执行sp_addlogreader_agent来添加这个agent job,执行sp_helplogreader_agent可以查询到我们新建的这个agent job. 每一个publication database只能有一个同账户的log reader agent job, 如果是非sql server数据库的publiaction database,那么必须设定特定账户用户及密码,而且连接publisher的授权认证模式也必须是0,@publisher_security_mode = 0.

接下来我们添加publication. 执行sp_addpublication, 设定是否可以被pull, push, 设置@repl_freq,为snapshot replication设置snapshot, 为transactional replication设置continuous. transactional replication是默认值

这里摘抄msdn上的一个小例子:

-- to avoid storing the login and password in the script file, the values 
-- are passed into sqlcmd as scripting variables. for information about 
-- how to use scripting variables on the command line and in sql server
-- management studio, see the "executing replication scripts" section in
-- the topic "programming replication using system stored procedures".
declare @publicationdb as sysname;
declare @publication as sysname;
declare @login as sysname;
declare @password as sysname;
set @publicationdb = n'adventureworks'; 
set @publication = n'advworksproducttran'; 
-- windows account used to run the log reader and snapshot agents.
set @login = $(login); 
-- this should be passed at runtime.
set @password = $(password); 
-- enable transactional or snapshot replication on the publication database.
exec sp_replicationdboption 
  @dbname=@publicationdb, 
  @optname=n'publish',
  @value = n'true';
-- execute sp_addlogreader_agent to create the agent job. 
exec sp_addlogreader_agent 
  @job_login = @login, 
  @job_password = @password,
  -- explicitly specify the use of windows integrated authentication (default) 
  -- when connecting to the publisher.
  @publisher_security_mode = 1;
-- create a new transactional publication with the required properties. 
exec sp_addpublication 
  @publication = @publication, 
  @status = n'active',
  @allow_push = n'true',
  @allow_pull = n'true',
  @independent_agent = n'true';
-- create a new snapshot job for the publication, using a default schedule.
exec sp_addpublication_snapshot 
  @publication = @publication, 
  @job_login = @login, 
  @job_password = @password,
  -- explicitly specify the use of windows integrated authentication (default) 
  -- when connecting to the publisher.
  @publisher_security_mode = 1;
go
sp_addpublication_snpashot来定义一个publication的snapshot
为什么要定义这个publication的snapshot,既然可以生成snapshot,为什么还需要定义它,而不直接从publication中生成snapshot? 连着上面的分析,可知默认情况下,publication是transactional级别的,所以并没有一个全量的snapshot给到我们新建的transactional publication, 所以这里要为transactional publication新建一个snapshot. 其二,无论publication定义为哪种publication,都只是定义,没有实际执行的执行者。

这个地方隐藏的一点就是,当执行完这个命令之后,就新建了一个snapshot agent job.
这个snapshot的agent job创建,必须是在publisher database上进行,当我们为某一个publication创建一个snapshot agentj job的时候,publisher通过与distributor通信,把这个新建的agent job注册到了distributor。 如果是在非publisher数据库上执行,则会出现下面的错误:
msg 18757, level 16, state 1, procedure sp_msrepl_addpublication_snapshot, line 76
unable to execute procedure. the database is not published. execute the procedure in a database that is published for replication.

sp_addarticle来往publication里面添加需要复制的对象和数据
这么说来,publication与article之间是一对多的关系。而article是肯定不能被包含在不同的publication之间的,如果被包含在不同的publication之间,同步逻辑该怎么写?

我们以msdn上的一个例子来阐述:
https://msdn.microsoft.com/en-us/library/ms173857.x
为sp_addarticle指定的参数主要有:
@publication: 一个数据库可能有多个publication,在这里我们要指定某一个具体的数据库里已有的publication;
@article: 指定一个 article名称;
@source_object:来指定要同步的对象
@destination_table: 如果target table/stored procedure不是同一个名字,就需要指定;
@type有很多种,比如logbased,同步表的时候就要指定logbased了;
@ins_cmd, @del_cmd,@upd_cmd就是用来更改同步逻辑的。详细参考:specify how changes are propagated for transactional articles ( https://msdn.microsoft.com/en-us/library/ms152489.aspx).

declare @publication    as sysname;
declare @table as sysname;
declare @filterclause as nvarchar(500);
declare @filtername as nvarchar(386);
declare @schemaowner as sysname;
set @publication = n'advworksproducttran'; 
set @table = n'product';
set @filterclause = n'[discontinueddate] is null'; 
set @filtername = n'filter_out_discontinued';
set @schemaowner = n'production';
-- add a horizontally and vertically filtered article for the product table.
-- manually set @schema_option to ensure that the production schema 
-- is generated at the subscriber (0x8000000).
exec sp_addarticle 
  @publication = @publication, 
  @article = @table, 
  @source_object = @table,
  @source_owner = @schemaowner, 
  @schema_option = 0x80030f3,
  @vertical_partition = n'true', 
  @type = n'logbased',
  @filter_clause = @filterclause;
-- (optional) manually call the stored procedure to create the 
-- horizontal filtering stored procedure. since the type is 
-- 'logbased', this stored procedures is executed automatically.
exec sp_articlefilter 
  @publication = @publication, 
  @article = @table, 
  @filter_clause = @filterclause, 
  @filter_name = @filtername;
-- add all columns to the article.
exec sp_articlecolumn 
  @publication = @publication, 
  @article = @table;
-- remove the daystomanufacture column from the article
exec sp_articlecolumn 
  @publication = @publication, 
  @article = @table, 
  @column = n'daystomanufacture', 
  @operation = n'drop';
-- (optional) manually call the stored procedure to create the 
-- vertical filtering view. since the type is 'logbased', 
-- this stored procedures is executed automatically.
exec sp_articleview 
  @publication = @publication, 
  @article = @table,
  @filter_clause = @filterclause;
go

msg 156, level 15, state 1: incorrect syntax near the keyword ‘from’.
msg 21745, level 16, state 1, procedure sp_msrepl_articleview, line 301
cannot generate a filter view or procedure. verify that the value specified for the @filter_clause parameter of sp_addarticle can be added to the where clause of a select statement to produce a valid query.
msg 20027, level 11, state 1, procedure sp_msrepl_articlecolumn, line 181
the article ‘region’ does not exist.
如果没有使用任何的条件筛选,我们就不要在sp_addarticle里面指定filter,要不然就出现上面的错误。

sp_startpubliaction_snapshot来生成一份publication snapshot
这份publication snapshot是由什么来生成的,如果是snapshot replication,那么可以理解每次replication都是全量抽取的,如果是transactional repliaction,则这份snapshot应该只生成一次。
由于snapshot生成一次耗时,耗网络宽带,如果基于数据库作snapshot还要锁表,不适合用来做日常的同步更新,这里我们要设置snapshot agent job只跑一次。

在执行这段脚本的时候,特别要注意 job的执行者对working directory有可读可写权限,如果没有,我们可以修改这working directory :

exec sp_changedistpublisher
@publisher = 'vm-b9cb-cd02\mssqlserver2014',
@property = 'working_directory',
@value = 'e:\data_bu\snapshot' ;
--d:\data\data_srv\dbagroup\sql2014\mssql12.mssqlserver2014\mssql\repldata

上面的脚本就是用来改写working directory.

transactional publication, snapshot publication是怎么驱动replication的
我们知道replication,可以用push,pull的方式来进行,那么在publisher服务器上我们定义好publication, 添加完我们需要同步的对象与数据, 到这里配置publication的工作就结束了
当subscription也配置好之后,我们就可以启动相应的agent job来启用replication.

**

3 subscriber:

**从订阅者的角度来说,主要做的事情就是pull publication,就是主动拉取publication. 那么publisher要推publication该怎么配置? 一份subscription把publisher与subscriber连接起来,首先在distributor上肯定是定义一些subscribers,subscriber上是不是也需要授权distributor可以访问? 然后为每个subscriber配置一些subscription,在这个步骤中应该可以指定这份subscription是pull还是push形式,频率有多少。
- synchronization agent : 同步代理job在pull subscription时候,运行在subscriber服务器上,在push subcription的时候,运行在distributor服务器上: