Redis的持久化
作者:韦涛
推荐理由:Redis作为分布式缓存架构中重要的一环,会保存一些较为重要的数据,抗住系统的高并发访问。因此Redis中的数据必须持久化,学习持久化配置对应生产环境redis故障保证不丢失和恢复具有重要意义。
Redis持久化的两种方式:RDB(Redis Database)快照和AOF(Append Only
File)只追加文件。
一、RDB(Redis Database)快照
快照持久化。Redis可以通过创建快照的形式来获得存储在内存中的数据在
某个时间点上的副本。Redis创建快照之后,可以对快照进行备份,可以将快照
复制到其他服务器从而创建具有相同数据的服务器副本(Redis主从结构,主要
用来提高Redis性能),还可以将快照留在原地以便重启服务器的时候使用。
1、基本原理
在指定的时间间隔内将内存中的数据集快照写入磁盘(亦即Snapshot快照),
恢复时是将快照文件直接读到内存里。
redis会单独创建(fork)一个子进程进行持久化,会先将数据写入到一个临
时文件中,待持久化过程(即写入过程)结束了,再用这个临时文件代替上一次持久化的文件。整个过程中,主进程是不进行任何IO操作的,这就确保了极高的性能。
如果需要进行大规模的数据恢复,且对于数据恢复的完整性并非特别敏感,
那RDB方式比AOF方式更加高效,RDB的缺点是最后一次持久化后的数据可能丢失。如果在新的快照文件创建完毕之前,Redis、系统或者硬件这三者中的任意一个崩溃了,那么Redis将丢失最近一次创建快照后写入的所有数据。
2、fork
fork的作用是复制一个与当前进程一样的进程,新进程的所有数据(变量、
环境变量、程序计数器等)数值都和原进程一致,但却是一个全新的进程,并作为原进程的子进程。
3、RDB保存的是dump.rdb文件
默认情况下,RDB保存的是dump.rdb文件。但是,根据配置,快照数据将被写入dbfilename选项指定的文件里面,并存储在dir选项指定的路径上面。4、触发RDB快照
配置文件
1 ################################ SNAPSHOTTING ###########################
#####
2 #
3 # Save the DB on disk:
4 #
5 # save <seconds> <changes>
6 #
7 # Will save the DB if both the given number of seconds and the given
8 # number of write operations against the DB occurred.
9 #
10 # In the example below the behaviour will be to save:
11 # after 900 sec (15 min) if at least 1 key changed
12 # after 300 sec (5 min) if at least 10 keys changed
13 # after 60 sec if at least 10000 keys changed
14 #
15 # Note: you can disable saving completely by commenting out all "save" l
ines.
16 #
17 # It is also possible to remove all the previously configured save
18 # points by adding a save directive with a single empty string argument
19 # like in the following example:
20 #
21 # 不设置任何的save指令或者save "" 即为禁用
22 # save ""
23
24 # 指定在多长时间内,有多少次更新操作,就将数据同步到数据快照文件dump.rdb(即RDB
触发),可以多个条件配合
25 # save <seconds> <changes>
26 # Redis默认配置文件中提供了三个条件:
27 # save 900 1
28 # save 300 10
29 # save 60 10000
30 # 分别表示900秒(15分钟)内有1个更改,300秒(5分钟)内有10个更改以及60秒内有
10000个更改会触发RDB生成新的dump.rdb文件。
31
32 # 不满足配置的save策略可以手动使用save命令或者bgsave命令立即触发RDB,保存生产
新的dump.rdb文件
33 # save命令只管保存,其他不管,全部阻塞
34 # bgsave命令redis会在后台异步进行快照操作,同时还能响应客户端请求35 # 使用lastsave命令获取最后一次成功执行快照的时间
36
37 # 执行FLUSHALL或者SHUTDOWN命令时会迅速形成最新的dump.rdb文件
38
39 save 900 1
40 save 300 10
41 save 60 10000
42
43 # By default Redis will stop accepting writes if RDB snapshots are enabl
ed
44 # (at least one save point) and the latest background save failed.
45 # This will make the user aware (in a hard way) that data is not persist
ing
46 # on disk properly, otherwise chances are that no one will notice and so
me
47 # disaster will happen.
48 #
49 # If the background saving process will start working again Redis will
50 # automatically allow writes again.
51 #
52 # However if you have setup your proper monitoring of the Redis server
53 # and persistence, you may want to disable this feature so that Redis wi
ll
54 # continue to work as usual even if there are problems with disk,
55 # permissions, and so forth.
56 # 如果配置成no,表示不在乎数据不一致或者有其他的手段发现和控制
57 stop‐writes‐on‐bgsave‐error yes
58
59 # Compress string objects using LZF when dump .rdb databases?
60 # For default that's set to 'yes' as it's almost always a win.
61 # If you want to save some CPU in the saving child set it to 'no' but
62 # the dataset will likely be bigger if you have compressible values or k
eys.
63 # 指定存储至本地数据库时是否压缩数据,默认为yes,Redis采用LZF压缩,
64 # 如果为了节省CPU时间,可以关闭该选项,但会导致数据库文件变的巨大
65 rdbcompression yes
66
67 # Since version 5 of RDB a CRC64 checksum is placed at the end of the fi
le.
68 # This makes the format more resistant to corruption but there is a perf
ormance69 # hit to pay (around 10%) when saving and loading RDB files, so you can
disable it
70 # for maximum performances.
71 #
72 # RDB files created with checksum disabled have a checksum of zero that
will
73 # tell the loading code to skip the check.
74 # 在存储快照后,还可以让redis使用CRC64算法来进行数据校验,但是这样做会增加大约
10%的性能消耗,如果希望获取到最大的性能提升,可以关闭此功能
75 rdbchecksum yes
76
77 # The filename where to dump the DB
78 # 指定本地数据库文件名,默认值为dump.rdb
79 dbfilename dump.rdb
80
81 # The working directory.
82 #
83 # The DB will be written inside this directory, with the filename specif
ied
84 # above using the 'dbfilename' configuration directive.
85 #
86 # The Append Only File will also be created inside this directory.
87 #
88 # Note that you must specify a directory here, not a file name.
89 # 指定本地数据库存放目录
90 dir ./
91
1 127.0.0.1:6379> set name weitao
2 OK
3 127.0.0.1:6379> save
4 127.0.0.1:6379> set name weitao
5 OK
6 127.0.0.1:6379> bgsave
1 127.0.0.1:6379> set name weitao
2 OK
3 127.0.0.1:6379> FLUSHALL
1 127.0.0.1:6379> set name weitao
2 OK3 127.0.0.1:6379> SHUTDOWN
5、恢复
将备份文件(dump.rdb)移动到redis的安装目录并启动服务即可
使用CONFIG GET dir 命令获取目录
6、优势
适合大规模的数据恢复
适合对数据完整性要求不高的情景
7、劣势
在指定的时间间隔内进行数据持久化,若redis意外down了,则最后一次持
久化之后的数据可能丢失。
8、总结
9、创建快照的办法
BGSAVE命令:客户端向Redis发送BGSAVE命令来创建一个快照。对于支
持BGSAVE命令的平台来说(基本上所有平台支持,除了Windows平台),
Redis会调用fork来创建一个子进程,然后子进程负责将快照写入硬盘,而
父进程则继续处理命令请求。SAVE命令:客户端还可以向Redis发送SAVE命令来创建一个快照,接到SAVE命令的Redis服务器在快照创建完毕之前不会再响应任何其他命令。
SAVE命令不常用,通常只会在没有足够内存去执行BGSAVE命令的情况下,又或者即使等待持久化操作执行完毕也无所谓的情况下,才会使用这个命令。
save选项:如果用户设置了save选项(一般会默认设置),比如save
60 10000,那么从Redis最近一次创建快照之后开始算起,当“60秒之内有
10000次写入”这个条件被满足时,Redis就会自动触发BGSAVE命令。
SHUTDOWN命令:当Redis通过SHUTDOWN命令接收到关闭服务器的请求
时,或者接收到标准TERM信号时,会执行一个SAVE命令,阻塞所有客户
端,不再执行客户端发送的任何命令,并在SAVE命令执行完毕之后关闭服
务器。
一个Redis服务器连接到另一个Redis服务器:当一个Redis服务器连接
到另一个Redis服务器,并向对方发送SYNC命令来开始一次复制操作的时
候,如果主服务器目前没有执行BGSAVE操作,或者主服务器并非刚刚执行
完BGSAVE操作,那么主服务器就会执行BGSAVE命令。
二、AOF(Append Only File)
与快照持久化相比,AOF持久化的实时性更好,因此已成为主流的持久化方
案。默认情况下Redis没有开启AOF(append only file)方式的持久化。可以通过appendonly参数开启:appendonly yes。开启AOF持久化后每执行一条会更改Redis中的数据的命令,Redis就会将该命令写入硬盘中的AOF文件。AOF文件的保存位置和RDB文件的位置相同,都是通过dir参数设置的,默认的文件名是appendonly.aof。
1、基本原理
以日志的形式记录每一个写操作,将Redis的所有写指令都记录下来(读操作
不记录),只允许追加日志文件不允许改写文件,redis启动之初会读取该文件重新构建数据;换言之,redis启动就根据日志文件将记录的所有的写操作指令从前到后依次执行一遍以完成数据的恢复。
2、AOF保存的是appendonly.aof文件启用AOF持久化方式之后,Redis服务初次启动即生成appendonly.aof文件,之后会将写操作指令写入该文件;同时,只要RDB持久化方式是开启,仍然会产生dump.rdb文件,并按照持久化策略进行数据持久化;二者共存。但是Redis服务启动时会优先加载appendonly.aof文件恢复数据。
两个持久化文件同时存在,appendonly.aof文件记录了k1 v1, …, k26
v26的写操作指令,但是dump.rdb文件仅记录了k1 v1, …, k12 v12的持久化数据开启AOF持久化不开启AOF持久化
3、配置文件
1 ############################## APPEND ONLY MODE #########################
######
2 # AOF
3 4
# By default Redis asynchronously dumps the dataset on disk. This mode is
5 # good enough in many applications, but an issue with the Redis process o
r 6
# a power outage may result into a few minutes of writes lost (depending
on
7 # the configured save points).
8 #
9 # The Append Only File is an alternative persistence mode that provides
10 # much better durability. For instance using the default data fsync poli
cy
11 # (see later in the config file) Redis can lose just one second of write
s in a
12 # dramatic event like a server power outage, or a single write if someth
ing
13 # wrong with the Redis process itself happens, but the operating system
is
14 # still running correctly.
15 #
16 # AOF and RDB persistence can be enabled at the same time without proble
ms.17 # If the AOF is enabled on startup Redis will load the AOF, that is the
file
18 # with the better durability guarantees.
19 #
20 # Please check http://redis.io/topics/persistence for more information.
21 # 指定是否在每次更新操作后对操作命令进行日志记录,
22 # Redis在默认情况下是异步的把数据写入磁盘,
23 # 如果不开启,可能会在断电时导致一段时间内的数据丢失。
24 # 因为redis本身同步数据文件是按上面save条件来同步的,
25 # 所以有的数据会在一段时间内只存在于内存中。
26 # 默认为no
27 appendonly no
28
29 # The name of the append only file (default: "appendonly.aof")
30 # 指定更新日志文件名,默认为appendonly.aof
31
32 appendfilename "appendonly.aof"
33
34 # The fsync() call tells the Operating System to actually write data on
disk
35 # instead of waiting for more data in the output buffer. Some OS will re
ally flush
36 # data on disk, some other OS will just try to do it ASAP.
37 #
38 # Redis supports three different modes:
39 #
40 # no: don't fsync, just let the OS flush the data when it wants. Faster.
41 # always: fsync after every write to the append only log. Slow, Safest.
42 # everysec: fsync only one time every second. Compromise.
43 #
44 # The default is "everysec", as that's usually the right compromise betw
een
45 # speed and data safety. It's up to you to understand if you can relax t
his to
46 # "no" that will let the operating system flush the output buffer when
47 # it wants, for better performances (but if you can live with the idea o
f 48
# some data loss consider the default persistence mode that's snapshotti
ng),
49 # or on the contrary, use "always" that's very slow but a bit safer than
50 # everysec.
51 #52 # More details please check the following article:
53 # http://antirez.com/post/redis‐persistence‐demystified.html
54 #
55 # If unsure, use "everysec".
56 # 指定更新日志条件,共有3个可选值:
57 # no:表示等操作系统进行数据缓存同步到磁盘(快)
58 # always:同步持久化,表示每次更新操作后调用fsync()将数据写到磁盘(慢,安全)
59 # everysec:异步操作,表示每秒同步一次(折衷,默认值)
60
61 # appendfsync always
62 appendfsync everysec
63 # appendfsync no
64
65 # When the AOF fsync policy is set to always or everysec, and a backgrou
nd
66 # saving process (a background save or AOF log background rewriting) is
67 # performing a lot of I/O against the disk, in some Linux configurations
68 # Redis may block too long on the fsync() call. Note that there is no fi
x for
69 # this currently, as even performing fsync in a different thread will bl
ock
70 # our synchronous write(2) call.
71 #
72 # In order to mitigate this problem it's possible to use the following o
ption
73 # that will prevent fsync() from being called in the main process while
a 74
# BGSAVE or BGREWRITEAOF is in progress.
75 #
76 # This means that while another child is saving, the durability of Redis
is
77 # the same as "appendfsync none". In practical terms, this means that it
is
78 # possible to lose up to 30 seconds of log in the worst scenario (with t
he
79 # default Linux settings).
80 #
81 # If you have latency problems turn this to "yes". Otherwise leave it as
82 # "no" that is the safest pick from the point of view of durability.
83 # 重写时是否可以运用appendfsync,用默认no即可,保证数据的安全性
84
85 no‐appendfsync‐on‐rewrite no86
87 # Automatic rewrite of the append only file.
88 # Redis is able to automatically rewrite the log file implicitly calling
89 # BGREWRITEAOF when the AOF log size grows by the specified percentage.
90 #
91 # This is how it works: Redis remembers the size of the AOF file after t
he
92 # latest rewrite (if no rewrite has happened since the restart, the size
of
93 # the AOF at startup is used).
94 #
95 # This base size is compared to the current size. If the current size is
96 # bigger than the specified percentage, the rewrite is triggered. Also
97 # you need to specify a minimal size for the AOF file to be rewritten, t
his
98 # is useful to avoid rewriting the AOF file even if the percentage incre
ase
99 # is reached but it is still pretty small.
100 #
101 # Specify a percentage of zero in order to disable the automatic AOF
102 # rewrite feature.
103 # 设置重写的基准值
104 # Redis会记录上次重写时的appendonly.aof文件的大小,默认配置是当当前的appendo
nly.aof文件的大小是上次rewirte之后的文件大小的一倍且当前文件大小大于64M时触发
105
106 auto‐aof‐rewrite‐percentage 100
107 auto‐aof‐rewrite‐min‐size 64mb
108
109 # An AOF file may be found to be truncated at the end during the Redis
110 # startup process, when the AOF data gets loaded back into memory.
111 # This may happen when the system where Redis is running
112 # crashes, especially when an ext4 filesystem is mounted without the
113 # data=ordered option (however this can't happen when Redis itself
114 # crashes or aborts but the operating system still works correctly).
115 #
116 # Redis can either exit with an error when this happens, or load as much
117 # data as possible (the default now) and start if the AOF file is found
118 # to be truncated at the end. The following option controls this behavio
r.
119 #
120 # If aof‐load‐truncated is set to yes, a truncated AOF file is loaded an
d121 # the Redis server starts emitting a log to inform the user of the
event.
122 # Otherwise if the option is set to no, the server aborts with an error
123 # and refuses to start. When the option is set to no, the user requires
124 # to fix the AOF file using the "redis‐check‐aof" utility before to rest
art
125 # the server.
126 #
127 # Note that if the AOF file will be found to be corrupted in the middle
128 # the server will still exit with an error. This option only applies whe
n 12
9 # Redis will try to read more data from the AOF file but not enough byte
s 13
0 # will be found.
131 aof‐load‐truncated yes
132
133 # When rewriting the AOF file, Redis is able to use an RDB preamble in t
he
134 # AOF file for faster rewrites and recoveries. When this option is turne
d 13
5 # on the rewritten AOF file is composed of two different stanzas:
136 #
137 # [RDB file][AOF tail]
138 #
139 # When loading Redis recognizes that the AOF file starts with the "REDI
S"
140 # string and loads the prefixed RDB file, and continues loading the AOF
141 # tail.
142 aof‐use‐rdb‐preamble yes
143
4、同步方式
在Redis的配置文件中存在三种同步方式,它们分别是:
appendfsync always #每次有数据修改发生时都会写入AOF文件,这样会
严重降低Redis的速度appendfsync everysec #每秒钟同步一次,显示地将多个写命令同步到硬盘appendfsync no #让操作系统决定何时进行同步appendfsync always可以实现将数据丢失减到最少,不过这种方式需要对硬
盘进行大量的写入而且每次只写入一个命令,十分影响Redis的速度。另外使用固态硬盘的用户谨慎使用appendfsync always选项,因为这会明显降低固态硬盘的使用寿命。
为了兼顾数据和写入性能,用户可以考虑appendfsync everysec选项,让
Redis每秒同步一次AOF文件,Redis性能几乎没受到任何影响。而且这样即使出现系统崩溃,用户最多只会丢失一秒之内产生的数据。当硬盘忙于执行写入操作的时候,Redis还会优雅的放慢自己的速度以便适应硬盘的最大写入速度。appendfsync no选项一般不推荐,这种方案会使Redis丢失不定量的数据而且如果用户的硬盘处理写入操作的速度不够的话,那么当缓冲区被等待写入的数据填满时,Redis的写入操作将被阻塞,这会导致Redis的请求速度变慢。虽然AOF持久化非常灵活地提供了多种不同的选项来满足不同应用程序对数
据安全的不同要求,但AOF持久化也有缺陷——AOF文件的体积太大。
5、AOF持久化方式的异常恢复
运行过程中,有可能因为Redis意外down掉而导致appendonly.aof文件写入
不全导致该文件损坏。由此便会导致开启了AOF持久化方式的Redis服务重启加载
appendonly.aof文件报错,导致服务无法重启,因此便需要redis-check-aof –
fix appendonly.aof修复损坏的appendonly.aof文件。6、rewirte
AOF持久化方式采用的是追加appendonly.aof文件的方式,因此便会导致该
文件越来越大,为了避免这种情况,新增了重写机制。当appendonly.aof文件的大小超过所设定的值时,Redis就会启动appendonly.aof文件的内容压缩,删除冗余指令,只保留可以恢复数据的最小指令集,可以使用命令BGREWRITEAOF进行手动触发。BGREWRITEAOF命令和BGSAVE创建快照原理十分相似,所以AOF文件重写也需要用到子进程,这样会导致性能问题和内存占用问题,和快照持久化一样。更糟糕的是,如果不加以控制的话,AOF文件的体积可能会比快照文件大好几倍。
重写原理
appendonly.aof文件持续增长而过大,Redis会fork出一个新的进程来将文
件进行重写(也是先写临时文件,最后在rename),遍历新进程内存中的数据,每
条数据对应一条set语句。重写appendonly.aof文件并没有读取旧的文件,而是
将整个内存中的数据库内容用命令的方式重写了一个新的appendonly.aof文件。触发机制
和快照持久化可以通过设置save选项来自动执行BGSAVE一样,AOF持久化也可以通过设置
1 auto‐aof‐rewrite‐percentage
2 auto‐aof‐rewrite‐min‐size两个选项自动执行BGREWRITEAOF命令。举例:假设用户对Redis设置了如下配置
选项并且启用了AOF持久化。那么当AOF文件体积大于64mb,并且AOF的体积比上
一次重写之后的体积大了至少一倍(100%)的时候,Redis将执行BGREWRITEAOF
命令。这也是默认配置。
1 auto‐aof‐rewrite‐percentage 100
2 auto‐aof‐rewrite‐min‐size 64mb
7、优势
appendfsync no:表示等操作系统进行数据缓存同步到磁盘(快)
appendfsync always:同步持久化,表示每次更新操作后调用fsync()将数
据写到磁盘(慢,安全)
appendfsync everysec:异步操作,表示每秒同步一次(折衷,默认值)
8、劣势
就相同数据集的的数据而言AOF文件比RDB文件大得多,恢复速度慢于RDB
AOF运行速度慢于RDB,每秒同步效率较好,不同步效率和RDB相同
9、总结
三、二者比较性能建议
因为RDB文件只用作后备用途,建议只在Slave上持久化RDB文件,而且只要15分钟备份一次就够了,只保留save 900 1这条规则。
如果Enalbe AOF,好处是在最恶劣情况下也只会丢失不超过两秒数据,启动脚本较简单只load自己的AOF文件就可以了。代价一是带来了持续的IO,二是AOFrewrite的最后将rewrite过程中产生的新数据写到新文件造成的阻塞几乎是不可避免的。只要硬盘许可,应该尽量减少AOF rewrite的频率,AOF重写的基础大小默认值64M太小了,可以设到5G以上。默认超过原大小100%大小时重写可以改到适当的数值。
如果不Enable AOF ,仅靠Master-Slave Replication实现高可用性也可
以 。 能 省 掉 一 大 笔 IO 也 减 少 了 rewrite 时 带 来 的 系 统 波 动 。 代 价 是 如 果Master/Slave同时down掉,会丢失十几分钟的数据,启动脚本也要比较两个Master/Slave中的RDB文件,载入较新的那个。新浪微博就选用了这种架构。
四、Redis 4.0对于持久化机制的优化
Redis 4.0开始支持RDB和AOF的混合持久化(默认关闭,可以通过配置项
aof-use-rdb-preamble开启)。
如果把混合持久化打开,AOF重写的时候就直接把RDB的内容写到AOF文件开头。这样做的好处是可以结合RDB和AOF的优点,快速加载同时避免丢失过多的数据。当然缺点也是有的,AOF里面的RDB部分的压缩格式不再是AOF格式,可读性较差。
上一篇: 如何查找python的安装路径
下一篇: python怎么提取数组中的数