Redis之数据持久化
介绍
Redis是内存数据库,断电之后,内存中的数据会自动被抹除。这样就需要进行持久化操作,将内存中的数据同步到硬盘中,Redis支持两种数据持久化方法。RDB和AOF模式。
RDB
1、介绍
RDB(Redis DataBase)可以在指定的时间间隔内将内存中的数据集(全部键值对),写入到磁盘中,生成经过压缩的二进制RDB文件,通过该文件可以还原数据库的状态。
2、命令及原理
支持两个命令,都可以在客户端手动执行:
- SAVE:SAVE命令由主进程执行,所以会阻塞主进程(客户端发送的所有命令都会被拒绝),直到主进程执行完SAVE命令、重新开始接收命令请求之后,客户端发送的命令才会被处理。命令都进入了TCP套接字的缓冲区,存在于内核,并没有读取到用户区。这是epoll全部可读。
- BGSAVE:所谓的后台备份,BGSAVE命令会派生(fork)出一个子进程,专门负责创建RDB文件,服务器继续处理命令请求。
- BGSAVE原理:fork的作用是复制一个与当前进程一样的进程。新进程的所有数据(变量、环境变量、程序计数器等)数值都和原进程一致,但是是一个全新的进程,并作为原进程的子进程。
3、间接性保存
因为后台备份不会阻塞服务器,所以Redis允许用户通过配置文件设置SAVE选项,让服务器每隔一段时间自动执行一次BGSAVE命令。下面是关于RDB的配置说明。
################################ SNAPSHOTTING ################################
#
# Save the DB on disk:
#
# save <seconds> <changes>
#
# Will save the DB if both the given number of seconds and the given
# number of write operations against the DB occurred.
#
# In the example below the behaviour will be to save:
# after 900 sec (15 min) if at least 1 key changed
# after 300 sec (5 min) if at least 10 keys changed
# after 60 sec if at least 10000 keys changed
#
# Note: you can disable saving completely by commenting out all "save" lines.
#
# It is also possible to remove all the previously configured save
# points by adding a save directive with a single empty string argument
# like in the following example:
#
/*
save "",表示不执行RBD保存。
每个900s,修改1次,执行一次BGSAVE
或每隔300s,修改10次,执行一次BGSAVE
或每隔60s,修改10000次,执行一次BGSAVE
三个条件,任意一个满足,就会执行BGSAVE。
*/
# save ""
save 900 1
save 300 10
save 60 10000
# By default Redis will stop accepting writes if RDB snapshots are enabled
# (at least one save point) and the latest background save failed.
# This will make the user aware (in a hard way) that data is not persisting
# on disk properly, otherwise chances are that no one will notice and some
# disaster will happen.
#
# If the background saving process will start working again Redis will
# automatically allow writes again.
#
# However if you have setup your proper monitoring of the Redis server
# and persistence, you may want to disable this feature so that Redis will
# continue to work as usual even if there are problems with disk,
# permissions, and so forth.
/*
后台备份不关注数据一致性,不一致的时候,继续接收写。
*/
stop-writes-on-bgsave-error yes
# Compress string objects using LZF when dump .rdb databases?
# For default that's set to 'yes' as it's almost always a win.
# If you want to save some CPU in the saving child set it to 'no' but
# the dataset will likely be bigger if you have compressible values or keys.
/*
对于存储到磁盘中的快照,可以设置是否进行压缩存储。
如果是的话,redis会采用LZF算法进行压缩。
如果你不想消耗CPU来进行压缩的话,可以设置为关闭此功能。
*/
rdbcompression yes
# Since version 5 of RDB a CRC64 checksum is placed at the end of the file.
# This makes the format more resistant to corruption but there is a performance
# hit to pay (around 10%) when saving and loading RDB files, so you can disable it
# for maximum performances.
#
# RDB files created with checksum disabled have a checksum of zero that will
# tell the loading code to skip the check.
/*
在存储快照后,还可以让redis使用CRC64算法来进行数据校验,但是这样做会增加大约
10%的性能消耗,如果希望获取到最大的性能提升,可以关闭此功能。
*/
rdbchecksum yes
# The filename where to dump the DB
/*
当前服务器RDB文件的名字,最好每每个服务器对应一个名字,防止文件覆盖。
*/
dbfilename dump.rdb
4、RDB优缺点
适合大规模的数据恢复,对数据完整性和一致性要求不高,在一定间隔时间做一次备份,所以如果redis意外down掉的话,就会丢失最后一次快照后的所有修改,fork的时候,内存中的数据被克隆了一份,大致2倍的膨胀性也是需要考虑。
AOF
有了前面的RDB为什么还需要有牛逼的AOF呢?因为对于RDB都会有一个实现限制,所以当服务器出现宕机,会出现大量的数据丢失。于是出现了AOF,尽量减小数据的丢失。
1、介绍
AOF(Append Only File),与RDB持久化通过保存数据库中的键值对来记录数据库状态不同,AOF持久化是通过保存Redis服务器所执行的写命令来记录数据库状态的。被写人 AOF文件的所有命令都是以Redis的命令请求协议格式保存的,因为Redis的命令请求协议是纯文本格式,所以我们可以直接打开一个AOF文件,观察其中的写命令。
2、命令及同步原理
AOF持久化功能实现分为命令追加(append)、文件写入(write)、文件同步(sync)三个步骤。下面说明Redis内部如何实现这几个操作。
命令追加:
当AOF持久化功能处于打开状态时,服务器在每次执行完一个写命令之后,会以协议格式将被执行的写命令追加到服务器状态的aof_buf
缓冲区的末尾。
struct redisServer {
//....................
/* AOF persistence */
int aof_state; /* AOF_(ON|OFF|WAIT_REWRITE) */
int aof_fsync; /* Kind of fsync() policy */
char *aof_filename; /* Name of the AOF file */
int aof_no_fsync_on_rewrite; /* Don't fsync if a rewrite is in prog. */
int aof_rewrite_perc; /* Rewrite AOF if % growth is > M and... */
off_t aof_rewrite_min_size; /* the AOF file is at least N bytes. */
off_t aof_rewrite_base_size; /* AOF size on latest startup or rewrite. */
off_t aof_current_size; /* AOF current size. */
int aof_rewrite_scheduled; /* Rewrite once BGSAVE terminates. */
pid_t aof_child_pid; /* PID if rewriting process */
list *aof_rewrite_buf_blocks; /* Hold changes during an AOF rewrite. */
sds aof_buf; /* AOF buffer, written before entering the event loop */
int aof_fd; /* File descriptor of currently selected AOF file */
int aof_selected_db; /* Currently selected DB in AOF */
time_t aof_flush_postponed_start; /* UNIX time of postponed AOF flush */
time_t aof_last_fsync; /* UNIX time of last fsync() */
time_t aof_rewrite_time_last; /* Time used by last AOF rewrite run. */
time_t aof_rewrite_time_start; /* Current AOF rewrite start time. */
int aof_lastbgrewrite_status; /* C_OK or C_ERR */
unsigned long aof_delayed_fsync; /* delayed AOF fsync() counter */
int aof_rewrite_incremental_fsync;/* fsync incrementally while rewriting? */
int aof_last_write_status; /* C_OK or C_ERR */
int aof_last_write_errno; /* Valid if aof_last_write_status is ERR */
int aof_load_truncated; /* Don't stop on unexpected AOF EOF. */
int aof_use_rdb_preamble; /* Use RDB preamble on AOF rewrites. */
/* AOF pipes used to communicate between parent and child during rewrite. */
int aof_pipe_write_data_to_child;
int aof_pipe_read_data_from_parent;
int aof_pipe_write_ack_to_parent;
int aof_pipe_read_ack_from_child;
int aof_pipe_write_ack_to_child;
int aof_pipe_read_ack_from_parent;
int aof_stop_sending_diff; /* If true stop sending accumulated diffs
to child process. */
sds aof_child_diff; /* AOF diff accumulator child side. */
//............................
}
上述就是服务器结构体中的aof_buf
。这个缓冲区肯定是先保存在用户空间的内存中。每次服务器执行SET
指令,后就会在sds这个aof_buf
字符串后面添加对应的命令,这就是追加的过程,其实就是一个简单的动态字符串而已。
写入和同步:
Linux关于文件写入的方式,这是重点重点重点(参考Unix环境高级编程):
传统的Uinx系统实现在内核中设有缓冲区高速缓存或页高速缓存,大多数磁盘I/O都通过缓冲区进行。当我们向文件写入数据时,为了提高效率,内核通常先将数据复制到缓冲区中,然后排入队列,晚些时候再写入磁盘,这种方式被称为延迟写。通常,当内核需要重用缓冲区来存放其他磁盘块数据时,它会把所有延迟写数据块写入磁盘。为了保证磁盘上实际文件系统与缓冲区中内容的一致性,系统提供了sync、fsync、fdatasync
三个函数
完全是为了解决磁盘写入慢的问题。
include<unistd.h>
int fsync(int fd);
/*
叫做同步,不进行延迟写。
对文件描述为fd,等待写磁盘结束才返回,通常用于数据库应用程序,确保修改过得块可以立即磁盘。
*/
int fdatasync(int fd);
/*
类似于fsync,但是只影响文件数据部分。
*/
void sync (void);
/*
异步操作:只是将所有修改过得块缓冲区从用户区排入写队列内核区,然后返回,不必等待写磁盘结束,何时真正写入磁盘,由操作系统的文件系统决定。不确定。
通常update守护进行周期调用,定时flush内核的块缓冲区。
*/
//返回值:若成功,返回0; 若出错,返回-1
在Redis服务器事件循环中,会调用flushAppendOnlyFile
函数,考虑是否需要将aof_buf
缓冲区中的内容写入(write)和保存到AOF文件里面。这个根据配置文件决定。
3、AOF重写
https://yq.aliyun.com/articles/193034
因为AOF持久化是通过保存被执行的写命令来记录数据库状态的,所以随着服务器运行时间的流逝,AOF文件中的内容会越来越多(因为不仅需要记录数据,还需要记录命令),文件的体积也会越来越大,如果不加以控制的话,体积过大的AOF文件很可能对Redis服务器、甚至整个宿主计算机造成影响,并且AOF文件的体积越大,使用AOF文件来进行数据还原所需的时间就越多。
重写的原理:多个添加操作,能否经过处理变成一条添加的操作,那么存储会节省内存?于是,先从数据库中读取键现在的值,然后用一条命令去记录键值对,代替之前记录这个键值对的多条命令,这就是 AOF 重写功能的实现原理。
在配置文件中,当AOF文件超过一定大小之后,就可以配置AOF重写功能了。这个重写是通过后台重写进程处理的。过程如下:
- 子进程进行AOF重写期间,服务器进程(父进程)可以继续处理命令请求。
- 子进程带有服务器进程的数据副本,使用子进程而不是线程,可以在避免使用锁的情况下,保证数据的安全性。
- 子进程在进行AOF重写期间,服务器进程还需要继续处理命令请求,而新的命令可能会对现有的数据库状态进行修改,从而使得服务器当前的数据库状态和重写后的 AOF 文件所保存的数据库状态不一致。为了解决这种数据不一致问题。Redis 服务器设置了一个AOF重写缓冲区,这个缓冲区在服务器创建子进程之后开始使用,当Redis服务器执行完一个写命令之后,它会同时将这个写命令发送给AOF缓冲区和AOF重写缓冲区。
- 当子进程完成AOF重写工作之后,它会向父进程发送一个信号,父进程在接到该信号之后,会调用一个信号处理函数。首先将AOF重写缓冲区中的所有内容写入到新AOF文件中,这时新AOF文件所保存的数据库状态将和服务器当前的数据库状态一致。然后对新的AOF文件进行改名,原子地 (atomic) 覆盖现有的AOF文件,完成新旧两个AOF文件的替换,父进程就可以接收处理了。在整个AOF后台重写过程中,只有信号处理函数执行时会对服务器进程(父进程)造成阻塞,在其他时候AOF后台重写都不会阻塞父进程,这将AOF重写对服务器性能造成的影响降到了最低。
4、配置文件
############################## APPEND ONLY MODE ###############################
# By default Redis asynchronously dumps the dataset on disk. This mode is
# good enough in many applications, but an issue with the Redis process or
# a power outage may result into a few minutes of writes lost (depending on
# the configured save points).
#
# The Append Only File is an alternative persistence mode that provides
# much better durability. For instance using the default data fsync policy
# (see later in the config file) Redis can lose just one second of writes in a
# dramatic event like a server power outage, or a single write if something
# wrong with the Redis process itself happens, but the operating system is
# still running correctly.
#
# AOF and RDB persistence can be enabled at the same time without problems.
# If the AOF is enabled on startup Redis will load the AOF, that is the file
# with the better durability guarantees.
#
# Please check http://redis.io/topics/persistence for more information.
/*
是否开启AOF选项
*/
appendonly no
# The name of the append only file (default: "appendonly.aof")
/*
AOF文件名
*/
appendfilename "appendonly.aof"
# The fsync() call tells the Operating System to actually write data on disk
# instead of waiting for more data in the output buffer. Some OS will really flush
# data on disk, some other OS will just try to do it ASAP.
#
# Redis supports three different modes:
#
# no: don't fsync, just let the OS flush the data when it wants. Faster.
# always: fsync after every write to the append only log. Slow, Safest.
# everysec: fsync only one time every second. Compromise.
#
# The default is "everysec", as that's usually the right compromise between
# speed and data safety. It's up to you to understand if you can relax this to
# "no" that will let the operating system flush the output buffer when
# it wants, for better performances (but if you can live with the idea of
# some data loss consider the default persistence mode that's snapshotting),
# or on the contrary, use "always" that's very slow but a bit safer than
# everysec.
#
# More details please check the following article:
# http://antirez.com/post/redis-persistence-demystified.html
#
# If unsure, use "everysec".
/*
append追加策略:
always在每次事件循环中,都将aof_buf中缓冲区内容写入(write)并同步(fsync)到AOF文件。
这种叫做同步持久化,效率最差,但是安全性最高,即使出现故障停机,AOF持久化也就丢失一个事件循环中的命令。
everysec在每次事件循环中,都将aof_buf中缓冲区内容写入(write)到AOF文件,这时候并不同步。
如果距离上次同步超过了1s,那么才对AOF文件进行同步(fsync),并且同步由专门的线程负责。
异步同步,服务器每次将值写入AOF,但是每隔1s才进行一次同步,从效率上,everysec模块足够快,就算宕机,也就丢失1s中内的数据。
no在每次事件循环中,都将aof_buf中缓冲区内容写入(write)到AOF文件,不同步,至于合适同步由操作系统决定。因为flushAppendOnlyFile函数无需调用
fsync函数,不需要等待磁盘,所以写入速度是最快的,不过因为这种模式会在系统缝存中积累一段时间的写入数据,所以该模式的单次同步时长通常是三种模式中时间最长的。
*/ 从平摊操作的角度来看,no模式和everysec模式的效率类似,当出现故障停机使用no模式的服务器将丢失上次同步AOF文件之后的所有写命令数据。
# appendfsync always
appendfsync everysec
# appendfsync no
# When the AOF fsync policy is set to always or everysec, and a background
# saving process (a background save or AOF log background rewriting) is
# performing a lot of I/O against the disk, in some Linux configurations
# Redis may block too long on the fsync() call. Note that there is no fix for
# this currently, as even performing fsync in a different thread will block
# our synchronous write(2) call.
#
# In order to mitigate this problem it's possible to use the following option
# that will prevent fsync() from being called in the main process while a
# BGSAVE or BGREWRITEAOF is in progress.
#
# This means that while another child is saving, the durability of Redis is
# the same as "appendfsync none". In practical terms, this means that it is
# possible to lose up to 30 seconds of log in the worst scenario (with the
# default Linux settings).
#
# If you have latency problems turn this to "yes". Otherwise leave it as
# "no" that is the safest pick from the point of view of durability.
/*
fsync同步函数可能占用大量IO操作,导致阻塞其他操作。
所以可以切换成sync异步模式,但是可能导致数据不完整。
*/
no-appendfsync-on-rewrite no
# Automatic rewrite of the append only file.
# Redis is able to automatically rewrite the log file implicitly calling
# BGREWRITEAOF when the AOF log size grows by the specified percentage.
#
# This is how it works: Redis remembers the size of the AOF file after the
# latest rewrite (if no rewrite has happened since the restart, the size of
# the AOF at startup is used).
#
# This base size is compared to the current size. If the current size is
# bigger than the specified percentage, the rewrite is triggered. Also
# you need to specify a minimal size for the AOF file to be rewritten, this
# is useful to avoid rewriting the AOF file even if the percentage increase
# is reached but it is still pretty small.
#
# Specify a percentage of zero in order to disable the automatic AOF
# rewrite feature.
/*
自动重写,当AOF文件增长过大,则会自动fork子进程,重写AOF的大小。
详解,原理在前面已经讲解了,具体的代码就不细心揣摩了。
auto-aof-rewrite-percentage 代表全写先前AOF的多少,100代表全部重写
size代表当AOF到达多大才进行重写,大点的公司一般会是4G左右。
*/
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
# An AOF file may be found to be truncated at the end during the Redis
# startup process, when the AOF data gets loaded back into memory.
# This may happen when the system where Redis is running
# crashes, especially when an ext4 filesystem is mounted without the
# data=ordered option (however this can't happen when Redis itself
# crashes or aborts but the operating system still works correctly).
#
# Redis can either exit with an error when this happens, or load as much
# data as possible (the default now) and start if the AOF file is found
# to be truncated at the end. The following option controls this behavior.
#
# If aof-load-truncated is set to yes, a truncated AOF file is loaded and
# the Redis server starts emitting a log to inform the user of the event.
# Otherwise if the option is set to no, the server aborts with an error
# and refuses to start. When the option is set to no, the user requires
# to fix the AOF file using the "redis-check-aof" utility before to restart
# the server.
#
# Note that if the AOF file will be found to be corrupted in the middle
# the server will still exit with an error. This option only applies when
# Redis will try to read more data from the AOF file but not enough bytes
# will be found.
aof-load-truncated yes
# When rewriting the AOF file, Redis is able to use an RDB preamble in the
# AOF file for faster rewrites and recoveries. When this option is turned
# on the rewritten AOF file is composed of two different stanzas:
#
# [RDB file][AOF tail]
#
# When loading Redis recognizes that the AOF file starts with the "REDIS"
# string and loads the prefixed RDB file, and continues loading the AOF
# tail.
#
# This is currently turned off by default in order to avoid the surprise
# of a format change, but will at some point be used as the default.
/*
是否使用AOF和RDB一起持久化
*/
aof-use-rdb-preamble no
4、AOF对比于RDB
- RDB持久化方式能够在指定的时间间隔能对数据进行快照存储。
- AOF持久化方式记录每次对服务器写的操作,当服务器重启的时候会重新执行这些
命令来恢复原始的数据,AOF命令以Redis协议追加保存每次写的操作到文件末尾.。Redis还能对AOF文件进行后台重写,使得AOF文件的体积不至于过大。 - 作为分布式缓存可以不使用任何持久化方式。
- 同时开启两种持久化方式,在这种情况下,当redis重启的时候会优先载入AOF文件来恢复原始的数据,因为在通常情况下AOF文件保存的数据集要比RDB文件保存的数据集要完整。RDB的数据不实时,同时使用两者时服务器重启也只会找AOF文件。那要不要只使用AOF呢?作者建议不要,因为RDB更适合用于备份数据库(AOF在不断变化不好备份),快速重启,而且不会有AOF可能潜在的bug,留着作为一个万一的手段。
- 性能建议,因为RDB文件只用作后备用途,建议只在Slave上持久化RDB文件,而且只要15分钟备份一次就够了,只保留save 900 1这条规则。
- 如果使用了AOF,好处是在最恶劣情况下也只会丢失不超过两秒数据,代价一是带来了持续的IO;二是AOF rewrite的最后将rewrite过程中产生的新数据写到新文件造成的阻塞几乎是不可避免的(通常启动后台进程减少主进程的命令处理)。只要硬盘许可,应该尽量减少AOF rewrite的频率,AOF重写的基础大小默认值64M太小了,可以设到5G以上。默认超过原大小100%大小时重写可以改到适当的数值。