redis的持久化机制详解

2022-09-08 13:59:52

redis的持久化机制

因为Redis的数据都储存在内存中,当进程退出时,所有数据都将丢失。为了保证数据安全,Redis支持RDB和AOF两种持久化机制有效避免数据丢失问题。RDB可以看作在某一时刻Redis的快照(snapshot),非常适合灾难恢复。AOF则是写入操作的日志。本文主要讲解RDB、AOF和混合结合使用。

rdb快照

RDB就像是一台给Redis内存数据存储拍照的照相机,生成快照保存到磁盘的过程。触发RDB持久化分为手动触发和自动触发。Redis重启读取RDB速度快,但是无法做到实时持久化,因此一般用于数据冷备和复制传输。

rdb触发方式

手动触发

1.阻塞: 使用save命令:此命令会使用Redis的主线程进程同步存储,阻塞当前的Redis服务器,造成服务不可用,直到RDB过程完成。无论当前服务器数据量大小,线上不要用。
2.非阻塞: 使用bgsave命令:此命令会通过fork()创建子进程,在后台进程存储。只有fork阶段会阻塞当前Redis服务器,不必到整个RDB过程结束,一般时间很短(因为不需要复制父进程的物理内存空间,只是将父进程的 虚拟内存 与 物理内存 映射关系复制到子进程中)。因此Redis内部涉及到RDB都采用bgsave命令。这里注意一点,无论RDB还是AOF,由于使用了写时复制,fork出来的子进程不需要拷贝父进程的物理内存空间,但是会复制父进程的空间内存页表。
在这里插入图片描述

自动触发

一般我们是不会直接用命令生成RDB文件的,Redis支持自动触发RDB持久化机制,配置都在redis.conf文件里面

################################ SNAPSHOTTING  ################################# Save the DB to disk.## save <seconds> <changes>## Redis will save the DB if both the given number of seconds and the given# number of write operations against the DB occurred.## Snapshotting can be completely disabled with a single empty string argument# as in following example:## save ""## Unless specified otherwise, by default Redis will save the DB:#   * After 3600 seconds (an hour) if at least 1 key changed#   * After 300 seconds (5 minutes) if at least 100 keys changed#   * After 60 seconds if at least 10000 keys changed
900秒(15分钟)内至少1个key值改变(则进行数据库保存--持久化)  
300秒(5分钟)内至少10个key值改变(则进行数据库保存--持久化)  
60秒(1分钟)内至少10000个key值改变(则进行数据库保存--持久化) 
只要满足其中的任何一种都可以触发rdb
save"" 关闭rdb# You can set these explicitly by uncommenting the three following lines.#
save 3600 1
save 300 100
save 60 10000# By default Redis will stop accepting writes if RDB snapshots are enabled# (at least one save point) and the latest background save failed.# This will make the user aware (in a hard way) that data is not persisting# on disk properly, otherwise chances are that no one will notice and some# disaster will happen.## If the background saving process will start working again Redis will# automatically allow writes again.## However if you have setup your proper monitoring of the Redis server# and persistence, you may want to disable this feature so that Redis will# continue to work as usual even if there are problems with disk,# permissions, and so forth.
如果是yes,当bgsave命令失败时Redis将停止写入操作,这样会让用户了解到数据没有被正确的存储到磁盘上。
否则没人会注意到这个问题,可能会造成灾难。
stop-writes-on-bgsave-erroryes# Compress string objects using LZF when dump .rdb databases?# By default compression is enabled as it's almost always a win.# If you want to save some CPU in the saving child set it to 'no' but# the dataset will likely be bigger if you have compressible values or keys.
是否对RDB文件进行压缩,但是在LZF压缩消耗更多CPU
rdbcompressionyes# Since version 5 of RDB a CRC64 checksum is placed at the end of the file.# This makes the format more resistant to corruption but there is a performance# hit to pay (around 10%) when saving and loading RDB files, so you can disable it# for maximum performances.## RDB files created with checksum disabled have a checksum of zero that will# tell the loading code to skip the check.
是否对RDB文件进程校验
rdbchecksumyes# Enables or disables full sanitation checks for ziplist and listpack etc when# loading an RDB or RESTORE payload. This reduces the chances of a assertion or# crash later on while processing commands.# Options:#   no         - Never perform full sanitation#   yes        - Always perform full sanitation#   clients    - Perform full sanitation only for user connections.#                Excludes: RDB files, RESTORE commands received from the master#                connection, and client connections which have the#                skip-sanitize-payload ACL flag.# The default should be 'clients' but since it currently affects cluster# resharding via MIGRATE, it is temporarily set to 'no' by default.## sanitize-dump-payload no# The filename where to dump the DB
配置文件名称,默认dump.rdb
dbfilename dump.rdb# Remove RDB files used by replication in instances without persistence# enabled. By default this option is disabled, however there are environments# where for regulations or other security concerns, RDB files persisted on# disk by masters in order to feed replicas, or stored on disk by replicas# in order to load them for the initial synchronization, should be deleted# ASAP. Note that this option ONLY WORKS in instances that have both AOF# and RDB persistence disabled, otherwise is completely ignored.## An alternative (and sometimes better) way to obtain the same effect is# to use diskless replication on both master and replicas instances. However# in the case of replicas, diskless is not always an option.
配置rdb文件存放的路劲,这个参数比较重要。
rdb-del-sync-files no# The working directory.## The DB will be written inside this directory, with the filename specified# above using the 'dbfilename' configuration directive.## The Append Only File will also be created inside this directory.## Note that you must specify a directory here, not a file name.
配置rdb文件存放的路劲,这个参数比较重要。dir /var/lib/redis/6379
  1. **save m n:**代表Redis服务器在m秒内数据存在n次修改时,自动触发rdb。这个参数比较关键。
  2. **stop-writes-on-bgsave-error:**如果是yes,当bgsave命令失败时Redis将停止写入操作。
  3. **rdbcompression:**是否对RDB文件进行压缩,但是在LZF压缩消耗更多CPU
  4. **rdbchecksum:**是否对RDB文件进程校验
  5. **dbfilename:**配置文件名称,默认dump.rdb
  6. **dir:**配置rdb文件存放的路劲,这个参数比较重要。

rdb运行原理

  • 要保证速度
  • 要保证空间大小
    在这里插入图片描述

当 Redis 需要保存 dump.rdb 文件时, 服务器执行以下操作:

  1. redis执行bgsave命令,Redis判断当前存在正在进行执行的子进程,如RDB/AOF子进程,存在bgsave命令直接返回
  2. fork出子进程,fork操作中Redis父进程会阻塞, 该过程速度非常快
  3. fork完成返回,父进程继续响应其他命令(写时复制)
  4. 子进程进程对内存数据生成新的RDB文件
  5. 子进程告诉父进程处理完成,父进程用新的rdb文件替换旧的rdb文件

优缺点

优点

  • 适合做全量备份
  • RDB 可以最大化 Redis 的性能;通过fork出子进程(速度快),和写时复制机制(节省空间)
  • 恢复速度比AOF快

缺点

  • RDB方式没有办法做到实时持久化,会丢失数据,比如半个小时进行一次rdb, 8点的时候进行了一个rdb,结果8点半的时候宕机了, 那么会丢失8点到8点半的数据

通过RDB文件恢复数据

将dump.rdb 文件拷贝到redis的安装目录的bin目录下,重启redis服务即可。在实际开发中,一般会考虑到物理机硬盘损坏情况,选择备份dump.rdb 。

AOF

RDB方式不能提供强一致性,如果Redis进程崩溃,那么两次RDB之间的数据也随之消失。那么AOF的出现很好的解决了数据持久化的实时性,AOF以独立日志的方式记录每次写命令,重启时再重新执行AOF文件中的命令来恢复数据。AOF会先把命令追加在AOF缓冲区,然后根据对应策略写入硬盘(appendfsync),具体参数后面有讲。接下来介绍一下AOF重写命令。

redis4.0之前和redis4.0之后

AOF触发方式

手动触发

使用bgrewriteaof命令:Redis主进程fork子进程来执行AOF重写,这个子进程创建新的AOF文件来存储重写结果,防止影响旧文件。因为fork采用了写时复制机制,子进程不能访问在其被创建出来之后产生的新数据。Redis使用“AOF重写缓冲区”保存这部分新数据,最后父进程将AOF重写缓冲区的数据写入新的AOF文件中然后使用新AOF文件替换老文件。
在这里插入图片描述

自动触发

和RDB一样,配置在redis.conf文件里,当然你也可以通过调用CONFIG SET命令设置。我们先看来看AOF相关配置:

############################## APPEND ONLY MODE ################################ By default Redis asynchronously dumps the dataset on disk. This mode is# good enough in many applications, but an issue with the Redis process or# a power outage may result into a few minutes of writes lost (depending on# the configured save points).## The Append Only File is an alternative persistence mode that provides# much better durability. For instance using the default data fsync policy# (see later in the config file) Redis can lose just one second of writes in a# dramatic event like a server power outage, or a single write if something# wrong with the Redis process itself happens, but the operating system is# still running correctly.## AOF and RDB persistence can be enabled at the same time without problems.# If the AOF is enabled on startup Redis will load the AOF, that is the file# with the better durability guarantees.## Please check https://redis.io/topics/persistence for more information.

appendonly no# The name of the append only file (default: "appendonly.aof")

appendfilename"appendonly.aof"# The fsync() call tells the Operating System to actually write data on disk# instead of waiting for more data in the output buffer. Some OS will really flush# data on disk, some other OS will just try to do it ASAP.## Redis supports three different modes:## no: don't fsync, just let the OS flush the data when it wants. Faster.# always: fsync after every write to the append only log. Slow, Safest.# everysec: fsync only one time every second. Compromise.## The default is "everysec", as that's usually the right compromise between# speed and data safety. It's up to you to understand if you can relax this to# "no" that will let the operating system flush the output buffer when# it wants, for better performances (but if you can live with the idea of# some data loss consider the default persistence mode that's snapshotting),# or on the contrary, use "always" that's very slow but a bit safer than# everysec.## More details please check the following article:# http://antirez.com/post/redis-persistence-demystified.html## If unsure, use "everysec".# appendfsync always
appendfsync everysec# appendfsync no# When the AOF fsync policy is set to always or everysec, and a background# saving process (a background save or AOF log background rewriting) is# performing a lot of I/O against the disk, in some Linux configurations# Redis may block too long on the fsync() call. Note that there is no fix for# this currently, as even performing fsync in a different thread will block# our synchronous write(2) call.## In order to mitigate this problem it's possible to use the following option# that will prevent fsync() from being called in the main process while a# BGSAVE or BGREWRITEAOF is in progress.## This means that while another child is saving, the durability of Redis is# the same as "appendfsync none". In practical terms, this means that it is# possible to lose up to 30 seconds of log in the worst scenario (with the# default Linux settings).## If you have latency problems turn this to "yes". Otherwise leave it as# "no" that is the safest pick from the point of view of durability.

no-appendfsync-on-rewrite no# Automatic rewrite of the append only file.# Redis is able to automatically rewrite the log file implicitly calling# BGREWRITEAOF when the AOF log size grows by the specified percentage.## This is how it works: Redis remembers the size of the AOF file after the# latest rewrite (if no rewrite has happened since the restart, the size of# the AOF at startup is used).## This base size is compared to the current size. If the current size is# bigger than the specified percentage, the rewrite is triggered. Also# you need to specify a minimal size for the AOF file to be rewritten, this# is useful to avoid rewriting the AOF file even if the percentage increase# is reached but it is still pretty small.## Specify a percentage of zero in order to disable the automatic AOF# rewrite feature.

auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb# An AOF file may be found to be truncated at the end during the Redis# startup process, when the AOF data gets loaded back into memory.# This may happen when the system where Redis is running# crashes, especially when an ext4 filesystem is mounted without the# data=ordered option (however this can't happen when Redis itself# crashes or aborts but the operating system still works correctly).## Redis can either exit with an error when this happens, or load as much# data as possible (the default now) and start if the AOF file is found# to be truncated at the end. The following option controls this behavior.## If aof-load-truncated is set to yes, a truncated AOF file is loaded and# the Redis server starts emitting a log to inform the user of the event.# Otherwise if the option is set to no, the server aborts with an error# and refuses to start. When the option is set to no, the user requires# to fix the AOF file using the "redis-check-aof" utility before to restart# the server.## Note that if the AOF file will be found to be corrupted in the middle# the server will still exit with an error. This option only applies when# Redis will try to read more data from the AOF file but not enough bytes# will be found.
aof-load-truncatedyes# When rewriting the AOF file, Redis is able to use an RDB preamble in the# AOF file for faster rewrites and recoveries. When this option is turned# on the rewritten AOF file is composed of two different stanzas:##   [RDB file][AOF tail]## When loading, Redis recognizes that the AOF file starts with the "REDIS"# string and loads the prefixed RDB file, then continues loading the AOF# tail.
aof-use-rdb-preambleyes

appendonly no:指定是 否在每次更新操作后进行日志记录,Redis在默认情况下是异步的把数据写入磁盘,如果不开启,可能会在断电 时导致一段时间内的数据丢失。因为 | | | redis本身同步数据文件是按上面save条件来同步的,所以有的数据会在一段时间内只存在于内存中。默认为no
appendfilename appendonly.aof: 指定更新日志文件名,默认为appendonly.aof
appendfsync always :命令写入aof缓冲区后,每一次写入都需要写入磁盘,慢,安全
appendfsync everysec: 命令写入aof缓冲区后,然后有专门线程每秒执行写入磁盘,相对快,可能会丢失1-2s的数据, 推荐使用,redis的默认值
appendfsync no:命令写入aof缓冲区后,之后写入磁盘的操作由操作系统负责
no-appendfsync-on-rewrite no:指 定是否在后台aof文件rewrite期间调 用fsync,默认为no,表示要调用fsync(无论后台是否有子进程在刷盘)。Redis在后台写RDB文件或重写afo文件期间会存在大量磁盘IO,此时,在某些 linux系统中,调用fsync可能会阻塞;
auto-aof-rewrite-percentage 100 :Redis记录最近的一次AOF操作的文件大小,如果当前AOF文件大小增长超过这个百分比则触发一次重写,默认100
auto-aof-rewrite-min-size 64mb:触发自动重写的最低文件体积(小于64mb不自动重写)
aof-use-rdb-preamble yes:开启混合持久化,更快的AOF重写和启动时数据恢复
aof-load-truncated yes: 指定当发生AOF文件末尾截断时,加载文件还是报错退出; yes(末尾被截断的 AOF 文件将会被加载,并打印日志通知用户) ;no(服务器将报错并拒绝启动,这时用户需要使用redis-check-aof 工具修复AOF文件,再重新启动)
注意: 当aof-use-rdb-preamble 为yes时,触发AOF重写将不再是根据当前内容生成写命令。而是先生成RDB文件写到开头,再将RDB生成期间的发生的增量写命令附加到文件末尾。

AOF的运行方式

  1. Redis调用fork()。于是我们有了父子两个进程。
  2. 子进程开始向一个临时文件中写AOF。
  3. 父进程在一个内存缓冲区中积累新的变更(同时将新的变更写入旧的AOF文件,所以即使重写失败我们也安全)。
  4. 当子进程完成重写文件,父进程收到一个信号,追加内存缓冲区到子进程创建的文件末尾。
  5. 搞定!现在Redis原子性地重命名旧文件为新的,然后开始追加新数据到新文件。

AOF的优缺点

优点

  • 做到最多丢失1-2s内的数据,
  • 即使由于某种原因文件末尾是一个写到一半的命令(磁盘满或者其他原因),redis-check-aof工具也可以很轻易的修复。
  • 当AOF文件变得很大时,Redis会自动在后台进行重写。重写是绝对安全的,aof文件的体积小

缺点

  • 对同样的数据集,AOF文件通常要大于等价的RDB文件。
  • AOF可能比RDB慢

AOF重写

你可以猜得到,写操作不断执行的时候AOF文件会越来越大。例如,如果你增加一个计数器100次,你的数据集里只会有一个键存储这最终值,但是却有100条记录在AOF中。其中99条记录在重建当前状态时是不需要的。

于是Redis支持一个有趣的特性:在后台重建AOF而不影响服务客户端。每当你发送BGREWRITEAOF时,Redis将会写入一个新的AOF文件,包含重建当前内存中数据集所需的最短命令序列。如果你使用的是Redis 2.2的AOF,你需要不时的运行BGREWRITEAOF命令。Redis 2.4可以自动触发日志重写(查看Redis 2.4中的示例配置文件以获得更多信息)。

开启aof,采用rdb和aof混合使用
在这里插入图片描述
aof文件最上面是rdb文件,下面是命令日志
在这里插入图片描述
aof重写
在这里插入图片描述
aof文件,上面是rdb文件,下面是重写后的命令日志
在这里插入图片描述

补充知识点(fork与写时复制)

在 Linux 系统中,调用 fork 系统调用创建子进程时,并不会把父进程所有占用的内存页复制一份,而是与父进程共用相同的内存页,而当子进程或者父进程对内存页进行修改时才会进行复制
进程的内存可分为 虚拟内存 和 物理内存。

  • 物理内存:就是电脑安装的内存条,如果电脑安装了2GB的内存条,那么系统就用于 0 ~ 2GB 的物理内存空间。
  • 虚拟内存:虚拟内存是使用软件虚拟的,在 32 位操作系统中,每个进程都独占 4GB 的虚拟内存空间。

在这里插入图片描述

写时复制 的原理大概如下:

  • 创建子进程时,将父进程的 虚拟内存 与 物理内存 映射关系复制到子进程中,
  • 当子进程或者父进程对内存数据进行修改时,便会触发 写时复制 机制:将原来的内存页复制一份新的,并重新设置其内存映射关系,比如父进程修改把数据3 修改为6,这时会在物理内存中开辟一个空间保存数据6,并把父进程的指针指向数据6,而子进程的指针依然指向数据3
    当创建子进程时,父子进程指向相同的 物理内存,而不是将父进程所占用的 物理内存 复制一份。这样做的好处有两个:
  • 加速创建子进程的速度。
  • 减少进程对物理内存的使用。
  • 作者:MR_YANGMIN
  • 原文链接:https://blog.csdn.net/weixin_45239670/article/details/123486611
    更新时间:2022-09-08 13:59:52