欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  数据库

Redis源码分析(七)---zipmap压缩图

程序员文章站 2022-05-01 11:23:01
...

如果有看过之前我分析的ziplist压缩列表的分析的话,理解这个我觉得不是什么特别的难题。ziplist压缩列表和zipmap都采用了动态分配字节的做法表示长度,比如通过固定的字节表示节省了不少的空间。同样带来的问题就是复杂的指针移动,和字符位置移动。但总的

如果有看过之前我分析的ziplist压缩列表的分析的话,理解这个我觉得不是什么特别的难题。ziplist压缩列表和zipmap都采用了动态分配字节的做法表示长度,比如通过固定的字节表示节省了不少的空间。同样带来的问题就是复杂的指针移动,和字符位置移动。但总的来说,一定是利大于弊了,要不然设计者也不会这么做。ziplist保存的使用一个列表,zipmap就保存的则是一个个键值对,通过key:value key:value的形式连着。下面我给出zipmap的结构构成,zipmap其实也就是一个超级长的字符串。

"foo""bar""hello""world" 
里面涉及了几个变量zmlen,len,free,下面给出完整的解释:
/* String -> String Map data structure optimized for size.
 * This file implements a data structure mapping strings to other strings
 * implementing an O(n) lookup data structure designed to be very memory
 * efficient.
 *
 * The Redis Hash type uses this data structure for hashes composed of a small
 * number of elements, to switch to a hash table once a given number of
 * elements is reached.
 *
 * Given that many times Redis Hashes are used to represent objects composed
 * of few fields, this is a very big win in terms of used memory.
 *
 * zipmap压缩表和ziplist十分类似,都做到了内存操作效率比较高的
 * --------------------------------------------------------------------------
 *
 * Copyright (c) 2009-2010, Salvatore Sanfilippo 
 * All rights reserved.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions are met:
 *
 *   * Redistributions of source code must retain the above copyright notice,
 *     this list of conditions and the following disclaimer.
 *   * Redistributions in binary form must reproduce the above copyright
 *     notice, this list of conditions and the following disclaimer in the
 *     documentation and/or other materials provided with the distribution.
 *   * Neither the name of Redis nor the names of its contributors may be used
 *     to endorse or promote products derived from this software without
 *     specific prior written permission.
 *
 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
 * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
 * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
 * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
 * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
 * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
 * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
 * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
 * POSSIBILITY OF SUCH DAMAGE.
 */

/* Memory layout of a zipmap, for the map "foo" => "bar", "hello" => "world":
 *
 * "foo""bar""hello""world"
 *
 *  is 1 byte length that holds the current size of the zipmap.
 * When the zipmap length is greater than or equal to 254, this value
 * is not used and the zipmap needs to be traversed to find out the length.
 * 占有着1个字节,所以他的最多可代表的数量是254,当zipmap中的元素记录超过这个数时,
 * 那只能从前往后后遍历算大小了,和ziplist是不一样的。
 *
 *  is the length of the following string (key or value).
 *  lengths are encoded in a single value or in a 5 bytes value.
 * If the first byte value (as an unsigned 8 bit value) is between 0 and
 * 252, it's a single-byte length. If it is 253 then a four bytes unsigned
 * integer follows (in the host byte ordering). A value of 255 is used to
 * signal the end of the hash. The special value 254 is used to mark
 * empty space that can be used to add new key/value pairs.
 * 代表了后面字符串key 或 value的值的长度,长度一般被编码1个字节或5个字节表示,这个和ziplist类似
 * 如果后面的字符串长度小于等于252个,可与用单字节表示,其他253,254等长度被用来表示其他作用了,当超过这个数时候
 * 则直接按5字节的方式存储长度。
 *
 *  is the number of free unused bytes after the string, resulting
 * from modification of values associated to a key. For instance if "foo"
 * is set to "bar", and later "foo" will be set to "hi", it will have a
 * free byte to use if the value will enlarge again later, or even in
 * order to add a key/value pair if it fits.
 * 一般来表示后面的value长度的空闲值,当key:value=“foo”:"bar",后来被改为“foo”:"hi",空闲长度就为1了
 *
 *  is always an unsigned 8 bit number, because if after an
 * update operation there are more than a few free bytes, the zipmap will be
 * reallocated to make sure it is as small as possible.
 * 的数字一般比较小,如果空闲太大,zipmap会进行调整大小使map整体变得尽可能小
 *
 * The most compact representation of the above two elements hash is actually:
 * 这是一个例子:
 * "foo""bar""hello""world" 
 * key字符后面同前
 * "\x02\x03foo\x03\x00bar\x05hello\x05\x00world\xff"
 *
 * Note that because keys and values are prefixed length "objects",
 * the lookup will take O(N) where N is the number of elements
 * in the zipmap and *not* the number of bytes needed to represent the zipmap.
 * This lowers the constant times considerably.
 */
说到键值对,里面最最重要的方法当然是根据key ,setValue的方法了,方法如下:
/* Set key to value, creating the key if it does not already exist.
 * If 'update' is not NULL, *update is set to 1 if the key was
 * already preset, otherwise to 0. */
unsigned char *zipmapSet(unsigned char *zm, unsigned char *key, unsigned int klen, unsigned char *val, unsigned int vlen, int *update) {
    unsigned int zmlen, offset;
    unsigned int freelen, reqlen = zipmapRequiredLength(klen,vlen);
    unsigned int empty, vempty;
    unsigned char *p;

    freelen = reqlen;
    if (update) *update = 0;
    //寻找key的位置
    p = zipmapLookupRaw(zm,key,klen,&zmlen);
    if (p == NULL) {
        /* Key not found: enlarge */
        //key的位置没有找到,调整zipmap的大小,准备添加操作
        zm = zipmapResize(zm, zmlen+reqlen);
        p = zm+zmlen-1;
        zmlen = zmlen+reqlen;

        /* Increase zipmap length (this is an insert) */
        //如果头字节还没有达到最大值,则递增
        if (zm[0] = ZIPMAP_VALUE_MAX_FREE) {
        /* First, move the tail  bytes to the front, then resize
         * the zipmap to be  bytes smaller. */
        offset = p-zm;
        memmove(p+reqlen, p+freelen, zmlen-(offset+freelen+1));
        zmlen -= empty;
        zm = zipmapResize(zm, zmlen);
        p = zm+offset;
        vempty = 0;
    } else {
        vempty = empty;
    }

    /* Just write the key + value and we are done. */
    /* Key: */
    //定位到插入的位置,首先写入key值
    p += zipmapEncodeLength(p,klen);
    memcpy(p,key,klen);
    p += klen;
    /* Value: */
    //key值后面是value值,再次写入
    p += zipmapEncodeLength(p,vlen);
    *p++ = vempty;
    memcpy(p,val,vlen);
    return zm;
}
map里返回长度的方法有点特别,就直接定位了就用一个字节存储长度:
/* Return the number of entries inside a zipmap */
/* 返回map的长度 */
unsigned int zipmapLen(unsigned char *zm) {
    unsigned int len = 0;
    //如果第一个长度小于最大值,则直接返回
    if (zm[0] 
平常我们在redis客户端执行set key "value"命令的时候,调用的其实就是set方法,如下: 
    zm = zipmapSet(zm,(unsigned char*) "name",4, (unsigned char*) "foo",3,NULL);
    zm = zipmapSet(zm,(unsigned char*) "surname",7, (unsigned char*) "foo",3,NULL);
    zm = zipmapSet(zm,(unsigned char*) "age",3, (unsigned char*) "foo",3,NULL);
比ziplist方法简单许多了,最后给出头文件
/* String -> String Map data structure optimized for size.
 *
 * See zipmap.c for more info.
 *
 * --------------------------------------------------------------------------
 *
 * Copyright (c) 2009-2010, Salvatore Sanfilippo 
 * All rights reserved.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions are met:
 *
 *   * Redistributions of source code must retain the above copyright notice,
 *     this list of conditions and the following disclaimer.
 *   * Redistributions in binary form must reproduce the above copyright
 *     notice, this list of conditions and the following disclaimer in the
 *     documentation and/or other materials provided with the distribution.
 *   * Neither the name of Redis nor the names of its contributors may be used
 *     to endorse or promote products derived from this software without
 *     specific prior written permission.
 *
 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
 * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
 * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
 * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
 * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
 * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
 * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
 * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
 * POSSIBILITY OF SUCH DAMAGE.
 */

#ifndef _ZIPMAP_H
#define _ZIPMAP_H

unsigned char *zipmapNew(void);  //创建一个新的压缩图
unsigned char *zipmapSet(unsigned char *zm, unsigned char *key, unsigned int klen, unsigned char *val, unsigned int vlen, int *update); //设置压缩图中的某个键值对
unsigned char *zipmapDel(unsigned char *zm, unsigned char *key, unsigned int klen, int *deleted);  //删除压缩图上的某个键值对
unsigned char *zipmapRewind(unsigned char *zm);   //将在zipmapNext中被调用到
unsigned char *zipmapNext(unsigned char *zm, unsigned char **key, unsigned int *klen, unsigned char **value, unsigned int *vlen); //取得此键值对的下一个键值对
int zipmapGet(unsigned char *zm, unsigned char *key, unsigned int klen, unsigned char **value, unsigned int *vlen); //获取某个键值对
int zipmapExists(unsigned char *zm, unsigned char *key, unsigned int klen); //某个key值在zipmap中是否存在
unsigned int zipmapLen(unsigned char *zm); //zipmap压缩图的总键值对数
size_t zipmapBlobLen(unsigned char *zm); //压缩图的序列化到文件中所需大小
void zipmapRepr(unsigned char *p);  //输出的压缩图的具体信息,用于测试

#endif
最后,基于本人对redis源代码分析有一段时间了,我把分析好的代码,同步到了我的个人github上了,放上地址大家可以一起学习:

github:https://github.com/linyiqun/Redis-Code