欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  IT编程

SQLite3源码学习之PageCache分析

程序员文章站 2022-07-05 22:59:11
上一篇学习了pcache1的机制,这是pagecache管理的一个插件,在这基础上又封装了一层,主要是用来处理脏页(就是修改过的缓存页),如脏页的添加删除和回收利用等,这部分代码的...

上一篇学习了pcache1的机制,这是pagecache管理的一个插件,在这基础上又封装了一层,主要是用来处理脏页(就是修改过的缓存页),如脏页的添加删除和回收利用等,这部分代码的实现在pcache.c里。

1.数据结构

在pcache中,通过PCache结构对象作为连接句柄,每个缓存页通过PgHdr来表示。

在pagecache中,所有的脏页通过一个双向链表来连接在一起,其结构关系如下图所示:

SQLite3源码学习之PageCache分析

其中pCache->pDirty为链表的头部,pCache->pDirtyTail为链表的尾部。

2.脏页的添加和删除

这个链表是按照LRU的顺序来维护的,新的链表元素是从头部插入,即页面p比p->DirtyNext更新。pCache->pDirty指向最新的页面,pCache->pDirtyTail指向最老的页面。

链表的插入和删除由pcacheManageDirtyList()函数来完成

/* Allowed values for second argument to pcacheManageDirtyList() */
#define PCACHE_DIRTYLIST_REMOVE   1    /* Remove pPage from dirty list */
#define PCACHE_DIRTYLIST_ADD      2    /* Add pPage to the dirty list */
#define PCACHE_DIRTYLIST_FRONT    3    /* Move pPage to the front of the list */
 
/*
** Manage pPage's participation on the dirty list.  Bits of the addRemove
** argument determines what operation to do.  The 0x01 bit means first
** remove pPage from the dirty list.  The 0x02 means add pPage back to
** the dirty list.  Doing both moves pPage to the front of the dirty list.
*/
static void pcacheManageDirtyList(PgHdr *pPage, u8 addRemove){
  PCache *p = pPage->pCache;
 
  pcacheTrace(("%p.DIRTYLIST.%s %d\n", p,
                addRemove==1  "REMOVE" : addRemove==2  "ADD" : "FRONT",
                pPage->pgno));//打印调试信息
  //把页面从链表移除
  if( addRemove & PCACHE_DIRTYLIST_REMOVE ){ 
    assert( pPage->pDirtyNext || pPage==p->pDirtyTail );
    assert( pPage->pDirtyPrev || pPage==p->pDirty );
  
/* Update the PCache1.pSynced variable if necessary. */
    if( p->pSynced==pPage ){
      p->pSynced = pPage->pDirtyPrev;
    }
   
    if( pPage->pDirtyNext ){
      pPage->pDirtyNext->pDirtyPrev = pPage->pDirtyPrev;//让下一个节点指向前一个节点
    }else{
      assert( pPage==p->pDirtyTail );
      //如果被删除的页面是最后一个,那么更新链表尾部
      p->pDirtyTail = pPage->pDirtyPrev;
    }
if( pPage->pDirtyPrev ){
  //让前一个节点指向后一个节点
      pPage->pDirtyPrev->pDirtyNext = pPage->pDirtyNext;
    }else{
      /* If there are now no dirty pages in the cache, set eCreate to 2. 
      ** This is an optimization that allows sqlite3PcacheFetch() to skip
      ** searching for a dirty page to eject from the cache when it might
      ** otherwise have to.  */
      assert( pPage==p->pDirty );
      //如果被删的是头部,那么更新链表头部
      p->pDirty = pPage->pDirtyNext;
      assert( p->bPurgeable || p->eCreate==2 );
      if( p->pDirty==0 ){         /*OPTIMIZATION-IF-TRUE*/
        assert( p->bPurgeable==0 || p->eCreate==1 );
        //没有脏页的情况下,p->eCreate被设为2
        p->eCreate = 2;
      }
    }
    pPage->pDirtyNext = 0;
    pPage->pDirtyPrev = 0;
  }
  //在链表头部插入新的页面
  if( addRemove & PCACHE_DIRTYLIST_ADD ){
    assert( pPage->pDirtyNext==0 && pPage->pDirtyPrev==0 && p->pDirty!=pPage );
  
    pPage->pDirtyNext = p->pDirty;
    if( pPage->pDirtyNext ){
      assert( pPage->pDirtyNext->pDirtyPrev==0 );
      //让上一个节点指向下一个节点
      pPage->pDirtyNext->pDirtyPrev = pPage;
}else{
  //如果是第一个节点,那么添加尾部
      p->pDirtyTail = pPage;
      if( p->bPurgeable ){
        assert( p->eCreate==2 );
        //有脏页存在时,p->eCreate置1
        p->eCreate = 1;
      }
}
    //更新链表头部
    p->pDirty = pPage;
 
    /* If pSynced is NULL and this page has a clear NEED_SYNC flag, set
    ** pSynced to point to it. Checking the NEED_SYNC flag is an 
    ** optimization, as if pSynced points to a page with the NEED_SYNC
    ** flag set sqlite3PcacheFetchStress() searches through all newer 
    ** entries of the dirty-list for a page with NEED_SYNC clear anyway.  */
    if( !p->pSynced 
     && 0==(pPage->flags&PGHDR_NEED_SYNC)   /*OPTIMIZATION-IF-FALSE*/
){
  // p->pSynced是一个标记页,用来快速查找最新的已被同步的页
      p->pSynced = pPage;
    }
  }
  pcacheDump(p);
}

3.页面读取

读取页面的接口函数是sqlite3PcacheFetch(),在这个函数中需要通过sqlite3GlobalConfig.pcache2.xFetch()调用插件pcache1的接口,如果读取的页面不在缓存中时,由传入的第3个参数eCreate来控制创建缓存页的策略。

eCreate的真值又由createFlag和pCache->eCreate来决定,而pCache->eCreate的真值又由pCache->bPurgeable和pCache->pDirty来决定,真值表如下:

pCache->bPurgeable

pCache->pDirty

pCache->eCreate

0

0

2

0

1

2

1

1

1

1

2

2

pCache->eCreate

createFlag

eCreate

1

0

0

2

0

0

1

3

1

2

3

2

sqlite3_pcache_page *sqlite3PcacheFetch(
  PCache *pCache,       /* Obtain the page from this cache */
  Pgno pgno,            /* Page number to obtain */
  // createFlag传入的值是0或3(即二进制11)
  int createFlag        /* If true, create page if it does not exist already */
){
  int eCreate;
  sqlite3_pcache_page *pRes;
 
  assert( pCache!=0 );
  assert( pCache->pCache!=0 );
  assert( createFlag==3 || createFlag==0 );
  //见第一个真值表第3行
  assert( pCache->eCreate==((pCache->bPurgeable && pCache->pDirty)  1 : 2) );
  //对于eCreate的具体处理见上一篇文章
  /* eCreate defines what to do if the page does not exist.
  **    0     Do not allocate a new page.  (createFlag==0)
  **    1     Allocate a new page if doing so is inexpensive.
  **          (createFlag==1 AND bPurgeable AND pDirty)
  **    2     Allocate a new page even it doing so is difficult.
  **          (createFlag==1 AND !(bPurgeable AND pDirty)
  */
  /*上面的注释的意思是说如果cache slot可回收,并且存在脏页的情况下,
  **如果缓存页的数量达到最大时需要预留一些slot,不再回收或创建新的
  **缓存页*/
  //见第2个真值表
  eCreate = createFlag & pCache->eCreate;
  assert( eCreate==0 || eCreate==1 || eCreate==2 );
  assert( createFlag==0 || pCache->eCreate==eCreate );
  //即eCreate==1+!(pCache->bPurgeable&&pCache->pDirty)
  //即bPurgeable和pDirty都满足的情况下,eCreate是1
  assert( createFlag==0 || eCreate==1+(!pCache->bPurgeable||!pCache->pDirty) );
  pRes = sqlite3GlobalConfig.pcache2.xFetch(pCache->pCache, pgno, eCreate);
  pcacheTrace(("%p.FETCH %d%s (result: %p)\n",pCache,pgno,
               createFlag" create":"",pRes));
  return pRes;
}

取到的页面是一个sqlite3_pcache_page类型的对象,由上篇文章知道PgHdr1是该类型的一个继承。

根据这个对象,调用sqlite3PcacheFetchFinish()来获得PgHdr对象,并初始化,这里有个比较有意思的地方,就是sqlite3PcacheFetchFinish()调用pcacheFetchFinishWithInit()初始化后,间接地递归调用自己。

PgHdr *sqlite3PcacheFetchFinish(
  PCache *pCache,             /* Obtain the page from this cache */
  Pgno pgno,                  /* Page number obtained */
  sqlite3_pcache_page *pPage  /* Page obtained by prior PcacheFetch() call */
){
  PgHdr *pPgHdr;
 
  pPgHdr = (PgHdr *)pPage->pExtra;
 
  if( !pPgHdr->pPage ){
    return pcacheFetchFinishWithInit(pCache, pgno, pPage);
  }
  ……
  return pPgHdr;
}
 
static SQLITE_NOINLINE PgHdr *pcacheFetchFinishWithInit(
  PCache *pCache,             /* Obtain the page from this cache */
  Pgno pgno,                  /* Page number obtained */
  sqlite3_pcache_page *pPage  /* Page obtained by prior PcacheFetch() call */
){
  PgHdr *pPgHdr;
  assert( pPage!=0 );
  pPgHdr = (PgHdr*)pPage->pExtra;
  ……
  return sqlite3PcacheFetchFinish(pCache,pgno,pPage);
}

4.页面读取失败后的处理

如果页面读取失败,那么说明页缓存的数量已经超过最大值,那么找到一个已经sync的脏页回收,如果没找到,那么找一个最老的页面来刷盘回收,但是如果还没sync,通常还没有独占锁,会返回一个busy。

回收一个脏页后,不管成功没成功都要为读取失败的页面分配一个新的页缓存,即把eCreate强制设为2。

/*
** If the sqlite3PcacheFetch() routine is unable to allocate a new
** page because no clean pages are available for reuse and the cache
** size limit has been reached, then this routine can be invoked to 
** try harder to allocate a page.  This routine might invoke the stress
** callback to spill dirty pages to the journal.  It will then try to
** allocate the new page and will only fail to allocate a new page on
** an OOM error.
**
** This routine should be invoked only after sqlite3PcacheFetch() fails.
*/
int sqlite3PcacheFetchStress(
  PCache *pCache,                 /* Obtain the page from this cache */
  Pgno pgno,                      /* Page number to obtain */
  sqlite3_pcache_page **ppPage    /* Write result here */
){
  PgHdr *pPg;
  if( pCache->eCreate==2 ) return 0;
  // pCache->szSpill是设置的一个可回收的阈值
  if( sqlite3PcachePagecount(pCache)>pCache->szSpill ){
    /* Find a dirty page to write-out and recycle. First try to find a 
    ** page that does not require a journal-sync (one with PGHDR_NEED_SYNC
    ** cleared), but if that is not possible settle for any other 
    ** unreferenced dirty page.
    **
    ** If the LRU page in the dirty list that has a clear PGHDR_NEED_SYNC
    ** flag is currently referenced, then the following may leave pSynced
    ** set incorrectly (pointing to other than the LRU page with NEED_SYNC
    ** cleared). This is Ok, as pSynced is just an optimization.  */
   //首先从pCache->pSynced开始搜索已经sync的page
    for(pPg=pCache->pSynced; 
        pPg && (pPg->nRef || (pPg->flags&PGHDR_NEED_SYNC)); 
        pPg=pPg->pDirtyPrev
    );
    //找到之后更新pCache->pSynced
    pCache->pSynced = pPg;
    //如果没找到,那么就找一个没有引用的页
    if( !pPg ){
      for(pPg=pCache->pDirtyTail; pPg && pPg->nRef; pPg=pPg->pDirtyPrev);
    }
    if( pPg ){
      int rc;
#ifdef SQLITE_LOG_CACHE_SPILL
      sqlite3_log(SQLITE_FULL, 
                  "spill page %d making room for %d - cache used: %d/%d",
                  pPg->pgno, pgno,
                  sqlite3GlobalConfig.pcache.xPagecount(pCache->pCache),
                numberOfCachePages(pCache));
#endif
      pcacheTrace(("%p.SPILL %d\n",pCache,pPg->pgno));
     // xStress和pStress由sqlite3PcacheOpen时传入
     //该函数把脏页刷到磁盘,并从脏页链表中移除
      rc = pCache->xStress(pCache->pStress, pPg);
      pcacheDump(pCache);
    //如果没有锁资源,会返回SQLITE_BUSY
      if( rc!=SQLITE_OK && rc!=SQLITE_BUSY ){
        return rc;
      }
    }
  }
  //不管page数量是否超限,都创建一个新的缓存页
  *ppPage = sqlite3GlobalConfig.pcache2.xFetch(pCache->pCache, pgno, 2);
  return *ppPage==0  SQLITE_NOMEM_BKPT : SQLITE_OK;
}

5.结束

关于page cache的内容,就基本讲这么多吧,另外pcacheSortDirtyList()函数对脏页按照页号重新排序,这里用到了链表的归并排序方法,将在下一篇文章中介绍,剩下的其他函数都是很容易理解的。

另外再提2个问题:

1.为什么只有存在脏页的时候,读取页面的时候才设置page数量的最大值,即pCache->pDirty不为空的时候,eCreate的值才为1

2.sqlite3PcacheFetchStress()函数回收脏页的时候,为什么要先找已经sync的page。

这2个问题单独从page cache模块中还没看到答案,可能需要事务处理和日志模块的相关知识,在以后对pager模块完全理解透彻后再回过头来看这2个问题。