GlusterFS之內(nèi)存池(mem-pool)實(shí)現(xiàn)原理及代碼詳解
我的新浪微博:http://weibo.com/freshairbrucewoo。
歡迎大家相互交流,共同提高技術(shù)。
最近一直在研究glusterfs的源代碼,自己也在上面做了一些小的改動(dòng)。我最開始研究的是3.2.5這個(gè)版本,因?yàn)閾?jù)同行和網(wǎng)上資料顯示這個(gè)版本目前是最穩(wěn)定的版本。glusterfs實(shí)現(xiàn)比較復(fù)雜,具體的設(shè)計(jì)思想和架構(gòu)就不詳細(xì)介紹了,網(wǎng)上有這方面的資料(CSDN博客里面就有很好介紹的文章)。
研究開源系統(tǒng)的一個(gè)好處就是可以充分了解它的實(shí)現(xiàn),如果是看這方面的論文只能了解一些原理性的東西,但是我們真正做項(xiàng)目還需要實(shí)際的實(shí)現(xiàn)。很多開源系統(tǒng)可能本身不一定就很適合你的系統(tǒng),但是如果可以改造那么利用它來改造也是很值得劃算的。研究開源系統(tǒng)最大的好處就是學(xué)習(xí)它的優(yōu)秀的代碼,今天這篇博文就是要分享glusterfs里面使用的內(nèi)存池技術(shù)。
glusterfs實(shí)現(xiàn)內(nèi)存池技術(shù)的源文件和頭文件分別是mem-pool.c和mem-pool.h,首先看看頭文件中內(nèi)存池對(duì)象結(jié)構(gòu)體的定義如下:
1 struct mem_pool { 2 struct list_head list;//用于管理內(nèi)存池的標(biāo)準(zhǔn)雙向鏈表 3 int hot_count;//正在使用的內(nèi)存數(shù)量計(jì)數(shù) 4 int cold_count;//未使用的內(nèi)存數(shù)量計(jì)數(shù) 5 gf_lock_t lock; 6 unsigned long padded_sizeof_type;//帶有填充 7 void *pool;//內(nèi)存池開始地址 8 void *pool_end;//內(nèi)存池結(jié)束地址 9 int real_sizeof_type;//內(nèi)存池存放對(duì)象的真實(shí)大小 10 uint64_t alloc_count;//采用alloc分配的次數(shù) 11 uint64_t pool_misses;//內(nèi)出池缺少次數(shù) 12 int max_alloc;//采用alloc分配的最大次數(shù) 13 int curr_stdalloc; 14 int max_stdalloc; 15 char *name; 16 struct list_head global_list;//加入到全局的內(nèi)存池鏈表 17 };
然后我們?cè)趤矸治鰩讉€(gè)重要的實(shí)現(xiàn)函數(shù),第一個(gè)函數(shù)就是mem_pool_new_fn,它會(huì)新建一個(gè)內(nèi)存池對(duì)象,然后按照傳遞進(jìn)來的內(nèi)存的大小和個(gè)數(shù)分配內(nèi)存,還要加上一些額外存儲(chǔ)內(nèi)容的內(nèi)存容量,如存放鏈表指針的因?yàn)檫@些內(nèi)存池對(duì)象本身是通過通用鏈表來管理的,還有如標(biāo)識(shí)內(nèi)存是否在被使用的一個(gè)標(biāo)志等。具體看下面代碼的實(shí)現(xiàn),關(guān)鍵代碼都有注釋:
1 struct mem_pool * 2 mem_pool_new_fn (unsigned long sizeof_type, 3 unsigned long count, char *name) 4 { 5 struct mem_pool *mem_pool = NULL; 6 unsigned long padded_sizeof_type = 0; 7 void *pool = NULL; 8 int i = 0; 9 int ret = 0; 10 struct list_head *list = NULL; 11 jdfs_ctx_t *ctx = NULL; 12 13 if (!sizeof_type || !count) { 14 gf_log_callingfn ("mem-pool", GF_LOG_ERROR, "invalid argument"); 15 return NULL; 16 } 17 padded_sizeof_type = sizeof_type + GF_MEM_POOL_PAD_BOUNDARY;//計(jì)算大小:對(duì)象本身所占內(nèi)存+鏈表頭+內(nèi)存池指針+int內(nèi)存大小(存放in_use變量) 18 19 mem_pool = GF_CALLOC (sizeof (*mem_pool), 1, gf_common_mt_mem_pool); 20 if (!mem_pool) 21 return NULL; 22 23 ret = gf_asprintf (&mem_pool->name, "%s:%s", THIS->name, name);//哪一個(gè)xlator分配什么名字內(nèi)存 24 if (ret < 0) 25 return NULL; 26 27 if (!mem_pool->name) { 28 GF_FREE (mem_pool); 29 return NULL; 30 } 31 32 LOCK_INIT (&mem_pool->lock); 33 INIT_LIST_HEAD (&mem_pool->list); 34 INIT_LIST_HEAD (&mem_pool->global_list); 35 36 mem_pool->padded_sizeof_type = padded_sizeof_type;//總的對(duì)齊內(nèi)存大小 37 mem_pool->cold_count = count;//數(shù)量:剛開始都是冷的(未使用的) 38 mem_pool->real_sizeof_type = sizeof_type;//使用內(nèi)存池對(duì)象的真實(shí)內(nèi)存大小 39 40 pool = GF_CALLOC (count, padded_sizeof_type, gf_common_mt_long);//分配count個(gè)padded_sizeof_type大小的內(nèi)存 41 if (!pool) { 42 GF_FREE (mem_pool->name); 43 GF_FREE (mem_pool); 44 return NULL; 45 } 46 47 for (i = 0; i < count; i++) { 48 list = pool + (i * (padded_sizeof_type));//分配每一個(gè)內(nèi)存對(duì)象大小到鏈表 49 INIT_LIST_HEAD (list); 50 list_add_tail (list, &mem_pool->list);//加入到內(nèi)存池的鏈表中去 51 } 52 53 mem_pool->pool = pool;//記錄分配的內(nèi)存區(qū)域 54 mem_pool->pool_end = pool + (count * (padded_sizeof_type));//內(nèi)存分配結(jié)束的地址 55 56 /* add this pool to the global list */ 57 ctx = jdfs_ctx_get (); 58 if (!ctx) 59 goto out; 60 61 list_add (&mem_pool->global_list, &ctx->mempool_list);//加入全局的內(nèi)存池鏈表 62 63 out: 64 return mem_pool; 65 }
如果我們需要使用這種內(nèi)存池中的內(nèi)存,那么就從內(nèi)存池中拿出一個(gè)對(duì)象(不同對(duì)象需要不同的內(nèi)存池對(duì)象保存,每一個(gè)內(nèi)存池對(duì)象只保存一種對(duì)象的內(nèi)存結(jié)構(gòu))的內(nèi)存,代碼實(shí)現(xiàn)和注釋如下:
1 void * 2 mem_get (struct mem_pool *mem_pool) 3 { 4 struct list_head *list = NULL; 5 void *ptr = NULL; 6 int *in_use = NULL; 7 struct mem_pool **pool_ptr = NULL; 8 9 if (!mem_pool) { 10 gf_log_callingfn ("mem-pool", GF_LOG_ERROR, "invalid argument"); 11 return NULL; 12 } 13 14 LOCK (&mem_pool->lock); 15 { 16 mem_pool->alloc_count++; 17 if (mem_pool->cold_count) {//內(nèi)存池中是否還有未使用的內(nèi)存對(duì)象 18 list = mem_pool->list.next;//取出一個(gè) 19 list_del (list);//從鏈表中脫鏈 20 21 mem_pool->hot_count++; 22 mem_pool->cold_count--; 23 24 if (mem_pool->max_alloc < mem_pool->hot_count)//最大以分配的內(nèi)存是否小于正在使用的內(nèi)存數(shù)量 25 mem_pool->max_alloc = mem_pool->hot_count; 26 27 ptr = list; 28 in_use = (ptr + GF_MEM_POOL_LIST_BOUNDARY + 29 GF_MEM_POOL_PTR);//分配內(nèi)存池對(duì)象的時(shí)候分配了這個(gè)區(qū)域來保存次塊內(nèi)存是否在使用 30 *in_use = 1;//標(biāo)記次塊內(nèi)存正在使用 31 32 goto fwd_addr_out; 33 } 34 35 /* This is a problem area. If we've run out of 36 * chunks in our slab above, we need to allocate 37 * enough memory to service this request. 38 * The problem is, these individual chunks will fail 39 * the first address range check in __is_member. Now, since 40 * we're not allocating a full second slab, we wont have 41 * enough info perform the range check in __is_member. 42 * 43 * I am working around this by performing a regular allocation 44 * , just the way the caller would've done when not using the 45 * mem-pool. That also means, we're not padding the size with 46 * the list_head structure because, this will not be added to 47 * the list of chunks that belong to the mem-pool allocated 48 * initially. 49 * 50 * This is the best we can do without adding functionality for 51 * managing multiple slabs. That does not interest us at present 52 * because it is too much work knowing that a better slab 53 * allocator is coming RSN. 54 */ 55 mem_pool->pool_misses++;//內(nèi)存池缺失計(jì)數(shù)次數(shù)加1 56 mem_pool->curr_stdalloc++;//系統(tǒng)標(biāo)準(zhǔn)分配次數(shù)加1 57 if (mem_pool->max_stdalloc < mem_pool->curr_stdalloc) 58 mem_pool->max_stdalloc = mem_pool->curr_stdalloc; 59 ptr = GF_CALLOC (1, mem_pool->padded_sizeof_type, 60 gf_common_mt_mem_pool);//分配一個(gè)內(nèi)存池對(duì)象 61 gf_log_callingfn ("mem-pool", GF_LOG_DEBUG, "Mem pool is full. " 62 "Callocing mem"); 63 64 /* Memory coming from the heap need not be transformed from a 65 * chunkhead to a usable pointer since it is not coming from 66 * the pool. 67 */ 68 } 69 fwd_addr_out: 70 pool_ptr = mem_pool_from_ptr (ptr); 71 *pool_ptr = (struct mem_pool *)mem_pool; 72 ptr = mem_pool_chunkhead2ptr (ptr);//得到真正開始的內(nèi)存 73 UNLOCK (&mem_pool->lock); 74 75 return ptr; 76 }
當(dāng)我們使用完一個(gè)內(nèi)存池中的內(nèi)存結(jié)構(gòu)以后就需要還給內(nèi)存池以便被以后的程序使用,達(dá)到循環(huán)使用的目的。但是在歸還以前我們首先需要判斷是不是內(nèi)存池對(duì)象的一個(gè)成員,判斷的結(jié)果有三種,分別是:是,不是和錯(cuò)誤情況(就是它在內(nèi)存池的內(nèi)存范圍以內(nèi),但是不符合內(nèi)存池對(duì)象的大小),實(shí)現(xiàn)如下:
1 static int 2 __is_member (struct mem_pool *pool, void *ptr)//判斷ptr指向的內(nèi)存是否是pool的成員 3 { 4 if (!pool || !ptr) { 5 gf_log_callingfn ("mem-pool", GF_LOG_ERROR, "invalid argument"); 6 return -1; 7 } 8 9 if (ptr < pool->pool || ptr >= pool->pool_end)//ptr如果不再pool開始到結(jié)束的范圍內(nèi)就不是 10 return 0; 11 12 if ((mem_pool_ptr2chunkhead (ptr) - pool->pool) 13 % pool->padded_sizeof_type)//判斷是否是一個(gè)符合內(nèi)存塊大小的內(nèi)存對(duì)象 14 return -1; 15 16 return 1; 17 }
那么根據(jù)上面函數(shù)判斷的結(jié)果,放入內(nèi)存對(duì)象到內(nèi)存池對(duì)象的函數(shù)就會(huì)做相應(yīng)的處理,具體代碼如下:
1 void 2 mem_put (void *ptr)//將ptr放回到內(nèi)存池中去 3 { 4 struct list_head *list = NULL; 5 int *in_use = NULL; 6 void *head = NULL; 7 struct mem_pool **tmp = NULL; 8 struct mem_pool *pool = NULL; 9 10 if (!ptr) { 11 gf_log_callingfn ("mem-pool", GF_LOG_ERROR, "invalid argument"); 12 return; 13 } 14 15 list = head = mem_pool_ptr2chunkhead (ptr);//得到鏈表指針 16 tmp = mem_pool_from_ptr (head); 17 if (!tmp) { 18 gf_log_callingfn ("mem-pool", GF_LOG_ERROR, 19 "ptr header is corrupted"); 20 return; 21 } 22 23 pool = *tmp; 24 if (!pool) { 25 gf_log_callingfn ("mem-pool", GF_LOG_ERROR, 26 "mem-pool ptr is NULL"); 27 return; 28 } 29 LOCK (&pool->lock); 30 { 31 32 switch (__is_member (pool, ptr)) 33 { 34 case 1://是內(nèi)存池中的內(nèi)存 35 in_use = (head + GF_MEM_POOL_LIST_BOUNDARY + 36 GF_MEM_POOL_PTR);//得到是否正在使用變量 37 if (!is_mem_chunk_in_use(in_use)) {//正在使用就暫時(shí)不回收 38 gf_log_callingfn ("mem-pool", GF_LOG_CRITICAL, 39 "mem_put called on freed ptr %p of mem " 40 "pool %p", ptr, pool); 41 break; 42 } 43 pool->hot_count--; 44 pool->cold_count++; 45 *in_use = 0; 46 list_add (list, &pool->list);//加入到內(nèi)存池中的鏈表 47 break; 48 case -1://錯(cuò)誤就終止程序 49 /* For some reason, the address given is within 50 * the address range of the mem-pool but does not align 51 * with the expected start of a chunk that includes 52 * the list headers also. Sounds like a problem in 53 * layers of clouds up above us. ;) 54 */ 55 abort (); 56 break; 57 case 0://不是內(nèi)存池中的內(nèi)存直接釋放掉 58 /* The address is outside the range of the mem-pool. We 59 * assume here that this address was allocated at a 60 * point when the mem-pool was out of chunks in mem_get 61 * or the programmer has made a mistake by calling the 62 * wrong de-allocation interface. We do 63 * not have enough info to distinguish between the two 64 * situations. 65 */ 66 pool->curr_stdalloc--;//系統(tǒng)分配次數(shù)減1 67 GF_FREE (list); 68 break; 69 default: 70 /* log error */ 71 break; 72 } 73 } 74 UNLOCK (&pool->lock); 75 }
除了上面介紹的,當(dāng)然還有銷毀內(nèi)存池的功能函數(shù)mem_pool_destroy,輔助分配系統(tǒng)內(nèi)存的一些封裝函數(shù)等;另外還有一個(gè)對(duì)于調(diào)試有用的功能,那就是記錄分配內(nèi)存的信息,這些東西相對(duì)簡(jiǎn)單,可以自己直接看源碼理解。
下班了,今天就到此為止吧!以后準(zhǔn)備分享iobuf實(shí)現(xiàn)的原理以及源代碼!
浙公網(wǎng)安備 33010602011771號(hào)