上面的一篇粗略的介绍了一下python的对象结构,这篇来分析一个非常重要的部分,内存分配。。。
好像自己看的源代码,只要是跟C语言相关的,都在内存处理方面做了相当多的工作。。。。例如nginx,它也有实现自己的pool,python当然也不例外。。。。
python在内存分配上面分成了4个层次吧。。。
_____ ______ ______ ________
[ int ] [ dict ] [ list ] ... [ string ] Python core |
+3 | <----- Object-specific memory -----> | <-- Non-object memory --> |
_______________________________ | |
[ Python's object allocator ] | |
+2 | ####### Object memory ####### | <------ Internal buffers ------> |
______________________________________________________________ |
[ Python's raw memory allocator (PyMem_ API) ] |
+1 | <----- Python memory (under PyMem manager's control) ------> | |
__________________________________________________________________
[ Underlying general-purpose allocator (ex: C library malloc) ]
0 | <------ Virtual memory allocated for the python process -------> |
上面是直接从源代码里面copy出来的注释。。。。
(0)这个是最底层的C层面上的内存操作,也就是malloc和free了。。
(1)这个事python在C层面上的操作做了一层简单的封装。。例如PyMem_MALLOC,PyMem_FREE,它们是在malloc和free上面做了很简单的包装:
//这里对malloc,realloc与free做了简单的宏封装,对于malloc,如果为0,然么分配1
#define PyMem_MALLOC(n) ((size_t)(n) > (size_t)PY_SSIZE_T_MAX ? NULL \
: malloc((n) ? (n) : 1))
#define PyMem_REALLOC(p, n) ((size_t)(n) > (size_t)PY_SSIZE_T_MAX ? NULL \
: realloc((p), (n) ? (n) : 1))
#define PyMem_FREE free
#endif /* PYMALLOC_DEBUG */
/*
* Type-oriented memory interface
* ==============================
*
* Allocate memory for n objects of the given type. Returns a new pointer
* or NULL if the request was too large or memory allocation failed. Use
* these macros rather than doing the multiplication yourself so that proper
* overflow checking is always done.
*/
//通过感知type的大小来分内存
#define PyMem_New(type, n) \
( ((size_t)(n) > PY_SSIZE_T_MAX / sizeof(type)) ? NULL : \
( (type *) PyMem_Malloc((n) * sizeof(type)) ) )
#define PyMem_NEW(type, n) \
( ((size_t)(n) > PY_SSIZE_T_MAX / sizeof(type)) ? NULL : \
( (type *) PyMem_MALLOC((n) * sizeof(type)) ) )
上面还有New啥的,也就是扩展了类型大小的感知部分。。。。
(3)这一层就是最重点的部分了,PyObject_Malloc,PyObject_Free就属于这一层。。。。python就是在这一层实现了内存池,用于高效的进行内存分配。。。。(源代码集中在obmalloc.c里面)
几点python内存分配方面的常识:
(1)python的内存分配大体分为了两个部分,首先是小内存分配,这里主要是512字节以内,以及大于512字节的分配两类
(2)在内存分配方面按照8字节对齐的方式,例如要分配12字节的内存,其实最终将会占用16字节的大小
* Request in bytes Size of allocated block Size class idx
* ----------------------------------------------------------------
* 1-8 8 0
* 9-16 16 1
* 17-24 24 2
* 25-32 32 3
* 33-40 40 4
* 41-48 48 5
* 49-56 56 6
* 57-64 64 7
* 65-72 72 8
* ... ... ...
* 497-504 504 62
* 505-512 512 63
上面也是直接从源代码里面copy出来的注释,很鲜明的表现出了python内存分配方面的对齐策略。。。
接下来来看几个非常重要的宏定义:
#define ALIGNMENT 8 /* must be 2^N */
#define ALIGNMENT_SHIFT 3
#define ALIGNMENT_MASK (ALIGNMENT - 1)
/* Return the number of bytes in size class I, as a uint. */
#define INDEX2SIZE(I) (((uint)(I) + 1) << ALIGNMENT_SHIFT)
#define SMALL_REQUEST_THRESHOLD 512
#define NB_SMALL_SIZE_CLASSES (SMALL_REQUEST_THRESHOLD / ALIGNMENT) //==64
//页大小4kb
#define SYSTEM_PAGE_SIZE (4 * 1024)
#define SYSTEM_PAGE_SIZE_MASK (SYSTEM_PAGE_SIZE - 1)
#define ARENA_SIZE (256 << 10) /* 256KB */
#ifdef WITH_MEMORY_LIMITS
#define MAX_ARENAS (SMALL_MEMORY_LIMIT / ARENA_SIZE)
#endif
//这里定义的pool的大小为4k
#define POOL_SIZE SYSTEM_PAGE_SIZE /* must be 2^N 4kb*/
#define POOL_SIZE_MASK SYSTEM_PAGE_SIZE_MASK //4*1024-1
上面的宏这里就不详细具体的说明了,,,它主要是确定了如下的信息
(1)一个Arena的大小为256KB(它用来管理pool)
(2)一个pool的大小为4kb
好了,接下来来看比较重要的pool的头定义:
//内存池头部
//通过szidx编号可以知道当前这个pool是用来分配多大大小的内存的pool
struct pool_header {
union { block *_padding;
uint count; } ref; /* number of allocated blocks */ //当前pool上面分配的block的数量
block *freeblock; /* pool's free list head */ //指向下一个可用的block,这里构成了一个链表, 它是一个离散的链表,很有意思
struct pool_header *nextpool; /* next pool of this size class */ //通过这两个指针形成pool的双链表
struct pool_header *prevpool; /* previous pool "" */
uint arenaindex; /* index into arenas of base adr */ //在arena里面的索引
uint szidx; /* block size class index */ //分配内存的类别,8字节,16或者。。。
uint nextoffset; /* bytes to virgin block */ //下一个可用的block的内存偏移量
uint maxnextoffset; /* largest valid nextoffset */ //最后一个block距离开始位置的距离
};
上面的注释非常详细的说明了各个字段的用处。。。另外这里可以看到有一个szidx字段,它与上面内存分配时候的内存对齐表上的szidx相对应,其实每一个pool都是用来分配固定大小的内存的,例如szidx为0,那么这个pool就是用来分配8字节的,szidx为1就是用来分配16个字节的。。。。这个以后看代码就能明白。。。
这个样子每个pool都只分配一种大小的内存块就方便的多了。。特别是对于内存偏移的计算都相当的方便
python在分配小内存的时候,是按照block的单位来进行分配的,例如szidx为0的pool,它的一个block大小就是8字节。。通过freeblock指针来形成一个block的离散的单链表(嗯,这个实现也是非常的trick,看了好久才看明白)。。。
(嗯,其实内存池这部分的实现还有很多的trick,尼玛。。。看这些trick的实现真心消耗脑细胞啊。。。擦。。只能怪自己这方面确实才疏学浅,,要看这么久才能理解。。。)
//这个可以理解为用来管理pool
struct arena_object {
/* The address of the arena, as returned by malloc. Note that 0
* will never be returned by a successful malloc, and is used
* here to mark an arena_object that doesn't correspond to an
* allocated arena.
*/
uptr address; //指向分配的256kb的首地址,这里通过0来表明当前没有进行分配
/* Pool-aligned pointer to the next pool to be carved off. */
block* pool_address;
/* The number of available pools in the arena: free pools + never-
* allocated pools.
*/
uint nfreepools; //可用的pool
/* The total number of pools in the arena, whether or not available. */
uint ntotalpools; //在当前arena的pool的总数
/* Singly-linked list of available pools. */
struct pool_header* freepools; //pool链表的头部
/* Whenever this arena_object is not associated with an allocated
* arena, the nextarena member is used to link all unassociated
* arena_objects in the singly-linked `unused_arena_objects` list.
* The prevarena member is unused in this case.
*
* When this arena_object is associated with an allocated arena
* with at least one available pool, both members are used in the
* doubly-linked `usable_arenas` list, which is maintained in
* increasing order of `nfreepools` values.
*
* Else this arena_object is associated with an allocated arena
* all of whose pools are in use. `nextarena` and `prevarena`
* are both meaningless in this case.
*/
struct arena_object* nextarena;
struct arena_object* prevarena;
};
上面这个是另外一个非常重要的结构,可以理解为它是用来管理pool的,它的address指针将会指向一个分配的256kb内存,pool将会在这个上面产生。。。。。
接下来先来看看Arena结构的创建过程吧:
//分配一个arena_object,其实这个也是做了缓存的
static struct arena_object*
new_arena(void)
{
struct arena_object* arenaobj; //这里先创建一个arena的指针
uint excess; /* number of bytes above pool alignment */
void *address; //如果新创建的话,这个用来指向申请的256KB内存
int err;
#ifdef PYMALLOC_DEBUG
if (Py_GETENV("PYTHONMALLOCSTATS"))
_PyObject_DebugMallocStats();
#endif
if (unused_arena_objects == NULL) { //当前没有可用的arena,那么这里需要创建
uint i;
uint numarenas;
size_t nbytes;
/* Double the number of arena objects on each allocation.
* Note that it's possible for `numarenas` to overflow.
*/
//最开始maxarenas为0,也就是说第一次创建Arena结构的时候,就将会一次性创建16个,以后直接翻倍
numarenas = maxarenas ? maxarenas << 1 : INITIAL_ARENA_OBJECTS; //INITIAL_ARENA_OBJECTS=16
if (numarenas <= maxarenas) //出现这种情况,只能说尼玛,这都能整形溢出 啊
return NULL; /* overflow */
#if SIZEOF_SIZE_T <= SIZEOF_INT
if (numarenas > PY_SIZE_MAX / sizeof(*arenas))
return NULL; /* overflow */
#endif
nbytes = numarenas * sizeof(*arenas); //接下来分配arenas结构体所需要的内存
arenaobj = (struct arena_object *)realloc(arenas, nbytes); //分配内存地址
if (arenaobj == NULL)
return NULL;
arenas = arenaobj;
/* We might need to fix pointers that were copied. However,
* new_arena only gets called when all the pages in the
* previous arenas are full. Thus, there are *no* pointers
* into the old array. Thus, we don't have to worry about
* invalid pointers. Just to be sure, some asserts:
*/
assert(usable_arenas == NULL);
assert(unused_arena_objects == NULL);
/* Put the new arenas on the unused_arena_objects list. */
//这里相当于是初始化刚刚创建的arena结构体
for (i = maxarenas; i < numarenas; ++i) {
arenas[i].address = 0; //通过将这个地址赋值为0,表示当前arena没有分配可用的内存
arenas[i].nextarena = i < numarenas - 1 ?
&arenas[i+1] : NULL;
}
unused_arena_objects = &arenas[maxarenas]; //这里将unused_arena_objects指向当前可用的第一个
maxarenas = numarenas;
}
/* Take the next available arena object off the head of the list. */
assert(unused_arena_objects != NULL);
arenaobj = unused_arena_objects;
unused_arena_objects = arenaobj->nextarena; //将unused_arena_objects指针指向下一个arena结构
assert(arenaobj->address == 0);
//接下来分配数据内存
#ifdef ARENAS_USE_MMAP
address = mmap(NULL, ARENA_SIZE, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
err = (address == MAP_FAILED);
#else
address = malloc(ARENA_SIZE); //这里分配256字节的内存
err = (address == 0);
#endif
if (err) {
/* The allocation failed: return NULL after putting the
* arenaobj back.
*/
arenaobj->nextarena = unused_arena_objects;
unused_arena_objects = arenaobj;
return NULL;
}
//将Arena结构的address指向分配的地址
arenaobj->address = (uptr)address;
//更新计数器
++narenas_currently_allocated;
#ifdef PYMALLOC_DEBUG
++ntimes_arena_allocated;
if (narenas_currently_allocated > narenas_highwater)
narenas_highwater = narenas_currently_allocated;
#endif
arenaobj->freepools = NULL; //这里pool头部指针设置为null
/* pool_address <- first pool-aligned address in the arena
nfreepools <- number of whole pools that fit after alignment */
arenaobj->pool_address = (block*)arenaobj->address;
arenaobj->nfreepools = ARENA_SIZE / POOL_SIZE; //其实这里是64个可用的pool,正好对象64中类型的小内存分配
assert(POOL_SIZE * arenaobj->nfreepools == ARENA_SIZE);
//下面是做一次内存对齐,最终保证pool_address的地址是4kb的整数倍,这个主要是方便以后内存计算
excess = (uint)(arenaobj->address & POOL_SIZE_MASK);
if (excess != 0) { //这个意思是当前的内存地址不是4kb的整数倍,那么需要进行一次pool地址的对齐
--arenaobj->nfreepools; //可用数-1
arenaobj->pool_address += POOL_SIZE - excess;
}
arenaobj->ntotalpools = arenaobj->nfreepools; //总共可用的pool的数量
return arenaobj;
}
嗯,代码上面的注释应该说的很清楚了吧。。。。创建结构体对象,然后分配内存,用adress指针来指向。。
初始化能分配的pool的数量,以及起始的地址。。。
好了。。。今天就先写到这吧。。。好晚了。。。感觉上班还真心有点累啊。。。。。还是在学校轻松。。。
明天在来分析最为重要的PyObject_Malloc和PyObject_Free两个函数吧。。。