Golang 字典 map-CSDN博客

本文链接：https://blog.csdn.net/wan212000/article/details/126123261

这里填写标题

1. Golang 字典 map

1. Golang 字典 map

对于 map 的使用, 大家肯定都会, 所以基础的知识讲解的不多, 主要是对 map 的底层结构进行了详细的讲解, 正所谓知其然, 必知其所以然! 对于 map 的底层结构设计, 感觉有些意思, 特别是对于它的 hash 方式(取前 N 位和后 M 位), 再结合它的扩容和缩容, 其实可以从中提炼一些共性的东西。然后对于 Rehash, 这个和 Redis 的 Rehash 原因一样, 这块应该是业内的通用设计方法, 感兴趣的同学可以看看 redis 的字典结构和它 rehash 的方法。

1.1. 基本用法

字典属于引用类型, 初始化方式主要有 2 种, 分别为:

m1 := make(map[string]int)
m2 := map[string]int{
	"lvmenglou": 32, 
    "litinajie": 28,
}

字典是被设计成 “not addressable”, 所以不能直接修改 value 成员, 如果需要修改 value 成员, 需要对元素整体替换:

type user struct {
	name string
	age  byte
}

m := map[int]user{1: {"lvmenglou": 32}}
m[1].age += 1 // 错误：cannot assign to m[1].age

不能对 nil 字典进行写操作, 否则会触发 panic, 但是可以读。

var m map[string]intm["a"] = 1 // panic: assignment to entry in nil map

nil 与空字典, 这个需要注意:

var m1 map[string]int  // nil 字典 
m2 := map[string]int{} // 空字典, 不等于 nil

map 属于非线程安全, 如果多个线程同时对一个 map 进行读、写、删除操作, 会触发 panic, 可以使用支持线程安全的 sync.Map。

1.2. 内存模型

先看一下 map 的数据结构:

type hmap struct {
	count      int
	B          uint8          // bucket 的数量是 2^B, 最多可以放 loadFactor * 2^B 个元素, 再多就要 hashGrow 了
	hash0      uint32         // hash seed
	buckets    unsafe.Pointer //2^B 大小的数组, 如果 count == 0 的话, 可能是 nil
	oldbuckets unsafe.Pointer // 扩容的时候, buckets 长度会是 oldbuckets 的两倍, 只有在 growing 时候为空。
	// 其它省略。..
}

type bmap struct {
	topbits  [8]uint8
	keys     [8]keytype
	values   [8]valuetype
	pad      uintptr
	overflow uintptr
}

B 是 map 的 bmap 数组长度的对数, 每个 bmap 里面存储了 kv 对, buckets 是一个指针, 指向实际存储的 bmap 数组的首地址。

每个 bmap 里面最多存储 8 个 key, 下图是 bmap 的内存模型, HOB Hash 指的就是 top hash 字段, 每个 bucket 设计成最多只能放 8 个 key-value 对, 如果有第 9 个 key-value 落入当前的 bucket, 那就需要再构建一个 bucket , 通过 overflow 指针连接起来。

1.3. 查找数据

key 经过哈希计算后得到哈希值, 哈希值是 64 个 bit 位(针对 64 位机), 假如一个 key 经过哈希函数计算后, 得到的哈希结果是:

10010111 | 000011110110110010001111001010100010010110010101010 │ 01010

其中最后 5 位是 01010, 值为 10, 表示 10 号桶。最前面 8 位 10010111, 值是 151, 用来在 bmap 中找数据用的。

1.4. 扩容缩容

一个 map 最多只能装 8*2^B 个数据, 当数据量快满时, 为了减少查询 Hash 冲突, 就需要进行扩容。当 bucket 数量过多, 然后数据又非常少时, 就需要进行缩容(之前情况一般出现在数据大量删除的情况)。判断需要扩容和缩容的临界值, 需要引入"装载因子", 感兴趣的同学可以自行百度。当进行扩容时, 比如 B=5 扩容到 B=6, 最低位需要扩到 6 位, 然后重新 Hash, 找到在 []bmap 对应的 hash 值, 最高位不变。缩容的话, 就是相反的方式。(因为最低位的第 6 位是 0 或者 1, 所以第 6 位为 0 的数据, 因为值不变, 所以不会重新进行 Hash, 第 6 位为 1 的数据, 需要被 Hash 到扩容后的桶中)

为了更好举例, 扩容前 B=2, 共有 4 个 bmap。

假设 overflow 太多, 触发了等量扩容, 需要将数据变得更紧凑。

假设针对上面情况, 触发了 2 倍扩容, 将 B=2 扩容到 B=3。

1.5. 迭代遍历

本来 map 的遍历过程比较简单: 遍历所有的 bucket 以及它后面挂的 overflow bucket, 然后挨个遍历 bucket 中的所有 cell。每个 bucket 中包含 8 个 cell, 从有 key 的 cell 中取出 key 和 value, 完成遍历。但是遍历如果发生在扩容的过程中, 就会涉及到遍历新老 bucket 的过程。所以在遍历过成功, 如果 map 在库容, 需要对新旧数据同时进行遍历, 下面是扩容过程示例, 图中进行二倍扩容后, *oldbuckets 中的 1 已经全部搬迁到了 *buckets 中, 所以遍历时, 需要对 *oldbuckets 和 *buckets 都进行遍历。

1.6. 总结

对于 map 的使用, 大家肯定都会, 所以基础的知识讲解的不多, 主要是对 map 的底层结构进行了详细的讲解, 正所谓知其然, 必知其所以然! 对于 map 的底层结构设计, 感觉有些意思, 特别是对于它的 hash 方式(取前 N 位和后 M 位), 再结合它的扩容和缩容, 其实可以从中提炼一些共性的东西。然后对于 Rehash, 这个和 Redis 的 Rehash 原因一样, 这块应该是业内的通用设计方法, 感兴趣的同学可以看看 redis 的字典结构和它 rehash 的方法。

1.7. Deep copy

From: https://gist.github.com/soroushjp/0ec92102641ddfc3ad5515ca76405f4d

deepcopy.go

// Package deepcopy provides a function for deep copying map[string]interface{}
// values. Inspired by the StackOverflow answer at:
// http://stackoverflow.com/a/28579297/1366283
//
// Uses the golang.org/pkg/encoding/gob package to do this and therefore has the
// same caveats.
// See: https://blog.golang.org/gobs-of-data
// See: https://golang.org/pkg/encoding/gob/
package deepcopy

import (
	"bytes"
	"encoding/gob"
)

func init() {
	gob.Register(map[string]interface{}{})
}

// Map performs a deep copy of the given map m.
func Map(m map[string]interface{}) (map[string]interface{}, error) {
	var buf bytes.Buffer
	enc := gob.NewEncoder(&buf)
	dec := gob.NewDecoder(&buf)
	err := enc.Encode(m)
	if err != nil {
		return nil, err
	}
	var copy map[string]interface{}
	err = dec.Decode(&copy)
	if err != nil {
		return nil, err
	}
	return copy, nil
}

deepcopy_test.go

package deepcopy

import (
	"testing"

	"github.com/stretchr/testify/assert"
)

func TestMap(t *testing.T) {
	testCases := []struct {
		// original and expectedOriginal are the same value in each test case. We do
		// this to avoid unintentionally asserting against a mutated
		// expectedOriginal and having the test pass erroneously. We also do not
		// want to rely on the deep copy function we are testing to ensure this does
		// not happen.
		original         map[string]interface{}
		transformer      func(m map[string]interface{}) map[string]interface{}
		expectedCopy     map[string]interface{}
		expectedOriginal map[string]interface{}
	}{
		// reassignment of entire map, should be okay even without deepcopy.
		{
			original: nil,
			transformer: func(m map[string]interface{}) map[string]interface{} {
				return map[string]interface{}{}
			},
			expectedCopy:     map[string]interface{}{},
			expectedOriginal: nil,
		},
		{
			original: map[string]interface{}{},
			transformer: func(m map[string]interface{}) map[string]interface{} {
				return nil
			},
			expectedCopy:     nil,
			expectedOriginal: map[string]interface{}{},
		},
		// mutation of map
		{
			original: map[string]interface{}{},
			transformer: func(m map[string]interface{}) map[string]interface{} {
				m["foo"] = "bar"
				return m
			},
			expectedCopy: map[string]interface{}{
				"foo": "bar",
			},
			expectedOriginal: map[string]interface{}{},
		},
		{
			original: map[string]interface{}{
				"foo": "bar",
			},
			transformer: func(m map[string]interface{}) map[string]interface{} {
				m["foo"] = "car"
				return m
			},
			expectedCopy: map[string]interface{}{
				"foo": "car",
			},
			expectedOriginal: map[string]interface{}{
				"foo": "bar",
			},
		},
		// mutation of nested maps
		{
			original: map[string]interface{}{},
			transformer: func(m map[string]interface{}) map[string]interface{} {
				m["foo"] = map[string]interface{}{
					"biz": "baz",
				}
				return m
			},
			expectedCopy: map[string]interface{}{
				"foo": map[string]interface{}{
					"biz": "baz",
				},
			},
			expectedOriginal: map[string]interface{}{},
		},
		{
			original: map[string]interface{}{
				"foo": map[string]interface{}{
					"biz": "booz",
					"gaz": "gooz",
				},
			},
			transformer: func(m map[string]interface{}) map[string]interface{} {
				m["foo"] = map[string]interface{}{
					"biz": "baz",
				}
				return m
			},
			expectedCopy: map[string]interface{}{
				"foo": map[string]interface{}{
					"biz": "baz",
				},
			},
			expectedOriginal: map[string]interface{}{
				"foo": map[string]interface{}{
					"biz": "booz",
					"gaz": "gooz",
				},
			},
		},
		// mutation of slice values
		{
			original: map[string]interface{}{
				"foo": []string{"biz", "baz"},
			},
			transformer: func(m map[string]interface{}) map[string]interface{} {
				m["foo"].([]string)[0] = "hiz"
				return m
			},
			expectedCopy: map[string]interface{}{
				"foo": []string{"hiz", "baz"},
			},
			expectedOriginal: map[string]interface{}{
				"foo": []string{"biz", "baz"},
			},
		},
	}
	for i, tc := range testCases {
		copy, err := Map(tc.original)
		assert.NoError(t, err)
		assert.Exactly(t, tc.expectedCopy, tc.transformer(copy), "copy was not mutated. test case: %d", i)
		assert.Exactly(t, tc.expectedOriginal, tc.original, "original was mutated. test case: %d", i)
	}
}