In the Go language, Map is implemented based on a Hash Table. The core idea of a hash table is to map a key to a storage location (bucket) using a hash function, enabling fast data lookup, insertion, and deletion.

Basic Concepts

Map

Also known as a dictionary, hash table, or associative array, a map is a widely used data structure in computer science for storing key-value pairs, where each key appears only once. Common implementations include hash tables, balanced binary search trees, skip lists, and prefix trees.

Hash Table

Hash tables are the most common implementation of maps. They use a hash function to map keys to storage locations (buckets), enabling fast lookup, insertion, and deletion operations.

  • Average time complexity for lookup, insertion, and deletion is O(1).
  • Hash Collision: When multiple keys are hashed to the same location, collisions are resolved using methods such as chaining or open addressing.
  • Resizing Mechanism: When the load factor (number of elements / number of buckets) exceeds a threshold, the hash table is resized, typically by doubling the number of buckets and rehashing all elements.

Hash tables are used in scenarios requiring efficient lookup, insertion, and deletion, such as database indexing and caching systems.

Balanced Binary Search Tree

Some languages (e.g., C++’s std::map) use balanced binary search trees (e.g., red-black trees) to implement maps.

  • Time complexity for lookup, insertion, and deletion is O(log n).
  • Ordering: Key-value pairs are stored in order, enabling efficient range queries and traversal.
  • No need to handle hash collisions, but performance is slightly lower than hash tables.

Balanced binary search trees are used in scenarios requiring ordered storage and range queries, such as dictionaries and sorted data.

Hash Collision

When two or more keys produce the same hash value, they are stored in the same bucket, resulting in a hash collision. Common resolution methods include chaining and open addressing, with Go using the former.

Bucket

In Go’s map implementation, a Bucket is the basic storage unit of a hash table. Each bucket can store multiple key-value pairs. Specifically:

  • Bucket Structure: Each bucket is a fixed-size array, typically storing 8 key-value pairs. If a hash collision occurs (i.e., multiple keys are hashed to the same bucket), the bucket stores additional key-value pairs in a linked list.
  • Role of the Hash Function: The hash function maps a key to a bucket index. Go’s hash function generates a hash value based on the key’s type, then maps the hash value to a bucket index using modulo or other methods.
  • Hash Collision: When multiple keys are hashed to the same bucket, Go’s map stores these key-value pairs in the same bucket. If the bucket is full, Go uses a linked list to chain additional key-value pairs to the bucket.

Load Factor

The load factor is a measure of how densely a hash table is populated, determining when the hash table needs to be resized. The load factor is calculated as:

1 Load Factor = Number of Elements / Number of Buckets

In Go’s map, the default load factor is 6.5. When the load factor exceeds this threshold, Go triggers the map’s resizing operation.

  • Performance Balance: A higher load factor increases space utilization but also increases the probability of hash collisions, degrading the performance of lookup, insertion, and deletion operations.
  • Resizing Timing: When the load factor exceeds the threshold, Go resizes the map to reduce hash collisions and maintain performance.

Go Implementation

The key’s data type must be comparable. Incomparable types include slices, maps, and functions.

Source Code

 1// A header for a Go map.
 2type hmap struct {
 3    // Number of elements in the map, must be the first field in the struct because the built-in len function reads from here
 4	count     int 
 5    // Map status flags, such as whether it is being written or migrated, because map is not thread-safe, flags are checked during operations
 6	flags     uint8
 7    // log_2 of buckets (can hold up to loadFactor * 2^B elements, i.e., 6.5*2^B, more than that requires hashGrow)
 8	B         uint8
 9    // Approximate number of overflow buckets, used to determine if too many overflow buckets are created
10	noverflow uint16
11    // Hash seed, random hash seed to prevent hash collision attacks
12	hash0     uint32
13	// Pointer to the buckets array storing data, size 2^B, may be nil if count == 0
14	buckets    unsafe.Pointer
15    // Pointer to the old buckets array, half the size, non-nil only during growing, used to determine if in resizing state
16	oldbuckets unsafe.Pointer
17    // Resizing progress flag, buckets below this address have been migrated
18	nevacuate  uintptr
19	clearSeq   uint64
20	// Reduces GC scanning, used when both key and value can be inlined
21	extra *mapextra
22}
23
24// mapextra holds fields that are not present on all maps.
25type mapextra struct {
26	// If key and value do not contain pointers and can be inlined (<=128 bytes)
27	// Use extra to store overflow buckets to avoid GC scanning the entire map
28	// However, bmap.overflow is also a pointer, breaking the assumption that bmap does not contain pointers
29    // In this case, overflow pointers are stored in the extra field
30
31	// Stores overflow bucket list
32	overflow    *[]*bmap
33    // Stores old overflow bucket list
34	oldoverflow *[]*bmap
35	// Pointer to free overflow bucket
36	nextOverflow *bmap
37}
38
39// A bucket for a Go map.
40type bmap struct {
41    // Stores the high 8 bits of the hash values of the 8 keys in the bucket, tophash[0] < minTopHash indicates the bucket is in migration
42	tophash [abi.OldMapBucketCount]uint8
43	// The following fields are not explicitly defined in bmap but are added by the compiler during compilation
44    // keys              // Each bucket can hold up to 8 keys
45    // values            // 8 keys correspond to 8 values
46    // overflow pointer  // Overflow bucket created after a hash collision
47}

Graphical Representation

1+-------------------+
2|       Map         |
3+-------------------+
4|   Hash Function   |
5+-------------------+
6|   Bucket Array    |  --> Points to multiple buckets
7+-------------------+
8|   Other Metadata  |
9+-------------------+
1+-------------------+
2|      Bucket       |
3+-------------------+
4|   tophash [8]     |  --> Stores the high 8 bits of the hash value of each key-value pair (for quick matching)
5+-------------------+       
6|   keys [8]        |  --> Stores keys
7+-------------------+
8|   values [8]      |  --> Stores values
9+-------------------+
 1+-------------------+       +-------------------+
 2|      Bucket       |       |   Overflow Bucket |
 3+-------------------+       +-------------------+
 4|   tophash [8]     |       |    tophash [8]    |
 5+-------------------+       +-------------------+
 6|     keys [8]      |       |     keys [8]      |
 7+-------------------+       +-------------------+
 8|    values [8]     |       |    values [8]     |
 9+-------------------+       +-------------------+
10|    overflow *     |  -->  |    overflow *     |  --> ...
11+-------------------+       +-------------------+
 1+-------------------+
 2|       Map         |
 3+-------------------+
 4|   Hash Function   |
 5+-------------------+
 6|   Bucket Array    |  -->  +-------------------+
 7+-------------------+       |      Bucket       |
 8|   Other Metadata  |       +-------------------+
 9+-------------------+       |   tophash [8]     |
10                            +-------------------+
11                            |     keys [8]      |
12                            +-------------------+
13                            |    values [8]     |
14                            +-------------------+
15                            |    overflow *    |  --> +-------------------+
16                            +-------------------+      |   Overflow Bucket |
17                                                       +-------------------+
18                                                       |   tophash [8]     |
19                                                       +-------------------+
20                                                       |   keys [8]        |
21                                                       +-------------------+
22                                                       |   values [8]      |
23                                                       +-------------------+
24                                                       |   overflow *      |  --> ..
25                                                       +-------------------+

When allocating memory, bmap allocates a larger memory space, with the first 8 bytes being bmap, followed by 8 keys, 8 values, and 1 overflow pointer.

map

Lookup

Key Location Process

Assume the key is hashed to produce a 64-bit hash value, such as:

1 10010111 | 000011110110110010001111001010100010010110010101010 │ 01010

The low bits of the hash determine the bucket location, and the high bits determine the position within the bucket.

The low bits of the hash are the lower B bits of the hash value. For example, if B is 5, 2^5=32, meaning the map has 32 buckets. Only the lower 5 bits of the hash value are needed to determine which bucket the key-value pair falls into. Here, 01010 is bucket 10.

The high bits of the hash, or tophash, are the upper 8 bits of the hash value. The tophash determines the key’s index within the bucket. Here, it is 10010111, or 151.

In summary, in bucket 10, find the index with tophash 151, which is index 2, the index of the desired key. If the key is not found in this bucket and overflow is not nil, continue searching in the overflow bucket.

 1func mapaccess1(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer {
 2	// ……
 3	
 4	// If h is nil or has no elements, return zero value
 5	if h == nil || h.count == 0 {
 6		return unsafe.Pointer(&zeroVal[0])
 7	}
 8	
 9	// Read and write conflict
10	if h.flags&hashWriting != 0 {
11		throw("concurrent map read and map write")
12	}
13	
14	// Hash algorithm for different key types is determined at compile time
15	alg := t.key.alg
16	
17	// Calculate hash value, adding hash0 for randomness
18	hash := alg.hash(key, uintptr(h.hash0))
19	
20	// For example, B=5, then m is 31, binary is all 1s
21	// To get bucket num, hash is ANDed with m,
22	// Making bucket num determined by the lower 8 bits of hash
23	m := uintptr(1)<<h.B - 1
24	
25	// b is the bucket address
26	b := (*bmap)(add(h.buckets, (hash&m)*uintptr(t.bucketsize)))
27	
28	// If oldbuckets is not nil, resizing is in progress
29	if c := h.oldbuckets; c != nil {
30	    // If not same size resizing (see resizing section)
31	    // Solution for condition 1
32		if !h.sameSizeGrow() {
33			// New bucket count is twice the old count
34			m >>= 1
35		}
36		
37		// Find key's bucket position in the old map
38		oldb := (*bmap)(add(c, (hash&m)*uintptr(t.bucketsize)))
39		
40		// If oldb has not been migrated to the new bucket
41		// Search in the old bucket
42		if !evacuated(oldb) {
43			b = oldb
44		}
45	}
46	
47	// Calculate high 8 bits of hash
48	// Equivalent to right shifting 56 bits, taking only the high 8 bits
49	top := uint8(hash >> (sys.PtrSize*8 - 8))
50	
51	// Add minTopHash
52	if top < minTopHash {
53		top += minTopHash
54	}
55	for {
56	    // Iterate over the 8 positions in the bucket
57		for i := uintptr(0); i < bucketCnt; i++ {
58		    // tophash does not match, continue
59			if b.tophash[i] != top {
60				continue
61			}
62			// tophash matches, locate key position
63			k := add(unsafe.Pointer(b), dataOffset+i*uintptr(t.keysize))
64			// key is a pointer
65			if t.indirectkey {
66			    // Dereference
67				k = *((*unsafe.Pointer)(k))
68			}
69			// If key matches
70			if alg.equal(key, k) {
71			    // Locate value position
72				v := add(unsafe.Pointer(b), dataOffset+bucketCnt*uintptr(t.keysize)+i*uintptr(t.valuesize))
73				// Dereference value
74				if t.indirectvalue {
75					v = *((*unsafe.Pointer)(v))
76				}
77				return v
78			}
79		}
80		
81		// Bucket searched (not found), continue searching in overflow bucket
82		b = b.overflow(t)
83		// Overflow bucket also searched, key not found
84		// Return zero value
85		if b == nil {
86			return unsafe.Pointer(&zeroVal[0])
87		}
88	}
89}