HashMap

HashMap 是 Java 集合框架中最核心的类之一，也是面试的高频考点。理解其底层数据结构、哈希冲突解决机制和扩容过程，是掌握 Java 集合框架的关键。

一、底层数据结构

1.1 JDK 1.8 数据结构

HashMap 采用数组 + 链表 + 红黑树的数据结构。

                    ┌─────────────────────────────────────────────────┐
                    │                   HashMap                       │
                    ├─────────────────────────────────────────────────┤
                    │                                                 │
    index 0         │   ┌─────┐                                       │
    index 1         │   │ null│                                       │
    index 2         │   ├─────┼──→ ┌─────┐──→ ┌─────┐──→ ┌─────┐      │
                    │   │  2  │    │ K,V │    │ K,V │    │ K,V │      │
                    │   └─────┘    └─────┘    └─────┘    └─────┘      │
                    │              链表节点                          │
    index 3         │   ┌─────┐                                       │
                    │   │ null│                                       │
                    │   └─────┘                                       │
    index 4         │   ┌─────┼──→ ┌─────────────────────────┐        │
                    │   │  4  │    │        红黑树节点        │        │
                    │   └─────┘    │  (链表长度 ≥ 8 时转换)  │        │
                    │              └─────────────────────────┘        │
    ...             │   ...                                          │
                    │                                                 │
    index n-1       │   ┌─────┐                                       │
                    │   │ null│                                       │
                    │   └─────┘                                       │
                    │                                                 │
                    │   table 数组（Node<K,V>[]）                      │
                    └─────────────────────────────────────────────────┘

1.2 Node 节点结构

// JDK 1.8 Node 节点
static class Node<K,V> implements Map.Entry<K,V> {
    final int hash;       // 哈希值
    final K key;          // 键
    V value;              // 值
    Node<K,V> next;       // 下一个节点
    
    Node(int hash, K key, V value, Node<K,V> next) {
        this.hash = hash;
        this.key = key;
        this.value = value;
        this.next = next;
    }
}
 
// 红黑树节点
static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {
    TreeNode<K,V> parent;   // 父节点
    TreeNode<K,V> left;     // 左子节点
    TreeNode<K,V> right;    // 右子节点
    TreeNode<K,V> prev;     // 前驱节点
    boolean red;            // 颜色
}

1.3 重要参数

public class HashMap<K,V> {
    // 默认初始容量（16，必须是 2 的幂）
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4;
    
    // 最大容量
    static final int MAXIMUM_CAPACITY = 1 << 30;
    
    // 默认负载因子（0.75）
    static final float DEFAULT_LOAD_FACTOR = 0.75f;
    
    // 链表转红黑树的阈值（链表长度 ≥ 8）
    static final int TREEIFY_THRESHOLD = 8;
    
    // 红黑树转链表的阈值（树节点数 ≤ 6）
    static final int UNTREEIFY_THRESHOLD = 6;
    
    // 转红黑树时数组的最小容量
    static final int MIN_TREEIFY_CAPACITY = 64;
    
    // 存储元素的数组
    transient Node<K,V>[] table;
    
    // 元素个数
    transient int size;
    
    // 扩容阈值：threshold = capacity * loadFactor
    int threshold;
    
    // 负载因子
    final float loadFactor;
}

二、哈希计算

2.1 哈希函数

// 计算 key 的哈希值
static final int hash(Object key) {
    int h;
    // 高 16 位异或低 16 位（扰动函数）
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}
 
// 计算数组下标
// index = (n - 1) & hash
// n 是数组长度，必须是 2 的幂

2.2 为什么用扰动函数？

                原始 hashCode
                ┌─────────────────────────────────────┐
                │ h0000000 00000000 00000000 0000abcd │
                └─────────────────────────────────────┘
                                │
                                │ h >>> 16
                                ▼
                ┌─────────────────────────────────────┐
                │ 00000000 00000000 h0000000 00000000 │
                └─────────────────────────────────────┘
                                │
                                │ XOR (^)
                                ▼
                ┌─────────────────────────────────────┐
                │ h0000000 00000000 h0000000 0000abcd │  ← 扰动后的哈希值
                └─────────────────────────────────────┘

好处：让高 16 位也参与下标计算，减少哈希冲突

2.3 为什么数组长度是 2 的幂？

// 使用位运算代替取模
index = (n - 1) & hash;
 
// 等价于
index = hash % n;  // 但位运算效率更高
 
// 前提：n 是 2 的幂
// n = 16, n - 1 = 15 = 0000 1111
// 任何数 & 0000 1111 结果都在 0-15 之间

三、put 操作流程

3.1 流程图

┌─────────────────────────────────────────────────────────────┐
│                    HashMap put 流程                          │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  1. 计算 key 的 hash 值                                      │
│           │                                                 │
│           ▼                                                 │
│  2. table 是否为空或长度为 0？                                │
│           │                                                 │
│       ┌───┴───┐                                             │
│       是      否                                             │
│       │       │                                             │
│       ▼       │                                             │
│   初始化扩容  │                                             │
│       │       │                                             │
│       └───┬───┘                                             │
│           │                                                 │
│           ▼                                                 │
│  3. 计算 index = (n-1) & hash                                │
│     检查 table[index] 是否为空                               │
│           │                                                 │
│       ┌───┴───┐                                             │
│       空    非空                                             │
│       │       │                                             │
│       ▼       ▼                                             │
│   直接插入  遍历查找                                          │
│           │                                                 │
│       ┌───┴───────────────────────┐                         │
│       │           │               │                         │
│   key 相同    链表节点         红黑树节点                     │
│       │           │               │                         │
│       ▼           ▼               ▼                         │
│   覆盖 value  继续遍历        树查找                         │
│           │               │                                 │
│           ▼               ▼                                 │
│       插入新节点      插入树节点                              │
│           │               │                                 │
│           └───────┬───────┘                                 │
│                   │                                         │
│                   ▼                                         │
│  4. 检查链表长度是否 ≥ 8                                      │
│           │                                                 │
│       ┌───┴───┐                                             │
│       是      否                                             │
│       │       │                                             │
│       ▼       │                                             │
│   转红黑树    │                                             │
│       │       │                                             │
│       └───┬───┘                                             │
│           │                                                 │
│           ▼                                                 │
│  5. size++，检查是否需要扩容                                  │
│           │                                                 │
│       ┌───┴───┐                                             │
│    size >   size ≤                                           │
│   threshold threshold                                        │
│       │       │                                             │
│       ▼       ▼                                             │
│     扩容    完成                                             │
│                                                             │
└─────────────────────────────────────────────────────────────┘

3.2 源码分析

public V put(K key, V value) {
    return putVal(hash(key), key, value, false, true);
}
 
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
               boolean evict) {
    Node<K,V>[] tab; Node<K,V> p; int n, i;
    
    // 1. 表为空则初始化
    if ((tab = table) == null || (n = tab.length) == 0)
        n = (tab = resize()).length;
    
    // 2. 计算下标，该位置为空则直接插入
    if ((p = tab[i = (n - 1) & hash]) == null)
        tab[i] = newNode(hash, key, value, null);
    else {
        Node<K,V> e; K k;
        
        // 3. key 相同，直接覆盖
        if (p.hash == hash &&
            ((k = p.key) == key || (key != null && key.equals(k))))
            e = p;
        
        // 4. 红黑树节点
        else if (p instanceof TreeNode)
            e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
        
        // 5. 链表节点
        else {
            for (int binCount = 0; ; ++binCount) {
                if ((e = p.next) == null) {
                    p.next = newNode(hash, key, value, null);
                    // 链表长度 ≥ 8 转红黑树
                    if (binCount >= TREEIFY_THRESHOLD - 1)
                        treeifyBin(tab, hash);
                    break;
                }
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    break;
                p = e;
            }
        }
        
        // 6. 存在相同 key，覆盖 value
        if (e != null) {
            V oldValue = e.value;
            if (!onlyIfAbsent || oldValue == null)
                e.value = value;
            afterNodeAccess(e);
            return oldValue;
        }
    }
    
    ++modCount;
    
    // 7. 检查是否需要扩容
    if (++size > threshold)
        resize();
    
    afterNodeInsertion(evict);
    return null;
}

四、扩容机制

4.1 扩容触发条件

当 size > threshold（threshold = capacity × loadFactor）时触发扩容。

// 默认情况
capacity = 16
loadFactor = 0.75
threshold = 16 * 0.75 = 12
 
// 当元素个数超过 12 时触发扩容

4.2 扩容过程

final Node<K,V>[] resize() {
    Node<K,V>[] oldTab = table;
    int oldCap = (oldTab == null) ? 0 : oldTab.length;
    int oldThr = threshold;
    int newCap, newThr = 0;
    
    // 1. 计算新容量
    if (oldCap > 0) {
        if (oldCap >= MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return oldTab;
        }
        else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                 oldCap >= DEFAULT_INITIAL_CAPACITY)
            newThr = oldThr << 1;  // 新容量 = 旧容量 * 2
    }
    else if (oldThr > 0)  // 指定了初始容量
        newCap = oldThr;
    else {               // 默认值
        newCap = DEFAULT_INITIAL_CAPACITY;
        newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
    }
    
    if (newThr == 0) {
        float ft = (float)newCap * loadFactor;
        newThr = (newCap < MAXIMUM_CAPACITY && ft < MAXIMUM_CAPACITY ?
                  (int)ft : Integer.MAX_VALUE);
    }
    threshold = newThr;
    
    // 2. 创建新数组
    Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
    table = newTab;
    
    // 3. 数据迁移
    if (oldTab != null) {
        for (int j = 0; j < oldCap; ++j) {
            Node<K,V> e;
            if ((e = oldTab[j]) != null) {
                oldTab[j] = null;
                if (e.next == null)
                    // 单个节点，直接计算新位置
                    newTab[e.hash & (newCap - 1)] = e;
                else if (e instanceof TreeNode)
                    // 红黑树拆分
                    ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                else {
                    // 链表拆分
                    Node<K,V> loHead = null, loTail = null;
                    Node<K,V> hiHead = null, hiTail = null;
                    Node<K,V> next;
                    do {
                        next = e.next;
                        if ((e.hash & oldCap) == 0) {
                            // 原位置
                            if (loTail == null)
                                loHead = e;
                            else
                                loTail.next = e;
                            loTail = e;
                        }
                        else {
                            // 原位置 + oldCap
                            if (hiTail == null)
                                hiHead = e;
                            else
                                hiTail.next = e;
                            hiTail = e;
                        }
                    } while ((e = next) != null);
                    if (loTail != null) {
                        loTail.next = null;
                        newTab[j] = loHead;
                    }
                    if (hiTail != null) {
                        hiTail.next = null;
                        newTab[j + oldCap] = hiHead;
                    }
                }
            }
        }
    }
    return newTab;
}

4.3 扩容时链表拆分原理

扩容前（容量 16）：
index = hash & 15 = hash & 0000 1111

扩容后（容量 32）：
index = hash & 31 = hash & 0001 1111

区别：多了一位（第 5 位）

如果 hash 第 5 位 = 0：新 index = 原 index
如果 hash 第 5 位 = 1：新 index = 原 index + 16

┌─────────────────────────────────────────────┐
│  原链表（index = 4）                         │
│  A → B → C → D → E                          │
│                                             │
│  扩容后拆分为两条链：                         │
│  链1（index = 4）：A → C → E                 │
│  链2（index = 20）：B → D                    │
│                                             │
│  通过 hash & oldCap 判断：                    │
│  = 0：留在原位置                              │
│  = 1：移动到原位置 + oldCap                   │
└─────────────────────────────────────────────┘

五、get 操作流程

public V get(Object key) {
    Node<K,V> e;
    return (e = getNode(hash(key), key)) == null ? null : e.value;
}
 
final Node<K,V> getNode(int hash, Object key) {
    Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
    
    // 1. 计算下标
    if ((tab = table) != null && (n = tab.length) > 0 &&
        (first = tab[(n - 1) & hash]) != null) {
        
        // 2. 检查第一个节点
        if (first.hash == hash &&
            ((k = first.key) == key || (key != null && key.equals(k))))
            return first;
        
        // 3. 遍历链表或红黑树
        if ((e = first.next) != null) {
            if (first instanceof TreeNode)
                return ((TreeNode<K,V>)first).getTreeNode(hash, key);
            do {
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    return e;
            } while ((e = e.next) != null);
        }
    }
    return null;
}

六、JDK 7 vs JDK 8 区别

对比项	JDK 7	JDK 8
数据结构	数组 + 链表	数组 + 链表 + 红黑树
链表插入	头插法	尾插法
扩容顺序	倒序（链表反转）	正序（保持顺序）
哈希计算	4 次扰动	1 次扰动
红黑树	无	链表 ≥ 8 转，≤ 6 转回

6.1 头插法 vs 尾插法

// JDK 7 头插法（并发扩容可能导致死循环）
void transfer(Entry[] newTable) {
    Entry[] src = table;
    int newCapacity = newTable.length;
    for (int j = 0; j < src.length; j++) {
        Entry<K,V> e = src[j];
        if (e != null) {
            src[j] = null;
            do {
                Entry<K,V> next = e.next;
                int i = indexFor(e.hash, newCapacity);
                e.next = newTable[i];  // 头插法
                newTable[i] = e;
                e = next;
            } while (e != null);
        }
    }
}
 
// JDK 8 尾插法（保持顺序，避免死循环）
// 见 resize() 源码分析

头插法问题（JDK 7）：
扩容前：A → B → C
扩容后：C → B → A（顺序反转）

并发扩容可能导致环形链表：
线程1：处理到 B，暂停
线程2：完成扩容，B → A
线程1：恢复执行，A → B
结果：A → B → A → B ...（死循环）

七、常见面试题

Q1: HashMap 的底层数据结构？

A: JDK 8 采用数组 + 链表 + 红黑树：

数组：存储元素的主体
链表：解决哈希冲突
红黑树：链表长度 ≥ 8 且数组长度 ≥ 64 时转换

Q2: 为什么链表转红黑树的阈值是 8？

源码注释：根据泊松分布，链表长度达到 8 的概率极低（约 0.00000006%）
红黑树节点大小是链表节点的 2 倍，转换有成本
8 是时间换空间的平衡点

Q3: HashMap 是线程安全的吗？

A: 不是。线程安全的替代方案：

// 1. ConcurrentHashMap（推荐）
Map<String, String> map = new ConcurrentHashMap<>();
 
// 2. Collections.synchronizedMap
Map<String, String> map = Collections.synchronizedMap(new HashMap<>());
 
// 3. Hashtable（不推荐，性能差）
Map<String, String> map = new Hashtable<>();

Q4: 为什么负载因子是 0.75？

太小（如 0.5）：空间浪费，频繁扩容
太大（如 1.0）：哈希冲突增加，查询效率降低
0.75 是空间和时间的平衡点

Q5: HashMap 的扩容过程？

新容量 = 旧容量 × 2
创建新数组
数据迁移：链表拆分为两组（原位置 / 原位置+旧容量）
红黑树拆分或退化

Q6: 为什么 JDK 8 改用尾插法？

头插法在并发扩容时可能导致链表成环（死循环）
尾插法保持节点顺序，避免此问题
但 HashMap 仍不是线程安全的

Q7: HashMap 如何解决哈希冲突？

链地址法：相同哈希值的元素存入链表
扰动函数：让高 16 位参与运算，减少冲突
红黑树优化：链表过长时转为红黑树，提高查询效率

八、总结

特性	说明
数据结构	数组 + 链表 + 红黑树
初始容量	16
负载因子	0.75
扩容倍数	2 倍
转红黑树	链表 ≥ 8 且容量 ≥ 64
退回链表	树节点 ≤ 6
哈希计算	`(h = key.hashCode()) ^ (h >>> 16)`
下标计算	`(n - 1) & hash`
线程安全	否

最后更新：2026年3月2日

ArrayList vs LinkedList ConcurrentHashMap