【Java专题05】集合框架源码:从HashMap到ConcurrentHashMap-夜雨聆风

【Java专题05】集合框架源码:从HashMap到ConcurrentHashMap

面试场景还原

面试官：”请说一下HashMap的底层实现，JDK1.7和1.8有什么区别？为什么HashMap不是线程安全的？ConcurrentHashMap是如何保证线程安全的？在什么情况下HashMap会出现死链问题？”

你：（脑海中浮现出数组+链表+红黑树的结构图，从数据结构开始娓娓道来…）

考察重点

这道题是Java面试的必考题，面试官想考察：

数据结构理解：对数组、链表、红黑树的掌握程度
源码功底：是否读过HashMap、ConcurrentHashMap的核心源码
并发理解：是否明白线程不安全的具体表现和原因
版本演进：JDK1.7到1.8的优化和改进

一、HashMap底层结构

1.1 整体架构（JDK1.8）

HashMap在JDK1.8中采用数组 + 链表 + 红黑树的数据结构：

┌─────────────────────────────────────────────────────────────┐│                        HashMap                              │├─────────────────────────────────────────────────────────────┤│  ┌─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┐  ││  │  0  │  1  │  2  │  3  │  4  │  5  │  6  │  7  │ ... │  │  table数组│  └──┬──┴──┬──┴──┬──┴──┬──┴──┬──┴──┬──┴──┬──┴──┬──┴─────┘  ││     │     │     │     │     │     │     │                    ││     ▼     ▼     ▼     ▼     ▼     ▼     ▼                    ││   Node  Node  Node  Node  Node  Node  Node                   ││     │                       │                               ││     ▼                       ▼                               ││  链表（长度<8）          红黑树（长度>=8）                   │└─────────────────────────────────────────────────────────────┘

1.2 核心数据结构

Node节点（链表节点）：

static class Node<K,V> implements Map.Entry<K,V> {    final int hash;      // key的hash值    final K key;         // key    V value;             // value    Node<K,V> next;      // 指向下一个节点的指针    Node(int hash, K key, V value, Node<K,V> next) {        this.hash = hash;        this.key = key;        this.value = value;        this.next = next;    }}

TreeNode节点（红黑树节点）：

static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {    TreeNode<K,V> parent;   // 父节点    TreeNode<K,V> left;     // 左子节点    TreeNode<K,V> right;    // 右子节点    TreeNode<K,V> prev;     // 前驱节点（用于删除）    boolean red;            // 颜色：红色或黑色}

1.3 核心字段

public class HashMap<K,V> {    // 默认初始容量 16（必须是2的幂次方）    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4;    // 最大容量 2^30    static final int MAXIMUM_CAPACITY = 1 << 30;    // 默认负载因子 0.75    static final float DEFAULT_LOAD_FACTOR = 0.75f;    // 树化阈值 8    static final int TREEIFY_THRESHOLD = 8;    // 解树化阈值 6    static final int UNTREEIFY_THRESHOLD = 6;    // 最小树化容量 64（数组长度小于64时先扩容）    static final int MIN_TREEIFY_CAPACITY = 64;    // 存储元素的数组    transient Node<K,V>[] table;    // 元素个数    transient int size;    // 扩容阈值（capacity * load factor）    int threshold;    // 负载因子    final float loadFactor;}

1.4 为什么容量必须是2的幂次方？

核心原因：为了通过位运算快速计算索引位置，替代取模运算。

// 计算索引：hash & (n - 1)// 当n是2的幂次方时，等价于 hash % nindex = (n - 1) & hash

举例：n=16，n-1=15（二进制1111）

hash:    1101 1010 1011 0010 1110 1010 1011 0010& 1111:  0000 0000 0000 0000 0000 0000 0000 1111结果:    0000 0000 0000 0000 0000 0000 0000 0010 = 2

优点：

位运算比取模快得多
均匀分布，减少hash碰撞

二、HashMap核心方法源码解析

2.1 put方法流程

public V put(K key, V value) {    return putVal(hash(key), key, value, false, true);}// 计算hash值static final int hash(Object key) {    int h;    // 高16位与低16位异或，增加随机性    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);}final V putVal(int hash, K key, V value, boolean onlyIfAbsent,               boolean evict) {    Node<K,V>[] tab; Node<K,V> p; int n, i;    // 1. 数组为空，初始化扩容    if ((tab = table) == null || (n = tab.length) == 0)        n = (tab = resize()).length;    // 2. 计算索引，如果该位置为空，直接插入    if ((p = tab[i = (n - 1) & hash]) == null)        tab[i] = newNode(hash, key, value, null);    else {        Node<K,V> e; K k;        // 3. 如果头节点hash相等且key相等，覆盖        if (p.hash == hash &&            ((k = p.key) == key || (key != null && key.equals(k))))            e = p;        // 4. 如果是红黑树节点，插入红黑树        else if (p instanceof TreeNode)            e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);        else {            // 5. 链表遍历            for (int binCount = 0; ; ++binCount) {                if ((e = p.next) == null) {                    p.next = newNode(hash, key, value, null);                    // 6. 链表长度 >= 8，转换为红黑树                    if (binCount >= TREEIFY_THRESHOLD - 1)                        treeifyBin(tab, hash);                    break;                }                // 找到相同key，覆盖                if (e.hash == hash &&                    ((k = e.key) == key || (key != null && key.equals(k))))                    break;                p = e;            }        }        // 覆盖操作        if (e != null) {            V oldValue = e.value;            if (!onlyIfAbsent || oldValue == null)                e.value = value;            afterNodeAccess(e);            return oldValue;        }    }    ++modCount;    // 7. 元素个数超过阈值，扩容    if (++size > threshold)        resize();    afterNodeInsertion(evict);    return null;}

put流程总结：

1. 计算hash(key)2. 如果数组为空，初始化扩容3. 计算索引 (n-1) & hash4. 如果该位置为空，直接插入5. 否则判断是链表还是红黑树   - 红黑树：插入树中   - 链表：遍历链表，尾部插入     * 找到相同key：覆盖     * 未找到且长度>=8：转换为红黑树6. 元素个数>阈值，扩容

2.2 get方法流程

public V get(Object key) {    Node<K,V> e;    return (e = getNode(hash(key), key)) == null ? null : e.value;}final Node<K,V> getNode(int hash, Object key) {    Node<K,V>[] tab; Node<K,V> first, e; int n; K k;    // 1. 数组不为空，且索引位置不为空    if ((tab = table) != null && (n = tab.length) > 0 &&        (first = tab[(n - 1) & hash]) != null) {        // 2. 检查第一个节点        if (first.hash == hash &&            ((k = first.key) == key || (key != null && key.equals(k))))            return first;        // 3. 遍历后续节点        if ((e = first.next) != null) {            // 红黑树查找            if (first instanceof TreeNode)                return ((TreeNode<K,V>)first).getTreeNode(hash, key);            // 链表遍历            do {                if (e.hash == hash &&                    ((k = e.key) == key || (key != null && key.equals(k))))                    return e;            } while ((e = e.next) != null);        }    }    return null;}

2.3 扩容机制（resize）

扩容触发条件：size > threshold（元素个数 > 容量 × 负载因子）

扩容流程：

创建新数组，容量为原来的2倍
重新计算每个元素在新数组中的位置
迁移元素（JDK1.8优化：要么在原位置，要么在原位置+旧容量）

final Node<K,V>[] resize() {    Node<K,V>[] oldTab = table;    int oldCap = (oldTab == null) ? 0 : oldTab.length;    int oldThr = threshold;    int newCap, newThr = 0;    // 1. 计算新容量和新阈值    if (oldCap > 0) {        // 超过最大容量，不再扩容        if (oldCap >= MAXIMUM_CAPACITY) {            threshold = Integer.MAX_VALUE;            return oldTab;        }        // 新容量 = 旧容量 × 2        else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&                 oldCap >= DEFAULT_INITIAL_CAPACITY)            newThr = oldThr << 1;  // 新阈值 = 旧阈值 × 2    }    // 初始化容量    else if (oldThr > 0)        newCap = oldThr;    else {        newCap = DEFAULT_INITIAL_CAPACITY;        newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);    }    // 2. 创建新数组    @SuppressWarnings({"rawtypes","unchecked"})    Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];    table = newTab;    // 3. 迁移旧数据    if (oldTab != null) {        for (int j = 0; j < oldCap; ++j) {            Node<K,V> e;            if ((e = oldTab[j]) != null) {                oldTab[j] = null;                // 单个节点，直接迁移                if (e.next == null)                    newTab[e.hash & (newCap - 1)] = e;                // 红黑树节点，拆分树                else if (e instanceof TreeNode)                    ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);                // 链表节点，拆分为两个链表                else {                    Node<K,V> loHead = null, loTail = null;  // 低位链表（原位置）                    Node<K,V> hiHead = null, hiTail = null;  // 高位链表（原位置+oldCap）                    Node<K,V> next;                    do {                        next = e.next;                        // 判断节点在扩容后的位置                        if ((e.hash & oldCap) == 0) {                            // 低位链表（位置不变）                            if (loTail == null)                                loHead = e;                            else                                loTail.next = e;                            loTail = e;                        } else {                            // 高位链表（位置 = 原位置 + oldCap）                            if (hiTail == null)                                hiHead = e;                            else                                hiTail.next = e;                            hiTail = e;                        }                    } while ((e = next) != null);                    // 将低位链表放到原位置                    if (loTail != null) {                        loTail.next = null;                        newTab[j] = loHead;                    }                    // 将高位链表放到新位置                    if (hiTail != null) {                        hiTail.next = null;                        newTab[j + oldCap] = hiHead;                    }                }            }        }    }    return newTab;}

JDK1.8扩容优化：

优化前：重新计算每个元素的hash值
优化后：通过(e.hash & oldCap)判断位置

结果为0：放在原位置
结果不为0：放在原位置+旧容量

好处：不需要重新计算hash，提高扩容效率

2.4 树化和解树化

树化条件：

链表长度 ≥ TREEIFY_THRESHOLD（8）
数组长度 ≥ MIN_TREEIFY_CAPACITY（64）

final void treeifyBin(Node<K,V>[] tab, int hash) {    int n, index; Node<K,V> e;    // 数组长度小于64，优先扩容    if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)        resize();    else if ((e = tab[index = (n - 1) & hash]) != null) {        TreeNode<K,V> hd = null, tl = null;        // 将链表节点转换为树节点        do {            TreeNode<K,V> p = replacementTreeNode(e, null);            if (tl == null)                hd = p;            else {                p.prev = tl;                tl.next = p;            }            tl = p;        } while ((e = e.next) != null);        // 将红黑树放到数组位置        if ((tab[index] = hd) != null)            hd.treeify(tab);    }}

为什么树化阈值是8？

基于泊松分布，链表长度达到8的概率极低（0.00000006）
当链表长度超过8时，说明hash碰撞非常严重，需要红黑树提高查找效率（O(log n) vs O(n)）

为什么解树化阈值是6？

避免频繁的树化和解树化
8和6之间留了缓冲，防止阈值相同导致频繁转换

三、JDK1.7 vs JDK1.8 HashMap对比

特性	JDK1.7	JDK1.8
数据结构	数组+链表	数组+链表+红黑树
插入方式	头插法	尾插法
hash计算	多次扰动	一次扰动（hash ^ (hash >>> 16)）
扩容迁移	重新计算hash	(e.hash & oldCap) == 0 判断
并发问题	扩容死链	数据覆盖

JDK1.7头插法导致的问题：

// 头插法：新节点插入到链表头部voidaddEntry(int hash, K key, V value, int bucketIndex) {    Entry<K,V> e = table[bucketIndex];    table[bucketIndex] = new Entry<>(hash, key, value, e);}

头插法在并发扩容时会产生循环链表：

初始链表: 1 -> 2 -> 3线程A扩容时，rehash后链表: 3 -> 2 -> 1线程B同时操作，可能出现: 1 -> 2 -> 3 -> 2 ... 循环链表

JDK1.8尾插法解决了死链问题：

// 尾插法：新节点插入到链表尾部p.next = newNode(hash, key, value, null);

四、ConcurrentHashMap线程安全实现

4.1 JDK1.7 ConcurrentHashMap（分段锁）

数据结构：Segment数组 + HashEntry数组 + 链表

┌─────────────────────────────────────────────────────────────┐│                    ConcurrentHashMap                        │├─────────────────────────────────────────────────────────────┤│  ┌──────────────────────────────────────────────────────┐  ││  │ Segment数组（可重入锁，默认16个）                      │  ││  │ ┌─────────┬─────────┬─────────┬─────────┬─────────┐  │  ││  │ │Segment0 │Segment1 │Segment2 │Segment3 │Segment4 │  │  ││  │ └────┬────┴────┬────┴────┬────┴────┬────┴────┬────┘  │  ││  └──────┼─────────┼─────────┼─────────┼─────────┼────────┘  ││         ▼         ▼         ▼         ▼         ▼          ││      HashEntry HashEntry HashEntry HashEntry HashEntry       ││         │                   │                               ││         ▼                   ▼                               ││      链表                 链表                               │└─────────────────────────────────────────────────────────────┘

分段锁原理：

默认16个Segment，每个Segment独立加锁
不同Segment的写操作可以并发执行
理论上并发度为16

static final class Segment<K,V> extends ReentrantLock {    transient volatile HashEntry<K,V>[] table;    transient int count;}

4.2 JDK1.8 ConcurrentHashMap（CAS + synchronized）

数据结构：数组 + 链表 + 红黑树（与HashMap一致）

线程安全实现：

CAS操作：用于更新数组、节点等
synchronized：锁定链表或红黑树的头节点

public class ConcurrentHashMap<K,V> {    // 数组，volatile保证可见性    transient volatile Node<K,V>[] table;    // 扩容时使用的数组    private transient volatile Node<K,V>[] nextTable;    // 控制标识符（初始化、扩容等状态）    private transient volatile int sizeCtl;}

4.3 put方法源码（JDK1.8）

public V put(K key, V value) {    return putVal(key, value, false);}final V putVal(K key, V value, boolean onlyIfAbsent) {    if (key == null || value == null) throw new NullPointerException();    int hash = spread(key.hashCode());    int binCount = 0;    for (Node<K,V>[] tab = table;;) {        Node<K,V> f; int n, i, fh;        // 1. 数组为空，初始化        if (tab == null || (n = tab.length) == 0)            tab = initTable();        // 2. 当前桶为空，CAS直接插入        else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {            if (casTabAt(tab, i, null, new Node<K,V>(hash, key, value, null)))                break;        }        // 3. 正在扩容，帮助扩容        else if ((fh = f.hash) == MOVED)            tab = helpTransfer(tab, f);        else {            V oldVal = null;            // 4. 锁住链表头节点            synchronized (f) {                if (tabAt(tab, i) == f) {                    // 链表                    if (fh >= 0) {                        binCount = 1;                        for (Node<K,V> e = f;; ++binCount) {                            K ek;                            if (e.hash == hash &&                                ((ek = e.key) == key ||                                 (ek != null && key.equals(ek)))) {                                oldVal = e.val;                                if (!onlyIfAbsent)                                    e.val = value;                                break;                            }                            Node<K,V> pred = e;                            if ((e = e.next) == null) {                                pred.next = new Node<K,V>(hash, key, value, null);                                break;                            }                        }                    }                    // 红黑树                    else if (f instanceof TreeBin) {                        Node<K,V> p;                        binCount = 2;                        if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key, value)) != null) {                            oldVal = p.val;                            if (!onlyIfAbsent)                                p.val = value;                        }                    }                }            }            // 5. 链表长度超过8，转换为红黑树            if (binCount != 0) {                if (binCount >= TREEIFY_THRESHOLD)                    treeifyBin(tab, i);                if (oldVal != null)                    return oldVal;                break;            }        }    }    addCount(1L, binCount);    return null;}

关键点：

tabAt：Unsafe.getObjectVolatile，保证可见性
casTabAt：Unsafe.compareAndSwapObject，CAS更新
synchronized：只锁链表头节点，粒度更细

4.4 get方法源码（JDK1.8）

public V get(Object key) {    Node<K,V>[] tab; Node<K,V> e, p; int n, eh; K ek;    int h = spread(key.hashCode());    if ((tab = table) != null && (n = tab.length) > 0 &&        (e = tabAt(tab, (n - 1) & h)) != null) {        // 头节点匹配        if ((eh = e.hash) == h) {            if ((ek = e.key) == key || (ek != null && key.equals(ek)))                return e.val;        }        // 红黑树或正在扩容        else if (eh < 0)            return (p = e.find(h, key)) != null ? p.val : null;        // 链表遍历        while ((e = e.next) != null) {            if (e.hash == h &&                ((ek = e.key) == key || (ek != null && key.equals(ek))))                return e.val;        }    }    return null;}

get方法为什么不需要加锁？

table使用volatile保证可见性
Node的val和next使用volatile保证可见性
读取的是内存中的最新值，不会读到过时数据

4.5 JDK1.7 vs JDK1.8 ConcurrentHashMap对比

特性	JDK1.7	JDK1.8
锁机制	Segment分段锁（ReentrantLock）	CAS + synchronized
锁粒度	一个Segment锁一个HashEntry数组	一个synchronized锁一个链表头
并发度	默认16，可扩展	理论上是数组长度
数据结构	Segment + HashEntry + 链表	Node + 链表 + 红黑树
扩容方式	每个Segment独立扩容	支持并发扩容

为什么JDK1.8用synchronized替代ReentrantLock？

synchronized在JDK1.6后经过优化（锁升级），性能不亚于ReentrantLock
锁粒度更细，降低锁竞争
代码更简洁，维护成本低

五、常见面试陷阱

5.1 死链问题（JDK1.7）

问题描述：多线程环境下，HashMap扩容时可能产生循环链表，导致get操作死循环。

产生原因：头插法 + 并发扩容

复现场景：

// 初始状态：数组长度2，链表 a -> b// 两个线程同时触发扩容到长度4// 线程A：rehash后链表 b -> a// 线程B：同时操作，可能导致 a -> b -> a 循环链表

解决方案：

使用ConcurrentHashMap
升级到JDK1.8（尾插法解决了死链）

5.2 数据覆盖问题（JDK1.8）

问题描述：多线程put时，可能导致数据丢失。

场景：

// 线程A和线程B同时put，key的hash相同// 线程A判断桶为空，准备插入，但还未插入// 线程B也判断桶为空，插入成功// 线程A继续插入，覆盖了线程B的数据

原因：(f = tabAt(tab, i)) == null判断后，CAS之前存在时间窗口。

解决方案：使用ConcurrentHashMap

5.3 红黑树退化问题

问题：频繁删除导致红黑树退化为链表

解决：当树节点数量小于UNTREEIFY_THRESHOLD（6）时，退化为链表

5.4 扩容期间的get问题

问题：扩容期间get数据，可能找不到

解决：ConcurrentHashMap在扩容时，旧的table和新的table同时存在，get时先查旧表，找不到再查新表

// 节点hash为MOVED（-1）表示正在扩容static final class ForwardingNode<K,V> extends Node<K,V> {    final Node<K,V>[] nextTable;    ForwardingNode(Node<K,V>[] tab) {        super(MOVED, null, null, null);        this.nextTable = tab;    }    Node<K,V> find(int h, Object k) {        // 到新表中查找        Node<K,V> e = (Node<K,V>)nextTable;        // ...    }}

六、实战：HashMap使用最佳实践

6.1 初始化容量设置

// ❌ 错误：默认容量16，扩容3次Map<String, String> map = new HashMap<>();for (int i = 0; i < 1000; i++) {    map.put("key" + i, "value" + i);}// ✅ 正确：预估容量，避免扩容int expectedSize = 1000;int capacity = (int) (expectedSize / 0.75f) + 1;Map<String, String> map = new HashMap<>(capacity);

6.2 重写equals和hashCode

public class User {    private Long id;    private String name;    @Override    public boolean equals(Object o) {        if (this == o) return true;        if (o == null || getClass() != o.getClass()) return false;        User user = (User) o;        return Objects.equals(id, user.id);    }    @Override    public int hashCode() {        return Objects.hash(id);  // 只使用id计算hash    }}

6.3 线程安全场景选择

场景	推荐方案
单线程	HashMap
多线程，读多写少	Collections.synchronizedMap
多线程，写操作频繁	ConcurrentHashMap
需要排序	ConcurrentSkipListMap

七、思维导图总结

集合框架（HashMap → ConcurrentHashMap）├── HashMap底层│   ├── 数组 + 链表 + 红黑树│   ├── 核心参数：容量16、负载因子0.75、树化阈值8│   └── 容量必须是2的幂次方├── 核心方法│   ├── put：hash → 索引 → 插入（尾插法）│   ├── get：hash → 索引 → 查找│   └── resize：扩容2倍，优化迁移（e.hash & oldCap）├── JDK1.7 vs JDK1.8│   ├── 数据结构：链表 → 红黑树│   ├── 插入方式：头插法 → 尾插法│   └── 并发问题：死链 → 数据覆盖├── ConcurrentHashMap│   ├── JDK1.7：分段锁（Segment）│   ├── JDK1.8：CAS + synchronized│   └── 并发扩容：ForwardingNode└── 面试陷阱    ├── 死链问题（JDK1.7）    ├── 数据覆盖（JDK1.8）    └── 扩容期间的可见性

八、面试Tips

加分话术

数据结构：”HashMap底层是数组+链表+红黑树，链表长度超过8且数组长度超过64时会转换为红黑树，查找复杂度从O(n)降到O(log n)。”
扩容优化：”JDK1.8扩容时通过(e.hash & oldCap)判断节点位置，不用重新计算hash，提高了扩容效率。”
并发实现：”ConcurrentHashMap在JDK1.8中使用CAS加synchronized，只锁链表头节点，比JDK1.7的分段锁粒度更细，并发度更高。”
历史问题：”JDK1.7的头插法在并发扩容时会产生循环链表，导致死循环，JDK1.8改为尾插法解决了这个问题。”

常见追问准备

HashMap的初始容量为什么是16？
负载因子为什么是0.75？
红黑树和链表的转换边界为什么是8和6？
ConcurrentHashMap的size()方法如何实现？
HashMap在多线程下还有什么问题？

避坑指南

❌ 不要混淆”树化阈值8″和”最小树化容量64″
❌ 不要认为ConcurrentHashMap完全无锁（仍有synchronized）
❌ 不要忽略hashCode()和equals()的重写规则
✅ 强调”2的幂次方”是为了位运算替代取模
✅ 区分JDK1.7和1.8的并发问题差异

下期预告

【Java专题06】Spring框架核心：从IoC源码到循环依赖解决

内容预告：

Spring IoC容器启动流程
Bean的生命周期详解
依赖注入的三种方式
循环依赖的解决原理（三级缓存）
AOP代理的实现机制