Java学习之HashMap源码阅读

文章目录

HashMap源码阅读
- 一、阅读说明
- 二、JDK1.8之前和JDK1.8之后的区别
- - 1. 底层数据结构的不同
  - 2. hash碰撞后的链表插入方式不同
  - 3. Entry替换成了Node
- 三、重要的参数
- 四、用到的数据结构
- 五、重要的方法
- 六、总结
- 参考来源：

HashMap源码阅读一、阅读说明

本文所读HashMap源码系源于jdk14.0.1。

二、JDK1.8之前和JDK1.8之后的区别 1. 底层数据结构的不同

JDK1.8之前：数组+链表。
JDK1.8之后：数组+链表+红黑树。JDK1.8对HashMap做了优化，当链表长度超过8，且数组大小超过64，就将链表转换为红黑树。

2. hash碰撞后的链表插入方式不同

jdk1.7在发生hash碰撞后，会在链表头部插入新加入的元素
jdk1.8之后则在发生hash碰撞后采用尾插法，即将新加入的元素插入链表尾部

3. Entry替换成了Node 三、重要的参数

默认初始容量(16)

static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

最大容量(<= 1<<30)

static final int MAXIMUM_CAPACITY = 1 << 30;

默认负载因子

static final float DEFAULT_LOAD_FACTOR = 0.75f;

转化为红黑树的阈值(8)

static final int TREEIFY_THRESHOLD = 8;

从红黑树树退化为链表的阈值(6)

static final int UNTREEIFY_THRESHOLD = 6;

转化为红黑树的最小容量(64)
```
static final int MIN_TREEIFY_CAPACITY = 64;
```
注意：数组的数量必须大于64才触发转化红黑树，小于64时只会扩容
存储元素的Node数组
```
transient Node[] table;
```
存放具体元素的集合
```
transient Set> entrySet;
```
引发扩容的临界值
```
int threshold;
```
注意：当map的实际大小超过这个临界值（容量*负载因子），并且当前位置已有元素时，就会进行扩容
负载因子
```
final float loadFactor;
```

四、用到的数据结构

当发生hash冲突时，需要用链表来存放这些hash冲突的内容。在冲突数量较少的情况下，可以优先使用链表，但是当链表长度大于8且数组的大小大于64时，需要将链表转化为红黑树来提升效率。因此，HashMap中会存在两种数据结构Node和TreeNode。

Node

源码如下：

static class Node implements Map.Entry {
        final int hash;
        final K key;
        V value;
        Node next;

        Node(int hash, K key, V value, Node next) {
            this.hash = hash;
            this.key = key;
            this.value = value;
            this.next = next;
        }

        public final K getKey()        { return key; }
        public final V getValue()      { return value; }
        public final String toString() { return key + "=" + value; }

        public final int hashCode() {
            return Objects.hashCode(key) ^ Objects.hashCode(value);
        }

        public final V setValue(V newValue) {
            V oldValue = value;
            value = newValue;
            return oldValue;
        }

        public final boolean equals(Object o) {
            if (o == this)
                return true;
            if (o instanceof Map.Entry) {
                Map.Entry e = (Map.Entry)o;
                if (Objects.equals(key, e.getKey()) &&
                    Objects.equals(value, e.getValue()))
                    return true;
            }
            return false;
        }
    }

理解：

Node节点会存储四个属性，首先有一个存储哈希值的int类型hash变量，其次是存储HashMap的key值以及value值的两个变量，最后是指向下一个节点的next变量。

TreeNode

源码如下：

static final class TreeNode extends linkedHashMap.Entry {
        TreeNode parent;  // red-black tree links
        TreeNode left;
        TreeNode right;
        TreeNode prev;    // needed to unlink next upon deletion
        boolean red;
        TreeNode(int hash, K key, V val, Node next) {
            super(hash, key, val, next);
        }
        //篇幅过长，省略不粘贴了
}

理解：

TreeNode为红黑树节点，共有五个属性，分别为父节点、左子节点、右子节点、父节点以及当前节点的颜色。

五、重要的方法

put方法

源码如下：

public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
    	//首先定义tab数组和节点p
        Node[] tab; Node p; int n, i;
    	//如果table为空或者table的长度为0，需要扩容
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
    	//根据哈希值找到对应的table下标，如果这个下标对应的位置为null，表示还没有任何元素，则在此处建立链表，并且把对应的key-value存起来
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {//根据哈希值找到对应的table下标，如果这个下标对应的位置不为null，表示已经有元素在这里
            Node e; K k;
            //p.hash是这个位置(即key值相同的Node链表节点)对应的hash值，要加入的元素的hash值与p.hash一样，也就是说有key相等的Node，直接放在这个位置
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            //如果这个节点是TreeNode类型的，那说明它是用红黑树存储的，则以红黑树方式存储
            else if (p instanceof TreeNode)
                e = ((TreeNode)p).putTreeval(this, tab, hash, key, value);
            else {
                //否则，就说明它是链表存储的，则以链表形式存储
                for (int binCount = 0; ; ++binCount) {
                    //移动到链表尾部
                    if ((e = p.next) == null) {
                        //binCount是记录链表节点个数的
                        p.next = newNode(hash, key, value, null);
                        //如果链表节点数大于或等于TREEIFY_THRESHOLD，则转化为红黑树
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    //如果当前节点的key和要插入元素的key相等，表明已有该元素的位置，直接break
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            //处理key对应的value值
            if (e != null) { // existing mapping for key
                //更新旧值，即决定是否将原来的值覆盖掉，如果onlyIfAbsent为false或者原来节点的旧值为null，则需要更新旧值
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        //modCount和线程安全有关，fail-fast机制，modCount就是修改次数，需要注意的是HashMap是线程不安全的
        ++modCount;
    	//如果size超过threshold阈值并且当前位置已有元素，就触发扩容
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

理解：给HashMap添加元素时，建立一个table，把要添加的key-value值封装在Node或者TreeNode中，以链表或红黑树的形式解决hash冲突。此外，添加完之后，如果map里的元素个数（注意不包括当前添加的这个）大于threshold阈值（size*负载因子），则触发扩容。

get方法

源码如下：

public V get(Object key) {
        Node e;
        return (e = getNode(hash(key), key)) == null ? null : e.value;
    }

    final Node getNode(int hash, Object key) {
        Node[] tab; Node first, e; int n; K k;
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) != null) {
            if (first.hash == hash && // always check first node，检测第一个节点，如果第一个节点就是要get的key，直接返回对应的节点
                ((k = first.key) == key || (key != null && key.equals(k))))
                return first;
            //前面未能返回，说明第一个节点并不是要get的key，则判断其他节点
            if ((e = first.next) != null) {
                //如果是树节点，就要按照红黑树的查找方式取节点
                if (first instanceof TreeNode)
                    return ((TreeNode)first).getTreeNode(hash, key);
                do {
                    //不是树节点，则循环遍历链表，直到找到要get的key，将其对应的节点返回
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        //没有这个key值，返回null
        return null;
    }

resize方法

源码如下：

先看构造函数中的threshold：

public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);
        this.loadFactor = loadFactor;
        this.threshold = tableSizeFor(initialCapacity);
	}
//对于tableSizeFor

    static final int tableSizeFor(int cap) {
        int n = -1 >>> Integer.numberOfLeadingZeros(cap - 1);
        return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
    }

理解：tableSizeFor的功能（不考虑大于最大容量的情况）是返回大于输入参数且最近的2的整数次幂的数。

    final Node[] resize() {
        Node[] oldTab = table;
        int oldCap = (oldTab == null) ? 0 : oldTab.length;
        int oldThr = threshold;
        int newCap, newThr = 0;
        //数组长度大于0
        if (oldCap > 0) {
            //数组容量大于或等于最大容量，threshold直接等于整型最大值，并直接返回原数组
            if (oldCap >= MAXIMUM_CAPACITY) {
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
            //新数组容量（旧数组的两倍）未大于最大容量，且旧数组的容量大于或等于默认初始化容量，则新的threshold为旧的threshold的两倍
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                     oldCap >= DEFAULT_INITIAL_CAPACITY)
                newThr = oldThr << 1; // double threshold
        }
        //旧的threshold大于0且数组容量等于0，则新的容量为旧的threshold值
        else if (oldThr > 0) // initial capacity was placed in threshold
            newCap = oldThr;
        //旧的threshold等于0且数组容量等于0，则新的容量为默认初始容量，新的threshold为默认负载因子*默认初始容量
        else {               // zero initial threshold signifies using defaults
            newCap = DEFAULT_INITIAL_CAPACITY;
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
        //如果经过上面的步骤后，新的threshold等于0，则表明需要设置新的threshold
        if (newThr == 0) {
            float ft = (float)newCap * loadFactor;
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                      (int)ft : Integer.MAX_VALUE);
        }
        //更新threshold
        threshold = newThr;
        //新建一个table，容量为newCap，把原来table的东西拷贝到新的table里
        @SuppressWarnings({"rawtypes","unchecked"})
        Node[] newTab = (Node[])new Node[newCap];
        table = newTab;
        //如果旧数组不为null，就进行拷贝
        if (oldTab != null) {
            //for循环遍历旧数组的每一个元素
            for (int j = 0; j < oldCap; ++j) {
                Node e;
                //当前遍历到的旧数组不为null，可以进行操作
                if ((e = oldTab[j]) != null) {
                    //把旧数组置为null
                    oldTab[j] = null;
                    //接下来的操作都是放在新数组上进行的
                    if (e.next == null)
                        newTab[e.hash & (newCap - 1)] = e;
                    //如果节点为树节点，则采用红黑树的方式
                    else if (e instanceof TreeNode)
                        ((TreeNode)e).split(this, newTab, j, oldCap);
                    //不是树节点，则为链表方式
                    else { // preserve order
                        Node loHead = null, loTail = null;
                        Node hiHead = null, hiTail = null;
                        Node next;
                        do {
                            next = e.next;
                            if ((e.hash & oldCap) == 0) {
                                if (loTail == null)
                                    loHead = e;
                                else
                                    loTail.next = e;
                                loTail = e;
                            }
                            else {
                                if (hiTail == null)
                                    hiHead = e;
                                else
                                    hiTail.next = e;
                                hiTail = e;
                            }
                        } while ((e = next) != null);
                        if (loTail != null) {
                            loTail.next = null;
                            newTab[j] = loHead;
                        }
                        if (hiTail != null) {
                            hiTail.next = null;
                            newTab[j + oldCap] = hiHead;
                        }
                    }
                }
            }
        }
        return newTab;
    }

理解：扩容首先对容量和临界值进行更新，更新完成后根据是红黑树还是链表（采用尾插法），将旧数组的元素拷贝到新数组。

remove方法

源码如下：

public V remove(Object key) {
        Node e;
        return (e = removeNode(hash(key), key, null, false, true)) == null ?
            null : e.value;
    }

    
    final Node removeNode(int hash, Object key, Object value,
                               boolean matchValue, boolean movable) {
        Node[] tab; Node p; int n, index;
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (p = tab[index = (n - 1) & hash]) != null) {
            Node node = null, e; K k; V v;
            //看看第一个节点能不能匹配上，能则把节点用node记下
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                node = p;
            //第一个节点不能匹配上，则看下一个节点
            else if ((e = p.next) != null) {
                //得先分辨是树还是链表，是树则按照树的方式处理（获取该key对应的节点）
                if (p instanceof TreeNode)
                    node = ((TreeNode)p).getTreeNode(hash, key);
                //是链表则用链表的方式
                else {
                    do {
                        if (e.hash == hash &&
                            ((k = e.key) == key ||
                             (key != null && key.equals(k)))) {
                            node = e;
                            break;
                        }
                        p = e;
                    } while ((e = e.next) != null);
                }
            }
            //对记录下来的node进行处理
            if (node != null && (!matchValue || (v = node.value) == value ||
                                 (value != null && value.equals(v)))) {
                //是树节点，则用红黑树的方式删除节点
                if (node instanceof TreeNode)
                    ((TreeNode)node).removeTreeNode(this, tab, movable);
                //是链表，则用链表的方式删除节点，若node为链表第一个元素，直接node.next给index对应的tab位置即可
                else if (node == p)
                    tab[index] = node.next;
                //不是链表第一个元素，需要这样处理
                else
                    p.next = node.next;
                //改变的次数
                ++modCount;
                //map的大小
                --size;
                afterNodeRemoval(node);
                return node;
            }
        }
        return null;
    }

理解：删除节点也是分为红黑树和链表的情况，分别处理。

containsKey方法

源码如下:
```
public boolean containsKey(Object key) {
        return getNode(hash(key), key) != null;
    }
```
理解：它内部调用了getNode()方法，因此也get()方法类似，只是get完之后判断是否为null来确定map里有没有这个key。

六、总结

扩容机制

当向map中添加元素时，若map中的条目数量超过了临界值（负载因子*当前容量），那么就要考虑扩容了。扩容的resize()方法一般发生在put元素时，其，一如果table为空或者table的长度为0，需要扩容；其二，在添加完元素后，判断++size > threshold是否成立，成立则触发扩容。

链表和红黑树的转换

先看看put()方法里的部分源码：

//如果链表节点数大于或等于TREEIFY_THRESHOLD（8），则转化为红黑树
if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
    treeifyBin(tab, hash);
break;

这个treeifyBin的源码如下：

 
    final void treeifyBin(Node[] tab, int hash) {
        int n, index; Node e;
        //如果数组table为null或者其当前容量小于最小转化为树的容量MIN_TREEIFY_CAPACITY（64），则先进行resize()看是否需要扩容
        if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
            resize();
        //当前容量大于或等于MIN_TREEIFY_CAPACITY（64）时，才进行链表到红黑树的转化
        else if ((e = tab[index = (n - 1) & hash]) != null) {
            TreeNode hd = null, tl = null;
            do {
                TreeNode p = replacementTreeNode(e, null);
                if (tl == null)
                    hd = p;
                else {
                    p.prev = tl;
                    tl.next = p;
                }
                tl = p;
            } while ((e = e.next) != null);
            if ((tab[index] = hd) != null)
                //链表转化为红黑树的方法
                hd.treeify(tab);
        }
    }

理解：如果链表节点数大于或等于数组为空或者TREEIFY_THRESHOLD（8），调用treeifyBin(tab, hash)。在treeifyBin(tab, hash)中，当前容量小于MIN_TREEIFY_CAPACITY（64），就先扩容；否则将链表转化为红黑树。

参考来源：

Java：HashMap源码阅读（带图）
HashMap中的modCount

Java学习之HashMap源码阅读

Java相关栏目本月热门文章