HashMap_Java

迷惑点，记录在这：

UNTREEIFY_THRESHOLD 和capacity 的关系还是没太明白，希望看到的友友解答下

扩容阈值 == 负载因子 * table容量

1. 为什么HashMap的阈值是0.75，为什么JDK1.8后链表切换红黑树长度为8

查看源码，HashMap上面有一段注释：

* Because TreeNodes are about twice the size of regular nodes, we
* use them only when bins contain enough nodes to warrant use
* (see TREEIFY_THRESHOLD). And when they become too small (due to
* removal or resizing) they are converted back to plain bins.  In
* usages with well-distributed user hashCodes, tree bins are
* rarely used.  Ideally, under random hashCodes, the frequency of
* nodes in bins follows a Poisson distribution
* (http://en.wikipedia.org/wiki/Poisson_distribution) with a
* parameter of about 0.5 on average for the default resizing
* threshold of 0.75, although with a large variance because of
* resizing granularity. Ignoring variance, the expected
* occurrences of list size k are (exp(-0.5) * pow(0.5, k) /
* factorial(k)). The first values are:
*
* 0:    0.60653066
* 1:    0.30326533
* 2:    0.07581633
* 3:    0.01263606
* 4:    0.00157952
* 5:    0.00015795
* 6:    0.00001316
* 7:    0.00000094
* 8:    0.00000006

树的结点占用空间是常规节点的2倍，我们使用树节点只有在容纳足够多的元素的时候。并且当元素个数减少足够小恢复普通节点。在随机HashCode的情况下，结点的使用频率遵循泊松分布。

源码中的参数：

 
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

    
    static final int MAXIMUM_CAPACITY = 1 << 30;  // 最大容量

    
    static final float DEFAULT_LOAD_FACTOR = 0.75f;  // 默认负载因子

    
    static final int TREEIFY_THRESHOLD = 8;  // 当链表长度为8转换成红黑树； 大于2至少为8

    
    static final int UNTREEIFY_THRESHOLD = 6;

    
    static final int MIN_TREEIFY_CAPACITY = 64;  // 树化最小数组容量

当table > 64 链表大于8 发生树化，table < 64,时，使用put添加元素，如果table>阈值且index下标位置上不等于空，进行table的2倍扩容；默认负载因子0.75

HashMap在jdk1.8之后引入了红黑树的概念，表示若桶中链表元素超过8时，会自动转化成红黑树；若桶中元素小于等于6时，树结构还原成链表形式。

平均查找长度：

顺序查找：

从表的一端开始，顺序扫描线性表，依次将扫描到的节点关键字和给定值k相比较。

等概率条件下...平均查找长度：ASL = (n+....+2+1)/n= (n+1)/2。

原因：

　　红黑树的平均查找长度是log(n)，长度为8，查找长度为log(8)=3，链表的平均查找长度为n/2，当长度为8时，平均查找长度为8/2=4，这才有转换成树的必要；链表长度如果是小于等于6，6/2=3，虽然速度也很快的，但是转化为树结构和生成树的时间并不会太短。

选择6和8的原因：

避免频繁插入删除在 len = 8 徘徊，频繁切换

2. put操作

If the map previously contained a mapping for the key, the old value is replaced.
当映射后的值相同，旧值被替换；Set中好像是抛弃

源码： resize方法（）

Initializes or doubles table size.  If null, allocates in accord with initial capacity target held in field threshold. Otherwise, because we are using power-of-two expansion, the elements from each bin must either stay at same index, or move
with a power of two offset in the new table.
用来初始化或进行二倍扩容；如果为空，用初始的容量进行分配。由于我们使用二倍扩容，每一个元素要么保持在原地，要么为 oldIndex + capacity

Node[] oldTab = table;
        int oldCap = (oldTab == null) ? 0 : oldTab.length;
        int oldThr = threshold;
        // newThr: 扩容后下次扩容的条件； newCap: 扩容后table数组的大小
        int newCap, newThr = 0;
        // 正常扩容
        if (oldCap > 0) {
            if (oldCap >= MAXIMUM_CAPACITY) {
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                     oldCap >= DEFAULT_INITIAL_CAPACITY)
                newThr = oldThr << 1; // double threshold
        }
        // 使用构造方法进行创建的
        else if (oldThr > 0) // initial capacity was placed in threshold
            newCap = oldThr;
        // 没有经过初始化， 即new HashMap<>()
        else {               // zero initial threshold signifies using defaults
            newCap = DEFAULT_INITIAL_CAPACITY;
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
        // 通过newCap * 负载因子进行计算
        if (newThr == 0) {
            float ft = (float)newCap * loadFactor;
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                      (int)ft : Integer.MAX_VALUE);
        }

// 省略。。。

跟着这个 up看源码：

HashMap夺命14问，你能坚持到第几问？_wenwenaier的博客-CSDN博客

3. hash（）实现

return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);

使用hashCode() 计算出h1然后与h1右移16进行高位异或

4. JDK1.7中的rehash

这个值一般都为false, 为true的时候表示需要重新计算一次hash, 计算结果主要来自容量和默认threshold的关系，可以在IDEA中通过参数 -D jdk.map.althashing.threshold = xxx设置；当容量大于等于这个数发生重新hash计算；默认是Integer.MAX_VALUE

5. fail-fast机制

fail-fast 机制，即快速失败机制，是java集合(Collection)中的一种错误检测机制。当在迭代集合的过程中该集合在结构上发生改变的时候，就有可能会发生fail-fast，即抛出 ConcurrentModificationException异常。fail-fast机制并不保证在不同步的修改下一定会抛出异常，它只是尽最大努力去抛出，所以这种机制一般仅用于检测bug。

HashMap中有这样一个字段：modCount; 当进行put 和 remove 的时候都会进行 ++

This field is used to make iterators on Collection-views of the HashMap fail-fast. 
这个值用来使迭代HashMap容器视图是快速失败

 final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        // 省略
        ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

final Node removeNode(int hash, Object key, Object value,
                               boolean matchValue, boolean movable) {
        Node[] tab; Node p; int n, index;
        // 省略

            if (node != null && (!matchValue || (v = node.value) == value ||
                                 (value != null && value.equals(v)))) {
                if (node instanceof TreeNode)
                    ((TreeNode)node).removeTreeNode(this, tab, movable);
                else if (node == p)
                    tab[index] = node.next;
                else
                    p.next = node.next;
                ++modCount;
                --size;
                afterNodeRemoval(node);
                return node;
            }
        }

实现原理：

在迭代的时候会调用next（）方法，如果当前ModCount和expectedModCount不相等，说明在遍历过程中有线程调用了remove（）或者 put（）方法，修改了结构，发生并发异常，终止操作；这就是fail-fast机制

final Node nextNode() {
       Node[] t;
       Node e = next;
       if (modCount != expectedModCount)
           throw new ConcurrentModificationException();
       if (e == null)
           throw new NoSuchElementException();
       if ((next = (current = e).next) == null && (t = table) != null) {
           do {} while (index < t.length && (next = t[index++]) == null);
       }
        return e;
}

6. JDK 1.7中HashMap线程不安全

多线程环境下，两个线程创建HashMap，然后进行扩容操作，使用双指针进行元素转移；假设线程2被阻塞然后线程1完成元素扩容，但是线程2的结点指针是执行线程1的，在恢复后从线程1上将元素转移到自己的容器中，但是在转移过程中，使用的where(next != null) ，最终可能导致执行e.next后产生环形链表

这是1.7中的情况，下面是1.8中的代码，使用了do ... while循环

do {
	next = e.next;
	if ((e.hash & oldCap) == 0) {
		if (loTail == null)
			loHead = e;
		else
			loTail.next = e;
		loTail = e;
	}
	else {
		if (hiTail == null)
			hiHead = e;
		else
			hiTail.next = e;
		hiTail = e;
	}
} while ((e = next) != null);
if (loTail != null) {
	loTail.next = null;
	newTab[j] = loHead;
}
if (hiTail != null) {
	hiTail.next = null;
	newTab[j + oldCap] = hiHead;
}

HashMap

Java相关栏目本月热门文章