Hashmap
1.hashmap的家族
1 2
| public class HashMap<K, V> extends AbstractMap<K,V> implements Map<K,V>, Cloneable, Serializable
|
能够看出,hashmap是map的一个实现类
2.基础知识
1
| hashmap是有链表加数组或者红黑树构成的,对加入的key计算得到hash值,然后经过计算得到索引,如果原有的索引位置上已经有了元素,那么就会形成一个bucket,然后把新的元素加入到bucket中去,至于bucket是链表还是红黑树,由bucket里面的元素的个数决定。
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
| static final int DEFAULT_INITIAL_CAPACITY = 1 << 4;
static final int MAXIMUM_CAPACITY = 1 << 30;
static final float DEFAULT_LOAD_FACTOR = 0.75f;
static final int TREEIFY_THRESHOLD = 8;
static final int UNTREEIFY_THRESHOLD = 6;
static final int MIN_TREEIFY_CAPACITY = 64;
|
1 2 3 4 5 6
| transient Node<K,V>[] table; transient Set<Map.Entry<K,V>> entrySet; transient int size; transient int modCount; int threshold; final float loadFactor;
|
每个位置上面存放的都是一个node,也就是所说的bucket,一般情况下node是一个链表。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
| static class Node<K, V> implements Map.Entry<K,V> { final int hash; final K key; V value; Node<K,V> next; Node(int hash, K key, V value, Node<K,V> next) { this.hash = hash; this.key = key; this.value = value; this.next = next; } public final K getKey() { return key; } public final V getValue() { return value; } public final String toString() { return key + "=" + value; } public final int hashCode() { return Objects.hashCode(key) ^ Objects.hashCode(value); } public final V setValue(V newValue) { V oldValue = value; value = newValue; return oldValue; } public final boolean equals(Object o) { if (o == this) { return true; } if (o instanceof Map.Entry) { Map.Entry<?,?> e = (Map.Entry<?,?>) o; if (Objects.equals(key, e.getKey()) && Objects.equals(value, e.getValue())) { return true; } } return false; } }
|
3.hashmap的初始化
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
| public HashMap( int initialCapacity, float loadFactor){ if (initialCapacity < 0) { throw new IllegalArgumentException("Illegal initial capacity: " + initialCapacity); } if (initialCapacity > MAXIMUM_CAPACITY) { initialCapacity = MAXIMUM_CAPACITY; } if (loadFactor <= 0 || Float.isNaN(loadFactor)) { throw new IllegalArgumentException("Illegal load factor: " + loadFactor); } this.loadFactor = loadFactor; this.threshold = tableSizeFor(initialCapacity); } public HashMap( int initialCapacity){ this(initialCapacity, DEFAULT_LOAD_FACTOR); } public HashMap() { this.loadFactor = DEFAULT_LOAD_FACTOR; } public HashMap(Map < ? extends K, ? extends V > m){ this.loadFactor = DEFAULT_LOAD_FACTOR; putMapEntries(m, false); }
|
4.put
1 2 3
| public V put (K key, V value){ return putVal(hash(key), key, value, false, true); }
|
在put的时候,会先计算key的hash值
1
| (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
|
如果key为null,返回0,(h = key.hashCode()) ^ (h >>> 16)这是一个扰动函数,h右移16位之后,再与原来的进行异或计算,因为已经右移了,所以高16位的异或之后还是原来的高16位。也就是说相当于把原来的h的高16位与h的低16位进行异或计算,进而来降低碰撞的概率。该部分内容较多,建议参考 [hash计算原理][https://www.zhihu.com/question/20733617]
在得到hash值之后,就开始了实际的put操作
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66
| final V putVal ( int hash, K key, V value,boolean onlyIfAbsent, boolean evict){ Node<K,V>[] tab; Node<K,V> p; int n, i; if ((tab = table) == null || (n = tab.length) == 0) { n = (tab = resize()).length; } if ((p = tab[i = (n - 1) & hash]) == null) { tab[i] = newNode(hash, key, value, null); } else { Node<K,V> e; K k; if (p.hash == hash && ((k = p.key) == key || (key != null && key.equals(k)))) { e = p; } else if (p instanceof TreeNode) { e = ((TreeNode<K,V>) p).putTreeVal(this, tab, hash, key, value); } else { for (int binCount = 0; ; ++binCount) { if ((e = p.next) == null) { p.next = newNode(hash, key, value, null); if (binCount >= TREEIFY_THRESHOLD - 1) { treeifyBin(tab, hash); } break; } if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) { break; } p = e; } } if (e != null) { V oldValue = e.value; if (!onlyIfAbsent || oldValue == null) { e.value = value; } afterNodeAccess(e); return oldValue; } } ++modCount; if (++size > threshold) { resize(); } afterNodeInsertion(evict); return null; }
|
上面在数组还没有初始化,或者达到了扩容的阈值的时候,会调用resize函数来进行扩容
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
| final Node<K,V>[] resize () { Node<K,V>[] oldTab = table; int oldCap = (oldTab == null) ? 0 : oldTab.length; int oldThr = threshold; int newCap, newThr = 0; if (oldCap > 0) { if (oldCap >= MAXIMUM_CAPACITY) { threshold = Integer.MAX_VALUE; return oldTab; } else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY && oldCap >= DEFAULT_INITIAL_CAPACITY) { newThr = oldThr << 1; } } else if (oldThr > 0) { newCap = oldThr; } else { newCap = DEFAULT_INITIAL_CAPACITY; newThr = (int) (DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY); } if (newThr == 0) { float ft = (float) newCap * loadFactor; newThr = (newCap < MAXIMUM_CAPACITY && ft < (float) MAXIMUM_CAPACITY ? (int) ft : Integer.MAX_VALUE); } threshold = newThr; @SuppressWarnings({"rawtypes", "unchecked"}) Node<K,V>[] newTab = (Node<K,V>[]) new Node[newCap]; table = newTab; if (oldTab != null) { for (int j = 0; j < oldCap; ++j) { Node<K,V> e; if ((e = oldTab[j]) != null) { oldTab[j] = null; if (e.next == null) { newTab[e.hash & (newCap - 1)] = e; } else if (e instanceof TreeNode) { ((TreeNode<K,V>) e).split(this, newTab, j, oldCap); } else { Node<K,V> loHead = null, loTail = null; do { next = e.next; if ((e.hash & oldCap) == 0) { if (loTail == null) { loHead = e; } else { loTail.next = e; } loTail = e; } else { if (hiTail == null) { hiHead = e; } else { hiTail.next = e; } hiTail = e; } } while ((e = next) != null); if (loTail != null) { loTail.next = null; newTab[j] = loHead; } if (hiTail != null) { hiTail.next = null; newTab[j + oldCap] = hiHead; } } } } } return newTab; }
|
这里有个需要注意的地方,因为我们在计算元素的索引时候,使用的是(n - 1) & hash来计算元素应该被存储的位置。因为上面使用了tableSizeFor来计算了n,所以n一定是2的n次方,例如n = 4,那么n的二进制表示就是1000,n-1的二进制表示就是0111。
现在如果开始扩容,那么新的容量一定是n的2倍,也就是8,换成2进制就是10000,n-1就是01111
对于同一个元素而言,新旧的位置区别在于 n-1不同,或者是只是n-1的高位不同
旧容量n-1 00111
新容量n-1 01111
我们把上面两个进行或运算,得到1000,而这正是oldCap的值。因此如果 (e.hash & oldCap) == 0,说明即使扩容,得到的索引还是和之前一样,如果为1,则说明需要进行移位。
现在我们已经接近了跟索引的问题,但是因为hashmap中有bucket,也就是同一个索引上面还有很多元素,而之前的元素因为执行了(n - 1) & hash得到了一样的值,才被存到同一个位置的,现在由于n发生了变化,可能某些元素在重新计算之后,需要移到新的位置去。因此才会有后面的loHead和hiHead。
并且通过上面的计算,我们能够发现,新老位置的索引差值,刚好是旧容量的值,因此新的索引位置就等于就旧的索引位置加上旧的容量