首页 > 代码库 > HashMap和HashSet原理及底层实现
HashMap和HashSet原理及底层实现
HashMap底层用哈希算法实现,下面看一下哈希算表的整体概括:
当map.put(“key”,”values”);的时候,底层是这样的:
static final Entry<?,?>[] EMPTY_TABLE = {};
transient Entry<K,V>[] table = (Entry<K,V>[]) EMPTY_TABLE;
/**
* The number of key-value mappings contained in this map.
*/
transient int size;
int threshold; // 临界值 它等于HashMap的容量乘以负载因子
final float loadFactor;// 负载因子
public V put(K key, V value) {
// 如果table为空,则使其不为空
if (table == EMPTY_TABLE) {
inflateTable(threshold);
}
// 如果key为null,调用putForNullKey处理
if (key == null)
return putForNullKey(value);
int hash = hash(key);
// 搜索指定hash值对应的索引
int i = indexFor(hash, table.length);
for (Entry<K,V> e = table[i]; e != null; e = e.next) {
Object k;
// 如果hash值相同,并且equals比较返回true,则覆盖,然后返回被覆盖的
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
// 如果i索引处的entry为null,表明此处还没有entry
modCount++;
addEntry(hash, key, value, i);
return null;
}
// 添加entry
void addEntry(int hash, K key, V value, int bucketIndex) {
if ((size >= threshold) && (null != table[bucketIndex])) {
resize(2 * table.length);//原来长度的2倍
hash = (null != key) ? hash(key) : 0;
bucketIndex = indexFor(hash, table.length);
}
createEntry(hash, key, value, bucketIndex);
}
void createEntry(int hash, K key, V value, int bucketIndex) {
Entry<K,V> e = table[bucketIndex];
// 头插法建立链
table[bucketIndex] = new Entry<>(hash, key, value, e);
size++;
}
void resize(int newCapacity) {
Entry[] oldTable = table;//先记录下来table
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE; //static final int MAXIMUM_CAPACITY = 1 << 30;
return;
}
Entry[] newTable = new Entry[newCapacity];//这个是原来长度的2倍
transfer(newTable, initHashSeedAsNeeded(newCapacity));
table = newTable;
threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
}
/**
* Transfers all entries from current table to newTable.
*/
void transfer(Entry[] newTable, boolean rehash) {// rehash 是否重新hash
int newCapacity = newTable.length;
for (Entry<K,V> e : table) {
while(null != e) {
Entry<K,V> next = e.next;
if (rehash) {
e.hash = null == e.key ? 0 : hash(e.key);
}
int i = indexFor(e.hash, newCapacity);
e.next = newTable[i];
newTable[i] = e;
e = next;
}
}
}
/**
* Initialize the hashing mask value. We defer(延迟) initialization until we
* really need it.
*/
final boolean initHashSeedAsNeeded(int capacity) {
boolean currentAltHashing = hashSeed != 0;
boolean useAltHashing = sun.misc.VM.isBooted() &&
(capacity >= Holder.ALTERNATIVE_HASHING_THRESHOLD);
boolean switching = currentAltHashing ^ useAltHashing;
if (switching) {
hashSeed = useAltHashing
? sun.misc.Hashing.randomHashSeed(this)
: 0;
}
return switching;
}
// 内部类 entry
static class Entry<K,V> implements Map.Entry<K,V> {
final K key;
V value;
Entry<K,V> next;// 指向下一个entry
int hash;
/**
* Creates new entry.
*/
Entry(int h, K k, V v, Entry<K,V> n) {
value = v;
next = n;
key = k;
hash = h;
}
说明:
对于HashMap及其子类而言,它们采用Hash算法来决定集合中元素的存储位置。当系统开始初始化HashMap时,系统会创建一个长度为capacity的Entry数组,这个数组里可以存储元素的位置被称为“桶(bucket)”,每个bucket都有其指定索引,系统可以根据其索引快速访问该bucket里存储的元素。当每个bucket只存储一个元素时,HashMap性能最好。当解决冲突而产生的链越长,性能越差。
装填因子load factor,默认值是0.75,这个是空间和时间的折衷,增大装填因子,可以减小Hash表所占用的空间,但会增加查找时间,减小装填因子,会提高数据查询性能,但会增加Hash表所占用的内存空间。
在new 一个hashMap的时候,可以适当的传入要建立的大小,传入的应该是2的n次幂。
HashSet的底层实现
HashSet底层是通过HashMap实现的,看如下的构造函数,构造HashSet的时候底层就构造了一个HashMap
public HashSet() {
map = new HashMap<>();
}
private static final Object PRESENT = new Object();
public boolean add(E e) {
return map.put(e, PRESENT)==null;
}
add的时候调用map的put方法,value始终是PRESENT。
The general contract of hashCode is:
· Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
· If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
· It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.
HashMap和HashSet原理及底层实现