首页 > 代码库 > 【Java集合系列四】HashSet和LinkedHashSet解析
【Java集合系列四】HashSet和LinkedHashSet解析
2017-07-29 16:58:13
一、简介
1、Set概念
Set可以理解为集合,非常类似数据概念中的集合,集合三大特征:1、确定性;2、互异性;3、无序性,因此Set实现类也有类似的特征。
2、HashSet
HashSet继承自AbstractSet,实现了Set接口,但是其源码非常少,也非常简单。内部使用HashMap来存储数据,数据存储在HashMap的key中,value都是同一个默认值:
二、HashSet几个重要的方法
1、add(E e)
HashSet的确定性,也可以理解为唯一性,是通过HashMap的put方法来保证的,往HashMap中put数据时,如果key是一样的,只会替换key对应的value,不会新插入一条数据。所以往HashSet中add相同的元素没有什么用,这里的相同是通过equals方法保证的,具体的在HashMap中细说。
2、remove(Object o)
简单粗暴,从HashMap中移除一条数据。
3、contains(Object o)
4、iterator()
5、其他
其他的方法诸如:size()、isEmpty()、contains()、clear()等都完全委托给了HashMap。需要注意的是:HashSet没有提供set、get等方法。
源码如下:
1 package java.util; 2 3 import java.io.InvalidObjectException; 4 5 /** 6 * This class implements the <tt>Set</tt> interface, backed by a hash table 7 * (actually a <tt>HashMap</tt> instance). It makes no guarantees as to the 8 * iteration order of the set; in particular, it does not guarantee that the 9 * order will remain constant over time. This class permits the <tt>null</tt> 10 * element. 11 * 12 * <p>This class offers constant time performance for the basic operations 13 * (<tt>add</tt>, <tt>remove</tt>, <tt>contains</tt> and <tt>size</tt>), 14 * assuming the hash function disperses the elements properly among the 15 * buckets. Iterating over this set requires time proportional to the sum of 16 * the <tt>HashSet</tt> instance‘s size (the number of elements) plus the 17 * "capacity" of the backing <tt>HashMap</tt> instance (the number of 18 * buckets). Thus, it‘s very important not to set the initial capacity too 19 * high (or the load factor too low) if iteration performance is important. 20 * 21 * <p><strong>Note that this implementation is not synchronized.</strong> 22 * If multiple threads access a hash set concurrently, and at least one of 23 * the threads modifies the set, it <i>must</i> be synchronized externally. 24 * This is typically accomplished by synchronizing on some object that 25 * naturally encapsulates the set. 26 * 27 * If no such object exists, the set should be "wrapped" using the 28 * {@link Collections#synchronizedSet Collections.synchronizedSet} 29 * method. This is best done at creation time, to prevent accidental 30 * unsynchronized access to the set:<pre> 31 * Set s = Collections.synchronizedSet(new HashSet(...));</pre> 32 * 33 * <p>The iterators returned by this class‘s <tt>iterator</tt> method are 34 * <i>fail-fast</i>: if the set is modified at any time after the iterator is 35 * created, in any way except through the iterator‘s own <tt>remove</tt> 36 * method, the Iterator throws a {@link ConcurrentModificationException}. 37 * Thus, in the face of concurrent modification, the iterator fails quickly 38 * and cleanly, rather than risking arbitrary, non-deterministic behavior at 39 * an undetermined time in the future. 40 * 41 * <p>Note that the fail-fast behavior of an iterator cannot be guaranteed 42 * as it is, generally speaking, impossible to make any hard guarantees in the 43 * presence of unsynchronized concurrent modification. Fail-fast iterators 44 * throw <tt>ConcurrentModificationException</tt> on a best-effort basis. 45 * Therefore, it would be wrong to write a program that depended on this 46 * exception for its correctness: <i>the fail-fast behavior of iterators 47 * should be used only to detect bugs.</i> 48 * 49 * <p>This class is a member of the 50 * <a href="http://www.mamicode.com/{@docRoot}/../technotes/guides/collections/index.html"> 51 * Java Collections Framework</a>. 52 * 53 * @param <E> the type of elements maintained by this set 54 * 55 * @author Josh Bloch 56 * @author Neal Gafter 57 * @see Collection 58 * @see Set 59 * @see TreeSet 60 * @see HashMap 61 * @since 1.2 62 */ 63 64 public class HashSet<E> 65 extends AbstractSet<E> 66 implements Set<E>, Cloneable, java.io.Serializable 67 { 68 static final long serialVersionUID = -5024744406713321676L; 69 70 private transient HashMap<E,Object> map; 71 72 // Dummy value to associate with an Object in the backing Map 73 private static final Object PRESENT = new Object(); 74 75 /** 76 * Constructs a new, empty set; the backing <tt>HashMap</tt> instance has 77 * default initial capacity (16) and load factor (0.75). 78 */ 79 public HashSet() { 80 map = new HashMap<>(); 81 } 82 83 /** 84 * Constructs a new set containing the elements in the specified 85 * collection. The <tt>HashMap</tt> is created with default load factor 86 * (0.75) and an initial capacity sufficient to contain the elements in 87 * the specified collection. 88 * 89 * @param c the collection whose elements are to be placed into this set 90 * @throws NullPointerException if the specified collection is null 91 */ 92 public HashSet(Collection<? extends E> c) { 93 map = new HashMap<>(Math.max((int) (c.size()/.75f) + 1, 16)); 94 addAll(c); 95 } 96 97 /** 98 * Constructs a new, empty set; the backing <tt>HashMap</tt> instance has 99 * the specified initial capacity and the specified load factor. 100 * 101 * @param initialCapacity the initial capacity of the hash map 102 * @param loadFactor the load factor of the hash map 103 * @throws IllegalArgumentException if the initial capacity is less 104 * than zero, or if the load factor is nonpositive 105 */ 106 public HashSet(int initialCapacity, float loadFactor) { 107 map = new HashMap<>(initialCapacity, loadFactor); 108 } 109 110 /** 111 * Constructs a new, empty set; the backing <tt>HashMap</tt> instance has 112 * the specified initial capacity and default load factor (0.75). 113 * 114 * @param initialCapacity the initial capacity of the hash table 115 * @throws IllegalArgumentException if the initial capacity is less 116 * than zero 117 */ 118 public HashSet(int initialCapacity) { 119 map = new HashMap<>(initialCapacity); 120 } 121 122 /** 123 * Constructs a new, empty linked hash set. (This package private 124 * constructor is only used by LinkedHashSet.) The backing 125 * HashMap instance is a LinkedHashMap with the specified initial 126 * capacity and the specified load factor. 127 * 128 * @param initialCapacity the initial capacity of the hash map 129 * @param loadFactor the load factor of the hash map 130 * @param dummy ignored (distinguishes this 131 * constructor from other int, float constructor.) 132 * @throws IllegalArgumentException if the initial capacity is less 133 * than zero, or if the load factor is nonpositive 134 */ 135 HashSet(int initialCapacity, float loadFactor, boolean dummy) { 136 map = new LinkedHashMap<>(initialCapacity, loadFactor); 137 } 138 139 /** 140 * Returns an iterator over the elements in this set. The elements 141 * are returned in no particular order. 142 * 143 * @return an Iterator over the elements in this set 144 * @see ConcurrentModificationException 145 */ 146 public Iterator<E> iterator() { 147 return map.keySet().iterator(); 148 } 149 150 /** 151 * Returns the number of elements in this set (its cardinality). 152 * 153 * @return the number of elements in this set (its cardinality) 154 */ 155 public int size() { 156 return map.size(); 157 } 158 159 /** 160 * Returns <tt>true</tt> if this set contains no elements. 161 * 162 * @return <tt>true</tt> if this set contains no elements 163 */ 164 public boolean isEmpty() { 165 return map.isEmpty(); 166 } 167 168 /** 169 * Returns <tt>true</tt> if this set contains the specified element. 170 * More formally, returns <tt>true</tt> if and only if this set 171 * contains an element <tt>e</tt> such that 172 * <tt>(o==null ? e==null : o.equals(e))</tt>. 173 * 174 * @param o element whose presence in this set is to be tested 175 * @return <tt>true</tt> if this set contains the specified element 176 */ 177 public boolean contains(Object o) { 178 return map.containsKey(o); 179 } 180 181 /** 182 * Adds the specified element to this set if it is not already present. 183 * More formally, adds the specified element <tt>e</tt> to this set if 184 * this set contains no element <tt>e2</tt> such that 185 * <tt>(e==null ? e2==null : e.equals(e2))</tt>. 186 * If this set already contains the element, the call leaves the set 187 * unchanged and returns <tt>false</tt>. 188 * 189 * @param e element to be added to this set 190 * @return <tt>true</tt> if this set did not already contain the specified 191 * element 192 */ 193 public boolean add(E e) { 194 return map.put(e, PRESENT)==null; 195 } 196 197 /** 198 * Removes the specified element from this set if it is present. 199 * More formally, removes an element <tt>e</tt> such that 200 * <tt>(o==null ? e==null : o.equals(e))</tt>, 201 * if this set contains such an element. Returns <tt>true</tt> if 202 * this set contained the element (or equivalently, if this set 203 * changed as a result of the call). (This set will not contain the 204 * element once the call returns.) 205 * 206 * @param o object to be removed from this set, if present 207 * @return <tt>true</tt> if the set contained the specified element 208 */ 209 public boolean remove(Object o) { 210 return map.remove(o)==PRESENT; 211 } 212 213 /** 214 * Removes all of the elements from this set. 215 * The set will be empty after this call returns. 216 */ 217 public void clear() { 218 map.clear(); 219 } 220 221 /** 222 * Returns a shallow copy of this <tt>HashSet</tt> instance: the elements 223 * themselves are not cloned. 224 * 225 * @return a shallow copy of this set 226 */ 227 @SuppressWarnings("unchecked") 228 public Object clone() { 229 try { 230 HashSet<E> newSet = (HashSet<E>) super.clone(); 231 newSet.map = (HashMap<E, Object>) map.clone(); 232 return newSet; 233 } catch (CloneNotSupportedException e) { 234 throw new InternalError(e); 235 } 236 } 237 238 /** 239 * Save the state of this <tt>HashSet</tt> instance to a stream (that is, 240 * serialize it). 241 * 242 * @serialData The capacity of the backing <tt>HashMap</tt> instance 243 * (int), and its load factor (float) are emitted, followed by 244 * the size of the set (the number of elements it contains) 245 * (int), followed by all of its elements (each an Object) in 246 * no particular order. 247 */ 248 private void writeObject(java.io.ObjectOutputStream s) 249 throws java.io.IOException { 250 // Write out any hidden serialization magic 251 s.defaultWriteObject(); 252 253 // Write out HashMap capacity and load factor 254 s.writeInt(map.capacity()); 255 s.writeFloat(map.loadFactor()); 256 257 // Write out size 258 s.writeInt(map.size()); 259 260 // Write out all elements in the proper order. 261 for (E e : map.keySet()) 262 s.writeObject(e); 263 } 264 265 /** 266 * Reconstitute the <tt>HashSet</tt> instance from a stream (that is, 267 * deserialize it). 268 */ 269 private void readObject(java.io.ObjectInputStream s) 270 throws java.io.IOException, ClassNotFoundException { 271 // Read in any hidden serialization magic 272 s.defaultReadObject(); 273 274 // Read capacity and verify non-negative. 275 int capacity = s.readInt(); 276 if (capacity < 0) { 277 throw new InvalidObjectException("Illegal capacity: " + 278 capacity); 279 } 280 281 // Read load factor and verify positive and non NaN. 282 float loadFactor = s.readFloat(); 283 if (loadFactor <= 0 || Float.isNaN(loadFactor)) { 284 throw new InvalidObjectException("Illegal load factor: " + 285 loadFactor); 286 } 287 288 // Read size and verify non-negative. 289 int size = s.readInt(); 290 if (size < 0) { 291 throw new InvalidObjectException("Illegal size: " + 292 size); 293 } 294 295 // Set the capacity according to the size and load factor ensuring that 296 // the HashMap is at least 25% full but clamping to maximum capacity. 297 capacity = (int) Math.min(size * Math.min(1 / loadFactor, 4.0f), 298 HashMap.MAXIMUM_CAPACITY); 299 300 // Create backing HashMap 301 map = (((HashSet<?>)this) instanceof LinkedHashSet ? 302 new LinkedHashMap<E,Object>(capacity, loadFactor) : 303 new HashMap<E,Object>(capacity, loadFactor)); 304 305 // Read in all elements in the proper order. 306 for (int i=0; i<size; i++) { 307 @SuppressWarnings("unchecked") 308 E e = (E) s.readObject(); 309 map.put(e, PRESENT); 310 } 311 } 312 313 /** 314 * Creates a <em><a href="http://www.mamicode.com/Spliterator.html#binding">late-binding</a></em> 315 * and <em>fail-fast</em> {@link Spliterator} over the elements in this 316 * set. 317 * 318 * <p>The {@code Spliterator} reports {@link Spliterator#SIZED} and 319 * {@link Spliterator#DISTINCT}. Overriding implementations should document 320 * the reporting of additional characteristic values. 321 * 322 * @return a {@code Spliterator} over the elements in this set 323 * @since 1.8 324 */ 325 public Spliterator<E> spliterator() { 326 return new HashMap.KeySpliterator<E,Object>(map, 0, -1, 0, 0); 327 } 328 }
三、LinkedHashSet
1、LinekdHashSet简介
LinkedHashSet继承自HashSet,源码更少、更简单,唯一的区别是LinkedHashSet内部使用的是LinkHashMap。这样做的意义或者好处就是LinkedHashSet中的元素顺序是可以保证的,也就是说遍历序和插入序是一致的。
2、Demo使用
打印日志如上,HashSet和HashMap都不保证顺序,Link**能保证顺序。
源码如下:
1 public class MainSet { 2 public static void main(String[] args) { 3 Object value = http://www.mamicode.com/new Object(); 4 HashMap<String, Object> hashMap = new HashMap<>(); 5 HashSet<String> hashSet = new HashSet<>(); 6 LinkedHashMap<String, Object> linkedHashMap = new LinkedHashMap<>(); 7 LinkedHashSet<String> linkedHashSet = new LinkedHashSet<>(); 8 hashSet.add("java"); 9 hashMap.put("java", value); 10 linkedHashSet.add("java"); 11 linkedHashMap.put("java", value); 12 13 hashSet.add("golang"); 14 hashMap.put("golang", value); 15 linkedHashSet.add("golang"); 16 linkedHashMap.put("golang", value); 17 18 hashSet.add("python"); 19 hashMap.put("python", value); 20 linkedHashSet.add("python"); 21 linkedHashMap.put("python", value); 22 23 hashSet.add("ruby"); 24 hashMap.put("ruby", value); 25 linkedHashSet.add("ruby"); 26 linkedHashMap.put("ruby", value); 27 28 hashSet.add("scala"); 29 hashMap.put("scala", value); 30 linkedHashSet.add("scala"); 31 linkedHashMap.put("scala", value); 32 33 hashSet.add("c"); 34 hashMap.put("c", value); 35 linkedHashSet.add("c"); 36 linkedHashMap.put("c", value); 37 38 System.out.println("默认插入序:\njava\tgolang\tpython\truby\tscala\tc"); 39 40 System.out.println(" \nHashSet:-------------------"); 41 for (String str : hashSet) { 42 System.out.print(str + "\t"); 43 } 44 45 System.out.println(" \n\nHashMap:-------------------"); 46 for (Map.Entry<String, Object> entry : hashMap.entrySet()) { 47 System.out.print(entry.getKey() + "\t"); 48 } 49 50 System.out.println(" \n\nLinkedHashSet:-------------------"); 51 for (String str : linkedHashSet) { 52 System.out.print(str + "\t"); 53 } 54 55 System.out.println(" \n\nLinkedHashMap:-------------------"); 56 for (Map.Entry<String, Object> entry : linkedHashMap.entrySet()) { 57 System.out.print(entry.getKey() + "\t"); 58 } 59 } 60 }
3、debug中奇异现象
如上,我们明明添加了6个元素,但是table中只有5个,怎么回事呢?初步猜测应该是“c”元素和其中某一个元素位于同一个bucket,验证如下:
我们发现,“java”竟然和“c”位于同一个bucket,他俩在同一个链表中。唯一的疑惑是:“c”是后加入的元素,按理说应该在链表的表头才对啊,这个问题还需要探究。
同时这里也介绍了一种能debug到HashMap内部数据结构的方法,但是需要注意2个问题:
1、需要在AS中设置一下,否则debug看到的信息不是这样的,如下图:
2、直接使用table[3].next是不行的, 需要像图上那样,些完整的包名才行。
4、继续debug
如上所示,设置HashSet之后,同样能看到如下信息:
看看LinkedHashMap,如下图,LinkedHashMap中多了head和tail,这是指向表头、表尾的指针,head指向“java”,tail指向“c”,这和我们的插入序保持一致,但是实际存储和之前是一样的。
LinkedHashSet如下图:
【Java集合系列四】HashSet和LinkedHashSet解析