.NET Framework 中的集合类型在这几年经历了显著的发展。得益于微软新的开放策略,让我们可以展现一个常用数据结构的两个版本:在.NET 和 Mono 中的哈希表。
理论上,哈希表是一个非常简单的构造,就是数组或链表的集合被划分到有限数量的存储体中。然而,即使你擅长于多线程逻辑,在获取该数据结构的元素项时,仍然有些复杂。 // Copyright © Microsoft Corporation. All rights reserved.
public virtual Object this[Object key] {
get {
if (key == null) {
throw new ArgumentNullException(“key”, Environment.GetResourceString(“ArgumentNull_Key”));
}
uint seed;
uint incr;
// Take a snapshot of buckets, in case another thread does a resize
bucket[] lbuckets = buckets;
uint hashcode = InitHash(key, lbuckets.Length, out seed, out incr);
int ntry = 0;
bucket b;
int bucketNumber = (int) (seed % (uint)lbuckets.Length);
do
{
int currentversion;
// A read operation on hashtable has three steps:
// (1) calculate the hash and find the slot number.
// (2) compare the hashcode, if equal, go to step 3. Otherwise end.
// (3) compare the key, if equal, go to step 4. Otherwise end.
// (4) return the value contained in the bucket.
// After step 3 and before step 4. A writer can kick in a remove the old item and add a new one
// in the same bukcet. So in the reader we need to check if the hash table is modified during above steps.
//
// Writers (Insert, Remove, Clear) will set ‘isWriterInProgress’ flag before it starts modifying
// the hashtable and will ckear the flag when it is done. When the flag is cleared, the ‘version’
// will be increased. We will repeat the reading if a writer is in progress or done with the modification
// during the read.
//
// Our memory model guarantee if we pick up the change in bucket from another processor,
// we will see the ‘isWriterProgress’ flag to be true or ‘version’ is changed in the reader.
//
int spinCount = 0;
do {
// this is violate read, following memory accesses can not be moved ahead of it.
currentversion = version;
b = lbuckets[bucketNumber];
// The contention between reader and writer shouldn’t happen frequently.
// But just in case this will burn CPU, yield the control of CPU if we spinned a few times.
// 8 is just a random number I pick.
if( (++spinCount) % 8 == 0 ) {
Thread.Sleep(1); // 1 means we are yeilding control to all threads, including low-priority ones.
}
} while ( isWriterInProgress || (currentversion != version) );
if (b.key == null) {
return null;
}
if (((b.hash_coll & 0x7FFFFFFF) == hashcode) &&
KeyEquals (b.key, key))
return b.val;
bucketNumber = (int) (((long)bucketNumber + incr)% (uint)lbuckets.Length);
} while (b.hash_coll < 0 && ++ntry < lbuckets.Length);
return null;
}
是的,各位读者请注意,这里存在一个伪自循环锁(pseudo-spin lock),并调用了 Threading.Sleep。
而在.NET 2.0 和泛型集合中,微软决定放弃集合的内部锁定机制。相反,要求任何锁定都必须在外部调用。我们在 Generic.Dictionary 类中可以发现更为简洁的代码。
// Copyright © Microsoft Corporation. All rights reserved.
private int FindEntry(TKey key) {
if( key == null) {
ThrowHelper.ThrowArgumentNullException(ExceptionArgument.key);
}
if (buckets != null) {
int hashCode = comparer.GetHashCode(key) & 0x7FFFFFFF;
for (int i = buckets[hashCode % buckets.Length]; i >= 0; i = entries[i].next) {
if (entries[i].hashCode == hashCode && comparer.Equals(entries[i].key, key)) return i;
}
}
return -1;
}
Mono 版本的 Hashtable 和 Dictionary 在实现上仍然是大相径庭,而且也不同于微软的实现。
// Copyright © 2004-2005 Novell, Inc ( http://www.novell.com )
public virtual Object this[Object key]
{
get
{
if (key == null)
throw new ArgumentNullException(“key”, “null key”);
Slot[] table = this.table;
int[] hashes = this.hashes;
uint size = (uint)table.Length;
int h = this.GetHash(key) & Int32.MaxValue;
uint indx = (uint)h;
uint step = (uint)((h >> 5) + 1) % (size - 1) + 1;
for (uint i = size; i > 0; i–)
{
indx %= size;
Slot entry = table[indx];
int hashMix = hashes[indx];
Object k = entry.key;
if (k == null)
break;
if (k == key || ((hashMix & Int32.MaxValue) == h
&& this.KeyEquals(key, k)))
{
return entry.value;
}
if ((hashMix & CHAIN_MARKER) == 0)
break;
indx += step;
}
return null;
}
而 Mono 的 dictionary 实现则为:
// Copyright © 2004 Novell, Inc ( http://www.novell.com)
// Copyright © 2005 David Waite
// Copyright © 2007 HotFeet GmbH ( http://www.hotfeet.ch )
public TValue this [TKey key] {
get {
if (key == null)
throw new ArgumentNullException (“key”);
// get first item of linked list corresponding to given key
int hashCode = hcp.GetHashCode (key);
int cur = table [(hashCode & int.MaxValue) % table.Length] - 1;
// walk linked list until right slot is found or end is reached
while (cur != NO_SLOT) {
// The ordering is important for compatibility with MS and strange
// Object.Equals () implementations
if (linkSlots [cur].HashCode == hashCode && hcp.Equals (keySlots [cur], key))
return valueSlots [cur];
cur = linkSlots [cur].Next;
}
throw new KeyNotFoundException ();
}
好了,现在你可以看到,四种方法本质上实现了同样的功能。毋庸置疑,使用 Sleep 命令的那种是最差的,至于哪一个最好,我想还得由你来决定。
评论