Hash from Adrian Ettlinger on 2001-12-11 (stdin)

From: Adrian Ettlinger <aettlinger_at_worldnet.att.net>
Date: Tue, 11 Dec 2001 08:07:50 -0500

For Bill Raymond and Tom Holroyd:

    Hey, I'm enjoying reading this technical discussion, although a lot of
it is over my head. I never heard of MD5, and I'm not sure I really want to
learn about it.

    But....... let me, at my sophomoric level, talk a bit about my
experience with hash tables. I first learned about the technique from Don
Woods' code. I assume the term "collision" means when two different
positions happen to generate an identical hash value. Don's code provided
only 19001 different hash values, so with 100,000 positions in storage,
there have to be many, many collisions. The way Don's code handles the
problem is that, whenever a collision is generated, a chaining sequence is
originated so that all positions with identical hash values are compared
brute force. This is the main factor which caused the solver to run very,
very slow with large trees, slowing down rather rapidly to the pace of a
crawl. I once mentioned this problem to a "guru" friend, and he tipped me
off to a process he called "double hash", where you compute two hash values
based on two different formulae, and compare the combination of the two
values. I introduced this technique, and it yielded a very dramatic
improvement in the FcPro solver's ability to run much larger trees at a much
faster pace. Chaining is still used, but the maximum chain that develops
becomes much much shorter. Rereading Bill's latest, I see the term
"secondary search". May I assume that that's the accepted term for what I'm
calling "chaining"?

    So, my question after all that description is this. What do you guys
who really know what you're doing think of that? Or put another way, how do
they handle it "downtown"?

Best regards, ---------------Adrian
Received on Tue Dec 11 2001 - 05:44:23 IST