[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y ] [Search | Free Show | Home]

How the fuck do hash maps work?

This is a blue board which means that it's for everybody (Safe For Work content only). If you see any adult content, please report it.

Thread replies: 10
Thread images: 1

File: KermitMFC.jpg (109KB, 743x574px) Image search: [Google]
KermitMFC.jpg
109KB, 743x574px
Am I maximum retard?
>cannot comprehend hash maps
I get the concept, you have an array of pairs, you hash the lefthand side of the pair to get an array index, you go there, either your array is actually an array of lists of pairs and you add to the list or it's not and you start jumping around until you find an empty spot, in the latter case if the ratio of data to spots exceeds a certain threshold you grow the array and rehash all the pairs, right?
My problem is this.
Universal double hashing.
You need to pick two hash functions randomly from some pool of hash functions that are all valid.
How do you do that? How can you just up and pick two hash functions? Where do you even get the pool from? Writing one hash function seems hard enough.
And then, not only that, but they need to be coprime to the table length.
Do they?
Because I know in linear probing and quadratic probing that's a problem. But with hashing, wouldn't the coefficient be more unpredictable / pseudorandom? So would having it be a coprime actually be unnecessary?
And even if so, once again, how do you actually choose the hash functions? How do you define a whole pool of hash functions and then just choose two out of it?
Speaking of which, I've heard you should actually pick the two hash functions from two pools defined completely differently. Is this true and why?
Also, if the coefficient is completely unpredictable, what happens if an insertion operation fails? Let's say it hasn't even tried all the spots, it just fails anyway because all the spots it did try were occupied and then it just looped back around to the first spot it tried without having tried them all. The dreaded situation with linear and quadratic probing. Seems considerably less likely with double hashing, but still possible, no?
If that happens, should you just immediately grow the table, without regard to load factor?
Also, is there some point where if the load factor gets low enough, you should shrink the table?
>>
>>61388164
>Am I maximum retard?
probably
>>
>>61388164
>You need to pick two hash functions randomly from some pool of hash functions that are all valid.
>How can you just up and pick two hash functions?
>randomly

So double hashing is just putting hash tables in your hash table. When you have a single hashing function, maybe some of your data collides, and then you have a bunch of linked lists. If you take that colliding data and hash it again with a function independent of the first hash function, it will produce values that don't collide among the records that produced colliding values for the first hash function (though they may collide with other records not in the same bucket).
>>
> How can you just up and pick two hash functions? Where do you even get the pool from?
The same hash function with random value added. It's called 'keyed hash function'. It's pretty much the same concept as salt. See SipHash on wikipedia.
>>
>>61388164
There is countless info on how hash maps work. Holy fuck you're retarded. A simple video on youtube wouldv'e saved you the time you used to write here insted.
Simplest example. You want to insert and search some data from a hash table, in constant time. You create a simple ARRAY, you get your data(suppose it's strings here), and you put it through a function: like summing all the characters of the string, and then "mod (len(ARRAY)).". Some data might generate the same result from the hash function, and you can solve it using: Linked Lists or chaining operations. You don't understand the core principles, you will never understand the advanced ideas, like the perfect hashing function from Knuth.
>>
>>61388164
Look up how Java hashmaps work, seems to be standard in univrsities. I think Java uses a load factor of about 75%, before growing the hash.
>>
>>61389081
did you even read op's post you fucking braindead retard?
>>
>>61388164
it's literally take some value of type key, map it to a discrete value that maps to some address in an array and on collision, use some type of scanning or another hash round.

on exhaustion of address space it's literally like any other kind of resizing array, alloc a geometrically larger fucking array and then reinsert the key/values using the same hashing functions.
>>
>>61389179
no cuz he's a retard.
>>
>>61388178
fpbp
Thread posts: 10
Thread images: 1


[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y] [Search | Top | Home]

I'm aware that Imgur.com will stop allowing adult images since 15th of May. I'm taking actions to backup as much data as possible.
Read more on this topic here - https://archived.moe/talk/thread/1694/


If you need a post removed click on it's [Report] button and follow the instruction.
DMCA Content Takedown via dmca.com
All images are hosted on imgur.com.
If you like this website please support us by donating with Bitcoins at 16mKtbZiwW52BLkibtCr8jUg2KVUMTxVQ5
All trademarks and copyrights on this page are owned by their respective parties.
Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.
This is a 4chan archive - all of the content originated from that site.
This means that RandomArchive shows their content, archived.
If you need information for a Poster - contact them.