It looks to me like the idea is, as you generally describe, that you segment your table into a 2d structure (well conceptually) and proceed to fill one ‘row’ at a time until it’s about 75% full, at which point you move on to the next one.
I don’t have time to fully grok the paper, but they claim this makes insertion consistently fast (I believe this until we’re at 75% of total capacity, but maybe they have some other mode for filling when they’re at 75% in every row?). They also claim retrieval is fast, and I didn’t read enough to understand how even retrieval works, or why it is faster.
I’ll put out that there a lot of times that it would be really nice to have a nearly full hash table still, you know, work. You can’t always change the size of one during execution of a program. And, in some environments memory counts a lot. That said, I would like to see and play with an implementation — I’m not sure this is ‘worth it’ in the general case.
It is also probably cache inefficient, as are most things about hash tables, with the exception of linear probing for reading out of a fairly full one, in which case, you get to just keep pulling stuff directly out of memory to check it. So, it’s not clear to me that this is performance wise worth it. Anyway, I’d like to fully understand it, it seems like an interesting new idea.
I don’t have time to fully grok the paper, but they claim this makes insertion consistently fast (I believe this until we’re at 75% of total capacity, but maybe they have some other mode for filling when they’re at 75% in every row?). They also claim retrieval is fast, and I didn’t read enough to understand how even retrieval works, or why it is faster.
I’ll put out that there a lot of times that it would be really nice to have a nearly full hash table still, you know, work. You can’t always change the size of one during execution of a program. And, in some environments memory counts a lot. That said, I would like to see and play with an implementation — I’m not sure this is ‘worth it’ in the general case.
It is also probably cache inefficient, as are most things about hash tables, with the exception of linear probing for reading out of a fairly full one, in which case, you get to just keep pulling stuff directly out of memory to check it. So, it’s not clear to me that this is performance wise worth it. Anyway, I’d like to fully understand it, it seems like an interesting new idea.