arXiv papers

Optimal Bounds for Open Addressing Without Reordering

Martin Farach-Colton, Andrew Krapivin, William Kuszmaul

https://arxiv.org/abs/2501.02305

In this paper, we revisit one of the simplest problems in data structures: the task of inserting elements into an open-addressed hash table so that elements can later be retrieved with as few probes as possible. We show that, even without reordering elements over time, it is possible to construct a hash table that achieves far better expected search complexities (both amortized and worst-case) than were previously thought possible. Along the way, we disprove the central conjecture left by Yao in his seminal paper Uniform Hashing is Optimal. All of our results come with matching lower bounds.


Undergraduate Upends a 40-Year-Old Data Science Conjecture

Tiny Pointers

Michael A. Bender, Alex Conway, Martín Farach-Colton, William Kuszmaul, Guido Tagliavini

https://arxiv.org/abs/2111.12800

This paper introduces a new data-structural object that we call the tiny pointer. In many applications, traditional \( log n \) -bit pointers can be replaced with \( O(log n) \) -bit tiny pointers at the cost of only a constant-factor time overhead. We develop a comprehensive theory of tiny pointers, and give optimal constructions for both fixed-size tiny pointers (i.e., settings in which all of the tiny pointers must be the same size) and variable-size tiny pointers (i.e., settings in which the average tiny-pointer size must be small, but some tiny pointers can be larger). If a tiny pointer references an element in an array filled to load factor , then the optimal tiny-pointer size is bits in the fixed-size case, and expected bits in the variable-size case. Our tiny-pointer constructions also require us to revisit several classic problems having to do with balls and bins; these results may be of independent interest.
Using tiny pointers, we revisit five classic data-structure problems: the data-retrieval problem, succinct dynamic binary search trees, space-efficient stable dictionaries, space-efficient dictionaries with variable-size keys, and the internal-memory stash problem. These are all well-studied problems, and in each case tiny pointers allow for us to take a natural space-inefficient solution that uses pointers and make it space-efficient for free.

Du-Hwang Characteristic Area: Catch-22

A.O.Ivanov, A.A.Tuzhilin

https://arxiv.org/abs/1402.6079

The paper is devoted to description of two interconnected mistakes generated by the gap in the Du and Hwang approach to Gilbert-Pollack Steiner ratio conjecture.