Convergence of the Number of Period sets in Strings

Abstract

Consider words of length $n$. The set of all periods of a word of length $n$ is a subset of $0,1,2,łdots,n-1$. However, not every subset of $0,1,2,łdots,n-1$ can be a valid set of periods. In a seminal paper in 1981, Guibas and Odlyzko proposed encoding the set of periods of a word into a binary string of length $n$, called an autocorrelation, where a $1$ at position $i$ denotes the period $i$. They considered the question of recognizing a valid period set, and also studied the number $ąppa_n$ of valid period sets for strings of length $n$. They conjectured that $łn p̨pa_n$ asymptotically converges to a constant times $(łn n)^2$. Although improved lower bounds for $łn kp̨a_n/(łn n)^2$ were proved in 2001, the question of a tight upper bound has remained open since Guibas and Odlyzko’s paper. Here, we exhibit an upper bound for this fraction, which implies its convergence and closes this longstanding conjecture. Moreover, we extend our result to find similar bounds for the number of correlations: a generalization of autocorrelations that encodes the overlaps between two strings.

Publication
Algorithmica