approximation

Superstring

Algorithms for shortest superstring questions

Practical lower and upper bounds for the Shortest Linear Superstring

Given a set P of words, the Shortest Linear Superstring (SLS) problem is an optimisation problem that asks for a superstring of P of minimal length. SLS has applications in data compression, where a superstring is a compact representation of P, and …

Superstrings with multiplicities

A superstring of a set of words $P = s_1, ..., s_p $ is a string that contains each word of P as substring. Given P, the well known Shortest Linear Superstring problem (SLS), asks for a shortest superstring of P. In a variant of SLS, called …

Approximation of Greedy Algorithms for Max-ATSP, Maximal Compression, Maximal Cycle Cover, and Shortest Cyclic Cover of Strings

Covering a directed graph by a Hamiltonian path or a set of words by a superstring belong to well studied optimisation problems that prove difficult to approximate. Indeed, the Maximum Asymmetric Travelling Salesman Problem (Max-ATSP), which asks for …

Longest common subsequence problem for unoriented and cyclic strings

Given a finite set of strings X, the Longest Common Subsequence problem (LCS) consists in finding a subsequence common to all strings in X that is of maximal length. LCS is a central problem in stringology and finds broad applications in text …

Hardness of Optimal Spaced Seed Design

Speeding up approximate pattern matching is a line of research in stringology since the 80's. Practically fast approaches belong to the class of filtration algorithms, in which text regions dissimilar to the pattern are excluded (filtered out) in a …

Exact and Efficient Computation of the Expected Number of Missing and Common Words in Random Texts

The number of missing words (NMW) of length q in a text, and the number of common words (NCW) of two texts are useful text statistics. Knowing the distribution of the NMW in a random text is essential for the construction of so-called monkey tests …