Algorithms on strings and labelled graphs


In this project, I focus on designing efficient algorithms on strings in many different models, and, in particular, when the string (or a set of strings) is represented by a labelled graph. In ESA 2022, my team published a paper about finding approximate occurrences of a pattern (a short string) in a text (a longer string), when the pattern can be cyclically rotated. In the future we plan to extend our ESA 2022 results to work more efficiently with the edit distance metric.

In LATIN 2022 we have published a paper about new algorithms for finding approximate occurrences of a pattern in a so-called elastic-degenerate (ED) text. An ED string can be represented with a labelled directed acyclic graph, while our algorithms employ various techniques from computational geometry. In the future we plan to extend the notion of pattern matching of ED strings, to allow the pattern to be elastic-degenerate as well.

Together with another group of researchers, we have also published a paper in SPIRE 2022 about a new notion of covers: a subsequence cover is a word whose subsequence occurrences cover all the positions of the input string. This continues a long line of papers about covering strings in many different models.

Our immediate goal is to design a data structure, over a dictionary of strings, to support suffix/prefix queries: given two dictionary elements, return the longest prefix of one that is also a suffix of the other.

Supervisors Leen Stougie (CWI)
PostDoc Wiktor Zuba
Location Centrum Wiskunde en Informatica(CWI)


This project has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement Grant Agreement No 101034253.