Substring versus Subsequence

  • Substring : A pattern P is called a substring of Text T if the pattern appears in the Text in a continuous fashion.
  • Subsequence : A pattern P is called a subsequence of Text T if the pattern preserves the relative ordering of characters within the text T and it might not appear in a continuous fashion. e.g., the prime numbers are a subsequence of the positive integers.

  • Subsequence is a generalisation of substring, suffix, and prefix. Finding the longest string which is a subsequence of two or more strings is known as the longest common subsequence problem.

Example: The string “anna” is a subsequence of the string “banana”:

1
2
3
banana
|| ||
an na
  • Substring

A substring of a string is a prefix of a suffix of the string, and equivalently a suffix of a prefix. If one string is a substring of another, it is also a subsequence, which is a more general concept.

Example: The string “ana” is a substring (and subsequence) of banana at two different offsets:

1
2
3
4
5
banana
|||||
ana||
|||
ana