HackerRank - Regex - Backreferences

© HackerRank

Matching Same Text Again & Again

\group_number

This tool (\1 references the first capturing group) matches the same text as previously matched by the capturing group.

For Example

1
(\d)\1

It can match 00, 11, 22, 33, 44, 55, 66, 77, 88 or 99.

Task

ab #1?AZa$ab #1?AZa$

1
^([a-z])(\w)(\s)(\W)(\d)(\D)([A-Z])([a-zA-Z])([aeiouAEIOU])(\S)\1\2\3\4\5\6\7\8\9\10$

Backreferences To Failed Groups

Backreference to a capturing group that match nothing is different from backreference to a capturing group that did not participate in the match at all.

Capturing group that match nothing

1
(b?)o\1

is matched with o

Here, b? is optional and matches nothing.
Thus, (b?) is successfully matched and capture nothing.
o is matched with o and \1 successfully matches the nothing captured by the group.

Capturing group that didn’t participate in the match at all

1
(b)?o\1

is not matching o

In most regex flavors (excluding JavaScript), (b)?o\1 fails to match o.

Here, (b) fails to match at all. Since, the whole group is optional the regex engine does proceed to match o.
The regex engine now arrives at \1 which references a group that did not participate in the match attempt at all.
Thus, the backreference fails to match at all.

Task

12-34-56-78

12345678

1

1
^\d{2}(-?)\d{2}\1\d{2}\1\d{2}$

2

1
^\d{2}(-?)(\d{2}\1){2}\d{2}$

Branch Reset Groups

NOTE - Branch reset group is supported by Perl, PHP, Delphi and R.

(?|regex)

A branch reset group consists of alternations and capturing groups. (?|(regex1)|(regex2))
Alternatives in branch reset group share same capturing group.

1
(?|(Haa)|(Hee)|(bye)|(k))\1

is mathched with HaaHaa and kk

Task

12-34-56-78

12:34:56:78

12---34---56---78

12.34.56.78

1
/^\d{2}(?|(-)|(:)|(---)|(\.)|){1}(\d{2}\1){2}\d{2}$/

Forward References

NOTE - Forward reference is supported by JGsoft, .NET, Java, Perl, PCRE, PHP, Delphi and Ruby regex flavors.

Forward reference creates a back reference to a regex that would appear later.
Forward references are only useful if they’re inside a repeated group.
Then there may arise a case in which the regex engine evaluates the backreference after the group has been matched already.

1
(\2amigo|(go!))+

is matched with go!go!amigo

Task

tactactic

tactactictactic

1
^(\2tic|(tac))+$

Resources

Regex - Subdomains - Backreferences