Grep regex capture group tutorial

11/21/2023

The back-reference ‘\n’, where n is a single digit, matches the is normally regex-active and matches any character, \1 will only match a period. To demonstrate that \1 is not a regex, let's try: $ echo '.a' | grep -E '(.)\1' If we want \1 to be a word, add to it a word boundary: $ grep -E '(a\b) \1\b' fileīecause \1\b requires a word boundary after \1, the second line no longer matches. Since \1 is text, not a regex, it does not care what follows the a. In our case, the matched text is the character a. The issue is that \1 refers to the matched text, not a regex. This is telling me that the capture group includes everything except word boundaries at the right edge of the string, which I still don't understand. Building up from the examples above: $ cat tests I don't understand what's going on!Īfter accepting an answer, I realized I actually still don't understand what's going on.

So far so good.īut then, it should be checking if \1 matches ab, and as far as I can tell it shouldn't, because following a in ab we have a word character.

When processing the string a ab, the regex engine matches the character a, see that it is followed by something which isn't a "word character" and thus matches \b. The capturing group can be described in words as "the character a followed by a word boundary". The parts in bold mean that there was a match. Below I'm issuing the regular expression (a\b) \1 to grep and inserting a couple of test strings through stdin.

0 Comments

Grep regex capture group tutorial

Leave a Reply.

Author

Archives

Categories