July 15, 2015

Begin and end index of a regular expression pattern matched string

This is a quick tip on regular expressions.  Nothing too exciting, but something I use often enough but can never remember so I always have to look it up.

The String.indexOf("match me") method is handy for finding the beginning index of some pattern within a String you want to match.  Combine this beginning index with "match me".length() and you can get the ending index as well.  This works for finding simple substrings, but what if the pattern is more complicated and needs a regular expression?  Let's take a look at the code in listing 1.

Listing 1: Simple pattern matching.
package pattern.matching;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class PatternMatching {

    public static void main(String[] args) throws Exception {
        StringBuilder searchMe = new StringBuilder();
        searchMe.append("hello").append("\n");
        searchMe.append(" package doctor;  ").append("\n");
        searchMe.append("name // comment").append("\n");
        searchMe.append("/* continue */").append("\n");
        searchMe.append("int yesterday = 9;").append("\n");
        searchMe.append("tomorrow!").append("\n");
        
        System.out.printf("BEFORE\n------\n%s", searchMe.toString());
        
        final Pattern pattern = Pattern.compile("^ *package .*; *$", Pattern.MULTILINE);
        final Matcher matcher = pattern.matcher(searchMe.toString());

        if (matcher.find()) {
            System.out.println();
            System.out.printf("Start index: %d\n", matcher.start());
            System.out.printf("End index:   %d\n", matcher.end());
            System.out.printf("Match found: [%s]\n", matcher.group());            
            searchMe.replace(6, 24, "I've been replaced...so sad :(");
        } else {
            System.out.println("Matcher matched nothing! ");
        }
        
        System.out.println();
        System.out.printf("AFTER\n-----\n%s\n", searchMe.toString());       
    }
}
Listing 2: Output
BEFORE
------
hello
 package doctor;  
name // comment
/* continue */
int yesterday = 9;
tomorrow!

Start index: 6
End index:   24
Match found: [ package doctor;  ]

AFTER
-----
hello
I've been replaced...so sad :(
name // comment
/* continue */
int yesterday = 9;
tomorrow!
Listing 1 shows the code for a simple regular expression pattern matching of a String.  The Matcher is looking for a simple pattern.  If found, the Matcher.start and Matcher.end methods are used to get the beginning and ending indices within the String.  The StringBuilder.replace method is then used to replace the matched pattern with another String.  As you can see in Listing 2, the output of the application shows the before state, the matcher information, and the after state.

So pretty simple :)
Enjoy!