I Will Not Repeat Myself I Will Not…

In my post about an optimisation surprise, one of the functions that the profiler warned me about was doing something relatively innocuous:

List<String> findAll(String input) {
   return myMap.entrySet().stream()
      .filter(entry -> input.matches(entry.getKey())

Let’s just unpack that (because it slightly confused me). The map – myMap is a map of String to String where the key is a regex, and the value is a value we might want to extract if the regex matches some input.

The matches function of String provides us regex matching with a target regex String. What’s not to like?

Let’s look inside String.matches for a moment…

public boolean matches(String regex) {
   return Pattern.matches(regex, this);

Ok… it uses Pattern… let’s look at that:

public static boolean matches(String regex, CharSequence input) {
    Pattern p = Pattern.compile(regex);
    Matcher m = p.matcher(input);
    return m.matches();

Aha… it should be obvious from the fact that we’re using only String objects and not compiled regular expressions inside Pattern objects… but every iteration of this will have to re-parse the regular expressions involved.

So if you’re going to call the original function a lot (clue: a very lot in this case!), then not pre-parsing the regular expressions will be costly.

All I needed to do was make the map a map of Pattern to String and I had shaved an enormous amount of unnecessary work off the job.

In practice, there was more to it than that, but let this article make you be very wary of frequent calls to String.matches.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s