Tech and Media Labs
This site uses cookies to improve the user experience.

Java Regex - Java Regular Expressions (java.util.regex)

Jakob Jenkov
Last update: 2014-12-16

Java regex is the official Java regular expression API. What a regular expression is more precisely will be explained in the next section. The term Java regex is an abbreviation of Java regular expression. The Java regex API is located in the java.util.regex package which has been part of standard Java (JSE) since Java 1.4. This Java regex tutorial will explain how to use this API to match regular expressions against text.

Regular Expressions

A regular expression is a textual pattern used to search in text. You do so by "matching" the regular expression against the text. The result of matching a regular expression against a text is either a true / false value, specifying if the regular expression matched the text, or a set of matches - one match for every occurrence of the regular expression found in the text.

For instance, you could use a regular expression to search an HTML page for email addresses, URLs, telephone numbers etc. This would be done by matching different regular expressions against the HTML page. The result of matching each regular expression against the HTML page would be a set of matches - one set of matches for each regular expression (each regular expression may match more than one time).

Matching regular expressions against text is exactly what you can do with Java regex - the Java regular expression API.

Java Regex Core Classes

The Java regex API consists of two core classes. These are:

The Pattern class is used to create patterns (regular expressions). A pattern is precompiled regular expression in object form (as a Pattern instance), capable of matching itself against a text.

The Matcher class is used to match a given regular expression (Pattern instance) against a text multiple times. In other words, to look for multiple occurrences of the regular expression in the text. The Matcher will tell you where in the text (character index) it found the occurrences. You can obtain a Matcher instance from a Pattern instance.

Both the Pattern and Matcher classes are covered in detail in their own texts. See links above, or in the top left of every page in this Java regex tutorial trail.

Another key aspect of regular expressions is the regular expression syntax. Java is not the only programming language that has support for regular expressions. Most modern programming languages supports regular expressions. The syntax used in each language define regular expressions is not exactly the same, though. Therefore you will need to learn the syntax used by your programming language. The syntax used by the Java regex API is covered in detail in the text about the Java regular expression syntax.

Java Regular Expression Example

Here is a simple java regex example that uses a regular expression to check if a text contains the substring http:// :

String text    =
        "This is the text to be searched " +
        "for occurrences of the http:// pattern.";

String pattern = ".*http://.*";

boolean matches = Pattern.matches(pattern, text);

System.out.println("matches = " + matches);

The text variable contains the text to be checked with the regular expression.

The pattern variable contains the regular expression as a String. The regular expression matches all texts which contains one or more characters (.*) followed by the text http:// followed by one or more characters (.*).

The third line uses the Pattern.matches() static method to check if the regular expression (pattern) matches the text. If the regular expression matches the text, then Pattern.matches() returns true. If the regular expression does not match the text Pattern.matches() returns false.

The example does not actually check if the found http:// string is part of a valid URL, with domain name and suffix (.com, .net etc.). The regular expression just checks for an occurrence of the string http://.

Here is another Java regex example which uses the Matcher class to locate multiple occurrences of the substring "is" inside a text:

String text    =
        "This is the text which is to be searched " +
        "for occurrences of the word 'is'.";

String patternString = "is";

Pattern pattern = Pattern.compile(patternString);
Matcher matcher = pattern.matcher(text);

int count = 0;
while(matcher.find()) {
    System.out.println("found: " + count + " : "
            + matcher.start() + " - " + matcher.end());

From the Pattern instance a Matcher instance is obtained. Via this Matcher instance the example finds all occurrences of the regular expression in the text.

Java Regex API in Java 6

Although Java regex has beein part of standard Java since Java 1.4, this Java regex tutorial covers the Java regex API released with Java 6.

Feel Free to Contact Me

If you disagree with anything I write here in this Java regex tutorial, or just have comments, questions, etc. to the java.util.regex package, feel free to send me an email. You wouldn't be the first to do so. You can find my email address on the about page.

Jakob Jenkov

Copyright  Jenkov Aps
Close TOC