I think you will agree that split a string in Java is a very popular task.
Usually, you need to cut a string delimiter and parse other string parts into an array or list.
I’ll explain to you 5 the most popular methods how to split a string in Java.
How to Split String in Java: Methods Overview
There are 3 split functions in Java Core and 2 split methods in Java libraries.
So what is a difference?
- If you need to split a string into an array – use
String.split(s)
. - If you need to split a string into collection (list, set or whatever you want) – use
Pattern.compile(regex).splitAsStream(delimiter)
. StringTokenizer
is a legacy class. Don’t know why it’s not deprecated, but use the 1st method instead. Anyway, you should know how it’s working, that’s why I provided code examples.com.google.common.base.Splitter
was popular before Java 8 release. It provides API more or less similar toPattern.compile(regex).splitAsStream(delimiter)
.org.apache.commons.lang3.StringUtils.split(s)
provides a built-in null check, so sometimes it’s more convenient way instead ofString.split(s)
.
So let’s take a deeper look.
String.split(regex)
Splits a string into an array of strings by regex delimiter.
Method signature:
public String[] split(String regex);
Parameter String regex
is the delimiting or regular expression.
Example:
package com.explainjava;
public class Demo {
public static void main(String[] args) {
String[] parts ="10,20".split(",");
String part1 = parts[0];
String part2 = parts[1];
System.out.println(part1 + " and " + part2);
}
}
Output:
10 and 20
The extended method with a parameter int limit
is present as well.
It uses limit to indicate how many rows should be returned.
Method signature:
public String[] split(String regex, int limit);
Example:
package com.explainjava;
public class Demo {
public static void main(String[] args) {
String s = "Welcome to EXPLAINJAVA.COM!";
String[] parts = s.split("\\s", 2);
String part1 = parts[0];
String part2 = parts[1];
System.out.println("First part: " + part1);
System.out.println("Second part: " + part2);
}
}
Output:
First part: Welcome
Second part: to EXPLAINJAVA.COM!
Pattern.compile(regexp).splitAsStream(input)
This function allows splitting a string into a stream and process stream to List or Set, or even Map.
Example:
package com.explainjava;
import java.util.List;
import java.util.regex.Pattern;
import java.util.stream.Collectors;
public class Main {
public static void main(String[] args) {
List<String> strings = Pattern.compile("\\|")
.splitAsStream("010|020202")
.collect(Collectors.toList());
System.out.println(strings);
}
}
Output:
[010, 020202]
Since: Java 8
StringTokenizer
java.util.StringTokenizer
is a legacy class and I do not recommend to use it anymore.
This class is maintained for compatibility reasons.
The main question for me is: why it’s not deprecated?!
To parse the string, you must organize a loop, using hasMoreTokens() and nextToken() methods.
Example:
package com.explainjava;
public class Demo {
public static void main(String[] args) {
StringTokenizer strings = new StringTokenizer("Welcome to EXPLAINJAVA.COM!", ".");
while(strings.hasMoreTokens()){
String substring = strings.nextToken();
System.out.println(substring);
}
}
}
Output:
Welcome to EXPLAINJAVA
COM!
Google Guava Splitter
Google Guava is an open-source set of common libraries for Java, mainly developed by Google engineers.
If you want to use Guava you should add maven dependency:
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>23.0</version>
</dependency>
Splitter has rich API, it can: omit empty strings, trim results, set limit, map to list etc.
The separator can be specified as a single character, fixed string, regular expression or CharMatcher instance.
On my opionion, since Java 8 it can be replaced with Java Stream API (Pattern.compile(regex).spliteToStream(input)).
Example:
package com.explainjava;
import java.util.List;
import com.google.common.base.Splitter;
public class Main {
public static void main(String[] args) {
String s = "Welcome to EXPLAINJAVA.COM!";
Iterable<String> result = Splitter.on(" ").split(s);
System.out.println(result);
}
}
Output:
[Welcome, to, EXPLAINJAVA.COM!]
Apache Commons StringUtils
Library Apache Commons has its own utility class to work with strings – StringUtils.
Add maven dependency:
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
<version>3.6</version>
</dependency>
StringUtils split method looks similar to String.split(s):
public static String[] split(String str);
The difference is: StringUtils.split() method contains a built-in null check.
Example:
package com.explainjava;
import java.util.List;
import org.apache.commons.lang3.StringUtils;
public class Main {
public static void main(String[] args) {
String[] strings = StringUtils.split("Welcome to EXPLAINJAVA.COM!", " ");
System.out.println(Arrays.toString(strings));
}
}
Output:
[Welcome, to, EXPLAINJAVA.COM!]
Popular Split Regular Expressions
I think it would be useful to provide some frequently asked regex to split a string.
Split String by Space
public static void main(String[] args) {
String[] split = "1 2 3".split(" ");
System.out.println(Arrays.toString(split));
}
Split String by Whitespace
public static void main(String[] args) {
String[] split = "1 2\n3\t45".split("\\s");
System.out.println(Arrays.toString(split));
}
\\s
means to split a string by whitespace character (in ASCII it’s tab, line feed, form feed, carriage return, and space; in Unicode, also matches no-break spaces, next line, and the variable-width spaces).
Split String by Comma
public static void main(String[] args) {
String[] split = "1,2,3".split(",");
System.out.println(Arrays.toString(split));
}
Split String by Slash
public static void main(String[] args) {
String[] split = "1/2/3".split("/");
System.out.println(Arrays.toString(split));
}
Split String by Backslash
First of all, it’s impossible to have a string like this in Java: "1\2\3"
.
It should be escaped like this: "1\\2\\3"
.
Or you can read it from file and Java will escape string for you.
Delimiter should be escaped as well.
public static void main(String[] args) {
String[] split = "1\\2\\3".split("\\\\");
System.out.println(Arrays.toString(split));
}
Split String by Question Mark
public static void main(String[] args) {
String[] split = "1?2?3".split("\\?");
System.out.println(Arrays.toString(split));
}
Split String by Dollar Sign
public static void main(String[] args) {
String[] split = "1$2$3".split("\\$");
System.out.println(Arrays.toString(split));
}
Split String by Colon
public static void main(String[] args) {
String[] split = "1:2:3".split(":");
System.out.println(Arrays.toString(split));
}
Split String by Dot
public static void main(String[] args) {
String[] split = "1.2.3".split("\\.");
System.out.println(Arrays.toString(split));
}
Split String By Plus
public static void main(String[] args) {
String[] split = "1+2+3".split("\\+");
System.out.println(Arrays.toString(split));
}
Split String by Pipe
public static void main(String[] args) {
String[] split = "1|2|3".split("\\|");
System.out.println(Arrays.toString(split));
}
Split String By Tab
public static void main(String[] args) {
String[] split = "1\t2\t3".split("\t");
System.out.println(Arrays.toString(split));
}
Split String by Multiple Delimiters
public static void main(String[] args) {
String[] split = "1,2|3.4$5".split("[,|.]");
System.out.println(Arrays.toString(split));
}
Any questions? Please, ask me.