How to Encode String in UTF-8 in Java

Rupam Yadav Feb 02, 2024
  1. Encode a String to UTF-8 by Converting It to Bytes Array and Using new String()
  2. Encode a String to UTF-8 Using StandardCharsets.UTF_8.encode and StandardCharsets.UTF_8.decode(byteBuffer)
  3. Encode Strings From a File to UTF-8 Using Files.readString()
How to Encode String in UTF-8 in Java

We need to use the concept of encoding and decoding when we work with Strings, and we want to convert that string to another character set.

UTF-8, which is short for Unicode Transformation Format - 8 bit, is a variable-width standard that assigns a different number of bytes from one to four to every code point or character.

Below we check out how to encode a string and a file’s contents to UTF-8 standard.

Encode a String to UTF-8 by Converting It to Bytes Array and Using new String()

We first convert the string to an array of bytes in the first method and create a string with the UTF-8 encoding.

We create a string japaneseString that contains Japanese characters. Next, we convert the string to a byte array because we cannot encode a string directly to UTF-8. japaneseString.getBytes() returns an array of byte type.

Now we create a new String using new String() and pass in two arguments, the first argument is the byte array japaneseBytesArray, and the second argument is the encoding format that we want to use.

We use the StandardCharsets class to get the encoding charset and access the UTH_8 field. The encodedString contains a string that is encoded with UTF-8.

import java.nio.charset.StandardCharsets;

public class JavaExample {
  public static void main(String[] args) {
    String japaneseString = "これはテキストです";
    byte[] japaneseBytesArray = japaneseString.getBytes();

    String encodedString = new String(japaneseBytesArray, StandardCharsets.UTF_8);

    System.out.println(encodedString);
  }
}

Output:

これはテキストです

Encode a String to UTF-8 Using StandardCharsets.UTF_8.encode and StandardCharsets.UTF_8.decode(byteBuffer)

We can use the StandardCharsets class to encode a string to specified charset like UTF-8.

We create a japaneseString and then call encode() of StandardCharsets.UTF_8 that is of type charsets. In the encode() method, we pass the japaneseString, returning a ByteBuffer object.

The string is currently in the form of a ByteBuffer, so we call the decode() method of StandardCharsets.UTF_8 that takes the ByteBuffer object as an argument, and at last, we convert the result to a string using toString().

import java.nio.ByteBuffer;
import java.nio.charset.StandardCharsets;

public class JavaExample {
  public static void main(String[] args) {
    String japaneseString = "これはテキストです";
    ByteBuffer byteBuffer = StandardCharsets.UTF_8.encode(japaneseString);

    String encodedString = StandardCharsets.UTF_8.decode(byteBuffer).toString();

    System.out.println(encodedString);
  }
}

Output:

これはテキストです

Encode Strings From a File to UTF-8 Using Files.readString()

In the Last example, instead of encoding a single string to UTF-8 format, we read a file and encode all the strings in the file.

First, we create a text file and add some text to encode in the UTF-8 standard. To get the file’s path, we use Paths.get() and pass in the file’s path as an argument that returns a Path object.

We call the readString() method of the Files class that takes two arguments, the first argument is the Path object, and the second argument is the charset to use that we access using StandardCharsets.UTF_8.

We get the encoded string readString and print it in the output.

import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;

public class JavaExample {
  public static void main(String[] args) {
    try {
      Path path = Paths.get(
          "C:\\Users\\User1\\IdeaProjects\\Java Examples\\src\\main\\java\\example_file.txt");
      String readString = Files.readString(path, StandardCharsets.UTF_8);
      System.out.println(readString);
    } catch (IOException e) {
      e.printStackTrace();
    }
  }
}

Output:

これはテキストです
Tämä on tekstiä
Author: Rupam Yadav
Rupam Yadav avatar Rupam Yadav avatar

Rupam Saini is an android developer, who also works sometimes as a web developer., He likes to read books and write about various things.

LinkedIn

Related Article - Java String