How to Create and Read PDF in Java Using the iText Library

Mohd Mohtashim Nawaz Feb 16, 2024
  1. PDF and Libraries to Work With PDF Files
  2. Features of the iText Library
  3. Steps to Install the iText Library in Eclipse
  4. Steps to Create a PDF File Using the iText Library in Java
  5. Steps to Read the PDF File Using the iText Library in Java
How to Create and Read PDF in Java Using the iText Library

The iText library is an open-source library to create, manipulate and read PDF files in Java. This article discusses the iText library, its installation in Eclipse, and creating and reading PDF files in Java using the iText library.

PDF and Libraries to Work With PDF Files

The Portable Document Format (PDF) is a widely used data exchange format using files. The PDF is independent of the hardware, operating system, and software.

Therefore, it is very popular and often used for text, images, and other data types.

Many libraries are available to create, read and work with PDF files. Some of these libraries are given below.

  1. iText - The community version of iText is an open-source library. It reads, creates, and manipulates the PDF files using Java.

    It has a hierarchical structure and can perform arbitrarily complex PDF files to generate desired results. The iText library is available in Java and .NET.

  2. Adobe PDF Library - Adobe developed this library to create, manipulate and read PDF files. We can use this library to print the pdf as well.

    This library works with different languages such as C++, Java, and .NET.

  3. PDFBox - This is another open-source library. Apache developed this library to create, edit, and view PDF files, and it can be used with Java.

  4. Jasper Reports - This reporting tool can generate reports in PDF files.

Features of the iText Library

Let us look at some of the iText library features.

  1. Creating PDF Files - We can create arbitrarily complex and interactive PDF files using the iText library. We can also insert images into the PDF file.
  2. We can create bookmarks, add page numbers, and add watermarks to the PDF file using the iText library.
  3. We can perform split and merge operations on the PDF Files.
  4. The iText library provides a facility to work with interactive forms in PDF files.
  5. We can save a PDF file as an image in different formats such as JPG, PNG, etc.

Steps to Install the iText Library in Eclipse

The iText library is third-party open-source software that needs to be installed separately to your Java project before using it. This article guides you through the installation steps of the iText in Eclipse.

Eclipse is one of the most popular IDEs used for application development in Java. This article assumes that you have already installed Java and Eclipse.

Even if you work on any other IDE, the installation process is similar.

Let us see the steps to add the iText library to your Eclipse project.

  • Create a project in Eclipse (File -> New -> Java Project).
  • Right-click on the project, a drop-down menu appears.
  • Click on Convert to Maven Project.
  • You will observe that a new file named pom.xml appears in your project folder.
  • Open the pom.xml and add the following dependencies within the <project> tag and after the <build> tag.
    <dependencies>
    
      <dependency>
    	<groupId>com.itextpdf</groupId>
    	<artifactId>barcodes</artifactId>
    	<version>7.2.1</version>
    	<!-- barcodes depends on kernel -->
      </dependency>
    
      <dependency>
    	<groupId>com.itextpdf</groupId>
    	<artifactId>font-asian</artifactId>
    	<version>7.2.1</version>
      </dependency>
    
      <dependency>
    	<groupId>com.itextpdf</groupId>
    	<artifactId>forms</artifactId>
    	<version>7.2.1</version>
    	<!-- forms depends on kernel and layout -->
      </dependency>
    
      <dependency>
    	  <groupId>com.itextpdf</groupId>
    	  <artifactId>hyph</artifactId>
    	  <version>7.2.1</version>
      </dependency>
    
      <dependency>
    	<groupId>com.itextpdf</groupId>
    	<artifactId>io</artifactId>
    	<version>7.2.1</version>
      </dependency>
    
      <dependency>
    	<groupId>com.itextpdf</groupId>
    	<artifactId>kernel</artifactId>
    	<version>7.2.1</version>
    	<!-- kernel depends on io -->
      </dependency>
    
      <dependency>
    	<groupId>com.itextpdf</groupId>
    	<artifactId>layout</artifactId>
    	<version>7.2.1</version>
    	<!-- layout depends on kernel -->
      </dependency>
    
      <dependency>
    	<groupId>com.itextpdf</groupId>
    	<artifactId>pdfa</artifactId>
    	<version>7.2.1</version>
    	<!-- pdfa depends on kernel -->
      </dependency>
    
      <dependency>
    	<groupId>com.itextpdf</groupId>
    	<artifactId>sign</artifactId>
    	<version>7.2.1</version>
    	<!-- sign depends on kernel, layout and forms -->
      </dependency>
    
    	<dependency>
    	<groupId>com.itextpdf</groupId>
    	<artifactId>commons</artifactId>
    	<version>7.2.1</version>
    	<!-- sign depends on kernel, layout and forms -->
      </dependency>
    
      <dependency>
    	<groupId>com.itextpdf</groupId>
    	<artifactId>styled-xml-parser</artifactId>
    	<version>7.2.1</version>
    	<!-- sign depends on kernel, layout and forms -->
      </dependency>
    
    <dependency>
    	<groupId>com.itextpdf</groupId>
    	<artifactId>svg</artifactId>
    	<version>7.2.1</version>
    	<!-- sign depends on kernel, layout and forms -->
      </dependency>
    
    <dependency>
    	<groupId>org.apache.logging.log4j</groupId>
    	<artifactId>log4j-api</artifactId>
    	<version>2.13.3</version>
    </dependency>
    
    <dependency>
    	<groupId>org.apache.logging.log4j</groupId>
    	<artifactId>log4j-core</artifactId>
    	<version>2.13.3</version>
    </dependency>
    
    <dependency>
    	<groupId>org.apache.logging.log4j</groupId>
    	<artifactId>log4j-slf4j-impl</artifactId>
    	<version>2.13.3</version>
    </dependency>
    
    </dependencies>
    

    Note that the project uses the latest version (7.2.1) of the iText library when writing. You can upgrade to the newer versions if they are available.

This will import the necessary libraries into the project. You should have a stable internet connection as the libraries are downloaded from the maven repository.

Steps to Create a PDF File Using the iText Library in Java

Once the libraries are installed, you can use the iText library to create pdf files using the Java program.

The iText library has a class named PdfWriter that creates a new pdf file to write into it. Once the file is open, you add text, image, etc.

Let us understand the steps to create a pdf file and add text and image to it.

  1. Create an instance of the PdfWriter class by passing the file’s name as a parameter to the constructor.

  2. Proceed to create an instance of the PdfDocument class by passing the PdfWriter class to the constructor. This class is responsible for writing to the pdf file.

  3. Finally, you shall create a Document class instance. It attaches to the PdfDocument class instance.

    This class is used to attach individual elements to the pdf file.

  4. At this moment, you are ready to write text and images to the pdf file.

Steps to Write the Text to the PDF File in Java Using the iText Library

Let us see the steps to write a text to the file.

  1. A Paragraph class is used to write the text to the pdf. So, you need to create an instance of the Paragraph class.
  2. You can add the text by calling the add() method.
  3. You can change the text appearance by calling different methods like setTextAlignment(), setFont(), etc.
  4. Finally, you add the paragraph instance to the Document instance by calling the add() method.

Finally, when you have added all the instances to the document, you need to close it by calling the close() method of the Document class.

Code Example to Create PDF in Java Using the iText Library

import com.itextpdf.io.font.constants.StandardFonts;
import com.itextpdf.kernel.font.PdfFont;
import com.itextpdf.kernel.font.PdfFontFactory;
import com.itextpdf.kernel.pdf.PdfDocument;
import com.itextpdf.kernel.pdf.PdfReader;
import com.itextpdf.kernel.pdf.PdfWriter;
import com.itextpdf.kernel.pdf.canvas.parser.PdfTextExtractor;
import com.itextpdf.layout.Document;
import com.itextpdf.layout.element.Paragraph;
import com.itextpdf.layout.properties.TextAlignment;
import java.io.IOException;

public class pdfExample {
  public static void main(String[] args) {
    String file = "sample_pdf.pdf";
    try {
      createPdf(file);
    } catch (IOException e) {
      e.printStackTrace();
    }
  }

  private static void createPdf(String file) throws IOException {
    PdfWriter writer = new PdfWriter(file);
    PdfDocument pdfDoc = new PdfDocument(writer);
    Document doc = new Document(pdfDoc);

    PdfFont myFont = PdfFontFactory.createFont(StandardFonts.TIMES_ROMAN);

    Paragraph p1 = new Paragraph();
    p1.add("Hello, This is Delftstack!");
    p1.setTextAlignment(TextAlignment.CENTER);
    p1.setFont(myFont);
    p1.setFontSize(28);
    doc.add(p1);

    Paragraph p2 = new Paragraph();
    p2.add("We help you understand the concepts.");
    p2.setFontSize(18);
    doc.add(p2);

    doc.close();
  }
}

The pdf file created using the above code is given below.

Created PDF File

Steps to Read the PDF File Using the iText Library in Java

The iText library provides a PdfReader class to read a pdf file. The pdf file can be read by following the steps given below.

  1. First, you need to create an instance of the PdfReader class by passing the file’s path to the constructor.
  2. Create a PdfDocument class instance by passing the instance of PdfReader to the constructor.
  3. If your pdf file contains multiple pages, you will need to loop through each page. To get the number of pages, you can invoke the getNumberOfPages() method using the instance of the PdfDocument class.
  4. Loop through each page.
    1. Invoke getTextFromPage() method of PdfTextExtractor class by passing the document’s current page.
    2. To get the current page, you need to invoke the getPage() method of the Document class and pass the current page number.
    3. The getTextFromPage() method is static, so you do not need a class instance.
    4. The method returns all the text on the current page. You can store it into a String variable.
  5. Process the text (for example, display it on the console).

Code Example to Read PDF in Java Using the iText Library

import com.itextpdf.io.font.constants.StandardFonts;
import com.itextpdf.kernel.font.PdfFont;
import com.itextpdf.kernel.font.PdfFontFactory;
import com.itextpdf.kernel.pdf.PdfDocument;
import com.itextpdf.kernel.pdf.PdfReader;
import com.itextpdf.kernel.pdf.PdfWriter;
import com.itextpdf.kernel.pdf.canvas.parser.PdfTextExtractor;
import com.itextpdf.layout.Document;
import com.itextpdf.layout.element.Paragraph;
import com.itextpdf.layout.properties.TextAlignment;
import java.io.IOException;

public class pdfExample {
  public static void main(String[] args) {
    String file = "sample_pdf.pdf";
    try {
      readpdf(file);
    } catch (IOException e) {
      e.printStackTrace();
    }
  }

  private static void readpdf(String file) throws IOException {
    PdfReader pr = new PdfReader(file);
    PdfDocument doc = new PdfDocument(pr);
    int num = doc.getNumberOfPages();

    for (int i = 1; i <= num; i++) {
      String str = PdfTextExtractor.getTextFromPage(doc.getPage(i));
      System.out.println(str);
    }
  }
}

Output:

Hello, This is Delftstack!
We help you understand the concepts.

Conclusion

This article discusses the basics of reading and writing the pdf file using the iText library. However, the iText library can perform complex operations on the pdf file.

To read more about the iText library and its functionalities, visit the documentation.

Related Article - Java PDF