How to Pretty Print XML Output Pretty in Python

Vaibhav Vaibhav Feb 16, 2024
  1. Make XML Output Pretty in Python Using BeautifulSoap Library
  2. Make XML Output Pretty in Python Using lxml Library
How to Pretty Print XML Output Pretty in Python

When reading text files, HTML files, XML files, etc., a file’s content is poorly structured and contains inconsistent indentation. This disparity makes it difficult to understand the output. This problem can be solved by beautifying the output of such files. Beautification includes fixing inconsistent indentation, removing random spaces, etc.

In this article, we will learn how to make an XML file output prettier. So that we all are on the same page, the following Python codes will consider this XML file. If you wish to use the same file, all you have to do is copy the XML content into a new file by the name of books.xml.

Make XML Output Pretty in Python Using BeautifulSoap Library

BeautifulSoup is a Python-based library for parsing HTML and XML files. It is generally used for web scraping data out of websites and documents. We can use this library to beautify an XML document’s output.

In case you don’t have this library installed on your machine, use either of the following pip commands.

pip install beautifulsoup4
pip3 install beautifulsoup4

We have to install two more modules to work with this library: html5lib and lxml. The html5lib library is a Python-based library to parse HTML or Hypertext Markup Language documents. And, the lxml library is a collection of utilities to work with XML files. Use the following pip commands to install these libraries on your machine.

pip install html5lib
pip install lxml 

Refer to the following Python code to understand the usage.

from bs4 import BeautifulSoup

filename = "./books.xml"
bs = BeautifulSoup(open(filename), "xml")
xml_content = bs.prettify()
print(xml_content)

Output:

<?xml version="1.0" encoding="utf-8"?>
<catalog>
 <book id="bk101">
  <author>
   Gambardella, Matthew
  </author>
  <title>
   XML Developer's Guide
  </title>
  <genre>
   Computer
  </genre>
  <price>
   44.95
  </price>
  <publish_date>
   2000-10-01
  </publish_date>
  <description>
   An in-depth look at creating applications
    with XML.
  </description>
 </book>
 
 ...

 <book id="bk111">
  <author>
   O'Brien, Tim
  </author>
  <title>
   MSXML3: A Comprehensive Guide
  </title>
  <genre>
   Computer
  </genre>
  <price>
   36.95
  </price>
  <publish_date>
   2000-12-01
  </publish_date>
  <description>
   The Microsoft MSXML3 parser is covered in
    detail, with attention to XML DOM interfaces, XSLT processing,
    SAX and more.
  </description>
 </book>
 <book id="bk112">
  <author>
   Galos, Mike
  </author>
  <title>
   Visual Studio 7: A Comprehensive Guide
  </title>
  <genre>
   Computer
  </genre>
  <price>
   49.95
  </price>
  <publish_date>
   2001-04-16
  </publish_date>
  <description>
   Microsoft Visual Studio 7 is explored in depth,
    looking at how Visual Basic, Visual C++, C#, and ASP+ are
    integrated into a comprehensive development
    environment.
  </description>
 </book>
</catalog>

The above code considers that the books.xml file is in the current working directory. Or, we can also set the full path to the file in the filename variable. The prettify() function beautifies the output of the XML file. Note that the output has been trimmed for understanding purposes.

Make XML Output Pretty in Python Using lxml Library

We can use the lxml library alone to beautify an XML file’s output. Refer to the following Python code for the same.

from lxml import etree

filename = "./books.xml"
f = etree.parse(filename)
content = etree.tostring(f, pretty_print=True, encoding=str)
print(content)

Output:

<?xml version="1.0" encoding="utf-8"?>
<catalog>
 <book id="bk101">
  <author>
   Gambardella, Matthew
  </author>
  <title>
   XML Developer's Guide
  </title>
  <genre>
   Computer
  </genre>
  <price>
   44.95
  </price>
  <publish_date>
   2000-10-01
  </publish_date>
  <description>
   An in-depth look at creating applications
    with XML.
  </description>
 </book>
 
 ...

 <book id="bk111">
  <author>
   O'Brien, Tim
  </author>
  <title>
   MSXML3: A Comprehensive Guide
  </title>
  <genre>
   Computer
  </genre>
  <price>
   36.95
  </price>
  <publish_date>
   2000-12-01
  </publish_date>
  <description>
   The Microsoft MSXML3 parser is covered in
    detail, with attention to XML DOM interfaces, XSLT processing,
    SAX and more.
  </description>
 </book>
 <book id="bk112">
  <author>
   Galos, Mike
  </author>
  <title>
   Visual Studio 7: A Comprehensive Guide
  </title>
  <genre>
   Computer
  </genre>
  <price>
   49.95
  </price>
  <publish_date>
   2001-04-16
  </publish_date>
  <description>
   Microsoft Visual Studio 7 is explored in depth,
    looking at how Visual Basic, Visual C++, C#, and ASP+ are
    integrated into a comprehensive development
    environment.
  </description>
 </book>
</catalog>

Take a note of the pretty_print = True argument. This flag makes the output prettier.

Vaibhav Vaibhav avatar Vaibhav Vaibhav avatar

Vaibhav is an artificial intelligence and cloud computing stan. He likes to build end-to-end full-stack web and mobile applications. Besides computer science and technology, he loves playing cricket and badminton, going on bike rides, and doodling.

Related Article - Python XML