Save HTML as PDF in Python

Save HTML as PDF in Python

Manav Narula Jul-09, 2021 May-31, 2021 Python Python HTML
  1. Use the wkhtmltopdf API With Pdfkit to Save HTML as a PDF Using Python
  2. Use the weasyprint Module to Save HTML as a PDF Using Python
  3. Use the PyQT Module to Save HTML as a PDF Using Python

HTML is the most basic and popular language for web development. It has formed the base for many languages. Python has support to create connections and work with websites.

A PDF is a portable document that can be viewed on different devices and is independent of the software used in creating it.

In this tutorial, we will save an HTML webpage as a PDF using Python.

Use the wkhtmltopdf API With Pdfkit to Save HTML as a PDF Using Python

The wkhtmltopdf is an open-source set of tools that can convert an HTML webpage to a PDF. We use the pdfkit module to work with this in Python. The functions from this module can work on single or multiple web pages and save them as a PDF file.

We can read the content directly from the webpage URL or an HTML file saved on the device. The from_url() function reads content from a URL, and the from_file() function reads from a file.

The name and path of the file can be specified within the function.

See the following code to see their use

import pdfkit
pdfkit.from_url('https://www.delftstack.com/', 'sample.pdf')

We can also store this content in a variable by mentioning False in the function instead of the PDF name.

Remember to install wkhtmltopdf from its official website before using this method.

Use the weasyprint Module to Save HTML as a PDF Using Python

The weasyprint module is used to render web pages into document formats. We use the HTML function to read the URL and save it as a PDF using the write_pdf() function.

For Example,

import weasyprint
doc_pdf = weasyprint.HTML('https://www.delftstack.com/').write_pdf('sample.pdf')

Many other modules and functionalities need to be installed before using weasyprint, so it is recommended not to use it. Also, Python 2 has removed support for this module.

Use the PyQT Module to Save HTML as a PDF Using Python

The PyQT module has a vast range of functionalities for GUI development and other features. We can manually read an HTML webpage URL and convert it into a PDF using different functions.

See the following code.

import sys 
from PyQt4.QtCore import *
from PyQt4.QtGui import *
from PyQt4.QtWebKit import *

app = QApplication(sys.argv) 
w = QWebView()
w.load(QUrl('https://www.delftstack.com'))
p = Qp()
p.setPageSize(Qp.A4)
p.setOutputFormat(Qp.PdfFormat)
p.setOutputFileName("sample.pdf")

def convertIt():
    w.print_(p)
    QApplication.exit()

QObject.connect(w, SIGNAL("loadFinished(bool)"), convertIt)
sys.exit(app.exec_())
Author: Manav Narula
Manav Narula avatar Manav Narula avatar

Manav is a IT Professional who has a lot of experience as a core developer in many live projects. He is an avid learner who enjoys learning new things and sharing his findings whenever possible.

LinkedIn

Related Article - Python HTML

  • Parse HTML With Python