Python Requests Pagination

  1. What is Pagination in Python
  2. Python Pagination with Next Button
  3. Python Pagination Without the Next Button
  4. Python Pagination With Infinite Scroll
  5. Pagination With Load More Button
Python Requests Pagination

In this article, we will learn about pagination and how we can overcome issues related to pagination in Python. Once we reach the end of this article, we shall be able to learn about Python pagination and how we can deal with problems with it.

What is Pagination in Python

When using any web application, it is of prime importance that the displayed content is not just limited and forcefully fitted into a single page. Still, it should be displayed over several pages, which can help for a better user experience.

This process of spreading content over several pages is called pagination. It must be kept in mind that when implementing the concept of pagination, we shall consider factors such as total page count, content type, categorical representation of topics under discussion and a numerical order followed for pages.

Python Pagination with Next Button

Pagination is not always limited to what users see, i.e., the front end of the websites, but it is also sometimes of prime importance to paginate the APIs used at the back end. We can use several Python APIs and modules to deal with pagination issues.

We will begin by making use of the requests module. Alongside, we shall be using BeautifulSoup4 if we are interested in locating the content from a webpage.

Also, we shall be using the lxml library to provide convenient access to the modules mentioned above.

Example code:

pip install requests beautifulsoup4 lxml

The above line will help us install the requests module via the beautifulsoup4 library.

import requests
from bs4 import BeautifulSoup
findurl = 'http://books.toscrape.com/catalogue/category/books/fantasy_19/index.html'
getresponse = requests.get(findurl)
getsoup = BeautifulSoup(getresponse.text, "lxml")
footer_element = getsoup.select_one('li.current')
print(footer_element.text.strip())

Output:

Page 1 of 3

The previous code snippet shall help us capture the footer from the webpage URL given in the code. You can change the URL as per requirement.

The requests library sends a get request on the URL.

For the soup object, we are using the CSS Selector. For instance, if we want to move on to another element, we can enter the name in the soup.select_one(name).

The above code was for webpages that include the next button for navigation. Apart from this scenario, pagination can also be done without next button for a website that uses infinite scroll and load more button.

Note: We get all these object names we have used in the code above by pressing F12 on the desired website and then examining the markup for the required elements. For example:

How to Inspect a website

Python Pagination Without the Next Button

Some websites, instead of the next button, use numbers such as 1,2,3,4 etc., to scroll among different pages. This makes it even easier for a user to navigate among multiple pages.

In this case, we shall try retrieving data from the first page and then navigating using a loop.

Example code:

# Handling pages with the Next button
import requests
from bs4 import BeautifulSoup
from urllib.parse import urljoin
def process_pages():
    get_url = 'https://www.gosc.pl/doc/791526.Zaloz-zbroje'
    response = requests.get(get_url)
    soup = BeautifulSoup(response.text, 'lxml')
    page_link_el = soup.select('.pgr_nrs a')
    # process the first page
    for link_el in page_link_el:
        link = urljoin(get_url, link_el.get('href'))
        response = requests.get(link)
        soup = BeautifulSoup(response.text, 'lxml')
        print(response.url)
        # process remaining pages
if __name__ == '__main__':
    process_pages()

Output:

https://www.gosc.pl/doc/791526.Zaloz-zbroje/2
https://www.gosc.pl/doc/791526.Zaloz-zbroje/3
https://www.gosc.pl/doc/791526.Zaloz-zbroje/4

Python Pagination With Infinite Scroll

As the name suggests, in this type of pagination, we do not have the next buttons or page numbers but rather keep scrolling to view the required content.

A simple example of such pagination can be any e-commerce website. We’re shown a certain number of products at a time, and once we scroll down, we’re shown the next products.

It must be kept in mind that during such scenarios, we don’t have to deal with multiple-page URLs.

An asynchronous call to API will help us get more content as we move.

Pagination With Load More Button

This pagination method resembles the infinite scroll method but only differs when we’re interested in knowing how we shall move to the next page.

In this case, we have a certain number of requests to be completed that keep on decreasing whenever we click on the load more button. For example, the total number of images on a website is 500, and we’re shown 30 images at a time.

So with every click on the load more button, we’re presented with the next 30 images, and the counter subtracts that 30 from the total 500 images. Let us consider the example below for a better understanding.

Example code:

import requests
from bs4 import BeautifulSoup
url = 'http://smarthistory.org/wp-json/smthstapi/v1/objects?tag=938&page={}'
null=0
page_counter = 1
while True:
    getresponse = requests.get(url.format(page_counter), headers=null)
    data = getresponse.json()
    # Process data
    # ...
    print(getresponse.url)  # only for debug
    if data.get('remaining') and int(data.get('remaining')) > 0:
        page_counter += 1
    else:
            break

Output:

https://smarthistory.org/wp-json/smthstapi/v1/objects?tag=938&page=1
https://smarthistory.org/wp-json/smthstapi/v1/objects?tag=938&page=2
https://smarthistory.org/wp-json/smthstapi/v1/objects?tag=938&page=3
https://smarthistory.org/wp-json/smthstapi/v1/objects?tag=938&page=4
https://smarthistory.org/wp-json/smthstapi/v1/objects?tag=938&page=5
https://smarthistory.org/wp-json/smthstapi/v1/objects?tag=938&page=6
https://smarthistory.org/wp-json/smthstapi/v1/objects?tag=938&page=7
...

The above code will continue printing the same URL and incrementing the page number until we reach the end of the available pages. For the above code, the total number of pages is 34.

We hope you find this article helpful in understanding the concept of pagination in Python.

Author: Abid Ullah
Abid Ullah avatar Abid Ullah avatar

My name is Abid Ullah, and I am a software engineer. I love writing articles on programming, and my favorite topics are Python, PHP, JavaScript, and Linux. I tend to provide solutions to people in programming problems through my articles. I believe that I can bring a lot to you with my skills, experience, and qualification in technical writing.

LinkedIn

Related Article - Python Requests

  • Ignore SSL Security Certificate Check in Python Requests
  • Set User Agent Using Requests in Python
  • Make an API Call With Token in Python
  • Use Requests Module to Post Form Data in Python
  • Set Maximum Retries for Requests in Python