Python Teradata Connection

Python Teradata Connection

  1. the Teradata Database in Python
  2. Use PyODBC Library to Connect to Teradata in Python
  3. Use Teradata SQL to Connect to Teradata in Python
  4. Use Teradata Module to Connect to Teradata in Python

This Python article will show how to connect to Teradata using Python. The options to connect to Teradata and export tables to Pandas are numerous.

This article will discuss some essential proper guidelines and look at how the Teradata module works before we discuss ways of connecting.

the Teradata Database in Python

You can script powerful interactions with the Teradata Database using Python with the Teradata module.

Adopting the udaSQL philosophy offers a DevOps-oriented SQL engine in which developers can concentrate on their SQL logic rather than performing external configuration, query banding, and logging.

We can download the Teradata module under the MIT license. The next step is to install and download the package from PyPI.

Due to its open-source nature, this module is supported by the community. However, customers and engineers are not provided support for the interoperability of this module with third-party applications, such as Teradata’s ODBC driver and sqlalchemy-Teradata.

Use PyODBC Library to Connect to Teradata in Python

The Pandas data frame can be created with SQL and uploaded to Teradata with a Teradata data frame.

  1. An environment with Pandas installed in Python.
  2. The Teradata database’s hostname/IP address and connection method are known.
  3. An ODBC driver must be installed on the machine you are attempting to connect to Teradata.

Alternatively, you can contact your company’s DBA if you are unsure of Teradata’s database details. Without the Teradata ODBC driver, you may need to use JDBC and PySpark.

In this tutorial, we will also discuss other methods.

Use the PyODBC Library

With PyODBC, you can easily connect to ODBC databases using Python. Furthermore, this library implements the DB API 2.0 specification with even more Pythonic features.

In your Python environment, you should install the PyODBC library. Then, you can install the library with either Conda or Pip.

pip install pyodbc

OR

conda install pyodbc

A Python package called pip is used to install packages, libraries, and modules. For example, after pip has installed the PyODBC module and Teradata’s ODBC driver, run the following Python code to list the existing drivers.

Verify that Teradata appears in this list by paying attention to the Teradata driver’s name.

pyodbc.drivers()

python teradata connection - pyodbc

Extract Data From Teradata Into Pandas Data Frame

The following code fence shows how to extract data from SQL into Pandas data frames. But, first, check the next output data frame.

teradata_df.head()

Python’s df.head() always returns the first above 5 rows. So here, it will show the first five rows from the default table of Teradata from 0-4.

python teradata connection - df_head

teradata_df.info()

python teradata connection - teradataInfo

There is much pertinent information in the connection string, such as the hostname, driver, username, password, and authentication protocol.

You may need to pass optional parameters depending on your Teradata settings. Parameters that are ODBC-compatible can be passed to PyODBC.

Upload Data Frame to Teradata From Pandas

The process of pulling data is simple, but the process of uploading data is more complicated.

  1. The Pandas data frame needs to be transformed into a schema.
  2. We should break up the data frame into chunks. The ODBC database has a maximum of 1MB per insert, so it will fail if your data frame is large.
  3. We should insert the records sequentially.

You can upload the data frame by following the code below.

cnxn.commit()
print('Query complete. Running time is %s sec/s.'%(round(end_time-start_time)))

The cnxn.commit() command will commit the changes and make them permanent.

Use Teradata SQL to Connect to Teradata in Python

To use this package, you do not need to install Teradata drivers (other than this one).

import teradatasql

with teradatasql.connect(host='name', user='name', password='*****') as connect:
    df = pd.read_sql(query, connect)

Once the terasql imports, the Teradata will connect with the following parameters host, username & password. Then after a successful connection, the query will be read and executed.

Another way is to use the Giraffez module. This module has many useful features, such as MLOAD, FASTLOAD, BULKEXPORT, etc. However, there are only a few requirements for beginners (e.g., C/C++ compiler, Teradata CLIv2, and TPT API headers/lib files).

Please note that the context manager has been updated to ensure that sessions are closed as of 13-07-2018. DF can be used to send data to Teradata.

Using the rest method, we can eliminate the 1MB limit imposed by odbc and the dependence on the odbc driver. We should use the host IP address in place of the driver argument.

import teradata
import pandas as pd

udaExec = teradata.UdaExec (appName="webApp", version="1.0", logConsole=False) 
with udaExec.connect(method="rest_one",system="DB_Name", username="user_name",
                      password="*******", host="HOST_IP_ADDRESS") as connect:

    data = [tuple(x) for x in df.to_records(index=False)]

    connect.executemany("INSERT INTO DATABASE.TABLEWITH5COL values(?,?,?,?,?)",data,batch=True)

To avoid the HY001[ODBC Teradata Driver] Memory allocation error, chunk your data into less than 1MB chunks when using the ODBC Teradata Driver. For example:

import teradata
import pandas as pd
import numpy as np

udaExec = teradata.UdaExec (appName="test", version="1.0", logConsole=False)
with udaExec.connect(method="odbc",system="DBName", username="User_Name",
                      password="*******", driver="Driver_Name") as connect:
    chunks_df = np.array_split(huge_df, 100)
     for i,_ in enumerate(chunks_df):
        data = [tuple(x) for x in chuncks_df[i].to_records(index=False)]
        connect.executemany("INSERT INTO DATABASE.TABLEWITH5COL v

Following is another easy way to connect Teradata with Python.

Use Teradata Module to Connect to Teradata in Python

By executing the following command if pip is already installed, you can install this module directly:

pip install Teradata

You can download the package at the following URL if you don’t already have it:https://pypi.python.org/pypi/teradata.

Once you download the teradata package, unzip it and then use the command prompt to navigate to the directory that contains setup.py, plus execute the following command to install:

python setup.py install

Example code:

import teradata
import sys

udaExec = teradata.UdaExec(
appName="HelloPeople", version="1.0", logConsole=False)
session = udaExec.connect(method="odbc", dsn="td16vm",
username="", password="", autocommit=True,
transactionMode="Teradata")

for row in session.execute('select getqueryband();'):
    print(row)
for row in session.execute('select top 20 tablename, tablekind from dbc.tables;'):
    print(row)

session.close()
input('Type <Enter> to exit...')

To connect to Teradata, we must configure these parameters: transaction mode is Teradata; ODBC is the connecting method (the other option is REST), DSN is td16vm, configured with the following parameters in the computer.

In the next step, you will need to create a virtual machine for Teradata. Here’s the running result for the above sample code:

python teradata connection - connect teradata

As discussed, several ways of connecting Teradata to Python exist. With step-by-step directions, all the possible ways to connect the Teradata module in Python have been revealed here.

Author: Abid Ullah
Abid Ullah avatar Abid Ullah avatar

My name is Abid Ullah, and I am a software engineer. I love writing articles on programming, and my favorite topics are Python, PHP, JavaScript, and Linux. I tend to provide solutions to people in programming problems through my articles. I believe that I can bring a lot to you with my skills, experience, and qualification in technical writing.

LinkedIn