In this article, we will be going through the general definition of SQLAlchemy ORM, how it compares to a pandas DataFrame and how we can convert an SQLAlchemy ORM object to a pandas DataFrame. However, we assume you are already familiar with how a pandas DataFrame and a relational database are set up for this article.
SQLAlchemy is a module that helps enhance the communication experience between a python script and the database connected to the script. This module is typically used as an ORM or Object Relational Mapping tool.
An ORM essentially “translates” the classes in a python script. With ORMs, we can write simple SQL queries to perform database-related actions.
It is through an ORM that, in a python script, the classes can be converted into a SQL table, and the functions are converted to queries. Hence, SQLAlchemy is often referred to as a bridge between a python script and a relational database.
Convert an SQLAlchemy ORM to a DataFrame
These functions will read the SQL query or table into the dataframe object. The
read_sql() function delegates the process to the
read_sql_query() or the
read_sql_table() functions, according to the parameters provided to the function.
Here are some of the parameters typically used in the
||As the name suggests, this is the name of the database table.|
||Typically contains the data connection URI.|
||This parameter contains the database schema name that is supposed to be queried. If there is no schema for the database, we can set this parameter to
||These are the indexes or columns. These are optional, and their default value of
||This parameter converts non-string values to floating point values. The default for this parameter is set to be
||These are the list of column names that can be parsed as dates. Their default value is set to
||This parameter is a list of column names to be selected from a database table. The default value for this parameter is set to
||This parameter is the number of rows for each chunk. It is typically used to enter a large amount of data simultaneously. If it is not required to be set, we can set it to the default value, which is set to
||This parameter can be a simple datatype name, int, bool, etc., or a dictionary of columns with their datatypes specified.|
The syntax for all these methods:
pandas.read_sql_query(sql, con, index_col=None, coerce_float=True, params=None, parse_dates=None, chunksize=None, dtype=None)
Typically, in Python scripts, we use this much more straightforward and simplified way to read into an SQL database:
data_frame_obj = pd.read_sql(query.statement, db_engine)
This line of code above will read the query defined in our python script and read it into our Dataframe object.
Let’s look at this example script to better understand how we can apply this.
import pandas import sqlalchemy as db from sqlalchemy.ext.declarative import declarative_base from sqlalchemy.orm import sessionmaker Base = declarative_base() engine = db.create_engine("mysql+pymysql:\ //root:@localhost/jobs") Session = sessionmaker(bind=engine) session = Session() class Jobs(Base): __tablename__ = 'jobs' job_id = db.Column(db.Integer, primary_key=True) name = db.Column(db.String(50)) description = db.Column(db.String(50)) status = db.Column(db.String(10)) rows = [ Jobs(job_id=1, name="add", description="add data to database", status="pending"), Jobs(job_id=2, name="view", description="view data from database", status="pending"), Jobs(job_id=3, name="search", description="search data from database", status="pending"), ] dataFrame = pandas.DataFrame([vars(row) for row in rows]) dataFrame.to_sql('jobs', engine, if_exists='replace') jobs_df = pandas.read_sql_query( sql=session.query(Jobs).statement, con=engine ) print(jobs_df)
To run this script successfully, you must make sure that you have the following libraries installed.
Once you have all these libraries set up, run the following command in the terminal of your choice to set up your docker:
docker run --name mysql-db -e MYSQL_ALLOW_EMPTY_PASSWORD=yes -p 3306:3306 -d mysql:5.6
Next, we can enter our container with this command to create the database:
docker exec -it mysql-db mysql
We run the create database query like so:
create database jobs;
Now that our environment is set up, we can exit the container and run our python script to get the following results.
This means we have successfully converted our SQLAlchemy ORM to a Pandas Dataframe.