SQL in Python
This LOYO project assumes you have a basic knowledge of SQL and databases. The Python module we will be talking about creates a SQLite client within your Python environment.
SQLite is an extremely simple relational database tool that saves all of your work into a single file. Many SQL clients support these files, so whatever database you connect to within this Python SQLite module can be used for a variety of applications.
The module in Python that creates a SQLite client within your environment is called sqlite3. Using two applications to work with the same data set could become frustrating and annoying, so SQLite allows you to do both within your Python environment and seamlessly move between them.
Getting SQLite in Your Environment is Simple
SQLite is a built-in Python library, meaning you don't have to jump through any hoops or install packages. Simply enter the line below into your code:
Making a Connection
Just like any other SQL client environment, you will need to make a connection to a database.
We can do this by running the code below.
This will link to a database if one with this name already exists, or create a new one if not. You can see that the LOYO database was created and added to the files for this project.
Obviously, this database is completely empty because we just created it. Let's look forward to some basic SQL that you can use in this Python module to interact with your database.
To create a table, use the SQL statement as you would inside any other client. After you have written the CREATE TABLE
statement, simply wrap it in triple quotes, place it inside the connection.execute()
function. The output of this function is an iterable of tuples. Place this function
within the control of with connection:
and you will have created a table in the database.
The above code creates a table of our group members' names, majors, ages, and spirit animals! Our new table contains the columns and their data types, but still has no records in it.
Adding Records
We can add records to our new table by using the INSERT
statement and entering the values for all columns of each record. A usual SQL INSERT
statement includes the values within the statement itself,
but in the SQLite Python module, they must be ?
as a placeholder to be added later. This ?
syntax is unique to SQLite and is used for the connection.execute()
method, allowing you to bind many sets of data values to one SQL statement for execution.
The data for each row must be entered as a tuple within a list.
Now, we have a table in our database that stores information about all of our groupmates.
Querying our Table
We can now query the table the same way as we would in any other SQL client environment.
The above code ran a query to see which one of our groupmates' spirit animal is a tree frog. From this example, we can see that our table was successfully created and that we can query our table using SQL statements directly in Python using some simple code.
Notice you need to use a print statement while looping through the rows of a query to see the results.
Why Does This Help Us?
Pandas
All of our work within Python this semester has primarily been working with Pandas. All of the data frames that we work with model relational data. Using SQLite, you can easily convert your data frames to relational database models. This could allow us to seamlessly link several related data frames that we have worked with this semester into a relational database with foreign keys and integrity constraints within our Python Environment.
Transitioning from a DataFrame to a table in a relational database model is as simple as using Pandas' df.to_sql(table_name, connection)
method.
sqlite3 also allows you to pull a query from a database into Python and turn it into a Pandas data frame. Let's take a look below:
You can see that the simple pd.read_sql()
function turned our table we created at the beginning of this project into a Pandas data frame.
Database Dump
If you extract a database that is small enough into a sqlite file through a SQL interface such as Oracle's SQL Developer, you can create a new connection with sl.connect('file.db')
and read in the tables from that database as pandas dataframes using pd.read_sql()
.
Accessing Our Database Outside of Python
We can also access this database in a SQL client outside of Python. This is as simple as downloading the database that we made within the project to our computer and opening within a SQL client like SQL Developer. This lets you easily take data from within Python and use it in a more SQL friendly environment.
Further Information
For more information about using SQLite3 in Python, check out the blog post at https://towardsdatascience.com/do-you-know-python-has-a-built-in-database-d553989c87bd
This short youtube video also does a good job explaining the basics of using SQLite3 in Python. https://www.youtube.com/watch?v=girsuXz0yA8