Python Library: aiosql
While it’s de rigueur in most languages to use an object-relational mapper (ORM) to translate between your internal representation (typically object-oriented) and the relational model, I find that you end up with a bit of the “worst of both worlds” in the end. This article isn’t attempting to dig into all the challenges with ORMs. Instead, I want to look at a Python library for an alternative pattern.
In the olden times, when I was still writing Clojure code like the cool kids, I
was a big fan of Kris Jenkins’ yesql
library. The basic idea was that you wrote SQL in SQL to query and manipulate
your RDBMS, and then you would get back very basic Clojure types to work with. A
while back, though, I stumbled over Will Vaughn’s
aiosql
, which does something of the
same but for Python.
While aiosql
can operate in either syncronous or asynchronous code, it’s much
easier to explain without the bits of Python async detritus in the middle. For
this exploration, we’re going to just use a simple SQLite
database as the backend. I’m also going to skip over whatever machinations you
use to install 3rd party packages. My personal preference is for
PDM
.
First, we need to import a few things:
>>> import sqlite3
>>> import aiosql
Then, we can build a bunch of queries from SQL. This is where aiosql
differentiates itself from the rest of the world. You write these in normal SQL,
with access to all the crazy capabilities innate in the language. For now,
we’ll just use a single triple-quoted string, but you can load from file(s) just
as easily:
>>> SQL_QUERIES = """
-- name: create_table#
-- Create the tables necessary for this demonstration
CREATE TABLE greetings (
greeting_id INTEGER PRIMARY KEY,
greeting TEXT NOT NULL
);
-- name: insert_initial_values!
-- Insert some initial values for us to play with
INSERT INTO greetings (greeting_id, greeting) VALUES
(NULL, "First greeting"),
(NULL, "Another greeting");
-- name: get_all_greetings
-- Get all of the greetings in the database
SELECT greeting_id, greeting FROM greetings;
-- name: get_greeting^
-- Get a single greeting
SELECT greeting_id, greeting FROM greetings WHERE greeting_id = :greeting_id;
-- name: add_greeting<!
-- Add a single greeting, returning the greeting_id
INSERT INTO greetings (greeting_id, greeting) VALUES (NULL, :greeting);
"""
>>> queries = aiosql.from_str(SQL_QUERIES, "sqlite3")
Now we have an object, queries
, that contains a set of functions we can use.
The string "sqlite3"
tells the library what dialect you’re going to be using.
Currently, it
suppoprts
a couple sync and async libraries for SQLite and PostgreSQL, but really, what
other database do you need?
But let’s talk about a few things that might look a bit different from just
“pure” SQL. First, we use SQL comments (--
) to annotate our queries. This
isn’t just good practice, it’s necessary for the library to parse them into
functions. Specifically, it looks at the name:
comment to identify the
function. This is anything that’s a valid Python
identifier
.
But wait, you might be saying, there’s some crazy characters there at the end on
some of them. These are query type annotations:
#
: This is a script, and it really only has a sense of “done”. This is typically used for things like DDL.!
: The statement is DDL/DML, and will make changes, but won’t return any results.^
: The statement will return a single value. This means that what you’ll get back will be a single row and it won’t be contained in Python list.<!
: Now we’re getting a bit Perly with our syntax, but in short it is used when you are doing an insert/update and want. The details are a bit more complicated, but basically it lets you return things from a query, like the primary key/row id of the inserted row, or something from aRETURNING
clause in PostgreSQL for example.
Moving on, we also will need a SQLite database for it to work on:
>>> db = sqlite3.connect("./testdb.sqlite3")
So, now we have two pieces: a set of queries, and a connection handle to a database. Let’s get started. First, let’s create the test database schema:
>>> queries.create_table(db)
"DONE"
Now the table exists, and we can insert some sample data into it:
>>> queries.insert_initial_values(db)
Now, we have a little sample data. We can see it here:
>>> queries.get_all_greetings(conn)
[(1, 'First greeting'), (2, 'Another greeting')]
You can see that what’s returned from the database is just basic Python data structures. A tuple for each row, and a all of the rows wrapped in a list. You can, however, leverage a capability in the driver to get you something a bit smarter:
>>> db.row_factory = sqlite3.Row
>>> queries.get_all_greetings(db)
[<sqlite3.Row object at 0x101f4d2a0>, <sqlite3.Row object at 0x101f84430>]
Now we can be a little more elegant when we’re asking the returned rows:
>>> result = queries.get_all_greetings(db)
>>> result[0]
<sqlite3.Row object at 0x101f85b40>
>>> tuple(result[0])
(1, 'First greeting')
>>> result[0].keys()
['greetings_id', 'greeting']
>>> result[0]["greeting"]
'First greeting'
Obviously, in most cases, you’re writing SQL that takes input. So, in order to
get input, you need to insert variables. You can see those in get_greeting
and
add_greeting
in the form of :variablename
. These get passed in as arguments
to the query function call. For example:
>>> queries.add_greeting(db, "A Whole New Greeting")
3
>>> queries.get_greeting(db, 3)["greeting"]
"A Whole New Greeting"
So far, I think these are all super simple, but what if you want to do something a bit more complicated with SQL? Well, let’s take the chinook database and play with it. Let’s say we want to see who all the music listeners of a specific genre are. This is the SQL we might use:
SELECT DISTINCT c.email,
c.firstname,
c.lastname,
g.name
FROM customers c
JOIN invoices i ON c.customerid = i.customerid
JOIN invoice_items il ON il.invoicelineid = i.invoiceid
JOIN tracks t ON il.trackid = t.trackid
JOIN genres g ON t.genreid = g.genreid
WHERE g.name = 'Rock'
ORDER BY 1;
We can simply replace the WHERE g.name = 'Rock'
with WHERE g.name = :genre
,
and then:
>>> queries.get_top_listeners_for_genre(db, genre="Rock")
[('aaronmitchell@yahoo.ca', 'Aaron', 'Mitchell', 'Rock'),
('alero@uol.com.br', 'Alexandre', 'Rocha', 'Rock'),
('astrid.gruber@apple.at', 'Astrid', 'Gruber', 'Rock'),
...
('steve.murray@yahoo.uk', 'Steve', 'Murray', 'Rock'),
('terhi.hamalainen@apple.fi', 'Terhi', 'Hämäläinen', 'Rock'),
('tgoyer@apple.com', 'Tim', 'Goyer', 'Rock'),
('vstevens@yahoo.com', 'Victor', 'Stevens', 'Rock'),
('wyatt.girard@yahoo.fr', 'Wyatt', 'Girard', 'Rock')]
Now, I realize that for many developers, SQL isn’t the most comfortable syntax, but for more complex database schemas, and especially for legacy databases that were designed for a totally different model, being able to work and access 100% of the richness of SQL is exceptionally powerful. It may not be the perfect answer, but it’s one to keep in your toolbelt.
Finally, if you want a bit more complicated version, you can take a look at this gist I
wrote a while back. It combined not just aiosql
, but also the brilliant
Pydantic
module and FastAPI
: