Using tokio and sqlx to connect to the PlanetScale database

Extra Lesson Details

Code Changes (GitHub Diff)

Something may have changed and we'll put it here if it does

To insert data into the database, we'll need to set up a database connection. sqlx is an async SQL client so we need to talk about async Rust.

In Rust, the equivalent of the Promise type in JavaScript is defined by the Rust standard library. It's called a Future. That's where Rust's standard library's responsibilities end though. The runtime that is responsible for running and polling Futures can be swapped out according to your programs needs. The runtime we'll be using is in a crate called tokio which is the longest lived, most stable async runtime in the Rust ecosystem.

cargo add tokio -p upload-pokemon-data

We need to go into our Cargo.toml and modify the tokio dependency. We'll enable the macros feature so that we can use the tokio::main macro.

tokio = { version = "1.11.0", features = ["macros"] }

We can use the tokio::main macro on our main function and use the async keyword to turn our main function into an async function.

async fn main() -> Result<(), csv::Error> {

The macro from the tokio crate will rewrite our main function to wrap it in the tokio runtime. We could set the runtime up ourselves but we're only concerned with using async in this course, not the internals of async runtimes, so we'll let the macro do the work for us.

Having an async function is important for us because it will let us await our sql queries, which are async functions themselves.

Before we can make queries, we need to set up our database connection. sqlx offers us the MySqlPoolOptions::new() builder to connect a pool to our database url. The connect function is async and we can't continue without the database connection, so we await the result before continuing. .connect returns a Result, so we can take the same approach as before and use ? to forward the error to color_eyre's report if anything goes wrong.

fn main() -> eyre::Result<()> {
let database_url = env::var("DATABASE_URL")
.wrap_err("Must have a DATABASE_URL set")
.suggestion("Run `pscale connect <database> <branch>` to get a connection")?;
let pool = MySqlPoolOptions::new()

We can also use eyre's traits to add some suggestion text since many people often forget the format of a mysql connection string:

let pool = MySqlPoolOptions::new()
.suggestion("database urls must be in the form `mysql://username:password@host:port/database`")?;

With the tokio runtime set up and the connection pool also set up, we can now insert pokemon into our database.

We already have insert_pokemon in scope because we added use db::* earlier in the course. At the bottom of our for loop we can call insert_pokemon with a reference to the MySql connection pool and to the pokemon_row we want to insert. insert_pokemon is async, so we'll await it and use ? to handle the error.

Also note that I've changed the println! to be more readable when this script prints out all >1000 pokemon.

for result in rdr.deserialize() {
let record: PokemonCsv = result?;
let pokemon_row: PokemonTableRow = record.into();
"{} {:?} {}",
insert_pokemon(&pool, &pokemon_row).await?;

At this point we'll be able to run the script and insert the pokemon! If you've lost or stopped your database connection along the way, you can get it started again with this pscale command.

pscale connect pokemon new-tables

and we can run our program to insert all the pokemon into the pokemon table.

DATABASE_URL=mysql:// cargo run

This will take a few minutes, that's partially why we left the println! in, so we could see progress.

There are two reasons this insertion is slow.

First: sqlx doesn't have batching. Some other clients do. So instead of sending a single large request we're sending a lot of smaller requests.

Second: Currently we stop at every await (because that's how await works) which means we only ever have one SQL query running.

To be clear: this is a perfectly fine way to work with async/await, especially if you're just getting started with Rust or async Rust but also because we only really ever have to run this script once, so how easy it is for us to write and run needs to be traded off against how long it takes to run. Even if this script took a whomping 15 minutes to run, that would only give us 15 minutes worth of time to improve the performance (because we only have to run the script once).

However, there are ways to execute our futures that will result in more requests running simultaneously.

Getting the PlanetScale connection string from an environment variable and handling errors with color_eyre

Showing Iterator progress bars with indicatif