Setting up database migrations with Knex

Connecting to our database to directly create tables, rename columns, or any other refactoring is so last century.

Rather than directly connect to the database environment on our RDS instances, we can make changes through code. This code allows us to document what our database looks like and the steps to get it there, while simultaneously making it easier to keep our development and production databases synced up. This is another example of Infrastructure as Code (IaC) that allows us to create our environment programmatically rather than through manual intervention.

One way to do this is to use migrations in the knex database library.

Step 0: Have working code and a database

If your code doesn't work before you do this, it's not going to magically work afterward. You also need a database to connect to (either locally, and/or in your production environment). If you're using Elastic Beanstalk, you've probably got an RDS instance and the appropriate variables configured already. I'm going to assume that's what you're using.

I'm assuming here that you have set up the NODE_ENV environment variable in your production environment, and that you have RDS connected through Elastic Beanstalk.

Step 1: Setting up your deployment

Create a new file that can be used by knex to manage your database connection. The file goes in the root of your project directory, and is called knexfile.js. This file contains two database configurations: development and production:

// knexfile.js
module.exports = {
  development: {
    client: 'pg',
    connection: {
      host: 'localhost',
      user: 'postgres',
      password: 'SuperSecretPassword',
      database: 'music',
      port: 5432,
    },
    migrations: {
      directory: './migrations'
    }
  },

  production: {
    client: 'pg',
    connection: {
      host: process.env.RDS_HOSTNAME,
      user: process.env.RDS_USERNAME,
      password: process.env.RDS_PASSWORD,
      database: process.env.RDS_DB_NAME,
      port: process.env.RDS_PORT || 5432,
      ssl: { rejectUnauthorized: false }
    },
    migrations: {
      directory: './migrations'
    }
  }
};

Now we'll update our index.js file to use this knexfile for configuration so we don't have to keep it updated in two places. Replace the const knex line with the following:

// in index.js
const knexConfig = require("./knexfile");
const environment = process.env.NODE_ENV || "development";
const knex = require("knex")(knexConfig[environment]);

Now our application will use the development configuration in our dev environment, and the production config in the production environment.

Step 2: Create a migration

Migration files are the key here. They allow you to specify the steps to create or update your tables in Javascript, then run those commands through knex.

Create a new migration by typing this command at the terminal:

npx knex migrate:make myfirsttable

This command creates a new migration file in a folder called migrations/ in your project directory.

💡
I'll fully admit that I used AI to come up with the exact syntax of the migration file. There's lots of documentation online to get you familiar with how to set those up.

Inside that file, you'll write two functions that get exported: up and down. up is run when you migrate, and down is run when you rollback a migration (if you made a mistake and want to go back. The two are basically inverses of each other - if you create a table in up, then down should drop the table.

For example, my application has a table called bands with several columns. Here is the migrations file to create that table:

// migrations/datetimestring_create_bands_table.js
/**
 * @param { import("knex").Knex } knex
 * @returns { Promise<void> }
 */
exports.up = function(knex) {
  return knex.schema.createTable('bands', function(table) {
    table.increments('band_id').primary();
    table.string('band_name', 255).notNullable();
    table.string('lead_singer', 255);
    table.string('music_genre', 100);
    table.string('still_rocking', 1).defaultTo('Y');
    table.integer('rating');
  });
};

/**
 * @param { import("knex").Knex } knex
 * @returns { Promise<void> }
 */
exports.down = function(knex) {
  return knex.schema.dropTableIfExists('bands');
};

This creates the table with columns for band_id (auto-incrementing primary key), band name, etc. As you can see, the down function drops the table.

Step 3: Run the migration

Once you have your migration(s) set up, you can run them with the following command:

npx knex migrate:latest

BAM! You've got yourself a database table. Or lots of tables if you make more migrations, or create lots of tables in one migration.

In addition to changing the table you directed it to change, knex uses two extra tables: knex_migrations and knex_migrations_lock to track which migrations have been run.

Step 4: Tell Beanstalk to run migrations

Now we need to tell Elastic Beanstalk to run the migrations when you update your code. That way when you make a new migration and update your site, it will automatically migrate the database.

⚠️
Automatically migrating your database is great for convenience, but it can occasionally break things. Make sure that your migrations work on your own machine before deploying them, and be ready to rollback if you need to.

To tell Beanstalk to migrate, create a new file in the .ebextensions folder called 01_migrations.config:

# .ebextensions/01_migrations.config
container_commands:
  01_migrate:
    command: "npx knex migrate:latest"
    leader_only: true

This will automatically run the migrations when you upload new code. leader_only ensures that only one server tries to run the migrations so you don't end up with race conditions.

Commit your changes and upload the code to Beanstalk to see the migrations in action. Use the logs to troubleshoot if you run into issues.

Step 5: Rollback (only if needed)

If you make an error in your migration file, you can rollback a change with

npx knex migrate:rollback

This runs the down function, undoing the changes made in a migration. It also updates knex's migration tracking so it knows what migrations have been run.

🐊 Pitfalls to watch out for

  • If you are using migrations to make changes, only make changes with migrations. Manual table edits are likely to break migrations, and mean that migrations no longer have full control over the state of your database. A mix of manual and automated edits is a recipe for confusion and frustration.
  • Make sure your database migrations are consistent across the team - if one user creates a migration to create a table, make sure to commit and share that code. For a small team, it might be best to have one person be the "migration czar."
  • Create the exports.down function so that you can rollback when needed. Without them, you might be forced to restart your database from scratch if something gets messed up.
  • Once a change is in production, it might be easier to do a "forward fix," creating a new migration to undo a change, rather than rolling back. Rollbacks are easy in dev, but may cause problems in production.