January 7, 2025
Migrating from SQLite to PostgreSQL
This article discusses what might be involved if you need to migrate from SQLite (a file-based database) to PostgreSQL (a powerful client-server-based database). The idea is to give an overview of a migration process without going into the specifics of a specific database, framework, or application to give you ideas about how you might approach this if you are finding yourself in a situation where you need to migrate between these two database systems.
SQLite is great, why migrate?
While SQLite is a great database for small applications or systems that are primarily read-only, it has limitations that can make it unsuitable for larger, more demanding use cases. You may have started a smaller application using SQLite. As it grew in complexity and usage, you may notice issues with write contention due to SQLite's single-writer architecture. Beyond write concurrency, you might need more advanced capabilities, such as efficient parallel queries, reliable replication, and comprehensive tooling for backups and scaling. These are areas where Postgres excels.
Additionally, you may find that you need features of Postgres that SQLite doesn't offer, such as stronger transactional integrity, a richer type system, superior indexing methods, full-text search, sophisticated extensions like PostGIS, and robust support for multi-user access. All these offer the ability to handle significantly more complexity as your application evolves.
Steps for a successful migration
Migrating a database could be fairly straightforward if you have a simple database and don't need any data preserved and migrated into the new database. This might be the case if you're working with a still-in-development application that you haven't yet put into production. However, larger systems with complex schemas and data relationships and plenty of potentially messy data will require more planning and effort. Unfortunately, in these cases, there's no magic bullet that will make the migration process completely painless. However, with careful planning and execution, you can minimize the impact on your application and users and end up with a more robust and scalable database system.
The keys to a successful migration are:
- Understand the differences between SQLite and PostgreSQL (and, consequently, your database schema and the new schema you'll create in PostgreSQL).
This article won't discuss these differences in detail, but a good overview can be found in our other article PostgresSQL vs. SQLite: The key differences and advantages of each.
- Understand your codebase and all the queries (and/or ORM calls) that interact with your database.
Any complex queries will likely have to be changed or rewritten to work correctly with PostgreSQL. This is especially true if you're using SQLite-specific features or syntax that isn't supported in PostgreSQL. If you are using an ORM, like Django ORM or Laravel's Eloquent, it may abstract away most of the differences between the two databases, but you should still be aware of what's going on under the hood. In such a case, ensure any raw database queries are noted and tested during the process before deploying to production.
- Plan your migration carefully, testing with a realistic dataset and workload on a local or staging environment.
It's important to create a migration and implement it on a staging environment that you can easily take down and restore in case things go wrong. This will allow you to test your migration process and ensure that everything works as expected before deploying to production. It would be very helpful to get a complete copy of your production database, if possible, to fully test the migration process before attempting to run it on the production database.
One situation that is very possible, given SQLite's flexibility in data types, is that you may have data that is not compatible with PostgreSQL's stricter data types. For example, SQLite allows you to store any type of data in any column, while PostgreSQL is more strict about data types. This could mean that you have to substantially clean up your data before a successful migration to PostgreSQL is possible, or you may have to modify your schema to accommodate the data you have.
- Ensure you have a rollback plan in case things go wrong.
There's a good chance things will not go smoothly on the first try. You should have a plan in place to roll back your application and database to the prior state if things go wrong. This could involve taking a backup of your SQLite database immediately before starting the migration or having a plan to restore your database and application from a backup or alternate git branch if things go wrong. If at all possible, you should notify your users of the upcoming downtime and migration, then take your application offline at a time when it is not heavily used so that you can run the migration without the database actively changing. If things do go wrong, ensure you can notify your users of the issue and that you have a plan to restore service as quickly as possible.
- Verify that everything is working in production as expected.
Once your migration is complete, before putting it back online for users, you should have automated tests as well as a manual testing process that you can run through to ensure that all the parts of your application that use the database are working as expected. You should also monitor your database for issues (like slow queries) and ensure that your application is performing as expected. It is common to miss indexes during a migration like this which can tank your application's performance.
- Keep improving your database schema and queries as you learn more about PostgreSQL and its capabilities.
As your application grows and you become confident that the application is working as expected, you should continue to monitor and improve your database schema and queries. PostgreSQL has a lot of advanced features that can help you optimize your database for performance and scalability, and you should take advantage of these features as you learn more about them.
As you can imagine, nailing all of the above points is non-trivial, and it could take a lot of work and time. You certainly don't want to rush a process like this. After all, your database is the heart of your application, and screwing up this delicate and potentially risky procedure could mean losing data or corrupting it, resulting in downtime and lost revenue.
Creating a migration plan for a PostgreSQL version of your SQLite database
Obviously, a very significant early part of a migration to a new database type is to recreate your database schema in the new relational database management system (PostgreSQL, in this case). This involves creating tables, indexes, constraints, and other database objects in the new database. Once this is complete, you'll also need to migrate the table data from SQLite to PostgreSQL, which will be discussed later.
There are two major approaches you could take to the task of creating a new schema in PostgreSQL:
-
Manual recreation: This involves manually creating the schema in PostgreSQL, using the
CREATE TABLE
,CREATE INDEX
, and other SQL commands (or the equivalent in your framework if you're using a migration system in something like Laravel). This is a good approach if you have a small database with a simple schema or if you want to take the opportunity to redesign your schema from scratch. -
Automated migration tools: There are tools available, notably
pgloader
, that can help you automate the process of creating a new schema in PostgreSQL based on your existing SQLite schema. These tools can analyze your SQLite database and generate the necessary SQL commands to create the equivalent schema in PostgreSQL. This is a good approach if you have a large or complex schema that would be time-consuming to recreate manually.
Additionally, once the new schema is created, you will want to ask yourself if any design flaws in the original database have been carried over to the new schema. This is a good opportunity to refactor your schema to leverage the capabilities of PostgreSQL or improve the design to suit your application's needs better. If you are making such changes, you'll need to carefully note these and ensure that your application code is updated to reflect the changes.
However, if you are happy with the original schema and don't feel it needs changes, you'll likely have a more straightforward migration by keeping things as similar as possible between the two databases.
Example database schema
Let's consider a simple database schema for a basic blog application in SQLite that we will use to demonstrate the migration process.
CREATE TABLE users (
id INTEGER PRIMARY KEY,
username TEXT NOT NULL UNIQUE,
email TEXT NOT NULL UNIQUE,
password TEXT NOT NULL -- Hashed password
);
CREATE TABLE posts (
id INTEGER PRIMARY KEY,
title TEXT NOT NULL,
content TEXT NOT NULL,
user_id INTEGER NOT NULL,
tags TEXT, -- Comma-separated list of tags
created_at TEXT DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE comments (
id INTEGER PRIMARY KEY,
post_id INTEGER NOT NULL,
user_id INTEGER NOT NULL,
content TEXT NOT NULL,
created_at TEXT DEFAULT CURRENT_TIMESTAMP
);
Manual recreation of the schema
If your schema is not too extensive, you might want to manually recreate it to ensure that you understand the differences between SQLite and PostgreSQL and can make any necessary changes to the schema as you go. Here's an example of what the above blog schema might look like in PostgreSQL:
CREATE TABLE users (
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
username VARCHAR(255) NOT NULL UNIQUE,
email VARCHAR(255) NOT NULL UNIQUE,
password TEXT NOT NULL -- Hashed password
);
CREATE TABLE posts (
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
title TEXT NOT NULL,
content TEXT NOT NULL,
user_id INTEGER NOT NULL,
tags TEXT[], -- Array of tags instead of a comma-separated list
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (user_id) REFERENCES users (id) ON DELETE CASCADE
);
CREATE TABLE comments (
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
post_id INTEGER NOT NULL,
user_id INTEGER NOT NULL,
content TEXT NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (post_id) REFERENCES posts (id) ON DELETE CASCADE,
FOREIGN KEY (user_id) REFERENCES users (id) ON DELETE CASCADE
);
Using pgloader
to migrate the schema
Depending on the schema you want to use and the amount of data you have, you may want to use a tool like pgloader
to help you migrate your schema from SQLite to PostgreSQL. pgloader
is a powerful tool that can help you automate the process of migrating your schema and data from one database to another.
Using a database based on the above, we'll show an example of what using pgloader
might look like to migrate the schema to PostgreSQL.
First, you'll need to install pgloader
if you haven't already done so. You can do this on Ubuntu with:
sudo apt-get install pgloader
Or on macOS with:
brew install pgloader
You can read more about installing pgloader
in the official documentation.
Next, you'll need to create a configuration file for pgloader
that tells it how to connect to your SQLite database and your PostgreSQL database, and how to migrate the schema and data. Here's an example configuration file that you could use to migrate the above schema:
LOAD DATABASE
FROM sqlite:///full/path/to/your/sqlite.db
INTO postgresql://user:password@localhost/database
WITH include no drop, create tables, create indexes, reset sequences
SET work_mem to '128MB',
maintenance_work_mem to '512 MB';
Note that the above configuration file requires you to put the full path to your SQLite DB file and add the credentials and name of your PostgreSQL database. You can also set other options in the configuration file to control how pgloader
migrates the data, but the above is a good starting point.
The work_mem
and maintenance_work_mem
settings are important for controlling how much memory pgloader
uses when migrating the data. You may need to adjust these settings based on the size of your database and the amount of memory you have available.
If everything is correctly specified in your configuration file, you can run pgloader
with the following command (in our example, we've named the configuration file sqlite-to-pgsql.load
):
pgloader sqlite-to-pgsql.load
If this is working correctly and pgloader can access both databases, you'll see output indicating that the migration is in progress. Once it's complete, you should have a new database in PostgreSQL with the same schema as your SQLite database. For a small example set of data, the output of the migration looks like this:
pgloader sqlite_to_pgsql.load
2024-12-16T19:16:27.003816-05:00 LOG pgloader version "3.6.9"
2024-12-16T19:16:27.004632-05:00 LOG Parsing commands from file #P"/Users/jdlien/code/sqlite-to-postgres/database/sqlite_to_pgsql.load"
2024-12-16T19:16:27.042300-05:00 LOG Migrating from #<SQLITE-CONNECTION sqlite:///Users/jdlien/code/sqlite-to-postgres/database/database.sqlite {700BC4C313}>
2024-12-16T19:16:27.042436-05:00 LOG Migrating into #<PGSQL-CONNECTION pgsql://jdlien@localhost:5432/sqlite_to_postgres {700BC4C3A3}>
2024-12-16T19:16:27.200789-05:00 ERROR PostgreSQL Database error 42830: there is no unique constraint matching given keys for referenced table "users"
QUERY: ALTER TABLE posts ADD FOREIGN KEY(user_id) REFERENCES users(id) ON UPDATE NO ACTION ON DELETE NO ACTION
2024-12-16T19:16:27.201216-05:00 ERROR PostgreSQL Database error 42830: there is no unique constraint matching given keys for referenced table "users"
QUERY: ALTER TABLE comments ADD FOREIGN KEY(user_id) REFERENCES users(id) ON UPDATE NO ACTION ON DELETE NO ACTION
2024-12-16T19:16:27.201798-05:00 ERROR PostgreSQL Database error 42830: there is no unique constraint matching given keys for referenced table "posts"
QUERY: ALTER TABLE comments ADD FOREIGN KEY(post_id) REFERENCES posts(id) ON UPDATE NO ACTION ON DELETE NO ACTION
2024-12-16T19:16:27.204274-05:00 LOG report summary reset
table name errors rows bytes total time
----------------------- --------- --------- --------- --------------
fetch 0 0 0.000s
fetch meta data 0 14 0.015s
Create Schemas 0 0 0.001s
Create SQL Types 0 0 0.002s
Create tables 0 11 0.036s
Set Table OIDs 0 11 0.003s
----------------------- --------- --------- --------- --------------
migrations 0 5 0.2 kB 0.017s
users 0 13 2.2 kB 0.017s
password_reset_tokens 0 0 0.010s
sessions 0 0 0.010s
cache 0 0 0.019s
cache_locks 0 0 0.019s
jobs 0 0 0.024s
job_batches 0 0 0.024s
failed_jobs 0 0 0.028s
posts 0 20 5.4 kB 0.028s
comments 0 40 7.8 kB 0.033s
----------------------- --------- --------- --------- --------------
COPY Threads Completion 0 4 0.035s
Reset Sequences 0 6 0.016s
Create Foreign Keys 3 0 0.002s
Create Triggers 0 0 0.000s
Install Comments 0 0 0.000s
----------------------- --------- --------- --------- --------------
Total import time โ 78 15.5 kB 0.053s
Looking at the above output, we can see that the migration was successful, with all the tables and rows being imported into the new PostgreSQL database. However, there were some errors when creating foreign keys. At this point, it'll be important to verify that all the data has been correctly imported into the new database and that the schema is correct. The foreign keys may need to be created manually (or via migration scripts) to ensure referential integrity in the future due to the differences in how SQLite and PostgreSQL handle foreign keys.
It's important to ensure that any primary keys, constraints, and indexes have been correctly migrated and that the data in the new database is consistent with the data in the old database.
When pgloader
isn't enough
In many cases, pgloader
might not get the job done for you. This is more likely if you have a complex schema with a lot of data and you are finding you have many issues with incompatible data types or constraints. SQLite's flexible types contribute to these issues; if the application code didn't validate the input data very well, you might have a lot of mixed data that doesn't fit into PostgreSQL's strict data types. Additionally, if the original schema was suboptimal and you wanted to make many changes to the new schema, pgloader
is not going to be much help.
In such cases, you may need to write a custom migration script. Of course, how you approach this will depend on the environment, languages, and framework you are working with, but essentially, the approach you would take is as follows, with a few examples of code in Python:
TODO: Overview of what a migration script might look like.
- First, ensure you're connected to both databases.
import re # For regular expressions used in data cleaning
import sqlite3 # For SQLite
import psycopg2 # For PostgreSQL
# Connect to SQLite
sqlite_conn = sqlite3.connect('source.db')
sqlite_cursor = sqlite_conn.cursor()
# Connect to PostgreSQL
pg_conn = psycopg2.connect(
dbname="destination_db", user="user", password="password", host="localhost"
)
pg_cursor = pg_conn.cursor()
- Next, query a chunk of data at a time from the original SQLite database. Note that it's better to explicitly specify all the column names rather than using
SELECT *
to avoid issues with column order or missing/additional columns.
chunk_size = 1000 # Adjust as needed based on memory constraints
offset = 0
while True:
sqlite_cursor.execute("SELECT id, username, email, password, created_at FROM users LIMIT ? OFFSET ?", (chunk_size, offset))
rows = sqlite_cursor.fetchall()
if not rows:
break # Exit when no more rows
# The function below accepts the rows, applies the transformations,
# and inserts them into the PostgreSQL database.
transform_and_insert(rows)
offset += chunk_size
Below is an example transform_and_insert
function. What you need in your particular case may look nothing like this. Still, this script gives you an example of the sorts of transforms, validation, and sanitization that you may want to do before inserting data into the new database.
import re
from datetime import datetime
def transform_and_insert(rows):
"""
Transforms the SQLite rows and inserts them into the PostgreSQL database.
Handles sanitization, transformation, and checks before insertion.
"""
transformed_rows = []
email_pattern = re.compile(r"[^@]+@[^@]+\.[^@]+") # Simple example email validation; this is not comprehensive
for row in rows:
try:
id, username, email, password, created_at = row
# 1. Strip leading/trailing whitespace from text fields
username = username.strip() if username else None
email = email.strip().lower() if email else None # Standardize email case
password = password.strip() if password else None
# 2. Sanitize and validate email
if not email or not email_pattern.match(email):
print(f"Skipping row {id}: Invalid email '{email}'")
continue # Skip this row if email is invalid
# 3. Handle missing or invalid 'created_at' dates
if not created_at:
created_at = '1970-01-01 00:00:00' # Default to epoch
else:
# Attempt to parse created_at; if invalid, default it
try:
created_at = datetime.strptime(created_at, "%Y-%m-%d %H:%M:%S")
except ValueError:
print(f"Row {id}: Invalid date '{created_at}', using default date")
created_at = datetime(1970, 1, 1)
# 4. Check required fields (e.g., username and password must not be null)
if not username or not password:
print(f"Skipping row {id}: Missing required fields (username or password)")
continue # Skip if essential fields are missing
# 5. Transform data to match PostgreSQL requirements
# PostgreSQL might use SERIAL, so IDs might be optional
transformed_row = (id, username, email, password, created_at)
transformed_rows.append(transformed_row)
except Exception as e:
print(f"Error processing row {row}: {e}")
# 6. Bulk insert transformed rows into PostgreSQL
if transformed_rows:
try:
insert_query = """
INSERT INTO users (id, username, email, password, created_at)
VALUES (%s, %s, %s, %s, %s)
ON CONFLICT (id) DO NOTHING;
"""
pg_cursor.executemany(insert_query, transformed_rows)
pg_conn.commit()
print(f"Inserted {len(transformed_rows)} rows successfully")
except Exception as e:
print(f"Failed to insert rows into PostgreSQL: {e}")
pg_conn.rollback()
The process of writing a transform script like this can be incredibly tedious and time-consuming on a large database with many tables and columns, but it has the significant advantage that you have control over every step of the process and can perform any transforms and error handling necessary to ensure that the data is clean and ready to be used in the new database.
Example migration plan document
The following is an outline of an empty migration plan that you can use to get started on internal documentation for your migration project. This will be especially helpful if you are working with a team and need to keep everyone on the same page.
Migration plan: SQLite to PostgreSQL migration for application_name
Overview
Rationale for migration, high-level goals, and objectives.
Scope
- Description of the database schema, data, and application components to be migrated.
- List of tables, columns, and relationships to be migrated.
- Approximate number of records to be migrated and complexity.
Migration strategy
- Specify the migration approach: manual schema recreation, automated migration tools (like pgloader), or custom migration scripts.
- Considerations for schema changes, data types, and constraints.
- Data migration strategy: full migration, partial migration, or phased migration.
- Note the downtime required for the migration process.
- Give a planned date for the migration of production data.
- Create a communication plan for stakeholders and users.
- Idenfify any code branches or versions that will be used after migration.
Potential problems
- Known issues or challenges that may arise during the migration process.
- Data type incompatibilities between SQLite and PostgreSQL.
- Known problematic data that may require manual intervention.
- Note any potentially problematic application code or SQL queries.
Testing plan
- Note testing system and test scripts that will be run to ensure that the application is working post-migration.
- List steps to take to verify data integrity and consistency.
- Write manual testing steps for the application after migration.
Migration checklist
This checklist can be used to track the progress of the migration process. Of course, this is a general guideline, and you should adjust it to fit your specific needs.
Pre-migration
- Create a copy/backup of the SQLite database to be used in the dev/staging environment.
- Prepare the PostgreSQL database in staging and ensure connectivity.
- Review the SQLite schema and identify differences with PostgreSQL.
- Identify data types and constraints that need to be adjusted.
- Install and configure migration tools (e.g., pgloader).
- Develop custom migration scripts if needed.
- Review application code for SQLite-specific queries or functions.
Test migration
- Run the migration tools (like
pgloader
) on a copy of the SQLite database to test the migration process. - Run any custom migration scripts on a copy of the SQLite database to test the migration process.
- Verify that the schema is correctly migrated to PostgreSQL.
- Run automated tests to ensure data integrity and consistency.
- Manually test the application to ensure it works as expected.
- Note the amount of time that the migration process takes, separately noting the time for testing.
- If the process fails, note the error, take steps to resolve it, and repeat the whole process starting from scratch.
- Repeat until the migration process is successful in the staging environment.
- Run the migration tools (like
Production migration
- Notify users and stakeholders of the upcoming downtime for the migration.
- Take the application offline.
- Make a final backup of the SQLite database.
- Run the migration tools on the production database.
- Run any custom migration scripts to apply final changes to the production database.
- Verify that the schema is correctly migrated to PostgreSQL.
- Run automated tests to ensure data integrity and consistency.
- Manually test the application to ensure it works as expected.
- If the migration failed, determine whether to roll back to the previous state or attempt to fix the issue.
- If the migration was successful, bring the application back online.
- If necessary, notify stakeholders and users that the migration is complete.
Post-migration
- Monitor the application for any issues post-migration.
- Monitor the database for performance issues or slow queries and optimize (like with indexes) as needed.
- When the application is stable, consider further enhancements to the PostgreSQL schema.
Summary
Migrating from SQLite to PostgreSQL can be a complex and time-consuming process, but with careful planning and execution, you can minimize the impact on your application and users. Understanding the differences between SQLite and PostgreSQL, creating a migration plan, and testing the migration thoroughly are key to a successful migration. Whether you choose to manually recreate the schema, use automated migration tools like pgloader
, or write custom migration scripts, the goal is to ensure that your data is migrated accurately and that your application works as expected post-migration.
Keeping in mind the guidelines from this article and using the migration plan and checklist should give you a good start for fleshing out your own plan so that you can execute a smooth migration of your application's data. As long as you don't rush, ensure you have thorough testing, and have your team on the same page about the process, you should be able to successfully migrate your database from SQLite to PostgreSQL.