How to restore specific data from previous backup on Postgres Heroku? (Eg. Accidentally deleted rows)
Here’s the situation : With Heroku & Postgres, you can have automatically generated backups dump file. But what can you do with it?
- Dump it on your database, if you want to fully go back to the backup state
- Dump it locally to “have a look”, or to use production data in development environment
- Set back specific rows of your database in a previous state (eg. restore accidentally deleted rows)
I found myself so much struggling about latter point that I wanted to share how I have done it.
Summary / TL;DR
In 3 steps you’ll be able to execute very simply:
First install the backup locally, second get a SQL script, third open your localhost to the outside world with ngrok.
Let’s go?
1. Download your dump file on Heroku and dump it somewhere:
- You can do that on a remote database if you have some servers available. But if like me you don’t want to provision another production database on Heroku or somewhere else, locally will totally do.
- I like to use PGAdmin (available on Linux, Mac and Windows), but using command line and
psql
will also do (by reading this post by example) - In PGAdmin, you’d do
Create a database
. Then right click on it and use therestore
function. Select your dump file, clickRestore
and you’re all set : your backup data is available locally! Good job!
2. Access it from your remote database
I wanted to do the following:
And I would be all set. Super easy, right? Pretty obvious? This must have been done hundreds of times already. Well, no!
There is a utility called db_link
in Postgres 9.1+, but it is pretty constraining as the following syntax applies:
Every column name needs to be repeated twice including its type. Pretty heavy, we are far from the simple SELECT * FROM backup_db.table_name
So the idea here is to use the information_schema
table content, which describes each table with its column names, its types etc. I found this question on SO: http://stackoverflow.com/questions/18317216/specify-dblink-column-definition-list-from-a-local-existing-type which helped me a lot (Thanks bentrm).
But its solution was a two steps process, first generating a function, then querying it:
And I was still aiming at a 1 liner. After some little pain (not being a SQL Guru), here is the Gist:
I now can do:
Awesome, right?
3. Access localhost remotely
If your remote database is already available from the internet = has an IP address, a domain name (Eg. for Heroku it will look like: ec2-54-217-229-169.eu-west-1.compute.amazonaws.com:5672/df68cfpbufjd9p
) you can skip this step. But if you use your local database, you need to make it available from the outside world (so that the Heroku database can access it).
For this, I use the wonderful ngrok.
Once installed I only need to enter the following command:
And you’d only need to plug db_link
(in the gist) to host=ngrock.com port=51727
and you are good to go!
4. Going further
There are many possible improvements to this. Here are some I see already:
- Considering the script as a default feature to
db_link
function - Being more error-proof if database structures are different in backup and production
- Making comparison tool between database results and backup results (to only return diffing lines)
- Handle simple joins
- And even further would be to have an application level adapter (Eg. ActiveRecord in Rails) that could allow manipulation of backend objects instead of raw SQL like now
Hope I was clear! Please ask for more details otherwise.
Read Further
The definitive guide to using Docker as a local development environment (part 2)
Learn how to plug a DB to your Dockerized server
Continue →The day I stopped using Heroku
Heroku has been a great service, but also has too much control on my work.
Continue →