Restore data from InnoDB file (idb & frm) using TwinDB toolkit

We have been told many times that modifying live database should be done with extreme care, we should always make a backup before doing something big to the database. However, there are countless stories on the Internet about losing data due to various reason, one of them is forgetting to create a backup (Gitlab is an example: https://about.gitlab.com/2017/02/01/gitlab-dot-com-database-incident/). I was facing the same issue when upgrading MySQL server to a new version. Luckily I was able to restore most of the data but it was still a very good lesson for me. One of lesson I learned is how we could restore the data from the *.ibd and *.frm file.

The database I worked with had many tables. There were about 5 of them using MyISAM engine while others were using InnoDB engine. I was asked to upgrade the entire the server from MySQL 5.7 to 8.0. Well, I was careless when following the In-place upgrade recommended here without creating a backup first: https://dev.mysql.com/doc/refman/8.0/en/upgrading-strategies.html. What I did was that I started a new instance of MySQL 8.0 (running in parallel with MySQL 5.7) while pointing its datadir option to the same datadir being used by MySQL 5.7, without shutting down 5.7 first. When MySQL 8.0 started, it modified the ibdata1 file somehow and caused this file to be no longer compatible with MySQL 5.7. At the same time, the 8.0 server could not update this ibdata1 file correctly and it failed to load the database into the server. As the result, I could no longer use or see this database in both 5.7 and 8.0.

After many failed tries of recovering this ibdata1 file, I had to import the SQL dump file given by the client into the MySQL 8.0. This dump file was actually a backup done by him (thanks god) but it was 1 month older than the newest data we had. That was not cool and I had to find a way to restore the data somehow.

Luckily most of the important & big tables are using MyISAM engine so I can simply copy the *.MYD and *.MYI files into the datadir of the new MySQL and immediately I had restored them successfully. But it was tricky with other tables. What I had were two ibd and frm files for each table using InnoDB engine. We cannot simply copying these two files into the data directory because the MySQL would complain about incorrect tablespace or something. Trust me, I tried many methods which recommends using DISCARD TABLESPACE and IMPORT TABLESPACE but none of them worked with me for some reason (Ex: https://medium.com/magebit/recover-innodb-database-from-frm-and-ibd-files-99fdb0deccad, you should try it first before reading further)

Finally I found an article which gave me some leads: https://mattstauffer.com/blog/restoring-lost-innodb-mysql-databases-after-all-data-and-hope-was-lost-with-drop-tables/. In that article, the author mentions the use of Twindb toolkit for restoring the data. I went to Twindb website, tried their tool: https://recovery.twindb.com/. Amazingly after uploading my ibd and frm files, I could see all the data. However, I could not actually get the data because they asked me to pay first ($99 USD, a bit too much for me). But at least I knew that my data was recoverable and Twindb tool could do it. Reading the previous article further, I found that they used TwinDB data recovery toolkit (https://twindb.com/undrop-tool-for-innodb/, https://github.com/twindb/undrop-for-innodb). The issue was that there were no explanation on how I could 'actually' retrieve the data. The instruction at https://twindb.com/undrop-tool-for-innodb/ only shows me how to extract the table structure and run some commands for fetching the records, but I wanted to export it into something which I can easily import back to my MySQL server. How could I do that? (I still don't know why they did not say anything about that)

Anyway, I managed to run the tool and get the actual data which I could import into the database using mysql command. Here is how I did it

1. Install the Twindb data recovery tool

It means cloning the git repos (https://github.com/twindb/undrop-for-innodb), run make to build the binary file. After this step, you will get the binary files called stream_parser and c_parser in the folder which you cloned

2. Create a .sql file containing the SQL for creating your table.

How to get the table structure? You can either use the info in your current database or you can extract the CREATE statement from the .frm file using mysqlfrm command (https://dev.mysql.com/doc/mysql-utilities/1.5/en/mysqlfrm.html)

3. Parse the innodb file.

Here is the command I used (assuming I have a file called people.ibd containing the data of people table in my database.

./stream_parser -f people.ibd

After parsing, you will get a folder called pages-people.idb. Inside this folder you will see another folder called FIL_PAGE_INDEX, and again inside this subfolder is a file with a long serial as its name and the extension is .page. We are ready to move to the next step

4. Extract the database into .sql file

Here is the command I used

./c_parser -6f pages-people.ibd/FIL_PAGE_INDEX/0000000000002305.page -t people-create.sql > dumps/default/people 2>dumps/default/people.sql

Explanation:

  • -6f: 6 mean the ibd file was generated by a version of MySQL 5.6+ (in my case it was 5.7), f for specifying the .page file we are going to parse
  • -t people-create.sql: the file contains the CREATE TABLE statement as I said in step 2
  • > dumps/default/people: the dump data will be in this file. This is actually a text file which is compatible with the command LOAD LOCAL DATA FILE. dumps/default is simply the folder I used for storing the exported data.
  • 2>dumps/default/people.sql: this is the .sql file which will contain the LOAD LOCAL DATA FILE statement. So in the end we can simply run this file to import the data.

But the tool is not perfect, and it cannot magically restore 100% the data you have lost (in my case it was 98% though). There might be some additional things you need to do (which happened to me):

  • Some value was exported incorrectly so I had to modified it manually in the output text file (I also adjusted many after importing)
  • Your MySQL server might not allow running LOAD LOCAL DATA FILE so you need to check and enable the option (in the mysql config file)

Anyway, I am glad to see most of my data return and it was worth doing all those complex steps. Actually I hope you will not need this blog because what you should do is to make sure creating a backup first. Hope this helps some of you who are frustring.