Saturday, June 24, 2006

mysql storage engines

I would be discussing mysql storage engines/table types over here. MySQL supports pluggable storage engine architecture, which means that you can create your own storage engine and ask MySQL to store data for you using the sql query syntax defined in MySQL. Sounds cool right, but lets not dig that deep now. Lets first see what types of storage engines does MySQL provide and their benefits/drawbacks.

MYISAM Storage Engine

Based on the older IASM storage engine and currently the default storage engine in MySQL. So, if you create a table in MySQL and dont define a storage engine then, the table would use MYIASM as the storage engine. MySQL databases are generally directories and MYISAM tables are stored using 3 files -
.frm => the table format
.MYD => table data
.MYI => table indexes

Points to Remember:

  • The tables are machine/os independent and can be shifted by simply copying the 3 files to another installation of mysql on another machine.

  • Limit of 2^32 rows which can be increased to (2^32)^2.

  • Maximum no of indexes per table = 64, which can be increased upto 128 by recompiling.

  • Maximum no of columns per index = 16.

  • Blob and TEXT columns can be indexed.

  • Table size is almost unlimited (256TB). Though the table gives an error of "Table is full" when it reaches 4 GB. Then some parameters (MAX_ROWS & AVG_ROW_LENGTH) in the table definition needs to be changed to make the table expandable.

  • Transactions are not supported

  • Fulltext indexing and searching is supported though it is very slow.

  • Granuality of locking is TABLE. So if an insert and select happens one after another, the insert will lock the complete table and the select will have to wait till the lock is removed.

  • The tables do get corrupted sometimes and to recover it, you would need to run the "REPAIR TABLE" command.

  • Does not provide "Foreign key" usage.

InnoDB Storage Engine

InnoDB provides transactions and relational database support in MySQL - supports foreign keys and commit, rollback and crash recovery capabilities. InnoDB creates two log files namely "ib_logfile0" and "ib_logfile1" and a data file "ibdata1" in the MySQL data directory where it stores its tables. You can specify the path for creation of data file and the initial/max size of the file and also the no of data files to create. Placing different data files on different disks would of course lead to increased efficiency.

The table definitions are stored in database directory with a .frm extension whereas the data is stored in the "ibdata1" - tablespace. InnoDB has something known as "clustered index" where the data for the rows is stored. Accessing a row through the clustered index is fast because the row data is on the same page where the index search leads. If a table is large, the clustered index architecture often saves a disk I/O.

innodb_data_home_dir = /ibdata

Points to remember:

  • Moving databases/tables is not easy. You need to take a dump and import it in the other installation of mysql

  • Table cannot contain more than 1000 columns

  • Although InnoDB supports row sizes larger than 65535 internally, you cannot define a row containing VARCHAR columns with a combined size larger than 65535.

  • combined size of the InnoDB log files must be less than 4GB

  • Minimum tablespace size is 10MB. And maximum tablespace size is 64TB. This is also the maximum size for a table

  • Does not support FULLTEXT indexes.

  • InnoDB does not keep an internal count of rows in a table. So to process a select count(*) from tablename, InnoDB must scan the complete index which may take some time.

  • Delete from tablename does not regenerate the table, but deletes rows one by one.

  • Supports foreign keys constrains and transactions (commit and rollback)

  • Has row level locking and also provides consistent non-locking read in select statements. Which provides better multi-user concurrency and performance.

Memory/Heap Storage Engine

As the name suggests, the tables are made in memory/RAM. Since the data for the tables are stored in memory they are lost if the mysql server is shut down or if it crashes, but the tables continue to exist since the table definition is a .frm file stored in the database directory. On a mysql restart, all data would have been lost and the tables would be empty. The indexes made on these tables are HASH indexes which makes access to these tables extremely fast. You can create temporary HEAP tables which can be used for high frequency reads & writes.

Points to remember:

  • Can have 32 indexes per table and 16 columns per index.

  • Type of indexing can be decided to be either HASH or BTREE.

  • Using Hash indexes can lead to slower updates and deletes on the table if the degree of key duplication is high.

  • Cannot contain BLOB or TEXT columns

  • Server needs sufficient memory to maintain all MEMORY tables at one point of time

  • To populate a MEMORY table when MySQL server is started, the --init-file option can be used

  • It has table level locking

  • Memory can be freed up by using either truncate/drop table command

Merge Storage Engine

Using MERGE storage engine, we can make a collection of identical MYISAM tables look and work like one. Each MERGE table is made up of two files
.frm - table definition/format
.MRG - names of tables that should be used as one
MERGEd tables support SELECT, DELETE, UPDATE and INSERT. On DROPPING a merge table, only the merge specification is dropped. The underlying tables are unaffected.

Points to remember:

  • MERGE table cannot enforce uniqueness over the underlying table.

  • MERGE table can be used to spilt big tables into parts and refer to them using the merged table name

  • MERGE table does not have its own indexes, it uses indexes of its underlying tables.

  • You cannot search the MERGE tables using FULLTEXT indexes. The FULLTEXT indexes can be created on the underlying MyIASM tables but they cannot be used through the MERGE table.

  • Merge tables use more file descriptors. If 10 clients are using a MERGE table that maps to 10 tables, the server uses (10 × 10) + 10 file descriptors.

  • Key reads are slower, since the on performing a SELECT, the MERGE storage engine has to issue a call on all the underlying tables one by one.

  • INSERTS can be directed to either the first or the last table using INSERT_METHOD parameter during MERGE table creation.

  • REPLACE does not work

  • When you create a MERGE table, there is no check to ensure that the underlying tables exist and have identical structures.

  • There is a limit of 2^32 rows to a MERGE table.

BDB(BerkeleyDB) Storage Engine

Sleepycat's Berkeley DB is a database engine that provides developers with fast, reliable, local persistence with zero administration (check ). Mysql has been provided the BDB transactional storage engine. You need to have BDB installed on your system and compile MySQL with BDB to use the BDB storage engine. Each BDB table is stored in two files
.frm -> table definition/format
.db -> contains table data and indexes

Points to remember:

  • BDB tables can have up to 31 indexes per table and 16 columns per index.

  • MySQL needs a primary key for each BDB table. If not provided, MySQL would create a hidden primary key and use it.

  • If all columns that you access in a BDB table are part of the same index or part of the primary key, MySQL can execute the query without having to access the actual row.

  • Sequential scanning is slower for BDB tables than for MyISAM tables because the data in BDB tables is stored in B-trees and not in a separate data file.

  • There are often holes in the BDB table to allow you to insert new rows in the middle of the index tree. This makes BDB tables somewhat larger than MyISAM tables.

  • SELECT COUNT(*) FROM tablename is slow for BDB tables, because no row count is maintained in the table.

  • Internal locking in BDB tables is done at the page level.

  • Opening many BDB tables at the same time may be quite slow.

  • Each BDB table stores in its .db file the path to the file as it was created. So it is not possible to move BDB table files from one database directory to another. To shift a BDB table mysqldump and import are required to be done.

FEDERATED Storage Engine

The federated storage engine can be used to access data in tables of remote databases locally. When a federated table is created a .frm file is created in the database directory. This file just contains the table definition. The actual table remains in the remote mysql database.
Flow of information in case of federated tables is as follows :

SQL calls issued locally
MySQL handler API (data in handler format)
MySQL client API (data converted to SQL calls)
Remote database -> MySQL client API
Convert result sets (if any) to handler format
Handler API -> Result rows or rows-affected count to local

The structure of the federated table must be exactly similar to the remote table structure except that the ENGINE should be FEDERATED and CONNECTION parameter should contain the connection string that indicates to the federated engine how to connect to the remote mysql server and access the table over there.

connection string format:

Points to remember:

  • The remote table that a FEDERATED table points to must exist before you try to access the table through the FEDERATED table.

  • It is possible for one FEDERATED table to point to another, but you must be careful not to create a loop.

  • Transactions are not supported

  • There is no way for the FEDERATED engine to know if the remote table has changed.

  • The FEDERATED storage engine supports SELECT, INSERT, UPDATE, DELETE, and indexes. It does not support ALTER TABLE, or any Data Definition Language statements other than DROP TABLE. The DROP TABLE statement would drop only the local table and not the remote table.

  • FEDERATED tables do not work with the query cache.

ARCHIVE Storage Engine

Used for storing large amounts of data without indexes in a very small footprint. When an archive table is created, following files are created in the database directory.
.frm -> table definition
.ARZ -> DATA file
The engine uses zlib lossless data compression (check ).

The Archive table is almost 75% smaller in size then the MyISAM table and almost 83% smaller in size then the transactional InnoDB engine. Even after compression of MyISAM table, Archive tables are still 7% smaller. Insert operations on an Archive table is faster than that on a MyISAM table. If table scans on both tables are compared, table scan on an Archive table is found to be faster.

Points to remember:

  • Supports only INSERTS and SELECTS

  • Does NOT support DELETE, REPLACE and UPDATE

  • Has row level locking

  • Rows are compressed as they are inserted.

  • A SELECT statement performs a complete table scan and uncompresses rows on demand. There is no ROW cache.

CSV Storage Engine

Stores data in text files using comma-separated values format. When a table is created 2 files are created in the database directory
.frm -> table definition
.CSV -> data file. A normal text file with comma separated values.

This engine does not support INDEXING. The data file (.csv) file can be read by spreadsheet applications like openoffice, MS office or star office.

In addition to these, there are two other storage engines:

EXAMPLE Storage engine: It is a stub engine and does nothing. You cannot perform anything on this engine except creation of tables. Tables are created as .frm files in the database directory. No data file is created and no data can be stored using this engine.
BLACKHOLE Storage engine: It acts as a black hole, accepts data and throws it away. It does not store anything. Selects always return an empty result set.

This summarizes the mysql storage engines. For details the following url can be referred.

The following link can be used to select a storage engine.

Next ... lets c what i can write


Anonymous said...

A few notes on archive:

1) Supports interlaced bulk inserts (so no locking for these).
2) For high concurrent insert it outperforms all other engines (aka logging purposes)
3) Does support blobs.
4) Does support ORDER BY (what gave you the impression that it did not?)
5) Rows are compressed in batch operations.

Your information for federated is a bit out of date as well.



Jayant Kumar said...

Thanks for pointing these out...

Will update the blog entry...


Anonymous said...

"Fulltext indexing and searching is supported though it is very slow."
this is so not true!

Jayant Kumar said...

Check out the fulltext search benchmarking results in my previous blogs...

manoj said...

this is a very good information explained with the comparision of table types.

Jayant Kumar said...

thanks manoj....

Imran M said...

very useful overview.helped me make the right architecture choice!

Anonymous said...

Thanks for the info, I'm a little confused about not being able to delete data from the archive storage engine.

I am trying to test things out and inserting a lot of data, but if I make a mistake how can I delete the data I have just added? Do I have to change the data type and then delete then change back?