尝试驾驭Archive存储引擎
截止作者撰写本文的时候,MySQL5.0还仅在Max版本中支持Archive存储引擎。想了解您的MySQL支持何种存储引擎,您可以在MySQL客户端中输入SHOW ENGINES命令。
mysql> show engines;
+------------+---------+----------------------------------------------------------------+
| Engine | Support | Comment |
+------------+---------+----------------------------------------------------------------+
| MyISAM | YES | Default engine as of MySQL 3.23 with great performance |
| MEMORY | YES | Hash based, stored in memory, useful for temporary tables |
| HEAP | YES | Alias for MEMORY |
| MERGE | YES | Collection of identical MyISAM tables |
| MRG_MYISAM | YES | Alias for MERGE |
| ISAM | NO | Obsolete storage engine, now replaced by MyISAM |
| MRG_ISAM | NO | Obsolete storage engine, now replaced by MERGE |
| InnoDB | DEFAULT | Supports transactions, row-level locking, and foreign keys |
| INNOBASE | YES | Alias for INNODB |
| BDB | YES | Supports transactions and page-level locking |
| BERKELEYDB | YES | Alias for BDB |
| NDBCLUSTER | NO | Clustered, fault-tolerant, memory-based tables |
| NDB | NO | Alias for NDBCLUSTER |
| EXAMPLE | NO | Example storage engine |
| ARCHIVE | YES | Archive storage engine |
| CSV | NO | CSV storage engine |
| FEDERATED | YES | Federated MySQL storage engine |
| BLACKHOLE | YES | /dev/null storage engine (anything you write to it disappears) |
+------------+---------+----------------------------------------------------------------+
为了测试Archive引擎压缩数据的效果,我们先以一个包含10万行数据的表为数据源,创建几种不同数据引擎的表,看看存储引擎的表现。先声明:以下的性能测试均是在一台拥有2GHz奔腾M处理器和1GB内存的Dell笔记本电脑上,运行WindowsXP系统和MySQL 5.0.10 β版。
mysql> create table test_myisam engine=myisam as select * from client_transaction_hist;
Query OK, 112050 rows affected (1.06 sec)
Records: 112050 Duplicates: 0 Warnings: 0
mysql> create table test_innodb engine=innodb as select * from client_transaction_hist;
Query OK, 112050 rows affected (3.72 sec)
Records: 112050 Duplicates: 0 Warnings: 0
mysql> create table test_archive engine=archive as select * from client_transaction_hist;
Query OK, 112050 rows affected (1.92 sec)
Records: 112050 Duplicates: 0 Warnings: 0
mysql> SELECT table_name table_name,
-> engine,
-> ROUND(data_length/1024/1024,2) total_size_mb,
-> table_rows
-> FROM information_schema.tables
-> WHERE table_schema = 'gim' and
-> table_name like 'test%'
-> ORDER BY 3;
+--------------+---------+---------------+------------+
| table_name | engine | total_size_mb | table_rows |
+--------------+---------+---------------+------------+
| test_archive | ARCHIVE | 1.64 | 112050 |
| test_myisam | MyISAM | 6.46 | 112050 |
| test_innodb | InnoDB | 9.52 | 112050 |
+--------------+---------+---------------+------------+
可以看到,Archive表比MyISAM表要小大约75%,比支持事务处理的InnoDB表小大约83%。接下来再让我们将MyISAM表压缩,比较看看:
C:\Program Files\MySQL\MySQL Server 5.0\data\gim>..\..\bin\myisampack test_myisam.MYI
Compressing test_myisam.MYD: (112050 records)
- Calculating statistics
- Compressing file
67.76%
+--------------+---------+---------------+------------+
| table_name | engine | total_size_mb | table_rows |
+--------------+---------+---------------+------------+
| test_archive | ARCHIVE | 1.64 | 112050 |
| test_myisam | MyISAM | 2.08 | 112050 |
| test_innodb | InnoDB | 9.52 | 112050 |
+--------------+---------+---------------+------------+
即便是启用了压缩之后,Archive表依然比MyISAM表小约7%。那么执行INSERT插入的速度呢?前面的例子使用CREATE TABLE … AS SELECT …语句向Archive表中插入数据时,性能表现不如MyISAM表好。但这是数据少的情况,当数据量非常大的时候Archive的插入性能表现会较MyISAM为佳。在接下来的例子里,我们要用一个存储过程插入100万行记录:
mysql> create table insert_test (c1 int, c2 varchar(20)) engine=myisam;
Query OK, 0 rows affected (0.00 sec)
mysql> drop procedure test_insert;
Query OK, 0 rows affected (0.00 sec)
mysql> delimiter //
mysql> create procedure test_insert()
-> begin
-> declare v_ctr tinyint;
-> set v_ctr = 0;
-> while v_ctr <> do
-> insert into insert_test values (1,'testing insert');
-> set v_ctr = v_ctr + 1;
-> end while;
-> end
-> //
Query OK, 0 rows affected (0.00 sec)
mysql> delimiter ;
mysql> call test_insert();
Query OK, 1 row affected (33.06 sec)
mysql> truncate table insert_test;
Query OK, 0 rows affected (0.01 sec)
mysql> alter table insert_test engine=archive;
Query OK, 0 rows affected
Records: 0 Duplicates: 0 Warnings: 0
mysql> call test_insert();
Query OK, 1 row affected (21.42 sec)
正如你所看到的,在这个INSERT测试例子中,Archive引擎比MyISAM更快。在某些情况下,使用INSERT DELAYED(延迟写入)选项能让Archive引擎的插入速度更快。启用DELAYED选项时,Archive引擎会使用一个压缩缓冲区来保存插入的数据,这当然也能起到提高性能的作用。还有一种方法是直接对标准MyISAM表执行ALTER TABLE语句转换成Archive表:
mysql> alter table myisam_insert engine=archive;
Query OK, 3000000 rows affected, 0 warning (8.84 sec)
Records: 3000000 Duplicates: 0 Warnings: 0
上面的测试显示,将拥有300万行的MyISAM表转换成Archive表只花了不到9秒的时间,太棒了!
读取性能又如何呢?Archive和其他MySQL存储引擎相比较,有何不同?Archive表的第一个弊端是它不支持索引,所以任何SELECT指令都会扫描整个表。MySQL团队正在测试Archive表的索引支持,可能在更高的版本中会发布。但是到目前为止,唯一能帮助提高SELECT执行速度的就是MySQL查询缓存了。
我们来测试一下三种不同存储引擎的数据表(刚超过10万行)执行完全扫描的性能表现:
mysql> select count(*) from test_myisam where client_id = 50;
+----------+
| count(*) |
+----------+
| 24 |
+----------+
1 row in set (0.25 sec)
mysql> select count(*) from test_innodb where client_id = 50;
+----------+
| count(*) |
+----------+
| 24 |
+----------+
1 row in set (0.59 sec)
mysql> select count(*) from test_archive where client_id = 50;
+----------+
| count(*) |
+----------+
| 24 |
+----------+
1 row in set (0.41 sec)
Archive引擎似乎表现还不错,接下来我们测试一下扫描包含300万行数据的表:
mysql> select count(*) from myisam_insert where c1 = 1;
+----------+
| count(*) |
+----------+
| 3000000 |
+----------+
1 row in set (1.05 sec)
mysql> select count(*) from archive_insert where c1 = 1;
+----------+
| count(*) |
+----------+
| 3000000 |
+----------+
1 row in set (2.20 sec)
mysql> flush status;
Query OK, 0 rows affected (0.00 sec)
mysql> select count(*) from archive_insert where c1 = 1;
+----------+
| count(*) |
+----------+
| 3000000 |
+----------+
1 row in set (0.00 sec)
mysql> show status like 'qcache_hits';
+-------------------------+----------+
| Variable_name | Value |
+-------------------------+----------+
| Qcache_hits | 1 |
+-------------------------+----------+
在上面的例子中,Archive表确实比MySQL表慢一些,但是它也在大约2秒的时间里完成了扫描。另外这个例子也表明,Archive表的扫描结果被放入了查询缓冲区。所以相同的查询请求几乎是立即得到回复(因为缓存命中)。
读取数据的时候,Archive引擎启用了一种称为“快照”的技术,这一点和InnoDB很相似。所以大量的并发读取操作不会阻止写入操作的执行。Archive表使用行锁定技术,这一点也和InnoDB一样。
Archive表的性能是否可能超过MyISAM?答案是肯定的。根据MySQL工程师的资料,当表内的数据达到1.5GB这个量级,CPU又比较快的时候,Archive表的执行性能就会超越MyISAM表。因为这个时候,CPU会取代I/O子系统成为性能瓶颈。别忘了Archive表比其他任何类型的表执行的物理I/O操作都要少。
记住一点,Archive存储引擎设计的出发点不仅是追求SQL查询时卓越的性能,而且更重要的是为备份提供便利。当我们需要备份历史数据的时候,Archive引擎能在实质上减少需要转移的数据量。
较小的空间占用也能在你移植MySQL数据的时候发挥作用。当你需要把数据从一台MySQL服务器转移到另一台的时候,Archive表可以方便地移植到新的MySQL环境,你只需将保存Archive表的底层文件复制过去就可以了。