11g引入了大量compress相关的特性,其中之一便是dbms_compression包;GET_COMPRESSION_RATIO函数可以帮助我们了解压缩某个表后各种可能的影响。换而言之,这个函数可以让我们在具体实施表压缩技术或者测试前,对于压缩后的效果能有一个基本的印象。该包在11gr2中被首次引入,故而使用之前版本的包括11gr1都无缘得用。其次除OLTP压缩模式之外的柱形混合压缩只能在基于Exdata存储的表空间上实现。使用DBMS_COMPRESSION包获取的相关压缩信息是十分准确的,因为在评估过程中Oracle通过实际采样并建立模型表以尽可能还原逼真的数据。 我们可以通过trace来分析其评估过程中的具体操作,可以分成2步: 1. 建立原表的样本表,其采样值基于原表的大小:
SQL> create table samp_dss_nation tablespace SCRATCH as select * from dss_nation sample block (50);

Table created.
2. 基于采用表建立对应压缩类型的模型表:
SQL> create table model_dss_nation tablespace SCRATCH compress for query high as select * from samp_dss_nation;
create table model_dss_nation tablespace SCRATCH compress for query high as select * from samp_dss_nation
*
ERROR at line 1:
ORA-64307: hybrid columnar compression is only supported in tablespaces
residing on Exadata storage
可以看到在实际建立过程中Oracle将拒绝在非Exdata存储的表空间上建立该类柱形混合压缩(包括:COMP_FOR_QUERY_HIGH,COMP_FOR_QUERY_LOW,COMP_FOR_ARCHIVE_HIGH,CO MP_FOR_ARCHIVE_LOW)。但DBMS_COMPRESSION在进行评估时可以绕过Oracle对于该类操作的LOCK. 要在没有Exdata存储设备的情况下使用dbms_compression包评测OLTP压缩模式外的柱状混合压缩模式时 (hybrid columnar compression is only supported in tablespaces residing on Exadata storage),首先需要打上patch 8896202:
[[email protected] admin]$ /s01/dbhome_1/OPatch/opatch lsinventory
Invoking OPatch 11.1.0.6.6

Oracle Interim Patch Installer version 11.1.0.6.6
Copyright (c) 2009, Oracle Corporation.  All rights reserved.

Oracle Home       : /s01/dbhome_1
Central Inventory : /s01/oraInventory
from           : /etc/oraInst.loc
OPatch version    : 11.1.0.6.6
OUI version       : 11.2.0.1.0
OUI location      : /s01/dbhome_1/oui
Log file location : /s01/dbhome_1/cfgtoollogs/opatch/opatch2010-06-02_23-08-33PM.log

Patch history file: /s01/dbhome_1/cfgtoollogs/opatch/opatch_history.txt

Lsinventory Output file location : /s01/dbhome_1/cfgtoollogs/opatch/lsinv/lsinventory2010-06-02_23-08-33PM.txt

--------------------------------------------------------------------------------
Installed Top-level Products (1):

Oracle Database 11g                                                  11.2.0.1.0
There are 1 products installed in this Oracle Home.

Interim patches (1) :

Patch  8896202      : applied on Wed Jun 02 21:55:44 CST 2010
Unique Patch ID:  11909460
Created on 29 Oct 2009, 15:21:45 hrs US/Pacific
Bugs fixed:
8896202

该patch用以:ENABLE COMPRESSION ADVISOR TO ESTIMATE EXADATA HCC COMPRESSION RATIOS
接着我们还需要运行被修改后的DBMSCOMP包创建SQL,具体操作为:
SQL> @?/rdbms/admin/prvtcmpr.plb

Package created.

Grant succeeded.

Package body created.

No errors.

Package body created.

No errors.

Type body created.

No errors.
SQL> @?/rdbms/admin/dbmscomp.sql

Package created.

Synonym created.

Grant succeeded.

No errors.

DBMS_COMPRESSION包在对表压缩进行评估时,默认表最少数据为1000000行,可能在你的测试库中没有这么多数据,我们可以修改这个下限;

通过将COMP_RATIO_MINROWS常数修改为1后,就可以分析最小为1行的表了:

SQL>create or replace package sys.dbms_compression authid current_user is

  COMP_NOCOMPRESS       CONSTANT NUMBER := 1;
  COMP_FOR_OLTP         CONSTANT NUMBER := 2;
  COMP_FOR_QUERY_HIGH   CONSTANT NUMBER := 4;
  COMP_FOR_QUERY_LOW    CONSTANT NUMBER := 8;
  COMP_FOR_ARCHIVE_HIGH CONSTANT NUMBER := 16;
  COMP_FOR_ARCHIVE_LOW  CONSTANT NUMBER := 32;

  COMP_RATIO_MINROWS CONSTANT NUMBER := 10;
  COMP_RATIO_ALLROWS CONSTANT NUMBER := -1;

  PROCEDURE get_compression_ratio(scratchtbsname IN varchar2,
                                  ownname        IN varchar2,
                                  tabname        IN varchar2,
                                  partname       IN varchar2,
                                  comptype       IN number,
                                  blkcnt_cmp     OUT PLS_INTEGER,
                                  blkcnt_uncmp   OUT PLS_INTEGER,
                                  row_cmp        OUT PLS_INTEGER,
                                  row_uncmp      OUT PLS_INTEGER,
                                  cmp_ratio      OUT NUMBER,
                                  comptype_str   OUT varchar2,
                                  subset_numrows IN number DEFAULT COMP_RATIO_MINROWS);

  function get_compression_type(ownname IN varchar2,
                                tabname IN varchar2,
                                row_id  IN rowid) return number;

  PROCEDURE incremental_compress(ownname         IN dba_objects.owner%type,
                                 tabname         IN dba_objects.object_name%type,
                                 partname        IN dba_objects.subobject_name%type,
                                 colname         IN varchar2,
                                 dump_on         IN number default 0,
                                 autocompress_on IN number default 0,
                                 where_clause    IN varchar2 default '');

end dbms_compression;

Package created.

SQL> alter package dbms_compression compile body;

Package body altered.

接下来我们通过建立一个基于TPC-D的测试的Schema,保证各表上有较多的数据,并且数据有一定的拟真度:

SQL> select table_name,num_rows,blocks from user_tables ;

TABLE_NAME                       NUM_ROWS     BLOCKS
------------------------------ ---------- ----------
DSS_SUPPLIER                        20000        496
DSS_PART                           400000       7552
DSS_REGION                              5          5
DSS_PARTSUPP                      1600000      29349
DSS_LINEITEM                     12000000     221376
DSS_ORDER                         3000000      48601
DSS_CUSTOMER                       300000       6922
DSS_NATION                             25          5

现在可以进行压缩评估了,我们针对测试模型Schema编辑以下匿名块并运行

SQL> set serveroutput on;
SQL> declare
  cmp_blk_cnt   binary_integer;
  uncmp_blk_cnt binary_integer;
  cmp_rows      binary_integer;
  uncmp_rows    binary_integer;
  cmp_ratio     number;
  cmp_typ       varchar2(100);
BEGIN
  for i in (SELECT TABLE_NAME
              from dba_tables
             where compression = 'DISABLED'
               and owner = 'MACLEAN' and num_rows>1000000) loop
    for j in 1 .. 5 loop
      dbms_compression.get_compression_ratio(scratchtbsname => 'SCRATCH',
                                             ownname        => 'MACLEAN',
                                             tabname        => i.table_name,
                                             partname       => NULL,
                                             comptype       => power(2, j),
                                             blkcnt_cmp     => cmp_blk_cnt,
                                             blkcnt_uncmp   => uncmp_blk_cnt,
                                             row_cmp        => cmp_rows,
                                             row_uncmp      => uncmp_rows,
                                             cmp_ratio      => cmp_ratio,
                                             comptype_str   => cmp_typ);
      dbms_output.put_line(i.table_name || '--' || 'compress_type is ' ||
                           cmp_typ || ' ratio :' ||
                           to_char(cmp_ratio, '99.9') || '%');

    end loop;
  end loop;
end;
/
DSS_ORDER--compress_type is "Compress For OLTP" ratio :  1.1%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
DSS_ORDER--compress_type is "Compress For Query High" ratio :  2.7%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
DSS_ORDER--compress_type is "Compress For Query Low" ratio :  1.7%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
DSS_ORDER--compress_type is "Compress For Archive High" ratio :  2.9%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
DSS_ORDER--compress_type is "Compress For Archive Low" ratio :  2.7%
DSS_PARTSUPP--compress_type is "Compress For OLTP" ratio :   .9%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
DSS_PARTSUPP--compress_type is "Compress For Query High" ratio :  1.8%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
DSS_PARTSUPP--compress_type is "Compress For Query Low" ratio :  1.2%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
DSS_PARTSUPP--compress_type is "Compress For Archive High" ratio :  1.9%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
DSS_PARTSUPP--compress_type is "Compress For Archive Low" ratio :  1.8%
DSS_LINEITEM--compress_type is "Compress For OLTP" ratio :  1.4%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
DSS_LINEITEM--compress_type is "Compress For Query High" ratio :  3.5%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
DSS_LINEITEM--compress_type is "Compress For Query Low" ratio :  2.3%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
DSS_LINEITEM--compress_type is "Compress For Archive High" ratio :  4.3%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
DSS_LINEITEM--compress_type is "Compress For Archive Low" ratio :  3.7%

PL/SQL procedure successfully completed.
可以从上述测试看到,"Compress For Archive High"压缩率最高,该类型最适合于数据归档存储,但其算法复杂度高于"Compress For Archive Low",压缩耗时亦随之上升。 总体压缩率都较低,这同TPC-D测试的数据建模有一定关联,我们再使用一组TPC-H的测试数据来模拟压缩:
SQL> conn liu/liu;
Connected.

SQL> select num_rows,blocks,table_name from user_tables;

  NUM_ROWS     BLOCKS TABLE_NAME
---------- ---------- ------------------------------
   3000000      46817 H_ORDER
    300000       6040 H_CUSTOMER
  12000000     221376 H_LINEITEM
        25          5 H_NATION
    400000       7552 H_PART
         5          5 H_REGION
   1600000      17491 H_PARTSUPP
     20000        496 H_SUPPLIER

8 rows selected.

SQL> set serveroutput on;
SQL> declare
  cmp_blk_cnt   binary_integer;
  uncmp_blk_cnt binary_integer;
  cmp_rows      binary_integer;
  uncmp_rows    binary_integer;
  cmp_ratio     number;
  cmp_typ       varchar2(100);
BEGIN
  for i in (SELECT TABLE_NAME
              from dba_tables
             where compression = 'DISABLED'
               and owner = 'LIU' and num_rows>1000000) loop
    for j in 1 .. 5 loop
      dbms_compression.get_compression_ratio(scratchtbsname => 'SCRATCH',
                                             ownname        => 'LIU',
                                             tabname        => i.table_name,
                                             partname       => NULL,
                                             comptype       => power(2, j),
                                             blkcnt_cmp     => cmp_blk_cnt,
                                             blkcnt_uncmp   => uncmp_blk_cnt,
                                             row_cmp        => cmp_rows,
                                             row_uncmp      => uncmp_rows,
                                             cmp_ratio      => cmp_ratio,
                                             comptype_str   => cmp_typ);
      dbms_output.put_line(i.table_name || '--' || 'compress_type is ' ||
                           cmp_typ || ' ratio :' ||
                           to_char(cmp_ratio, '99.9') || '%');

    end loop;
  end loop;
end;
/
H_ORDER--compress_type is "Compress For OLTP" ratio :  1.1%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
H_ORDER--compress_type is "Compress For Query High" ratio :  5.2%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
H_ORDER--compress_type is "Compress For Query Low" ratio :  2.9%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
H_ORDER--compress_type is "Compress For Archive High" ratio :  7.2%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
H_ORDER--compress_type is "Compress For Archive Low" ratio :  5.5%
H_PARTSUPP--compress_type is "Compress For OLTP" ratio :   .9%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
H_PARTSUPP--compress_type is "Compress For Query High" ratio :  5.1%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
H_PARTSUPP--compress_type is "Compress For Query Low" ratio :  2.7%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
H_PARTSUPP--compress_type is "Compress For Archive High" ratio :  7.2%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
H_PARTSUPP--compress_type is "Compress For Archive Low" ratio :  5.3%
H_LINEITEM--compress_type is "Compress For OLTP" ratio :  1.4%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
H_LINEITEM--compress_type is "Compress For Query High" ratio :  5.2%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
H_LINEITEM--compress_type is "Compress For Query Low" ratio :  3.0%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
H_LINEITEM--compress_type is "Compress For Archive High" ratio :  7.4%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
H_LINEITEM--compress_type is "Compress For Archive Low" ratio :  5.6%

PL/SQL procedure successfully completed.
可以看到相比TPC-D的测试用数据,TPC-H建立的数据更具可压缩性。 PS: TPC-D represents a broad range of decision support (DS) applications that require complex, long running queries against large complex data structures. Real-world business questions were written against this model, resulting in 17 complex queries. The TPC Benchmark