欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  科技

HPvertica收集表的统计信息方法介绍

程序员文章站 2022-04-22 23:49:59
一.通过PROJECTION_COLUMNS的statistics_type查看表的列的统计信息类型,和收集时间 SELECT projection_name,projection_colu...

一.通过PROJECTION_COLUMNS的statistics_type查看表的列的统计信息类型,和收集时间

SELECT projection_name,projection_column_name,

statistics_type,statistics_updated_timestamp,

column_position

FROM PROJECTION_COLUMNS

where table_schema='FDL' and table_name = 'FDL_T01_XXX';

--statistics_type最好为full

二.通过PROJECTIONS的HAS_STATISTICS判断表是否有统计信息

ELECT projection_basename,projection_name

segment_expression,segment_range,verified_fault_tolerance,

HAS_STATISTICS,is_segmented

FROM PROJECTIONS

where projection_schema='FDL' and projection_basename='FDL_T01_XXX';

--同样可以查看分段的类型,分布

三.通过PROJECTION_STORAGE的row_count,used_bytes等统计信息,判断统计信息是否准确

select NODE_NAME,projection_name,projection_column_count,

row_count,used_bytes

from PROJECTION_STORAGE

where ANCHOR_TABLE_SCHEMA='FDL' and ANCHOR_TABLE_NAME='FDL_T01_XXX'

order by projection_name,node_name;

四.收集单表统计信息

select ANALYZE_STATISTICS('FDL.FDL_T01_XXX');

五.批量收集表的统计信息

select 'select ANALYZE_STATISTICS('''||(table_schema||'.'||table_name)||''');' from tables where table_schema='FDL' order by table_name;

注1.关于统计信息

Number of rows in the table

Number of distinct values of each column (cardinality)

Minimum/maximum values of each column

Ahistogramof the distribution of values in each column

Disk space that the column occupie

注2.关于执行计划优化器

The access path with the fewest expected I/O operations and lowest CPU, memory, and network usage

Multiple eligibleprojectionsto answer the query

Join types of differing algorithms (hash join/merge join or group by hash/group by pipelined)

The order in which to perform joins

Query predicates

Data re-distribution algorithms (broadcast and re-segmentation) across nodes in the cluster