欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  数据库

GPDB管理员笔记(一)数据库对象

程序员文章站 2022-05-22 15:59:08
...

数据库对象管理1、创建数据库 create database new_dbname; createdb -h localhost -p 5432 mydb2、克隆数据库 3、查看数据libo=# \l List of databases Name | Owner | Encoding | Access privileges --------------------------------------------------- l

数据库对象管理 1、创建数据库 create database new_dbname; createdb -h localhost -p 5432 mydb 2、克隆数据库 3、查看数据 libo=# \l
List of databases
Name | Owner | Encoding | Access privileges
-----------+---------+----------+---------------------
libo | gpadmin | UTF8 |
postgres | gpadmin | UTF8 |
template0 | gpadmin | UTF8 | =c/gpadmin
: gpadmin=CTc/gpadmin
template1 | gpadmin | UTF8 | =c/gpadmin
: gpadmin=CTc/gpadmin
(4 rows)
select * from pg_database;
4、数据库属性变更 libo=# alter database libo owner to libo;
ALTER DATABASE
libo=# \l
List of databases
Name | Owner | Encoding | Access privileges
-----------+---------+----------+---------------------
libo | libo | UTF8 |
postgres | gpadmin | UTF8 |
template0 | gpadmin | UTF8 | =c/gpadmin
: gpadmin=CTc/gpadmin
template1 | gpadmin | UTF8 | =c/gpadmin
: gpadmin=CTc/gpadmin
(4 rows)
5、使用gpfilespace创建文件系统 [gpadmin@mdw ~]$ gpfilespace -o gpfilespace_config
20140303:10:43:03:012223 gpfilespace:mdw:gpadmin-[INFO]:-
A tablespace requires a file system location to store its database
files. A filespace is a collection of file system locations for all components
in a Greenplum system (primary segment, mirror segment and master instances).
Once a filespace is created, it can be used by one or more tablespaces.


20140303:10:43:03:012223 gpfilespace:mdw:gpadmin-[INFO]:-getting config
Enter a name for this filespace
> libodisk

Checking your configuration:
Your system has 2 hosts with 2 primary and 2 mirror segments per host.
Your system has 1 hosts with 0 primary and 0 mirror segments per host.

Configuring hosts: [sdw2, sdw1]

Please specify 2 locations for the primary segments, one per line:
primary location 1> /home/gpadmin/GPDB/data/d1
[Error] sdw2: /home/gpadmin/GPDB/data/d1/gpseg2 - Directory conflicts with existing datadir
primary location 1> /home/gpadmin/GPDB/data/d1
[Error] sdw2: /home/gpadmin/GPDB/data/d1/gpseg2 - Directory conflicts with existing datadir
primary location 1> /home/gpadmin/GPDB/data/d1
[Error] sdw2: /home/gpadmin/GPDB/data/d1/gpseg2 - Directory conflicts with existing datadir
primary location 1> /home/gpadmin/GPDB/data/d2
[Error] sdw2: /home/gpadmin/GPDB/data/d2/gpseg3 - Directory conflicts with existing datadir
primary location 1>
primary location 1>
primary location 1>
primary location 1>
primary location 1>
primary location 1> /home/gpadmin/GPDB/data/d1/gpseg0
[Error] sdw2: /home/gpadmin/GPDB/data/d1/gpseg0 : No such file or directory

primary location 1> /home/gpadmin/GPDB/data/d1/gpseg2
[Error] sdw2: /home/gpadmin/GPDB/data/d1/gpseg2/gpseg2 - Subdirectory of existing datadir
primary location 1> /home/gpadmin/GPDB/data/d1/
[Error] sdw2: /home/gpadmin/GPDB/data/d1/gpseg2 - Directory conflicts with existing datadir
primary location 1> /home/gpadmin/GPDB/data/d3
[Error] sdw2: /home/gpadmin/GPDB/data/d3 : No such file or directory

primary location 1> /home/gpadmin/GPDB/data/d3
[Error] sdw1: /home/gpadmin/GPDB/data/d3 : No such file or directory

primary location 1> /home/gpadmin/GPDB/data/d3
primary location 2> /home/gpadmin/GPDB/data/d3

Please specify 2 locations for the mirror segments, one per line:
mirror location 1> /home/gpadmin/GPDB/data/m3
mirror location 2> /home/gpadmin/GPDB/data/m3

Configuring hosts: [mdw]

Enter a file system location for the master
master location> /home/gpadmin/GPDB/data/master
20140303:10:51:07:012223 gpfilespace:mdw:gpadmin-[INFO]:-Creating configuration file...
20140303:10:51:07:012223 gpfilespace:mdw:gpadmin-[INFO]:-[created]
20140303:10:51:07:012223 gpfilespace:mdw:gpadmin-[INFO]:-
To add this filespace to the database please run the command:
gpfilespace --config /home/gpadmin/gpfilespace_config

[gpadmin@mdw ~]$ gpfilespace -c gpfilespace_config
20140303:10:51:29:012482 gpfilespace:mdw:gpadmin-[INFO]:-
A tablespace requires a file system location to store its database
files. A filespace is a collection of file system locations for all components
in a Greenplum system (primary segment, mirror segment and master instances).
Once a filespace is created, it can be used by one or more tablespaces.


20140303:10:51:30:012482 gpfilespace:mdw:gpadmin-[INFO]:-getting config
Reading Configuration file: 'gpfilespace_config'
20140303:10:51:30:012482 gpfilespace:mdw:gpadmin-[INFO]:-Performing validation on paths
..............................................................................

20140303:10:51:30:012482 gpfilespace:mdw:gpadmin-[INFO]:-Connecting to database
20140303:10:51:31:012482 gpfilespace:mdw:gpadmin-[INFO]:-Filespace "libodisk" successfully created
创建表空间 libo=# create tablespace libospace filespace libodisk;
CREATE TABLESPACE
libo=# grant create on tablespace libospace to libo;
GRANT
libo=# set default_tablespace=libospace;
SET
libo=# create table test (id int);
NOTICE: Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'id' as the Greenplum Database data distribution key for this table.
HINT: The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
CREATE TABLE
libo=# drop table test;
DROP TABLE
libo=# create table test (i int);
NOTICE: Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'i' as the Greenplum Database data distribution key for this table.
HINT: The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
CREATE TABLE

查看现有表空间和空间文件: SELECT spcname as tblspc, fsname as filespc, fsedbid as seg_dbid, fselocation as datadir
FROM pg_tablespace pgts, pg_filespace pgfs, pg_filespace_entry pgfse
WHERE pgts.spcfsoid=pgfse.fsefsoid AND pgfse.fsefsoid=pgfs.oid ORDER BY tblspc, seg_dbid;
tblspc | filespc | seg_dbid | datadir
------------+-----------+----------+----------------------------------------
libospace | libodisk | 1 | /home/gpadmin/GPDB/data/master/gpseg-1
libospace | libodisk | 2 | /home/gpadmin/GPDB/data/d3/gpseg0
libospace | libodisk | 3 | /home/gpadmin/GPDB/data/d3/gpseg1
libospace | libodisk | 4 | /home/gpadmin/GPDB/data/d3/gpseg2
libospace | libodisk | 5 | /home/gpadmin/GPDB/data/d3/gpseg3
libospace | libodisk | 6 | /home/gpadmin/GPDB/data/m3/gpseg0
libospace | libodisk | 7 | /home/gpadmin/GPDB/data/m3/gpseg1
libospace | libodisk | 8 | /home/gpadmin/GPDB/data/m3/gpseg2
libospace | libodisk | 9 | /home/gpadmin/GPDB/data/m3/gpseg3
pg_default | pg_system | 1 | /home/gpadmin/GPDB/data/gpseg-1
pg_default | pg_system | 2 | /home/gpadmin/GPDB/data/d1/gpseg0
pg_default | pg_system | 3 | /home/gpadmin/GPDB/data/d2/gpseg1
pg_default | pg_system | 4 | /home/gpadmin/GPDB/data/d1/gpseg2
pg_default | pg_system | 5 | /home/gpadmin/GPDB/data/d2/gpseg3
pg_default | pg_system | 6 | /home/gpadmin/GPDB/data/m1/gpseg0
pg_default | pg_system | 7 | /home/gpadmin/GPDB/data/m2/gpseg1
pg_default | pg_system | 8 | /home/gpadmin/GPDB/data/m1/gpseg2
pg_default | pg_system | 9 | /home/gpadmin/GPDB/data/m2/gpseg3
pg_global | pg_system | 1 | /home/gpadmin/GPDB/data/gpseg-1
pg_global | pg_system | 2 | /home/gpadmin/GPDB/data/d1/gpseg0
pg_global | pg_system | 3 | /home/gpadmin/GPDB/data/d2/gpseg1
pg_global | pg_system | 4 | /home/gpadmin/GPDB/data/d1/gpseg2
pg_global | pg_system | 5 | /home/gpadmin/GPDB/data/d2/gpseg3
pg_global | pg_system | 6 | /home/gpadmin/GPDB/data/m1/gpseg0
pg_global | pg_system | 7 | /home/gpadmin/GPDB/data/m2/gpseg1
pg_global | pg_system | 8 | /home/gpadmin/GPDB/data/m1/gpseg2
pg_global | pg_system | 9 | /home/gpadmin/GPDB/data/m2/gpseg3
(27 rows)
查看当前的shema libo=# select current_schema();
current_schema
----------------
public
(1 row)

创建表声明分布键 => CREATE TABLE products (name varchar(40), prod_id integer, supplier_id integer)
DISTRIBUTED BY (prod_id);
=> CREATE TABLE random_stuff (things text, doodads text, etc text)
DISTRIBUTED RANDOMLY;
选择表的存储模式
GPDB提供几种灵活的存储处理模式(或者混合模式)。在创建一张新的TABLE
时,有几个选项来决定数据如何储存在磁盘上。本节介绍这几种选项,以及出
于工作负载的考虑如何实现最佳的储存模式。
 选择堆存储(Heap)或只追加(Append-Only/AO)存储
 选择行存储(Row-Orientation)或列存储(Column-Orientation)
 使用压缩(只可以是AO表)
 检查只追加(AO)表的压缩和分布情况
创建列存储表 => CREATE TABLE bar (a int, b text) WITH (appendonly=true, orientation=column)
DISTRIBUTED BY (a);
检查AO表的压缩与分布情况 GP提供了内置的函数用以检查AO表的压缩率和分布情况。这两个函数可以使
用对象ID或者TABLE的NAME作为参数。表名可能需要带模式名限定。
GPDB管理员笔记(一)数据库对象
压缩率得到的是一个常见的比值类型。比如,3.19的返回值或者3.19:1,意味着
该TABLE未压缩状态下的储存尺寸是压缩下的储存尺寸的3倍多。
分布信息展示的是每个Instance存储该TABLE的ROW数量。例如,在一个有着4
个Instance的系统,其dbid范围为0 – 3,该函数返回类似下面的结果集:
=# SELECT get_ao_distribution("lineitem_comp');
get_ao_distribution
---------------------(0,7500721)
(1,7501365)
(2,7499978)
(3,7497731)
(4 rows)
通过TYPE命令的方式设置压缩配置
一个TYPE可以包含3个压缩参数。关于添加这些参数到TYPE的语法和限制,
参考相关的CREATE TYPE命令。下面的命令使用精简的方式创建压缩 CREATE TABLE t2 (c1 comptype) WITH (APPENDONLY=true, ORIENTATION=column);
这里的comptype的定义为:
CREATE TYPE comptype (
internallength = 4,
input = comptype_in,
output = comptype_out,
alignment = int4,
default = 123,
passedbyvalue,
compresstype="quicklz",
blocksize=65536,
compresslevel=1
);
重分布表数据
对于随机分布策略或者不改变分布策略的表,要重分布TABLE的数据,使用
REORGANIZW=TRUE。这在处理数据倾斜问题时可能是很必要的,在添加新的
Segment节点资源时也是必要的。比如: ALTER TABLE sales SET WITH (REORGANIZE=TRUE);
分区表维护 添加分区 ALTER TABLE sales ADD PARTITION
START (date '2009-02-01') INCLUSIVE
END (date '2009-03-01') EXCLUSIVE; 删除分区 ALTER TABLE sales DROP PARTITION FOR (RANK(1)) 清除分区数据 ALTER TABLE sales TRUNCATE PARTITION FOR (RANK(1)); 交换分区 CREATE TABLE jan08 (LIKE sales) WITH (appendonly=true);
INSERT INTO jan08 SELECT * FROM sales_1_prt_1 ;
ALTER TABLE sales EXCHANGE PARTITION FOR (DATE '2008-01-01') WITH TABLE jan08; 拆分分区 例如,将一个月分区数据拆分到一个1-15日的分区和另一个16-31日的分区:
ALTER TABLE sales SPLIT PARTITION FOR ('2008-01-01')
AT ('2008-01-16')
INTO (PARTITION jan081to15, PARTITION jan0816to31)