POSTGRESQL的常用索引

索引类型	索引功能
btree	快速的数据查找功能,对于大于,小于,等于, 半模糊, 等查找提供搜索功能
hash	快速的数据查找功能,对于等值,并且字段较长的字段查找有特殊的优化功能
gin	一种倒排索引,对于数组,全文检索,中的相交,包含,大于,左右边的查询有特殊的优化
gist	一种通用的索引入口,对于范围,空间,几何,空间等数据有特殊的查询优化

POSTGRESQL的索引类型本身是比较丰富的,对于不同的查询的方式有不同的对应的索引来进行建立

BTREE索引

BTREE 是各种数据库必备的索引类型,提供等值,不等值,大于小于以及半模糊查询等方式提供数据的搜索

create table t_btree(id int, info text);
insert into t_btree select generate_series(1,10000), md5(random()::text);
explain (analyze,verbose,timing,costs,buffers) select * from t_btree where id=1;

create index idx_t_btree_1 on t_btree using btree (id);
explain (analyze,verbose,timing,costs,buffers) select * from t_btree where id=1;

SELECT
relname AS "表名",
    pg_size_pretty(pg_total_relation_size(relname::regclass)) AS "表总大小",
    pg_size_pretty(pg_relation_size(relname::regclass)) AS "表大小",
    pg_size_pretty(pg_indexes_size(relname::regclass)) AS "索引大小"
FROM pg_stat_all_tables
WHERE relname = 't_btree';

HASH索引

HASH 索引是针对大字段的查询,由于BTREE 字段每行大小最大为 2704 ,所以超出2704 BYTE一行的INDEX 并且是等值运算的,很适合HASH 索引

1、建表，插入数据
create table t_hash (id int,info text);
insert into t_hash select generate_series(1,100),repeat(md5(random()::text),10000);

2、btree索引
create index idx_t_hash_1 on t_hash using btree (info);

3、hash索引
create index idx_t_hash_1 on t_hash using hash (info);
SELECT pg_size_pretty(pg_relation_size('idx_t_hash_1'));

例：假设我们正在设计一个简单的用户管理系统，其中用户表 users 包含用户的 id（唯一标识符）、username（用户名）和 email（电子邮件地址）。我们预计会有大量的查询操作是基于 email 字段进行的，并且 email 字段具有唯一性约束。因此，可以在 email 字段上使用哈希索引来优化这些查询。

1.建表
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    username VARCHAR(255) NOT NULL,
    email VARCHAR(255) NOT NULL UNIQUE
);

2.造数
DO $$
DECLARE
    i INT := 1;
BEGIN
    WHILE i <= 30000 LOOP
        INSERT INTO users1 (username, email) VALUES (
            'user' || i,
            'user' || i || '@example123456789example.com'
        );
        i := i + 1;
    END LOOP;
END $$;

3.创建btree索引
CREATE INDEX idx_email ON users using btree(email);
SELECT pg_size_pretty(pg_relation_size('idx_email'));
explain (analyze,verbose,timing,costs,buffers) SELECT * FROM users WHERE email = 'user999@example123456789example.com';

4.删除btree索引
drop index idx_email;

5.创建hash索引
CREATE INDEX idx_email ON users using hash(email);
SELECT pg_size_pretty(pg_relation_size('idx_email'));

explain (analyze,verbose,timing,costs,buffers) SELECT * FROM users WHERE email = 'user999@example123456789example.com';

大字段进行等值查询时，使用HASH 索引速度更快

hash 索引的缺点，创建索引的情况下，是需要CPU 进行计算的，耗费CPU大量资源

hash 索引主要的查询方式和对象是等值计算 select * from table where id = ‘223’; 用hash索引

GIN索引

GIN 索引对于多列值的查找有特殊的功能

create table t_gin3 (id int, c1 int, c2 int, c3 int, c4 int, c5 int, c6 int, c7 int, c8 int, c9 int);
insert into t_gin3 select generate_series(1,10000), random()*10, random()*20, random()*30, random()*40, random()*50, random()*60, random()*70, random()*80, random()*90;
explain (analyze,verbose,timing,costs,buffers) select * from t_gin3 where c5 = 5 or c6 =9 or c7=19 or c4 = 3;

CREATE EXTENSION btree_gin;
create index idx_t_gin3_1 on t_gin3 using gin (c1,c2,c3,c4,c5,c6,c7,c8,c9);

SELECT
relname AS "表名",
    pg_size_pretty(pg_total_relation_size(relname::regclass)) AS "表总大小",
    pg_size_pretty(pg_relation_size(relname::regclass)) AS "表大小",
    pg_size_pretty(pg_indexes_size(relname::regclass)) AS "索引大小"
FROM pg_stat_all_tables
WHERE relname = 't_gin3';

explain (analyze,verbose,timing,costs,buffers) select * from t_gin3 where c5 = 5 or c6 =9 or c7=19 or c4 = 3;

联合索引

条件中经常组合出现的列可以考虑建立联合索引。

假设我们有一个订单表 orders，包含以下字段：

order_id：订单编号

customer_id：客户编号

order_date：订单日期

amount：订单金额

1、建表

CREATE TABLE orders (
    order_id SERIAL PRIMARY KEY,
    customer_id INT NOT NULL,
    order_date DATE NOT NULL,
    amount DECIMAL(10, 2) NOT NULL
);

2、插入30w条数据

INSERT INTO orders (customer_id, order_date, amount)
SELECT
    (ABS((RANDOM() * 1000000000000000000)::BIGINT) % 10000) + 1 AS customer_id,
    CURRENT_DATE - INTERVAL '1 year' * (ABS((RANDOM() * 1000000000000000000)::BIGINT) % 10) AS order_date,
    (ABS((RANDOM() * 1000000000000000000)::BIGINT) % 10000) / 100.0 AS amount
FROM generate_series(1, 300000);

3、创建单个索引

CREATE INDEX idx_customer_id ON orders (customer_id);
CREATE INDEX idx_order_date ON orders (order_date);

4、查询语句

EXPLAIN ANALYZE
SELECT * FROM orders
WHERE customer_id = 1234 AND order_date BETWEEN '2024-01-01' AND '2024-12-31';

5、索引大小
SELECT pg_size_pretty(pg_relation_size('idx_customer_id'));
SELECT pg_size_pretty(pg_relation_size('idx_order_date'));

6、删除单个索引
drop index idx_customer_id;
drop index idx_order_date;

7、创建联合索引

CREATE INDEX idx_customer_id_order_date ON orders (customer_id, order_date);

8、查询语句
EXPLAIN ANALYZE
SELECT * FROM orders
WHERE customer_id = 1234 AND order_date BETWEEN '2024-01-01' AND '2024-12-31';

9、索引大小

SELECT pg_size_pretty(pg_relation_size('idx_customer_id_order_date'));

性能讲解

单个索引：虽然可以加速对 customer_id 的过滤，但对 order_date 的过滤仍需要额外的扫描和过滤操作，导致查询效率较低。

联合索引：能够同时利用 customer_id 和 order_date 的过滤条件，直接定位到符合条件的数据，减少了扫描范围，显著提高了查询性能。

同时，联合索引占用更少的空间

注意事项

联合索引的效率取决于查询条件是否覆盖索引的所有列。如果查询条件中缺少联合索引的先导列（如 customer_id），则联合索引可能无法被充分利用。

创建联合索引时，应根据查询模式和数据分布合理选择列的顺序。

索引推荐（openGauss）

单query索引推荐功能基于查询语句的语义信息和数据库的统计信息，对用户输入的单条查询语句生成推荐的索引。单query索引推荐功能支持用户在数据库中直接进行操作，本功能涉及的函数接口如下：

gs_index_advise

功能描述：针对单条查询语句生成推荐索引。

参数说明：SQL语句字符串。参数类型是文本型，如果参数中存在如单引号（'）等特殊字符，可以使用单引号（'）进行转义。

示例1：普通表推荐索引

1、创建测试表并插入数据
create table tb_ai_01 (c_w_id integer not null,c_d_id integer not null,c_id integer not null);
insert into tb_ai_01 values(generate_series(1, 200000),generate_series(1, 200000),generate_series(1, 200000));
2、分析表
analyze tb_ai_01;

3、执行单索引推荐
select "table", "column" from gs_index_advise('SELECT c_d_id from tb_ai_01 where c_w_id > 10 and c_w_id < 20');

4、按推荐创建索引，再次执行索引推荐语句
CREATE INDEX idx on tb_ai_01(c_w_id);
select "table", "column" from gs_index_advise('SELECT c_d_id from tb_ai_01 where c_w_id > 10 and c_w_id < 20');

示例2；分区表索引推荐

1、创建分区表并插入数据
create table tb_ai_par_01(id int, val1 int, val2 text)
partition by range(id) (
partition tb_par_range_p1 values less than (200),
partition tb_par_range_p2 values less than (400),
partition tb_par_range_p3 values less than (600),
partition tb_par_range_p4 values less than (800),
partition tb_par_range_p5 values less than (1000),
partition tb_par_range_p6 values less than (1200),
partition tb_par_range_p7 values less than (1400),
partition tb_par_range_p8 values less than (1600),
partition tb_par_range_p9 values less than (MAXVALUE)
)enable row movement;

insert into tb_ai_par_01 values(generate_series(1, 200000), generate_series(1, 200000), generate_series(1, 200000));

2、分析分区表
analyze tb_ai_par_01;

3、生成推荐索引
select "table", "column", "indextype" from gs_index_advise('select id,val1 from tb_ai_par_01 where val1 = 2000;');