POSTGRESQL的常用索引
| 索引类型 | 索引功能 |
| btree | 快速的数据查找功能,对于大于,小于,等于, 半模糊, 等查找提供搜索功能 |
| hash | 快速的数据查找功能,对于等值,并且字段较长的字段查找有特殊的优化功能 |
| gin | 一种倒排索引,对于数组,全文检索,中的相交,包含,大于,左右边的查询有特殊的优化 |
| gist | 一种通用的索引入口,对于范围,空间,几何,空间等数据有特殊的查询优化 |
POSTGRESQL的索引类型本身是比较丰富的,对于不同的查询的方式有不同的对应的索引来进行建立
BTREE索引
BTREE 是各种数据库必备的索引类型,提供等值,不等值,大于小于以及半模糊查询等方式提供数据的搜索
create table t_btree(id int, info text);
insert into t_btree select generate_series(1,10000), md5(random()::text);
explain (analyze,verbose,timing,costs,buffers) select * from t_btree where id=1;

create index idx_t_btree_1 on t_btree using btree (id);
explain (analyze,verbose,timing,costs,buffers) select * from t_btree where id=1;

SELECT
relname AS "表名",
pg_size_pretty(pg_total_relation_size(relname::regclass)) AS "表总大小",
pg_size_pretty(pg_relation_size(relname::regclass)) AS "表大小",
pg_size_pretty(pg_indexes_size(relname::regclass)) AS "索引大小"
FROM pg_stat_all_tables
WHERE relname = 't_btree';

HASH索引
HASH 索引是针对大字段的查询,由于BTREE 字段每行大小最大为 2704 ,所以超出2704 BYTE一行的INDEX 并且是等值运算的,很适合HASH 索引
1、建表,插入数据
create table t_hash (id int,info text);
insert into t_hash select generate_series(1,100),repeat(md5(random()::text),10000);
2、btree索引
create index idx_t_hash_1 on t_hash using btree (info);
3、hash索引
create index idx_t_hash_1 on t_hash using hash (info);
SELECT pg_size_pretty(pg_relation_size('idx_t_hash_1'));

例:假设我们正在设计一个简单的用户管理系统,其中用户表 users 包含用户的 id(唯一标识符)、username(用户名)和 email(电子邮件地址)。我们预计会有大量的查询操作是基于 email 字段进行的,并且 email 字段具有唯一性约束。因此,可以在 email 字段上使用哈希索引来优化这些查询。
1.建表
CREATE TABLE users (
id SERIAL PRIMARY KEY,
username VARCHAR(255) NOT NULL,
email VARCHAR(255) NOT NULL UNIQUE
);
2.造数
DO $$
DECLARE
i INT := 1;
BEGIN
WHILE i <= 30000 LOOP
INSERT INTO users1 (username, email) VALUES (
'user' || i,
'user' || i || '@example123456789example.com'
);
i := i + 1;
END LOOP;
END $$;
3.创建btree索引
CREATE INDEX idx_email ON users using btree(email);
SELECT pg_size_pretty(pg_relation_size('idx_email'));
explain (analyze,verbose,timing,costs,buffers) SELECT * FROM users WHERE email = 'user999@example123456789example.com';
4.删除btree索引
drop index idx_email;
5.创建hash索引
CREATE INDEX idx_email ON users using hash(email);
SELECT pg_size_pretty(pg_relation_size('idx_email'));
explain (analyze,verbose,timing,costs,buffers) SELECT * FROM users WHERE email = 'user999@example123456789example.com';



大字段进行等值查询时,使用HASH 索引速度更快
- hash 索引的缺点,创建索引的情况下,是需要CPU 进行计算的,耗费CPU大量资源
- hash 索引主要的查询方式和对象是 等值计算 select * from table where id = ‘223’; 用hash索引
GIN索引
GIN 索引对于多列值的查找有特殊的功能
create table t_gin3 (id int, c1 int, c2 int, c3 int, c4 int, c5 int, c6 int, c7 int, c8 int, c9 int);
insert into t_gin3 select generate_series(1,10000), random()*10, random()*20, random()*30, random()*40, random()*50, random()*60, random()*70, random()*80, random()*90;
explain (analyze,verbose,timing,costs,buffers) select * from t_gin3 where c5 = 5 or c6 =9 or c7=19 or c4 = 3;


CREATE EXTENSION btree_gin;
create index idx_t_gin3_1 on t_gin3 using gin (c1,c2,c3,c4,c5,c6,c7,c8,c9);

SELECT
relname AS "表名",
pg_size_pretty(pg_total_relation_size(relname::regclass)) AS "表总大小",
pg_size_pretty(pg_relation_size(relname::regclass)) AS "表大小",
pg_size_pretty(pg_indexes_size(relname::regclass)) AS "索引大小"
FROM pg_stat_all_tables
WHERE relname = 't_gin3';

explain (analyze,verbose,timing,costs,buffers) select * from t_gin3 where c5 = 5 or c6 =9 or c7=19 or c4 = 3;

联合索引
条件中经常组合出现的列可以考虑建立联合索引。
假设我们有一个订单表 orders,包含以下字段:
order_id:订单编号
customer_id:客户编号
order_date:订单日期
amount:订单金额
1、建表
CREATE TABLE orders (
order_id SERIAL PRIMARY KEY,
customer_id INT NOT NULL,
order_date DATE NOT NULL,
amount DECIMAL(10, 2) NOT NULL
);
2、插入30w条数据
INSERT INTO orders (customer_id, order_date, amount)
SELECT
(ABS((RANDOM() * 1000000000000000000)::BIGINT) % 10000) + 1 AS customer_id,
CURRENT_DATE - INTERVAL '1 year' * (ABS((RANDOM() * 1000000000000000000)::BIGINT) % 10) AS order_date,
(ABS((RANDOM() * 1000000000000000000)::BIGINT) % 10000) / 100.0 AS amount
FROM generate_series(1, 300000);

3、创建单个索引
CREATE INDEX idx_customer_id ON orders (customer_id);
CREATE INDEX idx_order_date ON orders (order_date);
4、查询语句
EXPLAIN ANALYZE
SELECT * FROM orders
WHERE customer_id = 1234 AND order_date BETWEEN '2024-01-01' AND '2024-12-31';
5、索引大小
SELECT pg_size_pretty(pg_relation_size('idx_customer_id'));
SELECT pg_size_pretty(pg_relation_size('idx_order_date'));
6、删除单个索引
drop index idx_customer_id;
drop index idx_order_date;

7、创建联合索引
CREATE INDEX idx_customer_id_order_date ON orders (customer_id, order_date);
8、查询语句
EXPLAIN ANALYZE
SELECT * FROM orders
WHERE customer_id = 1234 AND order_date BETWEEN '2024-01-01' AND '2024-12-31';
9、索引大小
SELECT pg_size_pretty(pg_relation_size('idx_customer_id_order_date'));

性能讲解
单个索引:虽然可以加速对 customer_id 的过滤,但对 order_date 的过滤仍需要额外的扫描和过滤操作,导致查询效率较低。
联合索引:能够同时利用 customer_id 和 order_date 的过滤条件,直接定位到符合条件的数据,减少了扫描范围,显著提高了查询性能。
同时,联合索引占用更少的空间
注意事项
联合索引的效率取决于查询条件是否覆盖索引的所有列。如果查询条件中缺少联合索引的先导列(如 customer_id),则联合索引可能无法被充分利用。
创建联合索引时,应根据查询模式和数据分布合理选择列的顺序。
索引推荐(openGauss)
单query索引推荐功能基于查询语句的语义信息和数据库的统计信息,对用户输入的单条查询语句生成推荐的索引。单query索引推荐功能支持用户在数据库中直接进行操作,本功能涉及的函数接口如下:
gs_index_advise
功能描述:针对单条查询语句生成推荐索引。
参数说明:SQL语句字符串。 参数类型是文本型,如果参数中存在如单引号(')等特殊字符,可以使用单引号(')进行转义。
示例1:普通表推荐索引
1、创建测试表并插入数据
create table tb_ai_01 (c_w_id integer not null,c_d_id integer not null,c_id integer not null);
insert into tb_ai_01 values(generate_series(1, 200000),generate_series(1, 200000),generate_series(1, 200000));
2、分析表
analyze tb_ai_01;

3、执行单索引推荐
select "table", "column" from gs_index_advise('SELECT c_d_id from tb_ai_01 where c_w_id > 10 and c_w_id < 20');

4、按推荐创建索引,再次执行索引推荐语句
CREATE INDEX idx on tb_ai_01(c_w_id);
select "table", "column" from gs_index_advise('SELECT c_d_id from tb_ai_01 where c_w_id > 10 and c_w_id < 20');

示例2;分区表索引推荐
1、创建分区表并插入数据
create table tb_ai_par_01(id int, val1 int, val2 text)
partition by range(id) (
partition tb_par_range_p1 values less than (200),
partition tb_par_range_p2 values less than (400),
partition tb_par_range_p3 values less than (600),
partition tb_par_range_p4 values less than (800),
partition tb_par_range_p5 values less than (1000),
partition tb_par_range_p6 values less than (1200),
partition tb_par_range_p7 values less than (1400),
partition tb_par_range_p8 values less than (1600),
partition tb_par_range_p9 values less than (MAXVALUE)
)enable row movement;
insert into tb_ai_par_01 values(generate_series(1, 200000), generate_series(1, 200000), generate_series(1, 200000));
2、分析分区表
analyze tb_ai_par_01;
3、生成推荐索引
select "table", "column", "indextype" from gs_index_advise('select id,val1 from tb_ai_par_01 where val1 = 2000;');

总结
- 查询经常用到的字段建索引
- 超长字段不加索引
- 重复值多的字段不加索引
- 经常组合出现作为条件的列建联合索引
- 根据业务场景选择合适的索引类型
Comments NOTHING