首页 > 资讯 > 数据库 >怎么使用PostgreSQL中的Bloom索引

862

分享到

怎么使用PostgreSQL中的Bloom索引

2024-04-02 19:04:59 862人浏览薄情痞子

摘要

这篇文章主要讲解了“怎么使用postgresql中的Bloom索引”，文中的讲解内容简单清晰，易于学习与理解，下面请大家跟着小编的思路慢慢深入，一起来研究和学习“怎么使用Postgresql中的Bloom索

这篇文章主要讲解了“怎么使用postgresql中的Bloom索引”，文中的讲解内容简单清晰，易于学习与理解，下面请大家跟着小编的思路慢慢深入，一起来研究和学习“怎么使用Postgresql中的Bloom索引”吧！

简介
Bloom Index源于Bloom filter(布隆过滤器),布隆过滤器用于在使用少量的空间的情况下可以很快速的判定某个值是否在集合中,其缺点是存在假阳性False Positives,因此需要Recheck来判断该值是否在集合中,但布隆过滤器不存在假阴性,也就是说,对于某个值如果过滤器返回不存在,那就是不存在.

结构
其结构如下图所示:

怎么使用PostgreSQL中的Bloom索引

第一个page为metadata,然后每一行都会有一个bit array(signature)和TID与其对应.

示例
创建数据表,插入数据

testdb=# drop table if exists t_bloom;
DROP TABLE
testdb=# CREATE TABLE t_bloom (id int, dept int, id2 int, id3 int, id4 int, id5 int,id6 int,id7 int,details text, zipcode int);
CREATE TABLE
testdb=# 
testdb=# INSERT INTO t_bloom 
testdb-# SELECT (random() * 1000000)::int, (random() * 1000000)::int,
testdb-# (random() * 1000000)::int,(random() * 1000000)::int,(random() * 1000000)::int,(random() * 1000000)::int, 
testdb-# (random() * 1000000)::int,(random() * 1000000)::int,md5(g::text), floor(random()* (20000-9999 + 1) + 9999) 
testdb-# from generate_series(1,16*1024*1024) g;
INSERT 0 16777216
testdb=# 
testdb=# analyze t_bloom;
ANALYZE
testdb=# 
testdb=# select pg_size_pretty(pg_table_size('t_bloom'));
 pg_size_pretty 
----------------
 1619 MB
(1 row)

创建Btree索引

testdb=# 
testdb=# create index idx_t_bloom_btree on t_bloom using btree(id,dept,id2,id3,id4,id5,id6,id7,zipcode);
CREATE INDEX
testdb=# \di+ idx_t_bloom_btree
                              List of relations
 Schema |       Name        | Type  | Owner |  Table  |  Size  | Description 
--------+-------------------+-------+-------+---------+--------+-------------
 public | idx_t_bloom_btree | index | pg12  | t_bloom | 940 MB | 
(1 row)

执行查询

testdb=# EXPLAIN ANALYZE select * from t_bloom where id4 = 305294 and zipcode = 13266;
                                                              QUERY PLAN                                                     
---------------------------------------------------------------------------------------------------------
 Index Scan using idx_t_bloom_btree on t_bloom  (cost=0.56..648832.73 rows=1 width=69) (actual time=2648.215..2648.215 rows=0
 loops=1)
   Index Cond: ((id4 = 305294) AND (zipcode = 13266))
 Planning Time: 3.244 ms
 Execution Time: 2659.804 ms
(4 rows)
testdb=# EXPLAIN ANALYZE select * from t_bloom where id5 = 241326 and id6 = 354198;
                                                              QUERY PLAN                                                     
---------------------------------------------------------------------------------------------------------
 Index Scan using idx_t_bloom_btree on t_bloom  (cost=0.56..648832.73 rows=1 width=69) (actual time=2365.533..2365.533 rows=0
 loops=1)
   Index Cond: ((id5 = 241326) AND (id6 = 354198))
 Planning Time: 1.918 ms
 Execution Time: 2365.629 ms
(4 rows)

创建Bloom索引

testdb=# create extension bloom;
CREATE EXTENSION
testdb=# CREATE INDEX idx_t_bloom_bloom ON t_bloom USING bloom(id, dept, id2, id3, id4, id5, id6, id7, zipcode) 
testdb-# WITH (length=64, col1=4, col2=4, col3=4, col4=4, col5=4, col6=4, col7=4, col8=4, col9=4);
CREATE INDEX
testdb=# \di+ idx_t_bloom_bloom
                              List of relations
 Schema |       Name        | Type  | Owner |  Table  |  Size  | Description 
--------+-------------------+-------+-------+---------+--------+-------------
 public | idx_t_bloom_bloom | index | pg12  | t_bloom | 225 MB | 
(1 row)

执行查询

testdb=# EXPLAIN ANALYZE select * from t_bloom where id4 = 305294 and zipcode = 13266;
                                                              QUERY PLAN                                                     
-------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on t_bloom  (cost=283084.16..283088.18 rows=1 width=69) (actual time=998.727..998.727 rows=0 loops=1)
   Recheck Cond: ((id4 = 305294) AND (zipcode = 13266))
   Rows Removed by Index Recheck: 12597
   Heap Blocks: exact=12235
   ->  Bitmap Index Scan on idx_t_bloom_bloom  (cost=0.00..283084.16 rows=1 width=0) (actual time=234.893..234.893 rows=12597
 loops=1)
         Index Cond: ((id4 = 305294) AND (zipcode = 13266))
 Planning Time: 31.482 ms
 Execution Time: 998.975 ms
(8 rows)
testdb=# EXPLAIN ANALYZE select * from t_bloom where id5 = 241326 and id6 = 354198;
                                                              QUERY PLAN                                                     
-------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on t_bloom  (cost=283084.16..283088.18 rows=1 width=69) (actual time=1019.621..1019.621 rows=0 loops=1)
   Recheck Cond: ((id5 = 241326) AND (id6 = 354198))
   Rows Removed by Index Recheck: 13033
   Heap Blocks: exact=12633
   ->  Bitmap Index Scan on idx_t_bloom_bloom  (cost=0.00..283084.16 rows=1 width=0) (actual time=204.873..204.873 rows=13033
 loops=1)
         Index Cond: ((id5 = 241326) AND (id6 = 354198))
 Planning Time: 0.441 ms
 Execution Time: 1019.811 ms
(8 rows)

从执行结果来看,在查询条件中没有非前导列(上例中为id1)的情况下多列任意组合查询,bloom index会优于btree index.

感谢各位的阅读，以上就是“怎么使用PostgreSQL中的Bloom索引”的内容了，经过本文的学习后，相信大家对怎么使用PostgreSQL中的Bloom索引这一问题有了更深刻的体会，具体使用情况还需要大家实践验证。这里是编程网，小编将为大家推送更多相关知识点的文章，欢迎关注！

您可能感兴趣的文档:

--结束END--

本文标题: 怎么使用PostgreSQL中的Bloom索引

本文链接: https://lsjlt.com/news/63797.html(转载时请注明来源链接)

有问题或投稿请发送至: 邮箱/279061341@qq.com QQ/279061341

回答

如何调试操作系统的错误？
操作系统

2023-11-15发布

回答

操作系统中的I/O系统是如何实现的？
操作系统

2023-11-15发布

回答

如何实现操作系统的内存管理？
操作系统

2023-11-15发布

回答

什么是虚拟内存，它对操作系统有什么影响？
操作系统

2023-11-15发布

回答

ASP中的MVC架构和WebForms架构有什么区别和使用场景？
ASP.NET

2023-11-15发布

回答

ASP中的数据验证和数据校验有什么不同？
ASP.NET

2023-11-15发布

回答

ASP中的ADO对象和DAO对象有什么区别和使用方法？
ASP.NET

2023-11-15发布

回答

Node.js中的包管理器NPM是什么？如何使用它进行依赖管理？
node.js

2023-11-15发布

回答

Vue.js中的动态组件是什么？如何使用它来动态渲染组件？
VUE

2023-11-15发布

回答

如何使用Vue.js实现懒加载和预加载？
VUE

2023-11-15发布

怎么使用PostgreSQL中的Bloom索引

怎么使用PostgreSQL中的Bloom索引

怎么使用PostgreSQL中Hash索引

PostgreSQL中怎么创建索引

postgresql 索引使用参考

怎么在postgresql中创建索引

PostgreSQL中的GIN索引有什么作用

PostgreSQL中的Btree索引有什么作用

python中的索引怎么使用

MySQL中怎么使用索引

sql中索引怎么使用

SQL Server中的索引怎么使用

SQL Server中的索引怎么使用

怎么在Mysql中使用索引

C#中索引器怎么使用

Oracle与PostgreSQL的NULL和索引使用区别是什么

PostgreSQL索引分类及使用的示例分析

Hive中的索引类型怎么使用

如何在PostgreSQL数据库中创建和使用索引

MongoDB中怎么使用唯一索引

MySQL中怎么使用索引优化

关于SQL建表语句使用详解

HBase在大数据审计与合规性追踪中的应用

MySQL与HBase在大数据金融分析中的性能与可扩展性对比

HBase的Region Server之间的网络通信优化

HBase在大数据监控与告警系统中的实时数据处理能力

MySQL与HBase在大数据安全策略中的实现与对比

HBase的分布式事务处理在复杂业务场景中的应用

MySQL与HBase在混合存储架构中的整合策略

HBase如何支持高效的二级索引查询

MySQL与HBase在物联网数据收集与处理中的协作模式