这篇文章将为大家详细讲解有关Hive中matadata怎么用,小编觉得挺实用的,因此分享给大家做个参考,希望大家阅读完这篇文章后可以有所收获。Hive元数据存储在MySQL库里,数据存储hdfs上;查看元数据库存放地址,查看Hive配置文件
这篇文章将为大家详细讲解有关Hive中matadata怎么用,小编觉得挺实用的,因此分享给大家做个参考,希望大家阅读完这篇文章后可以有所收获。
Hive元数据存储在MySQL库里,数据存储hdfs上;查看元数据库存放地址,查看Hive配置文件路径查看:
$HIVE_HOME/conf/hive-site.xml
<property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:Mysql://localhost:3306/hive_data?createDatabaseIfNotExist=true&characterEncoding=latin1</value> </property>
查看mysql元数据:
mysql> use hive_data
Database changed
mysql> show tables;
+---------------------------+| Tables_in_hive_data |+---------------------------+| bucketing_cols || cds || columns_v2 || database_params || db_privs || dbs || func_ru || funcs || global_privs || idxs || index_params || part_col_privs || part_col_stats || part_privs || partition_key_vals || partition_keys || partition_params || partitions || roles || sd_params || sds || sequence_table || serde_params || serdes || skewed_col_names || skewed_col_value_loc_map || skewed_string_list || skewed_string_list_values || skewed_values || sort_cols || tab_col_stats || table_params || tbl_col_privs || tbl_privs || tbls || version |+---------------------------+
----------------------------------------------------------------
1. Hive版本version表有且只有一条数据,多一条数据Hive会无法启动
mysql> select * from version;+--------+----------------+---------------------------------------+| VER_ID | SCHEMA_VERSION | VERSION_COMMENT |+--------+----------------+---------------------------------------+| 1 | 1.1.0 | Set by MetaStore hadoop@192.168.0.129 |+--------+----------------+---------------------------------------+
2.Hive数据库元数据表【dbs】【database_params】
mysql> select DB_ID,DB_LOCATION_URI,NAME from dbs
+-------+----------------------------------------------------------------+---------------+| db_id | DB_LOCATION_URI | NAME |+-------+----------------------------------------------------------------+---------------+| 1 | hdfs://192.168.0.129:9000/user/hive/warehouse | default || 3 | hdfs://192.168.0.129:9000/user/hive/warehouse/hive_data2.db | hive_data2 || 6 | hdfs://192.168.0.129:9000/user/hive/warehouse/ruozedata_job.db | ruozedata_job |+-------+----------------------------------------------------------------+---------------+
DB_ID -- 数据库ID 【tbls】
DB_LOCATION_URI -- HDFD存放路径
NAME -- 数据库名
3.Hive表内容、结构、属性
mysql> select TBL_ID,CREATE_TIME,DB_ID,SD_ID,TBL_NAME,TBL_TYPE from tbls;
+--------+-------------+-------+-------+---------------+----------------+| TBL_ID | CREATE_TIME | DB_ID | SD_ID | TBL_NAME | TBL_TYPE |+--------+-------------+-------+-------+---------------+----------------+| 7 | 1528299941 | 3 | 7 | emp | EXTERNAL_TABLE || 10 | 1528311773 | 3 | 10 | emp_bak | MANAGED_TABLE || 11 | 1528312267 | 3 | 11 | emp1 | EXTERNAL_TABLE || 16 | 1528403085 | 3 | 16 | dual | MANAGED_TABLE || 17 | 1528484818 | 3 | 17 | JSON | MANAGED_TABLE || 22 | 1529454293 | 3 | 22 | emp_partition | MANAGED_TABLE || 26 | 1529459118 | 3 | 31 | emp_sqoop111 | MANAGED_TABLE || 34 | 1529530688 | 6 | 39 | user_click | EXTERNAL_TABLE || 38 | 1529537107 | 6 | 44 | product_info | EXTERNAL_TABLE || 39 | 1529593387 | 6 | 45 | city_info | MANAGED_TABLE || 41 | 1529606647 | 6 | 46 | product_hot | MANAGED_TABLE |+--------+-------------+-------+-------+---------------+----------------+
TBL_ID -- 表ID 【table_params】【partitions】【partition_keys】
DB_ID -- 库ID
SD_ID -- 序列化ID 【sds】【partitions】
4.Hive文件存储相关元数据:【sds】【serdes】【serde_params】
mysql> select * from sds;
CD_ID --表列ID 【columns_v2】【cds】
SERDE_ID --序列化列ID 【serdes】【serde_params】
mysql> select * from columns_v2;
+-------+---------+--------------+-----------+-------------+
| CD_ID | COMMENT | COLUMN_NAME | TYPE_NAME | INTEGER_IDX |
+-------+---------+--------------+-----------+-------------+
| 7 | NULL | comm | double | 6 |
| 7 | NULL | deptno | int | 7 |
| 7 | NULL | empno | int | 0 |
| 7 | NULL | ename | string | 1 |
| 7 | NULL | hiredate | string | 4 |
| 7 | NULL | job | string | 2 |
| 7 | NULL | mgr | int | 3 |
| 7 | NULL | salary | double | 5 |
| 10 | NULL | comm | double | 6 |
| 10 | NULL | deptno | int | 7 |
| 10 | NULL | empno | int | 0 |
| 10 | NULL | ename | string | 1 |
| 10 | NULL | hiredate | string | 4 |
| 10 | NULL | job | string | 2 |
| 10 | NULL | mgr | int | 3 |
| 10 | NULL | salary | double | 5 |
5.Hive表分区相关的元数据表
mysql> select * from partitions;
+---------+-------------+------------------+-----------------+-------+--------+| PART_ID | CREATE_TIME | LAST_ACCESS_TIME | PART_NAME | SD_ID | TBL_ID |+---------+-------------+------------------+-----------------+-------+--------+| 1 | 1529456274 | 0 | pt=2018-06-19 | 26 | 22 || 6 | 1529530704 | 0 | data=2018-06-20 | 40 | 34 |
PART_ID -- 分区ID 【partition_key_vasls】【partition_params】
mysql> select * from partition_key_vals;
+---------+--------------+-------------+
| PART_ID | PART_KEY_VAL | INTEGER_IDX |
+---------+--------------+-------------+
| 1 | 2018-06-19 | 0 |
| 6 | 2018-06-20 | 0 |
+---------+--------------+-------------+
PART_ID -- 分区ID
PART_KEY_VAL -- 分区字段值
mysql> select * from partition_params;
+---------+-----------------------+-------------+
| PART_ID | PARAM_KEY | PARAM_VALUE |
+---------+-----------------------+-------------+
| 1 | COLUMN_STATS_ACCURATE | true |
| 1 | numFiles | 2 |
| 1 | numRows | 0 |
| 1 | rawDataSize | 0 |
| 1 | totalSize | 734 |
| 1 | transient_lastDdlTime | 1529456274 |
| 6 | COLUMN_STATS_ACCURATE | true |
| 6 | numFiles | 1 |
| 6 | numRows | 0 |
| 6 | rawDataSize | 0 |
| 6 | totalSize | 725264 |
| 6 | transient_lastDdlTime | 1529530704 |
PARAM_KEY -- 分区属性
PARAM_VALUE -- 分区属性值
总结:
关于“Hive中matadata怎么用”这篇文章就分享到这里了,希望以上内容可以对大家有一定的帮助,使各位可以学到更多知识,如果觉得文章不错,请把它分享出去让更多的人看到。
--结束END--
本文标题: Hive中matadata怎么用
本文链接: https://lsjlt.com/news/231629.html(转载时请注明来源链接)
有问题或投稿请发送至: 邮箱/279061341@qq.com QQ/279061341
2024-05-24
2024-05-24
2024-05-24
2024-05-24
2024-05-24
2024-05-24
2024-05-24
2024-05-24
2024-05-24
2024-05-24
回答
回答
回答
回答
回答
回答
回答
回答
回答
回答
0