Hive基礎（六）：Hive語法(2) DML(1) 數據操作(數據導入/數據導出)

1 數據導入

1.1 向表中裝載數據（Load）

1．語法

hive> load data [local] inpath '/opt/module/datas/student.txt' [overwrite] into table student

[partition (partcol1=val1,…)];

（1）load data:表示加載數據

（2）local:表示從本地加載數據到 hive 表；否則從 HDFS 加載數據到 hive 表

（3）inpath:表示加載數據的路徑

（4）overwrite:表示覆蓋表中已有數據，否則表示追加

（5）into table:表示加載到哪張表

（6）student:表示具體的表

（7）partition:表示上傳到指定分區

2．實操案例

（0）創建一張表

hive (default)> create table student(id string, name string) row format delimited 
fields terminated by '\t';

（1）加載本地文件到 hive

hive (default)> load data local inpath '/opt/module/datas/student.txt' into table 
default.student;

（2）加載 HDFS 文件到 hive 中

上傳文件到 HDFS

hive (default)> dfs -put /opt/module/datas/student.txt /user/atguigu/hive;

加載 HDFS 上數據

hive (default)> load data inpath '/user/atguigu/hive/student.txt' into table 
default.student;

（3）加載數據覆蓋表中已有的數據

上傳文件到 HDFS

hive (default)> dfs -put /opt/module/datas/student.txt /user/atguigu/hive;

加載數據覆蓋表中已有的數據

hive (default)> load data inpath '/user/atguigu/hive/student.txt' overwrite into 
table default.student;

1.2 通過查詢語句向表中插入數據（Insert）

1．創建一張表

hive (default)> create table student_par(id int, name string) row format 
delimited fields terminated by '\t';

2．基本插入數據

hive (default)> insert into table student_par 
values(1,'wangwu'),(2,'zhaoliu');

3．基本模式插入（根據單張表查詢結果）

hive (default)> insert overwrite table student_par
 select id, name from student where month='201709';

insert into：以追加數據的方式插入到表或分區，原有數據不會刪除

insert overwrite：會覆蓋表或分區中已存在的數據

注意：insert 不支持插入部分字段

4．多表（多分區）插入模式（根據多張表查詢結果）

hive (default)> from student
 insert overwrite table student partition(month='201707')
 select id, name where month='201709'
 insert overwrite table student partition(month='201706')
 select id, name where month='201709';

1.3 查詢語句中創建表并加載數據（As Select）

詳見 4.5.1 章創建表。

根據查詢結果創建表（查詢的結果會添加到新創建的表中）

create table if not exists student3
as select id, name from student;

1.4 創建表時通過 Location 指定加載數據路徑

1．上傳數據到 hdfs 上

hive (default)> dfs -mkdir /student;
hive (default)> dfs -put /opt/module/datas/student.txt /student;

2. 創建表，并指定在 hdfs 上的位置

hive (default)> create external table if not exists student5(
 id int, name string
 )
 row format delimited fields terminated by '\t'
 location '/student;

3．查詢數據

hive (default)> select * from student5;

1.5 Import 數據到指定 Hive 表中

注意：先用 export 導出后，再將數據導入。

hive (default)> import table student2 partition(month='201709') from
'/user/hive/warehouse/export/student';

2 數據導出

2.1 Insert 導出

1．將查詢的結果導出到本地

hive (default)> insert overwrite local directory 
'/opt/module/datas/export/student'
 select * from student;

2．將查詢的結果格式化導出到本地

hive(default)>insert overwrite local directory 
'/opt/module/datas/export/student1'
 ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' select * from 
student;

3．將查詢的結果導出到 HDFS 上(沒有 local)

hive (default)> insert overwrite directory '/user/atguigu/student2'
 ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' 
 select * from student;

2.2 Hadoop 命令導出到本地

hive (default)> dfs -get /user/hive/warehouse/student/month=201709/000000_0
/opt/module/datas/export/student3.txt;

2.3 Hive Shell 命令導出

基本語法：（hive -f/-e 執行語句或者腳本 > file）

[atguigu@hadoop102 hive]$ bin/hive -e 'select * from default.student;' >
/opt/module/datas/export/student4.txt;

2.4 Export 導出到 HDFS 上

(defahiveult)> export table default.student to
'/user/hive/warehouse/export/student';

export 和 import 主要用于兩個 Hadoop 平臺集群之間 Hive 表遷移。

2.5 Sqoop 導出

3 清除表中數據

注意：Truncate 只能刪除管理表，不能刪除外部表中數據

hive (default)> truncate table student;

posted @ 2020-07-22 19:00 秋華閱讀(511) 評論(0) 收藏舉報

刷新頁面返回頂部

秋華

Hive基礎（六）：Hive語法(2) DML(1) 數據操作(數據導入/數據導出)

1 數據導入

2 數據導出

3 清除表中數據

公告