首页 > 代码库 > Hive学习笔记——保存select结果,Join,多重插入
Hive学习笔记——保存select结果,Join,多重插入
1. 保存select查询结果的几种方式:
1、将查询结果保存到一张新的hive表中
create table t_tmp as select * from t_p;
2、将查询结果保存到一张已经存在的hive表中(用load的时候,可以是into或者overwrite into,这里是into或者overwrite)
insert into/overwrite table t_tmp select * from t_p;
3、将查询结果保存到指定的文件目录(可以是本地,也可以是hdfs)
insert overwrite local directory ‘/home/hadoop/test‘ select * from t_p;
insert overwrite directory ‘/aaa/test‘ select * from t_p;
2. 关于hive中的各种join
准备数据
1,a
2,b
3,c
4,d
7,y
8,u
2,bb
3,cc
7,yy
9,pp
建表:
create table a(id int,name string) row format delimited fields terminated by ‘,‘; create table b(id int,name string) row format delimited fields terminated by ‘,‘;
导入数据:
load data local inpath ‘/home/hadoop/a.txt‘ into table a; load data local inpath ‘/home/hadoop/b.txt‘ into table b;
实验:
** inner join
select * from a join b on a.id=b.id;
+-------+---------+-------+---------+--+
| a.id | a.name | b.id | b.name |
+-------+---------+-------+---------+--+
| 2 | b | 2 | bb |
| 3 | c | 3 | cc |
| 7 | y | 7 | yy |
+-------+---------+-------+---------+--+
**left join
select * from a left outer join b on a.id=b.id;
+-------+---------+-------+---------+--+
| a.id | a.name | b.id | b.name |
+-------+---------+-------+---------+--+
| 1 | a | NULL | NULL |
| 2 | b | 2 | bb |
| 3 | c | 3 | cc |
| 4 | d | NULL | NULL |
| 7 | y | 7 | yy |
| 8 | u | NULL | NULL |
+-------+---------+-------+---------+--+
**right join
select * from a right outer join b on a.id=b.id;
同上效果,只不过这次b的全部显示,a的会有NULL。
**full join
select * from a full outer join b on a.id=b.id;
+-------+---------+-------+---------+--+
| a.id | a.name | b.id | b.name |
+-------+---------+-------+---------+--+
| 1 | a | NULL | NULL |
| 2 | b | 2 | bb |
| 3 | c | 3 | cc |
| 4 | d | NULL | NULL |
| 7 | y | 7 | yy |
| 8 | u | NULL | NULL |
| NULL | NULL | 9 | pp |
+-------+---------+-------+---------+--+
**left semi join
select * from a left semi join b on a.id = b.id;
+-------+---------+--+
| a.id | a.name |
+-------+---------+--+
| 2 | b |
| 3 | c |
| 7 | y |
+-------+---------+--+
3. 多重插入
from student insert into table student_p partition(part=‘a‘) select * where id<95011; insert into table student_p partition(part=‘a‘) select * where id<95011;
Hive学习笔记——保存select结果,Join,多重插入