头歌（Educoder）实践教学平台——Hive综合应用案例

时间：2021-06-02 11:35:33 阅读：0 评论：0 收藏：0 [点我收藏+]

标签：path mode 需要 database convert replace -- ide www

说明

　　适用平台：头歌（Educoder）实践教学平台

　　适用实训：

　　　　Hive综合应用案例——学生成绩查询

　　　　Hive综合应用案例——用户学历查询

　　　　Hive综合应用案例——用户搜索日志分析

　　注意：

　　　　看清楚上面说的实训是不是和你需要做的实训是同一个实训！！！

　　　　如果是同一个实训，直接复制粘贴代码，然后点击测评即可。

　　发文时间：2021-05-26 16:03:14

　　更新时间：2021-05-26 16:03:18

正文

Hive综合应用案例 — 学生成绩查询

第 1 关：课程选修人数

第 2 关：课程选修人数

第 3 关：课程选修人数

 1 ---------- 禁止修改 ----------
 2  drop database if exists mydb cascade;
 3   set hive.auto.convert.join = false;
 4 set hive.ignore.mapjoin.hint=false;
 5 ---------- 禁止修改 ----------
 6 
 7 
 8 ---------- begin ----------
 9 ---创建mydb数据库
10 create database if not exists mydb;
11 
12 
13 
14 ---使用mydb数据库
15 use mydb;
16 
17 
18 
19 ---创建表my_stu
20 create table if not exists my_stu(
21 id string comment ‘学生id‘,
22 name string comment ‘姓名‘,
23 sex string comment ‘性别‘,
24 age string comment ‘年龄‘,
25 col string comment ‘所选的系‘)
26 row format delimited fields terminated by ‘,‘
27 stored as textfile;
28 ---导入数据：/root/data/step2_files/my_student.txt
29 load data local inpath ‘/root/data/step2_files/my_student.txt‘ into table my_stu;
30 
31 
32 
33 --创建表my_score
34 create table if not exists my_score(
35 id string comment ‘学生id‘,
36 courseid string comment ‘课程id‘,
37 score string comment ‘成绩‘)
38 row format delimited fields terminated by ‘,‘
39 stored as textfile;
40 ---导入数据：/root/data/step2_files/my_score.txt
41 load data local inpath ‘/root/data/step2_files/my_score.txt‘ into table my_score;
42 
43 
44 
45 --创建表my_course
46 create table if not exists my_course(
47 courseid string comment ‘课程id‘,
48 coursename string comment ‘课程名称‘)
49 row format delimited fields terminated by ‘,‘
50 stored as textfile;
51 ---导入数据：/root/data/step2_files/my_course.txt
52 load data local inpath ‘/root/data/step2_files/my_course.txt‘ into table my_course;
53 
54 
55 
56 ---查询每个课程有多少人选修。
57 select t2.coursename, count(*)
58 from
59     (
60         select t1.name name, course.coursename coursename
61         from
62             (
63                 select stu.name name, score.courseid courseid
64                 from my_score score, my_stu stu
65                 where score.id = stu.id
66             ) as t1,
67             my_course course
68         where t1.courseid = course.courseid
69     )
70     as t2
71 group by t2.coursename;
72 ---------- end ----------

第 4 关：shujuku课程的平均成绩

 1 ---------- 禁止修改 ----------
 2  drop database if exists mydb cascade;
 3   set hive.auto.convert.join = false;
 4 set hive.ignore.mapjoin.hint=false;
 5 ---------- 禁止修改 ----------
 6 
 7 
 8 ---------- begin ----------
 9 ---创建mydb数据库
10 create database if not exists mydb;
11 
12 
13 ---使用mydb数据库
14 use mydb;
15 
16 
17 ---创建表my_stu
18 create table if not exists my_stu(
19 id string comment ‘学生id‘,
20 name string comment ‘姓名‘,
21 sex string comment ‘性别‘,
22 age string comment ‘年龄‘,
23 col string comment ‘所选的系‘)
24 row format delimited fields terminated by ‘,‘
25 stored as textfile;
26 ---导入数据：/root/data/step2_files/my_student.txt
27 load data local inpath ‘/root/data/step2_files/my_student.txt‘ into table my_stu;
28 
29 
30 --创建表my_score
31 create table if not exists my_score(
32 id string comment ‘学生id‘,
33 courseid string comment ‘课程id‘,
34 score string comment ‘成绩‘)
35 row format delimited fields terminated by ‘,‘
36 stored as textfile;
37 ---导入数据：/root/data/step2_files/my_score.txt
38 load data local inpath ‘/root/data/step2_files/my_score.txt‘ into table my_score;
39 
40 
41 --创建表my_course
42 create table if not exists my_course(
43 courseid string comment ‘课程id‘,
44 coursename string comment ‘课程名称‘)
45 row format delimited fields terminated by ‘,‘
46 stored as textfile;
47 ---导入数据：/root/data/step2_files/my_course.txt
48 load data local inpath ‘/root/data/step2_files/my_course.txt‘ into table my_course; 
49 
50 
51 
52 ---计算shujuku课程的平均成绩
53 select t3.coursename, t2.avg_score
54 from
55     (
56         select t1.courseid courseid, avg(score.score) avg_score
57         from
58             (
59                 select courseid
60                 from my_course
61                 where my_course.coursename = ‘shujuku‘
62             ) as t1,
63             my_score score
64         where t1.courseid = score.courseid
65         group by t1.courseid
66     ) as t2,
67     my_course t3
68 where t2.courseid = t3.courseid;
69 
70 ---------- end ----------

Hive综合应用案例——用户学历查询

第 1 关：查询每一个用户从出生到现在的总天数

 1 ---------- 禁止修改 ----------
 2  drop database if exists mydb cascade;
 3 ---------- 禁止修改 ----------
 4 
 5 
 6 ---------- begin ----------
 7 ---创建mydb数据库
 8 create database if not exists mydb;
 9 ---使用mydb数据库
10 use mydb;
11 
12 
13 
14 ---创建表user
15 create table usertab(
16     id string,
17     sex string,
18     time string,
19     education string,
20     occupation string,
21     income string,
22     area string,
23     desired_area string,
24     city_countryside string
25 )
26 row format delimited fields terminated by ‘,‘; 
27 
28 
29 
30 ---导入数据：/root/data.txt
31 load data local inpath ‘/root/data.txt‘ into table usertab;
32 
33 
34 
35 --查询每一个用户从出生到2019-06-10的总天数
36 select id, datediff(‘2019-06-10‘,regexp_replace(time, ‘/‘, ‘-‘)) from usertab;

第 2 关：同一个地区相同的教育程度的最高收入

 1 ---------- 禁止修改 ----------
 2  drop database if exists mydb cascade;
 3 ---------- 禁止修改 ----------
 4 
 5 
 6 ---------- begin ----------
 7 
 8 
 9 
10 --创建mydb数据库
11 create database if not exists mydb;
12 
13 
14 
15 ---使用mydb数据库
16 use mydb;
17 
18 
19 
20 ---创建表user
21 create table usertab1(
22     id int,
23     sex string,
24     time string,
25     education string,
26     occupation string,
27     income string,
28     area string,
29     desired_area string,
30     city_countryside string
31 )
32 row format delimited fields terminated by ‘,‘; 
33 
34 
35 
36 ---导入数据：/root/data.txt
37 load data local inpath ‘/root/data1.txt‘ into table usertab1;
38 
39 
40 
41 --同一个地区相同的教育程度的最高收入
42 select area,education,income
43 from(
44     select area,education,income,
45     row_number() over(
46         partition by area, education order by income desc
47     ) as t1
48     from usertab1
49 ) as t2
50 where t2.t1 = 1;
51 
52 
53 
54 ---------- end ----------

第 3 关：统计各级学历所占总人数百分比

 1 ---------- 禁止修改 ----------
 2 
 3  drop database if exists mydb cascade;
 4  set hive.mapred.mode=nonstrict;
 5 ---------- 禁止修改 ----------
 6 
 7 
 8 ---------- begin ----------
 9 
10 
11 
12 --创建mydb数据库
13 create database if not exists mydb;
14 
15 
16 
17 ---使用mydb数据库
18 use mydb;
19 
20 
21 
22 ---创建表user
23 create table usertab2(
24     id int,
25     sex string,
26     time string,
27     education string,
28     occupation string,
29     income string,
30     area string,
31     desired_area string,
32     city_countryside string
33 )
34 row format delimited fields terminated by ‘,‘;
35 
36 
37 
38 ---导入数据：/root/data.txt
39 load data local inpath ‘/root/data.txt‘ into table usertab2;
40 
41 
42 
43 --统计各级学历所占总人数百分比(对结果保留两位小数)
44 select concat(round(t1.cnted * 100 / t2.cnt, 2),‘%‘), t1.education
45 from
46     (
47         select count(*) as cnted,education
48         from usertab2
49         group by education
50     ) as t1,
51 
52     (
53         select count(*) as cnt from usertab2
54     ) as t2
55 order by t1.education;
56 
57 
58 
59 ---------- end ----------

Hive综合应用案例 — 用户搜索日志分析

参考：

[1]:Hive综合应用案例——学生成绩查询

[2]:Hive综合应用案例——用户学历查询

头歌（Educoder）实践教学平台——Hive综合应用案例

标签：path mode 需要 database convert replace -- ide www

原文地址：https://www.cnblogs.com/brownpeach/p/14813861.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行

头歌（Educoder）实践教学平台——Hive综合应用案例

第 1 关：课程选修人数

第 2 关：课程选修人数

第 3 关：课程选修人数

第 4 关：shujuku课程的平均成绩

第 1 关：查询每一个用户从出生到现在的总天数

第 2 关：同一个地区相同的教育程度的最高收入

第 3 关：统计各级学历所占总人数百分比