06 RDD編程
- 總共有多少學生?map(), distinct(), count()
- 開設了多少門課程?
![]()
-
- 每個學生選修了多少門課?map(), countByKey()
- 每門課程有多少個學生選?map(), countByValue()
- Tom選修了幾門課?每門課多少分?filter(), map() RDD
- Tom選修了幾門課?每門課多少分?map(),lookup() list
![]()
- Tom的成績按分數大小排序。filter(), map(), sortBy()
- Tom的平均分。map(),lookup(),mean()
- 求每門課的選修人數及平均分。combineByKey()
![]()
-
course_list = stu_rdd_cource_count_reduce.collect()
sk = ''
rs = 0
zf = 0
for j in range(len(course_list)):
sk = str(course_list[j]).split(',')[0].replace("('",'').replace("'",'')
rs = int(str(course_list[j]).split(',')[1].replace(')',''))
zf = int(str(stu_rdd_cource_sum_reduce.collect()[j]).split(',')[1].replace(')',''))
print(sk, rs,round(zf/rs,2))結果可視化。 pyecharts.charts,Bar()
![]()
import pyecharts.options as opts
from pyecharts.charts import Bar
x = ['ComputerNetwork', 'Software', 'DataBase', 'Algorithm', 'OperatingSystem', 'Python', 'DataStructure', 'CLanguage']
y = [
[142, 132, 126, 144, 134, 136, 131, 128],
[51.9, 50.91, 50.54, 48.83, 54.94, 57.82, 47.57, 50.61]
]
bar = (
Bar()
.add_xaxis(x)
.add_yaxis(series_name='總人數', y_axis=y[0])
.add_yaxis(series_name='平均分', y_axis=y[1])
.set_series_opts(label_opts=opts.LabelOpts(is_show=False))
.set_global_opts(title_opts=opts.TitleOpts(title='課程', pos_left='right'))
.set_global_opts(toolbox_opts=opts.ToolboxOpts(is_show=True),
yaxis_opts=opts.AxisOpts(name="總人數"),
xaxis_opts=opts.AxisOpts(name="課程名"),axislabel_opts=opts.LabelOpts(rotate=15))
)
bar.render()
bar.render('./bar.html')





浙公網安備 33010602011771號