<output id="qn6qe"></output>

    1. <output id="qn6qe"><tt id="qn6qe"></tt></output>
    2. <strike id="qn6qe"></strike>

      亚洲 日本 欧洲 欧美 视频,日韩中文字幕有码av,一本一道av中文字幕无码,国产线播放免费人成视频播放,人妻少妇偷人无码视频,日夜啪啪一区二区三区,国产尤物精品自在拍视频首页,久热这里只有精品12

      CDH5.13快速體驗

      相對于易用性很差Apache Hadoop,其他商業版Hadoop的性能易用性都有更好的表現,如Cloudera、Hortonworks、MapR以及國產的星環,下面使用CDH(Cloudera Distribution Hadoop)快速體驗下。

      首先從,從Cloudera官網下載部署好的虛擬機環境https://www.cloudera.com/downloads/quickstart_vms/5-13.html.html,解壓后用虛擬機打開,官方推薦至少8G內存2cpu,由于筆記本性能足夠,我改為8G內存8cpu啟動,虛擬機各種賬號密碼都是cloudera

      打開虛擬機的瀏覽器訪問http://quickstart.cloudera/#/

      image點擊Get Started以體驗

      imageimageimage

      Tutorial Exercise 1:導入、查詢關系數據

      利用sqoop工具將mysql數據導入HDFS中


      [cloudera@quickstart ~]$ sqoop import-all-tables \
      >     -m 1 \
      >     --connect jdbc:mysql://quickstart:3306/retail_db \
      >     --username=retail_dba \
      >     --password=cloudera \
      >     --compression-codec=snappy \
      >     --as-parquetfile \
      >     --warehouse-dir=/user/hive/warehouse \
      >     --hive-import
      Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
      Please set $ACCUMULO_HOME to the root of your Accumulo installation.
      19/04/29 18:31:46 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.13.0
      19/04/29 18:31:46 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
      19/04/29 18:31:46 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
      19/04/29 18:31:46 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
      19/04/29 18:31:46 WARN tool.BaseSqoopTool: It seems that you're doing hive import directly into default
      
      (many more lines suppressed)
      
                      Failed Shuffles=0
                      Merged Map outputs=0
                      GC time elapsed (ms)=87
                      CPU time spent (ms)=3690
                      Physical memory (bytes) snapshot=443174912
                      Virtual memory (bytes) snapshot=1616969728
                      Total committed heap usage (bytes)=352845824
              File Input Format Counters 
                      Bytes Read=0
              File Output Format Counters 
                      Bytes Written=0
      19/04/29 18:38:27 INFO mapreduce.ImportJobBase: Transferred 46.1328 KB in 85.1717 seconds (554.6442 bytes/sec)
      19/04/29 18:38:27 INFO mapreduce.ImportJobBase: Retrieved 1345 records.
      [cloudera@quickstart ~]$ hadoop fs -ls /user/hive/warehouse/
      Found 6 items
      drwxrwxrwx   - cloudera supergroup          0 2019-04-29 18:32 /user/hive/warehouse/categories
      drwxrwxrwx   - cloudera supergroup          0 2019-04-29 18:33 /user/hive/warehouse/customers
      drwxrwxrwx   - cloudera supergroup          0 2019-04-29 18:34 /user/hive/warehouse/departments
      drwxrwxrwx   - cloudera supergroup          0 2019-04-29 18:35 /user/hive/warehouse/order_items
      drwxrwxrwx   - cloudera supergroup          0 2019-04-29 18:36 /user/hive/warehouse/orders
      drwxrwxrwx   - cloudera supergroup          0 2019-04-29 18:38 /user/hive/warehouse/products
      [cloudera@quickstart ~]$ hadoop fs -ls /user/hive/warehouse/categories/
      Found 3 items
      drwxr-xr-x   - cloudera supergroup          0 2019-04-29 18:31 /user/hive/warehouse/categories/.metadata
      drwxr-xr-x   - cloudera supergroup          0 2019-04-29 18:32 /user/hive/warehouse/categories/.signals
      -rw-r--r--   1 cloudera supergroup       1957 2019-04-29 18:32 /user/hive/warehouse/categories/6e701a22-4f74-4623-abd1-965077105fd3.parquet
      [cloudera@quickstart ~]$ 
      

      然后訪問http://quickstart.cloudera:8888/,來訪問表(invalidate metadata;是用來刷新元數據的)

      image

      Tutorial Exercise 2 :外部表方式導入訪問日志數據到HDFS并查詢

      通過hive建表

      CREATE EXTERNAL TABLE intermediate_access_logs (
          ip STRING,
          date STRING,
          method STRING,
          url STRING,
          http_version STRING,
          code1 STRING,
          code2 STRING,
          dash STRING,
          user_agent STRING)
      ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
      WITH SERDEPROPERTIES (
          'input.regex' = '([^ ]*) - - \\[([^\\]]*)\\] "([^\ ]*) ([^\ ]*) ([^\ ]*)" (\\d*) (\\d*) "([^"]*)" "([^"]*)"',
          'output.format.string' = "%1$$s %2$$s %3$$s %4$$s %5$$s %6$$s %7$$s %8$$s %9$$s")
      LOCATION '/user/hive/warehouse/original_access_logs';
      
      CREATE EXTERNAL TABLE tokenized_access_logs (
          ip STRING,
          date STRING,
          method STRING,
          url STRING,
          http_version STRING,
          code1 STRING,
          code2 STRING,
          dash STRING,
          user_agent STRING)
      ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
      LOCATION '/user/hive/warehouse/tokenized_access_logs';
      
      ADD JAR /usr/lib/hive/lib/hive-contrib.jar;
      
      INSERT OVERWRITE TABLE tokenized_access_logs SELECT * FROM intermediate_access_logs;
      

      imageimpala中刷新元數據后訪問表

      image

      Tutorial Exercise 3:使用spark進行關聯分析

      image

      Tutorial Exercise 4:利用flume收集日志,并用solr做全文索引

      image

      Tutorial Exercise 5:可視化

      imageTutorial is over!

      posted @ 2019-08-19 15:07  九命貓幺  閱讀(693)  評論(0)    收藏  舉報
      主站蜘蛛池模板: 德江县| 久久人体视频| 漂亮人妻中文字幕丝袜| 久久精品国产精品亚洲蜜月| 精品亚洲香蕉久久综合网| 成人午夜在线观看刺激| 亚洲国产另类久久久精品小说| 精品久久一线二线三线区| 东京热一精品无码av| 国产麻豆剧传媒精品国产av| 国内精品久久久久久久coent| 国内不卡一区二区三区| 国产成人精品一区二区三区免费 | 精品国产乱码久久久久久口爆网站| 久久丫精品久久丫| 中文人妻av高清一区二区| 水城县| 国产精品99久久免费| 少妇愉情理伦片丰满丰满午夜| 久久99精品久久久久久9| 自偷自拍亚洲综合精品| 一区二区三区av天堂| 国产精品一区在线蜜臀| 蜜臀人妻精品一区二区免费| 成人亚欧欧美激情在线观看| 久久综合给合久久狠狠狠88| 最新的精品亚洲一区二区| 久久a级片| 日日碰狠狠添天天爽五月婷| 一区二区中文字幕av| 成人视频在线观看| 亚洲成A人片在线观看无码不卡| 国产精品国产三级国快看| Y111111国产精品久久久| 18禁国产一区二区三区| 2021精品亚洲中文字幕| 色综合天天综合网国产人| 成人免费av在线观看| 亚洲香蕉网久久综合影视| 久久一级黄色大片免费观看| 好吊视频一区二区三区人妖|