Hadoop ecosystem

2024-07-08 07:58:51 224人阅读

How did it all start- huge data on the web!
Nutch built to crawl this web data
Huge data had to saved- HDFS was born!
How to use this data?
Map reduce framework built for coding and running analytics – java, any language-streaming/pipes
How to get in unstructured data – Web logs, Click streams, Apache logs, Server logs – fuse,webdav, chukwa, flume, Scribe
Hiho and sqoop for loading data into HDFS – RDBMS can join the Hadoop band wagon!
High level interfaces required over low level map reduce programming– Pig, Hive, Jaql
BI tools with advanced UI reporting- drilldown etc- Intellicus
Workflow tools over Map-Reduce processes and High level languages
Monitor and manage hadoop, run jobs/hive, view HDFS – high level view- Hue, karmasphere, eclipse plugin, cacti, ganglia
Support frameworks- Avro (Serialization), Zookeeper (Coordination)
More High level interfaces/uses- Mahout, Elastic map Reduce
OLTP- also possible – Hbase

声明：以上内容来自用户投稿及互联网公开渠道收集整理发布，本网站不拥有所有权，未作人工编辑处理，也不承担相关法律责任，若内容有误或涉及侵权可进行投诉：投诉/举报工作人员会在5个工作日内联系你，一经查实，本站将立刻删除涉嫌侵权内容。

联系
我们