首页 > 代码库 > Flume监听文件目录sink至hdfs配置

Flume监听文件目录sink至hdfs配置

一:flume介绍

        Flume是一个分布式、可靠、和高可用的海量日志聚合的系统,支持在系统中定制各类数据发送方,用于收集数据;同时,Flume提供对数据进行简单处理,并写到各种数据接受方(可定制)的能力。,Flume架构分为三个部分 源-Source,接收器-Sink,通道-Channel

 

二:配置文件

    此配置文件source为一个目录,注意,该目录下的文件应为只读,不可写,且文件名不能相同,采用的channels为file,sink为hdfs,此处往hdfs写的策略是当时间达到3600s或者文件大小达到128M。

agent1.sources = spooldirSourceagent1.channels = fileChannelagent1.sinks = hdfsSinkagent1.sources.spooldirSource.type=spooldiragent1.sources.spooldirSource.spoolDir=/data/lwq/new_logagent1.sources.spooldirSource.channels=fileChannelagent1.sinks.hdfsSink.type=hdfsagent1.sinks.hdfsSink.hdfs.path=hdfs://dev228:8020/raw/lwq/%y-%m-%dagent1.sinks.hdfsSink.hdfs.filePrefix=lwqagent1.sinks.sink1.hdfs.round = true# Number of seconds to wait before rolling current file (0 = never roll based on time interval)agent1.sinks.hdfsSink.hdfs.rollInterval = 3600# File size to trigger roll, in bytes (0: never roll based on file size)agent1.sinks.hdfsSink.hdfs.rollSize = 128000000agent1.sinks.hdfsSink.hdfs.rollCount = 0agent1.sinks.hdfsSink.hdfs.batchSize = 1000#Rounded down to the highest multiple of this (in the unit configured using hdfs.roundUnit), less than current time.agent1.sinks.hdfsSink.hdfs.roundValue = http://www.mamicode.com/1>

 

三:启动命令

  

1 ${FLUME_HOME}/bin/flume-ng agent --conf ./conf/ -f conf/flume-site.xml -Dflume.root.logger=DEBUG,console -n agent1 > log.log 2>&1 &