Apache Flume online training
Apache Flume is Data Ingestion Framework that writes event- grounded data to Hadoop
Distributed train System. It's a known fact that Hadoop processes Big data, a question arises
how the data generated from different web waiters is transmitted to Hadoop train System?
The answer is Apache Flume. Flume is designed for high volume data ingestion to Hadoop of
event- grounded data.
Flume Source: A Flume Source is present on Data creators like Face book or Twitter.
Source collects data from the creator and transfers that data to Flume Channel in the
form of Flume Events.
Flume Channel: An Intermediate Store that buffers the Events transferred by Flume
Source until they're consumed by Sink is called Flume Channel.
Flume Sink: A Flume Sink is present on Data depositories like HDFS, HBase. Flume sink
consumes events from Channel and stores them to Destination stores like HDFS.
Flume Agent: A Flume agent is a long- running Java process that runs on Source –
Channel – Sink Combination. Flume can have further than one agent. We can consider
Flume as a collection of connected Flume agents that are distributed in nature.
Flume Event: An Event is the unit of data transported in Flume. The general
representation of the Data Object in Flume is called Event. The event is made up of a
cargo of a byte array with voluntary heads.
Advantages of Apache Flume:
Scalable: Flume is scalable horizontally, i.e., we can add new bumps as per our demand.
Reliable: Apache Flume has support for deals and ensures that no data is lost in the
process of data transmission. It has different deals from source to channel and from
channel to Source.
Flume is customizable and provides support for colourful sources and sinks like Kafka,
Avro, spooling directory, providence, etc.
In Flume, a single source can transmit data to multiple channels and those channels in
turn will transmit the data to multiple cesspools, therefore a single source can transmit
data to multiple cesspools.
Flume provides the steady inflow of data transmission i.e. if data reading speed
increases and also data writing speed also increases.