Overview

This bundle provides a complete deployment of the core components of the Apache Hadoop platform along with its monitoring facilities. This document describes how the components of this deployment are connected and how to connect additional services to the deployment to add functionality.

Building on this Bundle

This bundle supports two ways to extend its functionality via additional charms: by relating to the subordinate apache-hadoop-plugin charm, or by using a stand-alone charm connecting to HDFS and YARN via standard relations.

Subordinate Charm Connection

By connecting to the plugin charm via the hadoop-plugin subordinate relation, a charm will get access to a pre-configured Apache Hadoop environment, including Java libraries and the hdfs command-line utility. The /etc/environment file on the unit will have all of the path and configuration variables defined.

The details of the interface protocol used by this relation are documented in the DEV-README.md of the plugin charm.

For example, to deploy apache-hive with this bundle, you would use:

juju quickstart apache-core-batch-processing
juju deploy cs:trusty/apache-hive
juju add-relation apache-hive plugin

This workflow has been bundled and is available as apache-analytics-sql.

Stand-alone Charm Connection

If a charm does not need the Java libraries, environment config, etc., and only needs the address information for the NameNode and ResourceManager, then it can connect to the hdfs-master and yarn-master services via the namenode and resourcemanager relations, respectively.

The details of the interface protocols used by these relations are documented in the DEV-README.md of the respective charms.

For example, to deploy apache-hue with this bundle, you would use:

juju quickstart apache-core-batch-processing
juju deploy cs:trusty/apache-hue
juju add-relation apache-hue plugin

Component Services

This bundle deploys the following services: (For detailed information, including relation specifications, see each charm's DEV-README.md):

namenode The two units of this service provide the NameNode in an HA configuration.
resourcemanager This service runs the ResourceManager component.
slave The three units of this service run the DataNode, NodeManger, and JournalNode components.
plugin This subordinate provides a consolidated connection point for Apache Hadoop, and provides the Hadoop libaries and config for connected clients.
client This is a minimal client, which serves as a gateway node.
zookeeper The three units of this service provide automatic fail-over for the NameNode.
ganglia This service provides the UI for monitoring.
ganglia-node This subordinate collects system-level monitoring data from each node.
rsyslog-forwarder-ha This subordinate forwards logs from each node to flume-syslog.
flume-syslog This service collects the logs in preparation for ingesting them into HDFS.
flume-hdfs This service takes the logs from flume-syslog and ingests them into HDFS.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DEV-README.md

DEV-README.md

Overview

Building on this Bundle

Subordinate Charm Connection

Stand-alone Charm Connection

Component Services

Files

DEV-README.md

Latest commit

History

DEV-README.md

File metadata and controls

Overview

Building on this Bundle

Subordinate Charm Connection

Stand-alone Charm Connection

Component Services