-
Notifications
You must be signed in to change notification settings - Fork 0
Hadoop first draft
-
Install Hadoop Machine. I used Ambari https://cwiki.apache.org/confluence/display/AMBARI/Install+Ambari+2.2.0+from+Public+Repositories
-
There are two relevant files, one for credentials, another for the REST endpoint and settings.
- copy jets3t.properties (https://github.com/noobaa/Connectors/blob/master/jets3t.properties) to /usr/local/hadoop/etc/hadoop/jets3t.properties
- copy core-site.xml (https://github.com/noobaa/Connectors/blob/master/core-site.xml) to /usr/local/hadoop/etc/hadoop/core-site.xml
Edit both files to reflect your NooBaa Endpoint and credentials.
- install s3fs (https://github.com/s3fs-fuse/s3fs-fuse) and create output folder on noobaa
mkdir /hadoop-out
/usr/local/bin/s3fs hadoop-out /hadoop-out -o passwd_file=passwd -ouse_path_request_style -ourl=http://146.148.44.71/ -osigv2 -o parallel_count=8
-
Test by creating a bucket and input/output folder. In this example, the bucket name is hadoop, it will read data from input folder and write the output to the output folder.
cd /usr/local/hadoop
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.3.jar wordcount s3n://hadoop/input file://hadoop-out
---------------- TEMP ----
Download hadoop: wget http://www.us.apache.org/dist/hadoop/common/hadoop-2.7.2/hadoop-2.7.2.tar.gz
tar -xvf hadoop-2.7.2.tar.gz /usr/local mv hadoop-2.7.2 hadoop edit /usr/local/hadoop/etc/hadoop/hadoop-env.sh
-
set the JAVA_HOME (point to installed java)
-
update HADOOP_CLASSPATH
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$f:$HADOOP_HOME/share/hadoop/tools/lib/* else export HADOOP_CLASSPATH=$f:$HADOOP_HOME/share/hadoop/tools/lib/*