You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One difficulty I can see is to get client configuration.
It can be taken from the system:
stored in /etc/hadoop/conf/hdfs-site.xml, or in $HADOOP_CONF_DIR/hdfs-site.xml, or somewhere else;
to get information from XML config file, we must know filesystem name: it can be found in the same hdfs-site.xml (parameter fs.defaultFS), or, if absent -- in core-site.xml;
HTTP connection parameters may not be specified (parameter dfs.http.address), so it is to be reconstructed from namenodes (dfs.ha.namenodes.$FS_NAME + dfs.namenode.rpc-address.$FS_NAME.$NN_NAME)
Or, it can be taken from config file $HDFSCLI_CONFIG created "by hands" (or in some automatic way from system configuration).
Next, we need to somehow detect active namenode from those found in system configuration.
Everything else looks just fine.
The text was updated successfully, but these errors were encountered:
Motivation
Currently we are using default Hadoop command line client, which is written in Java. There`s nothing good in running Java app every now and then.
What to do
We can use Python module for hdfs instead.
Difficulties
It can be taken from the system:
/etc/hadoop/conf/hdfs-site.xml
, or in$HADOOP_CONF_DIR/hdfs-site.xml
, or somewhere else;hdfs-site.xml
(parameterfs.defaultFS
), or, if absent -- incore-site.xml
;dfs.http.address
), so it is to be reconstructed from namenodes (dfs.ha.namenodes.$FS_NAME
+dfs.namenode.rpc-address.$FS_NAME.$NN_NAME
)Or, it can be taken from config file $HDFSCLI_CONFIG created "by hands" (or in some automatic way from system configuration).
Everything else looks just fine.
The text was updated successfully, but these errors were encountered: