CLI

CaffeOnSpark Command Line

spark-submit ... \
    --files <CAFFE_SOLVER_PROTOTXT>,<CAFFE_NET_PROTOTXT> \
    --conf spark.driver.extraLibraryPath="<LD_LIBRARY_PATH>" \
    --conf spark.executorEnv.LD_LIBRARY_PATH="<LD_LIBRARY_PATH>" \
    --class com.yahoo.ml.caffe.CaffeOnSpark <CAFFE_ON_SPARK_JAR> \
        <ARGUMENTS>

CaffeOnSpark users could use spark-submit as its command line outlined above.

<CAFFE_SOLVER_PROTOTXT> ... the file path of your Caffe solver prototxt file
<CAFFE_NET_PROTOTXT> ... the file path of your Caffe network prototxt file
<LD_LIBRARY_PATH> ... Library path for CaffeOnSpark to locate necessary .so files, including libcaffe.so, libcaffedistri.so and their dependencies (ex. cuda, lmdbjni).
<CAFFE_ON_SPARK_JAR> ... the file path of CaffeOnSpark jar file (caffe-grid-0.1-SNAPSHOT-jar-with-dependencies.jar).

CaffeOnSpark are a collection of arguments listed below.

Argument	Description
-conf <Conf>	solver prototxt file
-train	training mode
-test	test mode
-features <f1,f2,...>	feature extraction mode. A list of features are separated by comma
-label <Label>	blob name for label
-devices <Num>	number of devices (CPU or GPU) per executor. default: 1
-connection <Conn>	network connection interface, either ethernet or infiniband (default)
-model <File>	output model file URL (file:// or hdfs://...)
-weights <File>	input model file as initial weights for training, either used with "-snapshot" or used alone.
-snapshot <File>	input state file URL (file:// or hdfs://) for resuming training. "-weights" is also required together with this option.
-outputFormat <Format>	feature output format. should be either json (default) or parquet.
-resize	resize input images. The height and width are specified in memory data layer of the prototxt file.
-clusterSize <#Executors>	the number of executors to be used. Only used by Spark YARN clusters.
-persistent	Indicate that data files be persistented on local file system
-lmdb_partitions <Parts>	the # of LMDB RDD partitions. Default: cluster size

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLI

CaffeOnSpark Command Line

Clone this wiki locally