This repository has been archived by the owner on Nov 16, 2019. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 357
CLI
Andy Feng edited this page Mar 14, 2016
·
2 revisions
spark-submit ... \
--files <CAFFE_SOLVER_PROTOTXT>,<CAFFE_NET_PROTOTXT> \
--conf spark.driver.extraLibraryPath="<LD_LIBRARY_PATH>" \
--conf spark.executorEnv.LD_LIBRARY_PATH="<LD_LIBRARY_PATH>" \
--class com.yahoo.ml.caffe.CaffeOnSpark <CAFFE_ON_SPARK_JAR> \
<ARGUMENTS>
CaffeOnSpark users could use spark-submit as its command line outlined above.
- <CAFFE_SOLVER_PROTOTXT> ... the file path of your Caffe solver prototxt file
- <CAFFE_NET_PROTOTXT> ... the file path of your Caffe network prototxt file
- <LD_LIBRARY_PATH> ... Library path for CaffeOnSpark to locate necessary .so files, including libcaffe.so, libcaffedistri.so and their dependencies (ex. cuda, lmdbjni).
- <CAFFE_ON_SPARK_JAR> ... the file path of CaffeOnSpark jar file (caffe-grid-0.1-SNAPSHOT-jar-with-dependencies.jar).
CaffeOnSpark are a collection of arguments listed below.
Argument | Description |
---|---|
-conf <Conf> | solver prototxt file |
-train | training mode |
-test | test mode |
-features <f1,f2,...> | feature extraction mode. A list of features are separated by comma |
-label <Label> | blob name for label |
-devices <Num> | number of devices (CPU or GPU) per executor. default: 1 |
-connection <Conn> | network connection interface, either ethernet or infiniband (default) |
-model <File> | output model file URL (file:// or hdfs://...) |
-weights <File> | input model file as initial weights for training, either used with "-snapshot" or used alone. |
-snapshot <File> | input state file URL (file:// or hdfs://) for resuming training. "-weights" is also required together with this option. |
-outputFormat <Format> | feature output format. should be either json (default) or parquet. |
-resize | resize input images. The height and width are specified in memory data layer of the prototxt file. |
-clusterSize <#Executors> | the number of executors to be used. Only used by Spark YARN clusters. |
-persistent | Indicate that data files be persistented on local file system |
-lmdb_partitions <Parts> | the # of LMDB RDD partitions. Default: cluster size |