This repository contains the code used to set up distributed TensorFlow cluster for the course project CSM213B.
- Website more details is available here.
- Demo video is available here.
- Slides of the final presentation are available here.
-
Alex
-
Backend
-
FrontEnd
-
SoftMax_Local
The AlexNet is build in Tensorflow. We imported the caffe trained model (https://github.com/BVLC/caffe/tree/master/models/bvlc_alexnet) and only make inference in our program to get the image classification result. The codes include:
- Runnining AlexNet in local machine
- In distributed clusters (after partition)
- In distributed clusters (multiple sessions)
The backend server is developed using Flask in Python. The backend listend to the connections from the front end and process the request. The code of the backend is present in Backend folder.
- The cluster on raspberry pi devices is set up by following the procedure described here.
- The flask installation is described in detail here.
- Installing distributed TensorFlow on multiple machines is explained on the official page here.
- First add the cluster details in the server file, and then start the server process on each device of cluster as explained in the official documentation given in 3.
- Now you can start the flask basked backend server to listen to the incoming connections.
- Raspberry pi with camera attached and running python 2.7+ along with following the installations described in upper sections.
- Connectivity of cluster devices along with unique ip assingement in the local subnet. All cluster devices should be able to logically communicate with each other.
The frontend is developed using the Javascript. The server is hosted using the apache tomcat and is deveoped in eclipse.
- The eclipse for web development can be downloaded from the link.
- The eclipse requires the setting up of Apache Tomcat which is explained in detail here.
- Eclipse requires the Java SDK to be installed.
- The machine running frontend should be able to communicate with the cluster so as to schedule jobs at runtime on the cluster backend server.
This folder contains the local implementation the softmax in python. The codes include:
- Training the naive softmax regression classifier and stored the ckeckpoint files.
- Running inference using softmax regression classifier on both MINST dataset and also self-created digit image data.
- Running softmax in single machine, and in distributed cluster settings.
- Folder for Image preprocessing in TensorFlow.