Skip to content
This repository was archived by the owner on Jul 22, 2024. It is now read-only.

Commit 27718e3

Browse files
author
Predrag Djurdjevic
committed
tensorrt samples merge
1 parent 891f454 commit 27718e3

29 files changed

+4512
-0
lines changed

vision/tensorrt-samples/README.md

+38
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
# vision-tensorrt-samples
2+
3+
### Common Assumptions
4+
5+
* The repository contains samples based on Nvidia's TensorRT c/c++ samples modified for ease of use. Target platforms are Power with GPUs and Nvidia Jetsons both native Linux or within docker containers.
6+
7+
- Inputs can be arbitrary image files in terms of extension and resolution passed in as command line arguments
8+
- Outputs are list of classes and bboxes per image and debug images with bbox marked
9+
- Model can be arbitrary as long as it matches the model type (SSD, FRCNN for now, Yolo and Googlenet soon to follow) and that <model_name>_trt.prototxt and <model_name>.caffemodel are present and names adjusted in source code
10+
- Batch size can be arbitrary as long as it fits the device memory, adjustable in source code
11+
- Floating point precision can be arbitrary which affects the accuracy, speed and memory footprint, adjustable in source code
12+
- Number and names of classes can be arbitrary based on model, also adjustable in source code. The names could be read from label file but due to variations in syntax, left out for now
13+
- Number of classes is always one more than the label file since there is one background class.
14+
- Confidence level can be adjusted which determines the number of object recognized
15+
16+
### Common Use
17+
18+
* The samples first have to be compiled from source and ran from bin directory with command line parameters
19+
- It is assumed that cuda, cudnn, tensorrt, gcc, opencv are preinstalled and environment variables set (see below).
20+
- Copy the samples and make files over the respective TensorRT sample directories
21+
- Modify the code to match desired model, batch size, floating point precision, image folder, names of classes
22+
- Compile the source code via make from respective samples directory
23+
- Run the binary (release or debug) from the bin folder and pass in the file names as "name.ext" "name.ext" without the folder path.
24+
- On initial run, if TensorRT engine for the model has not been run before, it will take a little while to parse and serialize it to file
25+
- On subsequent runs if no changes were made to the model or engine parameters, the engine will be deserialized from an earlier saved one
26+
27+
### Common Prerequisites
28+
29+
* Following are prerequisite steps to have the correct native or docker environment both for Power and Jetson, build and runtime
30+
31+
- If building on Power within docker best is to start with nvidia/cuda-ppc64le:10.1-cudnn7-devel-ubuntu18.04 docker and add latest TensorRT SDK (currently 5.1.3.2, cuda 10.1, cudnn 7.5, for Power)
32+
- Install or build opencv version 3.3.1 and above
33+
- If building on Jetson TX2 native, follow the steps described in Nvidia Jetpack installation. This requires an Ubuntu host machine to initially flash the board via Jetpack Manager (currently 4.2.2). All the prerequisites, if checked, during installation are preinstalled and ready for use.
34+
- Note that Host Machine needs to be unchecked and TensorFlow can be unchecked and Jetson TX2 checked.
35+
36+
- Please follow the rest of the prerequisite instructions from the Nvidia samples README.md
37+
38+
+64
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
SHELL=/bin/bash -o pipefail
2+
TARGET?=$(shell uname -m)
3+
LIBDIR?=lib
4+
VERBOSE?=0
5+
ifeq ($(VERBOSE), 1)
6+
AT=
7+
else
8+
AT=@
9+
endif
10+
CUDA_TRIPLE=x86_64-linux
11+
CUBLAS_TRIPLE=x86_64-linux-gnu
12+
DLSW_TRIPLE=x86_64-linux-gnu
13+
ifeq ($(TARGET), aarch64)
14+
CUDA_TRIPLE=aarch64-linux
15+
CUBLAS_TRIPLE=aarch64-linux-gnu
16+
DLSW_TRIPLE=aarch64-linux-gnu
17+
endif
18+
ifeq ($(TARGET), qnx)
19+
CUDA_TRIPLE=aarch64-qnx
20+
CUBLAS_TRIPLE=aarch64-qnx-gnu
21+
DLSW_TRIPLE=aarch64-unknown-nto-qnx
22+
endif
23+
ifeq ($(TARGET), ppc64le)
24+
CUDA_TRIPLE=ppc64le-linux
25+
CUBLAS_TRIPLE=ppc64le-linux
26+
DLSW_TRIPLE=ppc64le-linux
27+
endif
28+
ifeq ($(TARGET), android64)
29+
DLSW_TRIPLE=aarch64-linux-androideabi
30+
CUDA_TRIPLE=$(DLSW_TRIPLE)
31+
CUBLAS_TRIPLE=$(DLSW_TRIPLE)
32+
endif
33+
export TARGET
34+
export VERBOSE
35+
export LIBDIR
36+
export CUDA_TRIPLE
37+
export CUBLAS_TRIPLE
38+
export DLSW_TRIPLE
39+
samples=sampleCharRNN sampleFasterRCNN sampleGoogleNet sampleINT8 sampleINT8API sampleMLP sampleMNIST sampleMNISTAPI sampleMovieLens sampleOnnxMNIST samplePlugin sampleSSD sampleUffMNIST sampleUffSSD trtexec
40+
41+
# sampleMovieLensMPS should only be compiled for Linux targets.
42+
# sample uses Linux specific shared memory and IPC libraries.
43+
ifeq ($(TARGET),x86_64)
44+
samples += sampleMovieLensMPS
45+
endif
46+
47+
.PHONY: all clean help
48+
all:
49+
$(AT)$(foreach sample,$(samples), $(MAKE) -C $(sample) &&) :
50+
51+
clean:
52+
$(AT)$(foreach sample,$(samples), $(MAKE) clean -C $(sample) &&) :
53+
54+
help:
55+
$(AT)echo "Sample building help menu."
56+
$(AT)echo "Samples:"
57+
$(AT)$(foreach sample,$(samples), echo "\t$(sample)" &&) :
58+
$(AT)echo "\nCommands:"
59+
$(AT)echo "\tall - build all samples."
60+
$(AT)echo "\tclean - clean all samples."
61+
$(AT)echo "\nVariables:"
62+
$(AT)echo "\tTARGET - Specify the target to build for."
63+
$(AT)echo "\tVERBOSE - Specify verbose output."
64+
$(AT)echo "\tCUDA_INSTALL_DIR - Directory where cuda installs to."
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,213 @@
1+
.SUFFIXES:
2+
CUDA_TRIPLE?=x86_64-linux
3+
CUBLAS_TRIPLE?=x86_64-linux-gnu
4+
DLSW_TRIPLE?=x86_64-linux-gnu
5+
TARGET?=$(shell uname -m)
6+
ifeq ($(CUDA_INSTALL_DIR),)
7+
$(warning CUDA_INSTALL_DIR variable is not specified, using /usr/local/cuda by default, use CUDA_INSTALL_DIR=<cuda_directory> to change.)
8+
endif
9+
ifeq ($(CUDNN_INSTALL_DIR),)
10+
$(warning CUDNN_INSTALL_DIR variable is not specified, using $$CUDA_INSTALL_DIR by default, use CUDNN_INSTALL_DIR=<cudnn_directory> to change.)
11+
endif
12+
CUDA_INSTALL_DIR?=/usr/local/cuda
13+
CUDNN_INSTALL_DIR?=$(CUDA_INSTALL_DIR)
14+
CUDA_LIBDIR=lib
15+
CUDNN_LIBDIR=lib64
16+
ifeq ($(TARGET), aarch64)
17+
ifeq ($(shell uname -m), aarch64)
18+
CUDA_LIBDIR=lib64
19+
CC = g++
20+
else
21+
CC = aarch64-linux-gnu-g++
22+
endif
23+
CUCC =$(CUDA_INSTALL_DIR)/bin/nvcc -m64 -ccbin $(CC)
24+
else ifeq ($(TARGET), x86_64)
25+
CUDA_LIBDIR=lib64
26+
CC = g++
27+
CUCC =$(CUDA_INSTALL_DIR)/bin/nvcc -m64
28+
else ifeq ($(TARGET), ppc64le)
29+
CUDA_LIBDIR=lib64
30+
CC = g++
31+
CUCC = $(CUDA_INSTALL_DIR)/bin/nvcc -m64
32+
else ifeq ($(TARGET), qnx)
33+
CC = ${QNX_HOST}/usr/bin/aarch64-unknown-nto-qnx7.0.0-g++
34+
CUCC = $(CUDA_INSTALL_DIR)/bin/nvcc -m64 -ccbin $(CC)
35+
else ifeq ($(TARGET), android64)
36+
ifeq ($(ANDROID_CC),)
37+
$(error ANDROID_CC must be set to the clang compiler to build for android 64bit, for example /path/to/my-toolchain/bin/aarch64-linux-android-clang++)
38+
endif
39+
CUDA_LIBDIR=lib
40+
ANDROID_FLAGS=-DANDROID -D_GLIBCXX_USE_C99=1 -Wno-sign-compare -D__aarch64__ -Wno-strict-aliasing -Werror -pie -fPIE -Wno-unused-command-line-argument
41+
COMMON_FLAGS+=$(ANDROID_FLAGS)
42+
COMMON_LD_FLAGS+=$(ANDROID_FLAGS)
43+
CC=$(ANDROID_CC)
44+
CUCC = $(CUDA_INSTALL_DIR)/bin/nvcc -m64 -ccbin $(CC) --compiler-options="-DANDROID -D_GLIBCXX_USE_C99=1 -Wno-sign-compare"
45+
ANDROID=1
46+
else ########
47+
$(error Auto-detection of platform failed. Please specify one of the following arguments to make: TARGET=[aarch64|x86_64|qnx|android64])
48+
endif
49+
50+
ifdef VERBOSE
51+
AT=
52+
else
53+
AT=@
54+
endif
55+
56+
AR = ar cr
57+
ECHO = @echo
58+
59+
SHELL=/bin/sh
60+
61+
ROOT_PATH=../..
62+
OUT_PATH=$(ROOT_PATH)/bin
63+
OUTDIR=$(OUT_PATH)
64+
65+
define concat
66+
$1$2$3$4$5$6$7$8
67+
endef
68+
69+
#$(call make-depend,source-file,object-file,depend-file)
70+
define make-depend
71+
$(AT)$(CC) -MM -MF $3 -MP -MT $2 $(COMMON_FLAGS) $1
72+
endef
73+
74+
#$(call make-cuda-depend,source-file,object-file,depend-file,flags)
75+
define make-cuda-depend
76+
$(AT)$(CUCC) -M -MT $2 $4 $1 > $3
77+
endef
78+
79+
#########################
80+
INCPATHS=
81+
LIBPATHS=
82+
# add cross compile directories
83+
ifneq ($(shell uname -m), $(TARGET))
84+
INCPATHS += -I"/usr/include/$(DLSW_TRIPLE)" -I"/usr/include/$(CUBLAS_TRIPLE)"
85+
LIBPATHS += -L"../lib/stubs" -L"../../lib/stubs" -L"/usr/lib/$(DLSW_TRIPLE)/stubs" -L"/usr/lib/$(DLSW_TRIPLE)" -L"/usr/lib/$(CUBLAS_TRIPLE)/stubs" -L"/usr/lib/$(CUBLAS_TRIPLE)" -L"$(CUDA_INSTALL_DIR)/targets/$(CUDA_TRIPLE)/$(CUDA_LIBDIR)/stubs" -L"$(CUDA_INSTALL_DIR)/targets/$(CUDA_TRIPLE)/$(CUDA_LIBDIR)"
86+
endif
87+
INCPATHS += -I"../common" -I"$(CUDA_INSTALL_DIR)/include" -I"$(CUDNN_INSTALL_DIR)/include" -I"../include" -I"../../include"
88+
LIBPATHS += -L"$(CUDA_INSTALL_DIR)/$(CUDA_LIBDIR)" -L"$(CUDNN_INSTALL_DIR)/$(CUDNN_LIBDIR)" -L"../lib" -L"../../lib"
89+
90+
.SUFFIXES:
91+
vpath %.h $(EXTRA_DIRECTORIES)
92+
vpath %.cpp $(EXTRA_DIRECTORIES)
93+
94+
COMMON_FLAGS += -Wall -std=c++11 $(INCPATHS)
95+
ifneq ($(ANDROID),1)
96+
COMMON_FLAGS += -D_REENTRANT
97+
endif
98+
ifeq ($(TARGET), qnx)
99+
COMMON_FLAGS += -D_POSIX_C_SOURCE=200112L -D_QNX_SOURCE -D_FILE_OFFSET_BITS=64 -fpermissive
100+
endif
101+
102+
COMMON_LD_FLAGS += $(LIBPATHS) -L$(OUTDIR)
103+
104+
OBJDIR =$(call concat,$(OUTDIR),/chobj)
105+
DOBJDIR =$(call concat,$(OUTDIR),/dchobj)
106+
107+
COMMON_LIBS = -lcudnn -lcublas -lcudart -lopencv_dnn -lopencv_ml -lopencv_objdetect -lopencv_shape -lopencv_stitching -lopencv_superres -lopencv_videostab -lopencv_calib3d -lopencv_features2d -lopencv_highgui -lopencv_videoio -lopencv_imgcodecs -lopencv_video -lopencv_photo -lopencv_imgproc -lopencv_flann -lopencv_core -ldl -lm -lpthread -lrt -ltbb
108+
109+
ifneq ($(TARGET), qnx)
110+
ifneq ($(ANDROID),1)
111+
COMMON_LIBS += -lrt -ldl -lpthread
112+
endif
113+
endif
114+
ifeq ($(ANDROID),1)
115+
COMMON_LIBS += -lculibos -llog
116+
endif
117+
118+
LIBS =-lnvinfer -lnvparsers -lnvinfer_plugin -lnvonnxparser $(COMMON_LIBS)
119+
DLIBS =-lnvinfer -lnvparsers -lnvinfer_plugin -lnvonnxparser $(COMMON_LIBS)
120+
OBJS =$(patsubst %.cpp, $(OBJDIR)/%.o, $(wildcard *.cpp $(addsuffix /*.cpp, $(EXTRA_DIRECTORIES))))
121+
DOBJS =$(patsubst %.cpp, $(DOBJDIR)/%.o, $(wildcard *.cpp $(addsuffix /*.cpp, $(EXTRA_DIRECTORIES))))
122+
CUOBJS =$(patsubst %.cu, $(OBJDIR)/%.o, $(wildcard *.cu $(addsuffix /*.cu, $(EXTRA_DIRECTORIES))))
123+
CUDOBJS =$(patsubst %.cu, $(DOBJDIR)/%.o, $(wildcard *.cu $(addsuffix /*.cu, $(EXTRA_DIRECTORIES))))
124+
125+
CFLAGS=$(COMMON_FLAGS)
126+
CFLAGSD=$(COMMON_FLAGS) -g
127+
LFLAGS=$(COMMON_LD_FLAGS)
128+
LFLAGSD=$(COMMON_LD_FLAGS)
129+
130+
all: debug release
131+
release : $(OUTDIR)/$(OUTNAME_RELEASE)
132+
133+
debug : $(OUTDIR)/$(OUTNAME_DEBUG)
134+
135+
test: test_debug test_release
136+
137+
test_debug:
138+
$(AT)cd $(OUTDIR) && ./$(OUTNAME_DEBUG)
139+
140+
test_release:
141+
$(AT)cd $(OUTDIR) && ./$(OUTNAME_RELEASE)
142+
143+
ifdef MAC
144+
$(OUTDIR)/$(OUTNAME_RELEASE) : $(OBJS) $(CUOBJS)
145+
$(ECHO) Linking: $@
146+
$(AT)$(CC) -o $@ $^ $(LFLAGS) $(LIBS)
147+
# Copy every EXTRA_FILE of this sample to bin dir
148+
$(foreach EXTRA_FILE,$(EXTRA_FILES), cp -f $(EXTRA_FILE) $(OUTDIR)/$(EXTRA_FILE); )
149+
150+
$(OUTDIR)/$(OUTNAME_DEBUG) : $(DOBJS) $(CUDOBJS)
151+
$(ECHO) Linking: $@
152+
$(AT)$(CC) -o $@ $^ $(LFLAGSD) $(DLIBS)
153+
else
154+
$(OUTDIR)/$(OUTNAME_RELEASE) : $(OBJS) $(CUOBJS)
155+
$(ECHO) Linking: $@
156+
$(AT)$(CC) -o $@ $^ $(LFLAGS) -Wl,--start-group $(LIBS) -Wl,--end-group
157+
# Copy every EXTRA_FILE of this sample to bin dir
158+
$(foreach EXTRA_FILE,$(EXTRA_FILES), cp -f $(EXTRA_FILE) $(OUTDIR)/$(EXTRA_FILE); )
159+
160+
$(OUTDIR)/$(OUTNAME_DEBUG) : $(DOBJS) $(CUDOBJS)
161+
$(ECHO) Linking: $@
162+
$(AT)$(CC) -o $@ $^ $(LFLAGSD) -Wl,--start-group $(DLIBS) -Wl,--end-group
163+
endif
164+
165+
$(OBJDIR)/%.o: %.cpp
166+
$(AT)if [ ! -d $(OBJDIR) ]; then mkdir -p $(OBJDIR); fi
167+
$(foreach XDIR,$(EXTRA_DIRECTORIES), if [ ! -d $(OBJDIR)/$(XDIR) ]; then mkdir -p $(OBJDIR)/$(XDIR); fi;) :
168+
$(call make-depend,$<,$@,$(subst .o,.d,$@))
169+
$(ECHO) Compiling: $<
170+
$(AT)$(CC) $(CFLAGS) -c -o $@ $<
171+
172+
$(DOBJDIR)/%.o: %.cpp
173+
$(AT)if [ ! -d $(DOBJDIR) ]; then mkdir -p $(DOBJDIR); fi
174+
$(foreach XDIR,$(EXTRA_DIRECTORIES), if [ ! -d $(OBJDIR)/$(XDIR) ]; then mkdir -p $(DOBJDIR)/$(XDIR); fi;) :
175+
$(call make-depend,$<,$@,$(subst .o,.d,$@))
176+
$(ECHO) Compiling: $<
177+
$(AT)$(CC) $(CFLAGSD) -c -o $@ $<
178+
179+
######################################################################### CU
180+
$(OBJDIR)/%.o: %.cu
181+
$(AT)if [ ! -d $(OBJDIR) ]; then mkdir -p $(OBJDIR); fi
182+
$(foreach XDIR,$(EXTRA_DIRECTORIES), if [ ! -d $(OBJDIR)/$(XDIR) ]; then mkdir -p $(OBJDIR)/$(XDIR); fi;) :
183+
$(call make-cuda-depend,$<,$@,$(subst .o,.d,$@))
184+
$(ECHO) Compiling CUDA release: $<
185+
$(AT)$(CUCC) $(CUFLAGS) -c -o $@ $<
186+
187+
$(DOBJDIR)/%.o: %.cu
188+
$(AT)if [ ! -d $(DOBJDIR) ]; then mkdir -p $(DOBJDIR); fi
189+
$(foreach XDIR,$(EXTRA_DIRECTORIES), if [ ! -d $(DOBJDIR)/$(XDIR) ]; then mkdir -p $(DOBJDIR)/$(XDIR); fi;) :
190+
$(call make-cuda-depend,$<,$@,$(subst .o,.d,$@))
191+
$(ECHO) Compiling CUDA debug: $<
192+
$(AT)$(CUCC) $(CUFLAGSD) -c -o $@ $<
193+
194+
clean:
195+
$(ECHO) Cleaning...
196+
$(AT)-rm -rf $(OBJDIR) $(DOBJDIR) $(OUTDIR)/$(OUTNAME_RELEASE) $(OUTDIR)/$(OUTNAME_DEBUG)
197+
198+
ifneq "$(MAKECMDGOALS)" "clean"
199+
-include $(OBJDIR)/*.d $(DOBJDIR)/*.d
200+
201+
ifeq ($(DO_CUDNN_CHECK), 1)
202+
# To display newlines in the message.
203+
define _cudnn_missing_newline_5020fd0
204+
205+
206+
endef
207+
SHELL=/bin/bash
208+
CUDNN_CHECK = $(shell echo -e '\#include <cudnn.h>\nint main(){ cudnnCreate(nullptr); return 0; }' | $(CC) -xc++ -o /dev/null $(CFLAGS) $(LFLAGS) - $(COMMON_LIBS) 2> /dev/null && echo 'passed_cudnn_exists_check')
209+
ifneq ($(CUDNN_CHECK), passed_cudnn_exists_check)
210+
$(error $(_cudnn_missing_newline_5020fd0)$(_cudnn_missing_newline_5020fd0)This sample requires CUDNN, but it could not be found.$(_cudnn_missing_newline_5020fd0)Please install CUDNN from https://developer.nvidia.com/cudnn or specify CUDNN_INSTALL_DIR when compiling.$(_cudnn_missing_newline_5020fd0)For example, `make CUDNN_INSTALL_DIR=/path/to/CUDNN/` where /path/to/CUDNN/ contains include/ and lib/ subdirectories.$(_cudnn_missing_newline_5020fd0)$(_cudnn_missing_newline_5020fd0))
211+
endif # ifneq ($(CUDNN_CHECK), passed_cudnn_exists_check)
212+
endif # ifeq ($(DO_CUDNN_CHECK), 1)
213+
endif # ifneq "$(MAKECMDGOALS)" "clean"
+36
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
### Common Assumptions
2+
3+
* The repository contains samples based on Nvidia's TensorRT c/c++ samples modified for ease of use. Target platforms are Power with GPUs and Nvidia Jetsons both native Linux or within docker containers.
4+
5+
- Inputs can be arbitrary image files in terms of extension and resolution passed in as command line arguments
6+
- Outputs are list of classes and bboxes per image and debug images with bbox marked
7+
- Model can be arbitrary as long as it matches the model type (SSD, FRCNN for now, Yolo, Googlenet soon to follow) and that <model_name>_trt.prototxt and <model_name>.caffemodel are present and names adjusted in source code
8+
- Batch size can be arbitrary as long as it fits the device memory, adjustable in source code
9+
- Floating point precision can be arbitrary which affect the accuracy, speed and memory footprint, adjustable in source code
10+
- Number and names of classes can be arbitrary, also adjustable in source code. The names could be read from label file but due to variations in syntax, left for later revision.
11+
- Number of classes is always one more than label file since there is one background class.
12+
- Confidence level can be adjusted which determines the number of object recognized
13+
14+
### Common Use
15+
16+
* The samples first have to be compiled and ran from bin directory with command line parameters
17+
- It is assumed that cuda, cudnn, tensorrt, gcc, opencv are preinstalled and environment variables set (see below).
18+
- Copy the samples over the respective TensorRT sample directories
19+
- Modify the code to match desired model, batch size, floating point precision, image folder, names of classes
20+
- Compile the source code via make from respective samples directory
21+
- Run the binary (release or debug) from the bin folder and pass in the file names as "name.ext" "name.ext" etc.
22+
- On initial run if TensorRT engine for the model has not been run before, it will take a little time to parse and serialize it to file
23+
- On subsequent runs if no changes were made to the model, the engine will be deserialized from earlier saved one
24+
25+
### Common Prerequisites
26+
27+
* Following are prerequisite steps to have the correct native or docker environment both for Power and Jetson, build and runtime
28+
29+
- If building on Power within docker best is to start with nvidia/cuda-ppc64le:10.1-cudnn7-devel-ubuntu18.04 docker and add latest TensorRT SDK (currently 5.1.3.2, cuda 10.1, cudnn 7.5 for Power)
30+
- Install or build opencv version 3.3.1 and above
31+
- If building on Jetson TX2 native, follow the steps described in Nvidia Jetpack installation. This requires an Ubuntu host machine to initially flash the board via Jetpack manager (currently 4.2.2). All the prerequisites, if checked, during installation are preinstalled and ready for use.
32+
- Note that Host Machine needs to be unchecked and TensorFlow can be unchecked and Jetson TX2 checked.
33+
34+
- Please follow the rest of the instructions from the Nvidia samples README.md
35+
36+

0 commit comments

Comments
 (0)