Deploying a quantized network on NVDLA #355

nainag · 2021-10-17T13:49:39Z

Hi,

Has anyone tried deploying a low-precision quantized network (int4, int5, etc.) on NVDLA?

If so, please let me know the steps and if you are able to successfully generate the calibration table using TensorRT and does the hardware supports quantization?

I would really appreciate any help in this direction.

Thanks!

mtsanic · 2021-12-12T13:03:19Z

I don't think NVDLA supports low-precision quantized network. Even the 8-bit (normal quantized) networks are compiled with its own compiler. Maybe, you can achieve the pseudo low-precision, i.e. 4-bit written on 8-bit data, by providing calibration table. However, I didn't try anything like that. This idea might face issues with working model, as some models won't be implemented by NVDLA.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deploying a quantized network on NVDLA #355

Deploying a quantized network on NVDLA #355

nainag commented Oct 17, 2021

mtsanic commented Dec 12, 2021

Deploying a quantized network on NVDLA #355

Deploying a quantized network on NVDLA #355

Comments

nainag commented Oct 17, 2021

mtsanic commented Dec 12, 2021