Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deploying a quantized network on NVDLA #355

Open
nainag opened this issue Oct 17, 2021 · 1 comment
Open

Deploying a quantized network on NVDLA #355

nainag opened this issue Oct 17, 2021 · 1 comment

Comments

@nainag
Copy link

nainag commented Oct 17, 2021

Hi,

Has anyone tried deploying a low-precision quantized network (int4, int5, etc.) on NVDLA?

If so, please let me know the steps and if you are able to successfully generate the calibration table using TensorRT and does the hardware supports quantization?

I would really appreciate any help in this direction.

Thanks!

@mtsanic
Copy link

mtsanic commented Dec 12, 2021

I don't think NVDLA supports low-precision quantized network. Even the 8-bit (normal quantized) networks are compiled with its own compiler. Maybe, you can achieve the pseudo low-precision, i.e. 4-bit written on 8-bit data, by providing calibration table. However, I didn't try anything like that. This idea might face issues with working model, as some models won't be implemented by NVDLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants