Skip to content

Releases: alibaba/TinyNeuralNetwork

Annoucing easyquant for speeding up LLM inference via quantization

31 May 09:41
841294e
Compare
Choose a tag to compare

With the help of quantization, we could achieve LLM inference efficiently with lower resource usage. Please install the package below and try out the examples here. We look forward to your feedback.