fast-llama Repo for learning fast inferencing techniques for llama2 Setup conda create -p ./env python=3.8 -y && \ conda activate ./env && \ pip install vllm && \ pip install openai && \ pip install rich