Using Pretrained LLMs from HuggingFace#

Open In Colab

In this notebook, We explore the canonical ways to inference large language models. This includes a discussion of GPU devices, conversation states and quantization. As a reminder, you must go to File > Save a copy in Drive to run the cells.