# Using Pretrained LLMs from HuggingFace [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1zXVYaEr37QP1T9qPsJqzwJhNc45aEzLl?usp=sharing) In this notebook, We explore the canonical ways to inference large language models. This includes a discussion of GPU devices, conversation states and quantization. As a reminder, you must go to File > Save a copy in Drive to run the cells.