Deep Learning#

Most modern NLP are language models. While a language model can be anything that predicts new text, in practice, these models are based on deep neural networks. In these notebooks, we will start from the very basics of training a language model from scratch and build up to training much larger and more advanced architectures. Importantly, this material is very hard and not intended, as of now, to be comprehesive. Instead, I recommend you work through these notebooks alongside other deep learning resources. For example, Jeremy Howard’s Practical Deep Learning for Coders is incredibly helpful. The bottom line is that no one was born knowing deep learning and that it can help to see this material presented in several different ways.

Training Named Entity Recognition models with spaCy

In previous notebooks, we’ve seen how to use spaCy for NER, but how did those pretrained models get trained in the first place? This notebook shows you how!

ner-spacy.html

An Introduction to Word Vectors with gensim

Word vectors are the foundation of neual language models. The word2vec algorithm (Mikolov et al., 2013) pioneered this approach and the gensim package helps us implement it.

w2v-gensim.html

word2vec from Scratch

Learn how to implement the word2vec algorithm in numpy. Start your foundation in training your own neural nets with this very simple architecture.

w2v-from-scratch.html

Building a machine translator with pytorch

Get started with using the popluar neural network frameworking pytorch. Thisof the other side of the transformer architecture, the encoder, pivotal in BERT-style models for information retrieval and organization. notebook was adapted from this source.

rnn-pytorch.html

Decoder-only Transformer Models from Scratch

Understanding the decoder-only transformer, the architecture behind GPTs and LLMs, is vital to navigating the contemporary machine learning and artificial intelligence. Work through this notebook to learn the basics. This notebook was adapted from this this source.

decoder-pytorch.html

Encoder-only Transformer Models for Scratch

Develop your knowledge of the other side of the transformer architecture, the encoder, pivotal in BERT-style models for information retrieval and organization.

encoder-pytorch.html

The Byte-Pair Tokenization Algorithm

Alongside model develop is text tokenization. In this notebook, learn modern techniques on how to build a custom tokenizer.

tokenizers.html