LANGUAGE MODEL - 1¶

Introduction¶

Welcome to my documentation for Makemore Part 1 from Andrej Karpathy's Neural Networks: Zero to Hero series. This section focuses on implementing a bigram character-level language model. Here, I’ve compiled my notes and insights from the lecture to serve as a reference for understanding the foundational concepts and practical implementations discussed.

Overview of Makemore Part 1¶

In this part of the series, I explored the following topics:

Implementing a Bigram Character-Level Language Model: The lecture introduces the basics of language modeling using bigrams, providing a step-by-step approach to understanding how character-level models can predict sequences of text.

Key Concepts Covered:

Introduction to Broadcasting and its use in neural networks
Framework of language modeling, including model training and sampling
Evaluation of loss functions, particularly negative log likelihood for classification
Practical insights into the mechanics of character-level predictions

Key Resources¶

Video Lecture

I watched the lecture on YouTube: Building Makemore Part 1

Codes:

The Jupyter notebooks and code implementations are available within this documentation itself.
If you wish to view the repository where I originally worked on, you can view it here: Neural Networks - Language Model 1

Structure of Contents¶

The lecture documentation has been divided into 3 sets: Set A, Set B, and Set C.
Each set has its own notes and notebook.
Notes have been marked with timestamps to the video.
This allows for simplicity and better understanding, as the lecture is long.

Have fun, Happy Learning!