LANGUAGE MODEL - 1¶
Introduction¶
Welcome to my documentation for Makemore Part 1 from Andrej Karpathy's Neural Networks: Zero to Hero series. This section focuses on implementing a bigram character-level language model. Here, I’ve compiled my notes and insights from the lecture to serve as a reference for understanding the foundational concepts and practical implementations discussed.
Overview of Makemore Part 1¶
In this part of the series, I explored the following topics:
Implementing a Bigram Character-Level Language Model: The lecture introduces the basics of language modeling using bigrams, providing a step-by-step approach to understanding how character-level models can predict sequences of text.
Key Concepts Covered:
- Introduction to
Broadcasting
and its use in neural networks - Framework of language modeling, including model training and sampling
- Evaluation of loss functions, particularly negative log likelihood for classification
- Practical insights into the mechanics of character-level predictions
Key Resources¶
Video Lecture
- I watched the lecture on YouTube: Building Makemore Part 1
Codes:
- The Jupyter notebooks and code implementations are available within this documentation itself.
- If you wish to view the repository where I originally worked on, you can view it here: Neural Networks - Language Model 1
Structure of Contents¶
- The lecture documentation has been divided into 3 sets: Set A, Set B, and Set C.
- Each set has its own notes and notebook.
- Notes have been marked with timestamps to the video.
- This allows for simplicity and better understanding, as the lecture is long.
Have fun, Happy Learning!