LANGUAGE MODEL - 2¶
Introduction¶
Welcome to my documentation for Makemore Part 2 from Andrej Karpathy's Neural Networks: Zero to Hero series. This section focuses on implementing a Multilayer Perceptron (MLP) as a character-level language model. Here, I’ve compiled my notes and insights from the lecture to serve as a reference for understanding the key concepts and practical implementations discussed.
Overview of Makemore Part 2¶
In this part of the series, I explored the following topics:
Implementing a Multilayer Perceptron (MLP): The MLP architecture is fundamental in neural networks, and this lecture provided a hands-on approach to understanding how multiple layers can learn complex data representations.
Key Concepts Covered:
- Model training techniques
- Learning rate tuning strategies
- Hyperparameter adjustments
- Evaluation metrics including loss functions and accuracy
- Insights into overfitting and underfitting
Key Resources¶
Video Lecture
- I watched the lecture on YouTube: Building Makemore Part 2
Codes:
- The Jupyter notebooks and code implementations are available within this documentation itself.
- If you wish to view the repository where I originally worked on, you can view it here: Neural Networks - Language Model 2
Structure of Contents¶
- The lecture documentation has been divided into 3 sets: Set A, Set B, and Set C.
- Each set has its own notes and notebook.
- Notes have been marked with timestamps to the video.
- This allows for simplicity and better understanding, as the lecture is long.
Have fun, Happy Learning!