LANGUAGE MODEL - 2¶

Introduction¶

Welcome to my documentation for Makemore Part 2 from Andrej Karpathy's Neural Networks: Zero to Hero series. This section focuses on implementing a Multilayer Perceptron (MLP) as a character-level language model. Here, I’ve compiled my notes and insights from the lecture to serve as a reference for understanding the key concepts and practical implementations discussed.

Overview of Makemore Part 2¶

In this part of the series, I explored the following topics:

Implementing a Multilayer Perceptron (MLP): The MLP architecture is fundamental in neural networks, and this lecture provided a hands-on approach to understanding how multiple layers can learn complex data representations.

Key Concepts Covered:

Model training techniques
Learning rate tuning strategies
Hyperparameter adjustments
Evaluation metrics including loss functions and accuracy
Insights into overfitting and underfitting

Key Resources¶

Video Lecture

I watched the lecture on YouTube: Building Makemore Part 2

Codes:

The Jupyter notebooks and code implementations are available within this documentation itself.
If you wish to view the repository where I originally worked on, you can view it here: Neural Networks - Language Model 2

Structure of Contents¶

The lecture documentation has been divided into 3 sets: Set A, Set B, and Set C.
Each set has its own notes and notebook.
Notes have been marked with timestamps to the video.
This allows for simplicity and better understanding, as the lecture is long.

Have fun, Happy Learning!