Skip to content

LANGUAGE MODEL - 1

Introduction

Welcome to my documentation for Makemore Part 1 from Andrej Karpathy's Neural Networks: Zero to Hero series. This section focuses on implementing a bigram character-level language model. Here, I’ve compiled my notes and insights from the lecture to serve as a reference for understanding the foundational concepts and practical implementations discussed.

Overview of Makemore Part 1

In this part of the series, I explored the following topics:

Implementing a Bigram Character-Level Language Model: The lecture introduces the basics of language modeling using bigrams, providing a step-by-step approach to understanding how character-level models can predict sequences of text.

Key Concepts Covered:

  • Introduction to Broadcasting and its use in neural networks
  • Framework of language modeling, including model training and sampling
  • Evaluation of loss functions, particularly negative log likelihood for classification
  • Practical insights into the mechanics of character-level predictions

Key Resources

Video Lecture

Codes:

  • The Jupyter notebooks and code implementations are available within this documentation itself.
  • If you wish to view the repository where I originally worked on, you can view it here: Neural Networks - Language Model 1

Structure of Contents

  • The lecture documentation has been divided into 3 sets: Set A, Set B, and Set C.
  • Each set has its own notes and notebook.
  • Notes have been marked with timestamps to the video.
  • This allows for simplicity and better understanding, as the lecture is long.

 

Have fun, Happy Learning!