Blogs 2024 Evaluating Types of Learning Rates on Mr. Karpathy’s GPT-2 GELU: The Activation Function That Bridges Deterministic and Stochastic Worlds Understanding Heap Building: The O(n log n) vs. The O(n) Approach