๐Ÿง  Deep Learning โ€” Complete Cheatsheet
Biological Neuron โ†’ MP Neuron โ†’ Perceptron โ†’ MLP โ†’ Sigmoid โ†’ FFNN โ†’ Backprop โ†’ Gradient Descent
๐Ÿ“š 100/100 KE LIYE โ€” HINGLISH EDITION ๐Ÿ”ฅ
1 Biological Neuron & MP Neuron
๐ŸŒณ Biological Neuron โ€” Tree jaisi structure
PartKaam (Function)
DendriteSignals receive karta hai (ears ๐Ÿ‘‚)
SomaProcess karta hai (brain ๐Ÿง )
AxonSignal bhejta hai (mouth ๐Ÿ—ฃ๏ธ)
SynapseConnection point (telephone wire ๐Ÿ“ž)
๐Ÿ’ก Brain mein 86 billion neurons hain โ€” sab parallel kaam karte hain!
๐Ÿšง McCulloch-Pitts (MP) Neuron โ€” 1943
Step 1 โ€” g(x): Inputs Add Karo
g(x) = xโ‚ + xโ‚‚ + xโ‚ƒ + ... + xโ‚™
Step 2 โ€” f: Threshold ฮธ se Compare Karo
Output = 1 if g(x) โ‰ฅ ฮธ (fires!) Output = 0 if g(x) < ฮธ (silent!)
  • Excitatory input = normal vote
  • Inhibitory input = VETO power โ€” agar 1 hai toh output automatic 0 ๐Ÿ—ณ๏ธ
๐Ÿ”ข AND / OR Gate Examples
AND Gate โ€” ฮธ = 2 set karo
xโ‚xโ‚‚SumOutput
0000
1010
0110
1121 โœ…
OR Gate โ€” ฮธ = 1 set karo
Fire karo jab koi bhi ek input = 1 ho โœ…

2 Linear Separability & XOR Problem
๐Ÿ“Š Linear Separability Kya Hai?

Ek straight line se 1s aur 0s ko alag kar sako โ€” tab linearly separable hai.

โœ… AND โ€” Separable
๐Ÿ”ต ๐Ÿ”ด
๐Ÿ”ต ๐Ÿ”ต
Diagonal line khich sakti hai!
โŒ XOR โ€” Not Separable
๐Ÿ”ด ๐Ÿ”ต
๐Ÿ”ต ๐Ÿ”ด
Diagonal corners mein hain!
โš ๏ธ Key Rule: Single MP/Perceptron neuron sirf linearly separable problems solve kar sakta hai!
๐Ÿงฉ XOR Problem โ€” AI Winter Ka Reason
xโ‚xโ‚‚XOR Output
000
101
011
110
XOR = "Ek ho ya doosra, dono nahi"
๐Ÿ”ฅ Solution: Multiple layers use karo โ€” yahi MLP hai!
  • Minsky & Papert ne 1960s mein prove kiya: single perceptron can't solve XOR
  • Isse "AI Winter" aa gaya โ€” funding band! โ„๏ธ

3 Perceptron โ€” Upgraded Neuron (1958)
โš–๏ธ Perceptron Formula โ€” Weights Introduce Hue!
Main Formula
y = 1 if ฮฃ(wแตข ร— xแตข) โ‰ฅ ฮธ, else 0 i=1 to n
Bias Form โ€” ฮธ ko wโ‚€ mein fold karo (wโ‚€ = -ฮธ, xโ‚€ = 1)
y = 1 if wโ‚€ + wโ‚xโ‚ + wโ‚‚xโ‚‚ + ... + wโ‚™xโ‚™ โ‰ฅ 0
Vector Form
y = 1 if wแต€x โ‰ฅ 0 โ†’ Side 1 (output=1) y = 0 if wแต€x < 0 โ†’ Side 2 (output=0) Dividing line: wแต€x = 0 (weight vector โŠฅ to this line!)
๐Ÿ’ก Bias (wโ‚€) = Judge ka pre-existing opinion โ€” input dekhne se pehle hi output ko push karta hai!
๐Ÿ“Š MP Neuron vs Perceptron
FeatureMP NeuronPerceptron
Inputs0 or 1Any real number
WeightsAll equal =1Learnable, different
ThresholdFixed, manualLearned as bias
FlexibilityLowHigher โœ…

4 Perceptron Learning Algorithm โ€” Self-Learning! ๐Ÿค–
๐Ÿ”„ Algorithm Steps
1
Weights ko randomly initialize karo
2
Ek random example x pick karo
3
Agar x positive class ka hai (y=1) lekin hum predict kiya 0:
w = w + x (w ko x ke paas lao)
4
Agar x negative class ka hai (y=0) lekin hum predict kiya 1:
w = w - x (w ko x se dur lao)
5
Repeat karo jab tak sab examples correctly classify na ho
Update Rules Summary
Predicted 0, Should be 1: w = w + x Predicted 1, Should be 0: w = w - x
๐Ÿ“ Geometry: w = w + x Kyun Kaam Karta Hai?

Angle between w and x determines output:

  • angle < 90ยฐ โ†’ dot product > 0 โ†’ output = 1
  • angle > 90ยฐ โ†’ dot product < 0 โ†’ output = 0
Jab w + x karte hain toh w aur x ka angle decrease hota hai โ†’ w x ke direction mein aata hai โ†’ output 1 ho jaata hai! โœ…
Convergence Theorem: Agar data linearly separable hai toh algorithm finite steps mein converge ZAROOR karega!
โš ๏ธ Data linearly separable nahi hai โ†’ Algorithm forever run karega!

5 Multi-Layer Perceptron (MLP) โ€” XOR ka Solution!
๐Ÿ›๏ธ MLP Structure
Input Layer โ€” Raw data (xโ‚, xโ‚‚, ...) โ€” koi processing nahi
Hidden Layers โ€” Intermediate processing โ€” "thinking" layers ๐Ÿง‘โ€๐Ÿณ
Output Layer โ€” Final answer
XOR ke liye โ€” 4 Hidden Neurons
hโ‚ fires when xโ‚=-1, xโ‚‚=-1 โ†’ case (0,0) hโ‚‚ fires when xโ‚=+1, xโ‚‚=-1 โ†’ case (1,0) โœ… hโ‚ƒ fires when xโ‚=-1, xโ‚‚=+1 โ†’ case (0,1) โœ… hโ‚„ fires when xโ‚=+1, xโ‚‚=+1 โ†’ case (1,1) Output: wโ‚=0, wโ‚‚=+1, wโ‚ƒ=+1, wโ‚„=0 โ†’ XOR SOLVED!
๐Ÿ“ Representation Power โ€” BIG Theorem!
๐Ÿ”ฅ Theorem: Koi bhi Boolean function with n inputs ko ek network represent kar sakta hai jisme:
Hidden layer = 2โฟ perceptrons Output layer = 1 perceptron
Inputs (n)Hidden Neurons (2โฟ)
24
532
101,024
201,048,576 (!)
โš ๏ธ Catch: n badhne pe neurons exponentially badhte hain โ†’ Yahi reason hai Deep Learning ka! ๐Ÿš€

6 Sigmoid Neuron โ€” Smooth Learning! ๐Ÿ“ˆ
๐Ÿ“‰ Step Function Ka Problem
  • Problem 1: Tiny change โ†’ Huge jump (49 marks = FAIL, 50 marks = PASS โ€” unfair!)
  • Problem 2: Zero gradient = Learning nahi ho sakta! ๐Ÿ˜ฑ
  • Step function = Old light switch (ON/OFF)
  • Sigmoid = Modern dimmer (gradually changes)
๐Ÿ“ Sigmoid Formula
Full Form
ฯƒ(z) = 1 / (1 + eโปแถป) where z = wโ‚€ + wโ‚xโ‚ + wโ‚‚xโ‚‚ + ... + wโ‚™xโ‚™
Properties
Output range: (0, 1) When z = 0 โ†’ ฯƒ = 0.5 (uncertain!) When z โ†’ +โˆž โ†’ ฯƒ โ†’ 1 (confident YES) When z โ†’ -โˆž โ†’ ฯƒ โ†’ 0 (confident NO)
๐ŸŽฒ Sigmoid as Probability!
OutputMatlab
0.9595% chance โ€” spam hai! ๐Ÿ“ง
0.5050-50 โ€” pata nahi ๐Ÿคท
0.033% chance โ€” safe hai โœ…
โœ… Sigmoid smooth & differentiable hai โ†’ Gradient exist karta hai everywhere โ†’ Learning ho sakti hai!

7 ML ke 4 Components โ€” Supervised Learning Setup
๐Ÿ  4 Components (House Price Example se samjho)
D
DATA โ€” Training examples {xแตข, yแตข} โ€” x = features (size, rooms), y = price
M
MODEL โ€” Mathematical guess: ลท = fฬ‚(x; ฮธ)
A
LEARNING ALGORITHM โ€” Best parameters kaise dhundein โ†’ Gradient Descent!
L
LOSS FUNCTION โ€” Prediction kitna galat hai โ†’ MSE!
Model Types
Linear: ลท = wแต€x Sigmoid: ลท = 1/(1 + e^(-wแต€x)) Quadratic: ลท = xแต€Wx
๐Ÿ“‰ Loss Function โ€” MSE (Mean Squared Error)
MSE Formula
L = (1/N) ร— ฮฃ (ลทแตข - yแตข)ยฒ i=1 to N
Squaring kyun? 3 reasons:
  • -3 aur +3 cancel nahi honge (negative errors fix)
  • Big errors zyada penalize hote hain (error 2โ†’4, error 4โ†’16!)
  • Mathematically easy to differentiate
Example
Actual: [45, 72, 28] Predicted: [42, 78, 27] Errorsยฒ: [9, 36, 1] MSE = (9+36+1)/3 = 15.33 โ† Lower = Better!

8 Gradient Descent โ€” Mountain Se Valley Tak! ๐Ÿ”๏ธ
โ›ฐ๏ธ Intuition + Math Derivation

Blindfold pe mountain pe ho, neeche valley mein jaana hai โ†’ Slope feel karo โ†’ Downhill step lo โ†’ Repeat!

Taylor Series (1st order approximation)
L(ฮธ + ฮทu) โ‰ˆ L(ฮธ) + ฮท ร— uแต€ ร— โˆ‡ฮธL(ฮธ)
Minimize karne ke liye u ka direction chahiye
uแต€โˆ‡L = ||u||||โˆ‡L||cosฮฒ Most negative when cosฮฒ = -1, ฮฒ = 180ยฐ โ†’ u = -โˆ‡ฮธL(ฮธ) (OPPOSITE direction of gradient!)
๐Ÿ”ฅ Parameter Update Rules
w_(t+1) = w_t - ฮท ร— โˆ‡w_t b_(t+1) = b_t - ฮท ร— โˆ‡b_t ฮท (eta) = learning rate (step ka size)
โœ… Minus sign important hai โ€” gradient ke opposite direction mein jaate hain (downhill)!
๐Ÿ”ข Gradients for Sigmoid Neuron (MSE loss)
Gradient w.r.t. Weight
โˆ‡w = โˆ‚L/โˆ‚w = (f(x) - y) ร— f(x) ร— (1 - f(x)) ร— x
Gradient w.r.t. Bias
โˆ‡b = โˆ‚L/โˆ‚b = (f(x) - y) ร— f(x) ร— (1 - f(x))
TermMatlab
(f(x) - y)Kitna galat predict kiya?
f(x)(1-f(x))Sigmoid derivative (slope)
xKis input se relation hai?
Learning Rate ฮท โ€” Goldilocks Zone ๐Ÿป
ฮท too large โ†’ overshoot, diverge โŒ (pahaad se gir gaye) ฮท too small โ†’ very slow โŒ (renge renge chalte raho) ฮท just right โ†’ fast convergence โœ…

9 Universal Approximation Theorem
๐Ÿงฑ The Big Theorem โ€” Koi bhi Function!
๐Ÿ”ฅ Theorem: Ek single hidden layer wala multilayer sigmoid network koi bhi continuous function approximate kar sakta hai โ€” kisi bhi desired precision se!
  • Face recognition
  • Weather prediction
  • Medical diagnosis
  • Language translation

LEGO Analogy ๐Ÿงฑ: Individual LEGO bricks rectangular hain, lekin enough bricks se koi bhi shape ban sakta hai. Waise hi individual sigmoids S-shaped hain, lekin unhe combine karo aur koi bhi function approximate kar sakte ho!

Individual sigmoids (building blocks): ฯƒโ‚: โ”€โ”€โ•ฏ ฯƒโ‚‚: โ”€โ”€โ•ฏ ฯƒโ‚ƒ: โ•ฐโ”€โ”€ Combined โ†’ approximates any target function! โ†’ Yahi hai Deep Learning ka foundation! ๐Ÿš€

10 Feed Forward Neural Network (FFNN) โ€” Assembly Line! ๐Ÿญ
โš™๏ธ FFNN Structure โ€” Har Layer Mein 2 Operations
Operation 1: Pre-Activation aแตข (Linear Transformation)
aแตข(x) = bแตข + Wแตข ร— hแตขโ‚‹โ‚(x)
Operation 2: Activation hแตข (Non-Linear)
hแตข(x) = g(aแตข(x))
Complete Forward Pass (3-layer network)
ลท = O(Wโ‚ƒ ยท g(Wโ‚‚ ยท g(Wโ‚x + bโ‚) + bโ‚‚) + bโ‚ƒ)
Parameters ฮธ (sab weights + biases)
ฮธ = {Wโ‚, Wโ‚‚, ..., WL, bโ‚, bโ‚‚, ..., bL}
๐Ÿ’ก Input Layer = hโ‚€ = x (koi processing nahi!)
๐Ÿค” Non-Linearity Kyun Zaruri Hai?
Without Activation โ€” Sab Linear Collapse!
hโ‚ = Wโ‚x + bโ‚ hโ‚‚ = Wโ‚‚(Wโ‚x + bโ‚) + bโ‚‚ = (Wโ‚‚Wโ‚)x + (Wโ‚‚bโ‚+bโ‚‚) โ† Still JUST ONE LINEAR LAYER! โŒ
With Activation โ€” Non-Linear Magic! โœ…
hโ‚ = g(Wโ‚x + bโ‚) โ† Non-linear! hโ‚‚ = g(Wโ‚‚hโ‚ + bโ‚‚) โ† Non-linear combo!
Sigmoid g(z) = 1/(1+eโปแถป) โ†’ output (0,1)
tanh g(z) = tanh(z) โ†’ output (-1,1)
ReLU g(z) = max(0,z) โ†’ output [0,โˆž)
Weight Matrix Dimensions
Wแตข shape: m ร— n (m = neurons in layer i, n = prev layer) bแตข shape: m ร— 1 (ek bias per neuron)

11 Output Functions & Loss Functions โ€” Right Tool, Right Job! ๐ŸŽฏ
๐Ÿ“ˆ Regression โ€” Real Values Predict karo
Output Activation: Linear
f(x) = WO ร— aL + bO (No squishing โ€” any real number!)
Loss: MSE
L(ฮธ) = (1/N) ร— ฮฃแตข ฮฃโฑผ (ลทแตขโฑผ - yแตขโฑผ)ยฒ
  • House price: โ‚น45,73,291
  • Temperature: 27.3ยฐC
  • Stock price: $142.67
๐Ÿพ Classification โ€” Categories Predict karo
Output Activation: Softmax
ลทโฑผ = e^(aL,j) / ฮฃแตข e^(aL,i) Properties: - 0 < ลทโฑผ < 1 โœ… - ฮฃ ลทโฑผ = 1 โœ… (valid probability!)
Example: [Dog=3.0, Cat=1.0, Bird=0.2]
e^3.0=20.09, e^1.0=2.72, e^0.2=1.22 Sum=24.03 Dog=83.6%, Cat=11.3%, Bird=5.1% โœ…
๐Ÿ“‰ Cross Entropy Loss
Full Formula
L(ฮธ) = -(1/N) ร— ฮฃแตข ฮฃโฑผ [yแตขโฑผ log(ลทแตขโฑผ) + (1-yแตขโฑผ) log(1-ลทแตขโฑผ)]
With One-Hot Labels: Simplifies to
L = -log(ลทโ‚—) (l = true class)
ลทLoss -log(ลท)
0.99 โœ…0.01 (tiny)
0.500.69 (medium)
0.01 โŒ4.61 (HUGE!)
โŒ Wrong + Confident = Catastrophic loss!
๐ŸŒณ Output Selection Decision Tree โ€” Kya use karein?
Output type?
โ†’
Number (regression)
โ†’
Linear activation + MSE loss
Output type?
โ†’
2 categories (binary)
โ†’
Sigmoid + Binary Cross Entropy
Output type?
โ†’
>2 categories (multi-class)
โ†’
Softmax + Cross Entropy

12 Backpropagation โ€” Galti ka Blame Dono Taraf! ๐Ÿ”„
๐Ÿ”— Chain Rule โ€” Foundation of Backprop

Analogy: Late ho gaye โ†’ Alarm nahi baja โ†’ Phone silent tha โ†’ Friend ka late text โ€” causes ki chain! ๐Ÿš—

Basic Chain Rule
dy/dx = (dy/dz) ร— (dz/dx)
Multiple Paths (sum karo sab)
โˆ‚p/โˆ‚z = ฮฃโ‚˜ (โˆ‚p/โˆ‚qโ‚˜) ร— (โˆ‚qโ‚˜/โˆ‚z)
Forward vs Backward Pass
Forward: Input โ†’ L1 โ†’ L2 โ†’ L3 โ†’ Loss Backward: Loss โ†’ L3 โ†’ L2 โ†’ L1 (gradients!)
๐Ÿ’ก Backprop = "Blame" divide karo โ€” har weight ka loss mein kitna contribution tha?
๐Ÿ“Š Backprop Gradients โ€” Step by Step
Part 1: Output Layer (Softmax + Cross Entropy)
โˆ‚L/โˆ‚aL,i = ลทแตข - yแตข (Predicted probability minus True probability!)
Concrete Example ๐Ÿพ: True=Cat, Predicted=[Dog=0.7, Cat=0.2, Bird=0.1]
Gradients (ลท - y): Dog: 0.7 - 0 = +0.7 โ†’ too high, push DOWN โฌ‡๏ธ Cat: 0.2 - 1 = -0.8 โ†’ too low, push UP โฌ†๏ธ Bird: 0.1 - 0 = +0.1 โ†’ slightly high, push down โฌ‡๏ธ
Part 2: Hidden Layer Gradient
โˆ‚L/โˆ‚hแตขโฑผ = ฮฃโ‚˜ (โˆ‚L/โˆ‚aแตขโ‚Šโ‚,โ‚˜) ร— Wแตขโ‚Šโ‚,โ‚˜,โฑผ โˆ‚L/โˆ‚aแตขโฑผ = (โˆ‚L/โˆ‚hแตขโฑผ) ร— g'(aแตขโฑผ)
Part 3: Weight & Bias Gradient
โˆ‚L/โˆ‚Wแตข = (โˆ‚L/โˆ‚aแตข) ร— hแตขโ‚‹โ‚แต€ โˆ‚L/โˆ‚bแตข = โˆ‚L/โˆ‚aแตข
๐Ÿ“ Activation Derivatives
Sigmoid: g'(z) = ฯƒ(z)(1 - ฯƒ(z)) Example: ฯƒ=0.8 โ†’ g' = 0.8ร—0.2 = 0.16 tanh: g'(z) = 1 - tanhยฒ(z) Example: tanh=0.9 โ†’ g' = 1 - 0.81 = 0.19 ReLU: g'(z) = 1 if z > 0, else 0
โš ๏ธ Sigmoid problem: jab ฯƒ โ‰ˆ 0 ya 1 โ†’ g' โ‰ˆ 0 โ†’ Vanishing Gradient!
๐Ÿ”„ Full Backprop Algorithm
1
Forward pass โ†’ ลท compute karo
2
Loss L compute karo
3
Output layer gradient: ลท - y
4
Hidden layers backward (chain rule)
5
Weight/bias gradients compute karo
6
ฮธ update: ฮธ_new = ฮธ_old - ฮท ร— โˆ‡ฮธ
๐Ÿ€ Big Picture โ€” Basketball Analogy
BasketballML Equivalent
Throwing techniqueParameters (w, b)
Ball goes in?Prediction ลท
Basket locationTrue label y
Distance missedLoss/Error
Coach's feedbackGradient
Adjustment sizeLearning rate ฮท
Practice sessionsTraining iterations

13 Gradient Descent Variants โ€” Faster & Smarter! โšก
๐Ÿš€ Momentum-Based GD
Regular GD (flat regions mein stuck!)
w_(t+1) = w_t - ฮท ร— โˆ‡w_t
Momentum GD (ball rolling down hill!) ๐ŸŽณ
uโ‚œ = ฮฒ ร— uโ‚œโ‚‹โ‚ + โˆ‡wโ‚œ (update direction) wโ‚œโ‚Šโ‚ = wโ‚œ - ฮท ร— uโ‚œ ฮฒ (beta) = momentum coefficient ฮฒ = 0.9 is common choice
๐Ÿ’ก Momentum = speed build-up karta hai โ€” flat regions mein bhi aage badhta hai!
  • ฮฒ = 0: Regular GD
  • ฮฒ = 0.9: 90% previous direction use karo
  • Too high ฮฒ: Overshoot ho sakta hai โš ๏ธ
๐Ÿ‘€ Nesterov Accelerated Gradient (NAG)
"Look Ahead" Strategy ๐Ÿ”ฎ
NAG Step 1: Lookahead point compute karo w_lookahead = w_t - ฮฒ ร— uโ‚œโ‚‹โ‚ NAG Step 2: Gradient at lookahead compute uโ‚œ = ฮฒ ร— uโ‚œโ‚‹โ‚ + โˆ‡w_lookahead NAG Step 3: Update wโ‚œโ‚Šโ‚ = wโ‚œ - ฮท ร— uโ‚œ
Momentum โŒ
Current point pe gradient lete ho, phir jump
NAG โœ…
Pehle jump karo, gradient wahan lete ho โ€” zyada accurate!
โœ… NAG = less oscillation, faster convergence!
๐Ÿ“ฆ Batch vs SGD vs Mini-Batch
TypeData UsedSpeed
Batch GDFull datasetSlow/step
SGD1 sampleFast/step
Mini-BatchB samplesBest! โญ
Mini-Batch Update Rule
wโ‚œโ‚Šโ‚ = wโ‚œ - (ฮท/B) ร— ฮฃ โˆ‡wโ‚œ (sum over batch) B = 32 or 64 is standard choice!
โญ Mini-batch GD = Best balance of speed + accuracy โ†’ Default choice in deep learning!

14 Learning Rate Scheduling โ€” Start Big, End Small! ๐Ÿ“‰
๐Ÿ“Š Scheduling Methods
Method 1: Step Decay โ€” Fixed intervals pe halve karo
Every 5 epochs: ฮท = ฮท / 2 Epoch 1-5: ฮท=0.1 Epoch 6-10: ฮท=0.05 (halved) Epoch 11-15:ฮท=0.025 (halved again)
Method 2: Exponential Decay ๐Ÿ“‰
ฮทโ‚œ = ฮทโ‚€ ร— e^(-kt) k = decay rate, t = current step/epoch Example: ฮทโ‚€=0.1, k=0.1 t=0: ฮท = 0.100 t=10: ฮท = 0.037 t=50: ฮท = 0.001
Method 3: 1/t Decay (Inverse Time)
ฮทโ‚œ = ฮทโ‚€ / (1 + kร—t) Stays higher for longer โ†’ more exploration!
๐Ÿ† Best Combination for Real Projects
โญ Production mein yeh combination use karo:
Mini-batch (B=32 or 64)
+
Momentum or NAG
+
Learning Rate Scheduling

= Fast, stable, good convergence! ๐Ÿš€
MethodBest For
Batch GDSmall datasets, full memory
SGDOnline learning
Mini-Batch โญDefault โ€” everything!
Step DecayCommon practice
Exp DecayImage classification
Line SearchResearch settings

โ˜… MASTER SUMMARY โ€” Poori Journey! ๐Ÿ—บ๏ธ
๐Ÿงฌ
Biological Neuron
Dendrite โ†’ Soma โ†’ Axon โ†’ Synapse
๐Ÿ”Œ
MP Neuron (1943)
Binary inputs, fixed threshold, no learning
โš–๏ธ
Perceptron (1958)
Real inputs, learnable weights, only linearly separable
๐Ÿ›๏ธ
MLP
Layers stack โ†’ solves XOR โ†’ any boolean fn
๐Ÿ“ˆ
Sigmoid Neuron
Smooth, probabilistic, differentiable, can learn!
๐Ÿš€
FFNN + Backprop
Forward pass โ†’ Loss โ†’ Backward pass โ†’ Update
๐ŸŽฏ Exam mein yaad rakho: Data โ†’ Model โ†’ Loss โ†’ Gradient โ†’ Update โ†’ Repeat! Yahi Deep Learning hai!