일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | ||||
4 | 5 | 6 | 7 | 8 | 9 | 10 |
11 | 12 | 13 | 14 | 15 | 16 | 17 |
18 | 19 | 20 | 21 | 22 | 23 | 24 |
25 | 26 | 27 | 28 | 29 | 30 | 31 |
- 챗지피티
- prompt
- Supervised Learning
- coursera
- GPT
- bingai
- nlp
- LLM
- Regression
- 딥러닝
- 프롬프트 엔지니어링
- neural network
- ML
- 인공신경망
- 머신러닝
- learning algorithms
- llama
- Scikitlearn
- Unsupervised Learning
- 인공지능
- Andrew Ng
- Machine Learning
- feature scaling
- supervised ml
- 언어모델
- ChatGPT
- Deep Learning
- AI
- AI 트렌드
- feature engineering
- Today
- Total
My Progress
[Supervised ML] Gradient Descent - 3 본문
Gradient Descent
Definition
- This is an algorithm to find minimum cost function.
- finding w and b in f(x) = wx + b that makes minimum cost function
Goal:
Example)
Problem: What is the fastest way to get down to the lowest ground from the hill?
How: By using Gradient Descent
Implementation
Gradient descent algorithm
Alpha represents the learning rate
Learning rate decides how big step you are going to take when you are going down the hill.
Algorithm Intuition
If you were to draw a tangent line, which is the derivative of J(w), you get the slope of J(w).
Slope of Case 1 is greater than 0. It makes w - learning rate * (slope of J(w)) to decrease.
It is the opposite for case 2.
Learning Rate
Finding adequate Learning Rate is crucial.
When do we know if we found a local minimum of a cost function?
- Near a local minimum, derivative becomes smaller and updates steps also become smaller
Can reach minimum without decreasing learning rate alpha
Gradient descent for Linear Regression
Case 1:
Derivative of the cost function on w
Case 2:
Derivative of the cost function on b
Ultimately gradient descent algorithm equals to the equations above.
Code
Batch gradient descent: Each step of gradient descent uses all the training examples
Computing Derivative of the cost function
#returns derivative of the cost function on w and b
def compute_gradient(x, y, w, b):
m = len(x)
dj_dw = 0
dj_db = 0
for i in range(m):
# calculate prediction for this example
f_wb_i = w * x[i] + b
# calculate partial derivatives of cost for this example
dj_db_i = f_wb_i - y[i]
dj_dw_i = (f_wb_i - y[i]) * x[i]
# add to totals
dj_db = dj_db + dj_db_i
dj_dw = dj_dw + dj_dw_i
# Divide by number of examples
dj_db = (1 / m) * dj_db
dj_dw = (1 / m) * dj_dw
return dj_dw, dj_db
'AI > ML Specialization' 카테고리의 다른 글
[Supervised ML] Gradient descent / Learning Rate - 6 (0) | 2023.07.28 |
---|---|
[Supervised ML] Feature scaling - 5 (0) | 2023.07.28 |
[Supervised ML] Multiple linear regression - 4 (0) | 2023.07.28 |
[Supervised ML] Regression/ Cost function - 2 (0) | 2023.07.26 |
[Supervised ML] Supervised/Unsupervised - 1 (0) | 2023.07.26 |