[Supervised ML] Gradient Descent

Notice

Recent Posts

Recent Comments

Link

« 2025/05 »
일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

My Progress

[Supervised ML] Gradient Descent - 3 본문

AI/ML Specialization

[Supervised ML] Gradient Descent - 3

ghwangbo 2023. 7. 27. 14:30

Gradient Descent

Definition

- This is an algorithm to find minimum cost function.

- finding w and b in f(x) = wx + b that makes minimum cost function

Goal:

Example)

Problem: What is the fastest way to get down to the lowest ground from the hill?

How: By using Gradient Descent

Implementation

Gradient descent algorithm

Alpha represents the learning rate

Learning rate decides how big step you are going to take when you are going down the hill.

Algorithm Intuition

If you were to draw a tangent line, which is the derivative of J(w), you get the slope of J(w).

Slope of Case 1 is greater than 0. It makes w - learning rate * (slope of J(w)) to decrease.

It is the opposite for case 2.

Learning Rate

Finding adequate Learning Rate is crucial.

When do we know if we found a local minimum of a cost function?

- Near a local minimum, derivative becomes smaller and updates steps also become smaller

Can reach minimum without decreasing learning rate alpha

Gradient descent for Linear Regression

Case 1:

Derivative of the cost function on w

Case 2:

Derivative of the cost function on b

Ultimately gradient descent algorithm equals to the equations above.

Code

Batch gradient descent: Each step of gradient descent uses all the training examples

Computing Derivative of the cost function

#returns derivative of the cost function on w and b
def compute_gradient(x, y, w, b):

	m = len(x)
    dj_dw = 0
    dj_db = 0
    
    for i in range(m):
    	# calculate prediction for this example
        f_wb_i = w * x[i] + b
        # calculate partial derivatives of cost for this example
        dj_db_i = f_wb_i - y[i]
        dj_dw_i = (f_wb_i - y[i]) * x[i]
        # add to totals
        dj_db = dj_db + dj_db_i
        dj_dw = dj_dw + dj_dw_i
        
    # Divide by number of examples
    dj_db = (1 / m) * dj_db
    dj_dw = (1 / m) * dj_dw
    
    return dj_dw, dj_db

'AI > ML Specialization' 카테고리의 다른 글

[Supervised ML] Gradient descent / Learning Rate - 6 (0)	2023.07.28
[Supervised ML] Feature scaling - 5 (0)	2023.07.28
[Supervised ML] Multiple linear regression - 4 (0)	2023.07.28
[Supervised ML] Regression/ Cost function - 2 (0)	2023.07.26
[Supervised ML] Supervised/Unsupervised - 1 (0)	2023.07.26

'AI/ML Specialization' Related Articles