[Supervised ML] Multiple linear regression

AI/ML Specialization

[Supervised ML] Multiple linear regression - 4

ghwangbo 2023. 7. 28. 11:39

1. Intuition

Instead of prediction house price with only the size of a house, we can use multiple features of a house to make a better prediction

This is a standard notation for multiple linear regression

W and X can be represented in the form of vectors.

This is the other format of writing multiple linear regression equation.

Multiplication of vectors are called "Dot Product"

2. Vectorization

Def: Vectorization allows easier mathematical calculation among vectors

Without Vectorization:

Case 1:

f = w[0] * x[0] + w[1] * x[1] + w[2] * x[2] + b

Case 2:

f = 0
for j in range(0, n):
	f = f + w[j] * x[j]
f = f + b

With vectorization:

w = np.array([1.0, 2.5, -3.3])
b = 4
x = np.array([10, 20, 30])

f = np.dot(w, x) + b

Vectorization is much faster because np.dot() function uses parallel calculation, while other cases uses sequential calculation.

During parallel calculation, each w and x is calculated simultaneously.

Example - Gradient descent with vectorization

Without vectorization:

for j in range(0, 16):
	w[j] = w[j] - 0.1 * d[j]

With vectorization:

w = w - 0.1 * d

3. Gradient descent for multiple linear regression

Gradient descent notation with one feature:

Gradient descent notation with multiple feature:

Since there are multiple w, every w has to be updated.

4. Code

4.1 Computing Cost for multiple linear regression

def compute_cost(X, y. w. b):
	#X: matrix of examples with multiple features
    
    m = X.shape[0]
    cost = 0.0
    for i in range(m):
    	f_wb_i = np.dot(X[i], w) + b
        cost = cost + (f_wb - y[i])**2
    cost = cost/(2*m)
    
    return cost

4.2.1 Gradient descent for multiple linear regression (Derivative)

def compute_gradient(X, y, w, b):
	
    m, n = X.shape
    dj_dw = np.zeros((n, ))
    dj_db = 0.
    
    for i in range(m):
    	err = (np.dot(X[i], w) + b) - y[i]
        for j in range(n)
        	dj_dw[j] = dj_dw[j] + err * X[i, j]
        dj_db = dj_db + err
    dj_dw = dj_dw/m
    dj_db = dj_db/m
    
    return dj_db, dj_dw