I have recently started DeepLearning.AI’s Deep Learning Specialization on Coursera. Below are my lecture notes from the second week of the first course. The lectures examined vectorized Logistic regression as a neural network in preparation for more complex neural networks.

## I. Logistic Regression

### Logistic Regression Input and Output

Given x where ,

We want

Parameters:

**Output option 1 (linear regression)**:

- not always which makes classification awkward

**Output option 2 (logistic regression)**:

- Provides output in [0,1] for easy binary classification (usually designated as class 1 and designated as class 0).
- Takes advantage of “sigmoid” equation visualized below.

```
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
#Create z and sigma
z = np.linspace(-5,5)
sigma = 1/(1+np.exp(-z))
#Draw prediction cut-off line
plt.axhline(0.5, color='black',ls='--')
#Label axis
plt.xlabel('z')
plt.ylabel(r'$latex \hat{y}$')
#Plot graph
plt.tick_params(axis='x',bottom='off',labelbottom='off')
plt.plot(z,sigma,'-',lw=3);
```

### Logistic Regression Loss and Cost Function

**Loss Function**:

For an individual instance, the loss function is:

Intuition:

- If y= 1:
- Minimizing this will ensure that is large which will ensure that is large, i.e. close to 1.

- If y= 0:
- Minimizing this will ensure that is large which will ensure that is small, i.e. close to 0.

**Cost Function**:

Across training set, the cost function is:

### Gradient Descent

J(w,b) is a convex function so gradient descent will not get stuck on a local minimum.

**Gradient Descent Algorithm**:

For cost function J(w,b) and learning rate ,

repeat {

}

## II. Implementing Vectorized Logistic Regression in Python

Note: Capital letters indicate a matrix rather than a single training instance.

### Vectorization

Vectorization is the art of removing for loops. For loops are much slower than matrix multiplication.

To create the product of W transpose and X, use numpy’s .dot(X,Y) function and the .T method for transpose :

### Parial Derivatives

In code, we represent the partial derivatives as follows:

After taking partial derivatives from a computation graph, we find that (the chage in cost with respect to z) is equal to:

And so in code, we will represent this as .

### Single Vectorized Step of Logistic Regression

Calculate z:

Calculate A (convert z to [0,1] range with sigmoid function):

Calculate dZ (the change in cost with respect to z):

Calculate w and b (weights and bias):

## III. Processing Images for Classification and Array Notes

### Image Processing

Images are stored as 3 n x m matrices of pixel intensities. To classify them we need to reshape them as a (n x m x 3, 1) array.

```
#Create dummy image
n=64
m=100
img = np.random.randn(n,m,3)
print('Shape of standard image: {}'.format(img.shape))
```

```
#Prepare for training or classification
reshaped_img = img.reshape((img.shape[0]*img.shape[1]*3,1))
print('Shape of reshaped image: {}'.format(reshaped_img.shape))
```

### Broadcasting

Broadcasting refers to automatic conversion of array shapes to allow for various calculations.

Given array of (n,m), adding/substracting/dividing/multiplying by an array or real numbers with various dimensions will convert them as follows:

- (1,n) -> (m,n) – Created by copying column m times
- (m,1) -> (m,n) – Created by copying row n times
- k -> (m,n) – Created by filling a (m,n) matrix with k.

### Rank 1 Arrays and Assert

**Avoid using rank 1 arrays**:

These arrays have shape of (n,). Use reshape to give them dimension of (n,1) or (1,n) to avoid tricky bugs in code.

E.g. use:

X = np.zeros((5,1))

Instead of:

X = np.zeros(5)

**Use assert to check array shape:**

assert(X.shape==(5,1))