Neural Networks and Deep Learning

Hui Lin @Google

Course Website

https://smi2021.scientistcafe.com/

Types of Neural Network

Figure adapted from slides by Andrew NG, Deep Learning Specialization

A Little Bit of History – Perceptron

\[z^{(i)} = w_0 + w_1x_1^{(i)} + w_2x_2^{(i)}\]

\[pred^{(i)}=\begin{cases} \begin{array}{c} 1\\ -1 \end{array} & \begin{array}{c} if\ z^{(i)}>0\\ if\ z^{(i)}\le0 \end{array}\end{cases}\]

Perceptron Algorithm

Start with random weights. Set a maximum of epochs of M, for each epoch (permutation):

\[w_0 = w_0 + \eta(actual^{(i)} - pred^{(i)})\]

\[w_1 = w_1 + \eta(actual^{(i)} - pred^{(i)})x_1^{(i)}\] \[w_2 = w_2 + \eta(actual^{(i)} - pred^{(i)})x_2^{(i)}\]

Perceptron Algorithm

Logistic Regression as A Neural Network

\[X=\left[\begin{array}{cccc} x_{1}^{(1)} & x_{1}^{(2)} & \dotsb & x_{1}^{(m)}\\ x_{2}^{(1)} & x_{2}^{(2)} & \dotsb & x_{2}^{(m)}\\ \vdots & \vdots & \vdots & \vdots\\ x_{n_{x}}^{(1)} & x_{n_{x}}^{(2)} & \dots & x_{n_{x}}^{(m)} \end{array}\right]\in\mathbb{R}^{n_{x}\times m}\]

\[y=[y^{(1)},y^{(2)},\dots,y^{(m)}] \in \mathbb{R}^{1 \times m}\]

\(\hat{y}^{(i)} = \sigma(w^Tx^{(i)} + b)\) where \(\sigma(z) = \frac{1}{1+e^{-z}}\)

Logistic Regression as A Neural Network

\[X=\left[\begin{array}{cccc} x_{1}^{(1)} & x_{1}^{(2)} & \dotsb & x_{1}^{(m)}\\ x_{2}^{(1)} & x_{2}^{(2)} & \dotsb & x_{2}^{(m)}\\ \vdots & \vdots & \vdots & \vdots\\ x_{n_{x}}^{(1)} & x_{n_{x}}^{(2)} & \dots & x_{n_{x}}^{(m)} \end{array}\right]\in\mathbb{R}^{n_{x}\times m}\]

\[y=[y^{(1)},y^{(2)},\dots,y^{(m)}] \in \mathbb{R}^{1 \times m}\]

\(\hat{y}^{(i)} = \sigma(w^Tx^{(i)} + b)\) where \(\sigma(z) = \frac{1}{1+e^{-z}}\)

Forward Propagation