## 1.Linear Regression

$\hat{y}=h_\theta(x)=\theta^Tx$ $J(\theta) = \frac{1}{2}\sum_{i}(h_\theta(x^i)-y^i)^2 = \frac{1}{2}\sum_i(\theta^Tx^i-y^i)^2$ $\frac{\partial J(\theta)}{\partial \theta} = \sum_i (h_\theta(x^i)-y^i)x^i$

## 2.Logistic Regression

$P(y=1|x)=h_\theta(x)=\frac{1}{1+\exp{(-\theta^Tx)} }$

where $$\sigma(z) = \frac{1}{1+\exp{-z} }$$ is called sigmoid or logistic function. $$h_\theta(x)$$ can be interpreted as the probability that $$y=1$$ .

$J(\theta) = -\sum_i(y^i log(h_\theta(x^i))+(1-y^i)log(1-h_\theta(x^i)))$ $\frac{\partial J(\theta)}{\partial \theta} = \sum_i (h_\theta(x^i)-y^i)x^i$

Logistic回归可以看作是用sigmoid函数归一化的线性回归。

## 3.Softmax Regression

$h_\theta(x) = \frac{1}{\sum^K_{j=1}\exp{({\theta^j}^Tx)} } \left[\begin{array}{c}\exp{({\theta^1}^Tx)}\\ \exp{({\theta^2}^Tx)}\\ \vdots\\ \exp{({\theta^K}^Tx)}\end{array}\right]$ $J(\theta) = -[\sum_{i=1}^M\sum_{k=1}^K\textbf{1}(y^i=k)log\frac{\exp({\theta^k}^Tx^i)}{\sum_{j=1}^K\exp{ {\theta^j}^Tx^i} }]$ $\frac{\partial J(\theta)}{\partial \theta^k} = -\sum^M_{i=1}[\textbf{1}(y^i=k)-\frac{\exp({\theta^k}^Tx^i)}{\sum_{j=1}^K\exp{ {\theta^j}^Tx^i} }]$