[吴恩达机器学习]Lecture2个人笔记

发布时间:2026/6/27 4:09:21
[吴恩达机器学习]Lecture2个人笔记 tags:机器学习笔记2-1 模型描述mmm训练集样本数xxx输入yyy输出hθ(x)θ0θ1xh_\theta(x) \theta_0 \theta_1 xhθ​(x)θ0​θ1​xHypothesis2-2 代价函数The cost function在 linear regression 中修改θ0、θ1\theta_0、\theta_1θ0​、θ1​使h(x)h(x)h(x)更好的拟合数据关于θ0、θ1\theta_0、\theta_1θ0​、θ1​对函数J(θ0,θ1)J(\theta_0,\theta_1)J(θ0​,θ1​)求最小值平方误差函数解决回归问题最常用手段如果直接把误差相加会正负相互抵消所以需要把误差平方Cost functionJ(θ0,θ1)12m∑i1m(hθ(x(i))−y(i))2 J(\theta_0,\theta_1) \frac{1}{2m} \sum_{i1}^{m}(h_\theta(x^{(i)})-y^{(i)})^2J(θ0​,θ1​)2m1​i1∑m​(hθ​(x(i))−y(i))2Goalminθ0,θ1 J(θ0,θ1)⏟cost function \underset{\theta_0,\theta_1}{\text{min}} \ \underbrace{J(\theta_0,\theta_1)}_{cost \ function}θ0​,θ1​min​costfunctionJ(θ0​,θ1​)​​2-3 梯度下降Gradient descent algorithmrepeat until convergenceθj:θj−α∂∂θjJ(θ0,θ1)\theta_j : \theta_j - \alpha \frac{\partial}{\partial \theta_j} J(\theta_0,\theta_1)θj​:θj​−α∂θj​∂​J(θ0​,θ1​)α\alphaαlearning rateα\alphaα过大可能无法收敛Simultaneous updateθ0\theta_0θ0​andθ1\theta_1θ1​靠近局部最低点slope 减小移动的幅度会自动变得越来越小So, no need to decreaseα\alphaαover time“Batch”Gradient Descent“Batch”: Each step of gradient descent uses all the training examples