深度学习(26)随机梯度下降四: 损失函数的梯度
1. Mean Squared Error(MSE)2. Cross Entropy LossCrossEntropy3. Softmax(1) Derivative(2) Crossentropy gradientOutlineMean Squared ErrorCross Entropy Loss
1. Mean Squared Error(MSE)
loss=∑[y−fθ(x)]2loss=∑[y-f_θ (x)]^2loss=∑[y−fθ(x)]2∇loss∇θ=2∑[y−fθ(x)]∗∇fθ(x)∇θ\frac{∇loss}{∇θ}=2∑[y-f_θ (x)] *\frac{∇f_θ (x)}{∇θ}∇θ∇loss=2∑[y−fθ(x)]∗∇θ∇fθ(x)fθ(x)=sigmoid(XW+b)f_θ (x)=sigmoid(XW+b)fθ(x)=sigmoid(XW+b)fθ(x)=relu(XW+b)f_θ (x)=relu(XW+b)fθ(x)=relu(XW+b)MSE Gradient
注:如果不写tape.watch([w, b])
的话,就需要将w和b手动转换为Variable类型。
2. Cross Entropy Loss
CrossEntropy
H([0,1,0],[p0,p1,p2])=DKL(p│q)=−1logp1H([0,1,0],[p_0,p_1,p_2 ])=D_{KL} (p│q)=-1 \log{p_1}H([0,1,0],[p0,p1,p2])=DKL(p│q)=−1logp1ddxlog2(x)=1x⋅ln(2)\frac{d}{dx} \log_2{(x)}=\frac{1}{x⋅ln(2)}dxdlog2(x)=x⋅ln(2)1p=softmax(logits)p=softmax(logits)p=softmax(logits)3. Softmax
soft version of max(1) Derivative
(2) Crossentropy gradient
参考文献:
[1] 龙良曲:《深度学习与TensorFlow2入门实战》