欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

cs231n assignment1 softmax

程序员文章站 2022-03-16 17:46:40
...

Softmax

Softmax的损失函数为
Li=logpyi=log(efyijefj)=fyi+logjefjL_{i}=-\log p_{y_{i}}=-\log \left(\frac{e^{f_{y_{i}}}}{\sum_{j} e^{f_{j}}}\right)=-f_{y_{i}}+\log \sum_{j} e^{f_{j}}

梯度推导参考文章:【学习笔记】cs231n中assignment1中的 Softmax exercise

代码:
softmax.py 中的softmax_loss_naive()函数

# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    num_train = X.shape[0]
    num_class = W.shape[1]
    for i in range(num_train):
        score = X[i].dot(W)
        score -= np.max(score)  # 提高计算中的数值稳定性
        correct_score = score[y[i]]  # 取分类正确的评分值
        exp_sum = np.sum(np.exp(score))
        loss += np.log(exp_sum) - correct_score
        for j in xrange(num_class):
            if j == y[i]:
                dW[:, j] += np.exp(score[j]) / exp_sum * X[i] - X[i]
            else:
                dW[:, j] += np.exp(score[j]) / exp_sum * X[i]
    loss /= num_train
    loss += 0.5 * reg * np.sum(W * W)
    dW /= num_train
    dW += reg * W

    pass

    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

softmax.py 中的softmax_loss_vectorized()函数

# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    num_train = X.shape[0]

    score = X.dot(W)
    # axis = 1每一行的最大值,score仍为500*10
    score -= np.max(score,axis=1)[:,np.newaxis]
    # correct_score变为500 * 1
    correct_score = score[range(num_train), y]
    exp_score = np.exp(score)
    # sum_exp_score维度为500*1
    sum_exp_score = np.sum(exp_score,axis=1)
    # 计算loss
    loss = np.sum(np.log(sum_exp_score) - correct_score)
    loss /= num_train
    loss += 0.5 * reg * np.sum(W * W)

    # 计算梯度
    margin = np.exp(score) / sum_exp_score.reshape(num_train,1)
    margin[np.arange(num_train), y] += -1
    dW = X.T.dot(margin)
    dW /= num_train
    dW += reg * W


    pass

    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

接下来选择合适的超参
softmax文件

# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
iters = 2000
for lr in learning_rates:
    for rs in regularization_strengths:
        softmax = Softmax()
        loss_hist = softmax.train(X_train,y_train,learning_rate=lr,reg=rs,num_iters=iters)
        plt.plot(loss_hist)
        plt.xlabel('Iteration number')
        plt.ylabel('Loss value')
        plt.show()
        
        y_train_pred = softmax.predict(X_train)
        acc_train = np.mean(y_train == y_train_pred)
        y_val_pred = softmax.predict(X_val)
        acc_val = np.mean(y_val == y_val_pred)

        results[(lr, rs)] = (acc_train, acc_val)
        
        if best_val < acc_val:
            best_val = acc_val
            best_softmax = softmax
pass

# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

这个超参选择可以一次选多个参数,迭代次数大一些,然后逐步根据准确率缩小区间。
最终测试集准确率达到0.38

Inline Question

Inline Question 1

Why do we expect our loss to be close to -log(0.1)? Explain briefly.**

YourAnswer:\color{blue}{\textit Your Answer:} Fill this in

因为W是随机生成的,故分类正确的概率为1/10,即损失函数为-log(0.1)。

Inline Question 2 - True or False

Suppose the overall training loss is defined as the sum of the per-datapoint loss over all training examples. It is possible to add a new datapoint to a training set that would leave the SVM loss unchanged, but this is not the case with the Softmax classifier loss.

YourAnswer:\color{blue}{\textit Your Answer:}
正确

YourExplanation:\color{blue}{\textit Your Explanation:}
由于SVM损失函数计算时如果新加入的测试图片分类正确,则loss一定为0;但是对与softmax而言,不论分类是否正确,loss总会是存在的,即使loss趋近于0,但整体而言,损失值变化了。

相关标签: 计算机视觉