CS231n Assignment1：KNN

程序员文章站 2024-03-25 10:23:12

...

cs231n/classifiers/k_nearest_neighbor.py代码：

import numpy as np
class KNearestNeighbor(object):
  """ a kNN classifier with L2 distance """

  def __init__(self):
    pass

  def train(self, X, y):
    """
    Train the classifier. For k-nearest neighbors this is just 
    memorizing the training data.

    Inputs:
    - X: A numpy array of shape (num_train, D) containing the training data
      consisting of num_train samples each of dimension D.
    - y: A numpy array of shape (N,) containing the training labels, where
         y[i] is the label for X[i].
    """
    self.X_train = X
    self.y_train = y
    
  def predict(self, X, k=1, num_loops=0):
    """
    Predict labels for test data using this classifier.

    Inputs:
    - X: A numpy array of shape (num_test, D) containing test data consisting
         of num_test samples each of dimension D.
    - k: The number of nearest neighbors that vote for the predicted labels.
    - num_loops: Determines which implementation to use to compute distances
      between training points and testing points.

    Returns:
    - y: A numpy array of shape (num_test,) containing predicted labels for the
      test data, where y[i] is the predicted label for the test point X[i].  
    """
    if num_loops == 0:
      dists = self.compute_distances_no_loops(X)
    elif num_loops == 1:
      dists = self.compute_distances_one_loop(X)
    elif num_loops == 2:
      dists = self.compute_distances_two_loops(X)
    else:
      raise ValueError('Invalid value %d for num_loops' % num_loops)

    return self.predict_labels(dists, k=k)

  def compute_distances_two_loops(self, X):
    """
    Compute the distance between each test point in X and each training point
    in self.X_train using a nested loop over both the training data and the 
    test data.

    Inputs:
    - X: A numpy array of shape (num_test, D) containing test data.

    Returns:
    - dists: A numpy array of shape (num_test, num_train) where dists[i, j]
      is the Euclidean distance between the ith test point and the jth training
      point.
    """
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train))
    for i in range(num_test):
      for j in range(num_train):
        #####################################################################
        # TODO:                                                             #
        # Compute the l2 distance between the ith test point and the jth    #
        # training point, and store the result in dists[i, j]. You should   #
        # not use a loop over dimension.                                    #
        #####################################################################
        dists[i,j] = np.sqrt(np.sum((X[i,:] - self.X_train[j,:])**2))
        """standard answer"""
        # dists[i, j] = np.sqrt(np.sum(np.square(X[i, :] - self.X_train[j, :]) ))
        #####################################################################
        #                       END OF YOUR CODE                            #
        #####################################################################
    return dists

  def compute_distances_one_loop(self, X):
    """
    Compute the distance between each test point in X and each training point
    in self.X_train using a single loop over the test data.
    X shape (
    Input / Output: Same as compute_distances_two_loops
    """
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train))
    for i in range(num_test):
      #######################################################################
      # TODO:                                                               #
      # Compute the l2 distance between the ith test point and all training #
      # points, and store the result in dists[i, :].                        #
      #######################################################################
      dists[i,:] = np.sqrt(np.sum(np.square(self.X_train - X[i,:]),axis=1))

      #######################################################################
      #                         END OF YOUR CODE                            #
      #######################################################################
    return dists

  def compute_distances_no_loops(self, X):
    """
    Compute the distance between each test point in X and each training point
    in self.X_train using no explicit loops.

    Input / Output: Same as compute_distances_two_loops
    """
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train)) 
    #########################################################################
    # TODO:                                                                 #
    # Compute the l2 distance between all test points and all training      #
    # points without using any explicit loops, and store the result in      #
    # dists.                                                                #
    #                                                                       #
    # You should implement this function using only basic array operations; #
    # in particular you should not use functions from scipy.                #
    #                                                                       #
    # HINT: Try to formulate the l2 distance using matrix multiplication    #
    #       and two broadcast sums.                                         #
    #########################################################################
    dists = np.multiply(np.dot(X,self.X_train.T),-2)
    dists2 = np.sum(np.square(X),axis=1,keepdims=True)
    dists3 = np.sum(np.square(self.X_train),axis=1)
    dists = np.add(dists,dists2)
    dists = np.add(dists,dists3)
    dists = np.sqrt(dists)
    #########################################################################
    #                         END OF YOUR CODE                              #
    #########################################################################
    return dists

  def predict_labels(self, dists, k=1):
    """
    Given a matrix of distances between test points and training points,
    predict a label for each test point.

    Inputs:
    - dists: A numpy array of shape (num_test, num_train) where dists[i, j] dist向量是测试集和训练集一起形成的矩阵
      gives the distance betwen the ith test point and the jth training point.

    Returns:
    - y: A numpy array of shape (num_test,) containing predicted labels for the
      test data, where y[i] is the predicted label for the test point X[i].  
    """
    num_test = dists.shape[0]
    y_pred = np.zeros(num_test)
    for i in range(num_test):
      # A list of length k storing the labels of the k nearest neighbors to
      # the ith test point.
      closest_y = []
      #########################################################################
      # TODO:                                                                 #
      # Use the distance matrix to find the k nearest neighbors of the ith    #
      # testing point, and use self.y_train to find the labels of these       #
      # neighbors. Store these labels in closest_y.                           #
      # Hint: Look up the function numpy.argsort.                             #
      #########################################################################
      closest_y = self.y_train[np.argsort(dists[i,:])[:k]]

      #########################################################################
      # TODO:                                                                 #
      # Now that you have found the labels of the k nearest neighbors, you    #
      # need to find the most common label in the list closest_y of labels.   #
      # Store this label in y_pred[i]. Break ties by choosing the smaller     #
      # label.                                                                #
      #########################################################################
      y_pred[i] = np.argmax(np.bincount(closest_y))
      #########################################################################
      #                           END OF YOUR CODE                            # 
      #########################################################################

    return y_pred

knn.ipynb代码，cross validation部分：

num_folds = 5
k_choices = [1, 3, 5, 8, 10, 12, 15, 20, 50, 100]

X_train_folds = []
y_train_folds = []
################################################################################
# TODO:                                                                        #
# Split up the training data into folds. After splitting, X_train_folds and    #
# y_train_folds should each be lists of length num_folds, where                #
# y_train_folds[i] is the label vector for the points in X_train_folds[i].     #
# Hint: Look up the numpy array_split function.                                #
################################################################################
X_train_folds = np.array_split(X_train,num_folds)
y_train_folds = np.array_split(y_train,num_folds)
################################################################################
#                                 END OF YOUR CODE                             #
################################################################################

# A dictionary holding the accuracies for different values of k that we find
# when running cross-validation. After running cross-validation,
# k_to_accuracies[k] should be a list of length num_folds giving the different
# accuracy values that we found when using that value of k.
k_to_accuracies = {}


################################################################################
# TODO:                                                                        #
# Perform k-fold cross validation to find the best value of k. For each        #
# possible value of k, run the k-nearest-neighbor algorithm num_folds times,   #
# where in each case you use all but one of the folds as training data and the #
# last fold as a validation set. Store the accuracies for all fold and all     #
# values of k in the k_to_accuracies dictionary.                               #
################################################################################
for k in k_choices:
    k_to_accuracies[k] = []
    
for k in k_choices:
    for i in range(num_folds):
        #这里出现错误，前若干的数组是[:i]不是[;,i]，这个代表着前几列
        X_train_cv = np.vstack(X_train_folds[:i] + X_train_folds[i+1:]) #(4000,3072)
        X_test_cv = X_train_folds[i]                                    #(1000,3072)
        y_train_cv = np.hstack(y_train_folds[:i] + y_train_folds[i+1:]) #(4000,)
        y_test_cv = y_train_folds[i]    #(1000,)
       
          
        #这里不需要第二个classifier = KNearestNeighbor()么
        classifier.train(X_train_cv,y_train_cv)
        dists_cv = classifier.compute_distances_no_loops(X_test_cv) #(1000,4000)
        y_test_pred_cv = classifier.predict_labels(dists_cv,k)

        num_correct_cv = np.sum(y_test_pred_cv == y_test_cv)
        accuracy_cv = float(num_correct_cv)/y_test_cv.shape[0]
        k_to_accuracies[k].append(accuracy_cv)
################################################################################
#                                 END OF YOUR CODE                             #
################################################################################

# Print out the computed accuracies
for k in sorted(k_to_accuracies):
    for accuracy in k_to_accuracies[k]:
        print('k = %d, accuracy = %f' % (k, accuracy))

做作业中遇到的问题汇总：
1、numpy.sum函数：

numpy.sum(a, axis=None, dtype=None, out=None, keepdims=<no value>, initial=<no value>)

作用就是求和函数，参考文档如下所示：https://docs.scipy.org/doc/numpy/reference/generated/numpy.sum.html
具体用到的参数为axis 和 keepdims。
参数及结果如下：

a = np.arange(12).reshape(3,4)

print(a)   
# [[ 0  1  2  3]
#  [ 4  5  6  7]
#  [ 8  9 10 11]]

print(np.sum(a))
#66

print(np.sum(a,axis=0))
#[12 15 18 21]

print(np.sum(a,axis=0,keepdims=True))
#[[12 15 18 21]]

print(np.sum(a,axis=1))
# [ 6 22 38]

print(np.sum(a,axis=1,keepdims=True))
# [[ 6]
#  [22]
#  [38]]

2、numpy.argsort，numpy.argmax ，numpy.bincount函数:
numpy.argsort ：将矩阵a按照axis排序，并返回排序后的下标
numpy.argmax：返回数组或矩阵最大值的索引
numpy.bincount：返回数组中每个元素出现的次数，尤其适用于计算数据集的标签列（y_train）的分布

3、numpy.vstack,numpy.hstack函数：
作用是堆叠数组，vstack按行堆叠数组，hstack按列堆叠数组
代码运行结果如下：

a = np.arange(12).reshape(3,4)
print(a)
# [[ 0  1  2  3]
#  [ 4  5  6  7]
#  [ 8  9 10 11]]
print(np.vstack(a))
# [[ 0  1  2  3]
#  [ 4  5  6  7]
#  [ 8  9 10 11]]
print(np.hstack(a))
# [ 0  1  2  3  4  5  6  7  8  9 10 11]

4、计算KNN dists数组时no loop的算法及原理：
个人理解，不一定很严谨，但是可以推论出来。
这个以后会用公式证明，简单的说下思路。
我们现在有 $dists[i,j] = X_{train_{(j)}}\cdot X_{train_{(j)}}^T+X_{(i)}\cdot X_{(i)}^T-2X_{(i)}\cdot X_{train_{(j)}}^T$
我们可以先推出 $dists[:,j]$ 的公式，再推出 $dists$ 的公式,或者直接用numpy.sum对矩阵的操作获得我们想要的结果。

上一篇：【OpenCV】矩阵掩模操作

下一篇： copyTo

CS231n Assignment1：KNN

CS231n Assignment1：KNN

CS231n：作业1——softmax

cs231n assignment1

【CS231n】Spring 2020 Assignments - Assignment1 - Softmax

深度学习与计算机视觉系列(2)_图像分类与KNN

深度学习与计算机视觉系列(2)_图像分类与KNN

原生python实现knn分类算法

原生python实现knn分类算法

KNN分类法与手势识别

计算视觉——图像分类— K邻近分类法（KNN）