Sift algorithm and classify images——Learning notes 2
这里写自定义目录标题
Sift algorithm
Sift,Scale-Invariant Feature Taransform,is established in the following steps:
1 、Image scale space
In order for to have a unified cognition to an objects at different scales, it is necessary to consider the characteristics of images at different scales. That means that we need to get the scale space of a image. Thus, We can use Gaussian Blur to solve this. And the larger the value of σ is, the fuzzier the image can be obtained.
2、Multiresolution pyramid
By applying Gaussian Blur to images of different sizes, we can obtain image scale space with different resolutions.
3、Difference of Gaussians(DoG operator)
By the following formula, we can obtain the differences between two adjacent scale spaces.
Pixel (x, y) will be compared with the field around it and the field of adjacent scale space to obtain the extreme value. For example, in the following image, pixels are compared with 26 values around and above it.
4、Accurate positioning of key points
Through curve fitting of the DoG function in the scale space, the extreme value points are calculated, so as to accurately locate the key points.
5、
By determining the main direction of this key point and calculating its gradient, we can get the feature vector of the key point. And then use all the gradients in the histogram statistics field, just like the Hog feature, we can complete the extraction of Sift features.
Opencv Sift function
The opencv library provides us with functions to calculate sift features, and it returns the key points of the input image.
sift = cv2.xfeatures2d.SIFT_create()
For example:
import cv2
import numpy as np
img = cv2.imread('cat.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
sift = cv2.xfeatures2d.SIFT_create()
kp = sift.detect(gray, None)
img = cv2.drawKeypoints(gray, kp, img)
cv2.imshow('drawKeypoints', img)
cv2.waitKey(0)
cv2.destroyAllWindows()
kp, des = sift.compute(gray, kp)
print(des.shape)
print(des[0])
Result:
(1675,)
(1675, 128)
[ 3. 0. 0. 2. 16. 47. 126. 85. 98. 2. 2. 1. 0. 1.
76. 126. 96. 17. 2. 0. 2. 8. 14. 37. 11. 3. 0. 2.
37. 39. 54. 13. 0. 0. 0. 5. 16. 27. 126. 67. 13. 11.
16. 60. 69. 20. 69. 33. 126. 67. 19. 14. 3. 0. 1. 23.
42. 12. 0. 3. 38. 34. 18. 16. 0. 0. 0. 51. 59. 5.
22. 12. 26. 39. 24. 126. 126. 2. 1. 1. 126. 126. 13. 14.
3. 0. 0. 27. 45. 30. 6. 6. 3. 9. 12. 19. 0. 0.
0. 15. 12. 0. 0. 0. 9. 38. 21. 75. 26. 0. 0. 0.
126. 126. 10. 5. 0. 0. 0. 7. 68. 44. 5. 2. 0. 0.
5. 22.]
From this function, drawKeypoints, we can see the position of the key points on the image and the dimension of the features is 128.
Used SVM to classify images based on sift features
step 1: compute sift features.
# Compute sift features
def SiftFeature(img):
sift = cv2.xfeatures2d.SURF_create()
keypoints, features = sift.detectAndCompute(img, None)
return features
step 2:use kmeans to classify features
# Use kmeans to classify features
def learnVocabulary(features):
species = 2
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 20, 0.1)
flags = cv2.KMEANS_RANDOM_CENTERS
compactness, labels, centers = cv2.kmeans(features, species, None,criteria, 20, flags)
return centers
step 3: computes the eigenvectors of the class
# Compute features in all of classes
def calcFeatVec(features, centers):
featVec = np.zeros((1, 2))
for i in range(0, features.shape[0]):
fi = features[i]
diffMat = np.tile(fi, (2, 1)) - centers
sqSum = (diffMat**2).sum(axis=1)
dist = sqSum**0.5
sortedIndices = dist.argsort()
idx = sortedIndices[0]
featVec[0][idx] += 1
return featVec
step 4: create the training class and save its feature vectors
# Create the training class and save its feature vectors
def build_center(path):
images = [plt.imread(path) for path in glob.glob('%s/*.jpg' % path)]
features = np.float32([]).reshape(0, 64)
img_f = list(map(SiftFeature, images))
for folder in img_f:
if folder is None:
continue
features = np.append(features, folder, axis=0)
centers = learnVocabulary(features)
filename = "f:/picture/svm/svm_centers.npy"
np.save(filename, centers)
def cal_vec(train_path):
centers = np.load("f:/picture/svm/svm_centers.npy")
data_vec = np.float32([]).reshape(0, 2)
labels = np.float32([])
cate = [plt.imread(train_path) for path in glob.glob('%s/*.jpg' % train_path)]
i = 0
for idx, img in enumerate(cate):
img_f = SiftFeature(img)
if img_f is None:
i += 1
continue
img_vec = calcFeatVec(img_f, centers)
data_vec = np.append(data_vec,img_vec,axis=0)
labels = np.append(labels, idx - i)
print('data_vec:',data_vec.shape)
print('image features vector done!')
return data_vec,labels
step 5: train and test
def SVM_Train(data_vec,labels):
clf = svm.SVC(decision_function_shape='ovo')
clf.fit(data_vec,labels)
joblib.dump(clf, "f:/picture/svm/svm_model.m")
def SVM_Test(test_path):
clf = joblib.load("f:/picture/svm/svm_model.m")
centers = np.load("f:/picture/svm/svm_centers.npy")
data_vec,labels = cal_vec(test_path)
res = clf.predict(data_vec)
num_test = data_vec.shape[0]
acc = 0
for i in range(num_test):
if labels[i] == res[i]:
acc = acc+1
return acc/num_test,res
After running, I found the classification results are not excellent.Thus, there are still some improvements to do.
Reference:
1、https://www.cnblogs.com/gzyc/p/11221963.html
2、https://blog.csdn.net/qq_31347869/article/details/88071930
3、https://blog.csdn.net/weixin_42486554/article/details/103732613