Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kNN + PCA #13

Closed
dzisandy opened this issue Jan 12, 2020 · 1 comment
Closed

kNN + PCA #13

dzisandy opened this issue Jan 12, 2020 · 1 comment

Comments

@dzisandy
Copy link

dzisandy commented Jan 12, 2020

Dear @mxbi, as you asked in previous issue KMNIST kNN #10

I've calculated scores for K-49 and MNIST.

  • Code for K-49 with respect to description of dataset (taken balanced matrics).
from sklearn.neighbors import KNeighborsClassifier
import numpy as np
from sklearn.decomposition import PCA


def load(f):
    return np.load(f)['arr_0']

# Load the data
x_train = load('k49-train-imgs.npz')
x_test = load('k49-test-imgs.npz')
y_train = load('k49-train-labels.npz')
y_test = load('k49-test-labels.npz')

# Flatten images
x_train = x_train.reshape(-1, 784)
x_test = x_test.reshape(-1, 784)

pca = PCA(n_components= 60, random_state= 0 )
x_train = pca.fit_transform(x_train)
x_test = pca.transform(x_test)
clf = KNeighborsClassifier(n_neighbors= 4,weights='distance', n_jobs=-1)
clf.fit(x_train, y_train)
p_test = clf.predict(x_test)

#added by description of K-49
accs = []
for cls in range(49):
  mask = (y_test == cls)
  cls_acc = (p_test == cls)[mask].mean() 
  accs.append(cls_acc)
  
accs = np.mean(accs)
print('Test accuracy:', accs)

The result is Test accuracy: 0.8679612391951115

  • Code for MNIST remains similar to the given in previous issue (.npz taken from Kaggle):
from sklearn.neighbors import KNeighborsClassifier
import numpy as np
from sklearn.decomposition import PCA

#Taken from Kaggle 
def load_data(path):
    with np.load(path) as f:
        x_train, y_train = f['x_train'], f['y_train']
        x_test, y_test = f['x_test'], f['y_test']
        return (x_train, y_train), (x_test, y_test)

# Load the data
(x_train, y_train), (x_test, y_test) = load_data('/content/mnist.npz')

x_train = x_train.reshape(-1, 784)
x_test = x_test.reshape(-1, 784)

pca = PCA(n_components= 60, random_state= 0 )
x_train = pca.fit_transform(x_train)
x_test = pca.transform(x_test)

clf = KNeighborsClassifier(n_neighbors= 4,weights='distance', n_jobs=-1)
clf.fit(x_train, y_train)
print('Test accuracy:', clf.score(x_test, y_test))

The result is Test accuracy: 0.9776

Hope it will be a good datapoint and milestone for somebody.

@mxbi mxbi closed this as completed in 12d650e Jan 13, 2020
@mxbi
Copy link
Member

mxbi commented Jan 13, 2020

I've updated the page, thank you!

@mxbi mxbi mentioned this issue Jan 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants