File:The effect of z-score normalization on k-means clustering.svg
Summary
Description |
English: ```python
import matplotlib.pyplot as plt import numpy as np from sklearn.cluster import KMeans from sklearn.preprocessing import StandardScaler
n_samples = 300 cluster_std = 0.15
np.random.seed(0) cluster_centers_scales = (([-1, -1], 0.2), ([1, -1], 0.5), ([1.5, 1], 0.2), ([-0.5, 1], 0.5)) data = np.concatenate([ np.random.normal(loc=center, scale=scale, size=(n_samples, 2)) for center, scale in cluster_centers_scales ])
kmeans = KMeans(n_clusters=4, random_state=0)
data_stretched = data * np.array([1, 0.02])
labels_stretched = kmeans.fit_predict(data_stretched)
scaler = StandardScaler() data_normalized = scaler.fit_transform(data_stretched)
labels_normalized = kmeans.fit_predict(data_normalized)
plt.figure(figsize=(10, 5)) plt.subplot(1, 2, 1) plt.scatter(data_stretched[:, 0], data_stretched[:, 1], c=labels_stretched, marker='+') plt.title('Stretched Data') plt.subplot(1, 2, 2) plt.scatter(data_normalized[:, 0], data_normalized[:, 1], c=labels_normalized, marker='+') plt.title('Z-score Normalized Data') plt.savefig('z_score_normalization.svg') plt.show() ``` |
Date | |
Source | Own work |
Author | Cosmia Nebula |
Licensing
- You are free:
- to share – to copy, distribute and transmit the work
- to remix – to adapt the work
- Under the following conditions:
- attribution – You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- share alike – If you remix, transform, or build upon the material, you must distribute your contributions under the same or compatible license as the original.