For segmentation, we need to group pixels that belong together

        1. Randomly initialize K cluster centers         2. Assign each point to the closest center         3. Compute new center of each cluster to be the mean of its assigned pixels         4. Repeat until convergence         5. Go to step 2

Challenges

  • bad starting points poor clusters
    • poor convergence rate or convergence to sub-optimal clustering
    • should try out multiple starting pts
  • assumes isotopic, convex clusters - sensitive to outliers
  • K is hyperparameter

Goal

evaluating clusters

  • generative
    • how well are pts reconstructed from clusters?
  • discriminative
    • how well do clusters correspond to labels?
    • unsupervised clustering doesn’t aim to be discriminative

how to choose # of clusters?

  • try diff nums in validation set and look at performance

pros and cons

  • good rep of data
  • simple fast easy but
  • need to choose K
  • sensitive to outliers
  • prone to local minima
  • all clusters have same parameters (non-adaptive)
  • can be slow, each iteration is O(KNd) for N d-dim pixels