DBSCAN (Define $\epsilon$ and MinPts )

Density-based spatial clustering of applications with noise (DBSCAN) is a clustering algorithm that can handle clusters with different densities. The basic idea is as follows the dense region is the cluster the sparce region is considered noise. Some of the important definitions are as follows:

  1. Core point: A point is a core point if it has at least a minimum number of points (MinPts) within its radius \(\epsilon\).

  2. Border point: A point that is not a core point but is within the radius \(\epsilon\) of a core point.

  3. Noise point: A point that is neither a core point nor a border point.

  4. Density edge: Edge between core points such that the distance between the two core points is less than \(\epsilon\).

  5. Density connected: A point \(p\) is density connected to a point \(q\) if there exists a path such that all the edges in the path are density edges.

Algorithm:

  1. Label all points as core, border or noise points and remove noise points.

  2. All the core points that are density connected form a cluster.

  3. Assign each border point to the closest core point’s cluster.

Important Points

  1. No controle over the number of clusters.
  2. Can take arbitrary shapes.
  3. Not sensitive to outliers.
  4. Varing density clusters can not be handled.
  5. High dimensional data can not be handled.(Curse of dimensionality)

results matching ""

    No results matching ""