:mod:`heat.cluster.kmedoids` ============================ .. py:module:: heat.cluster.kmedoids .. autoapi-nested-parse:: Module Implementing the Kmedoids Algorithm Module Contents --------------- .. py:class:: KMedoids(n_clusters: int = 8, init: Union[str, heat.core.dndarray.DNDarray] = 'random', max_iter: int = 300, random_state: int = None) Bases: :class:`heat.cluster._kcluster._KCluster` Kmedoids with the Manhattan distance as fixed metric, calculating the median of the assigned cluster points as new cluster center and snapping the centroid to the the nearest datapoint afterwards. This is not the original implementation of k-medoids using PAM as originally proposed by in [1]. :param n_clusters: The number of clusters to form as well as the number of centroids to generate. :type n_clusters: int, optional, default: 8 :param init: Method for initialization: - ‘k-medoids++’ : selects initial cluster centers for the clustering in a smart way to speed up convergence [2]. - ‘random’: choose k observations (rows) at random from data for the initial centroids. - DNDarray: gives the initial centers, should be of Shape = (n_clusters, n_features) :type init: str or DNDarray, default: ‘random’ :param max_iter: Maximum number of iterations of the algorithm for a single run. :type max_iter: int, default: 300 :param random_state: Determines random number generation for centroid initialization. :type random_state: int .. rubric:: References [1] Kaufman, L. and Rousseeuw, P.J. (1987), Clustering by means of Medoids, in Statistical Data Analysis Based on the L1 Norm and Related Methods, edited by Y. Dodge, North-Holland, 405416. .. role:: raw-html(raw) :format: html .. method:: _update_centroids(x: heat.core.dndarray.DNDarray, matching_centroids: heat.core.dndarray.DNDarray) Compute new centroid ``ci`` as closest sample to the median of the data points in ``x`` that are assigned to ``ci`` :param x: Input data :type x: DNDarray :param matching_centroids: Array filled with indeces ``i`` indicating to which cluster ``ci`` each sample point in ``x`` is assigned :type matching_centroids: DNDarray .. method:: fit(x: heat.core.dndarray.DNDarray, oversampling: float = 2, iter_multiplier: float = 1) Computes the centroid of a k-medoids clustering. :param x: Training instances to cluster. Shape = (n_samples, n_features) :type x: DNDarray :param oversampling: oversampling factor used in the k-means|| initializiation of centroids :type oversampling: float :param iter_multiplier: factor that increases the number of iterations used in the initialization of centroids :type iter_multiplier: float