:mod:`heat.cluster.kmedians` ============================ .. py:module:: heat.cluster.kmedians .. autoapi-nested-parse:: Module Implementing the Kmedians Algorithm Module Contents --------------- .. py:class:: KMedians(n_clusters: int = 8, init: Union[str, heat.core.dndarray.DNDarray] = 'random', max_iter: int = 300, tol: float = 0.0001, random_state: int = None) Bases: :class:`heat.cluster._kcluster._KCluster` K-Medians clustering algorithm [1]. Uses the Manhattan (City-block, :math:`L_1`) metric for distance calculations :param n_clusters: The number of clusters to form as well as the number of centroids to generate. :type n_clusters: int, optional, default: 8 :param init: Method for initialization: - ‘k-medians++’ : selects initial cluster centers for the clustering in a smart way to speed up convergence [2]. - ‘random’: choose k observations (rows) at random from data for the initial centroids. - 'batchparallel': initialize by using the batch parallel algorithm (see BatchParallelKMedians for more information). - DNDarray: gives the initial centers, should be of Shape = (n_clusters, n_features) :type init: str or DNDarray, default: ‘random’ :param max_iter: Maximum number of iterations of the k-means algorithm for a single run. :type max_iter: int, default: 300 :param tol: Relative tolerance with regards to inertia to declare convergence. :type tol: float, default: 1e-4 :param random_state: Determines random number generation for centroid initialization. :type random_state: int .. rubric:: References [1] Hakimi, S., and O. Kariv. "An algorithmic approach to network location problems II: The p-medians." SIAM Journal on Applied Mathematics 37.3 (1979): 539-560. .. attribute:: _p :annotation: = 1 .. role:: raw-html(raw) :format: html .. method:: _update_centroids(x: heat.core.dndarray.DNDarray, matching_centroids: heat.core.dndarray.DNDarray) Compute coordinates of new centroid as median of the data points in ``x`` that are assigned to it :param x: Input data :type x: DNDarray :param matching_centroids: Array filled with indeces ``i`` indicating to which cluster ``ci`` each sample point in x is assigned :type matching_centroids: DNDarray .. method:: fit(x: heat.core.dndarray.DNDarray, oversampling: float = 2, iter_multiplier: float = 1) Computes the centroid of a k-medians clustering. :param x: Training instances to cluster. Shape = (n_samples, n_features) :type x: DNDarray :param oversampling: oversampling factor used in the k-means|| initializiation of centroids :type oversampling: float :param iter_multiplier: factor that increases the number of iterations used in the initialization of centroids :type iter_multiplier: float