Electron microscopy
 
sklearn.cluster.KMeans()
- Python for Integrated Circuits -
- An Online Book -
Python for Integrated Circuits                                                                                   http://www.globalsino.com/ICs/        


Chapter/Index: Introduction | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | Appendix

=================================================================================

The k-means problem is solved using either Lloyd’s or Elkan’s algorithm.

class sklearn.cluster.KMeans(n_clusters=8, *, init='k-means++', n_init=10, max_iter=300, tol=0.0001, verbose=0, random_state=None, copy_x=True, algorithm='lloyd')

Parameters:
         n_clustersint, default=8: The number of clusters to form as well as the number of centroids to generate.
         init{‘k-means++’, ‘random’}, callable or array-like of shape (n_clusters, n_features), default=’k-means++’
             Method for initialization:
                 ‘k-means++’ : selects initial cluster centroids using sampling based on an empirical probability distribution of the points’ contribution to the overall inertia. This technique speeds up convergence, and is theoretically proven to be -optimal. See the description of n_init for more details.
                 ‘random’: choose n_clusters observations (rows) at random from data for the initial centroids.
                 If an array is passed, it should be of shape (n_clusters, n_features) and gives the initial centers.
                 If a callable is passed, it should take arguments X, n_clusters and a random state and return an initialization.         
         n_initint, default=10
                 Number of time the k-means algorithm will be run with different centroid seeds. The final results will be the best output of n_init consecutive runs in terms of inertia.
         max_iterint, default=300
                Maximum number of iterations of the k-means algorithm for a single run.
         tolfloat, default=1e-4
                Relative tolerance with regards to Frobenius norm of the difference in the cluster centers of two consecutive iterations to declare convergence.
         verboseint, default=0
                Verbosity mode.
         random_stateint, RandomState instance or None, default=None
                Determines random number generation for centroid initialization. Use an int to make the randomness deterministic.
         copy_xbool, default=True
                When pre-computing distances it is more numerically accurate to center the data first. If copy_x is True (default), then the original data is not modified. If False, the original data is modified, and put back before the function returns, but small numerical differences may be introduced by subtracting and then adding the data mean. Note that if the original data is not C-contiguous, a copy will be made even if copy_x is False. If the original data is sparse, but not in CSR format, a copy will be made even if copy_x is False.
         algorithm{“lloyd”, “elkan”, “auto”, “full”}, default=”lloyd”
                 K-means algorithm to use. The classical EM-style algorithm is "lloyd". The "elkan" variation can be more efficient on some datasets with well-defined clusters, by using the triangle inequality. However it’s more memory intensive due to the allocation of an extra array of shape (n_samples, n_clusters).
                "auto" and "full" are deprecated and they will be removed in Scikit-Learn 1.3. They are both aliases for "lloyd"..

Attributes:
         cluster_centers_ndarray of shape (n_clusters, n_features)
                Coordinates of cluster centers. If the algorithm stops before fully converging (see tol and max_iter), these will not be consistent with labels_.
         labels_ndarray of shape (n_samples,)
                Labels of each point
         inertia_float
                Sum of squared distances of samples to their closest cluster center, weighted by the sample weights if provided.
         n_iter_int
                Number of iterations run.
         n_features_in_int
                Number of features seen during fit.

 

 

 

============================================

         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         

 

 

 

 

 



















































 

 

 

 

 

=================================================================================