公告:所有在
2025 年 4 月 15 日之前注册使用 Earth Engine 的非商业项目都必须
验证是否符合非商业性质的资格条件,才能继续使用 Earth Engine。
ee.Clusterer.wekaKMeans
使用集合让一切井井有条
根据您的偏好保存内容并对其进行分类。
使用 k-means 算法对数据进行聚类。可以使用欧几里得距离(默认)或曼哈顿距离。如果使用曼哈顿距离,则形心将计算为分量中位数,而不是平均值。如需了解详情,请参阅以下内容:
D. Arthur, S. Vassilvitskii:k-means++:精心选择初始点的优势。In: Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, 1027-1035, 2007.
用法 | 返回 |
---|
ee.Clusterer.wekaKMeans(nClusters, init, canopies, maxCandidates, periodicPruning, minDensity, t1, t2, distanceFunction, maxIterations, preserveOrder, fast, seed) | 聚类器 |
参数 | 类型 | 详细信息 |
---|
nClusters | 整数 | 聚类数量。 |
init | 整数,默认值:0 | 要使用的初始化方法。0 = 随机,1 = k-means++,2 = canopy,3 = farthest first。 |
canopies | 布尔值,默认值:false | 使用 canopy 来减少距离计算次数。 |
maxCandidates | 整数,默认值:100 | 使用 Canopy 聚类时,在内存中保留的候选 Canopy 的最大数量。T2 距离加上数据特征,将决定在执行定期和最终剪枝之前形成多少个候选 Canopy,这可能会导致内存消耗过高。此设置可避免大量候选树冠消耗内存。 |
periodicPruning | 整数,默认值:10000 | 使用 Canopy 聚类时,修剪低密度冠层的频率。 |
minDensity | 整数,默认值:2 | 使用冠层聚类时,低于此值的冠层将在定期剪枝期间被剪除。 |
t1 | 浮点数,默认值:-1.5 | 使用 Canopy 聚类时要使用的 T1 距离。如果该值小于 0,则将其视为 T2 的正乘数。 |
t2 | 浮点数,默认值:-1 | 使用 Canopy 聚类时要使用的 T2 距离。值 < 0 会导致系统使用基于属性标准差的启发式方法。 |
distanceFunction | 字符串,默认值:“Euclidean” | 要使用的距离函数。选项包括:欧几里得和曼哈顿。 |
maxIterations | 整数,默认值:null | 迭代次数上限。 |
preserveOrder | 布尔值,默认值:false | 保留实例顺序。 |
fast | 布尔值,默认值:false | 使用截止值,实现更快的距离计算。停用平方误差/距离的计算/输出。 |
seed | 整数,默认值:10 | 随机种子。 |
如未另行说明,那么本页面中的内容已根据知识共享署名 4.0 许可获得了许可,并且代码示例已根据 Apache 2.0 许可获得了许可。有关详情,请参阅 Google 开发者网站政策。Java 是 Oracle 和/或其关联公司的注册商标。
最后更新时间 (UTC):2025-07-26。
[null,null,["最后更新时间 (UTC):2025-07-26。"],[[["\u003cp\u003eClusters data using the k-means algorithm with either Euclidean (default) or Manhattan distance.\u003c/p\u003e\n"],["\u003cp\u003eIf Manhattan distance is selected, centroids are calculated using the component-wise median instead of the mean.\u003c/p\u003e\n"],["\u003cp\u003eOffers various initialization methods including random, k-means++, canopy, and farthest first.\u003c/p\u003e\n"],["\u003cp\u003eAllows customization of distance calculation, iteration limits, and performance optimization through parameters.\u003c/p\u003e\n"]]],["The k-means algorithm clusters data using either Euclidean or Manhattan distance. Manhattan distance uses component-wise median for centroids, while Euclidean uses the mean. Initialization methods include random, k-means++, canopy, and farthest first. Canopies can be used to optimize distance calculations. Parameters control the number of clusters, pruning frequency, density thresholds, and distance settings. Additional options include limiting iterations, preserving data order, and using a fast distance calculation mode.\n"],null,["# ee.Clusterer.wekaKMeans\n\nCluster data using the k-means algorithm. Can use either the Euclidean distance (default) or the Manhattan distance. If the Manhattan distance is used, then centroids are computed as the component-wise median rather than mean. For more information see:\n\n\u003cbr /\u003e\n\nD. Arthur, S. Vassilvitskii: k-means++: the advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, 1027-1035, 2007.\n\n| Usage | Returns |\n|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|\n| `ee.Clusterer.wekaKMeans(nClusters, `*init* `, `*canopies* `, `*maxCandidates* `, `*periodicPruning* `, `*minDensity* `, `*t1* `, `*t2* `, `*distanceFunction* `, `*maxIterations* `, `*preserveOrder* `, `*fast* `, `*seed*`)` | Clusterer |\n\n| Argument | Type | Details |\n|--------------------|------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `nClusters` | Integer | Number of clusters. |\n| `init` | Integer, default: 0 | Initialization method to use. 0 = random, 1 = k-means++, 2 = canopy, 3 = farthest first. |\n| `canopies` | Boolean, default: false | Use canopies to reduce the number of distance calculations. |\n| `maxCandidates` | Integer, default: 100 | Maximum number of candidate canopies to retain in memory at any one time when using canopy clustering. T2 distance plus, data characteristics, will determine how many candidate canopies are formed before periodic and final pruning are performed, which might result in exceess memory consumption. This setting avoids large numbers of candidate canopies consuming memory. |\n| `periodicPruning` | Integer, default: 10000 | How often to prune low density canopies when using canopy clustering. |\n| `minDensity` | Integer, default: 2 | Minimum canopy density, when using canopy clustering, below which a canopy will be pruned during periodic pruning. |\n| `t1` | Float, default: -1.5 | The T1 distance to use when using canopy clustering. A value \\\u003c 0 is taken as a positive multiplier for T2. |\n| `t2` | Float, default: -1 | The T2 distance to use when using canopy clustering. Values \\\u003c 0 cause a heuristic based on attribute std. deviation to be used. |\n| `distanceFunction` | String, default: \"Euclidean\" | Distance function to use. Options are: Euclidean and Manhattan. |\n| `maxIterations` | Integer, default: null | Maximum number of iterations. |\n| `preserveOrder` | Boolean, default: false | Preserve order of instances. |\n| `fast` | Boolean, default: false | Enables faster distance calculations, using cut-off values. Disables the calculation/output of squared errors/distances. |\n| `seed` | Integer, default: 10 | The randomization seed. |"]]