Search in sources :

Example 1 with DistanceType

use of com.alibaba.alink.params.shared.clustering.HasKMeansDistanceType.DistanceType in project Alink by alibaba.

the class Chap20 method c_3.

static void c_3() throws Exception {
    Stopwatch sw = new Stopwatch();
    sw.start();
    AlinkGlobalConfiguration.setPrintProcessInfo(true);
    AkSourceBatchOp source = new AkSourceBatchOp().setFilePath(Chap17.DATA_DIR + Chap17.VECTOR_FILE);
    KMeans kmeans = new KMeans().setVectorCol(Chap17.VECTOR_COL_NAME).setPredictionCol(Chap17.PREDICTION_COL_NAME);
    GridSearchCV cv = new GridSearchCV().setNumFolds(4).setEstimator(kmeans).setParamGrid(new ParamGrid().addGrid(kmeans, KMeans.K, new Integer[] { 2, 3, 4, 5, 6 }).addGrid(kmeans, KMeans.DISTANCE_TYPE, new DistanceType[] { DistanceType.EUCLIDEAN, DistanceType.COSINE })).setTuningEvaluator(new ClusterTuningEvaluator().setVectorCol(Chap17.VECTOR_COL_NAME).setPredictionCol(Chap17.PREDICTION_COL_NAME).setLabelCol(Chap17.LABEL_COL_NAME).setTuningClusterMetric(TuningClusterMetric.RI)).enableLazyPrintTrainInfo();
    GridSearchCVModel bestModel = cv.fit(source);
    bestModel.transform(source).link(new EvalClusterBatchOp().setLabelCol(Chap17.LABEL_COL_NAME).setVectorCol(Chap17.VECTOR_COL_NAME).setPredictionCol(Chap17.PREDICTION_COL_NAME).lazyPrintMetrics());
    BatchOperator.execute();
    sw.stop();
    System.out.println(sw.getElapsedTimeSpan());
}
Also used : ParamGrid(com.alibaba.alink.pipeline.tuning.ParamGrid) AkSourceBatchOp(com.alibaba.alink.operator.batch.source.AkSourceBatchOp) KMeans(com.alibaba.alink.pipeline.clustering.KMeans) ClusterTuningEvaluator(com.alibaba.alink.pipeline.tuning.ClusterTuningEvaluator) Stopwatch(com.alibaba.alink.common.utils.Stopwatch) GridSearchCV(com.alibaba.alink.pipeline.tuning.GridSearchCV) DistanceType(com.alibaba.alink.params.shared.clustering.HasKMeansDistanceType.DistanceType) GridSearchCVModel(com.alibaba.alink.pipeline.tuning.GridSearchCVModel) EvalClusterBatchOp(com.alibaba.alink.operator.batch.evaluation.EvalClusterBatchOp)

Aggregations

Stopwatch (com.alibaba.alink.common.utils.Stopwatch)1 EvalClusterBatchOp (com.alibaba.alink.operator.batch.evaluation.EvalClusterBatchOp)1 AkSourceBatchOp (com.alibaba.alink.operator.batch.source.AkSourceBatchOp)1 DistanceType (com.alibaba.alink.params.shared.clustering.HasKMeansDistanceType.DistanceType)1 KMeans (com.alibaba.alink.pipeline.clustering.KMeans)1 ClusterTuningEvaluator (com.alibaba.alink.pipeline.tuning.ClusterTuningEvaluator)1 GridSearchCV (com.alibaba.alink.pipeline.tuning.GridSearchCV)1 GridSearchCVModel (com.alibaba.alink.pipeline.tuning.GridSearchCVModel)1 ParamGrid (com.alibaba.alink.pipeline.tuning.ParamGrid)1