Search in sources :

Example 11 with EvalClusterBatchOp

use of com.alibaba.alink.operator.batch.evaluation.EvalClusterBatchOp in project Alink by alibaba.

the class Chap20 method c_3.

static void c_3() throws Exception {
    Stopwatch sw = new Stopwatch();
    sw.start();
    AlinkGlobalConfiguration.setPrintProcessInfo(true);
    AkSourceBatchOp source = new AkSourceBatchOp().setFilePath(Chap17.DATA_DIR + Chap17.VECTOR_FILE);
    KMeans kmeans = new KMeans().setVectorCol(Chap17.VECTOR_COL_NAME).setPredictionCol(Chap17.PREDICTION_COL_NAME);
    GridSearchCV cv = new GridSearchCV().setNumFolds(4).setEstimator(kmeans).setParamGrid(new ParamGrid().addGrid(kmeans, KMeans.K, new Integer[] { 2, 3, 4, 5, 6 }).addGrid(kmeans, KMeans.DISTANCE_TYPE, new DistanceType[] { DistanceType.EUCLIDEAN, DistanceType.COSINE })).setTuningEvaluator(new ClusterTuningEvaluator().setVectorCol(Chap17.VECTOR_COL_NAME).setPredictionCol(Chap17.PREDICTION_COL_NAME).setLabelCol(Chap17.LABEL_COL_NAME).setTuningClusterMetric(TuningClusterMetric.RI)).enableLazyPrintTrainInfo();
    GridSearchCVModel bestModel = cv.fit(source);
    bestModel.transform(source).link(new EvalClusterBatchOp().setLabelCol(Chap17.LABEL_COL_NAME).setVectorCol(Chap17.VECTOR_COL_NAME).setPredictionCol(Chap17.PREDICTION_COL_NAME).lazyPrintMetrics());
    BatchOperator.execute();
    sw.stop();
    System.out.println(sw.getElapsedTimeSpan());
}
Also used : ParamGrid(com.alibaba.alink.pipeline.tuning.ParamGrid) AkSourceBatchOp(com.alibaba.alink.operator.batch.source.AkSourceBatchOp) KMeans(com.alibaba.alink.pipeline.clustering.KMeans) ClusterTuningEvaluator(com.alibaba.alink.pipeline.tuning.ClusterTuningEvaluator) Stopwatch(com.alibaba.alink.common.utils.Stopwatch) GridSearchCV(com.alibaba.alink.pipeline.tuning.GridSearchCV) DistanceType(com.alibaba.alink.params.shared.clustering.HasKMeansDistanceType.DistanceType) GridSearchCVModel(com.alibaba.alink.pipeline.tuning.GridSearchCVModel) EvalClusterBatchOp(com.alibaba.alink.operator.batch.evaluation.EvalClusterBatchOp)

Aggregations

EvalClusterBatchOp (com.alibaba.alink.operator.batch.evaluation.EvalClusterBatchOp)11 AkSourceBatchOp (com.alibaba.alink.operator.batch.source.AkSourceBatchOp)9 AkSinkBatchOp (com.alibaba.alink.operator.batch.sink.AkSinkBatchOp)4 KMeans (com.alibaba.alink.pipeline.clustering.KMeans)4 Stopwatch (com.alibaba.alink.common.utils.Stopwatch)3 BisectingKMeans (com.alibaba.alink.pipeline.clustering.BisectingKMeans)3 File (java.io.File)3 KMeansPredictBatchOp (com.alibaba.alink.operator.batch.clustering.KMeansPredictBatchOp)2 KMeansTrainBatchOp (com.alibaba.alink.operator.batch.clustering.KMeansTrainBatchOp)2 VectorAssemblerBatchOp (com.alibaba.alink.operator.batch.dataproc.vector.VectorAssemblerBatchOp)2 AkSinkStreamOp (com.alibaba.alink.operator.stream.sink.AkSinkStreamOp)2 AkSourceStreamOp (com.alibaba.alink.operator.stream.source.AkSourceStreamOp)2 GeoKMeans (com.alibaba.alink.pipeline.clustering.GeoKMeans)2 GmmPredictBatchOp (com.alibaba.alink.operator.batch.clustering.GmmPredictBatchOp)1 GmmTrainBatchOp (com.alibaba.alink.operator.batch.clustering.GmmTrainBatchOp)1 LdaPredictBatchOp (com.alibaba.alink.operator.batch.clustering.LdaPredictBatchOp)1 LdaTrainBatchOp (com.alibaba.alink.operator.batch.clustering.LdaTrainBatchOp)1 PcaPredictBatchOp (com.alibaba.alink.operator.batch.feature.PcaPredictBatchOp)1 PcaTrainBatchOp (com.alibaba.alink.operator.batch.feature.PcaTrainBatchOp)1 SegmentBatchOp (com.alibaba.alink.operator.batch.nlp.SegmentBatchOp)1