Search in sources :

Example 1 with BisectingKMeans

use of com.alibaba.alink.pipeline.clustering.BisectingKMeans in project Alink by alibaba.

the class Chap17 method c_4.

static void c_4() throws Exception {
    AkSourceBatchOp source = new AkSourceBatchOp().setFilePath(DATA_DIR + VECTOR_FILE);
    new BisectingKMeans().setK(3).setVectorCol(VECTOR_COL_NAME).setPredictionCol(PREDICTION_COL_NAME).enableLazyPrintModelInfo("BiSecting KMeans EUCLIDEAN").fit(source).transform(source).link(new EvalClusterBatchOp().setVectorCol(VECTOR_COL_NAME).setPredictionCol(PREDICTION_COL_NAME).setLabelCol(LABEL_COL_NAME).lazyPrintMetrics("Bisecting KMeans EUCLIDEAN"));
    BatchOperator.execute();
    new BisectingKMeans().setDistanceType(DistanceType.COSINE).setK(3).setVectorCol(VECTOR_COL_NAME).setPredictionCol(PREDICTION_COL_NAME).enableLazyPrintModelInfo("BiSecting KMeans COSINE").fit(source).transform(source).link(new EvalClusterBatchOp().setDistanceType("COSINE").setVectorCol(VECTOR_COL_NAME).setPredictionCol(PREDICTION_COL_NAME).setLabelCol(LABEL_COL_NAME).lazyPrintMetrics("Bisecting KMeans COSINE"));
    BatchOperator.execute();
}
Also used : AkSourceBatchOp(com.alibaba.alink.operator.batch.source.AkSourceBatchOp) BisectingKMeans(com.alibaba.alink.pipeline.clustering.BisectingKMeans) EvalClusterBatchOp(com.alibaba.alink.operator.batch.evaluation.EvalClusterBatchOp)

Example 2 with BisectingKMeans

use of com.alibaba.alink.pipeline.clustering.BisectingKMeans in project Alink by alibaba.

the class Chap18 method c_1.

static void c_1() throws Exception {
    AkSourceBatchOp dense_source = new AkSourceBatchOp().setFilePath(DATA_DIR + DENSE_TRAIN_FILE);
    AkSourceBatchOp sparse_source = new AkSourceBatchOp().setFilePath(DATA_DIR + SPARSE_TRAIN_FILE);
    Stopwatch sw = new Stopwatch();
    ArrayList<Tuple2<String, Pipeline>> pipelineList = new ArrayList<>();
    pipelineList.add(new Tuple2<>("KMeans EUCLIDEAN", new Pipeline().add(new KMeans().setK(10).setVectorCol(VECTOR_COL_NAME).setPredictionCol(PREDICTION_COL_NAME))));
    pipelineList.add(new Tuple2<>("KMeans COSINE", new Pipeline().add(new KMeans().setDistanceType(DistanceType.COSINE).setK(10).setVectorCol(VECTOR_COL_NAME).setPredictionCol(PREDICTION_COL_NAME))));
    pipelineList.add(new Tuple2<>("BisectingKMeans", new Pipeline().add(new BisectingKMeans().setK(10).setVectorCol(VECTOR_COL_NAME).setPredictionCol(PREDICTION_COL_NAME))));
    for (Tuple2<String, Pipeline> pipelineTuple2 : pipelineList) {
        sw.reset();
        sw.start();
        pipelineTuple2.f1.fit(dense_source).transform(dense_source).link(new EvalClusterBatchOp().setVectorCol(VECTOR_COL_NAME).setPredictionCol(PREDICTION_COL_NAME).setLabelCol(LABEL_COL_NAME).lazyPrintMetrics(pipelineTuple2.f0 + " DENSE"));
        BatchOperator.execute();
        sw.stop();
        System.out.println(sw.getElapsedTimeSpan());
        sw.reset();
        sw.start();
        pipelineTuple2.f1.fit(sparse_source).transform(sparse_source).link(new EvalClusterBatchOp().setVectorCol(VECTOR_COL_NAME).setPredictionCol(PREDICTION_COL_NAME).setLabelCol(LABEL_COL_NAME).lazyPrintMetrics(pipelineTuple2.f0 + " SPARSE"));
        BatchOperator.execute();
        sw.stop();
        System.out.println(sw.getElapsedTimeSpan());
    }
}
Also used : AkSourceBatchOp(com.alibaba.alink.operator.batch.source.AkSourceBatchOp) BisectingKMeans(com.alibaba.alink.pipeline.clustering.BisectingKMeans) KMeans(com.alibaba.alink.pipeline.clustering.KMeans) Tuple2(org.apache.flink.api.java.tuple.Tuple2) Stopwatch(com.alibaba.alink.common.utils.Stopwatch) ArrayList(java.util.ArrayList) BisectingKMeans(com.alibaba.alink.pipeline.clustering.BisectingKMeans) EvalClusterBatchOp(com.alibaba.alink.operator.batch.evaluation.EvalClusterBatchOp) Pipeline(com.alibaba.alink.pipeline.Pipeline)

Aggregations

EvalClusterBatchOp (com.alibaba.alink.operator.batch.evaluation.EvalClusterBatchOp)2 AkSourceBatchOp (com.alibaba.alink.operator.batch.source.AkSourceBatchOp)2 BisectingKMeans (com.alibaba.alink.pipeline.clustering.BisectingKMeans)2 Stopwatch (com.alibaba.alink.common.utils.Stopwatch)1 Pipeline (com.alibaba.alink.pipeline.Pipeline)1 KMeans (com.alibaba.alink.pipeline.clustering.KMeans)1 ArrayList (java.util.ArrayList)1 Tuple2 (org.apache.flink.api.java.tuple.Tuple2)1