Search in sources :

Example 26 with VectorAssembler

use of com.alibaba.alink.pipeline.dataproc.vector.VectorAssembler in project Alink by alibaba.

the class Chap10 method c_3_1.

static void c_3_1() throws Exception {
    BatchOperator<?> train_data = new AkSourceBatchOp().setFilePath(DATA_DIR + TRAIN_FILE);
    BatchOperator<?> test_data = new AkSourceBatchOp().setFilePath(DATA_DIR + TEST_FILE);
    Pipeline pipeline = new Pipeline().add(new OneHotEncoder().setSelectedCols(CATEGORY_FEATURE_COL_NAMES).setEncode(Encode.VECTOR)).add(new VectorAssembler().setSelectedCols(FEATURE_COL_NAMES).setOutputCol(VEC_COL_NAME)).add(new LogisticRegression().setVectorCol(VEC_COL_NAME).setLabelCol(LABEL_COL_NAME).setPredictionCol(PREDICTION_COL_NAME).setPredictionDetailCol(PRED_DETAIL_COL_NAME));
    pipeline.fit(train_data).transform(test_data).link(new EvalBinaryClassBatchOp().setPositiveLabelValueString("2").setLabelCol(LABEL_COL_NAME).setPredictionDetailCol(PRED_DETAIL_COL_NAME).lazyPrintMetrics());
    BatchOperator.execute();
}
Also used : OneHotEncoder(com.alibaba.alink.pipeline.feature.OneHotEncoder) AkSourceBatchOp(com.alibaba.alink.operator.batch.source.AkSourceBatchOp) VectorAssembler(com.alibaba.alink.pipeline.dataproc.vector.VectorAssembler) LogisticRegression(com.alibaba.alink.pipeline.classification.LogisticRegression) Pipeline(com.alibaba.alink.pipeline.Pipeline) EvalBinaryClassBatchOp(com.alibaba.alink.operator.batch.evaluation.EvalBinaryClassBatchOp)

Aggregations

VectorAssembler (com.alibaba.alink.pipeline.dataproc.vector.VectorAssembler)26 Test (org.junit.Test)16 Pipeline (com.alibaba.alink.pipeline.Pipeline)11 MultilayerPerceptronClassifier (com.alibaba.alink.pipeline.classification.MultilayerPerceptronClassifier)9 LogisticRegression (com.alibaba.alink.pipeline.classification.LogisticRegression)8 BatchOperator (com.alibaba.alink.operator.batch.BatchOperator)7 PipelineModel (com.alibaba.alink.pipeline.PipelineModel)7 Row (org.apache.flink.types.Row)7 FilePath (com.alibaba.alink.common.io.filesystem.FilePath)4 EvalBinaryClassBatchOp (com.alibaba.alink.operator.batch.evaluation.EvalBinaryClassBatchOp)4 AkSourceBatchOp (com.alibaba.alink.operator.batch.source.AkSourceBatchOp)4 OneHotEncoder (com.alibaba.alink.pipeline.feature.OneHotEncoder)3 TableSchema (org.apache.flink.table.api.TableSchema)3 DenseVector (com.alibaba.alink.common.linalg.DenseVector)2 MemSourceBatchOp (com.alibaba.alink.operator.batch.source.MemSourceBatchOp)2 MemSourceStreamOp (com.alibaba.alink.operator.stream.source.MemSourceStreamOp)2 Lda (com.alibaba.alink.pipeline.clustering.Lda)2 Binarizer (com.alibaba.alink.pipeline.feature.Binarizer)2 QuantileDiscretizer (com.alibaba.alink.pipeline.feature.QuantileDiscretizer)2 DocCountVectorizer (com.alibaba.alink.pipeline.nlp.DocCountVectorizer)2