Search in sources :

Example 6 with QuantileDiscretizer

use of com.alibaba.alink.pipeline.feature.QuantileDiscretizer in project Alink by alibaba.

the class PipelineSaveAndLoadTest method test.

@Test
public void test() throws Exception {
    String schemaStr = "sepal_length double, sepal_width double, petal_length double, petal_width double, category string";
    CsvSourceBatchOp source = new CsvSourceBatchOp().setSchemaStr(schemaStr).setFilePath("https://alink-test-data.oss-cn-hangzhou.aliyuncs.com/iris.csv");
    String modelFilename = "/tmp/model123";
    QuantileDiscretizer stage1 = new QuantileDiscretizer().setNumBuckets(2).setSelectedCols("sepal_length");
    Binarizer stage2 = new Binarizer().setSelectedCol("petal_width").setThreshold(1.);
    QuantileDiscretizer stage3 = new QuantileDiscretizer().setNumBuckets(4).setSelectedCols("petal_length");
    PipelineModel pipelineModel = new Pipeline(stage1, stage2, stage3).fit(source);
    // System.out.println(pipelineModel.transform(source).getSchema().toString());
    pipelineModel.save(new FilePath(modelFilename), true);
    BatchOperator.execute();
    LocalPredictor predictor = new LocalPredictor(modelFilename, schemaStr);
    Row res = predictor.map(Row.of(1.2, 3.4, 2.4, 3.6, "1"));
    Assert.assertEquals(res.getArity(), 5);
}
Also used : FilePath(com.alibaba.alink.common.io.filesystem.FilePath) Row(org.apache.flink.types.Row) Binarizer(com.alibaba.alink.pipeline.feature.Binarizer) CsvSourceBatchOp(com.alibaba.alink.operator.batch.source.CsvSourceBatchOp) QuantileDiscretizer(com.alibaba.alink.pipeline.feature.QuantileDiscretizer) Test(org.junit.Test)

Example 7 with QuantileDiscretizer

use of com.alibaba.alink.pipeline.feature.QuantileDiscretizer in project Alink by alibaba.

the class PipelineTest method test.

@Test
public void test() throws Exception {
    CsvSourceBatchOp source = new CsvSourceBatchOp().setSchemaStr("sepal_length double, sepal_width double, petal_length double, petal_width double, category string").setFilePath("https://alink-test-data.oss-cn-hangzhou.aliyuncs.com/iris.csv");
    String pipeline_model_filename = "/tmp/model123123123123.csv";
    QuantileDiscretizerModel model1 = new QuantileDiscretizer().setNumBuckets(2).setSelectedCols("sepal_length").fit(source);
    Binarizer model2 = new Binarizer().setSelectedCol("petal_width").setThreshold(1.);
    PipelineModel pipeline_model = new PipelineModel(model1, model2);
    pipeline_model.save(pipeline_model_filename, true);
    BatchOperator.execute();
    pipeline_model = PipelineModel.load(pipeline_model_filename);
    BatchOperator<?> res = pipeline_model.transform(source);
    res.print();
}
Also used : QuantileDiscretizerModel(com.alibaba.alink.pipeline.feature.QuantileDiscretizerModel) Binarizer(com.alibaba.alink.pipeline.feature.Binarizer) CsvSourceBatchOp(com.alibaba.alink.operator.batch.source.CsvSourceBatchOp) QuantileDiscretizer(com.alibaba.alink.pipeline.feature.QuantileDiscretizer) Test(org.junit.Test)

Aggregations

Binarizer (com.alibaba.alink.pipeline.feature.Binarizer)7 QuantileDiscretizer (com.alibaba.alink.pipeline.feature.QuantileDiscretizer)7 CsvSourceBatchOp (com.alibaba.alink.operator.batch.source.CsvSourceBatchOp)5 Test (org.junit.Test)5 QuantileDiscretizerModel (com.alibaba.alink.pipeline.feature.QuantileDiscretizerModel)4 Lda (com.alibaba.alink.pipeline.clustering.Lda)2 VectorAssembler (com.alibaba.alink.pipeline.dataproc.vector.VectorAssembler)2 FilePath (com.alibaba.alink.common.io.filesystem.FilePath)1 QuantileDiscretizerTrainBatchOp (com.alibaba.alink.operator.batch.feature.QuantileDiscretizerTrainBatchOp)1 AkSinkBatchOp (com.alibaba.alink.operator.batch.sink.AkSinkBatchOp)1 CsvSinkBatchOp (com.alibaba.alink.operator.batch.sink.CsvSinkBatchOp)1 AkSourceBatchOp (com.alibaba.alink.operator.batch.source.AkSourceBatchOp)1 CsvSourceStreamOp (com.alibaba.alink.operator.stream.source.CsvSourceStreamOp)1 GeneralizedLinearRegression (com.alibaba.alink.pipeline.regression.GeneralizedLinearRegression)1 Select (com.alibaba.alink.pipeline.sql.Select)1 Row (org.apache.flink.types.Row)1