Search in sources :

Example 1 with MinMaxScaler

use of com.alibaba.alink.pipeline.dataproc.MinMaxScaler in project Alink by alibaba.

the class Chap07 method c_3_2.

static void c_3_2() throws Exception {
    CsvSourceBatchOp source = new CsvSourceBatchOp().setFilePath(DATA_DIR + ORIGIN_FILE).setSchemaStr(SCHEMA_STRING);
    source.lazyPrintStatistics("< Origin data >");
    MinMaxScaler scaler = new MinMaxScaler().setSelectedCols(FEATURE_COL_NAMES);
    scaler.fit(source).transform(source).lazyPrintStatistics("< after MinMax Scale >");
    BatchOperator.execute();
}
Also used : MinMaxScaler(com.alibaba.alink.pipeline.dataproc.MinMaxScaler) VectorMinMaxScaler(com.alibaba.alink.pipeline.dataproc.vector.VectorMinMaxScaler) CsvSourceBatchOp(com.alibaba.alink.operator.batch.source.CsvSourceBatchOp)

Example 2 with MinMaxScaler

use of com.alibaba.alink.pipeline.dataproc.MinMaxScaler in project Alink by alibaba.

the class MinMaxTest method test.

@Test
public void test() throws Exception {
    BatchOperator batchData = new TableSourceBatchOp(GenerateData.getBatchTable());
    StreamOperator streamData = new TableSourceStreamOp(GenerateData.getStreamTable());
    MinMaxScalerTrainBatchOp op = new MinMaxScalerTrainBatchOp().setSelectedCols("f0", "f1").linkFrom(batchData);
    new MinMaxScalerPredictBatchOp().linkFrom(op, batchData).lazyCollect();
    new MinMaxScalerPredictStreamOp(op).linkFrom(streamData).print();
    MinMaxScalerModel model = new MinMaxScaler().setSelectedCols("f0", "f1").setOutputCols("f0_1", "f1_1").fit(batchData);
    List<Row> rows = model.transform(batchData).collect();
    rows.sort(new Comparator<Row>() {

        @Override
        public int compare(Row o1, Row o2) {
            if (o1.getField(0) == null) {
                return -1;
            }
            if (o2.getField(0) == null) {
                return 1;
            }
            if ((double) o1.getField(0) > (double) o2.getField(0)) {
                return 1;
            }
            if ((double) o1.getField(0) < (double) o2.getField(0)) {
                return -1;
            }
            return 0;
        }
    });
    assertEquals(rows.get(0), Row.of(null, null, null, null));
    assertEquals(rows.get(1), Row.of(-1., -3., 0., 0.));
    assertEquals(rows.get(2), Row.of(1., 2., 0.4, 1.));
    assertEquals(rows.get(3), Row.of(4., 2., 1., 1.));
    model.transform(streamData).print();
    StreamOperator.execute();
}
Also used : TableSourceBatchOp(com.alibaba.alink.operator.batch.source.TableSourceBatchOp) BatchOperator(com.alibaba.alink.operator.batch.BatchOperator) MinMaxScalerModel(com.alibaba.alink.pipeline.dataproc.MinMaxScalerModel) MinMaxScalerPredictStreamOp(com.alibaba.alink.operator.stream.dataproc.MinMaxScalerPredictStreamOp) MinMaxScaler(com.alibaba.alink.pipeline.dataproc.MinMaxScaler) TableSourceStreamOp(com.alibaba.alink.operator.stream.source.TableSourceStreamOp) Row(org.apache.flink.types.Row) StreamOperator(com.alibaba.alink.operator.stream.StreamOperator) Test(org.junit.Test)

Aggregations

MinMaxScaler (com.alibaba.alink.pipeline.dataproc.MinMaxScaler)2 BatchOperator (com.alibaba.alink.operator.batch.BatchOperator)1 CsvSourceBatchOp (com.alibaba.alink.operator.batch.source.CsvSourceBatchOp)1 TableSourceBatchOp (com.alibaba.alink.operator.batch.source.TableSourceBatchOp)1 StreamOperator (com.alibaba.alink.operator.stream.StreamOperator)1 MinMaxScalerPredictStreamOp (com.alibaba.alink.operator.stream.dataproc.MinMaxScalerPredictStreamOp)1 TableSourceStreamOp (com.alibaba.alink.operator.stream.source.TableSourceStreamOp)1 MinMaxScalerModel (com.alibaba.alink.pipeline.dataproc.MinMaxScalerModel)1 VectorMinMaxScaler (com.alibaba.alink.pipeline.dataproc.vector.VectorMinMaxScaler)1 Row (org.apache.flink.types.Row)1 Test (org.junit.Test)1