Search in sources :

Example 31 with AkSourceBatchOp

use of com.alibaba.alink.operator.batch.source.AkSourceBatchOp in project Alink by alibaba.

the class Chap23 method c_3.

static void c_3() throws Exception {
    AkSourceBatchOp train_set = new AkSourceBatchOp().setFilePath(DATA_DIR + TRAIN_FILE);
    AkSourceBatchOp test_set = new AkSourceBatchOp().setFilePath(DATA_DIR + TEST_FILE);
    new Pipeline().add(new RegexTokenizer().setPattern("\\W+").setSelectedCol(TXT_COL_NAME)).add(new DocCountVectorizer().setFeatureType("WORD_COUNT").setSelectedCol(TXT_COL_NAME).setOutputCol(VECTOR_COL_NAME)).add(new NGram().setN(2).setSelectedCol(TXT_COL_NAME).setOutputCol("v_2").enableLazyPrintTransformData(1, "2-gram")).add(new DocCountVectorizer().setFeatureType("WORD_COUNT").setSelectedCol("v_2").setOutputCol("v_2")).add(new VectorAssembler().setSelectedCols(VECTOR_COL_NAME, "v_2").setOutputCol(VECTOR_COL_NAME)).add(new LogisticRegression().setMaxIter(30).setVectorCol(VECTOR_COL_NAME).setLabelCol(LABEL_COL_NAME).setPredictionCol(PREDICTION_COL_NAME).setPredictionDetailCol(PRED_DETAIL_COL_NAME)).fit(train_set).transform(test_set).link(new EvalBinaryClassBatchOp().setPositiveLabelValueString("pos").setLabelCol(LABEL_COL_NAME).setPredictionDetailCol(PRED_DETAIL_COL_NAME).lazyPrintMetrics("NGram 2"));
    BatchOperator.execute();
    new Pipeline().add(new RegexTokenizer().setPattern("\\W+").setSelectedCol(TXT_COL_NAME)).add(new DocCountVectorizer().setFeatureType("WORD_COUNT").setSelectedCol(TXT_COL_NAME).setOutputCol(VECTOR_COL_NAME)).add(new NGram().setN(2).setSelectedCol(TXT_COL_NAME).setOutputCol("v_2")).add(new DocCountVectorizer().setFeatureType("WORD_COUNT").setSelectedCol("v_2").setOutputCol("v_2")).add(new NGram().setN(3).setSelectedCol(TXT_COL_NAME).setOutputCol("v_3")).add(new DocCountVectorizer().setFeatureType("WORD_COUNT").setVocabSize(10000).setSelectedCol("v_3").setOutputCol("v_3")).add(new VectorAssembler().setSelectedCols(VECTOR_COL_NAME, "v_2", "v_3").setOutputCol(VECTOR_COL_NAME)).add(new LogisticRegression().setMaxIter(30).setVectorCol(VECTOR_COL_NAME).setLabelCol(LABEL_COL_NAME).setPredictionCol(PREDICTION_COL_NAME).setPredictionDetailCol(PRED_DETAIL_COL_NAME)).fit(train_set).transform(test_set).link(new EvalBinaryClassBatchOp().setPositiveLabelValueString("pos").setLabelCol(LABEL_COL_NAME).setPredictionDetailCol(PRED_DETAIL_COL_NAME).lazyPrintMetrics("NGram 2 and 3"));
    BatchOperator.execute();
}
Also used : AkSourceBatchOp(com.alibaba.alink.operator.batch.source.AkSourceBatchOp) VectorAssembler(com.alibaba.alink.pipeline.dataproc.vector.VectorAssembler) RegexTokenizer(com.alibaba.alink.pipeline.nlp.RegexTokenizer) NGram(com.alibaba.alink.pipeline.nlp.NGram) DocCountVectorizer(com.alibaba.alink.pipeline.nlp.DocCountVectorizer) LogisticRegression(com.alibaba.alink.pipeline.classification.LogisticRegression) Pipeline(com.alibaba.alink.pipeline.Pipeline) EvalBinaryClassBatchOp(com.alibaba.alink.operator.batch.evaluation.EvalBinaryClassBatchOp)

Example 32 with AkSourceBatchOp

use of com.alibaba.alink.operator.batch.source.AkSourceBatchOp in project Alink by alibaba.

the class Chap24 method c_6.

static void c_6() throws Exception {
    MemSourceBatchOp test_data = new MemSourceBatchOp(new Long[] { 50L }, ITEM_COL);
    new ItemCfSimilarItemsRecommender().setItemCol(ITEM_COL).setRecommCol(RECOMM_COL).setModelData(new AkSourceBatchOp().setFilePath(DATA_DIR + ITEMCF_MODEL_FILE)).transform(test_data).print();
    LocalPredictor recomm_predictor = new ItemCfSimilarItemsRecommender().setItemCol(ITEM_COL).setRecommCol(RECOMM_COL).setK(10).setModelData(new AkSourceBatchOp().setFilePath(DATA_DIR + ITEMCF_MODEL_FILE)).collectLocalPredictor("item_id long");
    LocalPredictor kv_predictor = new Lookup().setSelectedCols(ITEM_COL).setOutputCols("item_name").setModelData(getSourceItems()).setMapKeyCols("item_id").setMapValueCols("title").collectLocalPredictor("item_id long");
    MTable recommResult = (MTable) recomm_predictor.map(Row.of(50L)).getField(1);
    System.out.println(recommResult);
}
Also used : MemSourceBatchOp(com.alibaba.alink.operator.batch.source.MemSourceBatchOp) AkSourceBatchOp(com.alibaba.alink.operator.batch.source.AkSourceBatchOp) MTable(com.alibaba.alink.common.MTable) LocalPredictor(com.alibaba.alink.pipeline.LocalPredictor) Lookup(com.alibaba.alink.pipeline.dataproc.Lookup)

Example 33 with AkSourceBatchOp

use of com.alibaba.alink.operator.batch.source.AkSourceBatchOp in project Alink by alibaba.

the class Chap24 method c_7.

static void c_7() throws Exception {
    if (!new File(DATA_DIR + USERCF_MODEL_FILE).exists()) {
        getSourceRatings().link(new UserCfTrainBatchOp().setUserCol(USER_COL).setItemCol(ITEM_COL).setRateCol(RATING_COL)).link(new AkSinkBatchOp().setFilePath(DATA_DIR + USERCF_MODEL_FILE));
        BatchOperator.execute();
    }
    MemSourceBatchOp test_data = new MemSourceBatchOp(new Long[] { 50L }, ITEM_COL);
    new UserCfUsersPerItemRecommender().setItemCol(ITEM_COL).setRecommCol(RECOMM_COL).setModelData(new AkSourceBatchOp().setFilePath(DATA_DIR + USERCF_MODEL_FILE)).transform(test_data).print();
    getSourceRatings().filter("user_id IN (276,429,222,864,194,650,896,303,749,301) AND item_id=50").print();
    new UserCfUsersPerItemRecommender().setItemCol(ITEM_COL).setRecommCol(RECOMM_COL).setExcludeKnown(true).setModelData(new AkSourceBatchOp().setFilePath(DATA_DIR + USERCF_MODEL_FILE)).transform(test_data).print();
}
Also used : MemSourceBatchOp(com.alibaba.alink.operator.batch.source.MemSourceBatchOp) AkSourceBatchOp(com.alibaba.alink.operator.batch.source.AkSourceBatchOp) AkSinkBatchOp(com.alibaba.alink.operator.batch.sink.AkSinkBatchOp) File(java.io.File) UserCfTrainBatchOp(com.alibaba.alink.operator.batch.recommendation.UserCfTrainBatchOp)

Example 34 with AkSourceBatchOp

use of com.alibaba.alink.operator.batch.source.AkSourceBatchOp in project Alink by alibaba.

the class Chap25 method c_3.

static void c_3() throws Exception {
    Stopwatch sw = new Stopwatch();
    sw.start();
    AlinkGlobalConfiguration.setPrintProcessInfo(true);
    AkSourceBatchOp train_set = new AkSourceBatchOp().setFilePath(Chap16.DATA_DIR + Chap16.TRAIN_FILE);
    AkSourceBatchOp test_set = new AkSourceBatchOp().setFilePath(Chap16.DATA_DIR + Chap16.TEST_FILE);
    linearReg(train_set, test_set);
    dnnReg(train_set, test_set);
    sw.stop();
    System.out.println(sw.getElapsedTimeSpan());
}
Also used : AkSourceBatchOp(com.alibaba.alink.operator.batch.source.AkSourceBatchOp) Stopwatch(com.alibaba.alink.common.utils.Stopwatch)

Example 35 with AkSourceBatchOp

use of com.alibaba.alink.operator.batch.source.AkSourceBatchOp in project Alink by alibaba.

the class Chap25 method c_2.

static void c_2() throws Exception {
    Stopwatch sw = new Stopwatch();
    sw.start();
    AlinkGlobalConfiguration.setPrintProcessInfo(true);
    AkSourceBatchOp train_set = new AkSourceBatchOp().setFilePath(Chap13.DATA_DIR + Chap13.DENSE_TRAIN_FILE);
    AkSourceBatchOp test_set = new AkSourceBatchOp().setFilePath(Chap13.DATA_DIR + Chap13.DENSE_TEST_FILE);
    softmax(train_set, test_set);
    dnn(train_set, test_set);
    cnn(train_set, test_set);
    sw.stop();
    System.out.println(sw.getElapsedTimeSpan());
}
Also used : AkSourceBatchOp(com.alibaba.alink.operator.batch.source.AkSourceBatchOp) Stopwatch(com.alibaba.alink.common.utils.Stopwatch)

Aggregations

AkSourceBatchOp (com.alibaba.alink.operator.batch.source.AkSourceBatchOp)66 AkSinkBatchOp (com.alibaba.alink.operator.batch.sink.AkSinkBatchOp)20 EvalBinaryClassBatchOp (com.alibaba.alink.operator.batch.evaluation.EvalBinaryClassBatchOp)18 File (java.io.File)16 EvalMultiClassBatchOp (com.alibaba.alink.operator.batch.evaluation.EvalMultiClassBatchOp)10 Pipeline (com.alibaba.alink.pipeline.Pipeline)10 LogisticRegression (com.alibaba.alink.pipeline.classification.LogisticRegression)10 EvalClusterBatchOp (com.alibaba.alink.operator.batch.evaluation.EvalClusterBatchOp)9 Stopwatch (com.alibaba.alink.common.utils.Stopwatch)8 MemSourceBatchOp (com.alibaba.alink.operator.batch.source.MemSourceBatchOp)7 Row (org.apache.flink.types.Row)6 Test (org.junit.Test)6 BatchOperator (com.alibaba.alink.operator.batch.BatchOperator)5 CsvSourceBatchOp (com.alibaba.alink.operator.batch.source.CsvSourceBatchOp)5 PipelineModel (com.alibaba.alink.pipeline.PipelineModel)5 ArrayList (java.util.ArrayList)4 PluginDownloader (com.alibaba.alink.common.io.plugin.PluginDownloader)3 RegisterKey (com.alibaba.alink.common.io.plugin.RegisterKey)3 LogisticRegressionPredictBatchOp (com.alibaba.alink.operator.batch.classification.LogisticRegressionPredictBatchOp)3 LogisticRegressionTrainBatchOp (com.alibaba.alink.operator.batch.classification.LogisticRegressionTrainBatchOp)3