Search in sources :

Example 1 with TextValueInputFormat

use of org.apache.flink.api.java.io.TextValueInputFormat in project flink by apache.

the class ExecutionEnvironment method readTextFileWithValue.

/**
 * Creates a {@link DataSet} that represents the Strings produced by reading the given file line
 * wise. This method is similar to {@link #readTextFile(String, String)}, but it produces a
 * DataSet with mutable {@link StringValue} objects, rather than Java Strings. StringValues can
 * be used to tune implementations to be less object and garbage collection heavy.
 *
 * <p>The {@link java.nio.charset.Charset} with the given name will be used to read the files.
 *
 * @param filePath The path of the file, as a URI (e.g., "file:///some/local/file" or
 *     "hdfs://host:port/file/path").
 * @param charsetName The name of the character set used to read the file.
 * @param skipInvalidLines A flag to indicate whether to skip lines that cannot be read with the
 *     given character set.
 * @return A DataSet that represents the data read from the given file as text lines.
 */
public DataSource<StringValue> readTextFileWithValue(String filePath, String charsetName, boolean skipInvalidLines) {
    Preconditions.checkNotNull(filePath, "The file path may not be null.");
    TextValueInputFormat format = new TextValueInputFormat(new Path(filePath));
    format.setCharsetName(charsetName);
    format.setSkipInvalidLines(skipInvalidLines);
    return new DataSource<>(this, format, new ValueTypeInfo<>(StringValue.class), Utils.getCallLocationName());
}
Also used : Path(org.apache.flink.core.fs.Path) TextValueInputFormat(org.apache.flink.api.java.io.TextValueInputFormat) StringValue(org.apache.flink.types.StringValue) DataSource(org.apache.flink.api.java.operators.DataSource)

Aggregations

TextValueInputFormat (org.apache.flink.api.java.io.TextValueInputFormat)1 DataSource (org.apache.flink.api.java.operators.DataSource)1 Path (org.apache.flink.core.fs.Path)1 StringValue (org.apache.flink.types.StringValue)1