use of org.apache.flink.api.java.io.TextValueInputFormat in project flink by apache.
the class ExecutionEnvironment method readTextFileWithValue.
/**
* Creates a {@link DataSet} that represents the Strings produced by reading the given file line
* wise. This method is similar to {@link #readTextFile(String, String)}, but it produces a
* DataSet with mutable {@link StringValue} objects, rather than Java Strings. StringValues can
* be used to tune implementations to be less object and garbage collection heavy.
*
* <p>The {@link java.nio.charset.Charset} with the given name will be used to read the files.
*
* @param filePath The path of the file, as a URI (e.g., "file:///some/local/file" or
* "hdfs://host:port/file/path").
* @param charsetName The name of the character set used to read the file.
* @param skipInvalidLines A flag to indicate whether to skip lines that cannot be read with the
* given character set.
* @return A DataSet that represents the data read from the given file as text lines.
*/
public DataSource<StringValue> readTextFileWithValue(String filePath, String charsetName, boolean skipInvalidLines) {
Preconditions.checkNotNull(filePath, "The file path may not be null.");
TextValueInputFormat format = new TextValueInputFormat(new Path(filePath));
format.setCharsetName(charsetName);
format.setSkipInvalidLines(skipInvalidLines);
return new DataSource<>(this, format, new ValueTypeInfo<>(StringValue.class), Utils.getCallLocationName());
}
Aggregations