Search in sources :

Example 16 with TextInputFormat

use of org.apache.flink.api.java.io.TextInputFormat in project flink by apache.

the class ExecutionEnvironment method readTextFile.

/**
 * Creates a {@link DataSet} that represents the Strings produced by reading the given file line
 * wise. The {@link java.nio.charset.Charset} with the given name will be used to read the
 * files.
 *
 * @param filePath The path of the file, as a URI (e.g., "file:///some/local/file" or
 *     "hdfs://host:port/file/path").
 * @param charsetName The name of the character set used to read the file.
 * @return A {@link DataSet} that represents the data read from the given file as text lines.
 */
public DataSource<String> readTextFile(String filePath, String charsetName) {
    Preconditions.checkNotNull(filePath, "The file path may not be null.");
    TextInputFormat format = new TextInputFormat(new Path(filePath));
    format.setCharsetName(charsetName);
    return new DataSource<>(this, format, BasicTypeInfo.STRING_TYPE_INFO, Utils.getCallLocationName());
}
Also used : Path(org.apache.flink.core.fs.Path) TextInputFormat(org.apache.flink.api.java.io.TextInputFormat) DataSource(org.apache.flink.api.java.operators.DataSource)

Aggregations

TextInputFormat (org.apache.flink.api.java.io.TextInputFormat)16 Path (org.apache.flink.core.fs.Path)15 Test (org.junit.Test)13 TimestampedFileInputSplit (org.apache.flink.streaming.api.functions.source.TimestampedFileInputSplit)9 OneShotLatch (org.apache.flink.core.testutils.OneShotLatch)6 ContinuousFileMonitoringFunction (org.apache.flink.streaming.api.functions.source.ContinuousFileMonitoringFunction)6 Configuration (org.apache.flink.configuration.Configuration)5 IOException (java.io.IOException)4 HashSet (java.util.HashSet)4 TreeSet (java.util.TreeSet)4 StreamSource (org.apache.flink.streaming.api.operators.StreamSource)4 AbstractStreamOperatorTestHarness (org.apache.flink.streaming.util.AbstractStreamOperatorTestHarness)4 FileNotFoundException (java.io.FileNotFoundException)3 FileInputSplit (org.apache.flink.core.fs.FileInputSplit)3 RunnableWithException (org.apache.flink.util.function.RunnableWithException)3 File (java.io.File)2 ArrayList (java.util.ArrayList)2 HashMap (java.util.HashMap)2 List (java.util.List)2 OperatorSubtaskState (org.apache.flink.runtime.checkpoint.OperatorSubtaskState)2