Search in sources :

Example 1 with CsvInput

use of org.neo4j.internal.batchimport.input.csv.CsvInput in project neo4j by neo4j.

the class ImportPanicIT method shouldExitAndThrowExceptionOnPanic.

/**
 * There was this problem where some steps and in particular parallel CSV input parsing that
 * paniced would hang the import entirely.
 */
@Test
void shouldExitAndThrowExceptionOnPanic() throws Exception {
    try (JobScheduler jobScheduler = new ThreadPoolJobScheduler()) {
        BatchImporter importer = new ParallelBatchImporter(databaseLayout, testDirectory.getFileSystem(), PageCacheTracer.NULL, Configuration.DEFAULT, NullLogService.getInstance(), ExecutionMonitor.INVISIBLE, AdditionalInitialIds.EMPTY, Config.defaults(), StandardV3_4.RECORD_FORMATS, ImportLogic.NO_MONITOR, jobScheduler, Collector.EMPTY, LogFilesInitializer.NULL, IndexImporterFactory.EMPTY, EmptyMemoryTracker.INSTANCE);
        Iterable<DataFactory> nodeData = DataFactories.datas(DataFactories.data(InputEntityDecorators.NO_DECORATOR, fileAsCharReadable(nodeCsvFileWithBrokenEntries())));
        Input brokenCsvInput = new CsvInput(nodeData, DataFactories.defaultFormatNodeFileHeader(), DataFactories.datas(), DataFactories.defaultFormatRelationshipFileHeader(), IdType.ACTUAL, csvConfigurationWithLowBufferSize(), CsvInput.NO_MONITOR, INSTANCE);
        var e = assertThrows(InputException.class, () -> importer.doImport(brokenCsvInput));
        assertTrue(e.getCause() instanceof DataAfterQuoteException);
    }
}
Also used : JobScheduler(org.neo4j.scheduler.JobScheduler) ThreadPoolJobScheduler(org.neo4j.test.scheduler.ThreadPoolJobScheduler) CsvInput(org.neo4j.internal.batchimport.input.csv.CsvInput) Input(org.neo4j.internal.batchimport.input.Input) DataFactory(org.neo4j.internal.batchimport.input.csv.DataFactory) CsvInput(org.neo4j.internal.batchimport.input.csv.CsvInput) ThreadPoolJobScheduler(org.neo4j.test.scheduler.ThreadPoolJobScheduler) DataAfterQuoteException(org.neo4j.csv.reader.DataAfterQuoteException) Test(org.junit.jupiter.api.Test)

Example 2 with CsvInput

use of org.neo4j.internal.batchimport.input.csv.CsvInput in project neo4j by neo4j.

the class CsvImporter method doImport.

@Override
public void doImport() throws IOException {
    if (force) {
        fileSystem.deleteRecursively(databaseLayout.databaseDirectory());
        fileSystem.deleteRecursively(databaseLayout.getTransactionLogsDirectory());
    }
    try (OutputStream badOutput = fileSystem.openAsOutputStream(reportFile, false);
        Collector badCollector = getBadCollector(skipBadEntriesLogging, badOutput)) {
        // Extract the default time zone from the database configuration
        ZoneId dbTimeZone = databaseConfig.get(GraphDatabaseSettings.db_temporal_timezone);
        Supplier<ZoneId> defaultTimeZone = () -> dbTimeZone;
        final var nodeData = nodeData();
        final var relationshipsData = relationshipData();
        CsvInput input = new CsvInput(nodeData, defaultFormatNodeFileHeader(defaultTimeZone, normalizeTypes), relationshipsData, defaultFormatRelationshipFileHeader(defaultTimeZone, normalizeTypes), idType, csvConfig, new CsvInput.PrintingMonitor(stdOut), memoryTracker);
        doImport(input, badCollector);
    }
}
Also used : ZoneId(java.time.ZoneId) OutputStream(java.io.OutputStream) Collector(org.neo4j.internal.batchimport.input.Collector) Collectors.badCollector(org.neo4j.internal.batchimport.input.Collectors.badCollector) Collectors.silentBadCollector(org.neo4j.internal.batchimport.input.Collectors.silentBadCollector) BadCollector(org.neo4j.internal.batchimport.input.BadCollector) CsvInput(org.neo4j.internal.batchimport.input.csv.CsvInput)

Aggregations

CsvInput (org.neo4j.internal.batchimport.input.csv.CsvInput)2 OutputStream (java.io.OutputStream)1 ZoneId (java.time.ZoneId)1 Test (org.junit.jupiter.api.Test)1 DataAfterQuoteException (org.neo4j.csv.reader.DataAfterQuoteException)1 BadCollector (org.neo4j.internal.batchimport.input.BadCollector)1 Collector (org.neo4j.internal.batchimport.input.Collector)1 Collectors.badCollector (org.neo4j.internal.batchimport.input.Collectors.badCollector)1 Collectors.silentBadCollector (org.neo4j.internal.batchimport.input.Collectors.silentBadCollector)1 Input (org.neo4j.internal.batchimport.input.Input)1 DataFactory (org.neo4j.internal.batchimport.input.csv.DataFactory)1 JobScheduler (org.neo4j.scheduler.JobScheduler)1 ThreadPoolJobScheduler (org.neo4j.test.scheduler.ThreadPoolJobScheduler)1