Search in sources :

Example 1 with KvEntryFileWriter

use of com.baidu.hugegraph.computer.core.store.KvEntryFileWriter in project hugegraph-computer by hugegraph.

the class HgkvFileSorter method mergeInputs.

@Override
public void mergeInputs(List<String> inputs, OuterSortFlusher flusher, List<String> outputs, boolean withSubKv) throws Exception {
    Function<String, EntryIterator> fileToInput;
    Function<String, KvEntryFileWriter> fileToWriter;
    if (withSubKv) {
        fileToInput = o -> new HgkvDir4SubKvReaderImpl(o).iterator();
    } else {
        fileToInput = o -> new HgkvDirReaderImpl(o).iterator();
    }
    fileToWriter = path -> new HgkvDirBuilderImpl(this.config, path);
    InputFilesSelector selector = new DisperseEvenlySelector();
    List<SelectedFiles> selectResult = selector.selectedByHgkvFile(inputs, outputs);
    this.sorter.mergeFile(selectResult, fileToInput, fileToWriter, flusher);
}
Also used : DisperseEvenlySelector(com.baidu.hugegraph.computer.core.store.file.select.DisperseEvenlySelector) HgkvDir4SubKvReaderImpl(com.baidu.hugegraph.computer.core.store.file.hgkvfile.reader.HgkvDir4SubKvReaderImpl) HgkvDirReaderImpl(com.baidu.hugegraph.computer.core.store.file.hgkvfile.reader.HgkvDirReaderImpl) SelectedFiles(com.baidu.hugegraph.computer.core.store.file.select.SelectedFiles) EntryIterator(com.baidu.hugegraph.computer.core.store.EntryIterator) KvEntryFileWriter(com.baidu.hugegraph.computer.core.store.KvEntryFileWriter) HgkvDirBuilderImpl(com.baidu.hugegraph.computer.core.store.file.hgkvfile.builder.HgkvDirBuilderImpl) InputFilesSelector(com.baidu.hugegraph.computer.core.store.file.select.InputFilesSelector)

Example 2 with KvEntryFileWriter

use of com.baidu.hugegraph.computer.core.store.KvEntryFileWriter in project hugegraph-computer by hugegraph.

the class BufferFileSorter method mergeInputs.

@Override
public void mergeInputs(List<String> inputs, OuterSortFlusher flusher, List<String> outputs, boolean withSubKv) throws Exception {
    Function<String, EntryIterator> fileToInput;
    Function<String, KvEntryFileWriter> fileToWriter;
    if (withSubKv) {
        fileToInput = o -> new BufferFileSubEntryReader(o).iterator();
    } else {
        fileToInput = o -> new BufferFileEntryReader(o).iterator();
    }
    fileToWriter = BufferFileEntryBuilder::new;
    InputFilesSelector selector = new DisperseEvenlySelector();
    List<SelectedFiles> selectResult = selector.selectedByBufferFile(inputs, outputs);
    this.sorter.mergeFile(selectResult, fileToInput, fileToWriter, flusher);
}
Also used : DisperseEvenlySelector(com.baidu.hugegraph.computer.core.store.file.select.DisperseEvenlySelector) BufferFileSubEntryReader(com.baidu.hugegraph.computer.core.store.file.bufferfile.BufferFileSubEntryReader) BufferFileEntryReader(com.baidu.hugegraph.computer.core.store.file.bufferfile.BufferFileEntryReader) SelectedFiles(com.baidu.hugegraph.computer.core.store.file.select.SelectedFiles) EntryIterator(com.baidu.hugegraph.computer.core.store.EntryIterator) BufferFileEntryBuilder(com.baidu.hugegraph.computer.core.store.file.bufferfile.BufferFileEntryBuilder) KvEntryFileWriter(com.baidu.hugegraph.computer.core.store.KvEntryFileWriter) InputFilesSelector(com.baidu.hugegraph.computer.core.store.file.select.InputFilesSelector)

Example 3 with KvEntryFileWriter

use of com.baidu.hugegraph.computer.core.store.KvEntryFileWriter in project hugegraph-computer by hugegraph.

the class SortLargeDataTest method testDiffNumEntriesFileMerge.

@Test
public void testDiffNumEntriesFileMerge() throws Exception {
    Config config = UnitTestBase.updateWithRequiredOptions(ComputerOptions.HGKV_MERGE_FILES_NUM, "3", ComputerOptions.TRANSPORT_RECV_FILE_MODE, "false");
    List<Integer> sizeList = ImmutableList.of(200, 500, 20, 50, 300, 250, 10, 33, 900, 89, 20);
    List<String> inputs = new ArrayList<>();
    for (int j = 0; j < sizeList.size(); j++) {
        String file = StoreTestUtil.availablePathById(j + 10);
        inputs.add(file);
        try (KvEntryFileWriter builder = new HgkvDirBuilderImpl(config, file)) {
            for (int i = 0; i < sizeList.get(j); i++) {
                byte[] keyBytes = StoreTestUtil.intToByteArray(i);
                byte[] valueBytes = StoreTestUtil.intToByteArray(1);
                Pointer key = new InlinePointer(keyBytes);
                Pointer value = new InlinePointer(valueBytes);
                KvEntry entry = new DefaultKvEntry(key, value);
                builder.write(entry);
            }
        }
    }
    List<String> outputs = ImmutableList.of(StoreTestUtil.availablePathById(0), StoreTestUtil.availablePathById(1), StoreTestUtil.availablePathById(2), StoreTestUtil.availablePathById(3));
    Sorter sorter = SorterTestUtil.createSorter(config);
    sorter.mergeInputs(inputs, new KvOuterSortFlusher(), outputs, false);
    int total = sizeList.stream().mapToInt(i -> i).sum();
    int mergeTotal = 0;
    for (String output : outputs) {
        mergeTotal += HgkvDirImpl.open(output).numEntries();
    }
    Assert.assertEquals(total, mergeTotal);
}
Also used : ComputerOptions(com.baidu.hugegraph.computer.core.config.ComputerOptions) BeforeClass(org.junit.BeforeClass) Random(java.util.Random) EntriesUtil(com.baidu.hugegraph.computer.core.store.entry.EntriesUtil) Pointer(com.baidu.hugegraph.computer.core.store.entry.Pointer) ArrayList(java.util.ArrayList) IntValueSumCombiner(com.baidu.hugegraph.computer.core.combiner.IntValueSumCombiner) IOFactory(com.baidu.hugegraph.computer.core.io.IOFactory) Lists(com.google.common.collect.Lists) ImmutableList(com.google.common.collect.ImmutableList) After(org.junit.After) StoreTestUtil(com.baidu.hugegraph.computer.core.store.StoreTestUtil) UnitTestBase(com.baidu.hugegraph.computer.suite.unit.UnitTestBase) Before(org.junit.Before) Logger(org.slf4j.Logger) OuterSortFlusher(com.baidu.hugegraph.computer.core.sort.flusher.OuterSortFlusher) Constants(com.baidu.hugegraph.computer.core.common.Constants) IOException(java.io.IOException) FileUtils(org.apache.commons.io.FileUtils) Test(org.junit.Test) HgkvDir(com.baidu.hugegraph.computer.core.store.file.hgkvfile.HgkvDir) StopWatch(org.apache.commons.lang3.time.StopWatch) CombineKvInnerSortFlusher(com.baidu.hugegraph.computer.core.sort.flusher.CombineKvInnerSortFlusher) HgkvDirImpl(com.baidu.hugegraph.computer.core.store.file.hgkvfile.HgkvDirImpl) DefaultKvEntry(com.baidu.hugegraph.computer.core.store.entry.DefaultKvEntry) KvEntry(com.baidu.hugegraph.computer.core.store.entry.KvEntry) File(java.io.File) Config(com.baidu.hugegraph.computer.core.config.Config) KvOuterSortFlusher(com.baidu.hugegraph.computer.core.sort.flusher.KvOuterSortFlusher) Bytes(com.baidu.hugegraph.util.Bytes) List(java.util.List) Log(com.baidu.hugegraph.util.Log) CombineKvOuterSortFlusher(com.baidu.hugegraph.computer.core.sort.flusher.CombineKvOuterSortFlusher) IntValue(com.baidu.hugegraph.computer.core.graph.value.IntValue) HgkvDirBuilderImpl(com.baidu.hugegraph.computer.core.store.file.hgkvfile.builder.HgkvDirBuilderImpl) PointerCombiner(com.baidu.hugegraph.computer.core.combiner.PointerCombiner) BytesInput(com.baidu.hugegraph.computer.core.io.BytesInput) BytesOutput(com.baidu.hugegraph.computer.core.io.BytesOutput) SorterTestUtil(com.baidu.hugegraph.computer.core.sort.SorterTestUtil) KvEntryFileWriter(com.baidu.hugegraph.computer.core.store.KvEntryFileWriter) Sorter(com.baidu.hugegraph.computer.core.sort.Sorter) Assert(com.baidu.hugegraph.testutil.Assert) PeekableIterator(com.baidu.hugegraph.computer.core.sort.flusher.PeekableIterator) RandomAccessInput(com.baidu.hugegraph.computer.core.io.RandomAccessInput) InlinePointer(com.baidu.hugegraph.computer.core.store.entry.InlinePointer) InnerSortFlusher(com.baidu.hugegraph.computer.core.sort.flusher.InnerSortFlusher) KvOuterSortFlusher(com.baidu.hugegraph.computer.core.sort.flusher.KvOuterSortFlusher) CombineKvOuterSortFlusher(com.baidu.hugegraph.computer.core.sort.flusher.CombineKvOuterSortFlusher) Config(com.baidu.hugegraph.computer.core.config.Config) InlinePointer(com.baidu.hugegraph.computer.core.store.entry.InlinePointer) ArrayList(java.util.ArrayList) DefaultKvEntry(com.baidu.hugegraph.computer.core.store.entry.DefaultKvEntry) KvEntry(com.baidu.hugegraph.computer.core.store.entry.KvEntry) Pointer(com.baidu.hugegraph.computer.core.store.entry.Pointer) InlinePointer(com.baidu.hugegraph.computer.core.store.entry.InlinePointer) DefaultKvEntry(com.baidu.hugegraph.computer.core.store.entry.DefaultKvEntry) Sorter(com.baidu.hugegraph.computer.core.sort.Sorter) KvEntryFileWriter(com.baidu.hugegraph.computer.core.store.KvEntryFileWriter) HgkvDirBuilderImpl(com.baidu.hugegraph.computer.core.store.file.hgkvfile.builder.HgkvDirBuilderImpl) Test(org.junit.Test)

Example 4 with KvEntryFileWriter

use of com.baidu.hugegraph.computer.core.store.KvEntryFileWriter in project hugegraph-computer by hugegraph.

the class FileMergerImpl method mergeInputs.

private void mergeInputs(List<String> inputs, Function<String, EntryIterator> inputToIter, OuterSortFlusher flusher, String output, Function<String, KvEntryFileWriter> fileToWriter) throws Exception {
    /*
         * File value format is different, upper layer is required to
         * provide the file reading mode
         */
    List<EntryIterator> entries = inputs.stream().map(inputToIter).collect(Collectors.toList());
    InputsSorter sorter = new InputsSorterImpl();
    // Merge inputs and write to output
    try (EntryIterator sortedKv = sorter.sort(entries);
        KvEntryFileWriter builder = fileToWriter.apply(output)) {
        flusher.flush(sortedKv, builder);
    }
}
Also used : InputsSorterImpl(com.baidu.hugegraph.computer.core.sort.sorter.InputsSorterImpl) EntryIterator(com.baidu.hugegraph.computer.core.store.EntryIterator) InputsSorter(com.baidu.hugegraph.computer.core.sort.sorter.InputsSorter) KvEntryFileWriter(com.baidu.hugegraph.computer.core.store.KvEntryFileWriter)

Aggregations

KvEntryFileWriter (com.baidu.hugegraph.computer.core.store.KvEntryFileWriter)4 EntryIterator (com.baidu.hugegraph.computer.core.store.EntryIterator)3 HgkvDirBuilderImpl (com.baidu.hugegraph.computer.core.store.file.hgkvfile.builder.HgkvDirBuilderImpl)2 DisperseEvenlySelector (com.baidu.hugegraph.computer.core.store.file.select.DisperseEvenlySelector)2 InputFilesSelector (com.baidu.hugegraph.computer.core.store.file.select.InputFilesSelector)2 SelectedFiles (com.baidu.hugegraph.computer.core.store.file.select.SelectedFiles)2 IntValueSumCombiner (com.baidu.hugegraph.computer.core.combiner.IntValueSumCombiner)1 PointerCombiner (com.baidu.hugegraph.computer.core.combiner.PointerCombiner)1 Constants (com.baidu.hugegraph.computer.core.common.Constants)1 ComputerOptions (com.baidu.hugegraph.computer.core.config.ComputerOptions)1 Config (com.baidu.hugegraph.computer.core.config.Config)1 IntValue (com.baidu.hugegraph.computer.core.graph.value.IntValue)1 BytesInput (com.baidu.hugegraph.computer.core.io.BytesInput)1 BytesOutput (com.baidu.hugegraph.computer.core.io.BytesOutput)1 IOFactory (com.baidu.hugegraph.computer.core.io.IOFactory)1 RandomAccessInput (com.baidu.hugegraph.computer.core.io.RandomAccessInput)1 Sorter (com.baidu.hugegraph.computer.core.sort.Sorter)1 SorterTestUtil (com.baidu.hugegraph.computer.core.sort.SorterTestUtil)1 CombineKvInnerSortFlusher (com.baidu.hugegraph.computer.core.sort.flusher.CombineKvInnerSortFlusher)1 CombineKvOuterSortFlusher (com.baidu.hugegraph.computer.core.sort.flusher.CombineKvOuterSortFlusher)1