Search in sources :

Example 16 with CheckpointV1

use of org.apache.samza.checkpoint.CheckpointV1 in project samza by apache.

the class AzureCheckpointManager method writeCheckpoint.

@Override
public void writeCheckpoint(TaskName taskName, Checkpoint checkpoint) {
    Preconditions.checkArgument(checkpoint instanceof CheckpointV1, "Only CheckpointV1 could be written to Azure");
    if (!taskNames.contains(taskName)) {
        throw new SamzaException("writing checkpoint of unregistered task");
    }
    TableBatchOperation batchOperation = new TableBatchOperation();
    Iterator<Map.Entry<SystemStreamPartition, String>> iterator = checkpoint.getOffsets().entrySet().iterator();
    while (iterator.hasNext()) {
        Map.Entry<SystemStreamPartition, String> entry = iterator.next();
        SystemStreamPartition ssp = entry.getKey();
        String offset = entry.getValue();
        String partitionKey = taskName.toString();
        checkValidKey(partitionKey, "Taskname");
        String rowKey = serializeSystemStreamPartition(ssp);
        checkValidKey(rowKey, "SystemStreamPartition");
        // Create table entity
        TaskCheckpointEntity taskCheckpoint = new TaskCheckpointEntity(partitionKey, rowKey, offset);
        // Add to batch operation
        batchOperation.insertOrReplace(taskCheckpoint);
        // Execute when batch reaches capacity or this is the last item
        if (batchOperation.size() >= MAX_WRITE_BATCH_SIZE || !iterator.hasNext()) {
            try {
                cloudTable.execute(batchOperation);
            } catch (StorageException e) {
                LOG.error("Executing batch failed for task: {}", taskName);
                throw new AzureException(e);
            }
            batchOperation.clear();
        }
    }
}
Also used : AzureException(org.apache.samza.AzureException) CheckpointV1(org.apache.samza.checkpoint.CheckpointV1) SamzaException(org.apache.samza.SamzaException) HashMap(java.util.HashMap) Map(java.util.Map) ImmutableMap(com.google.common.collect.ImmutableMap) StorageException(com.microsoft.azure.storage.StorageException) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition)

Example 17 with CheckpointV1

use of org.apache.samza.checkpoint.CheckpointV1 in project samza by apache.

the class ITestAzureCheckpointManager method testStoringAndReadingCheckpointsMultiTasks.

@Test
public void testStoringAndReadingCheckpointsMultiTasks() {
    Partition partition = new Partition(0);
    Partition partition1 = new Partition(1);
    TaskName taskName = new TaskName("taskName1");
    TaskName taskName1 = new TaskName("taskName2");
    SystemStreamPartition ssp = new SystemStreamPartition("Azure", "Stream", partition);
    SystemStreamPartition ssp1 = new SystemStreamPartition("Azure", "Stream", partition1);
    Map<SystemStreamPartition, String> sspMap = new HashMap<>();
    sspMap.put(ssp, "12345");
    sspMap.put(ssp1, "54321");
    Checkpoint cp1 = new CheckpointV1(sspMap);
    Map<SystemStreamPartition, String> sspMap2 = new HashMap<>();
    sspMap2.put(ssp, "12347");
    sspMap2.put(ssp1, "54323");
    Checkpoint cp2 = new CheckpointV1(sspMap2);
    checkpointManager.register(taskName);
    checkpointManager.register(taskName1);
    checkpointManager.writeCheckpoint(taskName, cp1);
    checkpointManager.writeCheckpoint(taskName1, cp2);
    Checkpoint readCp1 = checkpointManager.readLastCheckpoint(taskName);
    Assert.assertNotNull(readCp1);
    Assert.assertEquals(cp1, readCp1);
    Checkpoint readCp2 = checkpointManager.readLastCheckpoint(taskName1);
    Assert.assertNotNull(readCp2);
    Assert.assertEquals(cp2, readCp2);
    checkpointManager.writeCheckpoint(taskName, cp2);
    checkpointManager.writeCheckpoint(taskName1, cp1);
    readCp1 = checkpointManager.readLastCheckpoint(taskName1);
    Assert.assertEquals(cp1, readCp1);
    readCp2 = checkpointManager.readLastCheckpoint(taskName);
    Assert.assertEquals(cp2, readCp2);
}
Also used : Partition(org.apache.samza.Partition) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition) Checkpoint(org.apache.samza.checkpoint.Checkpoint) TaskName(org.apache.samza.container.TaskName) HashMap(java.util.HashMap) CheckpointV1(org.apache.samza.checkpoint.CheckpointV1) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition)

Example 18 with CheckpointV1

use of org.apache.samza.checkpoint.CheckpointV1 in project samza by apache.

the class TestBlobStoreBackupManager method testInitWithInvalidCheckpoint.

@Test
public void testInitWithInvalidCheckpoint() {
    // init called with null checkpoint storeStorageEngineMap
    blobStoreBackupManager.init(null);
    // verify delete snapshot index blob called from init 0 times because prevSnapshotMap returned from init is empty
    // in case of null checkpoint.
    verify(blobStoreUtil, times(0)).deleteSnapshotIndexBlob(anyString(), any(Metadata.class));
    when(blobStoreUtil.getStoreSnapshotIndexes(anyString(), anyString(), anyString(), any(Checkpoint.class), anySetOf(String.class))).thenCallRealMethod();
    // init called with Checkpoint V1 -> unsupported
    Checkpoint checkpoint = new CheckpointV1(new HashMap<>());
    try {
        blobStoreBackupManager.init(checkpoint);
    } catch (SamzaException exception) {
        Assert.fail("Checkpoint V1 is expected to only log warning.");
    }
}
Also used : Checkpoint(org.apache.samza.checkpoint.Checkpoint) CheckpointV1(org.apache.samza.checkpoint.CheckpointV1) SnapshotMetadata(org.apache.samza.storage.blobstore.index.SnapshotMetadata) SamzaException(org.apache.samza.SamzaException) Test(org.junit.Test)

Example 19 with CheckpointV1

use of org.apache.samza.checkpoint.CheckpointV1 in project samza by apache.

the class TestTaskStorageCommitManager method testWriteChangelogOffsetFilesWithEmptyChangelogTopic.

@Test
public void testWriteChangelogOffsetFilesWithEmptyChangelogTopic() throws IOException {
    Map<String, Map<SystemStreamPartition, String>> mockFileSystem = new HashMap<>();
    ContainerStorageManager containerStorageManager = mock(ContainerStorageManager.class);
    StorageEngine mockLPStore = mock(StorageEngine.class);
    StoreProperties lpStoreProps = mock(StoreProperties.class);
    when(mockLPStore.getStoreProperties()).thenReturn(lpStoreProps);
    when(lpStoreProps.isPersistedToDisk()).thenReturn(true);
    when(lpStoreProps.isDurableStore()).thenReturn(true);
    Path mockPath = mock(Path.class);
    when(mockLPStore.checkpoint(any())).thenReturn(Optional.of(mockPath));
    TaskInstanceMetrics metrics = mock(TaskInstanceMetrics.class);
    Timer checkpointTimer = mock(Timer.class);
    when(metrics.storeCheckpointNs()).thenReturn(checkpointTimer);
    java.util.Map<String, StorageEngine> taskStores = ImmutableMap.of("loggedPersistentStore", mockLPStore);
    Partition changelogPartition = new Partition(0);
    SystemStream changelogSystemStream = new SystemStream("changelogSystem", "changelogStream");
    SystemStreamPartition changelogSSP = new SystemStreamPartition(changelogSystemStream, changelogPartition);
    java.util.Map<String, SystemStream> storeChangelogsStreams = ImmutableMap.of("loggedPersistentStore", changelogSystemStream);
    StorageManagerUtil storageManagerUtil = mock(StorageManagerUtil.class);
    File tmpTestPath = new File("store-checkpoint-test");
    when(storageManagerUtil.getTaskStoreDir(eq(tmpTestPath), any(), any(), any())).thenReturn(tmpTestPath);
    TaskName taskName = new TaskName("task");
    when(containerStorageManager.getAllStores(taskName)).thenReturn(taskStores);
    TaskStorageCommitManager commitManager = spy(new TaskStorageCommitManager(taskName, Collections.emptyMap(), containerStorageManager, storeChangelogsStreams, changelogPartition, null, null, ForkJoinPool.commonPool(), storageManagerUtil, tmpTestPath, metrics));
    doAnswer(invocation -> {
        String storeName = invocation.getArgumentAt(0, String.class);
        String fileDir = invocation.getArgumentAt(3, File.class).getName();
        String mockKey = storeName + fileDir;
        SystemStreamPartition ssp = invocation.getArgumentAt(1, SystemStreamPartition.class);
        String offset = invocation.getArgumentAt(2, String.class);
        if (mockFileSystem.containsKey(mockKey)) {
            mockFileSystem.get(mockKey).put(ssp, offset);
        } else {
            Map<SystemStreamPartition, String> sspOffsets = new HashMap<>();
            sspOffsets.put(ssp, offset);
            mockFileSystem.put(mockKey, sspOffsets);
        }
        return null;
    }).when(commitManager).writeChangelogOffsetFile(any(), any(), any(), any());
    CheckpointId newCheckpointId = CheckpointId.create();
    String newestOffset = null;
    KafkaChangelogSSPOffset kafkaChangelogSSPOffset = new KafkaChangelogSSPOffset(newCheckpointId, newestOffset);
    java.util.Map<SystemStreamPartition, String> offsetsJava = ImmutableMap.of(changelogSSP, kafkaChangelogSSPOffset.toString());
    commitManager.init();
    // invoke persist to file system for v2 checkpoint
    commitManager.writeCheckpointToStoreDirectories(new CheckpointV1(offsetsJava));
    assertTrue(mockFileSystem.isEmpty());
    // verify that delete was called on current store dir offset file
    verify(storageManagerUtil, times(1)).deleteOffsetFile(eq(tmpTestPath));
}
Also used : HashMap(java.util.HashMap) CheckpointV1(org.apache.samza.checkpoint.CheckpointV1) KafkaChangelogSSPOffset(org.apache.samza.checkpoint.kafka.KafkaChangelogSSPOffset) Path(java.nio.file.Path) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition) Partition(org.apache.samza.Partition) SystemStream(org.apache.samza.system.SystemStream) TaskInstanceMetrics(org.apache.samza.container.TaskInstanceMetrics) Timer(org.apache.samza.metrics.Timer) TaskName(org.apache.samza.container.TaskName) CheckpointId(org.apache.samza.checkpoint.CheckpointId) HashMap(java.util.HashMap) Map(java.util.Map) ImmutableMap(com.google.common.collect.ImmutableMap) File(java.io.File) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition) Test(org.junit.Test)

Example 20 with CheckpointV1

use of org.apache.samza.checkpoint.CheckpointV1 in project samza by apache.

the class TestTaskStorageCommitManager method testWriteChangelogOffsetFilesV1.

@Test
public void testWriteChangelogOffsetFilesV1() throws IOException {
    Map<String, Map<SystemStreamPartition, String>> mockFileSystem = new HashMap<>();
    ContainerStorageManager containerStorageManager = mock(ContainerStorageManager.class);
    StorageEngine mockLPStore = mock(StorageEngine.class);
    StoreProperties lpStoreProps = mock(StoreProperties.class);
    when(mockLPStore.getStoreProperties()).thenReturn(lpStoreProps);
    when(lpStoreProps.isPersistedToDisk()).thenReturn(true);
    when(lpStoreProps.isDurableStore()).thenReturn(true);
    Path mockPath = mock(Path.class);
    when(mockLPStore.checkpoint(any())).thenReturn(Optional.of(mockPath));
    TaskInstanceMetrics metrics = mock(TaskInstanceMetrics.class);
    Timer checkpointTimer = mock(Timer.class);
    when(metrics.storeCheckpointNs()).thenReturn(checkpointTimer);
    java.util.Map<String, StorageEngine> taskStores = ImmutableMap.of("loggedPersistentStore", mockLPStore);
    Partition changelogPartition = new Partition(0);
    SystemStream changelogSystemStream = new SystemStream("changelogSystem", "changelogStream");
    SystemStreamPartition changelogSSP = new SystemStreamPartition(changelogSystemStream, changelogPartition);
    java.util.Map<String, SystemStream> storeChangelogsStreams = ImmutableMap.of("loggedPersistentStore", changelogSystemStream);
    StorageManagerUtil storageManagerUtil = mock(StorageManagerUtil.class);
    File tmpTestPath = new File("store-checkpoint-test");
    when(storageManagerUtil.getTaskStoreDir(eq(tmpTestPath), eq("loggedPersistentStore"), any(), any())).thenReturn(tmpTestPath);
    TaskName taskName = new TaskName("task");
    when(containerStorageManager.getAllStores(taskName)).thenReturn(taskStores);
    TaskStorageCommitManager commitManager = spy(new TaskStorageCommitManager(taskName, Collections.emptyMap(), containerStorageManager, storeChangelogsStreams, changelogPartition, null, null, ForkJoinPool.commonPool(), storageManagerUtil, tmpTestPath, metrics));
    when(storageManagerUtil.getStoreCheckpointDir(any(File.class), any(CheckpointId.class))).thenAnswer((Answer<String>) invocation -> {
        File file = invocation.getArgumentAt(0, File.class);
        CheckpointId checkpointId = invocation.getArgumentAt(1, CheckpointId.class);
        return file + "-" + checkpointId;
    });
    doAnswer(invocation -> {
        String fileDir = invocation.getArgumentAt(3, File.class).getName();
        SystemStreamPartition ssp = invocation.getArgumentAt(1, SystemStreamPartition.class);
        String offset = invocation.getArgumentAt(2, String.class);
        if (mockFileSystem.containsKey(fileDir)) {
            mockFileSystem.get(fileDir).put(ssp, offset);
        } else {
            Map<SystemStreamPartition, String> sspOffsets = new HashMap<>();
            sspOffsets.put(ssp, offset);
            mockFileSystem.put(fileDir, sspOffsets);
        }
        return null;
    }).when(commitManager).writeChangelogOffsetFile(any(), any(), any(), any());
    CheckpointId newCheckpointId = CheckpointId.create();
    String newestOffset = "1";
    KafkaChangelogSSPOffset kafkaChangelogSSPOffset = new KafkaChangelogSSPOffset(newCheckpointId, newestOffset);
    java.util.Map<SystemStreamPartition, String> offsetsJava = ImmutableMap.of(changelogSSP, kafkaChangelogSSPOffset.toString());
    commitManager.init();
    // invoke persist to file system for v2 checkpoint
    commitManager.writeCheckpointToStoreDirectories(new CheckpointV1(offsetsJava));
    assertEquals(2, mockFileSystem.size());
    // check if v2 offsets are written correctly
    String v2FilePath = storageManagerUtil.getStoreCheckpointDir(tmpTestPath, newCheckpointId);
    assertTrue(mockFileSystem.containsKey(v2FilePath));
    assertTrue(mockFileSystem.get(v2FilePath).containsKey(changelogSSP));
    assertEquals(1, mockFileSystem.get(v2FilePath).size());
    assertEquals(newestOffset, mockFileSystem.get(v2FilePath).get(changelogSSP));
    // check if v1 offsets are written correctly
    String v1FilePath = tmpTestPath.getPath();
    assertTrue(mockFileSystem.containsKey(v1FilePath));
    assertTrue(mockFileSystem.get(v1FilePath).containsKey(changelogSSP));
    assertEquals(1, mockFileSystem.get(v1FilePath).size());
    assertEquals(newestOffset, mockFileSystem.get(v1FilePath).get(changelogSSP));
}
Also used : CheckpointV2(org.apache.samza.checkpoint.CheckpointV2) HashMap(java.util.HashMap) CompletableFuture(java.util.concurrent.CompletableFuture) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition) Mockito.spy(org.mockito.Mockito.spy) CheckpointV1(org.apache.samza.checkpoint.CheckpointV1) Answer(org.mockito.stubbing.Answer) Mockito.doThrow(org.mockito.Mockito.doThrow) CheckpointManager(org.apache.samza.checkpoint.CheckpointManager) SystemStream(org.apache.samza.system.SystemStream) Map(java.util.Map) Mockito.doAnswer(org.mockito.Mockito.doAnswer) Assert.fail(org.junit.Assert.fail) Mockito.anyLong(org.mockito.Mockito.anyLong) Path(java.nio.file.Path) MapConfig(org.apache.samza.config.MapConfig) KafkaChangelogSSPOffset(org.apache.samza.checkpoint.kafka.KafkaChangelogSSPOffset) TaskInstanceMetrics(org.apache.samza.container.TaskInstanceMetrics) TaskName(org.apache.samza.container.TaskName) ImmutableMap(com.google.common.collect.ImmutableMap) Timer(org.apache.samza.metrics.Timer) Partition(org.apache.samza.Partition) Assert.assertTrue(org.junit.Assert.assertTrue) IOException(java.io.IOException) Checkpoint(org.apache.samza.checkpoint.Checkpoint) Test(org.junit.Test) Mockito.times(org.mockito.Mockito.times) Mockito.doNothing(org.mockito.Mockito.doNothing) Mockito.when(org.mockito.Mockito.when) File(java.io.File) SamzaException(org.apache.samza.SamzaException) CheckpointId(org.apache.samza.checkpoint.CheckpointId) Mockito.verify(org.mockito.Mockito.verify) Matchers.any(org.mockito.Matchers.any) TaskMode(org.apache.samza.job.model.TaskMode) Mockito.never(org.mockito.Mockito.never) FileFilter(java.io.FileFilter) Paths(java.nio.file.Paths) ForkJoinPool(java.util.concurrent.ForkJoinPool) Optional(java.util.Optional) Collections(java.util.Collections) Assert.assertEquals(org.junit.Assert.assertEquals) Mockito.eq(org.mockito.Mockito.eq) Mockito.mock(org.mockito.Mockito.mock) HashMap(java.util.HashMap) CheckpointV1(org.apache.samza.checkpoint.CheckpointV1) KafkaChangelogSSPOffset(org.apache.samza.checkpoint.kafka.KafkaChangelogSSPOffset) Path(java.nio.file.Path) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition) Partition(org.apache.samza.Partition) SystemStream(org.apache.samza.system.SystemStream) TaskInstanceMetrics(org.apache.samza.container.TaskInstanceMetrics) Timer(org.apache.samza.metrics.Timer) TaskName(org.apache.samza.container.TaskName) CheckpointId(org.apache.samza.checkpoint.CheckpointId) HashMap(java.util.HashMap) Map(java.util.Map) ImmutableMap(com.google.common.collect.ImmutableMap) File(java.io.File) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition) Test(org.junit.Test)

Aggregations

CheckpointV1 (org.apache.samza.checkpoint.CheckpointV1)22 HashMap (java.util.HashMap)14 SystemStreamPartition (org.apache.samza.system.SystemStreamPartition)13 Test (org.junit.Test)13 Checkpoint (org.apache.samza.checkpoint.Checkpoint)12 TaskName (org.apache.samza.container.TaskName)12 Partition (org.apache.samza.Partition)11 Map (java.util.Map)10 ImmutableMap (com.google.common.collect.ImmutableMap)9 SamzaException (org.apache.samza.SamzaException)8 SystemStream (org.apache.samza.system.SystemStream)8 KafkaChangelogSSPOffset (org.apache.samza.checkpoint.kafka.KafkaChangelogSSPOffset)7 File (java.io.File)6 CheckpointId (org.apache.samza.checkpoint.CheckpointId)6 CheckpointManager (org.apache.samza.checkpoint.CheckpointManager)6 CheckpointV2 (org.apache.samza.checkpoint.CheckpointV2)6 MapConfig (org.apache.samza.config.MapConfig)6 Collections (java.util.Collections)5 CompletableFuture (java.util.concurrent.CompletableFuture)5 TaskInstanceMetrics (org.apache.samza.container.TaskInstanceMetrics)5