Search in sources :

Example 1 with Shard

use of com.amazonaws.services.kinesis.model.Shard in project storm by apache.

the class KinesisConnection method getShardsForStream.

List<Shard> getShardsForStream(String stream) {
    DescribeStreamRequest describeStreamRequest = new DescribeStreamRequest();
    describeStreamRequest.setStreamName(stream);
    List<Shard> shards = new ArrayList<>();
    String exclusiveStartShardId = null;
    do {
        describeStreamRequest.setExclusiveStartShardId(exclusiveStartShardId);
        DescribeStreamResult describeStreamResult = kinesisClient.describeStream(describeStreamRequest);
        shards.addAll(describeStreamResult.getStreamDescription().getShards());
        if (describeStreamResult.getStreamDescription().getHasMoreShards() && shards.size() > 0) {
            exclusiveStartShardId = shards.get(shards.size() - 1).getShardId();
        } else {
            exclusiveStartShardId = null;
        }
    } while (exclusiveStartShardId != null);
    LOG.info("Number of shards for stream " + stream + " are " + shards.size());
    return shards;
}
Also used : ArrayList(java.util.ArrayList) DescribeStreamRequest(com.amazonaws.services.kinesis.model.DescribeStreamRequest) Shard(com.amazonaws.services.kinesis.model.Shard) DescribeStreamResult(com.amazonaws.services.kinesis.model.DescribeStreamResult)

Example 2 with Shard

use of com.amazonaws.services.kinesis.model.Shard in project flink by apache.

the class KinesisDataFetcherTest method testStreamToLastSeenShardStateIsCorrectlySetWhenNoNewShardsSinceRestoredCheckpoint.

@Test
public void testStreamToLastSeenShardStateIsCorrectlySetWhenNoNewShardsSinceRestoredCheckpoint() throws Exception {
    List<String> fakeStreams = new LinkedList<>();
    fakeStreams.add("fakeStream1");
    fakeStreams.add("fakeStream2");
    Map<KinesisStreamShard, String> restoredStateUnderTest = new HashMap<>();
    // fakeStream1 has 3 shards before restore
    restoredStateUnderTest.put(new KinesisStreamShard("fakeStream1", new Shard().withShardId(KinesisShardIdGenerator.generateFromShardOrder(0))), UUID.randomUUID().toString());
    restoredStateUnderTest.put(new KinesisStreamShard("fakeStream1", new Shard().withShardId(KinesisShardIdGenerator.generateFromShardOrder(1))), UUID.randomUUID().toString());
    restoredStateUnderTest.put(new KinesisStreamShard("fakeStream1", new Shard().withShardId(KinesisShardIdGenerator.generateFromShardOrder(2))), UUID.randomUUID().toString());
    // fakeStream2 has 2 shards before restore
    restoredStateUnderTest.put(new KinesisStreamShard("fakeStream2", new Shard().withShardId(KinesisShardIdGenerator.generateFromShardOrder(0))), UUID.randomUUID().toString());
    restoredStateUnderTest.put(new KinesisStreamShard("fakeStream2", new Shard().withShardId(KinesisShardIdGenerator.generateFromShardOrder(1))), UUID.randomUUID().toString());
    Map<String, Integer> streamToShardCount = new HashMap<>();
    // fakeStream1 will still have 3 shards after restore
    streamToShardCount.put("fakeStream1", 3);
    // fakeStream2 will still have 2 shards after restore
    streamToShardCount.put("fakeStream2", 2);
    HashMap<String, String> subscribedStreamsToLastSeenShardIdsUnderTest = KinesisDataFetcher.createInitialSubscribedStreamsToLastDiscoveredShardsState(fakeStreams);
    final TestableKinesisDataFetcher fetcher = new TestableKinesisDataFetcher(fakeStreams, new Properties(), 10, 2, new AtomicReference<Throwable>(), new LinkedList<KinesisStreamShardState>(), subscribedStreamsToLastSeenShardIdsUnderTest, FakeKinesisBehavioursFactory.nonReshardedStreamsBehaviour(streamToShardCount));
    for (Map.Entry<KinesisStreamShard, String> restoredState : restoredStateUnderTest.entrySet()) {
        fetcher.advanceLastDiscoveredShardOfStream(restoredState.getKey().getStreamName(), restoredState.getKey().getShard().getShardId());
        fetcher.registerNewSubscribedShardState(new KinesisStreamShardState(restoredState.getKey(), new SequenceNumber(restoredState.getValue())));
    }
    fetcher.setIsRestoringFromFailure(true);
    PowerMockito.whenNew(ShardConsumer.class).withAnyArguments().thenReturn(Mockito.mock(ShardConsumer.class));
    Thread runFetcherThread = new Thread(new Runnable() {

        @Override
        public void run() {
            try {
                fetcher.runFetcher();
            } catch (Exception e) {
            //
            }
        }
    });
    runFetcherThread.start();
    // sleep a while before closing
    Thread.sleep(1000);
    fetcher.shutdownFetcher();
    // assert that the streams tracked in the state are identical to the subscribed streams
    Set<String> streamsInState = subscribedStreamsToLastSeenShardIdsUnderTest.keySet();
    assertTrue(streamsInState.size() == fakeStreams.size());
    assertTrue(streamsInState.containsAll(fakeStreams));
    // assert that the last seen shards in state is correctly set
    for (Map.Entry<String, String> streamToLastSeenShard : subscribedStreamsToLastSeenShardIdsUnderTest.entrySet()) {
        assertTrue(streamToLastSeenShard.getValue().equals(KinesisShardIdGenerator.generateFromShardOrder(streamToShardCount.get(streamToLastSeenShard.getKey()) - 1)));
    }
}
Also used : HashMap(java.util.HashMap) KinesisStreamShard(org.apache.flink.streaming.connectors.kinesis.model.KinesisStreamShard) Properties(java.util.Properties) SequenceNumber(org.apache.flink.streaming.connectors.kinesis.model.SequenceNumber) TestableKinesisDataFetcher(org.apache.flink.streaming.connectors.kinesis.testutils.TestableKinesisDataFetcher) LinkedList(java.util.LinkedList) KinesisStreamShardState(org.apache.flink.streaming.connectors.kinesis.model.KinesisStreamShardState) Shard(com.amazonaws.services.kinesis.model.Shard) KinesisStreamShard(org.apache.flink.streaming.connectors.kinesis.model.KinesisStreamShard) HashMap(java.util.HashMap) Map(java.util.Map) Test(org.junit.Test) PrepareForTest(org.powermock.core.classloader.annotations.PrepareForTest)

Example 3 with Shard

use of com.amazonaws.services.kinesis.model.Shard in project flink by apache.

the class KinesisDataFetcherTest method testStreamToLastSeenShardStateIsCorrectlySetWhenNoNewShardsSinceRestoredCheckpointAndSomeStreamsDoNotExist.

@Test
public void testStreamToLastSeenShardStateIsCorrectlySetWhenNoNewShardsSinceRestoredCheckpointAndSomeStreamsDoNotExist() throws Exception {
    List<String> fakeStreams = new LinkedList<>();
    fakeStreams.add("fakeStream1");
    fakeStreams.add("fakeStream2");
    // fakeStream3 will not have any shards
    fakeStreams.add("fakeStream3");
    // fakeStream4 will not have any shards
    fakeStreams.add("fakeStream4");
    Map<KinesisStreamShard, String> restoredStateUnderTest = new HashMap<>();
    // fakeStream1 has 3 shards before restore
    restoredStateUnderTest.put(new KinesisStreamShard("fakeStream1", new Shard().withShardId(KinesisShardIdGenerator.generateFromShardOrder(0))), UUID.randomUUID().toString());
    restoredStateUnderTest.put(new KinesisStreamShard("fakeStream1", new Shard().withShardId(KinesisShardIdGenerator.generateFromShardOrder(1))), UUID.randomUUID().toString());
    restoredStateUnderTest.put(new KinesisStreamShard("fakeStream1", new Shard().withShardId(KinesisShardIdGenerator.generateFromShardOrder(2))), UUID.randomUUID().toString());
    // fakeStream2 has 2 shards before restore
    restoredStateUnderTest.put(new KinesisStreamShard("fakeStream2", new Shard().withShardId(KinesisShardIdGenerator.generateFromShardOrder(0))), UUID.randomUUID().toString());
    restoredStateUnderTest.put(new KinesisStreamShard("fakeStream2", new Shard().withShardId(KinesisShardIdGenerator.generateFromShardOrder(1))), UUID.randomUUID().toString());
    Map<String, Integer> streamToShardCount = new HashMap<>();
    // fakeStream1 has fixed 3 shards
    streamToShardCount.put("fakeStream1", 3);
    // fakeStream2 has fixed 2 shards
    streamToShardCount.put("fakeStream2", 2);
    // no shards can be found for fakeStream3
    streamToShardCount.put("fakeStream3", 0);
    // no shards can be found for fakeStream4
    streamToShardCount.put("fakeStream4", 0);
    HashMap<String, String> subscribedStreamsToLastSeenShardIdsUnderTest = KinesisDataFetcher.createInitialSubscribedStreamsToLastDiscoveredShardsState(fakeStreams);
    // using a non-resharded streams kinesis behaviour to represent that Kinesis is not resharded AFTER the restore
    final TestableKinesisDataFetcher fetcher = new TestableKinesisDataFetcher(fakeStreams, new Properties(), 10, 2, new AtomicReference<Throwable>(), new LinkedList<KinesisStreamShardState>(), subscribedStreamsToLastSeenShardIdsUnderTest, FakeKinesisBehavioursFactory.nonReshardedStreamsBehaviour(streamToShardCount));
    for (Map.Entry<KinesisStreamShard, String> restoredState : restoredStateUnderTest.entrySet()) {
        fetcher.advanceLastDiscoveredShardOfStream(restoredState.getKey().getStreamName(), restoredState.getKey().getShard().getShardId());
        fetcher.registerNewSubscribedShardState(new KinesisStreamShardState(restoredState.getKey(), new SequenceNumber(restoredState.getValue())));
    }
    fetcher.setIsRestoringFromFailure(true);
    PowerMockito.whenNew(ShardConsumer.class).withAnyArguments().thenReturn(Mockito.mock(ShardConsumer.class));
    Thread runFetcherThread = new Thread(new Runnable() {

        @Override
        public void run() {
            try {
                fetcher.runFetcher();
            } catch (Exception e) {
            //
            }
        }
    });
    runFetcherThread.start();
    // sleep a while before closing
    Thread.sleep(1000);
    fetcher.shutdownFetcher();
    // assert that the streams tracked in the state are identical to the subscribed streams
    Set<String> streamsInState = subscribedStreamsToLastSeenShardIdsUnderTest.keySet();
    assertTrue(streamsInState.size() == fakeStreams.size());
    assertTrue(streamsInState.containsAll(fakeStreams));
    // assert that the last seen shards in state is correctly set
    assertTrue(subscribedStreamsToLastSeenShardIdsUnderTest.get("fakeStream1").equals(KinesisShardIdGenerator.generateFromShardOrder(2)));
    assertTrue(subscribedStreamsToLastSeenShardIdsUnderTest.get("fakeStream2").equals(KinesisShardIdGenerator.generateFromShardOrder(1)));
    assertTrue(subscribedStreamsToLastSeenShardIdsUnderTest.get("fakeStream3") == null);
    assertTrue(subscribedStreamsToLastSeenShardIdsUnderTest.get("fakeStream4") == null);
}
Also used : HashMap(java.util.HashMap) KinesisStreamShard(org.apache.flink.streaming.connectors.kinesis.model.KinesisStreamShard) Properties(java.util.Properties) SequenceNumber(org.apache.flink.streaming.connectors.kinesis.model.SequenceNumber) TestableKinesisDataFetcher(org.apache.flink.streaming.connectors.kinesis.testutils.TestableKinesisDataFetcher) LinkedList(java.util.LinkedList) KinesisStreamShardState(org.apache.flink.streaming.connectors.kinesis.model.KinesisStreamShardState) Shard(com.amazonaws.services.kinesis.model.Shard) KinesisStreamShard(org.apache.flink.streaming.connectors.kinesis.model.KinesisStreamShard) HashMap(java.util.HashMap) Map(java.util.Map) Test(org.junit.Test) PrepareForTest(org.powermock.core.classloader.annotations.PrepareForTest)

Example 4 with Shard

use of com.amazonaws.services.kinesis.model.Shard in project flink by apache.

the class KinesisDataFetcherTest method testStreamToLastSeenShardStateIsCorrectlySetWhenNewShardsFoundSinceRestoredCheckpointAndSomeStreamsDoNotExist.

@Test
public void testStreamToLastSeenShardStateIsCorrectlySetWhenNewShardsFoundSinceRestoredCheckpointAndSomeStreamsDoNotExist() throws Exception {
    List<String> fakeStreams = new LinkedList<>();
    fakeStreams.add("fakeStream1");
    fakeStreams.add("fakeStream2");
    // fakeStream3 will not have any shards
    fakeStreams.add("fakeStream3");
    // fakeStream4 will not have any shards
    fakeStreams.add("fakeStream4");
    Map<KinesisStreamShard, String> restoredStateUnderTest = new HashMap<>();
    // fakeStream1 has 3 shards before restore
    restoredStateUnderTest.put(new KinesisStreamShard("fakeStream1", new Shard().withShardId(KinesisShardIdGenerator.generateFromShardOrder(0))), UUID.randomUUID().toString());
    restoredStateUnderTest.put(new KinesisStreamShard("fakeStream1", new Shard().withShardId(KinesisShardIdGenerator.generateFromShardOrder(1))), UUID.randomUUID().toString());
    restoredStateUnderTest.put(new KinesisStreamShard("fakeStream1", new Shard().withShardId(KinesisShardIdGenerator.generateFromShardOrder(2))), UUID.randomUUID().toString());
    // fakeStream2 has 2 shards before restore
    restoredStateUnderTest.put(new KinesisStreamShard("fakeStream2", new Shard().withShardId(KinesisShardIdGenerator.generateFromShardOrder(0))), UUID.randomUUID().toString());
    restoredStateUnderTest.put(new KinesisStreamShard("fakeStream2", new Shard().withShardId(KinesisShardIdGenerator.generateFromShardOrder(1))), UUID.randomUUID().toString());
    Map<String, Integer> streamToShardCount = new HashMap<>();
    // fakeStream1 had 3 shards before & 1 new shard after restore
    streamToShardCount.put("fakeStream1", 3 + 1);
    // fakeStream2 had 2 shards before & 2 new shard after restore
    streamToShardCount.put("fakeStream2", 2 + 3);
    // no shards can be found for fakeStream3
    streamToShardCount.put("fakeStream3", 0);
    // no shards can be found for fakeStream4
    streamToShardCount.put("fakeStream4", 0);
    HashMap<String, String> subscribedStreamsToLastSeenShardIdsUnderTest = KinesisDataFetcher.createInitialSubscribedStreamsToLastDiscoveredShardsState(fakeStreams);
    // using a non-resharded streams kinesis behaviour to represent that Kinesis is not resharded AFTER the restore
    final TestableKinesisDataFetcher fetcher = new TestableKinesisDataFetcher(fakeStreams, new Properties(), 10, 2, new AtomicReference<Throwable>(), new LinkedList<KinesisStreamShardState>(), subscribedStreamsToLastSeenShardIdsUnderTest, FakeKinesisBehavioursFactory.nonReshardedStreamsBehaviour(streamToShardCount));
    for (Map.Entry<KinesisStreamShard, String> restoredState : restoredStateUnderTest.entrySet()) {
        fetcher.advanceLastDiscoveredShardOfStream(restoredState.getKey().getStreamName(), restoredState.getKey().getShard().getShardId());
        fetcher.registerNewSubscribedShardState(new KinesisStreamShardState(restoredState.getKey(), new SequenceNumber(restoredState.getValue())));
    }
    fetcher.setIsRestoringFromFailure(true);
    PowerMockito.whenNew(ShardConsumer.class).withAnyArguments().thenReturn(Mockito.mock(ShardConsumer.class));
    Thread runFetcherThread = new Thread(new Runnable() {

        @Override
        public void run() {
            try {
                fetcher.runFetcher();
            } catch (Exception e) {
            //
            }
        }
    });
    runFetcherThread.start();
    // sleep a while before closing
    Thread.sleep(1000);
    fetcher.shutdownFetcher();
    // assert that the streams tracked in the state are identical to the subscribed streams
    Set<String> streamsInState = subscribedStreamsToLastSeenShardIdsUnderTest.keySet();
    assertTrue(streamsInState.size() == fakeStreams.size());
    assertTrue(streamsInState.containsAll(fakeStreams));
    // assert that the last seen shards in state is correctly set
    assertTrue(subscribedStreamsToLastSeenShardIdsUnderTest.get("fakeStream1").equals(KinesisShardIdGenerator.generateFromShardOrder(3)));
    assertTrue(subscribedStreamsToLastSeenShardIdsUnderTest.get("fakeStream2").equals(KinesisShardIdGenerator.generateFromShardOrder(4)));
    assertTrue(subscribedStreamsToLastSeenShardIdsUnderTest.get("fakeStream3") == null);
    assertTrue(subscribedStreamsToLastSeenShardIdsUnderTest.get("fakeStream4") == null);
}
Also used : HashMap(java.util.HashMap) KinesisStreamShard(org.apache.flink.streaming.connectors.kinesis.model.KinesisStreamShard) Properties(java.util.Properties) SequenceNumber(org.apache.flink.streaming.connectors.kinesis.model.SequenceNumber) TestableKinesisDataFetcher(org.apache.flink.streaming.connectors.kinesis.testutils.TestableKinesisDataFetcher) LinkedList(java.util.LinkedList) KinesisStreamShardState(org.apache.flink.streaming.connectors.kinesis.model.KinesisStreamShardState) Shard(com.amazonaws.services.kinesis.model.Shard) KinesisStreamShard(org.apache.flink.streaming.connectors.kinesis.model.KinesisStreamShard) HashMap(java.util.HashMap) Map(java.util.Map) Test(org.junit.Test) PrepareForTest(org.powermock.core.classloader.annotations.PrepareForTest)

Example 5 with Shard

use of com.amazonaws.services.kinesis.model.Shard in project flink by apache.

the class ShardConsumerTest method testCorrectNumOfCollectedRecordsAndUpdatedStateWithUnexpectedExpiredIterator.

@Test
public void testCorrectNumOfCollectedRecordsAndUpdatedStateWithUnexpectedExpiredIterator() {
    KinesisStreamShard fakeToBeConsumedShard = new KinesisStreamShard("fakeStream", new Shard().withShardId(KinesisShardIdGenerator.generateFromShardOrder(0)).withHashKeyRange(new HashKeyRange().withStartingHashKey("0").withEndingHashKey(new BigInteger(StringUtils.repeat("FF", 16), 16).toString())));
    LinkedList<KinesisStreamShardState> subscribedShardsStateUnderTest = new LinkedList<>();
    subscribedShardsStateUnderTest.add(new KinesisStreamShardState(fakeToBeConsumedShard, new SequenceNumber("fakeStartingState")));
    TestableKinesisDataFetcher fetcher = new TestableKinesisDataFetcher(Collections.singletonList("fakeStream"), new Properties(), 10, 2, new AtomicReference<Throwable>(), subscribedShardsStateUnderTest, KinesisDataFetcher.createInitialSubscribedStreamsToLastDiscoveredShardsState(Collections.singletonList("fakeStream")), Mockito.mock(KinesisProxyInterface.class));
    new ShardConsumer<>(fetcher, 0, subscribedShardsStateUnderTest.get(0).getKinesisStreamShard(), subscribedShardsStateUnderTest.get(0).getLastProcessedSequenceNum(), // and the 7th getRecords() call will encounter an unexpected expired shard iterator
    FakeKinesisBehavioursFactory.totalNumOfRecordsAfterNumOfGetRecordsCallsWithUnexpectedExpiredIterator(1000, 9, 7)).run();
    assertTrue(fetcher.getNumOfElementsCollected() == 1000);
    assertTrue(subscribedShardsStateUnderTest.get(0).getLastProcessedSequenceNum().equals(SentinelSequenceNumber.SENTINEL_SHARD_ENDING_SEQUENCE_NUM.get()));
}
Also used : KinesisStreamShard(org.apache.flink.streaming.connectors.kinesis.model.KinesisStreamShard) Properties(java.util.Properties) HashKeyRange(com.amazonaws.services.kinesis.model.HashKeyRange) LinkedList(java.util.LinkedList) SentinelSequenceNumber(org.apache.flink.streaming.connectors.kinesis.model.SentinelSequenceNumber) SequenceNumber(org.apache.flink.streaming.connectors.kinesis.model.SequenceNumber) BigInteger(java.math.BigInteger) KinesisStreamShardState(org.apache.flink.streaming.connectors.kinesis.model.KinesisStreamShardState) KinesisProxyInterface(org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxyInterface) Shard(com.amazonaws.services.kinesis.model.Shard) KinesisStreamShard(org.apache.flink.streaming.connectors.kinesis.model.KinesisStreamShard) TestableKinesisDataFetcher(org.apache.flink.streaming.connectors.kinesis.testutils.TestableKinesisDataFetcher) Test(org.junit.Test)

Aggregations

Shard (com.amazonaws.services.kinesis.model.Shard)14 KinesisStreamShard (org.apache.flink.streaming.connectors.kinesis.model.KinesisStreamShard)9 Test (org.junit.Test)8 Properties (java.util.Properties)7 KinesisStreamShardState (org.apache.flink.streaming.connectors.kinesis.model.KinesisStreamShardState)7 SequenceNumber (org.apache.flink.streaming.connectors.kinesis.model.SequenceNumber)7 DescribeStreamResult (com.amazonaws.services.kinesis.model.DescribeStreamResult)6 HashMap (java.util.HashMap)6 LinkedList (java.util.LinkedList)6 TestableKinesisDataFetcher (org.apache.flink.streaming.connectors.kinesis.testutils.TestableKinesisDataFetcher)6 Map (java.util.Map)5 PrepareForTest (org.powermock.core.classloader.annotations.PrepareForTest)5 DescribeStreamRequest (com.amazonaws.services.kinesis.model.DescribeStreamRequest)3 StreamDescription (com.amazonaws.services.kinesis.model.StreamDescription)3 HashKeyRange (com.amazonaws.services.kinesis.model.HashKeyRange)2 BigInteger (java.math.BigInteger)2 ArrayList (java.util.ArrayList)2 SentinelSequenceNumber (org.apache.flink.streaming.connectors.kinesis.model.SentinelSequenceNumber)2 KinesisProxyInterface (org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxyInterface)2 GetRecordsRequest (com.amazonaws.services.kinesis.model.GetRecordsRequest)1