Search in sources :

Example 1 with MasterDescription

use of com.netflix.titus.api.supervisor.service.MasterDescription in project titus-control-plane by Netflix.

the class LeaderResource method getLeader.

@GET
public LeaderRepresentation getLeader() {
    MasterDescription masterDescription = masterMonitor.getLatestLeader();
    LeaderRepresentation.Builder builder = LeaderRepresentation.newBuilder().withHostname(masterDescription.getHostname()).withHostIP(masterDescription.getHostIP()).withApiPort(masterDescription.getApiPort()).withApiStatusUri(masterDescription.getApiStatusUri()).withCreateTime(masterDescription.getCreateTime());
    return builder.build();
}
Also used : MasterDescription(com.netflix.titus.api.supervisor.service.MasterDescription) LeaderRepresentation(com.netflix.titus.api.endpoint.v2.rest.representation.LeaderRepresentation) GET(javax.ws.rs.GET)

Example 2 with MasterDescription

use of com.netflix.titus.api.supervisor.service.MasterDescription in project titus-control-plane by Netflix.

the class EmbeddedTitusMaster method boot.

public EmbeddedTitusMaster boot() {
    Stopwatch timer = Stopwatch.createStarted();
    logger.info("Starting Titus Master");
    Module embeddedKubeModule;
    if (embeddedKubeCluster == null) {
        embeddedKubeModule = new AbstractModule() {

            @Override
            protected void configure() {
            }
        };
    } else {
        embeddedKubeModule = new EmbeddedKubeModule(embeddedKubeCluster);
    }
    injector = InjectorBuilder.fromModules(Modules.override(new TitusRuntimeModule(false)).with(new AbstractModule() {

        @Override
        protected void configure() {
            bind(Archaius2ConfigurationLogger.class).asEagerSingleton();
            bind(Registry.class).toInstance(new DefaultRegistry());
        }
    }), embeddedKubeModule, Modules.override(new TitusMasterModule(enableREST, TitusMasterModule.Mode.EMBEDDED_KUBE)).with(new AbstractModule() {

        @Override
        protected void configure() {
            bind(InstanceCloudConnector.class).toInstance(new NoOpInstanceCloudConnector());
            bind(MasterDescription.class).toInstance(masterDescription);
            bind(MasterMonitor.class).to(LocalMasterMonitor.class);
            bind(AppScalePolicyStore.class).to(InMemoryPolicyStore.class);
            bind(LoadBalancerStore.class).to(InMemoryLoadBalancerStore.class);
            bind(LoadBalancerConnector.class).to(NoOpLoadBalancerConnector.class);
            bind(LoadBalancerJobValidator.class).to(NoOpLoadBalancerJobValidator.class);
        }

        @Provides
        @Singleton
        public JobStore getJobStore(TitusRuntime titusRuntime) {
            if (!cassandraJobStore) {
                return jobStore;
            }
            try {
                JobStore jobStore = EmbeddedCassandraStoreFactory.newBuilder().withTitusRuntime(titusRuntime).build().getJobStore();
                return jobStore;
            } catch (Throwable e) {
                e.printStackTrace();
                return null;
            }
        }
    }), newJettyModule(), new ArchaiusModule() {

        @Override
        protected void configureArchaius() {
            bindApplicationConfigurationOverride().toInstance(config);
        }
    }).createInjector();
    if (grpcPort <= 0) {
        grpcPort = getGrpcPort();
        config.setProperty("titus.master.grpcServer.port", "" + grpcPort);
    }
    injector.getInstance(ContainerEventBus.class).submitInOrder(new ContainerEventBus.ContainerStartedEvent());
    injector.getInstance(LeaderActivator.class).becomeLeader();
    injector.getInstance(AuditLogService.class).auditLogEvents().subscribe(auditLogs::add);
    if (enableREST) {
        // Since jetty API server is run on a separate thread, it may not be ready yet
        // We do not have better way, but call it until it replies.
        getClient().findAllApplicationSLA().retryWhen(attempts -> {
            return attempts.zipWith(Observable.range(1, 5), (n, i) -> i).flatMap(i -> {
                return Observable.timer(i, TimeUnit.SECONDS);
            });
        }).timeout(30, TimeUnit.SECONDS).toBlocking().firstOrDefault(null);
    }
    logger.info("Embedded TitusMaster started in {}ms", timer.elapsed(TimeUnit.MILLISECONDS));
    return this;
}
Also used : Module(com.google.inject.Module) AuditLogEvent(com.netflix.titus.api.audit.model.AuditLogEvent) LoadBalancerConnector(com.netflix.titus.api.connector.cloud.LoadBalancerConnector) LocalMasterMonitor(com.netflix.titus.master.supervisor.service.leader.LocalMasterMonitor) InjectorBuilder(com.netflix.governator.InjectorBuilder) ManagedChannel(io.grpc.ManagedChannel) EmbeddedKubeCluster(com.netflix.titus.testkit.embedded.kube.EmbeddedKubeCluster) Key(com.google.inject.Key) LoggerFactory(org.slf4j.LoggerFactory) NoOpLoadBalancerConnector(com.netflix.titus.api.connector.cloud.noop.NoOpLoadBalancerConnector) MasterDescription(com.netflix.titus.api.supervisor.service.MasterDescription) NoOpLoadBalancerJobValidator(com.netflix.titus.api.loadbalancer.model.sanitizer.NoOpLoadBalancerJobValidator) TitusMaster(com.netflix.titus.master.TitusMaster) TitusRuntimeModule(com.netflix.titus.master.TitusRuntimeModule) JobStore(com.netflix.titus.api.jobmanager.store.JobStore) LoadBalancerStore(com.netflix.titus.api.loadbalancer.store.LoadBalancerStore) AppScalePolicyStore(com.netflix.titus.api.appscale.store.AppScalePolicyStore) InMemoryJobStore(com.netflix.titus.runtime.store.v3.memory.InMemoryJobStore) MasterMonitor(com.netflix.titus.api.supervisor.service.MasterMonitor) ContainerEventBus(com.netflix.titus.common.util.guice.ContainerEventBus) ReactorTitusMasterClient(com.netflix.titus.testkit.client.ReactorTitusMasterClient) SupervisorServiceGrpc(com.netflix.titus.grpc.protogen.SupervisorServiceGrpc) ArchaiusSystemDisruptionBudgetResolver(com.netflix.titus.master.eviction.service.quota.system.ArchaiusSystemDisruptionBudgetResolver) InMemoryPolicyStore(com.netflix.titus.runtime.store.v3.memory.InMemoryPolicyStore) List(java.util.List) MachineServiceGrpc(com.netflix.titus.grpc.protogen.v4.MachineServiceGrpc) LifecycleInjector(com.netflix.governator.LifecycleInjector) SchedulerServiceGrpc(com.netflix.titus.grpc.protogen.SchedulerServiceGrpc) EmbeddedJettyModule(com.netflix.titus.runtime.endpoint.common.rest.EmbeddedJettyModule) EmbeddedCassandraStoreFactory(com.netflix.titus.ext.cassandra.testkit.store.EmbeddedCassandraStoreFactory) SupervisorServiceBlockingStub(com.netflix.titus.grpc.protogen.SupervisorServiceGrpc.SupervisorServiceBlockingStub) JobManagementServiceStub(com.netflix.titus.grpc.protogen.JobManagementServiceGrpc.JobManagementServiceStub) JettyModule(com.netflix.governator.guice.jetty.JettyModule) CopyOnWriteArrayList(java.util.concurrent.CopyOnWriteArrayList) LeaderActivator(com.netflix.titus.api.supervisor.service.LeaderActivator) Stopwatch(com.google.common.base.Stopwatch) Modules(com.google.inject.util.Modules) LoadBalancerJobValidator(com.netflix.titus.api.loadbalancer.model.sanitizer.LoadBalancerJobValidator) HealthStub(com.netflix.titus.grpc.protogen.HealthGrpc.HealthStub) NoOpInstanceCloudConnector(com.netflix.titus.api.connector.cloud.noop.NoOpInstanceCloudConnector) Singleton(javax.inject.Singleton) TitusMasterGrpcServer(com.netflix.titus.master.endpoint.grpc.TitusMasterGrpcServer) Observable(rx.Observable) JobActivityHistoryServiceGrpc(com.netflix.titus.grpc.protogen.JobActivityHistoryServiceGrpc) EvictionServiceGrpc(com.netflix.titus.grpc.protogen.EvictionServiceGrpc) HealthGrpc(com.netflix.titus.grpc.protogen.HealthGrpc) InstanceCloudConnector(com.netflix.titus.api.connector.cloud.InstanceCloudConnector) AutoScalingServiceGrpc(com.netflix.titus.grpc.protogen.AutoScalingServiceGrpc) TitusMasterModule(com.netflix.titus.master.TitusMasterModule) ArchaiusModule(com.netflix.archaius.guice.ArchaiusModule) EmbeddedKubeModule(com.netflix.titus.testkit.embedded.kube.EmbeddedKubeModule) Properties(java.util.Properties) Logger(org.slf4j.Logger) TestKitGrpcClientErrorUtils(com.netflix.titus.testkit.grpc.TestKitGrpcClientErrorUtils) DefaultSettableConfig(com.netflix.archaius.config.DefaultSettableConfig) JobManagementServiceGrpc(com.netflix.titus.grpc.protogen.JobManagementServiceGrpc) TitusMasterClient(com.netflix.titus.testkit.client.TitusMasterClient) Archaius2ConfigurationLogger(com.netflix.titus.common.util.archaius2.Archaius2ConfigurationLogger) TimeUnit(java.util.concurrent.TimeUnit) ManagedChannelBuilder(io.grpc.ManagedChannelBuilder) ObjectMappers(com.netflix.titus.api.json.ObjectMappers) Provides(com.google.inject.Provides) LoadBalancerServiceGrpc(com.netflix.titus.grpc.protogen.LoadBalancerServiceGrpc) DefaultRegistry(com.netflix.spectator.api.DefaultRegistry) JobManagementServiceBlockingStub(com.netflix.titus.grpc.protogen.JobManagementServiceGrpc.JobManagementServiceBlockingStub) InMemoryLoadBalancerStore(com.netflix.titus.runtime.store.v3.memory.InMemoryLoadBalancerStore) Registry(com.netflix.spectator.api.Registry) Preconditions(com.google.common.base.Preconditions) TitusRuntime(com.netflix.titus.common.runtime.TitusRuntime) SystemDisruptionBudgetDescriptor(com.netflix.titus.master.eviction.service.quota.system.SystemDisruptionBudgetDescriptor) AbstractModule(com.google.inject.AbstractModule) AuditLogService(com.netflix.titus.api.audit.service.AuditLogService) AppScalePolicyStore(com.netflix.titus.api.appscale.store.AppScalePolicyStore) MasterDescription(com.netflix.titus.api.supervisor.service.MasterDescription) Stopwatch(com.google.common.base.Stopwatch) TitusRuntime(com.netflix.titus.common.runtime.TitusRuntime) EmbeddedKubeModule(com.netflix.titus.testkit.embedded.kube.EmbeddedKubeModule) NoOpLoadBalancerJobValidator(com.netflix.titus.api.loadbalancer.model.sanitizer.NoOpLoadBalancerJobValidator) LoadBalancerJobValidator(com.netflix.titus.api.loadbalancer.model.sanitizer.LoadBalancerJobValidator) NoOpInstanceCloudConnector(com.netflix.titus.api.connector.cloud.noop.NoOpInstanceCloudConnector) InstanceCloudConnector(com.netflix.titus.api.connector.cloud.InstanceCloudConnector) LeaderActivator(com.netflix.titus.api.supervisor.service.LeaderActivator) LocalMasterMonitor(com.netflix.titus.master.supervisor.service.leader.LocalMasterMonitor) MasterMonitor(com.netflix.titus.api.supervisor.service.MasterMonitor) LoadBalancerConnector(com.netflix.titus.api.connector.cloud.LoadBalancerConnector) NoOpLoadBalancerConnector(com.netflix.titus.api.connector.cloud.noop.NoOpLoadBalancerConnector) NoOpInstanceCloudConnector(com.netflix.titus.api.connector.cloud.noop.NoOpInstanceCloudConnector) Archaius2ConfigurationLogger(com.netflix.titus.common.util.archaius2.Archaius2ConfigurationLogger) ContainerEventBus(com.netflix.titus.common.util.guice.ContainerEventBus) JobStore(com.netflix.titus.api.jobmanager.store.JobStore) InMemoryJobStore(com.netflix.titus.runtime.store.v3.memory.InMemoryJobStore) DefaultRegistry(com.netflix.spectator.api.DefaultRegistry) Registry(com.netflix.spectator.api.Registry) AbstractModule(com.google.inject.AbstractModule) TitusMasterModule(com.netflix.titus.master.TitusMasterModule) ArchaiusModule(com.netflix.archaius.guice.ArchaiusModule) DefaultRegistry(com.netflix.spectator.api.DefaultRegistry) Module(com.google.inject.Module) TitusRuntimeModule(com.netflix.titus.master.TitusRuntimeModule) EmbeddedJettyModule(com.netflix.titus.runtime.endpoint.common.rest.EmbeddedJettyModule) JettyModule(com.netflix.governator.guice.jetty.JettyModule) TitusMasterModule(com.netflix.titus.master.TitusMasterModule) ArchaiusModule(com.netflix.archaius.guice.ArchaiusModule) EmbeddedKubeModule(com.netflix.titus.testkit.embedded.kube.EmbeddedKubeModule) AbstractModule(com.google.inject.AbstractModule) LoadBalancerStore(com.netflix.titus.api.loadbalancer.store.LoadBalancerStore) InMemoryLoadBalancerStore(com.netflix.titus.runtime.store.v3.memory.InMemoryLoadBalancerStore) TitusRuntimeModule(com.netflix.titus.master.TitusRuntimeModule)

Example 3 with MasterDescription

use of com.netflix.titus.api.supervisor.service.MasterDescription in project titus-control-plane by Netflix.

the class ZookeeperMasterMonitorTest method testMonitorWorksForMultipleLeaderUpdates.

@Test(timeout = 30_000)
public void testMonitorWorksForMultipleLeaderUpdates() throws Exception {
    // Note we intentionally didn't set the initial value of master description because we'd like to make sure
    // that the monitor will work property even if it fails occasionally (in this case, it will fail to deserialize
    // the master description in the very beginning
    ExtTestSubscriber<MasterDescription> leaderSubscriber = new ExtTestSubscriber<>();
    masterMonitor.getLeaderObservable().filter(Objects::nonNull).subscribe(leaderSubscriber);
    for (int i = 0; i < 5; i++) {
        curator.setData().forPath(zkPaths.getLeaderAnnouncementPath(), ObjectMappers.defaultMapper().writeValueAsBytes(newMasterDescription(i)));
        // Try a few times, as we can get update for the same entity more than once.
        for (int j = 0; j < 3; j++) {
            MasterDescription newLeader = leaderSubscriber.takeNext(5, TimeUnit.SECONDS);
            if (newLeader != null && newLeader.getApiPort() == i) {
                return;
            }
        }
        fail("Did not received TitusMaster update for iteration " + i);
    }
}
Also used : MasterDescription(com.netflix.titus.api.supervisor.service.MasterDescription) ZookeeperTestUtils.newMasterDescription(com.netflix.titus.ext.zookeeper.ZookeeperTestUtils.newMasterDescription) ExtTestSubscriber(com.netflix.titus.testkit.rx.ExtTestSubscriber) Test(org.junit.Test) IntegrationNotParallelizableTest(com.netflix.titus.testkit.junit.category.IntegrationNotParallelizableTest)

Example 4 with MasterDescription

use of com.netflix.titus.api.supervisor.service.MasterDescription in project titus-control-plane by Netflix.

the class ZkLeaderVerificator method setupZKLeaderVerification.

@PostConstruct
void setupZKLeaderVerification() {
    final String myHostname = System.getenv("EC2_PUBLIC_HOSTNAME");
    final String myLocalIP = System.getenv("EC2_LOCAL_IPV4");
    if (myHostname == null || myHostname.isEmpty()) {
        logger.warn("Did not find public hostname variable, OK if not running cloud");
        return;
    }
    if (myLocalIP == null || myLocalIP.isEmpty()) {
        logger.warn("Did not find local IP variable, OK if not running cloud");
        return;
    }
    logger.info("Setting up ZK leader verification with myHostname=" + myHostname + ", localIP=" + myLocalIP);
    long delay = 20;
    final AtomicReference<MasterDescription> ref = new AtomicReference<>();
    masterMonitor.getLeaderObservable().doOnNext(ref::set).subscribe();
    final AtomicInteger falseCount = new AtomicInteger(0);
    final int MAX_FALSE_COUNTS = 10;
    new ScheduledThreadPoolExecutor(1).scheduleWithFixedDelay(new Runnable() {

        @Override
        public void run() {
            boolean foundFault = false;
            try {
                if (leaderActivator.isLeader()) {
                    logger.info("I'm leader, masterDescription=" + ref.get());
                    if (ref.get() != null && !myHostname.equals(ref.get().getHostname()) && !myLocalIP.equals(ref.get().getHostname())) {
                        foundFault = true;
                        logger.warn("ZK says leader is " + ref.get().getHostname() + ", not us (" + myHostname + ")");
                        if (falseCount.incrementAndGet() > MAX_FALSE_COUNTS) {
                            logger.error("Too many attempts failed to verify ZK leader status, exiting!");
                            SystemExt.forcedProcessExit(5);
                        }
                    }
                }
            } catch (Exception e) {
                logger.warn("Error verifying leader status: " + e.getMessage(), e);
            }
            if (!foundFault) {
                falseCount.set(0);
            }
        }
    }, delay, delay, TimeUnit.SECONDS);
}
Also used : MasterDescription(com.netflix.titus.api.supervisor.service.MasterDescription) AtomicInteger(java.util.concurrent.atomic.AtomicInteger) ScheduledThreadPoolExecutor(java.util.concurrent.ScheduledThreadPoolExecutor) AtomicReference(java.util.concurrent.atomic.AtomicReference) PostConstruct(javax.annotation.PostConstruct)

Aggregations

MasterDescription (com.netflix.titus.api.supervisor.service.MasterDescription)4 Preconditions (com.google.common.base.Preconditions)1 Stopwatch (com.google.common.base.Stopwatch)1 AbstractModule (com.google.inject.AbstractModule)1 Key (com.google.inject.Key)1 Module (com.google.inject.Module)1 Provides (com.google.inject.Provides)1 Modules (com.google.inject.util.Modules)1 DefaultSettableConfig (com.netflix.archaius.config.DefaultSettableConfig)1 ArchaiusModule (com.netflix.archaius.guice.ArchaiusModule)1 InjectorBuilder (com.netflix.governator.InjectorBuilder)1 LifecycleInjector (com.netflix.governator.LifecycleInjector)1 JettyModule (com.netflix.governator.guice.jetty.JettyModule)1 DefaultRegistry (com.netflix.spectator.api.DefaultRegistry)1 Registry (com.netflix.spectator.api.Registry)1 AppScalePolicyStore (com.netflix.titus.api.appscale.store.AppScalePolicyStore)1 AuditLogEvent (com.netflix.titus.api.audit.model.AuditLogEvent)1 AuditLogService (com.netflix.titus.api.audit.service.AuditLogService)1 InstanceCloudConnector (com.netflix.titus.api.connector.cloud.InstanceCloudConnector)1 LoadBalancerConnector (com.netflix.titus.api.connector.cloud.LoadBalancerConnector)1