Search in sources :

Example 6 with UncaughtExceptionHandler

use of io.cdap.cdap.common.logging.common.UncaughtExceptionHandler in project cdap by caskdata.

the class DaemonMain method doMain.

 * The main method. It simply call methods in the same sequence
 * as if the program is started by jsvc.
protected void doMain(final String[] args) throws Exception {
    try {
    } catch (Throwable t) {
        LOG.error("Exception raised when calling init", t);
        try {
        } catch (Throwable t2) {
            LOG.error("Exception raised when calling destroy", t);
        // Throw to terminate the main thread
        throw t;
    CountDownLatch shutdownLatch = new CountDownLatch(1);
    AtomicBoolean terminated = new AtomicBoolean();
    Runnable terminateRunnable = () -> {
        if (!terminated.compareAndSet(false, true)) {
        try {
            try {
            } finally {
                try {
                } finally {
        } catch (Throwable t) {
            LOG.error("Exception when shutting down: " + t.getMessage(), t);
    Runtime.getRuntime().addShutdownHook(new Thread(terminateRunnable));
    try {
    } catch (Throwable t) {
        // Throw to terminate the main thread
        LOG.error("Exception raised when calling start", t);;
        throw t;
    // Set uncaught exception handler after startup, this is so that if startup throws exception then we
    // want it to be logged as error (the handler logs it as debug)
    Thread.setDefaultUncaughtExceptionHandler(new UncaughtExceptionHandler());
Also used : AtomicBoolean(java.util.concurrent.atomic.AtomicBoolean) CountDownLatch(java.util.concurrent.CountDownLatch) UncaughtExceptionHandler(io.cdap.cdap.common.logging.common.UncaughtExceptionHandler)

Example 7 with UncaughtExceptionHandler

use of io.cdap.cdap.common.logging.common.UncaughtExceptionHandler in project cdap by caskdata.

the class SparkContainerLauncher method launch.

 * Launches the given main class. The main class will be loaded through the {@link SparkContainerClassLoader}.
 * @param mainClassName the main class to launch
 * @param args arguments for the main class
 * @param removeMainClass whether to remove the jar for the main class from the classloader
 * @param masterEnvName name of the MasterEnvironment used to submit the Spark job. This will be used to setup
 *   bindings for service discovery and other CDAP capabilities. If null, the default Hadoop implementations will
 *   be used.
public static void launch(String mainClassName, String[] args, boolean removeMainClass, @Nullable String masterEnvName) throws Exception {
    Thread.setDefaultUncaughtExceptionHandler(new UncaughtExceptionHandler());
    ClassLoader systemClassLoader = ClassLoader.getSystemClassLoader();
    Set<URL> urls = ClassLoaders.getClassLoaderURLs(systemClassLoader, new LinkedHashSet<URL>());
    // method call from the container launch script.
    if (removeMainClass) {
        urls.remove(getURLByClass(systemClassLoader, mainClassName));
    // Remove the first scala from the set of classpath. This ensure the one from Spark is used for spark
    removeNonSparkJar(systemClassLoader, "scala.language", urls);
    // Remove the first jar containing LZBlockInputStream from the set of classpath.
    // The one from Kafka is not compatible with Spark
    removeNonSparkJar(systemClassLoader, "net.jpountz.lz4.LZ4BlockInputStream", urls);
    // First create a FilterClassLoader that only loads JVM and kafka classes from the system classloader
    // This is to isolate the scala library from children
    ClassLoader parentClassLoader = new FilterClassLoader(systemClassLoader, KAFKA_FILTER);
    boolean rewriteCheckpointTempFileName = Boolean.parseBoolean(System.getProperty(SparkRuntimeUtils.STREAMING_CHECKPOINT_REWRITE_ENABLED, "false"));
    // Creates the SparkRunnerClassLoader for class rewriting and it will be used for the rest of the execution.
    // Use the extension classloader as the parent instead of the system classloader because
    // Spark classes are in the system classloader which we want to rewrite.
    ClassLoader classLoader = new SparkContainerClassLoader(urls.toArray(new URL[0]), parentClassLoader, rewriteCheckpointTempFileName);
    // Sets the context classloader and launch the actual Spark main class.
    // Create SLF4J logger from the context classloader. It has to be created from that classloader in order
    // for logs in this class to be in the same context as the one used in Spark.
    Object logger = createLogger(classLoader);
    // Install the JUL to SLF4J Bridge
    try {
    } catch (Exception e) {
        // Log the error and continue
        log(logger, "warn", "Failed to invoke SLF4JBridgeHandler.install() required for jul-to-slf4j bridge", e);
    // Get the SparkRuntimeContext to initialize all necessary services and logging context
    // Need to do it using the SparkRunnerClassLoader through reflection.
    Class<?> sparkRuntimeContextProviderClass = classLoader.loadClass(SparkRuntimeContextProvider.class.getName());
    if (masterEnvName != null) {
        sparkRuntimeContextProviderClass.getMethod("setMasterEnvName", String.class).invoke(null, masterEnvName);
    Object sparkRuntimeContext = sparkRuntimeContextProviderClass.getMethod("get").invoke(null);
    if (sparkRuntimeContext instanceof Closeable) {
        System.setSecurityManager(new SparkRuntimeSecurityManager((Closeable) sparkRuntimeContext));
    try {
        // in the PythonRunner/PythonWorkerFactory via SparkClassRewriter.
        if (!isPySpark()) {
            // Invoke StandardOutErrorRedirector.redirectToLogger()
            classLoader.loadClass(StandardOutErrorRedirector.class.getName()).getDeclaredMethod("redirectToLogger", String.class).invoke(null, mainClassName);
        // which causes executor logs attempt to write to driver log directory
        if (System.getProperty("spark.executorEnv.CDAP_LOG_DIR") != null) {
            System.setProperty("spark.executorEnv.CDAP_LOG_DIR", "<LOG_DIR>");
        // Optionally starts Py4j Gateway server in the executor container
        Runnable stopGatewayServer = startGatewayServerIfNeeded(classLoader, logger);
        try {
            log(logger, "info", "Launch main class {}.main({})", mainClassName, Arrays.toString(args));
            classLoader.loadClass(mainClassName).getMethod("main", String[].class).invoke(null, new Object[] { args });
            log(logger, "info", "Main method returned {}", mainClassName);
        } finally {
    } catch (Throwable t) {
        // LOG the exception since this exception will be propagated back to JVM
        // and kill the main thread (hence the JVM process).
        // If we don't log it here as ERROR, it will be logged by UncaughtExceptionHandler as DEBUG level
        log(logger, "error", "Exception raised when calling {}.main(String[]) method", mainClassName, t);
        throw t;
    } finally {
        if (sparkRuntimeContext instanceof Closeable) {
            Closeables.closeQuietly((Closeable) sparkRuntimeContext);
Also used : FilterClassLoader(io.cdap.cdap.common.lang.FilterClassLoader) Closeable( URL( URISyntaxException( MalformedURLException( IOException( SLF4JBridgeHandler(org.slf4j.bridge.SLF4JBridgeHandler) SparkRuntimeContextProvider( SparkContainerClassLoader( StandardOutErrorRedirector(io.cdap.cdap.common.logging.StandardOutErrorRedirector) SparkContainerClassLoader( FilterClassLoader(io.cdap.cdap.common.lang.FilterClassLoader) UncaughtExceptionHandler(io.cdap.cdap.common.logging.common.UncaughtExceptionHandler)

Example 8 with UncaughtExceptionHandler

use of io.cdap.cdap.common.logging.common.UncaughtExceptionHandler in project cdap by caskdata.

the class DefaultRuntimeJob method run.

public void run(RuntimeJobEnvironment runtimeJobEnv) throws Exception {
    // Setup process wide settings
    Thread.setDefaultUncaughtExceptionHandler(new UncaughtExceptionHandler());
    // Get Program Options
    ProgramOptions programOpts = readJsonFile(new File(DistributedProgramRunner.PROGRAM_OPTIONS_FILE_NAME), ProgramOptions.class);
    ProgramRunId programRunId = programOpts.getProgramId().run(ProgramRunners.getRunId(programOpts));
    ProgramId programId = programRunId.getParent();
    Arguments systemArgs = programOpts.getArguments();
    // Setup logging context for the program
    LoggingContextAccessor.setLoggingContext(LoggingContextHelper.getLoggingContextWithRunId(programRunId, systemArgs.asMap()));
    // Get the cluster launch type
    Cluster cluster = GSON.fromJson(systemArgs.getOption(ProgramOptionConstants.CLUSTER), Cluster.class);
    // Get App spec
    ApplicationSpecification appSpec = readJsonFile(new File(DistributedProgramRunner.APP_SPEC_FILE_NAME), ApplicationSpecification.class);
    ProgramDescriptor programDescriptor = new ProgramDescriptor(programId, appSpec);
    // Create injector and get program runner
    Injector injector = Guice.createInjector(createModules(runtimeJobEnv, createCConf(runtimeJobEnv, programOpts), programRunId, programOpts));
    CConfiguration cConf = injector.getInstance(CConfiguration.class);
    // Initialize log appender
    LogAppenderInitializer logAppenderInitializer = injector.getInstance(LogAppenderInitializer.class);
    SystemArguments.setLogLevel(programOpts.getUserArguments(), logAppenderInitializer);
    ProxySelector oldProxySelector = ProxySelector.getDefault();
    RuntimeMonitors.setupMonitoring(injector, programOpts);
    Deque<Service> coreServices = createCoreServices(injector, systemArgs, cluster);
    // regenerate app spec
    ConfiguratorFactory configuratorFactory = injector.getInstance(ConfiguratorFactory.class);
    try {
        Map<String, String> systemArguments = new HashMap<>(programOpts.getArguments().asMap());
        File pluginDir = new File(programOpts.getArguments().getOption(ProgramOptionConstants.PLUGIN_DIR, DistributedProgramRunner.PLUGIN_DIR));
        // create a directory to store plugin artifacts for the regeneration of app spec to fetch plugin artifacts
        if (!programOpts.getArguments().hasOption(ProgramOptionConstants.PLUGIN_DIR)) {
            systemArguments.put(ProgramOptionConstants.PLUGIN_DIR, DistributedProgramRunner.PLUGIN_DIR);
        // remember the file names in the artifact folder before app regeneration
        List<String> pluginFiles = DirUtils.listFiles(pluginDir, File::isFile).stream().map(File::getName).collect(Collectors.toList());
        ApplicationSpecification generatedAppSpec = regenerateAppSpec(systemArguments, programOpts.getUserArguments().asMap(), programId, appSpec, programDescriptor, configuratorFactory);
        appSpec = generatedAppSpec != null ? generatedAppSpec : appSpec;
        programDescriptor = new ProgramDescriptor(programDescriptor.getProgramId(), appSpec);
        List<String> pluginFilesAfter = DirUtils.listFiles(pluginDir, File::isFile).stream().map(File::getName).collect(Collectors.toList());
        if (pluginFilesAfter.isEmpty()) {
        // recreate it from the folders
        if (!pluginFiles.equals(pluginFilesAfter)) {
        // update program options
        programOpts = new SimpleProgramOptions(programOpts.getProgramId(), new BasicArguments(systemArguments), programOpts.getUserArguments(), programOpts.isDebug());
    } catch (Exception e) {
        LOG.warn("Failed to regenerate the app spec for program {}, using the existing app spec", programId, e);
    ProgramStateWriter programStateWriter = injector.getInstance(ProgramStateWriter.class);
    RuntimeClientService runtimeClientService = injector.getInstance(RuntimeClientService.class);
    CompletableFuture<ProgramController.State> programCompletion = new CompletableFuture<>();
    try {
        ProgramRunner programRunner = injector.getInstance(ProgramRunnerFactory.class).create(programId.getType());
        // Create and run the program. The program files should be present in current working directory.
        try (Program program = createProgram(cConf, programRunner, programDescriptor, programOpts)) {
            ProgramController controller =, programOpts);
            controller.addListener(new AbstractListener() {

                public void completed() {

                public void killed() {
                    // Write an extra state to make sure there is always a terminal state even
                    // if the program application run failed to write out the state.

                public void error(Throwable cause) {
                    // Write an extra state to make sure there is always a terminal state even
                    // if the program application run failed to write out the state.
                    programStateWriter.error(programRunId, cause);
            }, Threads.SAME_THREAD_EXECUTOR);
            if (stopRequested) {
            // Block on the completion
        } finally {
            if (programRunner instanceof Closeable) {
                Closeables.closeQuietly((Closeable) programRunner);
    } catch (Throwable t) {
        if (!programCompletion.isDone()) {
            // We log here so that the logs would still send back to the program logs collection.
            // Only log if the program completion is not done.
            // Otherwise the program runner itself should have logged the error.
            LOG.error("Failed to execute program {}", programRunId, t);
            // If the program completion is not done, then this exception
            // is due to systematic failure in which fail to run the program.
            // We write out an extra error state for the program to make sure the program state get transited.
            programStateWriter.error(programRunId, t);
        throw t;
    } finally {
        stopCoreServices(coreServices, logAppenderInitializer);
Also used : ApplicationSpecification( ConfiguratorFactory( HashMap(java.util.HashMap) Closeable( ProgramRunnerFactory( DefaultProgramRunnerFactory( ProxySelector( LogAppenderInitializer(io.cdap.cdap.logging.appender.LogAppenderInitializer) CompletableFuture(java.util.concurrent.CompletableFuture) ProgramStateWriter( MessagingProgramStateWriter( Injector( AbstractListener( ProgramDescriptor( BasicArguments( UncaughtExceptionHandler(io.cdap.cdap.common.logging.common.UncaughtExceptionHandler) DistributedProgramRunner( DistributedMapReduceProgramRunner( DistributedWorkerProgramRunner( ProgramRunner( DistributedWorkflowProgramRunner( RuntimeClientService( ProgramController( Program( Arguments( SystemArguments( BasicArguments( Cluster(io.cdap.cdap.runtime.spi.provisioner.Cluster) RuntimeClientService( Service( ProfileMetricService(io.cdap.cdap.internal.profile.ProfileMetricService) LogAppenderLoaderService(io.cdap.cdap.logging.appender.loader.LogAppenderLoaderService) MessagingService(io.cdap.cdap.messaging.MessagingService) AbstractIdleService( MessagingHttpService(io.cdap.cdap.messaging.server.MessagingHttpService) MetricsCollectionService(io.cdap.cdap.api.metrics.MetricsCollectionService) ProgramId( CConfiguration(io.cdap.cdap.common.conf.CConfiguration) SimpleProgramOptions( ProgramOptions( IOException( ExecutionException(java.util.concurrent.ExecutionException) TimeoutException(java.util.concurrent.TimeoutException) ProgramRunId( SimpleProgramOptions( File(

Example 9 with UncaughtExceptionHandler

use of io.cdap.cdap.common.logging.common.UncaughtExceptionHandler in project cdap by caskdata.

the class RemoteExecutionJobMain method main.

public static void main(String[] args) throws Exception {
    Thread.setDefaultUncaughtExceptionHandler(new UncaughtExceptionHandler());
    new RemoteExecutionJobMain().doMain(args);
Also used : UncaughtExceptionHandler(io.cdap.cdap.common.logging.common.UncaughtExceptionHandler)


UncaughtExceptionHandler (io.cdap.cdap.common.logging.common.UncaughtExceptionHandler)9 CConfiguration (io.cdap.cdap.common.conf.CConfiguration)5 File ( Configuration (org.apache.hadoop.conf.Configuration)4 Injector ( MetricsCollectionService (io.cdap.cdap.api.metrics.MetricsCollectionService)3 LogAppenderInitializer (io.cdap.cdap.logging.appender.LogAppenderInitializer)3 ApplicationSpecification ( ProgramDescriptor ( Arguments ( SConfiguration (io.cdap.cdap.common.conf.SConfiguration)2 BasicArguments ( SystemArguments ( MasterEnvironmentContext (io.cdap.cdap.master.spi.environment.MasterEnvironmentContext)2 Closeable ( IOException ( AbstractIdleService ( Service ( AbstractModule ( Module (