Search in sources :

Example 1 with SerializerInstance

use of org.apache.spark.serializer.SerializerInstance in project deeplearning4j by deeplearning4j.

the class SparkUtils method checkKryoConfiguration.

/**
     * Check the spark configuration for incorrect Kryo configuration, logging a warning message if necessary
     *
     * @param javaSparkContext Spark context
     * @param log              Logger to log messages to
     * @return True if ok (no kryo, or correct kryo setup)
     */
public static boolean checkKryoConfiguration(JavaSparkContext javaSparkContext, Logger log) {
    //Check if kryo configuration is correct:
    String serializer = javaSparkContext.getConf().get("spark.serializer", null);
    if (serializer != null && serializer.equals("org.apache.spark.serializer.KryoSerializer")) {
        String kryoRegistrator = javaSparkContext.getConf().get("spark.kryo.registrator", null);
        if (kryoRegistrator == null || !kryoRegistrator.equals("org.nd4j.Nd4jRegistrator")) {
            //It's probably going to fail later due to Kryo failing on the INDArray deserialization (off-heap data)
            //But: the user might be using a custom Kryo registrator that can handle ND4J INDArrays, even if they
            // aren't using the official ND4J-provided one
            //Either way: Let's test serialization now of INDArrays now, and fail early if necessary
            SerializerInstance si;
            ByteBuffer bb;
            try {
                si = javaSparkContext.env().serializer().newInstance();
                bb = si.serialize(Nd4j.linspace(1, 5, 5), null);
            } catch (Exception e) {
                //Failed for some unknown reason during serialization - should never happen
                throw new RuntimeException(KRYO_EXCEPTION_MSG, e);
            }
            if (bb == null) {
                //Should probably never happen
                throw new RuntimeException(KRYO_EXCEPTION_MSG + "\n(Got: null ByteBuffer from Spark SerializerInstance)");
            } else {
                //Could serialize successfully, but still may not be able to deserialize if kryo config is wrong
                boolean equals;
                INDArray deserialized;
                try {
                    deserialized = si.deserialize(bb, null);
                    //Equals method may fail on malformed INDArrays, hence should be within the try-catch
                    equals = Nd4j.linspace(1, 5, 5).equals(deserialized);
                } catch (Exception e) {
                    throw new RuntimeException(KRYO_EXCEPTION_MSG, e);
                }
                if (!equals) {
                    throw new RuntimeException(KRYO_EXCEPTION_MSG + "\n(Error during deserialization: test array" + " was not deserialized successfully)");
                }
                //Otherwise: serialization/deserialization was successful using Kryo
                return true;
            }
        }
    }
    return true;
}
Also used : INDArray(org.nd4j.linalg.api.ndarray.INDArray) SerializerInstance(org.apache.spark.serializer.SerializerInstance) ByteBuffer(java.nio.ByteBuffer)

Example 2 with SerializerInstance

use of org.apache.spark.serializer.SerializerInstance in project gatk by broadinstitute.

the class SparkTestUtils method roundTripInKryo.

/**
     * Takes an input object and returns the value of the object after it has been serialized and then deserialized in Kryo.
     * Requires the class of the input object as a parameter because it's not generally possible to get the class of a
     * generified method parameter with reflection.
     *
     * @param input instance of inputClazz.  Never {@code null}
     * @param inputClazz class to cast input
     * @param conf Spark configuration to test
     * @param <T> class to attempt.  Same or subclass of inputClazz
     * @return serialized and deserialized instance of input.  Throws exception if serialization round trip fails.
     */
public static <T> T roundTripInKryo(final T input, final Class<?> inputClazz, final SparkConf conf) {
    Utils.nonNull(input);
    final KryoSerializer kryoSerializer = new KryoSerializer(conf);
    final SerializerInstance sparkSerializer = kryoSerializer.newInstance();
    final ClassTag<T> tag = ClassTag$.MODULE$.apply(inputClazz);
    return sparkSerializer.deserialize(sparkSerializer.serialize(input, tag), tag);
}
Also used : SerializerInstance(org.apache.spark.serializer.SerializerInstance) KryoSerializer(org.apache.spark.serializer.KryoSerializer)

Aggregations

SerializerInstance (org.apache.spark.serializer.SerializerInstance)2 ByteBuffer (java.nio.ByteBuffer)1 KryoSerializer (org.apache.spark.serializer.KryoSerializer)1 INDArray (org.nd4j.linalg.api.ndarray.INDArray)1