Search in sources :

Example 1 with MmtfStructure

use of org.rcsb.mmtf.dataholders.MmtfStructure in project mmtf-spark by sbl-sdsc.

the class MmtfReader method readMmtfFiles.

/**
 * Reads uncompressed and compressed MMTF files recursively from
 * a given directory.
 * This methods reads files with the mmtf or mmtf.gz extension.
 *
 * @param path Path to MMTF files
 * @param sc Spark context
 * @return structure data as keyword/value pairs
 */
public static JavaPairRDD<String, StructureDataInterface> readMmtfFiles(String path, JavaSparkContext sc) {
    return sc.parallelize(getFiles(path)).mapToPair(new PairFunction<File, String, StructureDataInterface>() {

        private static final long serialVersionUID = 9018971417443154996L;

        public Tuple2<String, StructureDataInterface> call(File f) throws Exception {
            try {
                if (f.toString().contains(".mmtf.gz")) {
                    InputStream in = new FileInputStream(f);
                    MmtfStructure mmtf = new MessagePackSerialization().deserialize(new GZIPInputStream(in));
                    return new Tuple2<String, StructureDataInterface>(f.getName().substring(0, f.getName().indexOf(".mmtf")), new GenericDecoder(mmtf));
                } else if (f.toString().contains(".mmtf")) {
                    InputStream in = new FileInputStream(f);
                    MmtfStructure mmtf = new MessagePackSerialization().deserialize(in);
                    return new Tuple2<String, StructureDataInterface>(f.getName().substring(0, f.getName().indexOf(".mmtf")), new GenericDecoder(mmtf));
                } else
                    return null;
            } catch (Exception e) {
                System.out.println(e);
                return null;
            }
        }
    }).filter(t -> t != null);
}
Also used : GZIPInputStream(java.util.zip.GZIPInputStream) FileInputStream(java.io.FileInputStream) ByteArrayInputStream(java.io.ByteArrayInputStream) InputStream(java.io.InputStream) StructureDataInterface(org.rcsb.mmtf.api.StructureDataInterface) GenericDecoder(org.rcsb.mmtf.decoder.GenericDecoder) FileInputStream(java.io.FileInputStream) ZipException(java.util.zip.ZipException) IOException(java.io.IOException) FileNotFoundException(java.io.FileNotFoundException) GZIPInputStream(java.util.zip.GZIPInputStream) Tuple2(scala.Tuple2) MessagePackSerialization(org.rcsb.mmtf.serialization.MessagePackSerialization) PairFunction(org.apache.spark.api.java.function.PairFunction) File(java.io.File) MmtfStructure(org.rcsb.mmtf.dataholders.MmtfStructure)

Aggregations

ByteArrayInputStream (java.io.ByteArrayInputStream)1 File (java.io.File)1 FileInputStream (java.io.FileInputStream)1 FileNotFoundException (java.io.FileNotFoundException)1 IOException (java.io.IOException)1 InputStream (java.io.InputStream)1 GZIPInputStream (java.util.zip.GZIPInputStream)1 ZipException (java.util.zip.ZipException)1 PairFunction (org.apache.spark.api.java.function.PairFunction)1 StructureDataInterface (org.rcsb.mmtf.api.StructureDataInterface)1 MmtfStructure (org.rcsb.mmtf.dataholders.MmtfStructure)1 GenericDecoder (org.rcsb.mmtf.decoder.GenericDecoder)1 MessagePackSerialization (org.rcsb.mmtf.serialization.MessagePackSerialization)1 Tuple2 (scala.Tuple2)1