Search in sources :

Example 1 with ContainsSequenceRegex

use of edu.sdsc.mmtf.spark.filters.ContainsSequenceRegex in project mmtf-spark by sbl-sdsc.

the class FilterBySequenceRegex method main.

/**
 * @param args
 * @throws FileNotFoundException
 */
public static void main(String[] args) throws FileNotFoundException {
    String path = MmtfReader.getMmtfReducedPath();
    long start = System.nanoTime();
    SparkConf conf = new SparkConf().setMaster("local[*]").setAppName(FilterBySequenceRegex.class.getSimpleName());
    JavaSparkContext sc = new JavaSparkContext(conf);
    // read PDB in MMTF format
    JavaPairRDD<String, StructureDataInterface> pdb = MmtfReader.readSequenceFile(path, sc);
    // find structures that containing a Zinc finger motif
    pdb = pdb.filter(new ContainsSequenceRegex("C.{2,4}C.{12}H.{3,5}H"));
    System.out.println("Number of PDB entries containing a Zinc finger motif: " + pdb.count());
    long end = System.nanoTime();
    System.out.println("Time: " + (end - start) / 1E9 + " sec.");
    sc.close();
}
Also used : JavaSparkContext(org.apache.spark.api.java.JavaSparkContext) StructureDataInterface(org.rcsb.mmtf.api.StructureDataInterface) SparkConf(org.apache.spark.SparkConf) ContainsSequenceRegex(edu.sdsc.mmtf.spark.filters.ContainsSequenceRegex)

Aggregations

ContainsSequenceRegex (edu.sdsc.mmtf.spark.filters.ContainsSequenceRegex)1 SparkConf (org.apache.spark.SparkConf)1 JavaSparkContext (org.apache.spark.api.java.JavaSparkContext)1 StructureDataInterface (org.rcsb.mmtf.api.StructureDataInterface)1