Search in sources :

Example 1 with DistanceMatrix

use of org.baderlab.csplugins.brainlib.DistanceMatrix in project EnrichmentMapApp by BaderLab.

the class HierarchicalClusterTask method cluster.

public Map<Integer, RankValue> cluster(TaskMonitor tm) {
    if (tm == null)
        tm = new NullTaskMonitor();
    tm.setTitle("Hierarchical Cluster");
    tm.setStatusMessage("Loading expression data");
    List<double[]> clusteringExpressionSet = new ArrayList<>(genes.size());
    ArrayList<Integer> labels = new ArrayList<>(genes.size());
    List<String> names = new ArrayList<>(genes.size());
    List<EMDataSet> dataSets = map.getDataSetList();
    final int expressionCount = getTotalExpressionCount(dataSets);
    for (int geneId : genes) {
        // values all default to 0.0
        double[] vals = new double[expressionCount];
        int valsIndex = 0;
        boolean found = false;
        String name = null;
        for (EMDataSet dataSet : dataSets) {
            GeneExpressionMatrix expressionSets = dataSet.getExpressionSets();
            int numConditions = expressionSets.getNumConditions() - 2;
            GeneExpression geneExpression = expressionSets.getExpressionMatrix().get(geneId);
            if (geneExpression != null) {
                found = true;
                name = geneExpression.getName();
                double[] expression = geneExpression.getExpression();
                System.arraycopy(expression, 0, vals, valsIndex, expression.length);
            }
            valsIndex += numConditions;
        }
        if (found) {
            clusteringExpressionSet.add(vals);
            labels.add(geneId);
            names.add(name);
        }
    }
    tm.setStatusMessage("Calculating Distance");
    DistanceMatrix distanceMatrix = new DistanceMatrix(genes.size());
    distanceMatrix.calcDistances(clusteringExpressionSet, distanceMetric);
    distanceMatrix.setLabels(labels);
    tm.setStatusMessage("Clustering");
    AvgLinkHierarchicalClustering clusterResult = new AvgLinkHierarchicalClustering(distanceMatrix);
    //check to see if there more than 1000 genes, if there are use eisen ordering otherwise use bar-joseph
    clusterResult.setOptimalLeafOrdering(genes.size() <= 1000);
    clusterResult.run();
    tm.setStatusMessage("Ranking");
    Map<Integer, RankValue> ranks = new HashMap<>();
    int[] order = clusterResult.getLeafOrder();
    for (int i = 0; i < order.length; i++) {
        Integer geneId = labels.get(order[i]);
        ranks.put(geneId, new RankValue(i + 1, null, false));
    }
    tm.setStatusMessage("");
    return ranks;
}
Also used : HashMap(java.util.HashMap) ArrayList(java.util.ArrayList) AvgLinkHierarchicalClustering(org.baderlab.csplugins.brainlib.AvgLinkHierarchicalClustering) RankValue(org.baderlab.csplugins.enrichmentmap.view.heatmap.table.RankValue) GeneExpressionMatrix(org.baderlab.csplugins.enrichmentmap.model.GeneExpressionMatrix) EMDataSet(org.baderlab.csplugins.enrichmentmap.model.EMDataSet) DistanceMatrix(org.baderlab.csplugins.brainlib.DistanceMatrix) GeneExpression(org.baderlab.csplugins.enrichmentmap.model.GeneExpression) NullTaskMonitor(org.baderlab.csplugins.enrichmentmap.util.NullTaskMonitor)

Aggregations

ArrayList (java.util.ArrayList)1 HashMap (java.util.HashMap)1 AvgLinkHierarchicalClustering (org.baderlab.csplugins.brainlib.AvgLinkHierarchicalClustering)1 DistanceMatrix (org.baderlab.csplugins.brainlib.DistanceMatrix)1 EMDataSet (org.baderlab.csplugins.enrichmentmap.model.EMDataSet)1 GeneExpression (org.baderlab.csplugins.enrichmentmap.model.GeneExpression)1 GeneExpressionMatrix (org.baderlab.csplugins.enrichmentmap.model.GeneExpressionMatrix)1 NullTaskMonitor (org.baderlab.csplugins.enrichmentmap.util.NullTaskMonitor)1 RankValue (org.baderlab.csplugins.enrichmentmap.view.heatmap.table.RankValue)1