Search in sources :

Example 1 with ClockCache

use of com.tencent.angel.psagent.clock.ClockCache in project angel by Tencent.

the class PSAgent method initAndStart.

public void initAndStart() throws Exception {
    // Get ps locations from master and put them to the location cache.
    locationManager = new PSAgentLocationManager(PSAgentContext.get());
    locationManager.setMasterLocation(masterLocation);
    // Build and initialize rpc client to master
    masterClient = new MasterClient();
    masterClient.init();
    // Build local location
    String localIp = NetUtils.getRealLocalIP();
    int port = NetUtils.chooseAListenPort(conf);
    location = new Location(localIp, port);
    // Initialize matrix meta information
    clockCache = new ClockCache();
    List<MatrixMeta> matrixMetas = masterClient.getMatrices();
    LOG.info("===========================PSAgent get matrices from master," + matrixMetas.size());
    this.matrixMetaManager = new PSAgentMatrixMetaManager(clockCache);
    matrixMetaManager.addMatrices(matrixMetas);
    Map<ParameterServerId, Location> psIdToLocMap = masterClient.getPSLocations();
    List<ParameterServerId> psIds = new ArrayList<>(psIdToLocMap.keySet());
    Collections.sort(psIds, new Comparator<ParameterServerId>() {

        @Override
        public int compare(ParameterServerId s1, ParameterServerId s2) {
            return s1.getIndex() - s2.getIndex();
        }
    });
    int size = psIds.size();
    locationManager.setPsIds(psIds.toArray(new ParameterServerId[0]));
    for (int i = 0; i < size; i++) {
        if (psIdToLocMap.containsKey(psIds.get(i))) {
            locationManager.setPsLocation(psIds.get(i), psIdToLocMap.get(psIds.get(i)));
        }
    }
    matrixTransClient = new MatrixTransportClient();
    matrixClientAdapter = new MatrixClientAdapter();
    opLogCache = new MatrixOpLogCache();
    matrixStorageManager = new MatrixStorageManager();
    matricesCache = new MatricesCache();
    int staleness = conf.getInt(AngelConf.ANGEL_STALENESS, AngelConf.DEFAULT_ANGEL_STALENESS);
    consistencyController = new ConsistencyController(staleness);
    consistencyController.init();
    psAgentInitFinishedFlag.set(true);
    // Start heartbeat thread if need
    if (needHeartBeat) {
        startHeartbeatThread();
    }
    // Start all services
    matrixTransClient.start();
    matrixClientAdapter.start();
    clockCache.start();
    opLogCache.start();
    matricesCache.start();
}
Also used : MatrixClientAdapter(com.tencent.angel.psagent.matrix.transport.adapter.MatrixClientAdapter) ClockCache(com.tencent.angel.psagent.clock.ClockCache) MasterClient(com.tencent.angel.psagent.client.MasterClient) MatrixMeta(com.tencent.angel.ml.matrix.MatrixMeta) PSAgentMatrixMetaManager(com.tencent.angel.psagent.matrix.PSAgentMatrixMetaManager) MatricesCache(com.tencent.angel.psagent.matrix.cache.MatricesCache) ConsistencyController(com.tencent.angel.psagent.consistency.ConsistencyController) MatrixTransportClient(com.tencent.angel.psagent.matrix.transport.MatrixTransportClient) MatrixOpLogCache(com.tencent.angel.psagent.matrix.oplog.cache.MatrixOpLogCache) PSAgentLocationManager(com.tencent.angel.psagent.matrix.PSAgentLocationManager) MatrixStorageManager(com.tencent.angel.psagent.matrix.storage.MatrixStorageManager) ParameterServerId(com.tencent.angel.ps.ParameterServerId) Location(com.tencent.angel.common.location.Location)

Example 2 with ClockCache

use of com.tencent.angel.psagent.clock.ClockCache in project angel by Tencent.

the class PSAgentTest method testClockCache.

@Test
public void testClockCache() throws Exception {
    try {
        AngelApplicationMaster angelAppMaster = LocalClusterContext.get().getMaster().getAppMaster();
        assertTrue(angelAppMaster != null);
        AMTaskManager taskManager = angelAppMaster.getAppContext().getTaskManager();
        assertTrue(taskManager != null);
        WorkerManager workerManager = angelAppMaster.getAppContext().getWorkerManager();
        assertTrue(workerManager != null);
        Worker worker = LocalClusterContext.get().getWorker(worker0Attempt0Id).getWorker();
        assertTrue(worker != null);
        PSAgent psAgent = worker.getPSAgent();
        assertTrue(psAgent != null);
        ClockCache clockCache = psAgent.getClockCache();
        assertTrue(clockCache != null);
        int rowClock = clockCache.getClock(1, 0);
        assertEquals(rowClock, 0);
    } catch (Exception x) {
        LOG.error("run testClockCache failed ", x);
        throw x;
    }
}
Also used : WorkerManager(com.tencent.angel.master.worker.WorkerManager) AMTaskManager(com.tencent.angel.master.task.AMTaskManager) ClockCache(com.tencent.angel.psagent.clock.ClockCache) AngelApplicationMaster(com.tencent.angel.master.AngelApplicationMaster) Worker(com.tencent.angel.worker.Worker) Test(org.junit.Test)

Example 3 with ClockCache

use of com.tencent.angel.psagent.clock.ClockCache in project angel by Tencent.

the class TaskContext method globalSync.

/**
 * Global sync with special matrix,still wait until all matrixes's clock is synchronized.
 *
 * @param matrixId the matrix id
 * @throws InterruptedException
 */
public void globalSync(int matrixId) throws InterruptedException {
    ClockCache clockCache = PSAgentContext.get().getClockCache();
    List<PartitionKey> pkeys = PSAgentContext.get().getMatrixMetaManager().getPartitions(matrixId);
    int syncTimeIntervalMS = PSAgentContext.get().getConf().getInt(AngelConf.ANGEL_PSAGENT_CACHE_SYNC_TIMEINTERVAL_MS, AngelConf.DEFAULT_ANGEL_PSAGENT_CACHE_SYNC_TIMEINTERVAL_MS);
    while (true) {
        boolean sync = true;
        for (PartitionKey pkey : pkeys) {
            if (clockCache.getClock(matrixId, pkey) < getMatrixClock(matrixId)) {
                sync = false;
                break;
            }
        }
        if (!sync) {
            Thread.sleep(syncTimeIntervalMS);
        } else {
            break;
        }
    }
}
Also used : ClockCache(com.tencent.angel.psagent.clock.ClockCache) PartitionKey(com.tencent.angel.PartitionKey)

Example 4 with ClockCache

use of com.tencent.angel.psagent.clock.ClockCache in project angel by Tencent.

the class TaskContext method getPSMatrixClock.

/**
 * Get the clock value of a matrix
 * @param matrixId matrix id
 * @return clock value
 */
public int getPSMatrixClock(int matrixId) {
    ClockCache clockCache = PSAgentContext.get().getClockCache();
    List<PartitionKey> pkeys = PSAgentContext.get().getMatrixMetaManager().getPartitions(matrixId);
    int size = pkeys.size();
    int clock = Integer.MAX_VALUE;
    int partClock = 0;
    for (int i = 0; i < size; i++) {
        partClock = clockCache.getClock(matrixId, pkeys.get(i));
        if (partClock < clock) {
            clock = partClock;
        }
    }
    return clock;
}
Also used : ClockCache(com.tencent.angel.psagent.clock.ClockCache) PartitionKey(com.tencent.angel.PartitionKey)

Aggregations

ClockCache (com.tencent.angel.psagent.clock.ClockCache)4 PartitionKey (com.tencent.angel.PartitionKey)2 Location (com.tencent.angel.common.location.Location)1 AngelApplicationMaster (com.tencent.angel.master.AngelApplicationMaster)1 AMTaskManager (com.tencent.angel.master.task.AMTaskManager)1 WorkerManager (com.tencent.angel.master.worker.WorkerManager)1 MatrixMeta (com.tencent.angel.ml.matrix.MatrixMeta)1 ParameterServerId (com.tencent.angel.ps.ParameterServerId)1 MasterClient (com.tencent.angel.psagent.client.MasterClient)1 ConsistencyController (com.tencent.angel.psagent.consistency.ConsistencyController)1 PSAgentLocationManager (com.tencent.angel.psagent.matrix.PSAgentLocationManager)1 PSAgentMatrixMetaManager (com.tencent.angel.psagent.matrix.PSAgentMatrixMetaManager)1 MatricesCache (com.tencent.angel.psagent.matrix.cache.MatricesCache)1 MatrixOpLogCache (com.tencent.angel.psagent.matrix.oplog.cache.MatrixOpLogCache)1 MatrixStorageManager (com.tencent.angel.psagent.matrix.storage.MatrixStorageManager)1 MatrixTransportClient (com.tencent.angel.psagent.matrix.transport.MatrixTransportClient)1 MatrixClientAdapter (com.tencent.angel.psagent.matrix.transport.adapter.MatrixClientAdapter)1 Worker (com.tencent.angel.worker.Worker)1 Test (org.junit.Test)1