Search in sources :

Example 6 with StateObservation

use of core.game.StateObservation in project SimpleAsteroids by ljialin.

the class SingleTreeNode method mctsSearch.

public void mctsSearch(ElapsedCpuTimer elapsedTimer) {
    double avgTimeTaken = 0;
    double acumTimeTaken = 0;
    long remaining = elapsedTimer.remainingTimeMillis();
    int numIters = 0;
    int remainingLimit = 10;
    // while(remaining > 2*avgTimeTaken && remaining > remainingLimit){
    while (numIters < Agent.MCTS_ITERATIONS) {
        StateObservation state = rootState.copy();
        ElapsedCpuTimer elapsedTimerIteration = new ElapsedCpuTimer();
        SingleTreeNode selected = treePolicy(state);
        double delta = selected.rollOut(state);
        backUp(selected, delta);
        numIters++;
        acumTimeTaken += (elapsedTimerIteration.elapsedMillis());
        // System.out.println(elapsedTimerIteration.elapsedMillis() + " --> " + acumTimeTaken + " (" + remaining + ")");
        avgTimeTaken = acumTimeTaken / numIters;
        remaining = elapsedTimer.remainingTimeMillis();
    }
// System.out.println("Iterations: " + numIters);
}
Also used : StateObservation(core.game.StateObservation) ElapsedCpuTimer(tools.ElapsedCpuTimer)

Example 7 with StateObservation

use of core.game.StateObservation in project SimpleAsteroids by ljialin.

the class Agent method act.

public Types.ACTIONS act(StateObservation stateObs, ElapsedCpuTimer elapsedTimer) {
    // Set the state observation object as the new root of the tree.
    // we'll set up a game adapter and run the algorithm independently each
    // time at least to being with
    StateObservation obs = stateObs.copy();
    // Types.ACTIONS[] moveSeq = new Types.ACTIONS[maxRolloutLength];
    bestRollout = new Types.ACTIONS[maxNestingDepth][maxRolloutLength];
    lengthBestRollout = new int[maxNestingDepth];
    scoreBestRollout = new double[maxNestingDepth];
    // nested(obs, nestDepth, moveSeq, 0);
    double bestScore = Double.NEGATIVE_INFINITY;
    Types.ACTIONS bestAction = actions[0];
    for (int i = 0; i < num_actions; i++) {
        StateObservation state = obs.copy();
        Types.ACTIONS[] moveSeqCopy = new Types.ACTIONS[maxRolloutLength];
        int nActionsPlayed = 0;
        state.advance(actions[i]);
        moveSeqCopy[nActionsPlayed] = actions[i];
        nActionsPlayed++;
        nested(state, nestDepth, moveSeqCopy, nActionsPlayed);
        double score = state.getGameScore();
        if (score > bestScore) {
            bestScore = score;
            bestAction = actions[i];
        }
    }
    return bestAction;
}
Also used : StateObservation(core.game.StateObservation) Types(ontology.Types)

Aggregations

StateObservation (core.game.StateObservation)7 Types (ontology.Types)4 ElapsedCpuTimer (tools.ElapsedCpuTimer)4 Agent (controllers.singlePlayer.ea.Agent)3 AbstractPlayer (core.player.AbstractPlayer)3 EvoAlg (evodef.EvoAlg)3 SimpleRMHC (ga.SimpleRMHC)3 Random (java.util.Random)3 NTupleBanditEA (ntuple.NTupleBanditEA)3 ElapsedTimer (utilities.ElapsedTimer)3 SimpleMaxGame (altgame.SimpleMaxGame)2 BattleView (battle.BattleView)1 SpaceBattleLinkState (gvglink.SpaceBattleLinkState)1 SlidingMeanEDA (ntuple.SlidingMeanEDA)1 LinePlot (plot.LinePlot)1 GridModel (rl.grid.GridModel)1 JEasyFrame (utilities.JEasyFrame)1