Search in sources :

Example 1 with QLearningControl

use of nars.rl.horde.QLearningControl in project narchy by automenta.

the class RLParkQLTest method main.

public static void main(String[] args) {
    Integer[] actions = new Integer[] { 0, 1 };
    int features = 2;
    TabularAction ta = new TabularAction(actions, 1, features);
    final double alpha = .1;
    final double gamma = .99;
    final double lambda = .3;
    GQ gq = new GQ(alpha, 0.0, 1 - gamma, lambda, features);
    QLearningControl.Greedy acting = new QLearningControl.Greedy(gq, actions, ta);
    QLearningControl<Integer> q = new QLearningControl(acting, new QLearningControl.QLearning(actions, alpha, gamma, lambda, ta, gq.traces()));
    ArrayRealVector xt = null;
    int nextA = 0;
    for (int i = 0; i < 1000; i++) {
        double x1 = Math.random();
        double x2 = Math.random();
        System.out.println(Texts.n4(r) + " " + nextA);
        System.out.println(Arrays.toString(gq.traces().vect().toArray()));
        nextA = q.step(xt, nextA, xt = new ArrayRealVector(new double[] { x1, x2 }), r);
        r = Math.abs(nextA - x2) - 0.5;
    }
}
Also used : TabularAction(nars.rl.horde.functions.TabularAction) ArrayRealVector(org.apache.commons.math3.linear.ArrayRealVector) GQ(nars.rl.horde.functions.GQ) QLearningControl(nars.rl.horde.QLearningControl)

Aggregations

QLearningControl (nars.rl.horde.QLearningControl)1 GQ (nars.rl.horde.functions.GQ)1 TabularAction (nars.rl.horde.functions.TabularAction)1 ArrayRealVector (org.apache.commons.math3.linear.ArrayRealVector)1