Sub-taskforce 3: internal reward

plan for the next couple of weeks

In addition to earlier trials with distance travelled as internal reward for the adaptation of the organism-mode controllers, VU will perform trials with QI as internal reward, and -if time permits- with a combination of QI and distance. The methods will be compared in terms of the area of an arena explored as well as in terms of distance travelled. The VU team will try combinations of sensory input (distance sensors, GPS measurements) and actuator positions as input for the QI calculations.

Christopher remarks:
"[I]f QI/curiosity works then I am fine with that choice because it can work on the robot without hacks. I see distance as the fallback plan since it's easy to implement on the simulator"

We are still discussing further alternatives related to weighted QI (possibly evolving the weights, for instance).

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License