status 5 Dec 2011
VU is generating logs to analyse for preference-based learning.
Ran tests with QI as internal reward (based on gps first, now enhanced with accelerometer and joint force-feedback - not separately tuned, though); leads to comparable results as using distance as a reward.
page revision: 0, last edited: 05 Dec 2011 09:16