Internal reward (task 3) progress week 2

Work is underway to test QI as an alternative to distance for the internal reward for lifetime learning.

Secondly, we will endeavour to generate robot traces (i.e., sensori-motor logs) labelled as containing more or less desirable behaviour, where 'desirable' may be seen as walking far or maybe 'naturallly'. These logs can then be used as a basis for preference-based learning and indirectly define a fitness function of desirable sensori-motor states.

