due date: Tue Apr 4 by 9:00 PM email to: mcb419@gmail.com subject: hw11 email contents: 1) jsbin.com link to your project code 2) answer all the questions at the bottom of this page in the email
This assignment combines elements of associative learning (associating pellet color with reward ), estimating reward values using the delta rule, and implementing action policies based on estimated reward values. In this assignment, a single bot forages for RED, GREEN and BLUE pellets. The different colors will have different reward values. Your bot needs to learn the expected value of the different colors, and implement an efficient foraging strategy using that information. The objective is to collect as much energy as possible in a fixed time period (2000 ticks).
Pellets:
pellets - 10 each of red, green, and blue; randomly distributed; can be detected at a distance;
pellet values - pellet colors are randomly assigned to 3 categories: Best, Neutral, Worst
-- Best: 90% of pellets return a reward of +4, 10% return a reward of -4
-- Neutral: 50% return +4, 50% return -4
-- Worst: 10% return +4, 90% return -4
Bot sensory inputs:
bot.sns.left/right = a 1-d array [snsR, snsG, snsB] returning the sensed intensity for each pellet color (Braitenberg-style);
bot.sns.collision = true when the bot hits a boundary; false otherwise
bot.sns.deltaEnergy = energy gained on previous time step
bot.sns.lastColorConsumed = a string ("red", "green", "blue") indicating the color of the last pellet consumed
Bot motor output:
bot.mtr.left/right = motor velocity (Braitenberg-style);
Controllers:
seekRed - seeks red pellets, ignores other colors
seekGreen - seeks green pellets, ignores other colors
seekBlue - seeks blue pellets, ignores other colors
seekAll - seeks all pellets by using sum of R,G,B sensors
seekUser - this is the controller that you will develop
Estimated red: Estimated green: Estimated blue: |
||
First, run the provided controllers and understand how they work.
Next, using the seekAll controller for testing, modify the bot.prototype.updateEstimates
method
to updated the bot's estimatedValue
array as the bot consumes pellets.
This array has three elements for the estimated values of red, green and blue, respectively. The values will be displayed automatically to the right of the canvas.
Once your estimates are being
computed correctly, then write your own controller code bot.prototype.seekUser
to use this information in a
way that optimizes foraging performance. You should be able to reliably achieve scores over 200.
Controller | Fitness mean (std dev) |
---|