Here is the portal for uploading your HW: Implementation of Q-learning.
A short rubric:
Total points : 20
- Correct initialization ( proper n*n Q-matrix, R matrix or vector, etc. according to your implementation ): 3 points
- Correct transition function or matrix to get the next state given the current state and the action: 3 points
- Correct function or code block for choosing a random and valid action, or similar. 3 points
- Implement episode iterations, calculate q value and update q matrix correctly : 6 points
- Return the correct path of reaching the goal state given Q matrix : 5 points (this means you need to create a concrete gridworld using your implementation and find the solution)
Ps: you can set learning rate alpha equal to 1 so as to use the simplest form of q equation : Q(s,a) <-- R(s,a) + gamma* max ( Q'(s',a')) in the homework.
Extra Credit:
- Show the update of q matrix every N episodes ( You choose N ) : 1 points
- Set alpha between (0,1) : 2 points
- Implement a simple GUI which shows the movement of agent or the change of policy: 2 points
Hello , I am Ibad and i my self a software engineer and completed my Mphil in software engg. So my core is Mathematics and Artificial intelligence. I will completely do this, just let me know when you want it. I am waiting