NPTEL An Introduction To Artificial Intelligence Assignment 11 Answer


 

NPTEL SWAYAM is a free online learning platform providing courses in various disciplines from top universities and institutions in India. It offers an interactive learning environment with engaging video lectures, quizzes, assignments, and discussion boards. Learners can access courses at their own pace and convenience without any registration or enrollment. Successful completion of courses leads to recognized certification, enhancing career prospects. The platform is accessible to all learners, regardless of age, background, or qualifications. With its high-quality educational resources and opportunities for career advancement, NPTEL SWAYAM is an excellent choice for anyone looking to expand their knowledge and skills.


ABOUT THE COURSE :
The course introduces the variety of concepts in the field of artificial intelligence. It discusses the philosophy of AI, and how to model a new problem as an AI problem. It describes a variety of models such as search, logic, Bayes nets, and MDPs, which can be used to model a new problem. It also teaches many first algorithms to solve each formulation. The course prepares a student to take a variety of focused, advanced courses in various subfields of AI.


CRITERIA TO GET A CERTIFICATE

Average assignment score = 25% of average of best 8 assignments out of the total 12 assignments given in the course.
Exam score = 75% of the proctored certification exam score out of 100

Final score = Average assignment score + Exam score

YOU WILL BE ELIGIBLE FOR A CERTIFICATE ONLY IF AVERAGE ASSIGNMENT SCORE >=10/25 AND EXAM SCORE >= 30/75. If one of the 2 criteria is not met, you will not get the certificate even if the Final score >= 40/100.



1. Which of the following statements are true?
 
 
 
 

er:- A,B & D


2. Suppose you are performing model-based passive learning according to a given policy. Following this policy, you have reached State A a total of 100 times. State A has 4 possible transitions to next states: A, B, C, and D. The policy stipulates that you take the action a at this state. Taking action a, you end up in state A 61 times, state B 22 times and state C 17 times. Assuming add-one smoothing, what is the value of T(A, a, B)?

Round off the answer to first 3 decimal places.


er:- 0.221


1 point
3. For the next three questions, consider the following trajectories obtained by running some simulations in an unknown environment following a given policy. The state space is {A,B,C} and the action space is {a,b}. Assume discount factor is 0.5. Each sample is represented as (State, Action, Reward, Next state).
Run 1: (A, a, 0,B)
Run 2: (C, b, -1,A), (A, a, 0,B)
Run 3: (C, b, -1,B)
Run 4: (A, a, 0,B)
Run 5: (A, a, 0,C), (C, b, -1,B)
Using model-free passive learning, give an empirical estimate of VΠ(A).

Round off the answer to the first 3 decimal places.


er:- -0.450

1 point
4. Assume that the above samples are fed sequentially to a Temporal Difference learner. Assume all values of states are initialised to 0 and alpha is kept constant at 0.5. What will be the learned value of VΠ(A)?

Round off the answer to the first 2 decimal places.


er:- -0.25

1 point
5. Assume that the above samples are fed to a Q-learner. What is the value of Q(A,a)? Assume that all Q-values are initialized as 0. The discount factor is 0.5 and the learning rate is also 0.5.


er:- 0

1 point
6. Suppose we compute the optimal policy given the current Q-values. What is the action under optimal policy at state C?
Type a or b.


er:- B

1 point
1 point
7. Which of the following is correct regarding Boltzmann exploration?
 
 
 
 
1 point
8. Which of the following is required for the convergence of Q-learning to the optimal Q-values?
 
 
 
 
1 point
9. Which of the following statements are correct?
 
 
 
 
1 point
10. Which of the following statement(s) is/are correct for Model-based and Model-free reinforcement learning methods?
 
 
 
 
Post a Comment (0)
Previous Question Next Question

You might like