Please attach this sheet to the front of your assignment.
• This assignment contributes 50% towards your final grade. The total mark is 100.
• This assignment has two parts. The first part is base problems. The second part has a choice of two
options A and B.
• You can team up with another student. If you work as a team, both students should sign this
page, but only one submission is needed.
• Submit an electronic copy through Canvas before 11:59 p.m. on Monday, 6 June 2022. The submission
requirement is specified at the end of each option.
The School of Computer and Mathematical Sciences regards any act of cheating including plagiarism, unauthorised collaboration and theft of another student’s work most seriously. Any such act will result in a mark
of zero being given for this part of the assessment and may lead to disciplinary action.
Here is a quote from Richard Feynman, a well known scientist, “The first principle is that you must not
fool yourself and you are the easiest person to fool.”
Please sign to signify that you understand what this means, and that the assignment is your own work.
Base Problems – 40 Marks
This is the set of base problems that are shared by all options.
Question 1. Normal Form Game 10 Marks
Here is a simplified version of the game “Win an additional mark if U can!”.
• There are two players.
• Each player names an integer between 1 and 4.
• The player who names the integer closest to two thirds of the average integer gets a reward of 10, the
other players get nothing.
• If there is a tie (i.e., choosing the same number), each player gets reward of 5.
(a) Represent this game in Normal Form. (4 marks)
(b) Answer the following questions (3 marks)
• When player 2 chooses 4, what are the best responses for player 1?
• When player 1 chooses 3, what are the best responses for player 2?
• When player 2 chooses 2, what are the best responses for player 1?
• When player 1 chooses 1, what are the best responses for player 2?
• For player 1, is the strategy of choosing 4 strictly or very weakly dominated by another strategy? If
so, which ones?
• For player 2, is the strategy of choosing 1 strictly or very weakly dominated by another strategy? If
so, which ones?
(c) What is the Nash equilibrium of this game? (3 marks)
Find this out by applying the concept of dominated strategies to rule out a succession of inferior strategies
until only one choice remains.
Question 2. Tic-Tac-Toe 10 Marks.
This problem exercises the basic concepts of game playing, using Tic-Tac-Toe (Naughts and Crosses) as
an example, so that you have a better understanding on the main method in Deep Blue beating the best human
Tic-tac-toe has two players, X and O, who take turns marking the spaces in a 3×3 grid. X player starts
first. The player who succeeds in placing three of their marks in a horizontal, vertical, or diagonal row wins
(a) Estimate the number of possible game plays of Tic-Tac-Toe. One game play is a sequence of moves from
initial state till a terminal state. (2 marks)
(b) Show the whole game tree starting from an empty board down to depth 2 (i.e. one X and one O on the
board), taking symmetry into account. (2 marks)
By symmetry, we mean that if you turn the whole board around (without moving the relative position of
the pieces), you’ll get the same configuration. For example, the following 4 states are the same under
They are different from the following state no matter how hard you turn
(c) Mark on your tree the evaluations of all the positions at depth 2. The evaluation function is given as
follows. (2 marks)
We define Xn as the number of rows, columns or diagonals with exactly n X’s and no O’s. Similarly,
On is the number of rows, columns or diagonals with just n O’s. The utility function assigns +10 to any
position with X3 >= 1 and -10 to any position with O3 >= 1. All other terminal positions have utility
0. For the nonterminal positions, we use a linear evaluation function defined as
Eval(s) = 3X2(s) + X1(s) − (3O2(s) + O1(s))
(d) Using the minimax algorithm, mark on your tree the backed-up values for the positions at depths 1 and
0, and use those values to choose the best starting move. (2 marks)
(e) Circle the nodes at depth 2 that would not be evaluated if alpha-beta pruning were applied, assuming the
nodes are generated in the optimal order for alpha-beta pruning (i.e., in that order, you can prune the
most numbers of nodes). (2 marks)
Question 3. Extensive Form Game 10 Marks
Consider a variant of the Take-away game discussed in the lecture:
• There is a pile of 4 chips on the table.
• Two players take turns to remove 1 or 2 chips from the table, with player 1 starting first.
• The player removing the last chip(s) wins the game, and get a reward of 1; and the opponent gets a
reward of -1.
(a) Represent this game in Extensive Form. (2 marks)
(b) List all the pure strategies of the two players. (2 marks)
Note that you may first find the number of decision nodes for each player, and label the nodes in an
(c) Represent this game in normal form. (2 marks)
(Each row or column is a pure strategy for the respective player).
(d) Find all of the pure strategy Nash Equilibria of this game. (2 marks)
Indicate if an equilibrium is subgame perfect.
(e) Find the solution using the backward induction method. Does any player have a winning strategy? If so,
list the winning strategy. (2 marks)
Question 4. First-order Logic and Game Description Language (GDL) 10 Marks
Given the following information in natural language:
1 Farmers like engineers.
2 Hunters like farmers.
3 Alice likes Jack.
4 Jack is a hunter.
5 David is a farmer.
6 Julie is an engineer.
7 If A likes B, and B likes C, then A likes C.
(a) Explain the key difference between the Imperative and Declarative programming languages, and give an
example of each category.
(b) Use the predicates likes(,), framer(), engineer(), hunter() and constant symbols alice, jack, david, julie
to translate the above information to GDL rules. The rules should be ordered exactly according to the
order of the above sentences. (3 marks)
The translation of the first rule is given:
1 likes(X,Y) <= farmer(X) /\ engineer(Y) (c) Use first-order inference to show responses to the following query. Let KB be the knowledge base containing the 7 rules in (b). Find all the answers for the following query: KB| = likes(alice, X) Ask for every solution with derivation. (6 marks) Here is the format of a derivation: likes(alice,X) → state which rule is used in unification and its mgu θ. New query resulting from doing substitution using θ. . . . repeat the above procedure (resolution) until an empty query (which is a success) or failure ... 4 Option A: 60 Marks - Knowledge and Rational Agents The aim of this assignment is for you to gain a better understanding of knowledge and rationality in building intelligent agents. During week 8, all students were given the opportunity to play a game named “Win an additional mark if U can!”. Each student has at least two opportunities to play this game one is during the lecture and one is during the workshop. All the participants were presented an online form like this (note that during the workshop plays, there is also a field indicating how many times a player had played this game or similar variants). We call this variant V100 as each player has 100 options. In the subsequent week, we discussed the week 8’s results during the lecture and the students were offered two more chance to play similar variants. The first was played during lecture 9 and the form is as follows. We call this variant V10. The second was played after the lecture and the link was announced on Canvas and the students were given the opportunity to think for 24 hours before submitting an answer, which is notably longer than the previous iterations. We call this variant V4. These are all variants of Keynes’ Beauty Contest Game discussed during a lecture. Your tasks are as follows. 5 Task 1 (50 Marks) The game and its variants have been played for a few times. You’ll look at the sessions of this year and 2021. Here are more details. The data files are available under the assignment folder. • Year 2022: 6 sessions as described above. The data file is 2022data.xlsx. The name and student IDs were replaced by pseudo IDs of letters followed by a number. • Year 2021: 4 sessions as described below. There are 3 workshops (A, B, and C, in chronicle order). In week 1, the participations were all online. The participants in B and C were informed the winners’ choices of earlier session A. Both A and C happened during the workshop, while B was given longer time by receiving the game by email. In week 3, we had a discussion of the results of week 1 and the theoretical solution assuming perfect rationality as a common knowledge, then we played the game again. The participants were from the physical classroom and online. We’ve made data files of all results by replacing the name and student IDs with pseudo IDs of letters followed by a number: 2021week1data.csv is for week 1; 2021week3data.csv is for week 3. Your task is to study these two years’ data files and write a report of 4 pages maximum. Your report shall address four aspects listed below. (a) Rationality and Knowledge (10 marks) Discuss what is rationality and why a player’s knowledge about other players’ rationality is important. Discuss if the winners win by rational thinking, pure luck, or a combination of both. You shall utilise the data files to support your argument. Note that there are 10 sessions in total, and 3 different game variants. In later sessions, a player’s previous experience is recorded. (b) Nash equilibrium (15 marks) Find the pure strategy Nash equilibrium of the three variants (V100, V10 and V4), assuming the perfect rationality of all players. In reality, not all players are rational. Discuss the relationship between the winning numbers and Nash equilibrium across different games, with regards to the players’ rationality. A comparison of the three variants can be made by scaling the options. E.g., the 100 options in V100 can be scalded down to 4 options by a mapping, so it can be more comparable to V4. (c) Player Classification (15 marks) For 2022 sessions, give a classification of players in terms of different levels of rationality and their knowledge about other’s rationality. A good classification scheme shall cover at least 3 classes. For each class, you give a clear criteria, so that each row in the data file can be classified into one of the classes. You need to show this as part of your answer. You can use IDs to refer to the players. To support your scheme, you shall utilise the reasons given by the players (but be minded that the players do not necessarily articulate their reasons very clearly). (d) 2022 VS 2021 (10 marks) Discuss the similarity and difference between 2022 and 2021’s results. Note that the difference in game settings should also be taken into consideration. Task 2 (10 Marks) Using the knowledge in game theory to design a normal form game that can be played by multiple people (at least 3 players). The game should be designed in a way that the outcome is decided by the players collectively, not by a single player. Write a report to at least discuss the following points. • The game design: players, actions, rewards. And the game theoretical analysis, e.g. is there a Nash equilibrium assuming perfect rationality of the players? • The experimental results. 6 Submission You can team up with a classmate to work on this assignment. One submission per team. You are required to provide: (1) Your solution to Base Problems, and (2) Your report for Task 1 (Limited to 4 A4 pages; clearly indicate (a)-(d)), (3) Your report for Task 2 (Limited to 2 A4 pages). We prefer that (1), (2) and (3) are included in a single PDF file. If you have extra data, your submission should including everything in a zip file. The file name follows the following format: AI2022A2A-XY.zip, where XY is your student ID(s). The submission is via Canvas. 7 Option B: 60 Marks - Game Playing Agent The aim of this assignment is for you to better understand computer game playing by implementing a special game playing agent with the minimax algorithm and evaluation function. Task 1 (50 Marks) Consider the Take-away game discussed in the lecture: • There is a pile of X chips on the table. • Two players take turns to remove 1, 2, or 3 chips from the table, with player 1 starting first. • The player removing the last chip(s) wins the game. In this task, you are asked to develop a game playing agent which can play the Tic-Tac-Toe and the Take-away games against a human player. In particular, you will need to: (a) Implement the minimax algorithm in two versions: (30 marks) • (i) complete tree search. You may use Depth-first search (DFS). • (ii) depth limited tree search and an evaluation function. For depth limited tree search, you just need to introduce a depth variable and when it decreases to 0, either the evaluation value or the terminal value is returned. For Tic-Tac-Toe game, the evaluation function is outlined in the Base Problems part. For Take-away game, you need to come up with your own evaluation function. A Pseudocode can be found at https://en.wikipedia.org/wiki/Minimax Report on the details of your program design (data structures, and functions). Let two versions compete against each other and report on the performance difference, e.g., win-draw-lose rate, and resource consumption. (b) Implement an intuitive user interface (10 marks) This allows the human player to know the game state/actions to play. In the start of games, the human player should be able to choose who plays first and which version of the minimax algorithm to use (in the case of depth limited, allowing to set the search depth). Note that for the Take-away game, when the number of chips is large, it might not be practical to display chips visually. In that case, you can use numbers. Report on the program design. (c) Scalability Study (10 marks) Your agent is resource bounded, so the response of the player will be slower if the game gets larger. Design experiments to study the scalability of your game playing agent by using the Take-away game. Specifically, here are some important aspects to consider: • Varying the value of X. E.g., when you have a large value, say X=1000, the game tree for Take-away will be super large, more than 3334 leaves (the branching factor b=3 (three actions), and the depth of leaves ranges from 334 - 1000). So one question on scalability will be: what is the maximal value of X that your agent (running in your hardware) can handle (producing a response in a reasonable amount of time, say 100 seconds?) • Extending the number of chips the players are allowed to take away. E.g., you could add two more actions: taking away 4 chips, taking away 5 chips. This will increase the branching factor to 5. 8 • Varying the limit of search depth. Report on your experiment design and results. Your hardware and software configurations should also be noted as they are very relevant to the results. Task 2 (10 Marks) Implement a game player for a game of your own interest. Here are the requirements. • Describe the game clearly, including the players, actions, game dynamics, terminal conditions and goals. • Your player should use the minimax algorithm in Task 1, and the alpha-beta pruning algorithm. A Pseudocode can be found at https://en.wikipedia.org/wiki/Alphabeta_pruning • Let two versions compete against each other and report on the performance difference, e.g., win-drawlose rate, and resource consumption. It is required that the game has enough complexity so that you are able to see the difference. Submission You can team up with a classmate to work on this assignment. One submission per team. You are required to provide: (1) Your solution to Base Problems. (2) A report for Task 1. (Limited to 4 A4 pages) (3) A report for Task 2. (Limited to 2 A4 pages) (4) The source code and the compiled program of your game playing agent (with notes on what library it may depends on and instructions on how to run it). There is no restriction on your programming language. Note that the key codes of the game algorithms have to be written by yourselves. We prefer that (1), (2) and (3) are included in a single PDF file. Your submission should including everything (report and source code) in a zip file. The file name follows the following format: AI2022A2BXY.zip, where XY is your student ID(s). The submission is via Canvas. 9