Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.


Select target project
No results found


Select target project
  • Davide.Lanti1/wumpus-tessaris-1
  • Davide.Lanti1/wumpus-tessaris
  • Asma.Tajuddin/wumpus
  • Ali.Ahmed/wumpus
  • Sanaz.Khosropour/wumpus
  • PetroMakanza.Joseph/wumpus
  • tessaris/wumpus
7 results
Show changes
Commits on Source (46)
with 1357 additions and 412 deletions
......@@ -6,14 +6,80 @@ The package has been written to be used in the master course of AI taught at the
## Install
You can download the source code from <> and use `pip install .` or installing directly from the repository using
You can download the source code from <> and use `pip install .`, or install directly from the repository using
pip install
pip install git+
## Usage
To write your own player you should create a subclass of `Player` (defined in []( and then use an instance as a parameter of the `run_episode` method of `GridWorld` class (defined in []( Instances of the `Player` subclasses should be created using its `player` class method, where you can decide whether to give the player access to the underlying environment or to rely on the state information provided by the `play` method.
To write your own player you should create a subclass of `OnlinePlayer` or `OfflinePlayer` (defined in []( and then use an instance as a parameter of the `run_episode` function (defined in [](
Examples of the usage of the package can be found in the implementation of two players `RandomPlayer` and `UserPlayer` from [](, and in the file [``]( in the `examples` directory of the repository.
\ No newline at end of file
Examples of the usage of the package can be found in the implementation of two players `RandomPlayer` and `UserPlayer` in [](, and in the files [``](, [``]( in the [`examples`]( directory of the repository.
Your player could be also run using the script `gridrunner` script (in the repository is the [``]( file) and it'll be available once the package is installed (in alternative could be executed using `python -m wumpus.cli`):
``` bash
$ gridrunner --help
usage: gridrunner [-h] [--name NAME] [--path PATH] --entry ENTRY
[--world {EaterWorld,WumpusWorld}] [--horizon HORIZON]
[--noshow] [--out OUT] [--version] [--log LOG]
[infiles [infiles ...]]
Run episodes on worlds using the specified player.
positional arguments:
infiles world description JSON files, they must be compatible
with the world type (see --world option). (default:
optional arguments:
-h, --help show this help message and exit
--name NAME, -n NAME name of the player, default to the name of the player
class (default: None)
--path PATH, -p PATH path of the player library, it's prepended to the
sys.path variable (default: .)
--entry ENTRY, -e ENTRY
object reference for a Player subclass in the form
'importable.module:object.attr'. See
points/#data-model> for details. (default: None)
--world {EaterWorld,WumpusWorld}, -w {EaterWorld,WumpusWorld}
class name of the world (default: EaterWorld)
--horizon HORIZON, -z HORIZON
maximum number of steps (default: 20)
--noshow prevent the printing the world at each step (default:
--out OUT, -o OUT write output to file (default: <_io.TextIOWrapper
name='<stdout>' mode='w' encoding='UTF-8'>)
--version show program's version number and exit
--log LOG, -l LOG write the log of the games to file (JSON) (default:
For example:
``` bash
$ gridrunner --world EaterWorld --entry wumpus:RandomPlayer --noshow --horizon 5 ./examples/eater-world.json
Step 0: agent Eater_c881e1b0 executing N -> reward -1
Step 1: agent Eater_c881e1b0 executing W -> reward -1
Step 2: agent Eater_c881e1b0 executing E -> reward -1
Step 3: agent Eater_c881e1b0 executing W -> reward -1
Step 4: agent Eater_c881e1b0 executing E -> reward -1
Episode terminated by maximum number of steps (5).
Episode terminated with a reward of -5 for agent Eater_c881e1b0
You can also use a player defined in a script; e.g., if the player class `GooPlayer` is defined in the `` you can use the `eater_usage:GooPlayer` entry. Remember Python rules for finding modules, where the current directory is added to the search path. If the script is in a different directory, you can use the `--path` argument to tell the script where to find it:
gridrunner --world EaterWorld --entry eater_usage:GooPlayer --path examples --noshow --horizon 5 ./examples/eater-world.json
\ No newline at end of file
{"size": [5, 5], "block": [[1, 3]], "food": [[3, 0], [0, 0]], "eater": [1, 0]}
#!/usr/bin/env python
Examples of the use of the EaterWorld class
import random
from typing import Iterable
import sys
from wumpus import OfflinePlayer, run_episode, Eater, EaterWorld, Food
class GooPlayer(OfflinePlayer):
Offline player demonstrating the use of the start episode method to inspect the world.
def _say(self, text: str):
print( + ' says: ' + text)
def start_episode(self, world: EaterWorld) -> Iterable[Eater.Actions]:
Print the description of the world before starting.
# keep track of the reward
self.reward = 0
self._say('Episode starting for player {}'.format(
# inspect the objects in the world
food_locations = []
eater_location = None
for o in world.objects:
if isinstance(o, Eater):
eater_location = (o.location.x, o.location.y)
all_actions = list(o.Actions)
elif isinstance(o, Food):
food_locations.append((o.location.x, o.location.y))
# get the list of blocks
block_locations = sorted((bl.x, bl.y) for bl in world.blocks)
# Print the description of the world
self._say('World size: {}x{}'.format(world.size.x, world.size.y))
self._say('Eater agent in {}'.format(eater_location))
self._say('Food in {}'.format(sorted(food_locations)))
self._say('Blocks in {}'.format(block_locations))
self._say('Available actions: {}'.format({ a.value for a in all_actions}))
# creates an iterator that returns a sequence of random actions
def random_actions():
# prevent unbounded iterations
for _ in range(10000):
yield all_actions[random.randint(0, len(all_actions) - 1)]
return random_actions()
def end_episode(self, outcome: int, alive: bool, success: bool):
"""Method called at the when an episode is completed."""
self._say('Episode completed, my reward is {}'.format(outcome))
MAP_STR = """
# # # #
# # # #
# # #
# # # #
### ##
def main(*args):
Play a random EaterWorld episode using the default player
player_class = GooPlayer
world = EaterWorld.random(map_desc=MAP_STR)
player = player_class('Hungry.Monkey')
run_episode(world, player, horizon=20)
return 0
if __name__ == "__main__":
name: wumpus
- conda-forge
- nodefaults
- python>=3.8
- gym<=0.22
- pip
- pip:
- wumpus @
\ No newline at end of file
#!/usr/bin/env python
import argparse
import json
# Register the Wumpus world environment
import gym_wumpus
from gym_wumpus.envs import WumpusEnv
from gym import envs, error, make
def run_episode(env: WumpusEnv):
obs = env.reset()
total_reward = 0
print('Observation: {}'.format(env.space_to_percept(obs)))
for step in range(1000):
action = env.action_space.sample() # take a random action
obs, reward, done, info = env.step(action)
print('{} -> {}, [{}], {}{}'.format(env.space_to_action(action), reward, env.space_to_percept(obs), 'done ' if done else '', info))
total_reward += reward
if done:
print('Reward {} after {} steps'.format(total_reward, step))
def main():
default_env = 'wumpus-random-v0'
parser = argparse.ArgumentParser()
parser.add_argument('id', nargs='?', default=default_env, help='Environment name')
parser.add_argument('--list', help='Show available environments', action="store_true", default=False)
parser.add_argument('--file', type=argparse.FileType('r'), help='Read the JSON description of the world from the file')
args = parser.parse_args()
if args.list:
print([ for e in envs.registry.all() if str('wumpus') >= 0])
if args.file is not None:
env = make('wumpus-custom-v0', desc=json.load(args.file))
env = make(
except error.Error as e:
print('Bad Gym environment {}, using {}. Error is {}'.format(, default_env, e))
env = make(default_env)
if hasattr(env, 'space_to_percept'):
print('Environment {} is not a Wumpus world.'.format(env))
if __name__ == "__main__":
#!/usr/bin/env python
# Examples demonstrating the use of the Wumpus package
import json
import random
import wumpus as wws
class MyPlayer(wws.UserPlayer):
"""Player demonstrating the use of the start episode method to inspect the world."""
def start_episode(self):
"""Print the description of the world and the agent before starting (if known)."""
if is not None:
# I know the world
world_info = {k: [] for k in ('Hunter', 'Pits', 'Wumpus', 'Gold', 'Exits')}
world_info['Size'] = (,
world_info['Blocks'] = [(c.x, c.y) for c in]
for obj in
if isinstance(obj, wws.Hunter):
world_info['Hunter'].append((obj.location.x, obj.location.y))
elif isinstance(obj, wws.Pit):
world_info['Pits'].append((obj.location.x, obj.location.y))
elif isinstance(obj, wws.Wumpus):
world_info['Wumpus'].append((obj.location.x, obj.location.y))
elif isinstance(obj, wws.Exit):
world_info['Exits'].append((obj.location.x, obj.location.y))
elif isinstance(obj, wws.Gold):
world_info['Gold'].append((obj.location.x, obj.location.y))
print('World details:')
for k in ('Size', 'Pits', 'Wumpus', 'Gold', 'Exits', 'Blocks'):
print(' {}: {}'.format(k, world_info.get(k, None)))
if self.agent is not None and isinstance(self.agent, wws.Hunter):
print('Controlling hunter in position ({}, {}) with direction {}'.format(self.agent.location.x, self.agent.location.y, self.agent.orientation))
def play_classic(size: int = 0):
"""Play the classic version of the wumpus."""
# create the world
world = wws.WumpusWorld.classic(size=size if size > 3 else random.randint(4, 8))
# get the hunter agent
hunter = next(iter(o for o in world.objects if isinstance(o, wws.Hunter)), None)
# Run a player without any knowledge about the world
world.run_episode(hunter, wws.UserPlayer.player())
def play_classic_informed(size: int = 0):
"""Play the classic version of the wumpus with a player knowing the world and the agent."""
# create the world
world = wws.WumpusWorld.classic(size=size if size > 3 else random.randint(4, 8))
# get the hunter agent
hunter = next(iter(o for o in world.objects if isinstance(o, wws.Hunter)), None)
# Run a player with knowledge about the world
world.run_episode(hunter, MyPlayer.player(world=world, agent=hunter))
"id": "simple wumpus world",
"size": [7, 7],
"hunters": [[0, 0]],
"pits": [[4, 0], [3, 1], [2, 2], [6, 2], [4, 4], [3, 5], [4, 6], [5, 6]],
"wumpuses": [[1, 2]],
"exits": [[0, 0]],
"golds": [[6, 3]],
"blocks": []
def play_fixed(world_json: str = WUMPUS_WORLD):
"""Play on a given world described in JSON format."""
# create the world
world = wws.WumpusWorld.from_JSON(json.loads(world_json))
# get the hunter agent
hunter = next(iter(o for o in world.objects if isinstance(o, wws.Hunter)), None)
# Run a player with knowledge about the world
world.run_episode(hunter, MyPlayer.player(world=world, agent=hunter))
EXAMPLES = (play_classic, play_classic_informed, play_fixed)
def main(*args):
# Randomly play one of the examples
ex = random.choice(EXAMPLES)
print('Example {}:'.format(ex.__name__))
print(' ' + ex.__doc__)
if __name__ == "__main__":
"id": "wumpus-v0",
"size": [6, 6],
"hunters": [[0, 0, "N"]],
"pits": [[0, 5], [5, 1], [3, 1], [3, 3], [2, 2], [4, 3], [3, 5]],
"wumpuses": [[5, 4]],
"golds": [[5, 3]]
\ No newline at end of file
#!/usr/bin/env python
# Examples demonstrating the use of the Wumpus package
import argparse
import random
import sys
from typing import Iterable
import wumpus as wws
class GooPlayer(wws.OfflinePlayer):
"""Offline player demonstrating the use of the start episode method to inspect the world."""
def start_episode(self, world: wws.WumpusWorld) -> Iterable[wws.Hunter.Actions]:
"""Print the description of the world before starting."""
world_info = {k: [] for k in ('Hunter', 'Pits', 'Wumpus', 'Gold', 'Exits')}
world_info['Size'] = (world.size.x, world.size.y)
world_info['Blocks'] = [(c.x, c.y) for c in world.blocks]
for obj in world.objects:
if isinstance(obj, wws.Hunter):
world_info['Hunter'].append((obj.location.x, obj.location.y))
all_actions = list(obj.Actions)
elif isinstance(obj, wws.Pit):
world_info['Pits'].append((obj.location.x, obj.location.y))
elif isinstance(obj, wws.Wumpus):
world_info['Wumpus'].append((obj.location.x, obj.location.y))
elif isinstance(obj, wws.Exit):
world_info['Exits'].append((obj.location.x, obj.location.y))
elif isinstance(obj, wws.Gold):
world_info['Gold'].append((obj.location.x, obj.location.y))
print('World details:')
for k in ('Size', 'Pits', 'Wumpus', 'Gold', 'Exits', 'Blocks'):
print(' {}: {}'.format(k, world_info.get(k, None)))
# creates an iterator that returns a sequence of random actions
def random_actions():
# prevent unbounded iterations
for _ in range(10000):
yield all_actions[random.randint(0, len(all_actions) - 1)]
return random_actions()
def classic(size: int = 0):
"""Play the classic version of the wumpus."""
# create the world
world = wws.WumpusWorld.classic(size=size if size > 3 else random.randint(4, 8))
# Run a player without any knowledge about the world
wws.run_episode(world, wws.UserPlayer())
def classic_offline(size: int = 0):
"""Play the classic version of the wumpus with a player knowing the world and the agent."""
# create the world
world = wws.WumpusWorld.classic(size=size if size > 3 else random.randint(4, 8))
# Run a player with knowledge about the world
wws.run_episode(world, GooPlayer())
"id": "simple wumpus world",
"size": [7, 7],
"hunters": [[0, 0]],
"pits": [[4, 0], [3, 1], [2, 2], [6, 2], [4, 4], [3, 5], [4, 6], [5, 6]],
"wumpuses": [[1, 2]],
"exits": [[0, 0]],
"golds": [[6, 3], [3, 3]],
"blocks": []
def fixed_offline(world_json: str = WUMPUS_WORLD):
"""Play on a given world described in JSON format."""
# create the world
world = wws.WumpusWorld.from_JSON(world_json)
# Run a player with knowledge about the world
wws.run_episode(world, GooPlayer())
def real_deal(size: int = 0):
"""Play the classic version of the wumpus without being able to see the actual layout, that is as the actual software agent will do."""
# create the world
world = wws.WumpusWorld.classic(size=size if size > 3 else random.randint(4, 8))
# Run a player without any knowledge about the world
wws.run_episode(world, wws.UserPlayer(), show=False)
EXAMPLES = (classic, classic_offline, fixed_offline, real_deal)
def main(*cargs):
"""Demonstrate the use of the wumpus API on selected worlds"""
ex_names = {ex.__name__.lower(): ex for ex in EXAMPLES}
parser = argparse.ArgumentParser(description=main.__doc__, formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument('example', nargs='?', help='select one of the available example', choices=list(ex_names.keys()))
args = parser.parse_args(cargs)
if args.example:
ex = ex_names[args.example.lower()]
# Randomly play one of the examples
ex = random.choice(EXAMPLES)
print('Example {}:'.format(ex.__name__))
print(' ' + ex.__doc__)
return 0
if __name__ == "__main__":
from gym.envs.registration import register
except ImportError as e:
raise Exception('OpenAI gym not installed, you should install wumpus package with [gym] feature.') from e
'desc': {
"id": "wumpus-v0",
"size": [6, 6],
"hunters": [[0, 0, "N"]],
"pits": [[0, 5], [5, 1], [3, 1], [3, 3], [2, 2], [4, 3], [3, 5]],
"wumpuses": [[5, 4]],
"golds": [[5, 3]]
from gym_wumpus.envs.wumpus_env import WumpusEnv, wumpusenv_classic, wumpusenv_from_dict
import copy
from dataclasses import dataclass
import dataclasses
from typing import Dict, Tuple, Union
import gym
from gym import error, spaces, utils
from gym.utils import seeding
from wumpus import WumpusWorld, Hunter
class WumpusEnv(gym.Env):
metadata = {'render.modes': ['human', 'ansi']}
actions = tuple(Hunter.Actions)
def __init__(self, world: WumpusWorld = None):
"""Initialise a new Gym environment with an instance of the Wumpus world
Keyword Arguments:
world {WumpusWorld} -- Wumpus world, if None a random one will be created (default: {None})
self.__initial_world: WumpusWorld = world if isinstance(world, WumpusWorld) else WumpusWorld.classic()
# make sure there's an hunter
next(iter(o for o in self.__initial_world.objects if isinstance(o, Hunter)))
except StopIteration:
raise ValueError('Missing hunter in the Wumpus environment {}'.format(self.__initial_world))
self.__world: WumpusWorld = None
self.__agent: Hunter = None
self.__done: bool = False
# Actions are discrete integer values
self.action_space = spaces.Discrete(len(self.actions))
self.observation_space = spaces.Dict({ spaces.Discrete(2) for f in dataclasses.fields(Hunter.Percept)})
def _percept_to_space(self) -> Dict[str, int]:
percept = self.__agent.percept()
return {k: (1 if v else 0) for k, v in dataclasses.asdict(percept).items()}
def space_to_percept(cls, obs: Dict[str, int]) -> Hunter.Percept:
"""Converts the Gym ` <>` to the Wumpus environment percept
obs {Dict[str, int]} -- an observation space for the environment, see ``WumpusEnv.observation_space`` property for details
Hunter.Percept -- named tuple corresponding to the observation
return Hunter.Percept(**{k: v > 0 for k, v in obs.items()})
def space_to_action(cls, action: int) -> Hunter.Actions:
"""Converts the integer in the environment action space to the corresponding action in the ``Hunter`` agent
action {int} -- an integer from zero to the number of actions minus one, see ``WumpusEnv.action_space`` property for details
Hunter.Actions -- the corresponding action
return cls.actions[action]
def step(self, action: Union[int, Hunter.Actions]) -> Tuple[Dict[str, int], float, bool, Dict]:
"""OpenAI Gym ``step`` method, see `Gym documentation <>` for details
action {Union[int, Hunter.Actions]} -- the method accepts both an action space value (integer) or a ``Hunter.Actions`` object
Tuple[Dict[str, int], float, bool, Dict] -- see ``gym.step`` method for details
if self.__world is None:
if self.__done:
# episode ended, undefined result
return (None, 0.0, True, {})
info = {'alive': True}
reward =[action] if isinstance(action, int) else action)
if self.__agent.success():
self.__done = True
info['success'] = True
if not self.__agent.isAlive:
self.__done = True
info['alive'] = False
return (self._percept_to_space(), float(reward), self.__done, info)
def reset(self) -> Hunter.Percept:
self.__world = copy.deepcopy(self.__initial_world)
self.__agent = next(iter(o for o in self.__world.objects if isinstance(o, Hunter)), None)
self.__done = False
return self._percept_to_space()
def render(self, mode='human'):
if self.__world is None:
if mode == 'human':
elif mode == 'ansi':
return str(self.__world)
def close(self):
self.__world = None
self.__agent = None
self.__done = False
def from_dict(cls, desc: Dict):
"""Creates a new WumpusEnv object from a dictionary object with the world description.
desc {Dict} -- the description of the Wumpus World, see ``WumpusWorld.from_JSON`` method for details
WumpusEnv -- a new object
world = WumpusWorld.from_JSON(desc)
return cls(world=world)
def classic(cls, size: int = 4, seed=None):
"""Creates a new WumpusEnv object with pits and Wumpus positions randomly generated.
Keyword Arguments:
size {int} -- the size of the grid (default: {4})
seed {int} -- seed to initialise the random generator (default: {None})
WumpusEnv -- a new object
world = WumpusWorld.classic(size=size, seed=seed)
return cls(world=world)
def wumpusenv_from_dict(desc: Dict) -> WumpusEnv:
"""Creates a new WumpusEnv object from a dictionary object with the world description.
desc {Dict} -- the description of the Wumpus World, see ``WumpusWorld.from_JSON`` method for details
WumpusEnv -- a new object
return WumpusEnv.from_dict(desc=desc)
def wumpusenv_classic(size: int = 4, seed=None) -> WumpusEnv:
"""Creates a new WumpusEnv object with pits and Wumpus positions randomly generated.
Keyword Arguments:
size {int} -- the size of the grid (default: {4})
seed {int} -- seed to initialise the random generator (default: {None})
WumpusEnv -- a new object
return WumpusEnv.classic(size=size, seed=seed)
name = wumpus
version = attr:wumpus.__version__
description = This package implements a Python version of the Hunt the Wumpus game as described in the book Artificial Intelligence: A Modern Approach by Russell and Norvig.
long_description = file:
license = MIT
author = Sergio Tessaris
author_email =
zip_safe = False
include_package_data = True
packages = find:
python_requires = >=3.8
install_requires =
gym < 0.24
console_scripts =
gridrunner = wumpus.cli:main
from setuptools import setup, find_packages
import re
import codecs
import os
import subprocess
#!/usr/bin/env python
import setuptools
# Hack to allow non-normalised versions
# see <>
from setuptools.extern.packaging import version
version.Version = version.LegacyVersion
# see <> and
# <>
def find_version(*pkg_path):
pkg_dir = os.path.join(os.path.abspath(os.path.dirname(__file__)), *pkg_path)
version_file =, ''), 'r').read()
version_match ="^__version__ = ['\"]([^'\"]*)['\"]",
version_file, re.M)
_git_revision_ = None
_git_revision_ = subprocess.check_output(['git', 'describe', '--always', '--dirty'], encoding='utf-8').strip()
except subprocess.CalledProcessError:
if version_match:
return + ('' if _git_revision_ is None else '+' + _git_revision_)
elif _git_revision_:
return _git_revision_
raise RuntimeError("Unable to find version string.")
def long_description_md(fname=''):
this_directory = os.path.abspath(os.path.dirname(__file__))
with open(os.path.join(this_directory, fname), encoding='utf-8') as f:
long_description =
return long_description
description='Wumpus world simulator',
author='Sergio Tessaris',
exclude_package_data={'': ['.gitignore']},
if __name__ == "__main__":
\ No newline at end of file
from wumpus.wumpus import WumpusWorld, Hunter, Wumpus, Pit, Gold, Exit
from wumpus.gridworld import Player, UserPlayer, RandomPlayer
from .gridworld import Agent, Percept, Coordinate, coord, GridWorld, EaterWorld, GridWorldException, Eater, Food
from .player import OfflinePlayer, OnlinePlayer, UserPlayer, RandomPlayer
from .wumpus import WumpusWorld, Hunter, Wumpus, Pit, Gold, Exit
from .runner import run_episode
__version__ = '0.1.0'
\ No newline at end of file
__version__ = '1.1.0'
#!/usr/bin/env python
Command line interface
import argparse
import io
import json
import os
import sys
from . import __version__
from .gridworld import GridWorld
from .runner import get_subclasses, check_entrypoint, get_player_class, get_world_class, run_episode, worlds
def gridrunner(*args):
Run episodes on worlds using the specified player.
world_classes = sorted(get_subclasses(GridWorld), key=lambda c: c.__name__)
parser = argparse.ArgumentParser(description=gridrunner.__doc__, formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument('infiles', type=argparse.FileType('r'), nargs='*', help='world description JSON files, they must be compatible with the world type (see --world option).')
parser.add_argument('--name', '-n', type=str, help='name of the player, default to the name of the player class')
parser.add_argument('--path', '-p', type=str, default='.', help="path of the player library, it's prepended to the sys.path variable")
parser.add_argument('--entry', '-e', type=check_entrypoint, required=True, help="object reference for a Player subclass in the form 'importable.module:object.attr'. See <> for details.")
parser.add_argument('--world', '-w', type=str, default=world_classes[0].__name__, choices=[c.__name__ for c in world_classes], help='class name of the world')
parser.add_argument('--horizon', '-z', type=int, default=20, help='maximum number of steps')
parser.add_argument('--noshow', action='store_false', help="prevent the printing the world at each step")
parser.add_argument('--out', '-o', type=argparse.FileType('w'), default=sys.stdout, help="write output to file")
parser.add_argument('--version', action='version', version='%(prog)s ' + __version__)
parser.add_argument('--log', '-l', type=argparse.FileType('w'), help="write the log of the games to file (JSON)")
args_dict = vars(parser.parse_args(args))
name = args_dict['name']
path = os.path.abspath(args_dict['path']) if args_dict['path'] != '.' else os.getcwd()
obj_ref = args_dict['entry']
world_type = args_dict['world']
horizon = args_dict['horizon']
show = args_dict['noshow']
outf: io.TextIOBase = args_dict['out']
game_log: io.TextIOBase = args_dict['log']
player_class = get_player_class(obj_ref, path=path)
world_class = get_world_class(world_type)
if name is None:
name = player_class.__name__
player = player_class(name=name)
if game_log is not None:
print('[', file=game_log)
if len(args_dict['infiles']) > 0:
morelogs = False
for world in worlds(args_dict['infiles'], world_class):
glog = run_episode(world, player, horizon=horizon, show=show, outf=outf)
if game_log is not None:
if morelogs:
print(',', file=game_log)
morelogs = True
json.dump(glog, game_log)
world = world_class.random()
# show world definition
print('-' * 10 + ' Playing on world: ' + '-' * 10, file=outf)
print('\n' + '-' * 40, file=outf)
glog = run_episode(world, player, horizon=horizon, show=show, outf=outf)
if game_log is not None:
json.dump(glog, game_log)
if game_log is not None:
print(']', file=game_log)
return 0
def main():
if __name__ == "__main__":
......@@ -4,10 +4,14 @@ This file implements the basics of a rectangular grid world, and the objects
that might populate it. Including a simple agent that can move in four directions.
import collections
from dataclasses import dataclass
from enum import Enum
from inspect import cleandoc
import io
import json
import random
import textwrap
import sys
from typing import Set, NamedTuple, Iterable, Dict, Union, Any
......@@ -24,11 +28,34 @@ class Coordinate(NamedTuple):
y: int
def coord(x: int, y: int) -> Coordinate:
"""Return a Coordinates named tuple, first argument is horizontal and second vertical."""
def coord(x: int, y: int, *args) -> Coordinate:
Return a Coordinates named tuple, first argument is horizontal and second vertical.
Ignores additional arguments.
return Coordinate(x=x, y=y)
# Wide characters conversion
# see <>
_TO_WIDE_TABLE = dict((i, chr(i + 0xfee0)) for i in range(0x21, 0x7f))
_TO_WIDE_TABLE.update({0x20: u'\u3000', 0x2D: u'\u2212'}) # space and minus
def ascii_to_wide(in_ascii: str) -> str:
"""Converts ASCII characters in the input string into their wide versions.
in_ascii (str): input string
str: converted string
return in_ascii.translate(_TO_WIDE_TABLE)
class WorldObject(object):
"""An object, agent, or any other element that might be placed in a GridWorld."""
def __init__(self):
......@@ -57,15 +84,33 @@ class WorldObject(object):
raise NotImplementedError
class Actions(Enum):
"""The actions that an agent can perform."""
class Percept(object):
"""Agent perception of the environment."""
class Agent(WorldObject):
"""Is a special kind of world object that perform actions."""
class Actions(Enum):
"""The actions that the agent can perform."""
class Actions(Actions):
The actions of this agent.
class Percept(Percept):
The perception of this agent.
def actions(cls) -> Iterable['Agent.Actions']:
def actions(cls) -> Iterable[Actions]:
"""Return the actions that the agent can execute as an iterable object."""
return cls.Actions
......@@ -79,7 +124,7 @@ class Agent(WorldObject):
"""Return true is the agent can still execute actions."""
return True
def percept(self) -> Any:
def percept(self) -> 'Agent.Percept':
"""Return the perception of the environment. None by default."""
return None
......@@ -87,6 +132,11 @@ class Agent(WorldObject):
"""Execute an action and return the reward of the action."""
raise NotImplementedError
def suicide(self) -> int:
"""Kill the agent, returns the outcome of the action."""
# I don't know how to die
return 0
def on_done(self):
"""Called when the episode terminate."""
......@@ -117,13 +167,38 @@ class GridWorld(object):
self._size = size
self._blocks = set(blocks)
self._objects: Dict[WorldObject, Coordinate] = {}
self._location: Dict[Coordinate, Iterable[Coordinate]] = {}
self._location: Dict[Coordinate, Iterable[WorldObject]] = {}
def random(cls, map_desc: str=None, size: Coordinate=None, blocks: Iterable[Coordinate]=None, **kwargs):
"""Create a new world from a map description or from the given size and block positions.
map_desc (str, optional): map of the world. Defaults to None.
size (Coordinate, optional): size of the world. Defaults to None.
blocks (Iterable[Coordinate], optional): location of the blocks. Defaults to None (random placement).
if map_desc is not None:
return cls.from_string(map_desc)
if size is None:
new_size = random.randint(4, 8)
size = coord(new_size, new_size)
if blocks is None:
# randomly place blocks in the world
blocks = set()
occupy = int(random.random() * 0.1 * size.x * size.y)
while len(blocks) < occupy:
blocks.add(coord(random.randint(0, size.x - 1), random.randint(0, size.y - 1)))
return cls(size, blocks)
def from_string(cls, world_desc: str) -> 'GridWorld':
"""Create a new grid world from a string describing the layout.
Create a new grid world from a string describing the layout.
Each line corresponds to a different row, and #s represent the position of a block, while any othe character is interpreted as an empty square. The size of the world is the number of lines (height) and the size of the longest line (width). E.g.:
Each line corresponds to a different row, and #s represent the position of a block, while any other character is interpreted as an empty square. The size of the world is the number of lines (height) and the size of the longest line (width). E.g.:
......@@ -136,9 +211,78 @@ class GridWorld(object):
size = Coordinate(x=max(len(r) for r in rows), y=len(rows))
return cls(size, blocks=cls.find_coordinates(BLOCK_STR, rows))
def from_dict(cls, desc: Dict[str, Any]):
Create a new grid world from a dictionary describing the layout.
- map: contains a string or an array of strings with the the description of the map (see `from_string` method for details)
- size: single or pair of integers specifying the size of the world (single for square). Mandatory if 'map' is missing.
- block: list of pairs specifying the coordinates of blocks.
if 'map' in desc:
map_str = desc['map'] if isinstance(desc['map'], str) else "\n".join(str(line) for line in desc['map'])
world = cls.from_string(map_str)
if 'size' not in desc:
raise GridWorldException('Missing world size!')
size = Coordinate(x=desc['size'], y=desc['size']) if isinstance(desc['size'], int) else Coordinate(x=desc['size'][0], y=desc['size'][1])
world = cls(size, [])
for pos in desc.get('block', []):
world.addBlock(Coordinate(x=pos[0], y=pos[1]))
return world
def to_dict(self) -> Dict[str, Any]:
Convert a world into its dictionary description.
desc_dict = {}
size = self.size
blocks = list(self.blocks)
if len(blocks) < (0.1 * size.x * size.y):
desc_dict['size'] = tuple((size.x, size.y))
if len(blocks) > 0:
desc_dict['block'] = [tuple((pos.x, pos.y)) for pos in blocks]
map_rows = []
for y in range(size.y - 1, -1, -1):
row = ''.join('#' if self.isBlock(coord(x, y)) else '.' for x in range(0, size.x))
desc_dict['map'] = '\n'.join(map_rows)
return desc_dict
def from_JSON(cls, json_desc):
Create a new grid world from a JSON string or document object describing the layout. See method `from_dict` for details on the required keys and their values. For backward compatibility it accepts also a dictionary, which is passed directly to `from_dict` method.
if isinstance(json_desc,
dict_desc = json_desc
elif isinstance(json_desc, str):
dict_desc = json.loads(json_desc)
dict_desc = json.load(json_desc)
return cls.from_dict(dict_desc)
def to_JSON(self, fp):
Serialize a world as a JSON formatted stream to fp (a .write()-supporting file-like object, see json.dump for details).
json.dump(self.to_dict(), fp)
def to_JSONs(self) -> str:
Return a serialization of the world as a JSON formatted string.
return json.dumps(self.to_dict())
def find_coordinates(items: str, world_desc: Union[str, Iterable[str]]):
"""Return all the coordinates in which any of the characters appears in the world description. The decription can be a multiline string or the list of lines."""
"""Return all the coordinates in which any of the characters appears in the world description. The description can be a multiline string or the list of lines."""
coordinates = []
rows = world_desc.splitlines() if isinstance(world_desc, str) else world_desc
y = -1
......@@ -159,6 +303,16 @@ class GridWorld(object):
"""Return the set of coordinates where blocks are placed."""
return self._blocks
def object_locations(self) -> Dict[WorldObject, Coordinate]:
"""Return a dictionary associating objects to their coordinate."""
return self._objects
def location_objects(self) -> Dict[Coordinate, Iterable[WorldObject]]:
"""Return a dictionary associating locations to the objects at the given coordinate."""
return self._location
def isBlock(self, pos: Coordinate) -> bool:
"""Return true if in the coordinate there's a block."""
return pos in self.blocks
......@@ -169,24 +323,24 @@ class GridWorld(object):
def objects_at(self, pos: Coordinate) -> Iterable[WorldObject]:
"""Return an iterable over the objects at the given coordinate."""
return self._location.get(pos, [])
return self.location_objects.get(pos, [])
def location_of(self, obj: WorldObject) -> Coordinate:
"""Return the coordinate of the object within the world, or none if it's not in it."""
return self._objects.get(obj, None)
return self.object_locations.get(obj, None)
def empty_cells(self, count_objects=False) -> Iterable[Coordinate]:
"""Return an iterable object over the cells without blocks. If count_objects is not False then also other objects are taken into account."""
all_cells = set([coord(x, y) for x in range(0, self.size.x) for y in range(0, self.size.y)])
if count_objects:
return all_cells
def objects(self) -> Iterable[WorldObject]:
"""Return an iterable over the objects within the world."""
return self._objects.keys()
return self.object_locations.keys()
def removeBlock(self, pos: Coordinate):
......@@ -199,17 +353,17 @@ class GridWorld(object):
raise OutOfBounds('Placing {} outside the world at {}'.format(obj, pos))
if self.isBlock(pos):
raise Collision('Placing {} inside a block {}'.format(obj, pos))
if pos in self._location:
if pos in self.location_objects:
self._location[pos] = [obj]
self._objects[obj] = pos
self.location_objects[pos] = [obj]
self.object_locations[obj] = pos
def removeObject(self, obj: WorldObject):
del self._objects[obj]
del self.object_locations[obj]
except KeyError:
......@@ -221,169 +375,33 @@ class GridWorld(object):
raise OutOfBounds('Moving {} outside the world at {}'.format(obj, pos))
if self.isBlock(pos):
raise Collision('Moving {} inside a block {}'.format(obj, pos))
old_pos = self._objects.get(obj, None)
if old_pos in self._location:
if pos in self._location:
old_pos = self.object_locations.get(obj, None)
if old_pos in self.location_objects:
if pos in self.location_objects:
self._location[pos] = [obj]
self._objects[obj] = pos
self.location_objects[pos] = [obj]
self.object_locations[obj] = pos
def __str__(self):
BLANK = '.'.rjust(CELL_WIDTH)
maze_strs = [[BLANK for j in range(self.size.x)] for i in range(self.size.y)]
for pos in self.blocks:
maze_strs[pos.y][pos.x] = BLOCK
for obj, pos in self._objects.items():
for obj, pos in self.object_locations.items():
maze_strs[pos.y][pos.x] = obj.charSymbol().ljust(CELL_WIDTH)
top_frame = '' + '' * CELL_WIDTH * self.size.x + '' + '\n'
bottom_frame = '\n' + '' + '' * CELL_WIDTH * self.size.x + ''
top_frame = '' + '' * CELL_WIDTH * self.size.x + '' + '\n'
bottom_frame = '\n' + '' + '' * CELL_WIDTH * self.size.x + ''
side_frame = ''
return top_frame + "\n".join(reversed([side_frame + ''.join(maze_strs[i]) + side_frame for i in range(self.size.y)])) + bottom_frame
def run_episode(self, agent: Agent, player: 'Player', horizon: int = 0, show=True):
"""Run an episode on the world using the player to control the agent. The horizon specifies the maximun number of steps, 0 or None means no limit. If show is true then the world is printed ad each iteration before the player's turn.
Raise the exception GridWorldException is the agent is not in the world."""
if agent not in self.objects:
raise GridWorldException('Missing agent {}, cannot run the episode'.format(agent))
# inform the player of the start of the episode
step = 0
while not horizon or step < horizon:
if agent.success():
print('The agent {} succeded!'.format(
if not agent.isAlive:
print('The agent {} died!'.format(
if show:
action =, agent.percept(), agent.actions())
if action is None:
print('Episode terminated by the player {}.'.format(
print('Step {}: agent {} executing {}'.format(step,,
reward =, reward, agent.percept())
step += 1
print('Episode terminated by maximun number of steps ({}).'.format(horizon))
print('Episode terminated with a reward of {} for agent {}'.format(agent.reward,
class Player(object):
"""A player for a given agent. It implements the play method which should
return one of the actions for the agent or None to give up.
def start_episode(self):
"""Method called at the beginning of the episode."""
def end_episode(self):
"""Method called at the when an episode is completed."""
def play(self, turn: int, state, actions: Iterable[Agent.Actions]) -> Agent.Actions:
"""Given a turn (integer) and a percept, which might differ according to the specific problem, returns an action, among the given list of possible actions, to play at the given turn or None to stop the episode."""
raise NotImplementedError
def feedback(self, action: Agent.Actions, reward: int, state):
"""Receive in input the reward of the last action and the resulting state. The function is called right after the execution of the action."""
def name(self) -> str:
"""Return the name of the player or a default value based on its type and hash."""
return self._name
except AttributeError:
return object_id(self)
def world(self) -> GridWorld:
"""Return the world the player is playing on. If it's not known then None is returned."""
return self._world
except AttributeError:
return None
def agent(self) -> Agent:
"""Return the agent the player is controlling. If it's not known then None is returned."""
return self._agent
except AttributeError:
return None
def player(cls, name: str = None, world: GridWorld = None, agent: Agent = None, **args) -> 'Player':
"""Create a new player with a name, world is playing on and the agent is controlling. Additional args are passed to the default object constructor."""
ply = cls(**args)
if name is not None:
ply._name = name
if world is not None:
ply._world = world
if agent is not None:
ply._agent = agent
return ply
# Examples of the use of the API
# Trivial players
class RandomPlayer(Player):
"""This player selects randomly one of the available actions."""
def play(self, turn: int, state, actions: Iterable[Agent.Actions]) -> Agent.Actions:
actions_lst = list(actions)
return actions_lst[random.randint(0, len(actions) - 1)]
def feedback(self, action: Agent.Actions, reward: int, state):
print('{}: action {} reward is {}'.format(,, reward))
class UserPlayer(Player):
"""This player asks the user for the next move, if it's not ambiguous it accepts also commands initials and ignores the case."""
def play(self, turn: int, state, actions: Iterable[Agent.Actions]) -> Agent.Actions:
actions_dict = { a for a in actions}
print('{} percept:'.format(
print(textwrap.indent(str(state), ' '))
while True:
answer = input('{}: select an action {} and press enter, or empty to stop: '.format(, list(actions_dict.keys()))).strip()
if len(answer) < 1:
return None
elif answer in actions_dict:
return actions_dict[answer]
options = [k for k in actions_dict.keys() if k.lower().startswith(answer.lower())]
if len(options) == 1:
return actions_dict[options[0]]
print('Canot understand <{}>'.format(answer))
def feedback(self, action: Agent.Actions, reward: int, state):
print('{}: action {} reward is {}'.format(,, reward))
return ascii_to_wide(top_frame + "\n".join(reversed([side_frame + ''.join(maze_strs[i]) + side_frame for i in range(self.size.y)])) + bottom_frame)
......@@ -392,25 +410,40 @@ class UserPlayer(Player):
class Food(WorldObject):
"""Food in the EaterWorld, it can be consumed by the Eater agent."""
def charSymbol(self):
return '🍌'
class SimpleEater(Agent):
class Actions(Agent.Actions):
class Eater(Agent):
"""An agent that moves in the EaterWorld. It can move in 4 directions (Eater.Actions) and consumes Food objects that are in the cells where it moves. It sees its position and smells whether there's still food in the world (Eater.Percept). Its goal is to consume all the food in the environment."""
class Actions(Actions):
"""Eater actions for each direction in which the agent can move (N, S, E, W)"""
N = (0, 1)
S = (0, -1)
E = (1, 0)
W = (-1, 0)
class Percept(Percept):
"""Eater agent perception: the current position and whether there's more food."""
position: Coordinate
more_food: bool
def __init__(self):
self._foodcount = 0
self._reward = 0
self.FOOD_BONUS = 10
self._alive = True
def charSymbol(self):
return '🐒'
def isAlive(self):
"""Return true is the agent can still execute actions."""
return self._alive
def reward(self) -> int:
"""The current accumulated reward"""
......@@ -435,8 +468,17 @@ class SimpleEater(Agent):
self._reward += cost
return cost
def on_done(self):
print('{} agent: Got food {} times.'.format(type(self).__name__, self._foodcount))
def suicide(self) -> int:
"""Kill the agent, returns the outcome of the action."""
self._alive = False
# no penalty for suicide
return 0
def percept(self) -> 'Eater.Percept':
return self.Percept(
more_food=any(isinstance(o, Food) for o in
def success(self) -> bool:
"""Return true once all the food has been consumed."""
......@@ -444,26 +486,97 @@ class SimpleEater(Agent):
return len(food) == 0
def eaterWorld(map_desc: str, foods: Iterable[Coordinate] = [], food_amount: float = .1) -> GridWorld:
"""Create a new world using the decription and placing food in the given list foods or randomly placing a food_amount (if greater than zero) or the percentage of free cells otherwise number foods."""
world = GridWorld.from_string(map_desc)
class EaterWorld(GridWorld):
"""A GridWorld which contains Food and a Eater agent that can move within the world and eat the food when it moves in a cell that contains some food.
def random(cls, map_desc: str=None, size: Coordinate=None, blocks: Iterable[Coordinate]=[], food_amount: float=.1, **kwargs) -> 'EaterWorld':
"""Create a new world from the map description and randomly place food until the given percentage of the free space is filled. If the food amount is greater or equal than 1 then it's interpreted as the number of food objects to include.
map_desc (str, optional): map of the world. Defaults to None.
size (Coordinate, optional): size of the world. Defaults to None.
blocks (Iterable[Coordinate], optional): location of the blocks. Defaults to [].
food_amount (float, optional): the amount of food to add. Default 10% of the available cells.
GridWorldException: if the world cannot be created
EaterWorld: a new random world
world = super().random(map_desc=map_desc, size=size, blocks=blocks, **kwargs)
if len(foods) > 0:
for pos in foods:
if not world.isBlock(pos):
world.addObject(Food(), pos)
free_cells = list(world.empty_cells())
if len(free_cells) < 1:
raise GridWorldException('No space for placing food and agent in the world')
world.addObject(Eater(), free_cells.pop())
food_count = int(food_amount) if food_amount >= 1 else int(len(free_cells) * food_amount)
for i in range(food_count):
world.addObject(Food(), free_cells.pop())
return world
return world
def from_dict(cls, desc: Dict[str, Any]):
Create a new grid world from a dictionary describing the layout.
Keys (include keys of `GridWorld`):
- eater: agent position (random if missing)
- food: list of food positions
world = super().from_dict(desc)
def simpleEaterTest(player_class=RandomPlayer, horizon=20):
MAP_STR = """
for pos in [coord(*loc) for loc in desc.get('food', [])]:
if not world.isBlock(pos):
world.addObject(Food(), pos)
if 'eater' in desc:
return world
def to_dict(self) -> Dict[str, Any]:
Convert a world into its dictionary description.
desc_dict = super().to_dict()
desc_dict['food'] = []
for o in self.objects:
if isinstance(o, Eater):
desc_dict['eater'] = (o.location.x, o.location.y)
elif isinstance(o, Food):
desc_dict['food'].append(tuple((o.location.x, o.location.y)))
return desc_dict
def add_eater(self, location: Coordinate = None):
Add an Eater agent at the given coordinates or a random place if the location is not provided.
Raise exceptions if the agent cannot be placed in the given position or there's no space.
agent = Eater()
if location is not None:
self.addObject(agent, location)
free_cells = list(self.empty_cells(count_objects=True))
if len(free_cells) < 1:
raise GridWorldException('No space for placing the agent in the world')
self.addObject(agent, free_cells.pop())
MAP_STR = """
# # # #
# # #
......@@ -481,24 +594,3 @@ def simpleEaterTest(player_class=RandomPlayer, horizon=20):
# # # #
world = eaterWorld(MAP_STR, food_amount=0.1)
free_cells = list(world.empty_cells(count_objects=True))
# place the agent in a random empty place
agent = SimpleEater()
world.addObject(agent, free_cells.pop())
player = player_class.player()
world.run_episode(agent, player, horizon=horizon)
# Testing the API
if __name__ == "__main__":
import random
import textwrap
from typing import Iterable, Union
from .gridworld import Agent, Percept, object_id, GridWorld
class OnlinePlayer:
"""A player for a given agent. It implements the play method which should
return one of the actions for the agent or None to give up.
def __init__(self, name: str = None):
Initialise the name of the player if provided.
self._name = str(name) if name is not None else object_id(self)
def name(self) -> str:
"""The name of the player or a default value based on its type and hash."""
return self._name
def start_episode(self):
"""Method called at the beginning of the episode."""
def end_episode(self, outcome: int, alive: bool, success: bool):
"""Method called at the when an episode is completed with the outcome of the game and whether the agent was still alive and successfull.
def play(self, percept: Percept, actions: Iterable[Agent.Actions], reward: Union[int, None]) -> Agent.Actions:
"""Given a percept, which might differ according to the specific problem, and the list of valid actions, returns an action to play at the given turn or None to stop the episode. The reward is the one obtained in the previous action, on the first turn its value is None."""
raise NotImplementedError
class OfflinePlayer:
A player that receives the configuration of the world at the beginning of the episode and returns the sequence of actions to play.
def __init__(self, name: str = None):
Initialise the name of the player if provided.
self._name = str(name) if name is not None else object_id(self)
def name(self) -> str:
"""The name of the player or a default value based on its type and hash."""
return self._name
def start_episode(self, word: GridWorld) -> Iterable[Agent.Actions]:
"""Method called at the beginning of the episode. Receives the current world status."""
raise NotImplementedError
def end_episode(self, outcome: int, alive: bool, success: bool):
"""Method called at the when an episode is completed with the outcome of the game and whether the agent was still alive and successfull.
# Examples of the use of the API
# Trivial players
class RandomPlayer(OnlinePlayer):
"""This player selects randomly one of the available actions."""
def play(self, percept: Percept, actions: Iterable[Agent.Actions], reward: int) -> Agent.Actions:
actions_lst = list(actions)
return actions_lst[random.randint(0, len(actions) - 1)]
class UserPlayer(OnlinePlayer):
"""This player asks the user for the next move, if it's not ambiguous it accepts also commands initials and ignores the case."""
def play(self, percept: Percept, actions: Iterable[Agent.Actions], reward: int) -> Agent.Actions:
actions_dict = { a for a in actions}
print('{} percept:'.format(
print(textwrap.indent(str(percept), ' '))
while True:
answer = input('{}: select an action {} and press enter, or empty to stop: '.format(, list(actions_dict.keys()))).strip()
if len(answer) < 1:
return None
elif answer in actions_dict:
return actions_dict[answer]
options = [k for k in actions_dict.keys() if k.lower().startswith(answer.lower())]
if len(options) == 1:
return actions_dict[options[0]]
print('Cannot understand <{}>'.format(answer))
#!/usr/bin/env python
Functions to run the players
import argparse
import copy
import importlib
import inspect
import io
import json
import os
import re
import sys
from typing import Any, Dict, Iterator, Type, Union
from .gridworld import Agent, GridWorld, GridWorldException, EaterWorld
from .player import OnlinePlayer, OfflinePlayer
from .wumpus import WumpusWorld
def get_subclasses(cls):
for subclass in cls.__subclasses__():
yield from get_subclasses(subclass)
yield subclass
def check_entrypoint(arg_value: str, pattern: re.Pattern = re.compile(r"^[\w.-]+:[\w.-]+$")) -> str:
"""Checks that the argument is a valid object reference specification in the form 'importable.module:object.attr'. See <> for details.
arg_value (str): object reference
pattern ([type], optional): regular expression defining the required reference. Defaults to '^[\w.-]+:[\w.-]+$'.
argparse.ArgumentTypeError: if the argument doesn't comply with the pattern
str: the given argument if it complies with the pattern
if not pattern.match(arg_value):
raise argparse.ArgumentTypeError("the argument doesn't comply with {}".format(pattern.pattern))
return arg_value
def get_player_class(object_ref: str, path: os.PathLike = None) -> Union[Type[OnlinePlayer], Type[OfflinePlayer]]:
if path is not None and path not in sys.path:
if not os.path.isdir(path):
raise FileNotFoundError('Directory <{}> not found'.format(path))
sys.path.insert(0, path)
# see <>
modname, qualname_separator, qualname = object_ref.partition(':')
obj = importlib.import_module(modname)
except ModuleNotFoundError as e:
raise ImportError(f'Cannot find entrypoint {object_ref}: {e}')
if qualname_separator:
for attr in qualname.split('.'):
obj = getattr(obj, attr)
except AttributeError as e:
raise ImportError(f'Cannot find entrypoint {object_ref}: {e}')
player_class = obj
if not inspect.isclass(player_class) or not issubclass(player_class, (OnlinePlayer, OfflinePlayer)):
raise RuntimeError(f'{player_class} is not a subclass of OnlinePlayer or OfflinePlayer')
return player_class
def get_world_class(name: str) -> Type[GridWorld]:
"""Return the class of the world corresponding to the given name.
name (str): name of the GridWorld subclass
Type[GridWorld]: GridWorld subclass
world_class = globals().get(name, None)
if world_class is None:
raise NotImplementedError('GridWorld subclass {} is not available'.format(name))
if not issubclass(world_class, GridWorld):
raise NotImplementedError('Class {} is not a subclass of GridWorld'.format(name))
return world_class
def worlds(files, world_class: Type[GridWorld]) -> Iterator[GridWorld]:
for fd in files:
world_defs = json.load(fd)
if isinstance(world_defs, dict):
yield world_class.from_dict(world_defs)
elif isinstance(world_defs, list):
for wd in world_defs:
yield world_class.from_dict(wd)
except Exception as e:
print('Skipping world {}: {}'.format(wd, e))
except Exception as e:
print('Skipping {}: {}'.format(fd, e))
def run_episode(world: GridWorld, player: Union[OnlinePlayer, OfflinePlayer], agent: Agent = None, horizon: int = 0, show=True, outf: io.TextIOBase = None) -> Dict[str, Any]:
"""Run an episode on the world using the player to control the agent. The horizon specifies the maximum number of steps, 0 or None means no limit. If show is true then the world is printed ad each iteration before the player's turn.
Raise the exception GridWorldException is the agent is not in the world.
world (GridWorld): the world in which the episode is run
player (Player): the player
agent (Agent, optional): the agent controlled by the player. Defaults to first agent in the world
horizon (int, optional): stop after this number of steps, 0 for no limit. Defaults to 0.
show (bool, optional): whether to show the environment before a step. Defaults to True.
outf (TextIOBase, optional): writes output to the given stream. Defaults to stdout.
dictionary (JSON encodable) with the log of the game
GridWorldException: if there are problems with the world (e.g. there's no agent)
if outf is None:
outf = sys.stdout
if agent is None:
agent = next(o for o in world.objects if isinstance(o, Agent))
except StopIteration:
raise GridWorldException(f'No agent in this {world}')
elif agent not in world.objects:
raise GridWorldException('Missing agent {}, cannot run the episode'.format(agent))
game_log = {
'world': world.to_dict(),
'actions': [],
'exceptions': [],
'maxsteps': False
# inform the player of the start of the episode
if isinstance(player, OfflinePlayer):
plan = iter(player.start_episode(copy.deepcopy(world)))
except Exception as e:
plan = iter([])
print(f'Exception in Player.start_episode: {e}', file=outf)
game_log['exceptions'].append(f'Player.start_episode: {e}')
except Exception as e:
print(f'Exception in Player.start_episode: {e}', file=outf)
game_log['exceptions'].append(f'Player.start_episode: {e}')
step = 0
reward = None
while not horizon or step < horizon:
if agent.success():
print('The agent {} succeeded!'.format(, file=outf)
if not agent.isAlive:
print('The agent {} died!'.format(, file=outf)
if show:
print(world, file=outf)
if isinstance(player, OfflinePlayer):
action = next(plan, None)
action =, agent.actions(), reward)
except Exception as e:
action = None
print(f'Exception in {e}', file=outf)
game_log['exceptions'].append(f' {e}')
if action is None:
print('Episode terminated by the player {}.'.format(, file=outf)
reward =
print('Step {}: agent {} executing {} -> reward {}'.format(step,,, reward), file=outf)
step += 1
print('Episode terminated by maximum number of steps ({}).'.format(horizon), file=outf)
game_log['maxsteps'] = True
player.end_episode(agent.reward, agent.isAlive, agent.success())
except Exception as e:
print(f'Exception in Player.end_episode: {e}', file=outf)
game_log['exceptions'].append(f'Player.end_episode: {e}')
game_log['reward'] = agent.reward
game_log['alive'] = agent.isAlive
game_log['success'] = agent.success()
print(world, file=outf)
print('Episode terminated with a reward of {} for agent {}'.format(agent.reward,, file=outf)
return game_log
"$schema": "",
"definitions": {
"coord": {
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{"type": "integer"},
{"type": "integer"}
"hunter": {
"type": "array",
"minItems": 2,
"items": [
{"type": "integer"},
{"type": "integer"},
{"type": "string", "enum": ["N", "S", "E", "W"]}
"type": "object",
"properties": {
"id": {"type": "string"},
"size": {"$ref": "#/definitions/coord"},
"hunters": {"type": "array", "items": {"$ref": "#/definitions/hunter"}},
"pits": {"type": "array", "items": {"$ref": "#/definitions/coord"}},
"wumpuses": {"type": "array", "items": {"$ref": "#/definitions/coord"}},
"exits": {"type": "array", "items": {"$ref": "#/definitions/coord"}},
"golds": {"type": "array", "items": {"$ref": "#/definitions/coord"}},
"blocks": {"type": "array", "items": {"$ref": "#/definitions/coord"}}
"required": ["size"]
\ No newline at end of file
from dataclasses import dataclass
from enum import Enum
import json
import random
from typing import Iterable, NamedTuple, Dict, Sequence, Tuple
from typing import Any, Iterable, Dict, Sequence, Tuple, Union
from .gridworld import Agent, WorldObject, GridWorld, Coordinate, coord, GridWorldException, UserPlayer
from .gridworld import Actions, Agent, WorldObject, GridWorld, Coordinate, coord, GridWorldException, Percept
class WumpusWorldObject(WorldObject):
......@@ -39,7 +38,7 @@ class Exit(WumpusWorldObject):
class Hunter(Agent):
class Actions(Agent.Actions):
class Actions(Actions):
MOVE = 0
LEFT = 2
......@@ -53,7 +52,8 @@ class Hunter(Agent):
E = (1, 0)
W = (-1, 0)
class Percept(NamedTuple):
class Percept(Percept):
stench: bool
breeze: bool
bump: bool
......@@ -91,7 +91,7 @@ class Hunter(Agent):
self._arrow: bool = True
self._done: bool = False
self._bump: bool = False
self._has_gold: bool = False
self._has_gold: int = 0
def world(self) -> 'WumpusWorld':
......@@ -101,7 +101,7 @@ class Hunter(Agent):
"""Return true once the goal of the agent has been achieved."""
return self._done
def percept(self):
def percept(self) -> 'Hunter.Percept':
return self.Percept(
......@@ -165,7 +165,7 @@ class Hunter(Agent):
def glitter(self):
return is not None
def bump(self):
......@@ -212,29 +212,35 @@ class Hunter(Agent):
reward -= 1
elif action == self.Actions.GRAB:
reward -= 1
for gold in
self._has_gold = True
gold =
if gold:
self._has_gold += 1
elif action == self.Actions.CLIMB:
reward -= 1
self._done = True
if self._has_gold:
reward += 1000
reward += 1000 * self._has_gold
raise ValueError('Unrecognised action {}'.format(action))
self._reward += reward
return reward
def suicide(self) -> int:
"""Kill the agent, returns the outcome of the action."""
self._alive = False
reward = -1000
self._reward += reward
return reward
class WumpusWorld(GridWorld):
def classic(cls, size: int = 4, seed=None, pitProb: float = .2):
def classic(cls, size: Union[int, Coordinate] = 4, seed=None, pitProb: float = .2, **kwargs):
"""Create a classic wumpus world problem of the given size. The agent is placed in (0,0) facing north and there's exactly one wumpus and a gold ingot. The seed is used to initialise the random number generation and pits are placed with pitProb probability."""
world = cls(coord(size, size), [])
world = cls(size if isinstance(size, Coordinate) else coord(size, size), [])
agentPos = coord(0, 0)
......@@ -246,14 +252,18 @@ class WumpusWorld(GridWorld):
return world
def from_JSON(cls, json_obj: Dict):
def random(cls, size: Union[int, Coordinate] = 4, seed=None, pitProb: float = .2, **kwargs):
return cls.classic(size=size, seed=seed, pitProb=pitProb, **kwargs)
def from_dict(cls, desc: Dict[str, Any]):
"""Create a wumpus world from a JSON object"""
def getCoord(lst: Sequence[int]) -> Coordinate:
return coord(lst[0], lst[1])
def coordLst(key: str) -> Iterable[Coordinate]:
data = json_obj.get(key, [])
data = desc.get(key, [])
if len(data) < 1:
return []
elif isinstance(data[0], Sequence):
......@@ -262,9 +272,9 @@ class WumpusWorld(GridWorld):
return [getCoord(data)]
size = coordLst('size')[0]
size = getCoord(desc.get('size', [8, 8]))
blocks = coordLst('blocks')
hunters = coordLst('hunters')
hunters = desc.get('hunters', [])
pits = coordLst('pits')
wumpuses = coordLst('wumpuses')
exits = coordLst('exits')
......@@ -272,10 +282,15 @@ class WumpusWorld(GridWorld):
world = cls(size, blocks)
for pos in hunters:
world.addObject(Hunter(orientation=Hunter.Orientation.N), pos)
for hunter in (hunters if (len(hunters) == 0) or isinstance(hunters[0], Sequence) else [hunters]):
pos = getCoord(hunter)
orientation = Hunter.Orientation[str(hunter[2]).upper()]
except Exception:
orientation = Hunter.Orientation.N
world.addObject(Hunter(orientation=orientation), pos)
for pos in (exits or hunters or [coord(0, 0)]):
for pos in (exits or [coord(0, 0)]):
world.addObject(Exit(), pos)
for pos in wumpuses:
......@@ -338,11 +353,11 @@ class WumpusWorld(GridWorld):
"""Return a gold if it's at the coordinate or None."""
return next(iter(self._objAt(Gold, pos)), None)
def isExit(self, pos: Coordinate) -> Iterable[Exit]:
def isExit(self, pos: Coordinate) -> Exit:
"""Return a exit if it's at the coordinate or None."""
return next(iter(self._objAt(Exit, pos)), None)
def to_JSON(self) -> Dict:
def to_dict(self) -> Dict[str, Any]:
"""Return a JSON serialisable object with the description of the world."""
def coord_tuple(c: Coordinate) -> Tuple[int]:
......@@ -361,7 +376,7 @@ class WumpusWorld(GridWorld):
golds = []
for obj in self.objects:
if isinstance(obj, Hunter):
hunters.append(coord_tuple(obj.location) + (,))
elif isinstance(obj, Pit):
elif isinstance(obj, Wumpus):
......@@ -435,22 +450,3 @@ class WumpusWorld(GridWorld):
return '\n'.join(lines)
if __name__ == "__main__":
WORLD_MAP = '\n'.join([
# world = WumpusWorld.randomWorld(size=7, blockProb=0.1, world_desc=WORLD_MAP)
world = WumpusWorld.classic(size=7)
hunter = next(iter(o for o in world.objects if isinstance(o, Hunter)), None)
world.run_episode(hunter, UserPlayer.player())
JSON_STRING = '{"size": [7, 7], "hunters": [[0, 0]], "pits": [[4, 0], [3, 1], [2, 2], [6, 2], [4, 4], [3, 5], [4, 6], [5, 6]], "wumpuses": [[1, 2]], "exits": [[0, 0]], "golds": [[6, 3]]}'
world = WumpusWorld.from_JSON(json.loads(JSON_STRING))