Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • Davide.Lanti1/wumpus-tessaris-1
  • Davide.Lanti1/wumpus-tessaris
  • Asma.Tajuddin/wumpus
  • Ali.Ahmed/wumpus
  • Sanaz.Khosropour/wumpus
  • PetroMakanza.Joseph/wumpus
  • tessaris/wumpus
7 results
Show changes
Commits on Source (21)
......@@ -9,22 +9,22 @@ The package has been written to be used in the master course of AI taught at the
You can download the source code from <https://gitlab.inf.unibz.it/tessaris/wumpus> and use `pip install .`, or install directly from the repository using
```
pip install https://gitlab.inf.unibz.it/tessaris/wumpus/-/archive/master/wumpus-master.tar.gz
pip install git+https://gitlab.inf.unibz.it/tessaris/wumpus.git@master
```
## Usage
To write your own player you should create a subclass of `Player` (defined in [gridworld.py](https://gitlab.inf.unibz.it/tessaris/wumpus/blob/master/wumpus/gridworld.py)) and then use an instance as a parameter of the `run_episode` method of `GridWorld` class (defined in [gridworld.py](https://gitlab.inf.unibz.it/tessaris/wumpus/blob/master/wumpus/gridworld.py)).
To write your own player you should create a subclass of `OnlinePlayer` or `OfflinePlayer` (defined in [player.py](https://gitlab.inf.unibz.it/tessaris/wumpus/blob/master/wumpus/player.py)) and then use an instance as a parameter of the `run_episode` function (defined in [runner.py](https://gitlab.inf.unibz.it/tessaris/wumpus/-/blob/master/wumpus/runner.py)).
Examples of the usage of the package can be found in the implementation of two players `RandomPlayer` and `UserPlayer` from [gridworld.py](https://gitlab.inf.unibz.it/tessaris/wumpus/blob/master/wumpus/gridworld.py), and in the files [`wumpus-usage.py`](https://gitlab.inf.unibz.it/tessaris/wumpus/blob/master/examples/wumpus-usage.py), [`eater-usage.py`](https://gitlab.inf.unibz.it/tessaris/wumpus/blob/master/examples/eater-usage.py) in the [`examples`](https://gitlab.inf.unibz.it/tessaris/wumpus/blob/master/examples) directory of the repository.
Examples of the usage of the package can be found in the implementation of two players `RandomPlayer` and `UserPlayer` in [player.py](https://gitlab.inf.unibz.it/tessaris/wumpus/blob/master/wumpus/player.py), and in the files [`wumpus_usage.py`](https://gitlab.inf.unibz.it/tessaris/wumpus/blob/master/examples/wumpus_usage.py), [`eater_usage.py`](https://gitlab.inf.unibz.it/tessaris/wumpus/blob/master/examples/eater_usage.py) in the [`examples`](https://gitlab.inf.unibz.it/tessaris/wumpus/blob/master/examples) directory of the repository.
Your player could be also run using the script `gridrunner` script (in the repository is the `runner.py` file) and it'll be available once the package is installed (in alternative could be executed using `python -m wumpus.runner`):
Your player could be also run using the script `gridrunner` script (in the repository is the [`cli.py`](https://gitlab.inf.unibz.it/tessaris/wumpus/-/blob/master/wumpus/cli.py) file) and it'll be available once the package is installed (in alternative could be executed using `python -m wumpus.cli`):
```
``` bash
$ gridrunner --help
usage: gridrunner [-h] [--name NAME] [--path PATH] --entry ENTRY
[--world {EaterWorld,WumpusWorld}] [--horizon HORIZON]
[--noshow] [--out OUT] [--version]
[--noshow] [--out OUT] [--version] [--log LOG]
[infiles [infiles ...]]
Run episodes on worlds using the specified player.
......@@ -54,31 +54,32 @@ optional arguments:
--out OUT, -o OUT write output to file (default: <_io.TextIOWrapper
name='<stdout>' mode='w' encoding='UTF-8'>)
--version show program's version number and exit
--log LOG, -l LOG write the log of the games to file (JSON) (default:
None)
```
For example:
``` bash
$ gridrunner --world EaterWorld --entry wumpus:RandomPlayer --noshow --horizon 5 --path examples ./examples/eater-world.json
┌────────┐
│.....│
│.██...│
│.....│
│.....│
│🍌🐒.🍌.│
└────────┘
Step 0: agent Eater_df9919cd executing W -> reward 9
Step 1: agent Eater_df9919cd executing S -> reward -1
Step 2: agent Eater_df9919cd executing N -> reward -1
Step 3: agent Eater_df9919cd executing W -> reward -1
Step 4: agent Eater_df9919cd executing N -> reward -1
$ gridrunner --world EaterWorld --entry wumpus:RandomPlayer --noshow --horizon 5 ./examples/eater-world.json
Step 0: agent Eater_c881e1b0 executing N -> reward -1
Step 1: agent Eater_c881e1b0 executing W -> reward -1
Step 2: agent Eater_c881e1b0 executing E -> reward -1
Step 3: agent Eater_c881e1b0 executing W -> reward -1
Step 4: agent Eater_c881e1b0 executing E -> reward -1
Episode terminated by maximum number of steps (5).
┌─────────┐
┌─────────
│.....│
│.██...│
│🐒....│
│.....│
│...🍌.│
└─────────┘
Episode terminated with a reward of 5 for agent Eater_df9919cd
│.🐒...│
│🍌..🍌.│
└──────────┘
Episode terminated with a reward of -5 for agent Eater_c881e1b0
```
You can also use a player defined in a script; e.g., if the player class `GooPlayer` is defined in the `eater_usage.py` you can use the `eater_usage:GooPlayer` entry. Remember Python rules for finding modules, where the current directory is added to the search path. If the script is in a different directory, you can use the `--path` argument to tell the script where to find it:
```bash
gridrunner --world EaterWorld --entry eater_usage:GooPlayer --path examples --noshow --horizon 5 ./examples/eater-world.json
```
\ No newline at end of file
......@@ -8,19 +8,18 @@ import random
from typing import Iterable
import sys
from wumpus import InformedPlayer
from wumpus.gridworld import Eater, EaterWorld, Food
from wumpus import OfflinePlayer, run_episode, Eater, EaterWorld, Food
class MyPlayer(InformedPlayer):
class GooPlayer(OfflinePlayer):
"""
Informed player demonstrating the use of the start episode method to inspect the world.
Offline player demonstrating the use of the start episode method to inspect the world.
"""
def _say(self, text: str):
print(self.name + ' says: ' + text)
def start_episode(self, world: EaterWorld):
def start_episode(self, world: EaterWorld) -> Iterable[Eater.Actions]:
"""
Print the description of the world before starting.
"""
......@@ -35,6 +34,7 @@ class MyPlayer(InformedPlayer):
for o in world.objects:
if isinstance(o, Eater):
eater_location = (o.location.x, o.location.y)
all_actions = list(o.Actions)
elif isinstance(o, Food):
food_locations.append((o.location.x, o.location.y))
......@@ -47,23 +47,20 @@ class MyPlayer(InformedPlayer):
self._say('Food in {}'.format(sorted(food_locations)))
self._say('Blocks in {}'.format(block_locations))
self._say('Available actions: {}'.format({a.name: a.value for a in Eater.Actions}))
self._say('Available actions: {}'.format({a.name: a.value for a in all_actions}))
def end_episode(self):
"""Method called at the when an episode is completed."""
self._say('Episode completed, my reward is {}'.format(self.reward))
# creates an iterator that returns a sequence of random actions
def random_actions():
# prevent unbounded iterations
for _ in range(10000):
yield all_actions[random.randint(0, len(all_actions) - 1)]
return random_actions()
# random player
def play(self, turn: int, percept: Eater.Percept, actions: Iterable[Eater.Actions]) -> Eater.Actions:
actions_lst = list(actions)
next_move = actions_lst[random.randint(0, len(actions) - 1)]
self._say('I see {}, my next move is {}'.format(percept, next_move.name))
return next_move
def feedback(self, action: Eater.Actions, reward: int, percept: Eater.Percept):
"""Receive in input the reward of the last action and the resulting state. The function is called right after the execution of the action."""
self._say('Moved to {} with reward {}'.format(percept.position, reward))
self.reward += reward
def end_episode(self, outcome: int, alive: bool, success: bool):
"""Method called at the when an episode is completed."""
self._say('Episode completed, my reward is {}'.format(outcome))
MAP_STR = """
......@@ -81,12 +78,12 @@ def main(*args):
Play a random EaterWorld episode using the default player
"""
player_class = MyPlayer
player_class = GooPlayer
world = EaterWorld.random(MAP_STR)
world = EaterWorld.random(map_desc=MAP_STR)
player = player_class('Hungry.Monkey')
world.run_episode(player, horizon=20)
run_episode(world, player, horizon=20)
return 0
......
name: wumpus
channels:
- conda-forge
- nodefaults
dependencies:
- python>=3.6
- gym
- python>=3.8
- gym<=0.22
- pip
- pip:
- wumpus[gym] @ git+https://gitlab.inf.unibz.it/tessaris/wumpus.git
\ No newline at end of file
- wumpus @ https://gitlab.inf.unibz.it/tessaris/wumpus/-/archive/dev/wumpus-dev.zip
\ No newline at end of file
......@@ -2,16 +2,18 @@
# Examples demonstrating the use of the Wumpus package
import argparse
import random
import sys
from typing import Iterable
import wumpus as wws
class MyPlayer(wws.InformedPlayer, wws.UserPlayer):
"""Informed player demonstrating the use of the start episode method to inspect the world."""
class GooPlayer(wws.OfflinePlayer):
"""Offline player demonstrating the use of the start episode method to inspect the world."""
def start_episode(self, world: wws.WumpusWorld):
def start_episode(self, world: wws.WumpusWorld) -> Iterable[wws.Hunter.Actions]:
"""Print the description of the world before starting."""
world_info = {k: [] for k in ('Hunter', 'Pits', 'Wumpus', 'Gold', 'Exits')}
......@@ -21,6 +23,7 @@ class MyPlayer(wws.InformedPlayer, wws.UserPlayer):
for obj in world.objects:
if isinstance(obj, wws.Hunter):
world_info['Hunter'].append((obj.location.x, obj.location.y))
all_actions = list(obj.Actions)
elif isinstance(obj, wws.Pit):
world_info['Pits'].append((obj.location.x, obj.location.y))
elif isinstance(obj, wws.Wumpus):
......@@ -34,23 +37,31 @@ class MyPlayer(wws.InformedPlayer, wws.UserPlayer):
for k in ('Size', 'Pits', 'Wumpus', 'Gold', 'Exits', 'Blocks'):
print(' {}: {}'.format(k, world_info.get(k, None)))
# creates an iterator that returns a sequence of random actions
def random_actions():
# prevent unbounded iterations
for _ in range(10000):
yield all_actions[random.randint(0, len(all_actions) - 1)]
def play_classic(size: int = 0):
return random_actions()
def classic(size: int = 0):
"""Play the classic version of the wumpus."""
# create the world
world = wws.WumpusWorld.classic(size=size if size > 3 else random.randint(4, 8))
# Run a player without any knowledge about the world
world.run_episode(wws.UserPlayer())
wws.run_episode(world, wws.UserPlayer())
def play_classic_informed(size: int = 0):
def classic_offline(size: int = 0):
"""Play the classic version of the wumpus with a player knowing the world and the agent."""
# create the world
world = wws.WumpusWorld.classic(size=size if size > 3 else random.randint(4, 8))
# Run a player with knowledge about the world
world.run_episode(MyPlayer())
wws.run_episode(world, GooPlayer())
WUMPUS_WORLD = '''
......@@ -67,13 +78,13 @@ WUMPUS_WORLD = '''
'''
def play_fixed_informed(world_json: str = WUMPUS_WORLD):
def fixed_offline(world_json: str = WUMPUS_WORLD):
"""Play on a given world described in JSON format."""
# create the world
world = wws.WumpusWorld.from_JSON(world_json)
# Run a player with knowledge about the world
world.run_episode(MyPlayer())
wws.run_episode(world, GooPlayer())
def real_deal(size: int = 0):
......@@ -82,22 +93,21 @@ def real_deal(size: int = 0):
world = wws.WumpusWorld.classic(size=size if size > 3 else random.randint(4, 8))
# Run a player without any knowledge about the world
world.run_episode(wws.UserPlayer(), show=False)
wws.run_episode(world, wws.UserPlayer(), show=False)
EXAMPLES = (play_classic, play_classic_informed, play_fixed_informed, real_deal)
EXAMPLES = (classic, classic_offline, fixed_offline, real_deal)
def main(*args):
def main(*cargs):
"""Demonstrate the use of the wumpus API on selected worlds"""
ex_names = {ex.__name__.lower(): ex for ex in EXAMPLES}
ex = None
if len(args) > 0:
ex_name = args[0]
if ex_name.lower() in ex_names:
ex = ex_names[ex_name.lower()]
else:
print('Example {} not among the available {}'.format(ex_name, list(ex_names.keys())))
return -1
parser = argparse.ArgumentParser(description=main.__doc__, formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument('example', nargs='?', help='select one of the available example', choices=list(ex_names.keys()))
args = parser.parse_args(cargs)
if args.example:
ex = ex_names[args.example.lower()]
else:
# Randomly play one of the examples
ex = random.choice(EXAMPLES)
......
import copy
from dataclasses import dataclass
import dataclasses
from typing import Dict, Tuple, Union
import gym
......@@ -31,11 +33,11 @@ class WumpusEnv(gym.Env):
# Actions are discrete integer values
self.action_space = spaces.Discrete(len(self.actions))
self.observation_space = spaces.Dict({k: spaces.Discrete(2) for k in Hunter.Percept._fields})
self.observation_space = spaces.Dict({f.name: spaces.Discrete(2) for f in dataclasses.fields(Hunter.Percept)})
def _percept_to_space(self) -> Dict[str, int]:
percept = self.__agent.percept()
return {k: (1 if getattr(percept, k) else 0) for k in percept._fields}
return {k: (1 if v else 0) for k, v in dataclasses.asdict(percept).items()}
@classmethod
def space_to_percept(cls, obs: Dict[str, int]) -> Hunter.Percept:
......
[metadata]
name = wumpus
version = attr:wumpus.__version__
description = This package implements a Python version of the Hunt the Wumpus game as described in the book Artificial Intelligence: A Modern Approach by Russell and Norvig.
long_description = file: README.md
license = MIT
author = Sergio Tessaris
author_email = tessaris@inf.unibz.it
[options]
zip_safe = False
include_package_data = True
packages = find:
python_requires = >=3.8
install_requires =
gym < 0.24
[options.entry_points]
console_scripts =
gridrunner = wumpus.cli:main
from setuptools import setup, find_packages
import re
import codecs
import os
import subprocess
#!/usr/bin/env python
import setuptools
# Hack to allow non-normalised versions
# see <https://github.com/pypa/setuptools/issues/308>
from setuptools.extern.packaging import version
version.Version = version.LegacyVersion
_INCLUDE_GIT_REV_ = False
# see <https://stackoverflow.com/a/39671214> and
# <https://packaging.python.org/guides/single-sourcing-package-version>
def find_version(*pkg_path):
pkg_dir = os.path.join(os.path.abspath(os.path.dirname(__file__)), *pkg_path)
version_file = codecs.open(os.path.join(pkg_dir, '__init__.py'), 'r').read()
version_match = re.search(r"^__version__ = ['\"]([^'\"]*)['\"]",
version_file, re.M)
_git_revision_ = None
if _INCLUDE_GIT_REV_:
try:
_git_revision_ = subprocess.check_output(['git', 'describe', '--always', '--dirty'], encoding='utf-8').strip()
except subprocess.CalledProcessError:
pass
if version_match:
return version_match.group(1) + ('' if _git_revision_ is None else '+' + _git_revision_)
elif _git_revision_:
return _git_revision_
raise RuntimeError("Unable to find version string.")
def long_description_md(fname='README.md'):
this_directory = os.path.abspath(os.path.dirname(__file__))
with open(os.path.join(this_directory, fname), encoding='utf-8') as f:
long_description = f.read()
return long_description
setup(
name='wumpus',
version=find_version('wumpus'),
description='Wumpus world simulator',
long_description=long_description_md(),
long_description_content_type='text/markdown',
author='Sergio Tessaris',
author_email='tessaris@inf.unibz.it',
packages=find_packages(),
include_package_data=True,
entry_points= {
'console_scripts': ['gridrunner=wumpus.runner:main']
},
install_requires=[
],
extras_require={
'gym': ['gym']
},
exclude_package_data={'': ['.gitignore']},
)
if __name__ == "__main__":
setuptools.setup()
\ No newline at end of file
from wumpus.wumpus import WumpusWorld, Hunter, Wumpus, Pit, Gold, Exit
from wumpus.gridworld import Agent, Percept, InformedPlayer, UninformedPlayer, UserPlayer, RandomPlayer, Coordinate, coord
from .gridworld import Agent, Percept, Coordinate, coord, GridWorld, EaterWorld, GridWorldException, Eater, Food
from .player import OfflinePlayer, OnlinePlayer, UserPlayer, RandomPlayer
from .wumpus import WumpusWorld, Hunter, Wumpus, Pit, Gold, Exit
from .runner import run_episode
__version__ = '0.3.1'
__version__ = '1.1.0'
#!/usr/bin/env python
"""
Command line interface
"""
import argparse
import io
import json
import os
import sys
from . import __version__
from .gridworld import GridWorld
from .runner import get_subclasses, check_entrypoint, get_player_class, get_world_class, run_episode, worlds
def gridrunner(*args):
"""
Run episodes on worlds using the specified player.
"""
world_classes = sorted(get_subclasses(GridWorld), key=lambda c: c.__name__)
parser = argparse.ArgumentParser(description=gridrunner.__doc__, formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument('infiles', type=argparse.FileType('r'), nargs='*', help='world description JSON files, they must be compatible with the world type (see --world option).')
parser.add_argument('--name', '-n', type=str, help='name of the player, default to the name of the player class')
parser.add_argument('--path', '-p', type=str, default='.', help="path of the player library, it's prepended to the sys.path variable")
parser.add_argument('--entry', '-e', type=check_entrypoint, required=True, help="object reference for a Player subclass in the form 'importable.module:object.attr'. See <https://packaging.python.org/specifications/entry-points/#data-model> for details.")
parser.add_argument('--world', '-w', type=str, default=world_classes[0].__name__, choices=[c.__name__ for c in world_classes], help='class name of the world')
parser.add_argument('--horizon', '-z', type=int, default=20, help='maximum number of steps')
parser.add_argument('--noshow', action='store_false', help="prevent the printing the world at each step")
parser.add_argument('--out', '-o', type=argparse.FileType('w'), default=sys.stdout, help="write output to file")
parser.add_argument('--version', action='version', version='%(prog)s ' + __version__)
parser.add_argument('--log', '-l', type=argparse.FileType('w'), help="write the log of the games to file (JSON)")
args_dict = vars(parser.parse_args(args))
name = args_dict['name']
path = os.path.abspath(args_dict['path']) if args_dict['path'] != '.' else os.getcwd()
obj_ref = args_dict['entry']
world_type = args_dict['world']
horizon = args_dict['horizon']
show = args_dict['noshow']
outf: io.TextIOBase = args_dict['out']
game_log: io.TextIOBase = args_dict['log']
player_class = get_player_class(obj_ref, path=path)
world_class = get_world_class(world_type)
if name is None:
name = player_class.__name__
player = player_class(name=name)
if game_log is not None:
print('[', file=game_log)
if len(args_dict['infiles']) > 0:
morelogs = False
for world in worlds(args_dict['infiles'], world_class):
glog = run_episode(world, player, horizon=horizon, show=show, outf=outf)
if game_log is not None:
if morelogs:
print(',', file=game_log)
else:
morelogs = True
json.dump(glog, game_log)
else:
world = world_class.random()
# show world definition
print('-' * 10 + ' Playing on world: ' + '-' * 10, file=outf)
world.to_JSON(outf)
print('\n' + '-' * 40, file=outf)
glog = run_episode(world, player, horizon=horizon, show=show, outf=outf)
if game_log is not None:
json.dump(glog, game_log)
if game_log is not None:
print(']', file=game_log)
return 0
def main():
sys.exit(gridrunner(*sys.argv[1:]))
if __name__ == "__main__":
main()
......@@ -5,13 +5,12 @@ that might populate it. Including a simple agent that can move in four direction
"""
import collections
import copy
from dataclasses import dataclass
from enum import Enum
from inspect import cleandoc
import io
import json
import random
import textwrap
import sys
from typing import Set, NamedTuple, Iterable, Dict, Union, Any
......@@ -133,6 +132,11 @@ class Agent(WorldObject):
"""Execute an action and return the reward of the action."""
raise NotImplementedError
def suicide(self) -> int:
"""Kill the agent, returns the outcome of the action."""
# I don't know how to die
return 0
def on_done(self):
"""Called when the episode terminate."""
pass
......@@ -166,37 +170,28 @@ class GridWorld(object):
self._location: Dict[Coordinate, Iterable[WorldObject]] = {}
@classmethod
def random(cls, **kwargs):
def random(cls, map_desc: str=None, size: Coordinate=None, blocks: Iterable[Coordinate]=None, **kwargs):
"""Create a new world from a map description or from the given size and block positions.
Args:
map_desc (str, optional): map of the world. Defaults to None.
size (Coordinate, optional): size of the world. Defaults to None.
blocks (Iterable[Coordinate], optional): location of the blocks. Defaults to [].
blocks (Iterable[Coordinate], optional): location of the blocks. Defaults to None (random placement).
"""
map_desc: str = kwargs.get('map_desc', None)
size: Coordinate = kwargs.get('size', None)
blocks: Iterable[Coordinate] = kwargs.get('blocks', [])
if map_desc is not None:
return cls.from_string(map_desc)
if size is None:
new_size = random.randint(4, 8)
size = coord(new_size, new_size)
world = cls(size, blocks)
if len(blocks) == 0:
if blocks is None:
# randomly place blocks in the world
new_blocks = set()
blocks = set()
occupy = int(random.random() * 0.1 * size.x * size.y)
while len(new_blocks) < occupy:
new_blocks.add(coord(random.randint(0, size.x - 1), random.randint(0, size.y - 1)))
for pos in new_blocks:
world.addBlock(pos)
return world
while len(blocks) < occupy:
blocks.add(coord(random.randint(0, size.x - 1), random.randint(0, size.y - 1)))
return cls(size, blocks)
@classmethod
def from_string(cls, world_desc: str) -> 'GridWorld':
......@@ -408,170 +403,6 @@ class GridWorld(object):
return ascii_to_wide(top_frame + "\n".join(reversed([side_frame + ''.join(maze_strs[i]) + side_frame for i in range(self.size.y)])) + bottom_frame)
def run_episode(self, player: 'Player', horizon: int = 0, show=True, outf: io.TextIOWrapper = sys.stdout):
"""Run an episode on the world using the player to control the agent. The horizon specifies the maximum number of steps, 0 or None means no limit. If show is true then the world is printed ad each iteration before the player's turn.
Raise the exception GridWorldException is the agent is not in the world."""
# get the first agent
agent = next(iter(o for o in self.objects if isinstance(o, Agent)), None)
if agent is None:
raise GridWorldException('No agent in this world')
run_episode(self, agent, player, horizon=horizon, show=show, outf=outf)
class Player(object):
"""A player for a given agent. It implements the play method which should
return one of the actions for the agent or None to give up.
"""
def __init__(self, name: str = None):
"""
Initialise the name of the agent if provided.
"""
if name is not None:
self.name = str(name)
def start_episode(self):
"""Method called at the beginning of the episode."""
pass
def end_episode(self):
"""Method called at the when an episode is completed."""
pass
def play(self, turn: int, percept: Percept, actions: Iterable[Agent.Actions]) -> Agent.Actions:
"""Given a turn (integer) and a percept, which might differ according to the specific problem, returns an action, among the given list of possible actions, to play at the given turn or None to stop the episode."""
raise NotImplementedError
def feedback(self, action: Agent.Actions, reward: int, percept: Percept):
"""Receive in input the reward of the last action and the resulting state. The function is called right after the execution of the action."""
pass
@property
def name(self) -> str:
"""Return the name of the player or a default value based on its type and hash."""
try:
return self._name
except AttributeError:
return object_id(self)
@name.setter
def name(self, value: str):
"""Set the name of the player.
Args:
value (str): the name
"""
self._name = value
class InformedPlayer(Player):
"""
A player that receives the configuration of the world at the beginning of the episode.
"""
def start_episode(self, word: GridWorld):
"""Method called at the beginning of the episode. Receives the current world status."""
pass
class UninformedPlayer(Player):
"""
A player that doesn't know what is the configuration of the environment.
"""
def start_episode(self):
"""Method called at the beginning of the episode. Receives the current world status."""
pass
def run_episode(world: GridWorld, agent: Agent, player: Player, horizon: int = 0, show=True, outf: io.TextIOWrapper = sys.stdout):
"""Run an episode on the world using the player to control the agent. The horizon specifies the maximum number of steps, 0 or None means no limit. If show is true then the world is printed ad each iteration before the player's turn.
Raise the exception GridWorldException is the agent is not in the world.
Args:
world (GridWorld): the world in which the episode is run
agent (Agent): the agent controlled by the player
player (Player): the player
horizon (int, optional): stop after this number of steps, 0 for no limit. Defaults to 0.
show (bool, optional): whether to show the environment before a step. Defaults to True.
outf (io.TextIOWrapper, optional): writes output to the given stream. Defaults to sys.stdout.
Raises:
GridWorldException: [description]
"""
if agent not in world.objects:
raise GridWorldException('Missing agent {}, cannot run the episode'.format(agent))
# inform the player of the start of the episode
if isinstance(player, InformedPlayer):
player.start_episode(copy.deepcopy(world))
else:
player.start_episode()
step = 0
while not horizon or step < horizon:
if agent.success():
print('The agent {} succeeded!'.format(agent.name), file=outf)
break
if not agent.isAlive:
print('The agent {} died!'.format(agent.name), file=outf)
break
if show or step < 1:
print(world, file=outf)
action = player.play(step, agent.percept(), agent.actions())
if action is None:
print('Episode terminated by the player {}.'.format(player.name), file=outf)
break
reward = agent.do(action)
print('Step {}: agent {} executing {} -> reward {}'.format(step, agent.name, action.name, reward), file=outf)
player.feedback(action, reward, agent.percept())
step += 1
else:
print('Episode terminated by maximum number of steps ({}).'.format(horizon), file=outf)
player.end_episode()
print(world, file=outf)
print('Episode terminated with a reward of {} for agent {}'.format(agent.reward, agent.name), file=outf)
############################################################################
#
# Examples of the use of the API
#######################################
# Trivial players
#
class RandomPlayer(Player):
"""This player selects randomly one of the available actions."""
def play(self, turn: int, percept: Percept, actions: Iterable[Agent.Actions]) -> Agent.Actions:
actions_lst = list(actions)
return actions_lst[random.randint(0, len(actions) - 1)]
class UserPlayer(Player):
"""This player asks the user for the next move, if it's not ambiguous it accepts also commands initials and ignores the case."""
def play(self, turn: int, percept: Percept, actions: Iterable[Agent.Actions]) -> Agent.Actions:
actions_dict = {a.name: a for a in actions}
print('{} percept:'.format(self.name))
print(textwrap.indent(str(percept), ' '))
while True:
answer = input('{}: select an action {} and press enter, or empty to stop: '.format(self.name, list(actions_dict.keys()))).strip()
if len(answer) < 1:
return None
elif answer in actions_dict:
return actions_dict[answer]
else:
options = [k for k in actions_dict.keys() if k.lower().startswith(answer.lower())]
if len(options) == 1:
return actions_dict[options[0]]
else:
print('Cannot understand <{}>'.format(answer))
#######################################
# Simple agent moving in four directions that eats on the way.
......@@ -579,18 +410,22 @@ class UserPlayer(Player):
class Food(WorldObject):
"""Food in the EaterWorld, it can be consumed by the Eater agent."""
def charSymbol(self):
return '🍌'
class Eater(Agent):
"""An agent that moves in the EaterWorld. It can move in 4 directions (Eater.Actions) and consumes Food objects that are in the cells where it moves. It sees its position and smells whether there's still food in the world (Eater.Percept). Its goal is to consume all the food in the environment."""
class Actions(Actions):
"""Eater actions for each direction in which the agent can move (N, S, E, W)"""
N = (0, 1)
S = (0, -1)
E = (1, 0)
W = (-1, 0)
class Percept(Percept, NamedTuple):
@dataclass(frozen=True)
class Percept(Percept):
"""Eater agent perception: the current position and whether there's more food."""
position: Coordinate
more_food: bool
......@@ -599,10 +434,16 @@ class Eater(Agent):
self._foodcount = 0
self._reward = 0
self.FOOD_BONUS = 10
self._alive = True
def charSymbol(self):
return '🐒'
@property
def isAlive(self):
"""Return true is the agent can still execute actions."""
return self._alive
@property
def reward(self) -> int:
"""The current accumulated reward"""
......@@ -627,6 +468,12 @@ class Eater(Agent):
self._reward += cost
return cost
def suicide(self) -> int:
"""Kill the agent, returns the outcome of the action."""
self._alive = False
# no penalty for suicide
return 0
def percept(self) -> 'Eater.Percept':
return self.Percept(
position=self.location,
......@@ -640,8 +487,10 @@ class Eater(Agent):
class EaterWorld(GridWorld):
"""A GridWorld which contains Food and a Eater agent that can move within the world and eat the food when it moves in a cell that contains some food.
"""
@classmethod
def random(cls, **kwargs) -> 'EaterWorld':
def random(cls, map_desc: str=None, size: Coordinate=None, blocks: Iterable[Coordinate]=[], food_amount: float=.1, **kwargs) -> 'EaterWorld':
"""Create a new world from the map description and randomly place food until the given percentage of the free space is filled. If the food amount is greater or equal than 1 then it's interpreted as the number of food objects to include.
Args:
......@@ -657,9 +506,8 @@ class EaterWorld(GridWorld):
Returns:
EaterWorld: a new random world
"""
food_amount: float = kwargs.get('food_amount', .1)
world = super().random(**kwargs)
world = super().random(map_desc=map_desc, size=size, blocks=blocks, **kwargs)
free_cells = list(world.empty_cells())
random.shuffle(free_cells)
......@@ -746,27 +594,3 @@ MAP_STR = """
# # # #
################
"""
def simpleEaterTest(player_class=RandomPlayer, horizon=20):
world = EaterWorld.random(MAP_STR, food_amount=0.1)
player = player_class()
world.run_episode(player, horizon=horizon)
############################################################################
#
# Testing the API
if __name__ == "__main__":
simpleEaterTest()
simpleEaterTest(player_class=UserPlayer)
world = EaterWorld.random(MAP_STR)
world_copy = EaterWorld.from_dict(world.to_dict())
print(world)
print(world_copy)
assert str(world) == str(world_copy)
print(world_copy.to_JSONs())
print(EaterWorld.from_JSON('{"size": [5, 5], "block": [[1, 3]], "food": [[3, 0], [0, 0]], "eater": [1, 0]}'))
import random
import textwrap
from typing import Iterable, Union
from .gridworld import Agent, Percept, object_id, GridWorld
class OnlinePlayer:
"""A player for a given agent. It implements the play method which should
return one of the actions for the agent or None to give up.
"""
def __init__(self, name: str = None):
"""
Initialise the name of the player if provided.
"""
self._name = str(name) if name is not None else object_id(self)
@property
def name(self) -> str:
"""The name of the player or a default value based on its type and hash."""
return self._name
def start_episode(self):
"""Method called at the beginning of the episode."""
pass
def end_episode(self, outcome: int, alive: bool, success: bool):
"""Method called at the when an episode is completed with the outcome of the game and whether the agent was still alive and successfull.
"""
pass
def play(self, percept: Percept, actions: Iterable[Agent.Actions], reward: Union[int, None]) -> Agent.Actions:
"""Given a percept, which might differ according to the specific problem, and the list of valid actions, returns an action to play at the given turn or None to stop the episode. The reward is the one obtained in the previous action, on the first turn its value is None."""
raise NotImplementedError
class OfflinePlayer:
"""
A player that receives the configuration of the world at the beginning of the episode and returns the sequence of actions to play.
"""
def __init__(self, name: str = None):
"""
Initialise the name of the player if provided.
"""
self._name = str(name) if name is not None else object_id(self)
@property
def name(self) -> str:
"""The name of the player or a default value based on its type and hash."""
return self._name
def start_episode(self, word: GridWorld) -> Iterable[Agent.Actions]:
"""Method called at the beginning of the episode. Receives the current world status."""
raise NotImplementedError
def end_episode(self, outcome: int, alive: bool, success: bool):
"""Method called at the when an episode is completed with the outcome of the game and whether the agent was still alive and successfull.
"""
pass
############################################################################
#
# Examples of the use of the API
#######################################
# Trivial players
#
class RandomPlayer(OnlinePlayer):
"""This player selects randomly one of the available actions."""
def play(self, percept: Percept, actions: Iterable[Agent.Actions], reward: int) -> Agent.Actions:
actions_lst = list(actions)
return actions_lst[random.randint(0, len(actions) - 1)]
class UserPlayer(OnlinePlayer):
"""This player asks the user for the next move, if it's not ambiguous it accepts also commands initials and ignores the case."""
def play(self, percept: Percept, actions: Iterable[Agent.Actions], reward: int) -> Agent.Actions:
actions_dict = {a.name: a for a in actions}
print('{} percept:'.format(self.name))
print(textwrap.indent(str(percept), ' '))
while True:
answer = input('{}: select an action {} and press enter, or empty to stop: '.format(self.name, list(actions_dict.keys()))).strip()
if len(answer) < 1:
return None
elif answer in actions_dict:
return actions_dict[answer]
else:
options = [k for k in actions_dict.keys() if k.lower().startswith(answer.lower())]
if len(options) == 1:
return actions_dict[options[0]]
else:
print('Cannot understand <{}>'.format(answer))
#!/usr/bin/env python
"""
Functions to test the players
Functions to run the players
"""
import argparse
import copy
import importlib
import inspect
import io
import json
import os
import re
import sys
from typing import ClassVar, Iterator
from typing import Any, Dict, Iterator, Type, Union
from . import __version__
from .gridworld import Agent, GridWorld, GridWorldException, EaterWorld
from .player import OnlinePlayer, OfflinePlayer
from .wumpus import WumpusWorld
from .gridworld import Player, EaterWorld, GridWorld
def get_subclasses(cls):
......@@ -41,7 +44,7 @@ def check_entrypoint(arg_value: str, pattern: re.Pattern = re.compile(r"^[\w.-]+
return arg_value
def get_player_class(object_ref: str, path: os.PathLike = None) -> ClassVar[Player]:
def get_player_class(object_ref: str, path: os.PathLike = None) -> Union[Type[OnlinePlayer], Type[OfflinePlayer]]:
if path is not None and path not in sys.path:
if not os.path.isdir(path):
raise FileNotFoundError('Directory <{}> not found'.format(path))
......@@ -49,28 +52,31 @@ def get_player_class(object_ref: str, path: os.PathLike = None) -> ClassVar[Play
# see <https://packaging.python.org/specifications/entry-points/#data-model>
modname, qualname_separator, qualname = object_ref.partition(':')
obj = importlib.import_module(modname)
try:
obj = importlib.import_module(modname)
except ModuleNotFoundError as e:
raise ImportError(f'Cannot find entrypoint {object_ref}: {e}')
if qualname_separator:
for attr in qualname.split('.'):
try:
obj = getattr(obj, attr)
except AttributeError as e:
raise ImportError('Cannot import {} object: {}'.format(object_ref, e))
raise ImportError(f'Cannot find entrypoint {object_ref}: {e}')
player_class = obj
if not issubclass(player_class, Player):
raise NotImplementedError('class {} is not a subclass of Player'.format(player_class))
if not inspect.isclass(player_class) or not issubclass(player_class, (OnlinePlayer, OfflinePlayer)):
raise RuntimeError(f'{player_class} is not a subclass of OnlinePlayer or OfflinePlayer')
return player_class
def get_world_class(name: str) -> ClassVar[GridWorld]:
def get_world_class(name: str) -> Type[GridWorld]:
"""Return the class of the world corresponding to the given name.
Args:
name (str): name of the GridWorld subclass
Returns:
ClassVar[GridWorld]: GridWorld subclass
Type[GridWorld]: GridWorld subclass
"""
world_class = globals().get(name, None)
if world_class is None:
......@@ -81,70 +87,122 @@ def get_world_class(name: str) -> ClassVar[GridWorld]:
return world_class
def worlds(files, world_class: ClassVar[GridWorld]) -> Iterator[GridWorld]:
def worlds(files, world_class: Type[GridWorld]) -> Iterator[GridWorld]:
for fd in files:
try:
world = world_class.from_JSON(fd)
yield world
world_defs = json.load(fd)
if isinstance(world_defs, dict):
yield world_class.from_dict(world_defs)
elif isinstance(world_defs, list):
for wd in world_defs:
try:
yield world_class.from_dict(wd)
except Exception as e:
print('Skipping world {}: {}'.format(wd, e))
continue
except Exception as e:
print('Skipping {}: {}'.format(fd, e))
continue
def play_episode(world: GridWorld, player_class: ClassVar[Player], player_name: str = None, horizon: int = 100, show: bool = True, outf: io.TextIOWrapper = sys.stdout):
world.run_episode(player_class(name=player_name), horizon=horizon, show=show, outf=outf)
def run_episode(world: GridWorld, player: Union[OnlinePlayer, OfflinePlayer], agent: Agent = None, horizon: int = 0, show=True, outf: io.TextIOBase = None) -> Dict[str, Any]:
"""Run an episode on the world using the player to control the agent. The horizon specifies the maximum number of steps, 0 or None means no limit. If show is true then the world is printed ad each iteration before the player's turn.
Raise the exception GridWorldException is the agent is not in the world.
def gridrunner(*args):
"""
Run episodes on worlds using the specified player.
"""
Args:
world (GridWorld): the world in which the episode is run
player (Player): the player
agent (Agent, optional): the agent controlled by the player. Defaults to first agent in the world
horizon (int, optional): stop after this number of steps, 0 for no limit. Defaults to 0.
show (bool, optional): whether to show the environment before a step. Defaults to True.
outf (TextIOBase, optional): writes output to the given stream. Defaults to stdout.
world_classes = sorted(get_subclasses(GridWorld), key=lambda c: c.__name__)
parser = argparse.ArgumentParser(description=gridrunner.__doc__, formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument('infiles', type=argparse.FileType('r'), nargs='*', help='world description JSON files, they must be compatible with the world type (see --world option).')
parser.add_argument('--name', '-n', type=str, help='name of the player, default to the name of the player class')
parser.add_argument('--path', '-p', type=str, default='.', help="path of the player library, it's prepended to the sys.path variable")
parser.add_argument('--entry', '-e', type=check_entrypoint, required=True, help="object reference for a Player subclass in the form 'importable.module:object.attr'. See <https://packaging.python.org/specifications/entry-points/#data-model> for details.")
parser.add_argument('--world', '-w', type=str, default=world_classes[0].__name__, choices=[c.__name__ for c in world_classes], help='class name of the world')
parser.add_argument('--horizon', '-z', type=int, default=20, help='maximum number of steps')
parser.add_argument('--noshow', action='store_false', help="prevent the printing the world at each step")
parser.add_argument('--out', '-o', type=argparse.FileType('w'), default=sys.stdout, help="write output to file")
parser.add_argument('--version', action='version', version='%(prog)s ' + __version__)
args_dict = vars(parser.parse_args(args))
name = args_dict['name']
path = os.path.abspath(args_dict['path']) if args_dict['path'] != '.' else os.getcwd()
obj_ref = args_dict['entry']
world_type = args_dict['world']
horizon = args_dict['horizon']
show = args_dict['noshow']
outf = args_dict['out']
player_class = get_player_class(obj_ref, path=path)
world_class = get_world_class(world_type)
if name is None:
name = player_class.__name__
if len(args_dict['infiles']) > 0:
for world in worlds(args_dict['infiles'], world_class):
play_episode(world, player_class, player_name=name, horizon=horizon, show=show, outf=outf)
else:
world = world_class.random()
# show world definition
print('-' * 10 + ' Playing on world: ' + '-' * 10, file=outf)
world.to_JSON(outf)
print('\n' + '-' * 40, file=outf)
play_episode(world, player_class, player_name=name, horizon=horizon, show=show, outf=outf)
Returns:
dictionary (JSON encodable) with the log of the game
return 0
Raises:
GridWorldException: if there are problems with the world (e.g. there's no agent)
"""
if outf is None:
outf = sys.stdout
if agent is None:
try:
agent = next(o for o in world.objects if isinstance(o, Agent))
except StopIteration:
raise GridWorldException(f'No agent in this {world}')
elif agent not in world.objects:
raise GridWorldException('Missing agent {}, cannot run the episode'.format(agent))
game_log = {
'world': world.to_dict(),
'agent': agent.name,
'player': player.name,
'actions': [],
'exceptions': [],
'maxsteps': False
}
# inform the player of the start of the episode
if isinstance(player, OfflinePlayer):
try:
plan = iter(player.start_episode(copy.deepcopy(world)))
except Exception as e:
plan = iter([])
print(f'Exception in Player.start_episode: {e}', file=outf)
game_log['exceptions'].append(f'Player.start_episode: {e}')
else:
try:
player.start_episode()
except Exception as e:
print(f'Exception in Player.start_episode: {e}', file=outf)
game_log['exceptions'].append(f'Player.start_episode: {e}')
step = 0
reward = None
while not horizon or step < horizon:
if agent.success():
print('The agent {} succeeded!'.format(agent.name), file=outf)
break
if not agent.isAlive:
print('The agent {} died!'.format(agent.name), file=outf)
break
if show:
print(world, file=outf)
if isinstance(player, OfflinePlayer):
action = next(plan, None)
else:
try:
action = player.play(agent.percept(), agent.actions(), reward)
except Exception as e:
action = None
print(f'Exception in Player.play: {e}', file=outf)
game_log['exceptions'].append(f'Player.play: {e}')
if action is None:
game_log['actions'].append(action)
agent.suicide()
print('Episode terminated by the player {}.'.format(player.name), file=outf)
break
game_log['actions'].append(action.name)
reward = agent.do(action)
print('Step {}: agent {} executing {} -> reward {}'.format(step, agent.name, action.name, reward), file=outf)
step += 1
else:
print('Episode terminated by maximum number of steps ({}).'.format(horizon), file=outf)
game_log['maxsteps'] = True
try:
player.end_episode(agent.reward, agent.isAlive, agent.success())
except Exception as e:
print(f'Exception in Player.end_episode: {e}', file=outf)
game_log['exceptions'].append(f'Player.end_episode: {e}')
def main():
sys.exit(gridrunner(*sys.argv[1:]))
game_log['reward'] = agent.reward
game_log['alive'] = agent.isAlive
game_log['success'] = agent.success()
print(world, file=outf)
print('Episode terminated with a reward of {} for agent {}'.format(agent.reward, agent.name), file=outf)
if __name__ == "__main__":
main()
return game_log
from dataclasses import dataclass
from enum import Enum
import json
import random
import sys
from typing import Any, Iterable, NamedTuple, Dict, Sequence, Tuple
from typing import Any, Iterable, Dict, Sequence, Tuple, Union
from .gridworld import Actions, Agent, WorldObject, GridWorld, Coordinate, coord, GridWorldException, Percept, UserPlayer
from .gridworld import Actions, Agent, WorldObject, GridWorld, Coordinate, coord, GridWorldException, Percept
class WumpusWorldObject(WorldObject):
......@@ -54,7 +52,8 @@ class Hunter(Agent):
E = (1, 0)
W = (-1, 0)
class Percept(Percept, NamedTuple):
@dataclass(frozen=True)
class Percept(Percept):
stench: bool
breeze: bool
bump: bool
......@@ -228,14 +227,20 @@ class Hunter(Agent):
self._reward += reward
return reward
def suicide(self) -> int:
"""Kill the agent, returns the outcome of the action."""
self._alive = False
reward = -1000
self._reward += reward
return reward
class WumpusWorld(GridWorld):
@classmethod
def classic(cls, size: int = 4, seed=None, pitProb: float = .2):
def classic(cls, size: Union[int, Coordinate] = 4, seed=None, pitProb: float = .2, **kwargs):
"""Create a classic wumpus world problem of the given size. The agent is placed in (0,0) facing north and there's exactly one wumpus and a gold ingot. The seed is used to initialise the random number generation and pits are placed with pitProb probability."""
world = cls(coord(size, size), [])
world = cls(size if isinstance(size, Coordinate) else coord(size, size), [])
agentPos = coord(0, 0)
......@@ -247,12 +252,8 @@ class WumpusWorld(GridWorld):
return world
@classmethod
def random(cls, **kwargs):
size = kwargs.get('size', 4)
seed = kwargs.get('seed', None)
pitProb = kwargs.get('pitProb', .2)
return cls.classic(size=size, seed=seed, pitProb=pitProb)
def random(cls, size: Union[int, Coordinate] = 4, seed=None, pitProb: float = .2, **kwargs):
return cls.classic(size=size, seed=seed, pitProb=pitProb, **kwargs)
@classmethod
def from_dict(cls, desc: Dict[str, Any]):
......@@ -271,7 +272,7 @@ class WumpusWorld(GridWorld):
else:
return [getCoord(data)]
size = coordLst('size')[0]
size = getCoord(desc.get('size', [8, 8]))
blocks = coordLst('blocks')
hunters = desc.get('hunters', [])
pits = coordLst('pits')
......@@ -449,23 +450,3 @@ class WumpusWorld(GridWorld):
lines.append(bottom_line())
return '\n'.join(lines)
if __name__ == "__main__":
WORLD_MAP = '\n'.join([
'....#',
'.....',
'.....',
'.....',
'.....',
])
# world = WumpusWorld.randomWorld(size=7, blockProb=0.1, world_desc=WORLD_MAP)
world = WumpusWorld.classic(size=7)
world.run_episode(UserPlayer())
world.to_JSON(sys.stdout)
print()
JSON_STRING = '{"size": [7, 7], "hunters": [[0, 0, "E"]], "pits": [[4, 0], [3, 1], [2, 2], [6, 2], [4, 4], [3, 5], [4, 6], [5, 6]], "wumpuses": [[1, 2]], "exits": [[0, 0]], "golds": [[6, 3]]}'
world = WumpusWorld.from_JSON(json.loads(JSON_STRING))
print(world)
print(world.to_JSONs())