Skip to content

games: add pursuit-evasion game (2-player zero-sum, continuous 2D space)#1553

Open
aeyjeyaryan wants to merge 1 commit into
google-deepmind:masterfrom
aeyjeyaryan:add-pursuit-evasion-game
Open

games: add pursuit-evasion game (2-player zero-sum, continuous 2D space)#1553
aeyjeyaryan wants to merge 1 commit into
google-deepmind:masterfrom
aeyjeyaryan:add-pursuit-evasion-game

Conversation

@aeyjeyaryan

Copy link
Copy Markdown

This PR adds a new Python game, python_pursuit_evasion, to address issue #843 (Call for New Games).

Game overview

A 2-player zero-sum pursuit-evasion game in bounded 2D continuous space:

  • Player 0 (Pursuer): chooses a move direction each turn from 9 discrete actions (8 compass directions + stay). Movement is unit-length in any direction.
  • Player 1 (Evader): moves automatically according to a parameterized strategy. Has exactly one no-op legal action.
  • Capture condition: Euclidean distance < capture_radius (default 1.0).
  • Reward: +1 for pursuer if capture occurs within max_steps rounds, −1 otherwise (zero-sum).
  • Game parameters: grid_size (default 10.0), max_steps (default 50), capture_radius (default 1.0), evader_strategy (0–3).

Evader strategies

ID Name Behavior
0 Random Moves in a uniformly random direction each turn
1 Constant-velocity Moves East every turn
2 Zigzag Alternates between NE and SE
3 Adaptive Moves directly away from the pursuer (unit vector of evader − pursuer)

Implementation details

  • Follows the standard OpenSpiel Python game pattern (pyspiel.Game, pyspiel.State, observer class, registration).
  • Observation/information-state tensor: 5-element normalized vector [pursuer_x/grid_size, pursuer_y/grid_size, evader_x/grid_size, evader_y/grid_size, step/max_steps].
  • Sequential turn structure: pursuer moves → evader moves → capture check → repeat.

Tests

12 tests covering API conformance (pyspiel.random_sim_test), all 4 evader strategies, both terminal conditions, legal action counts, zero-sum property, observation tensor shape/range, and deterministic replay.

Additional context

This game accompanies a research paper comparing NEAT and PPO under non-stationary opponent strategies (IEEE Access, under submission). The adaptive evader strategy shifts the evasion policy during an episode, creating a non-stationary environment where population-based methods (NEAT) outperform gradient-based methods (PPO). The game is designed for use with OpenSpiel's Python algorithms and can serve as a benchmark for multi-agent RL under strategy distribution shifts.

@google-cla

google-cla Bot commented Jun 9, 2026

Copy link
Copy Markdown

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant