Skip to content

Issues using MaxUCB criterion #38

@JvThunder

Description

@JvThunder

I was trying to use POMCPOWSolver(criterion=MaxUCB(1.0)) for my project, but I got an error.
Then, I tried a very simple environment with simple transitions as in the following code:

from julia.POMDPs import solve, simulate
from julia.POMDPTools import Deterministic, HistoryRecorder, RandomPolicy
from julia.POMCPOW import POMCPOWSolver, MaxUCB
from julia.CommonRLSpaces import Box
from quickpomdps import QuickPOMDP

def transition(state, action):
    return Deterministic([state[0] + 1])

def observation(state, action, next_state):
    return Deterministic(next_state)

def reward(state, action, next_state):
    return 1

def terminal(state):
    return (state[0] >= 2)

pomdp = QuickPOMDP(
    states = Box([0], [3]),
    actions = Box([0], [1]),
    observations = Box([0], [3]),
    discount = 0.9,
    isterminal = terminal,
    transition = transition,
    observation = observation,
    reward = reward,
    initialstate = Deterministic([1])
)

# TODO: this is not working
# this works well
# solver = POMCPOWSolver(max_time = 1, tree_queries = 15)
# this got into MethodError: no method matching insert
solver = POMCPOWSolver(criterion=MaxUCB(1.0))

policy = solve(solver, pomdp)
hr = HistoryRecorder(max_steps=2)
hist = simulate(hr, pomdp, policy)
rhist = simulate(hr, pomdp, RandomPolicy(pomdp))

it = 0
for step in hist:
    print(f"____step:{it}____")
    print("State: ", step.s)
    print("Action: ", step.a)
    print("Reward: ", step.r)
    print("__________________")
    it += 1

Note that I am using python-jl to run this. I also tried POMCPOWSolver(max_time = 1, tree_queries = 15) and it works fine, so I think the issue might be the MaxUCB. The error I got is:

Traceback (most recent call last):
  File "/home/jvthunder/anaconda/envs/pomdp/lib/python3.8/site-packages/julia/pseudo_python_cli.py", line 308, in main
    python(**vars(ns))
  File "/home/jvthunder/anaconda/envs/pomdp/lib/python3.8/site-packages/julia/pseudo_python_cli.py", line 59, in python
    scope = runpy.run_path(script, run_name="__main__")
  File "/home/jvthunder/anaconda/envs/pomdp/lib/python3.8/runpy.py", line 265, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/jvthunder/anaconda/envs/pomdp/lib/python3.8/runpy.py", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/jvthunder/anaconda/envs/pomdp/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "test.py", line 60, in <module>
    hist = simulate(hr, pomdp, policy)
RuntimeError: <PyCall.jlwrap (in a Julia function called from Python)
JULIA: MethodError: no method matching insert!(::POMCPOW.CategoricalVector{Tuple{StaticArraysCore.SVector{1, Float64}, Float64}}, ::Tuple{Vector{Int64}, Float64}, ::Float64)

Closest candidates are:
  insert!(!Matched::DataStructures.SortedMultiDict{K, D, Ord}, ::Any, ::Any) where {K, D, Ord<:Base.Order.Ordering}
   @ DataStructures ~/.julia/packages/DataStructures/MKv4P/src/sorted_multi_dict.jl:167
  insert!(::POMCPOW.CategoricalVector{T}, !Matched::T, ::Float64) where T
   @ POMCPOW ~/.julia/packages/POMCPOW/f6XAQ/src/categorical_vector.jl:12
  insert!(!Matched::DataStructures.BalancedTree23{K, D, Ord}, ::Any, ::Any, !Matched::Bool) where {K, D, Ord<:Base.Order.Ordering}
   @ DataStructures ~/.julia/packages/DataStructures/MKv4P/src/balanced_tree.jl:358
  ...

Can you please tell me how to make this work with MaxUCB?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions