I was trying to use POMCPOWSolver(criterion=MaxUCB(1.0)) for my project, but I got an error.
Then, I tried a very simple environment with simple transitions as in the following code:
from julia.POMDPs import solve, simulate
from julia.POMDPTools import Deterministic, HistoryRecorder, RandomPolicy
from julia.POMCPOW import POMCPOWSolver, MaxUCB
from julia.CommonRLSpaces import Box
from quickpomdps import QuickPOMDP
def transition(state, action):
return Deterministic([state[0] + 1])
def observation(state, action, next_state):
return Deterministic(next_state)
def reward(state, action, next_state):
return 1
def terminal(state):
return (state[0] >= 2)
pomdp = QuickPOMDP(
states = Box([0], [3]),
actions = Box([0], [1]),
observations = Box([0], [3]),
discount = 0.9,
isterminal = terminal,
transition = transition,
observation = observation,
reward = reward,
initialstate = Deterministic([1])
)
# TODO: this is not working
# this works well
# solver = POMCPOWSolver(max_time = 1, tree_queries = 15)
# this got into MethodError: no method matching insert
solver = POMCPOWSolver(criterion=MaxUCB(1.0))
policy = solve(solver, pomdp)
hr = HistoryRecorder(max_steps=2)
hist = simulate(hr, pomdp, policy)
rhist = simulate(hr, pomdp, RandomPolicy(pomdp))
it = 0
for step in hist:
print(f"____step:{it}____")
print("State: ", step.s)
print("Action: ", step.a)
print("Reward: ", step.r)
print("__________________")
it += 1
Note that I am using python-jl to run this. I also tried POMCPOWSolver(max_time = 1, tree_queries = 15) and it works fine, so I think the issue might be the MaxUCB. The error I got is:
Traceback (most recent call last):
File "/home/jvthunder/anaconda/envs/pomdp/lib/python3.8/site-packages/julia/pseudo_python_cli.py", line 308, in main
python(**vars(ns))
File "/home/jvthunder/anaconda/envs/pomdp/lib/python3.8/site-packages/julia/pseudo_python_cli.py", line 59, in python
scope = runpy.run_path(script, run_name="__main__")
File "/home/jvthunder/anaconda/envs/pomdp/lib/python3.8/runpy.py", line 265, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/jvthunder/anaconda/envs/pomdp/lib/python3.8/runpy.py", line 97, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/jvthunder/anaconda/envs/pomdp/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "test.py", line 60, in <module>
hist = simulate(hr, pomdp, policy)
RuntimeError: <PyCall.jlwrap (in a Julia function called from Python)
JULIA: MethodError: no method matching insert!(::POMCPOW.CategoricalVector{Tuple{StaticArraysCore.SVector{1, Float64}, Float64}}, ::Tuple{Vector{Int64}, Float64}, ::Float64)
Closest candidates are:
insert!(!Matched::DataStructures.SortedMultiDict{K, D, Ord}, ::Any, ::Any) where {K, D, Ord<:Base.Order.Ordering}
@ DataStructures ~/.julia/packages/DataStructures/MKv4P/src/sorted_multi_dict.jl:167
insert!(::POMCPOW.CategoricalVector{T}, !Matched::T, ::Float64) where T
@ POMCPOW ~/.julia/packages/POMCPOW/f6XAQ/src/categorical_vector.jl:12
insert!(!Matched::DataStructures.BalancedTree23{K, D, Ord}, ::Any, ::Any, !Matched::Bool) where {K, D, Ord<:Base.Order.Ordering}
@ DataStructures ~/.julia/packages/DataStructures/MKv4P/src/balanced_tree.jl:358
...
Can you please tell me how to make this work with MaxUCB?
I was trying to use
POMCPOWSolver(criterion=MaxUCB(1.0))for my project, but I got an error.Then, I tried a very simple environment with simple transitions as in the following code:
Note that I am using python-jl to run this. I also tried
POMCPOWSolver(max_time = 1, tree_queries = 15)and it works fine, so I think the issue might be the MaxUCB. The error I got is:Can you please tell me how to make this work with MaxUCB?