Thank you for sharing this project.
The current call_model implementation is tightly coupled to the Qwen-style API (sending prompts and reading from completions). I would like to run the same evaluation pipeline using an OpenAI-compatible LLM API (e.g. OpenAI ChatCompletion / DeepSeek via DashScope compatible mode). how should I adapt the code to use an OpenAI-format LLM endpoint for evaluation?
Thanks!
Thank you for sharing this project.
The current call_model implementation is tightly coupled to the Qwen-style API (sending prompts and reading from completions). I would like to run the same evaluation pipeline using an OpenAI-compatible LLM API (e.g. OpenAI ChatCompletion / DeepSeek via DashScope compatible mode). how should I adapt the code to use an OpenAI-format LLM endpoint for evaluation?
Thanks!