Hello authors,
Thank you for releasing the WebAgent / WebSailor project and the SailorFog-QA dataset.
I have been reading the paper and exploring the repository, but I am a bit confused about the data generation pipeline.
I can find the SailorFog-QA dataset (e.g., sailorfog-QA.jsonl), however I cannot locate:
- the runnable pipeline / scripts to generate SailorFog-QA from raw documents or web data, or
- an entry point or instructions to reproduce the SailorFog-QA dataset generation process.
Could you please clarify:
- Is the SailorFog-QA dataset generated by an internal pipeline that is not fully released?
- If the pipeline is available, could you point to the exact directory / scripts / command to run it?
- If it is not released, are there any plans to open-source the data generation pipeline, or provide a simplified version?
This would be extremely helpful for researchers who want to reproduce or extend SailorFog-QA for their own datasets.
Thank you very much for your time and for the great work!
Hello authors,
Thank you for releasing the WebAgent / WebSailor project and the SailorFog-QA dataset.
I have been reading the paper and exploring the repository, but I am a bit confused about the data generation pipeline.
I can find the SailorFog-QA dataset (e.g., sailorfog-QA.jsonl), however I cannot locate:
Could you please clarify:
This would be extremely helpful for researchers who want to reproduce or extend SailorFog-QA for their own datasets.
Thank you very much for your time and for the great work!