Skip to content

Performance

ddayto edited this page Jan 27, 2025 · 2 revisions

Objective: Find best-fit dataset that aligns with task.

Task: Fine-tune transformer-based model for entity extraction and query restructuring specific to books and literary queries.

Book Datasets

  • Open Library Dataset: Contains bibliographic data with book titles, authors, genres, and descriptions.
  • Goodreads Data: Provides book reviews, tags (genres), and user-generated lists.
  • Google Books API: Metadata about books and summaries.
  • COCO Dataset (Microsoft)
  • Wonderbk Dataset (https://www.kaggle.com/datasets/elvinrustam/books-dataset)

Clone this wiki locally