Skip to content

Test NVTabular, Petastorm, and Huggingface Datasets for parquet data loading #88

@zhujiem

Description

@zhujiem

Huggingface Datasets:

 dataset = load_dataset("parquet", data_files={split: data_blocks}, split=split)
 super().__init__(dataset=dataset, num_workers=8, batch_size=self.batch_size)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions