Skip to content

Vendor Agnostic Models #16

@chauhankaranraj

Description

@chauhankaranraj

Feedback no. 3

One salient feature of the Backblaze dataset is that the distribution of vendors in the data is neither uniform nor exhaustive. For example, seagate comprises ~70% of data, HGST comprises ~15%, Intel drive data is absent, etc. Also, our initial assumption was that SMART metrics may behave differently for different vendors. Therefore in the current forecasting notebook, models are trained vendor-wise. However, the distribution of vendors across Ceph users is likely different and we want to support all of those vendors.

As a data scientist, I want to explore how "transferable" forecasting models are, across vendors. That is, how is performance affected when a model is trained on data from one vendor and evaluated on data from another one.

Acceptance criteria:

  • EDA notebook comparing model performance on data from the vendor it's trained on and data from other vendors

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions