Skip to content

The Impact of Relative Positional Embeddings #41

Description

@tldrafael

Hey, first off, thanks for sharing this work!

I noticed that you're actively using Relative Positional Embeddings (RPE) for all attention operations here and here.

As the RPE are intended to emulate the convolutional layer shift-invariance property. I wonder what was the impact of this choice? Looking at your arxiv paper, I haven't seen any explicit mention of this choice or an ablation study.

If you could share why RPE was important and if it is needed, I'd appretiate, thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions