Skip to content

Update vit.py to include LayerNorm in the MLP head which is missing#317

Open
vivekh2000 wants to merge 1 commit into
lucidrains:mainfrom
vivekh2000:patch-1
Open

Update vit.py to include LayerNorm in the MLP head which is missing#317
vivekh2000 wants to merge 1 commit into
lucidrains:mainfrom
vivekh2000:patch-1

Conversation

@vivekh2000

Copy link
Copy Markdown

I have included the LayerNorm layer in the classification head to identify it with the original implementation. It will be easy to compare now.
Also, an isolated LayerNorm was included, which I have removed and inserted in the MLP head, again to make it more intuitive. For reference, you have included LayerNorm in the self.ditill_mlp attribute of the class DistillWrapper in distill.py. I hope these changes are acceptable.

Thank you for the excellent code base.

I have included the LayerNorm layer in the classification head, which needed to be added to the implementation. Also, an unnecessary layer norm was included, which I have removed.
For reference, you have included LayerNorm in the ditill_mlp attribute of the class DistillWrapper in distill.py. Thank you.
@lucidrains

Copy link
Copy Markdown
Owner

@vivekh2000 hey Vivek, thanks for this PR

so i think the layernorm should be kept at its current position, as most transformers these days end with a final norm

will definitely remove the layernorm from the distill wrapper though (or make it an option, in the case that some of the variants do not have the final norm)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants