Skip to content
This repository was archived by the owner on Apr 11, 2021. It is now read-only.
This repository was archived by the owner on Apr 11, 2021. It is now read-only.

Example script for AttentionLSTM #36

@Spider101

Description

@Spider101

I am having a bit of trouble understanding how to incorporate the AttentionLSTM layer into my code. In your blog you have said that "The attentional component can be tacked onto the LSTM code that already exists.". But unlike a standard LSTM, this custom layer requires a second parameter which is the attention vector. As such, I tried the following code to build my model

seq_len, input_dims, output_dims = 200, 4096, 512
input_seq = Input(shape=(seq_len, input_dims,), dtype='float32')
attn = AttentionLSTM(output_dims, input_seq)(input_seq)  
model = Model(input=input_seq, output=attn)

However I get the following error: ValueError: Dimensions 4096 and 200 are not compatible.

My main trouble is understanding what should be the attention vector that should be passed according to your class specification. I know, conceptually, from the Show, Attend and Tell paper, that the attention vector should be each of the 1x4096 vectors. But I can't figure out how to pass that into the AttentionLSTM layer.

It would be very helpful if you could provide a gist or example script to demonstrate how to use the AttentionLSTM layer just like you did with the different rnns in your blog post!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions