Mismatch between pretrained weights and imdb data?

First, I ran `./download.sh` and `wget http://sato-motoki.com/research/vat/imdb_pretrained_lm_ijcai.model`.


Followed by the iVat train command in README.md. I've attached the output. It seems like `vocab_inv` is larger than the max_vocab at the time the pretrained model was made.
What is the best way to fix this?
Thanks!

```
train_set:71246
avg word number:244.2789911012548
vocab:87318
avg word number (train_x): 243.84721829991528
avg word number (dev_x):241.3660095897709
avg word number (test_x):236.99672
lm_words_num:17397769
train_vocab_size: 67054
vocab_inv: 87318
Traceback (most recent call last):
  File "train.py", line 427, in <module>
    main()
  File "train.py", line 181, in main
    serializers.load_npz(args.pretrained_model, pretrain_model)
  File "/data/anaconda/envs/tf17py3/lib/python3.6/site-packages/chainer/serializers/npz.py", line 190, in load_npz
    d.load(obj)
  File "/data/anaconda/envs/tf17py3/lib/python3.6/site-packages/chainer/serializer.py", line 83, in load
    obj.serialize(self)
  File "/data/anaconda/envs/tf17py3/lib/python3.6/site-packages/chainer/link.py", line 1001, in serialize
    d[name].serialize(serializer[name])
  File "/data/anaconda/envs/tf17py3/lib/python3.6/site-packages/chainer/link.py", line 651, in serialize

    data = serializer(name, param.data)
  File "/data/anaconda/envs/tf17py3/lib/python3.6/site-packages/chainer/serializers/npz.py", line 150, in __call__
    numpy.copyto(value, dataset)
ValueError: could not broadcast input array from shape (86935,256) into shape (87318,256)

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mismatch between pretrained weights and imdb data? #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Mismatch between pretrained weights and imdb data? #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions