Followed by the iVat train command in README.md. I've attached the output. It seems like vocab_inv is larger than the max_vocab at the time the pretrained model was made.
What is the best way to fix this?
Thanks!
train_set:71246
avg word number:244.2789911012548
vocab:87318
avg word number (train_x): 243.84721829991528
avg word number (dev_x):241.3660095897709
avg word number (test_x):236.99672
lm_words_num:17397769
train_vocab_size: 67054
vocab_inv: 87318
Traceback (most recent call last):
File "train.py", line 427, in <module>
main()
File "train.py", line 181, in main
serializers.load_npz(args.pretrained_model, pretrain_model)
File "/data/anaconda/envs/tf17py3/lib/python3.6/site-packages/chainer/serializers/npz.py", line 190, in load_npz
d.load(obj)
File "/data/anaconda/envs/tf17py3/lib/python3.6/site-packages/chainer/serializer.py", line 83, in load
obj.serialize(self)
File "/data/anaconda/envs/tf17py3/lib/python3.6/site-packages/chainer/link.py", line 1001, in serialize
d[name].serialize(serializer[name])
File "/data/anaconda/envs/tf17py3/lib/python3.6/site-packages/chainer/link.py", line 651, in serialize
data = serializer(name, param.data)
File "/data/anaconda/envs/tf17py3/lib/python3.6/site-packages/chainer/serializers/npz.py", line 150, in __call__
numpy.copyto(value, dataset)
ValueError: could not broadcast input array from shape (86935,256) into shape (87318,256)
First, I ran
./download.shandwget http://sato-motoki.com/research/vat/imdb_pretrained_lm_ijcai.model.Followed by the iVat train command in README.md. I've attached the output. It seems like
vocab_invis larger than the max_vocab at the time the pretrained model was made.What is the best way to fix this?
Thanks!