제가 그래픽카드가 없어서 구글 colab에서 gpu대여해서 main.py 돌렸는데 이렇게 뜨면서 checkpoint 폴더에 뭐가 만들어지긴 했습니다.
2023-04-30 14:39:12.103381: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Some weights of the model checkpoint at monologg/distilkobert were not used when initializing DistilBertModel: ['vocab_transform.bias', 'vocab_projector.bias', 'vocab_layer_norm.bias', 'vocab_transform.weight', 'vocab_layer_norm.weight', 'vocab_projector.weight']
- This IS expected if you are initializing DistilBertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DistilBertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
The tokenizer class you load from this checkpoint is 'BertTokenizer'.
The class this function is called from is 'KoBertTokenizer'.
/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py:561: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
warnings.warn(_create_warning_msg(
/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py:561: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
warnings.warn(_create_warning_msg(
2023-04-30 14:39:28;[ INFO];----------------------------------------
2023-04-30 14:39:28;[ INFO];rand_seed: 42
2023-04-30 14:39:28;[ INFO];max_grad_norm: 1.0
2023-04-30 14:39:28;[ INFO];lr: 0.0001
2023-04-30 14:39:28;[ INFO];multigpu: False
2023-04-30 14:39:28;[ INFO];context_max_length: 64
2023-04-30 14:39:28;[ INFO];gloss_max_length: 64
2023-04-30 14:39:28;[ INFO];epochs: 3
2023-04-30 14:39:28;[ INFO];context_bsz: 4
2023-04-30 14:39:28;[ INFO];gloss_bsz: 16
2023-04-30 14:39:28;[ INFO];encoder_name: distilkobert
2023-04-30 14:39:28;[ INFO];checkpoint: checkpoint/distilkobert_202304301439
2023-04-30 14:39:28;[ INFO];checkpoint_count: 0
2023-04-30 14:39:28;[ INFO];----------------------------------------
2023-04-30 14:39:28;[ INFO];Creating a new directory for checkpoint/distilkobert_202304301439
/usr/local/lib/python3.10/dist-packages/transformers/optimization.py:391: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
warnings.warn(
The number of iteration for each epoch is 10
2023-04-30 14:39:28;[ INFO];Epoch 1 initialized.
10it [00:20, 2.08s/it]
2023-04-30 14:39:52;[ INFO];Epoch: 01 | Epoch Time: 0m 20s
2023-04-30 14:39:52;[ INFO]; Train Loss: 1594.614
2023-04-30 14:39:52;[ INFO]; Eval. Acc: 58.28%
2023-04-30 14:39:52;[ INFO]; Eval. F1 : 58.53%
2023-04-30 14:39:53;[ INFO];Checkpoint saved at checkpoint/distilkobert_202304301439/saved_checkpoint_0
2023-04-30 14:39:53;[ INFO];Epoch 2 initialized.
/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py:561: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
warnings.warn(_create_warning_msg(
10it [00:21, 2.11s/it]
2023-04-30 14:40:18;[ INFO];Epoch: 02 | Epoch Time: 0m 21s
2023-04-30 14:40:18;[ INFO]; Train Loss: 394.816
2023-04-30 14:40:18;[ INFO]; Eval. Acc: 59.27%
2023-04-30 14:40:18;[ INFO]; Eval. F1 : 59.22%
2023-04-30 14:40:18;[ INFO];Checkpoint saved at checkpoint/distilkobert_202304301439/saved_checkpoint_1
2023-04-30 14:40:18;[ INFO];Epoch 3 initialized.
/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py:561: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
warnings.warn(_create_warning_msg(
10it [00:21, 2.14s/it]
2023-04-30 14:40:43;[ INFO];Epoch: 03 | Epoch Time: 0m 21s
2023-04-30 14:40:43;[ INFO]; Train Loss: 223.517
2023-04-30 14:40:43;[ INFO]; Eval. Acc: 60.26%
2023-04-30 14:40:43;[ INFO]; Eval. F1 : 60.57%
2023-04-30 14:40:44;[ INFO];Checkpoint saved at checkpoint/distilkobert_202304301439/saved_checkpoint_2
eval.py의 디폴트가 distilkobert_202011201741 라서 지금 체크포인트에 나온 날짜로 바꿔서 넣어서 돌려지긴 했는데 eval.py에 있던 model_fname = 'saved_checkpoint_fin' 요 마지막 파일이 안생겨서 그런가 아래 오류가 뜨더라구요
Some weights of the model checkpoint at monologg/distilkobert were not used when initializing DistilBertModel: ['vocab_projector.weight', 'vocab_projector.bias', 'vocab_layer_norm.bias', 'vocab_transform.weight', 'vocab_transform.bias', 'vocab_layer_norm.weight']
- This IS expected if you are initializing DistilBertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DistilBertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
The tokenizer class you load from this checkpoint is 'BertTokenizer'.
The class this function is called from is 'KoBertTokenizer'.
Traceback (most recent call last):
File "/content/drive/MyDrive/Colab Notebooks/eval.py", line 94, in <module>
model = torch.load(f"checkpoint/{args.model_date}/{model_fname}")
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 791, in load
with _open_file_like(f, 'rb') as opened_file:
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 271, in _open_file_like
return _open_file(name_or_buffer, mode)
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 252, in __init__
super().__init__(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: 'checkpoint/distilkobert_202304301336/saved_checkpoint_fin'
main.py에서 뭘 어떻게 하는지 몰라서 체크포인트 봐도 뭔지 모르겠어요 ㅠㅠ
어떻게 하면 해결이 될지 부탁드립니다
제가 그래픽카드가 없어서 구글 colab에서 gpu대여해서 main.py 돌렸는데 이렇게 뜨면서 checkpoint 폴더에 뭐가 만들어지긴 했습니다.
eval.py의 디폴트가 distilkobert_202011201741 라서 지금 체크포인트에 나온 날짜로 바꿔서 넣어서 돌려지긴 했는데 eval.py에 있던 model_fname = 'saved_checkpoint_fin' 요 마지막 파일이 안생겨서 그런가 아래 오류가 뜨더라구요
main.py에서 뭘 어떻게 하는지 몰라서 체크포인트 봐도 뭔지 모르겠어요 ㅠㅠ
어떻게 하면 해결이 될지 부탁드립니다