Skip to content

RuntimeError: Tracer cannot infer type of Seq2SeqLMOutput #1246

@ling976

Description

@ling976

你好,我在使用torch.jit.trace对模型进行追踪的时候发生了一个错误

错误信息为:

RuntimeError: Tracer cannot infer type of Seq2SeqLMOutput

下面是我这边的代码:

 tokenizer = AutoTokenizer.from_pretrained('./outputs/model_files/')
 model = AutoModelForSeq2SeqLM.from_pretrained('./outputs/model_files/')

 device = torch.device("cpu")
 model.to(device)
 model.eval()

 sample_sentence = "generate some numbers"
 encoding = tokenizer(sample_sentence, 
                    padding="max_length",
                    max_length=5,
                    return_tensors="pt",
                    return_attention_mask=True,
                    truncation=True)
 input_ids = encoding.input_ids
 attention_mask = encoding.attention_mask
 decoder_input_ids = torch.ones(1,1, dtype=torch.int32) * model.config.decoder_start_token_id

 traced_model = torch.jit.trace(model, (input_ids,attention_mask,decoder_input_ids),strict=False)
 traced_model.save("./model.pt")

具体的错误为信息:

 D:\Program Files\Python310\lib\site-packages\transformers\modeling_utils.py:701: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
 if causal_mask.shape[1] < attention_mask.shape[1]:
 Traceback (most recent call last):
  File "E:\Python\project\Chinese_Chat_T5_Base-main\convertModel.py", line 37, in <module>
traced_model = torch.jit.trace(model, (input_ids,attention_mask,decoder_input_ids),strict=False)
  File "D:\Program Files\Python310\lib\site-packages\torch\jit\_trace.py", line 759, in trace
return trace_module(
  File "D:\Program Files\Python310\lib\site-packages\torch\jit\_trace.py", line 976, in trace_module
module._c._create_method_from_trace(
RuntimeError: Tracer cannot infer type of Seq2SeqLMOutput(loss=None, logits=tensor([[[-8.0331, -0.6127,  1.7029,  ..., -6.0205, -4.9355, -7.5521]]],
   grad_fn=<UnsafeViewBackward0>), past_key_values=((tensor([[[[-4.1845e-01, -3.1748e+00,  3.5584e-01,  1.3317e-01, -4.8382e-01,
        4.9041e-01,  1.2883e+00,  5.5251e-01,  2.3777e+00,  3.6629e-01,
       -2.3793e-01,  1.6337e+00,  9.4133e-01, -1.0904e+00, -2.8644e+00,
       -5.2565e-02,  2.9996e-01, -4.1858e-01, -7.8744e-01, -1.7734e+00,
       -1.0728e+00,  5.5014e-01, -1.5405e+00,  2.7343e+00,  3.5340e+00,
       -1.5999e-02, -7.7990e-01,  4.5489e-01, -2.4964e-01, -2.9343e-01,
        7.0564e-01,  9.1929e-01,  3.4561e+00, -6.6381e-01,  8.5702e-01,
        6.3156e-01, -7.5711e-01,  1.6548e+00, -8.5602e-01, -9.3094e-01,
        9.1188e-02, -8.6472e-01,  6.4054e-01,  4.7034e-01,  3.4763e+00,
       -1.0079e+00,  1.2279e-01,  1.5227e+00,  1.6583e-01,  9.4017e-01,
        1.5735e+00,  3.4655e-01, -8.0972e-01,  9.2279e-01,  3.1652e-01,
       -2.3178e+00,  5.2484e-02,  4.8382e-01, -1.7146e-01,  2.4539e+00,

.......

     [-2.7458e-03, -4.8062e-02, -5.2608e-02,  ..., -4.8220e-03,
        5.0419e-02,  2.8005e-03]]]], grad_fn=<TransposeBackward0>))), decoder_hidden_states=None, decoder_attentions=None, cross_attentions=None, encoder_last_hidden_state=tensor([[[-0.0070,  0.1318, -0.0300,  ...,  0.0244, -0.0696,  0.0580],
     [-0.0274,  0.0240, -0.0552,  ..., -0.0846, -0.0992,  0.0408],
     [-0.0647,  0.0068, -0.0779,  ...,  0.0064,  0.0316,  0.0111],
     [-0.0445, -0.0067, -0.0273,  ...,  0.0320,  0.0382,  0.0814],
     [-0.0006,  0.0002,  0.0010,  ..., -0.0002,  0.0009, -0.0009]]],
   grad_fn=<MulBackward0>), encoder_hidden_states=None, encoder_attentions=None)
  :Dictionary inputs to traced functions must have consistent type. Found Tensor and Tuple[Tuple[Tensor, Tensor, Tensor, Tensor], Tuple[Tensor, Tensor, Tensor, Tensor], Tuple[Tensor, Tensor, Tensor, Tensor], Tuple[Tensor, Tensor, Tensor, Tensor], Tuple[Tensor, Tensor, Tensor, Tensor], Tuple[Tensor, Tensor, Tensor, Tensor], Tuple[Tensor, Tensor, Tensor, Tensor], Tuple[Tensor, Tensor, Tensor, Tensor], Tuple[Tensor, Tensor, Tensor, Tensor], Tuple[Tensor, Tensor, Tensor, Tensor], Tuple[Tensor, Tensor, Tensor, Tensor], Tuple[Tensor, Tensor, Tensor, Tensor]]

原模型地址为:https://huggingface.co/mxmax/Chinese_Chat_T5_Base

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions