Questions about neuron enchancement

@zhaoyiran924, Thanks for sharing your great work! I have a question about the implementation of `train_neuron.py` and its data format.

### Current Implementation
According to the paper, I thought it was using raw Wikipedia's passage to train those neurons, in a next-token prediction way, 
while the code currently seems to use a question-answer format for training:

```python
def formatting_prompts_func(example):
    output_texts = []
    for i in range(len(example['original_question'])):
        text = f"{example['original_question'][i]}. {example['response'][i]}"
        output_texts.append(text)
    return output_texts
```

Questions

1. Could you please confirm the format of how the wiki data passed, 
or share an example of how the Wikipedia documents are preprocessed to get the 'original_question' and 'response' fields?

2. Would it make more sense to use a simpler format for plain text documents, like:
```
def formatting_prompts_func(example):
    return example['text']
```

Thanks for your help.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Questions about neuron enchancement #3

Current Implementation

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Questions about neuron enchancement #3

Description

Current Implementation

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions