❓ Questions & Help
Details
Hello everyone! In my sequential recommendation dataset every item actually comes annotated with a list of categories (potentially with repeated values). The following would be a pretty meaningful example.
data = [
{"session_id": 1, "item_id-list": [101, 102, 103], "categories-list": [[A, B], [C, D], [E]]},
{"session_id": 2, "item_id-list": [201, 202], "categories-list": [[A], [F, F]]}
]
Is it possible to categorify the categories present above in a nested way so that:
- the lists
[[A,B], [C,D], ..], .. do not become separate tokens but remain lists of categorified elements (e.g. [[1,2], [3,4], [6]] and [[1], [5,5]])
- we can then feed those into
EmbeddingBag downstream?
I've tried supplying the Dataset constructor with an appropriate schema, but unfortunately failed. I could also try flattening the lists categorifying and fusing back but this looks like a inefficient and bad idea..
❓ Questions & Help
Details
Hello everyone! In my sequential recommendation dataset every item actually comes annotated with a list of categories (potentially with repeated values). The following would be a pretty meaningful example.
Is it possible to categorify the categories present above in a nested way so that:
[[A,B], [C,D], ..], ..do not become separate tokens but remain lists of categorified elements (e.g.[[1,2], [3,4], [6]]and[[1], [5,5]])EmbeddingBagdownstream?I've tried supplying the Dataset constructor with an appropriate schema, but unfortunately failed. I could also try flattening the lists categorifying and fusing back but this looks like a inefficient and bad idea..