Skip to content

Can this work with cluster made by top2vec ? #20

Description

@behrica

Thanks for your interesting package.

Do you think Clustergram could work with top2vec ?
https://github.com/ddangelov/Top2Vec

I saw that there is the option to create a clustergram from a DataFrame.

In top2vec, each "document" to cluster is represented as a embedding of a certain dimension, 256 , for example.

So I could indeed generate a data frame, like this:

x0 x1 ... x255 topic
0.5 0.2 .... -0.2 2
0.7 0.2 .... -0.1 2
0.5 0.2 .... -0.2 3

Does Clustergram assume anything on the rows of this data frame ?
I saw that the from_data method either takes "mean" or "medium" as method to calculate the cluster centers.

In word vector, we use typically the cosine distance to calculate distances between the vectors. Does this have any influence ?

top2vec calculates as well the "topic vectors" as a mean of the "document vectors", I believe.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions