Make the posterior() stats available 

The intially wrapped package _topicmodels_ offered the possibility of more refined exploration of topics in every document with `topicmodels::posterior(my_lda)$topics`. Could this be made available for a result of `seededlda::textmodel_lda()` ?

Given the probabilistic nature of topic-document associations, it would be nice to sensibilize students and the public to the fact that a given topic is only  the most present one in a given text, not the only one.

Example:

```{r}
lda_model2 <- topicmodels::LDA(convert(my_dfm, to = "topicmodels"), k = 6)
doc_topics <- topicmodels::posterior(lda_model2)$topics
df <- data.frame(doc_id = row.names(doc_topics) %>% str_replace(fixed(".txt"),""), doc_topics)
df_long <- tidyr::pivot_longer(df, cols = starts_with("X"), names_to = "topic", values_to = "importance")
ggplot(df_long, aes(x = importance, y = doc_id, fill = factor(topic))) +
	geom_bar(stat = "identity") +
	labs(x = "Topic Importance", y = "Document ID", fill = "Topic") +
	theme_minimal() +
	theme(axis.text.y = element_text(angle = 0, hjust = 1))
```

![mytextsplot2](https://github.com/koheiw/seededlda/assets/7204421/24b1fdfb-fd94-4f65-b8bb-a0ced804e012)






Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make the posterior() stats available #79

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Make the posterior() stats available #79

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions