具有主题模型的LDA，如何查看不同文档属于哪些主题？

如何使用内置数据集。这将向您显示哪些文档属于哪个主题的可能性最高。

library(topicmodels)data("AssociatedPress", package = "topicmodels")k <- 5 # set number of topics# generate modellda <- LDA(AssociatedPress[1:20,], control = list(alpha = 0.1), k)# now we have a topic model with 20 docs and five topics# make a data frame with topics as cols, docs as rows and# cell values as posterior topic distribution for each documentgammaDF <- as.data.frame(lda@gamma) names(gammaDF) <- c(1:k)# inspect...gammaDF   1 2 3 4 51  8.979807e-05 8.979807e-05 9.996408e-01 8.979807e-05 8.979807e-052  8.714836e-05 8.714836e-05 8.714836e-05 8.714836e-05 9.996514e-013  9.261396e-05 9.996295e-01 9.261396e-05 9.261396e-05 9.261396e-054  9.995437e-01 1.140774e-04 1.140774e-04 1.140774e-04 1.140774e-045  3.573528e-04 3.573528e-04 9.985706e-01 3.573528e-04 3.573528e-046  5.610659e-05 5.610659e-05 5.610659e-05 5.610659e-05 9.997756e-017  9.994345e-01 1.413820e-04 1.413820e-04 1.413820e-04 1.413820e-048  4.286702e-04 4.286702e-04 4.286702e-04 9.982853e-01 4.286702e-049  3.319338e-03 3.319338e-03 9.867226e-01 3.319338e-03 3.319338e-0310 2.034781e-04 2.034781e-04 9.991861e-01 2.034781e-04 2.034781e-0411 4.810342e-04 9.980759e-01 4.810342e-04 4.810342e-04 4.810342e-0412 2.651256e-04 9.989395e-01 2.651256e-04 2.651256e-04 2.651256e-0413 1.430945e-04 1.430945e-04 1.430945e-04 9.994276e-01 1.430945e-0414 8.402940e-04 8.402940e-04 8.402940e-04 9.966388e-01 8.402940e-0415 8.404830e-05 9.996638e-01 8.404830e-05 8.404830e-05 8.404830e-0516 1.903630e-04 9.992385e-01 1.903630e-04 1.903630e-04 1.903630e-0417 1.297372e-04 1.297372e-04 9.994811e-01 1.297372e-04 1.297372e-0418 6.906241e-05 6.906241e-05 6.906241e-05 9.997238e-01 6.906241e-0519 1.242780e-04 1.242780e-04 1.242780e-04 1.242780e-04 9.995029e-0120 9.997361e-01 6.597684e-05 6.597684e-05 6.597684e-05 6.597684e-05# Now for each doc, find just the top-ranked topic   toptopics <- as.data.frame(cbind(document = row.names(gammaDF),   topic = apply(gammaDF,1,function(x) names(gammaDF)[which(x==max(x))])))# inspect...toptopics          document topic1         1     22         2     53         3     14         4     45         5     46         6     57         7     28         8     49         9     110       10     211       11     312       12     113       13     114       14     215       15     116       16     417       17     418       18     319       19     420       20     3

那是你想做的吗？
此答案的提示：https : //stat.ethz.ch/pipermail/r-help/2010-August/247706.html

具有主题模型的LDA，如何查看不同文档属于哪些主题？

面试问答相关栏目本月热门文章