如何使用内置数据集。这将向您显示哪些文档属于哪个主题的可能性最高。
library(topicmodels)data("AssociatedPress", package = "topicmodels")k <- 5 # set number of topics# generate modellda <- LDA(AssociatedPress[1:20,], control = list(alpha = 0.1), k)# now we have a topic model with 20 docs and five topics# make a data frame with topics as cols, docs as rows and# cell values as posterior topic distribution for each documentgammaDF <- as.data.frame(lda@gamma) names(gammaDF) <- c(1:k)# inspect...gammaDF 1 2 3 4 51 8.979807e-05 8.979807e-05 9.996408e-01 8.979807e-05 8.979807e-052 8.714836e-05 8.714836e-05 8.714836e-05 8.714836e-05 9.996514e-013 9.261396e-05 9.996295e-01 9.261396e-05 9.261396e-05 9.261396e-054 9.995437e-01 1.140774e-04 1.140774e-04 1.140774e-04 1.140774e-045 3.573528e-04 3.573528e-04 9.985706e-01 3.573528e-04 3.573528e-046 5.610659e-05 5.610659e-05 5.610659e-05 5.610659e-05 9.997756e-017 9.994345e-01 1.413820e-04 1.413820e-04 1.413820e-04 1.413820e-048 4.286702e-04 4.286702e-04 4.286702e-04 9.982853e-01 4.286702e-049 3.319338e-03 3.319338e-03 9.867226e-01 3.319338e-03 3.319338e-0310 2.034781e-04 2.034781e-04 9.991861e-01 2.034781e-04 2.034781e-0411 4.810342e-04 9.980759e-01 4.810342e-04 4.810342e-04 4.810342e-0412 2.651256e-04 9.989395e-01 2.651256e-04 2.651256e-04 2.651256e-0413 1.430945e-04 1.430945e-04 1.430945e-04 9.994276e-01 1.430945e-0414 8.402940e-04 8.402940e-04 8.402940e-04 9.966388e-01 8.402940e-0415 8.404830e-05 9.996638e-01 8.404830e-05 8.404830e-05 8.404830e-0516 1.903630e-04 9.992385e-01 1.903630e-04 1.903630e-04 1.903630e-0417 1.297372e-04 1.297372e-04 9.994811e-01 1.297372e-04 1.297372e-0418 6.906241e-05 6.906241e-05 6.906241e-05 9.997238e-01 6.906241e-0519 1.242780e-04 1.242780e-04 1.242780e-04 1.242780e-04 9.995029e-0120 9.997361e-01 6.597684e-05 6.597684e-05 6.597684e-05 6.597684e-05# Now for each doc, find just the top-ranked topic toptopics <- as.data.frame(cbind(document = row.names(gammaDF), topic = apply(gammaDF,1,function(x) names(gammaDF)[which(x==max(x))])))# inspect...toptopics document topic1 1 22 2 53 3 14 4 45 5 46 6 57 7 28 8 49 9 110 10 211 11 312 12 113 13 114 14 215 15 116 16 417 17 418 18 319 19 420 20 3那是你想做的吗?
此答案的提示:https : //stat.ethz.ch/pipermail/r-help/2010-August/247706.html



