统计 | 1-单因素方差分析可视化--柱形图+标准差+多重比较结果

今天介绍一下单因素方差分析可视化的内容，主要是实现如下图：

分组平均值+标准差

1. 数据

library(agricolae)
data(sweetpotato)
head(sweetpotato)
str(sweetpotato)

试验描述：

这些数据与在秘鲁南部塔克纳省进行的一项实验相符。研究了两种病毒（Spfmv和Spcsv）的作用。处理方法如下：CC（Spcsv）=甘薯褪绿矮秆，FF（Spfmv）=羽状斑驳，FC（Spfmv y Spcsv）=病毒复合物和OO（见证）健康植物。每个小区种植甘薯50株，共12个小区。每次治疗重复3次，实验结束时评估总重量（千克）。病毒通过插条传播，插条在田间播种。

2. 方差分析

mod1 = aov(yield~virus,data=sweetpotato)
summary(mod1)

可以看出，不同病毒之间达到极显著。可以进行多重比较。

3. 多重比较

re = LSD.test(mod1,"virus")
re

结果：

$statistics
   MSerror Df   Mean      CV  t.value      LSD
  22.48917  8 27.625 17.1666 2.306004 8.928965

$parameters
        test p.ajusted name.t ntr alpha
  Fisher-LSD      none  virus   4  0.05

$means
      yield      std r       LCL      UCL  Min  Max   Q25  Q50   Q75
cc 24.40000 3.609709 3 18.086268 30.71373 21.7 28.5 22.35 23.0 25.75
fc 12.86667 2.159475 3  6.552935 19.18040 10.6 14.9 11.85 13.1 14.00
ff 36.33333 7.333030 3 30.019601 42.64707 28.0 41.8 33.60 39.2 40.50
oo 36.90000 4.300000 3 30.586268 43.21373 32.1 40.4 35.15 38.2 39.30

$comparison
NULL

$groups
      yield groups
oo 36.90000      a
ff 36.33333      a
cc 24.40000      b
fc 12.86667      c

attr(,"class")
[1] "group"

4. 多重比较可视化

re1 = re$groups
re1

# 计算品种标准误

xx = aggregate(yield ~ virus, sweetpotato,sd)
names(xx) = c("virus","sd")
xx


re2 = re1 %>% mutate(virus = rownames(re1)) %>% inner_join(.,xx,by="virus")
re2

# 作图
## 做直方图
re2 %>% ggplot(aes(virus,yield)) + geom_col(aes(fill = virus), width=.4) + 
  geom_errorbar(aes(ymax = yield + sd, ymin = yield - sd),width = .1,size=.5)+ 
  geom_text(aes(label = groups,y = yield + sd +1.5)) + theme(panel.grid = element_blank(), panel.background = element_rect(color = "black",fill = "transparent"))

5. 完整代码

library(agricolae)
data(sweetpotato)
mod1 = aov(yield~virus,data=sweetpotato)
summary(mod1)

re = LSD.test(mod1,"virus")
re

re1 = re$groups
re1

# 计算品种标准误

xx = aggregate(yield ~ virus, sweetpotato,sd)
names(xx) = c("virus","sd")
xx


re2 = re1 %>% mutate(virus = rownames(re1)) %>% inner_join(.,xx,by="virus")
re2

# 作图
## 做直方图
re2 %>% ggplot(aes(virus,yield)) + geom_col(aes(fill = virus), width=.4) + 
  geom_errorbar(aes(ymax = yield + sd, ymin = yield - sd),width = .1,size=.5)+ 
  geom_text(aes(label = groups,y = yield + sd +1.5)) + theme(panel.grid = element_blank(), panel.background = element_rect(color = "black",fill = "transparent"))

6. R语言太难？来用Genstat吧 6.1 导入数据

6.2 选择方差分析模型

结果：

Analysis of variance
 
Variate: yield
 
Source of variation	d.f.	s.s.	m.s.	v.r.	F pr.
virus	3	 1170.21	 390.07	 17.34	<.001
Residual	8	 179.91	 22.49	 	 
Total	11	 1350.12	 	 	 
 
 
Message: the following units have large residuals.
 
*units* 9	   -8.3	   s.e. 3.9
 
 
Tables of means
 
Variate: yield
 
Grand mean  27.6 
 
	virus	 cc	 fc	 ff	 oo
		 24.4	 12.9	 36.3	 36.9
 
 
Standard errors of differences of means
 
Table	virus	 
rep.	 3	 
d.f.	 8	 
s.e.d.	 3.87	 
 
 
 
Least significant differences of means (5% level)
 
Table	virus	 
rep.	 3	 
d.f.	 8	 
l.s.d.	 8.93

6.3 多重比较

Fisher's protected least significant difference test
 
 
virus
 
 
		Mean	 
	oo	 36.90	 a
	ff	 36.33	 a
	cc	 24.40	 b
	fc	 12.87	 c

6.4 结果可视化

结果：

欢迎关注我的公众号：育种数据分析之放飞自我。主要分享R语言，Python，育种数据分析，生物统计，数量遗传学，混合线性模型，GWAS和GS相关的知识。

统计 | 1-单因素方差分析可视化--柱形图+标准差+多重比较结果

Python相关栏目本月热门文章