対å¿ã®ãªã 2 群éã®éçæ¤å®ææ³ã¨ãã¦ãæãæåãªã®ã¯ Student ã® t æ¤å®ã§ããããã
以åãStudent ã® t æ¤å®ã«ã¤ãã¦ã®è¨äºãæ¸ãã¾ããã
ããããStudent ã® t æ¤å®ã¯ãç忣æ§ãä»®å®ãã¦ãããããä¸ç忣ã®ç¶æ³ã«ã対å¿ã§ããããã«ãWelch の t 検定ã使ãã®ãã»ãªãªã¼ã¨ãªã£ã¦ãã¾ãã
ãã ããããã 2ã¤ã®æ¤å®ã¯åå¸ã®æ£è¦æ§ãä»®å®ãã¦ãããããæ£è¦æ§ãä»®å®ã§ããªãç¶æ³ã§ã¯ãMann-Whitney の U検定ã¨ãããã®ãåºã使ããã¦ãã¾ãã
Mann-Whitney ã® Uæ¤å®ã¯ãæ£è¦æ§ãä»®å®ããªããã³ãã©ã¡ããªãã¯æ¤å®ã¨ãã¦æåã§ãããä¸ç忣ã®ç¶æ³ã§ãã¾ãæ¤å®ã§ããªãã¨ããåé¡ããããã¨ã¯ãã¾ãç¥ããã¦ãã¾ããã
仿¥ã¯ããããã®åé¡ããã¹ã¦è§£æ±ºãããæ£è¦æ§ãç忣æ§ãä»®å®ããªãæå¼·ã®æ¤å®ãBrunner-Munzel æ¤å®ãç´¹ä»ãããã®æ¤å®ç²¾åº¦ã«ã¤ãã¦èª¿æ»ãã¦ã¿ã¾ãã
Brunner-Munzel æ¤å®
Brunner-Munzel æ¤å®ã®çè«çãªå´é¢ã¯ã奥村先生のホームページã詳ããã§ãã
ããªãã¡ãMann-Whitney ã® Uæ¤å®ã¯ã2群éã®åå¸ã®å½¢ç¶ãåãã§ããã¨ããä»®å®ãæã¤ã®ã«å¯¾ããBrunner-Munzel æ¤å®ã¯ãåå¸ãåããã¨ã¯ä»®å®ããã両群ããä¸ã¤ãã¤å¤ãåãåºããã¨ããã©ã¡ãã大ãã確çãçããã¨ãã帰ç¡ä»®èª¬ãæ¤å®ããã¨ãããã®ã§ãã
Brunner-Munzel æ¤å®ã R ã§è¡ãã«ã¯ãlawstat ããã±ã¼ã¸ã® brunner.munzel.test() 颿°ã使ãã¾ãã
library(lawstat) x = c(1,2,1,1,1,1,1,1,1,1,2,4,1,1) y = c(3,3,4,3,1,2,3,1,1,5,4) brunner.munzel.test(x,y)
## Brunner-Munzel Test ## ## data: x and y ## Brunner-Munzel Test Statistic = 3.138, df = 17.68, p-value = 0.005786 ## 95 percent confidence interval: ## 0.5952 0.9827 ## sample estimates: ## P(X<Y)+.5*P(X=Y) ## 0.789
æ¬è¨äºã§ã¯ãä¸ç忣ã®ç¶æ³ã§ Brunner-Munzel æ¤å®ãã©ãã»ã©ã®æ¤å®ç²¾åº¦ãæã¤ã®ãã調æ»ãã¾ãã
ãã®éãæ¯è¼å¯¾è±¡ã¨ãã¦ãStudent ã® t æ¤å®ãWelch ã® t æ¤å®ãMann-Whitney ã® U æ¤å®ã«ã¤ãã¦ãæ¤å®ç²¾åº¦ã調æ»ãã¾ãã
ããããã®æ¤å®ã«ãããä»®å®ã¯æ¬¡ã®éãã§ãã
| ææ³ | æ£è¦æ§ | çåæ£æ§ |
|---|---|---|
| Studentâs t | è¦ | è¦ |
| Welchâs t | è¦ | ä¸è¦ |
| Mann-Whitney | ä¸è¦ | è¦ |
| Brunner-Munzel | ä¸è¦ | ä¸è¦ |
1. æ£è¦åå¸
ã¾ãã¯ãæ£è¦åå¸ã«å¯¾ããæ¤å®ç²¾åº¦ã調æ»ãã¾ãã æ¨æºåå·® 1 ã®æ£è¦åå¸ã«å¯¾ãã¦ãæ¨æºåå·®ã 1/4, 1/2, 3/4, 1, 4/3, 2, 4 ã¨å¤åãããæ£è¦åå¸ã¨ã®æ¤å®ãããããç¹°ãè¿ãã第ä¸ç¨®ã®é誤(alpha error rate)ã調æ»ãã¾ãã
sigma_ratio <- c("1/4", "1/2", "3/4", "1", "4/3", "2", "4") sigma1 <- c(1/4, 1/2, 3/4, 1, 4/3, 2, 4) sigma2 <- 1 iter_num <- 2000 library(pforeach) library(lawstat) pforeach(sigma=sigma1, .c=rbind)({ npforeach(i=seq_len(iter_num), .c=rbind)({ x1 <- rnorm(15, mean = 5, sd = sigma) x2 <- rnorm(45, mean = 5, sd = sigma2) student <- t.test(x1, x2, var.equal = TRUE)$p.value <= 0.05 welch <- t.test(x1, x2, var.equal = FALSE)$p.value <= 0.05 MH <- wilcox.test(x1,x2)$p.value <= 0.05 BM <- brunner.munzel.test(x1, x2)$p.value <= 0.05 data.frame(`Student's t`=student, `Welch's t`=welch, `Mann-Whitney`=MH, `Brunner-Munzel`=BM, check.names = FALSE) }) -> res colSums(res)/iter_num }) -> result library(tidyr) library(dplyr) data <- result %>% data.frame(check.names = FALSE) %>% cbind(sigma=factor(sigma_ratio, levels=sigma_ratio)) %>% gather(`Test Method`, `Alpha Error Rate`, 1:4) library(ggplot2) ggplot(data, aes(x=sigma, y=`Alpha Error Rate`, group=`Test Method`, color=`Test Method`)) + geom_line() + geom_point() + ggtitle("Normal Dist.")
æææ°´æº 0.05 ã§æ¤å®ãè¡ã£ã¦ãããããalpha error rate ã¯ã©ãã 0.05 ã«ãªãã¯ãã§ãããç°ãªãæ¨æºåå·®ã«å¯¾ããæ¤å®ã«ããã¦ãStudent 㨠Mann-Whitney 㯠0.05 ãã大ããå¤ãã¦ãããæ¤å®ç²¾åº¦ãéå¸¸ã«æªãã¨ããçµæã«ãªãã¾ããã
対ãã¦ãWelch 㨠Brunner-Munzel ã¯ç°ãªãæ¨æºåå·®ã«å¯¾ãã¦ã alpha error rate 㯠0.05 ãä¿ã£ã¦ãããæ¤å®ç²¾åº¦ã¯é常ã«è¯ãã§ãã
ãããã¸ãã¯ã¾ããçè«éãã§ããã
2. é£ç¶ä¸æ§åå¸
ããã§ã¯ãæ£è¦æ§ã®ä»®å®ãå´©ããå ´åã¯ã©ããªãã§ããããï¼
ç¶ãã¦ã¯ãé£ç¶ä¸æ§åå¸ã«å¯¾ãã¦ãå
ã»ã©ã®æ£è¦åå¸ã¨åãå
容ã®ã·ãã¥ã¬ã¼ã·ã§ã³ãè¡ã£ã¦ã¿ã¾ãã
ããªãã¡ãæ¨æºåå·® 1 ã®é£ç¶ä¸æ§åå¸ã«å¯¾ãã¦ãæ¨æºåå·®ã 1/4, 1/2, 3/4, 1, 4/3, 2, 4 ã¨å¤åãããé£ç¶ä¸æ§åå¸ã¨ã®æ¤å®ãããããç¹°ãè¿ãã第ä¸ç¨®ã®é誤ã調æ»ãã¾ãã
sigma_ratio <- c("1/4", "1/2", "3/4", "1", "4/3", "2", "4") sigma1 <- c(1/4, 1/2, 3/4, 1, 4/3, 2, 4) a1 <- 5 - sqrt(3) * sigma1 b1 <- 5 + sqrt(3) * sigma1 a2 <- 5 - sqrt(3) b2 <- 5 + sqrt(3) iter_num <- 2000 library(pforeach) library(lawstat) pforeach(a=a1, b=b1, .c=rbind)({ npforeach(i=seq_len(iter_num), .c=rbind)({ x1 <- runif(15, min = a, max = b) x2 <- runif(45, min = a2, max = b2) student <- t.test(x1, x2, var.equal = TRUE)$p.value <= 0.05 welch <- t.test(x1, x2, var.equal = FALSE)$p.value <= 0.05 MH <- wilcox.test(x1,x2)$p.value <= 0.05 BM <- brunner.munzel.test(x1, x2)$p.value <= 0.05 data.frame(`Student's t`=student, `Welch's t`=welch, `Mann-Whitney`=MH, `Brunner-Munzel`=BM, check.names = FALSE) }) -> res colSums(res)/iter_num }) -> result library(tidyr) library(dplyr) data <- result %>% data.frame(check.names = FALSE) %>% cbind(sigma=factor(sigma_ratio, levels=sigma_ratio)) %>% gather(`Test Method`, `Alpha Error Rate`, 1:4) library(ggplot2) ggplot(data, aes(x=sigma, y=`Alpha Error Rate`, group=`Test Method`, color=`Test Method`)) + geom_line() + geom_point() + ggtitle("Uniform Dist.")
çµæã¯ãæ£è¦åå¸ã®å ´åã¨åæ§ã«ãªãã¾ããã
ç°ãªãæ¨æºåå·®ã«å¯¾ãã¦ã¯ãStudent 㨠Mann-Whitney ã¯æ¤å®ç²¾åº¦ãéå¸¸ã«æªããWelch 㨠Brunner-Munzel ã¯è¯ã精度ãä¿ã£ã¦ãã¾ãã
å®ã¯ãWelch ã® t æ¤å®ã¯ãããç¨åº¦ã®åå¸ã®ããã¿ã«å¯¾ãã¦ããé«ã精度ãç¶æã§ãããã¨ãç¥ããã¦ãã¾ãã*1
ãããã£ã¦ãæ£è¦åå¸ãä»®å®ã§ããªãå ´åã§ããMann-Whitney ãã Welch ã使ç¨ããã»ããè¯ãã¨ãã人ãã¡ããã¾ãã
3. å¯¾æ°æ£è¦åå¸(å¹³åå¤ãåãå ´å)
ä¸è¨ 2ã¤ã¯å¯¾ç§°åå¸ã§ããããæªãã åå¸ã«å¯¾ãã調æ»ã¨ãã¦ãå¯¾æ°æ£è¦åå¸ãèãã¾ãã
å¯¾æ°æ£è¦åå¸ã¯ãå¹³åå¤ã¨ä¸å¤®å¤ãç°ãªãã¾ãã
ã¾ãã¯ãå¹³åå¤ãåãå¯¾æ°æ£è¦åå¸ã«ã¤ãã¦ãæ¨æºåå·®ã 1/4, 1/2, 3/4, 1, 4/3, 2, 4 åã¨ãªã£ãã¨ãã®ã第ä¸ç¨®ã®é誤ã調æ»ãã¾ãã
sigma_ratio <- c("1/4", "1/2", "3/4", "1", "4/3", "2", "4") sigma1 <- c(1, 2, 3, 4, 4, 4, 4) sigma2 <- c(4, 4, 4, 4, 3, 2, 1) sigmas1 <- sqrt(log((sigma1^2)/(5^2) + 1)) sigmas2 <- sqrt(log((sigma2^2)/(5^2) + 1)) mus1 <- log(5) - (sigmas1^2)/2 mus2 <- log(5) - (sigmas2^2)/2 iter_num <- 2000 library(pforeach) library(lawstat) pforeach(mu1=mus1, sigma1=sigmas1, mu2=mus2, sigma2=sigmas2, .c=rbind)({ npforeach(i=seq_len(iter_num), .c=rbind)({ x1 <- rlnorm(15, meanlog = mu1, sdlog = sigma1) x2 <- rlnorm(45, meanlog = mu2, sdlog = sigma2) student <- t.test(x1, x2, var.equal = TRUE)$p.value <= 0.05 welch <- t.test(x1, x2, var.equal = FALSE)$p.value <= 0.05 MH <- wilcox.test(x1,x2)$p.value <= 0.05 BM <- brunner.munzel.test(x1, x2)$p.value <= 0.05 data.frame(`Student's t`=student, `Welch's t`=welch, `Mann-Whitney`=MH, `Brunner-Munzel`=BM, check.names = FALSE) }) -> res colSums(res)/iter_num }) -> result library(tidyr) library(dplyr) data <- result %>% data.frame(check.names = FALSE) %>% cbind(sigma=factor(sigma_ratio, levels=sigma_ratio)) %>% gather(`Test Method`, `Alpha Error Rate`, 1:4) library(ggplot2) ggplot(data, aes(x=sigma, y=`Alpha Error Rate`, group=`Test Method`, color=`Test Method`)) + geom_line() + geom_point() + ggtitle("Log Normal Dist.(Mean)")
Student 㨠Mann-Whitney ã®æ¤å®ç²¾åº¦ãæªãã®ã¯ããã¾ã§ã¨åæ§ã§ãããBrunner-Munzel ã«ã¤ãã¦ãæ¤å®ç²¾åº¦ãæªããªã£ã¦ãã¾ãã
ã¾ããWelch ã¯åæ¦ãã¦ãããã®ã®ããã¯ãæ¤å®ç²¾åº¦ã¯ãã¾ãè¯ãããã¾ããã
4. å¯¾æ°æ£è¦åå¸(ä¸å¤®å¤ãåãå ´å)
次ã«ãä¸å¤®å¤ãåãå¯¾æ°æ£è¦åå¸ã«ã¤ãã¦ãåæ§ã«ãæ¨æºåå·®ã 1/4, 1/2, 3/4, 1, 4/3, 2, 4 åã¨ãªã£ãã¨ãã®ã第ä¸ç¨®ã®é誤ã調æ»ãã¾ãã
sigma_ratio <- c("1/4", "1/2", "3/4", "1", "4/3", "2", "4") sigma1 <- c(1, 2, 3, 4, 4, 4, 4) sigma2 <- c(4, 4, 4, 4, 3, 2, 1) sigmas1 <- sqrt(log((1 + sqrt((4 * sigma1^2)/25 + 1))/2)) sigmas2 <- sqrt(log((1 + sqrt((4 * sigma2^2)/25 + 1))/2)) mu1 <- log(5) mu2 <- log(5) iter_num <- 2000 library(pforeach) library(lawstat) pforeach(sigma1=sigmas1, sigma2=sigmas2, .c=rbind)({ npforeach(i=seq_len(iter_num), .c=rbind)({ x1 <- rlnorm(15, meanlog = mu1, sdlog = sigma1) x2 <- rlnorm(45, meanlog = mu2, sdlog = sigma2) student <- t.test(x1, x2, var.equal = TRUE)$p.value <= 0.05 welch <- t.test(x1, x2, var.equal = FALSE)$p.value <= 0.05 MH <- wilcox.test(x1,x2)$p.value <= 0.05 BM <- brunner.munzel.test(x1, x2)$p.value <= 0.05 data.frame(`Student's t`=student, `Welch's t`=welch, `Mann-Whitney`=MH, `Brunner-Munzel`=BM, check.names = FALSE) }) -> res colSums(res)/iter_num }) -> result library(tidyr) library(dplyr) data <- result %>% data.frame(check.names = FALSE) %>% cbind(sigma=factor(sigma_ratio, levels=sigma_ratio)) %>% gather(`Test Method`, `Alpha Error Rate`, 1:4) library(ggplot2) ggplot(data, aes(x=sigma, y=`Alpha Error Rate`, group=`Test Method`, color=`Test Method`)) + geom_line() + geom_point() + ggtitle("Log Normal Dist.(Median)")
ä»åº¦ã¯ãWelch ã®æ¤å®ç²¾åº¦ãæªããBrunner-Munzel ã¯è¯ã精度ãä¿ã£ã¦ãã¾ãã
ããã¯ãBrunner-Munzel ãã両群ããä¸ã¤ãã¤å¤ãåãåºããã¨ããã©ã¡ãã大ãã確çãçããã¨ãã帰ç¡ä»®èª¬ãæ¤å®ããããããå¹³åå¤ã§ãªãä¸å¤®å¤ã®æ¤å®ã¨ãªã£ã¦ããããã¨èãããã¾ãã*2
ã¾ã¨ã
ä¸ç忣ã®ç¶æ³ä¸ã«ããã¦ã
- Student ã® t æ¤å®
- Welch ã® t æ¤å®
- Mann-Whitney ã® U æ¤å®
- Brunner-Munzel æ¤å®
ã® 4ã¤ã®æ¤å®ææ³ã«ã¤ãã¦ããã®æ¤å®ç²¾åº¦ã調æ»ãã¾ããã
çè«éããStudent ã® t æ¤å®ããã³ Mann-Whitney ã® U æ¤å®ã¯ãä¸ç忣ã®ç¶æ³ä¸ã§ã¯æ¤å®ç²¾åº¦ãæªããªãã¾ããã
Welch ã® t æ¤å®ããã³ Brunner-Munzel æ¤å®ã¯ãããç¨åº¦ããã¿ã®ãªãåå¸ã«å¯¾ãã¦ã¯ãååãªæ¤å®ç²¾åº¦ãä¿ã¡ã¾ããã
ããã«ãæªãã åå¸ã«å¯¾ãã¦ããä¸å¤®å¤ã®æ¤å®ã¨ãã¦ãªãã°ãBrunner-Munzel æ¤å®ã¯ååãªæ¤å®ç²¾åº¦ãä¿ã¡ã¾ããã
Welch ã® t æ¤å®ã¯ãæªãã åå¸ã«å¯¾ãã¦ã¯ãæ¤å®ç²¾åº¦ã¯ãã¾ãããããã¾ããã§ããã
Brunner-Munzel æ¤å®ã¯ãæªãã åå¸ã«å¯¾ãã¦ãå¹³åå¤ã®æ¤å®ã¨ãã¦ã¯æ¤å®ç²¾åº¦ã¯ Welch ãããæªããªãã¾ããã
ããããæªãã åå¸ã«å¯¾ãã¦å¹³åå¤ãæ¤å®ããã¨ããç¶æ³ã¯ãã¾ãç¡ãã®ã§ã¯ãªãã§ããããï¼
ãããã£ã¦ãã»ã¨ãã©ã®å ´åã«ããã¦ãBrunner-Munzel æ¤å®ã¯æå¼·ã¨ãããã¨ã§ãã
ãããã£ã¦ãå¹³åå¤ã®æ¤å®ã«ã¯ Welchãä¸å¤®å¤ã®æ¤å®ã«ã¯ Brunner-Munzel ã使ãåããã¨ããã®ãããã®ã§ã¯ãªãã§ããããã
Brunner-Munzel æ¤å®ãè¦ãã¦ããã¦æã¯ãªãã§ãã
ãã®ããã«æå¼·ã®æ¤å®ãªã®ã§ãããBrunner-Munzel æ¤å®ã¯ç¥å度ãä½ãããã§ãã
@hoxo_m 確ããããä½ã£ã人ãé¢ä¿ãã¦ããã¸ã£ã¼ãã«ã®ç¥å度ãä¸ããããã«ãããã¦ç¡åã®ãã®ã¸ã£ã¼ãã«ã«æç¨¿ãããããç¥å度ãä½ãã¨ãã話ã ã£ãæ°ãï½
— motivic (@motivic_) 2015, 1æ 23ã¨ããããã§ããã¤ãã¼ã ãã©æå¼·ã®çµ±è¨çæ¤å®ãBrunner-Munzel æ¤å®ããã£ã¨ä½¿ã£ã¦ããã¾ãããï¼
以ä¸ã§ãã
追è¨
フリーの統計分析プログラムHADã§æåãªæ¸ æ°´å çã«è²´éãªãæè¦ãããã ãã¾ããã
Welchã®æ¤å®ã¯å¹³åå¤ã®å·®ã®æ¤å®ãªã®ã ããï¼çä¸å¤®å¤ã®ãã¼ã¿ã§æ¤å®ç²¾åº¦ãæ¤è¨ããã®ã¯ç¡æå³ã ã¨æãã®ã ãã»ã»ã»ããã®çµæãªãå¹³åå¤ãªãWelchï¼ä¸å¤®å¤ãªãBrunner-Munzelã§ï¼ã¨ããçµè«ã«ãªãã¯ããæå¾ã®çµè«ã¯ã¡ãã£ã¨æ¥µç«¯ã ã¨æããï¼RT
— Hiroshi Shimizu (@simizu706) 2015, 2æ 17ãã¼ããããã§ãããã¡ãã£ã¨çãããã¾ããã
å¹³åå¤ã®å ´å㯠Welchãä¸å¤®å¤ã®å ´å㯠Brunner-Munzel ã¨ä½¿ãåããã®ãè¯ãããã§ãã
ããã«è¨ãã¨ãå°æ¨æ¬ã®å ´åã«ãã¾ããããªãã¨ããåé¡ããããä¸¦ã¹æ¿ãBrunner-Munzelæ¤å®*3ã®ãããªæ¹æ³ãææ¡ããã¦ãã¾ãã
追è¨2
フリーの統計分析プログラムHADã§ Brunner-Munzel æ¤å®ãã§ããããã«ãªãã¾ããï¼ã¯ããï¼
HAD13.100ãã¢ãããã¾ãããBrunner-Munzelæ¤å®ãã§ããããã«ãªãã¾ããã
http://t.co/bX48q5ZqJh
— Hiroshi Shimizu (@simizu706) 2015, 2æ 18åè
Brunner-Munzel検定
マン・ホイットニーのU検定と不等分散時における代表値の検定法
http://www.researchgate.net/publication/226351592_The_two-sample_t_test_pre-testing_its_assumptions_does_not_pay_off
RPubs - 対数正規分布から、平均値(中央値)と標準偏差を指定して乱数を生成させる
*1:http://www.researchgate.net/publication/226351592_The_two-sample_t_test_pre-testing_its_assumptions_does_not_pay_off
*2:ä¸å¤®å¤ã®æ¤å®ã¨ããã®ã¯ééãã ããã§ãã https://twitter.com/simizu706/status/567970403936649216



