é層ãã¤ãºã¨ç¶æ 空éã¢ãã«ã§åºåå¹æãæ¨å®ããã
èæ¯
ããã¾ã§Marketing Mix Modelingï¼MMMï¼ã«ãããAdStockå¹æã®æ¨å®ã«ã¤ãã¦è²ã ã¨è¨äºãæ¸ãã¦ãã¾ãããããã®ä»ã«ã試ãããã¨æã£ã¦ããã¢ãã«ãããã¤ãããã¾ãããã®ä¸ã¤ãé層ãã¤ãºã¢ãã«ã¨ç¶æ 空éã¢ãã«ãåæã«åãæ±ããã®ã§ãã
ä¾ãã°ãå°åå¥ã®å£²ä¸æ¨ç§»ã®ãã¼ã¿ããæå ã«ããã¨èãã¦ã¿ã¾ããããå°åã§ã¯ãªã人ãååã§ãæ§ãã¾ããããããè¦å ã®åæ°´æºãããããæç³»åãã¼ã¿ãæã£ã¦ããç¶æ³ï¼ããããããã«ãã¼ã¿ï¼ã§ãã²ã¨ã¾ãããã§ã¯å°åã¨ãã¾ãããã®ãããªãã¼ã¿ã¯ããããä¼ç¤¾ã§ä¿æãã¦ãããã¨ã§ãããã ä»ãåå°åã«ã¤ãã¦MMMã«ããåºåå¹æãæ¨å®ãããã¨ãèããã¨ããã©ã®ãããªã¢ããªã³ã°ãå¯è½ã§ããããï¼
ã·ã³ãã«ã«èããã°ãå°åãã¨ã«ä¸ã¤ãã¤ã¢ãã«ãä½ãã¨ããæ¹æ³ãæãããã¾ããä¾ãã°å°åã®æ°ã2ã¤3ã¤ãããªãã£ãããã¢ãã«ã®ä½æã«æéãããããã¨ãå¯è½ã§ããã°ãã®æ¹æ³ã¯æå¹ããããã¾ããããããåé¡ãããã¾ããå°åãã¨ã«ç¬ç«ãã¦ã¢ãã«ã使ããã¨ãããã¨ã¯ãåå°åã®ãã©ã¡ã¼ã¿ã¯äºãã«ç¬ç«ã§ããã¨ã®ä»®å®ãç½®ããã¨ã«ãªãã®ã§ãããããã¯äºå®ã§ããããï¼
確ãã«TVãæ°èã«ã¯å°æ¹æ¬ããã¼ã«ã«å±ããããå°åéå®ã®ã³ã³ãã³ããæä¾ãããå ´åãããã§ãããããããã©ã¡ããã¨è¨ãã°å ¨å½ã§åãã³ã³ãã³ããæä¾ãããå²åã®æ¹ã大ããã§ããããã購買è¡åã«å°åã«ããå·®ãããã»ã©èªããããã¨ã¯ãçµé¨çã«ãæãã¾ãããããä»®ã«å°åã«ããç°è³ªæ§ã¨ãããã®ãããã»ã©å¼·ããªãã®ã§ããã°ãããããããã£ãæ å ±ãç©æ¥µçã«æ´»ãããæ¹ãè¯ãã®ã§ã¯ãªãã§ããããï¼
ç®ç
ãã®ãããªæã«ããã©ã¡ã¼ã¿éã«ç·©ãããªç¸ãããããæ¹æ³ã¨ãã¦é層ãã¤ãºã¢ãã«ã¨ããææ³ãããã¾ãã詳ããã¯æ¸ç±1ãåèã«ãã¦é ãã¨ãã¦ãããã§ã¯ãåå°åã®åºå广ã表ããã©ã¡ã¼ã¿ãçæããåå¸ãä»®å®ãããæ¹æ³ã¨èª¬æãã¦ããã¾ãããã®åå¸ã®å¹
ï¼ãã©ã¤ãã忣ï¼ãååã«å¤§ãããã°åºå广ã¯å°åã«ãã£ã¦æ§ã
ãªå¤ãåãå¾ã¾ãããéã«åå¸ã®å¹
ãæ¥µç«¯ã«çããã°å°åãã¨ã®åºå广ã«å·®ã¯ãªããã¨ãæå³ãã¾ãã
ãã®é層æ§é ãæã¤ã¢ãã«ããæç³»åãã¼ã¿ã®åæã§ã馴æã¿ã®ç¶æ 空éã¢ãã«ã¨åããã¦åãæ±ãããã¨ããã®ãä»åã®è©¦ã¿ã§ããå ·ä½çã«ã¯ã以ä¸ã®ãããªã¢ãã«ãæ³å®ãã¦ãã¾ãã
æ·»åã®Aããã³tã¯ããããå°åï¼Areaï¼ã¨æç¹ï¼timeï¼ãæå³ãã¦ãããã¯ãããå°å
Aã®tæç¹ã«ããã観測å¤ãã¨ãªãã¾ãã
ãã¤ã³ãã¯4è¡ç®ã§ãå°åãã¨ã®åºå广ã«å¯¾ãã¦ããã®çæå
ã¨ãªãåå¸ï¼ããã§ã¯æ£è¦åå¸ï¼ãä»®å®ãã¦ãã¾ãããã
ã大ãããã°
ã¯æ§ã
ãªå¤ãåãå¾ã¾ãããå¤ãå°ãããã°é常ã«ä¼¼éã£ãå¤ã«ãªããã¨ãçè§£ã§ããã¨æãã¾ãã
ã©ã¤ãã©ãªã®èªã¿è¾¼ã¿
ä»åã®åæã§ã¯{rstan}ã使ç¨ãã¾ããé層ãã¤ãº + ç¶æ
空éã¢ãã«ã¨ãã£ãé常ã«è¤éãªã¢ãã«ã表ç¾ã§ããã®ãå¼·ã¿ã§ãã
library(tidyverse) library(ggplot2) library(rstan) options(mc.cores = parallel::detectCores()) rstan_options(auto_write = TRUE)
ã·ãã¥ã¬ã¼ã·ã§ã³ãã¼ã¿ã®çæ
åæç¨ã®ãã¼ã¿ãçæãã¾ãããæ¤è¨¼ã®ãã¤ã³ãã¨ãã¦ã¯ä»¥ä¸ã®ããã«ãªãã¾ãï¼
- 5å°åï¼
num_regionï¼ãã4å¹´åï¼num_yearï¼ã®ææ¬¡ï¼num_monthï¼ãã¼ã¿ãåéããåè¨240ç¹ã®è¦³æ¸¬å¤ãå¾ãã¨æ³å® - åå¸°ä¿æ°ï¼åºå广ï¼ã¯å¹³å
0.5ã忣0.01ã®æ£è¦åå¸ã«ãããã£ã¦çºç - ç¶æ 夿°ã®ãã©ã¡ã¼ã¿ã¯åå°åã§å ±é
## ã·ã¼ãã®åºå® set.seed(123) ## 観測å¤ã®æ° num_region <- 5 num_month <- 12 num_year <- 4 data_length <- num_month * num_year ## åå¸°ä¿æ° mu_beta <- 0.5 var_beta <- 0.01 beta_ad <- rnorm(num_region, mu_beta, sqrt(var_beta)) beta_ad_all <- rep(beta_ad, each = data_length) ## åå¸ã®è¨å® # ç¶æ 夿°ã®åæå¤ state_t0 <- 3 var_state_t0 <- 1 # ç¶æ 夿°ã®åæ£ var_state <- 0.01 # 誤差é var_error <- 0.01 ## 説æå¤æ°X scale_x <- 5 zero_per <- 0.2 X <- ifelse(runif(num_region * data_length) > zero_per, 1, 0) * rpois(num_region * data_length, scale_x)
ä¸è¨ã®æ¡ä»¶ã«å¾ããã¼ã¿ãçºçããã¾ããç¶æ
夿°ãé¤ãã°å
ã«ç¤ºããã¢ãã«ã®éãç°¡åãªç·å½¢ã¢ãã«ã§ããArea_IDã¯ãå¾ã»ã©Stanã§ãã£ããã£ã³ã°ãè¡ãããã«å¿
è¦ã¨ãªã夿°ã§ãYMã¯æç»ç¨ã§ãã
## ç¶æ 夿° State <- matrix(0, nrow = num_region * data_length) for (i in 1:num_region) { for (j in 1:data_length) { if (j == 1) { ## 1, 49, 97, 145, 193è¡ç®ãåå°åã®å é ã¨ãªã State[(i-1) * data_length + j] <- rnorm(1, state_t0, sqrt(var_state_t0)) } else { State[(i-1) * data_length + j] <- State[(i-1) * data_length + j - 1] + rnorm(1, 0, sqrt(var_state)) } } } ## ç®ç夿° error <- rnorm(num_region * data_length, 0, sqrt(var_error)) Y <- State + X * beta_ad_all + error ## å°åãã¨ã®ID Area_ID <- 1:num_region DF <- data.frame( "Area_ID" = rep(Area_ID, each = data_length), "YM" = rep(1:data_length, time = num_region), "Y" = Y, "X" = X, "True_s" = State, "True_e" = error)
ããã§è¦³æ¸¬å¤ã¨ç¶æ 夿°ãåå°åã§ã©ã®ããã«æ¨ç§»ãã¦ããã確èªãã¦ããã¾ãã
DF %>% gather("Var", "Val", -c(Area_ID, YM)) %>% filter(Var %in% c("Y", "True_s")) %>% ggplot(., aes(x = YM, y = Val, color = Var)) + geom_line() + facet_wrap(~Area_ID)

ã¾ãããã®æç¹ã§å°åãã¾ã¨ãã¦lmã§æ¨å®ãã$\beta$ãã©ã®ããã«ãªããã確èªãã¦ããã¾ãããã
> coef(lm(Y ~ X, data = DF)) (Intercept) X 3.8088184 0.5003114
ã¾ãå°åãã¨ã«ãã¼ã¿ãåå²ããå ´åã試ãã¦ã¿ã¾ãã
results <- list() for (i in Area_ID) { results[[as.character(i)]] <- coef(lm(Y ~ X, data = DF, subset = Area_ID == i)) }
> print(cbind(do.call("rbind", results), beta_ad)) (Intercept) X beta_ad 1 5.066906 0.4292258 0.4439524 2 2.700195 0.4837768 0.4769823 3 3.764447 0.6628493 0.6558708 4 2.953617 0.4988667 0.5070508 5 4.051693 0.5314325 0.5129288
ããããè¿ãå¤ãæ¨å®ããã¦ãã¾ããï¼æ±
Stanã«ãããã£ããã£ã³ã°
æ°ãåãç´ãã¦ãStanãç¨ãããã£ããã£ã³ã°ãå®è¡ãã¦ã¿ã¾ããStanã«æ¸¡ããã¼ã¿ã¯ä»¥ä¸ã®éãã§ãã
dat_Stan <- list(N = data_length, K = num_region, Y = DF$Y, X = DF$X, Area_ID = DF$Area_ID)
ã¾ãStanã®ã¹ã¯ãªããã¯ä»¥ä¸ã®ããã«ãªãã¾ãã
data { int N; // å°åãã¨ã®è¦³æ¸¬å¤ã®æ°ï¼data_lengthï¼ int K; // å°åã®æ°ï¼num_regionï¼ vector[N*K] Y; // 観測å¤ã®ãã¯ãã« vector[N*K] X; // 説æå¤æ°ã®ãã¯ãã« int<lower=1, upper=K> Area_ID[N*K]; } parameters { real state_t0; // ç¶æ 夿°ã®0æç®ã®å¹³å vector[N*K] state; // ç¶æ 夿°ã®ãã¯ãã« real beta_0; // åå¸°ä¿æ°ã®äºååå¸ã®å¹³å vector[K] beta; // å°åãã¨ã®åå¸°ä¿æ°ã®ãã¯ãã« real<lower=0> var_state_t0; // ç¶æ 夿°ã®0æç®ã®åæ£ real<lower=0> var_state; // ç¶æ 夿°ã®åæ£ real<lower=0> var_beta_0; // åå¸°ä¿æ°ã®äºååå¸ã®åæ£ real<lower=0> var_error; // èª¤å·®åæ£ } model { // ç¶æ 夿°ããµã³ããªã³ã° for (k in 1:K) { // 1æç®ã®å¤ã¯0æç®ã®åå¸ãããµã³ãã«ãã state[1 + (k-1)*N] ~ normal(state_t0, var_state_t0); // 2æç®ä»¥éã¯åæã®å¤ãå¹³åã¨ããåå¸ãããµã³ãã«ãã for(i in 2:N) { state[i + (k-1)*N] ~ normal(state[i-1 + (k-1)*N], var_state); } } // åå¸°ä¿æ°ããµã³ããªã³ã° for (k in 1:K) { beta[k] ~ normal(beta_0, var_beta_0); } // Yããµã³ããªã³ã° for(i in 1:(N*K)) { Y[i] ~ normal(state[i] + beta[Area_ID[i]] * X[i], var_error); } }
ãã®ã¹ã¯ãªãããHB_SSM_Sim.stanã¨ãã¦ä¿åãããã£ããã£ã³ã°ãè¡ãã¾ãã
fit_01 <- stan(file = '/YourDirectory/HB_SSM_Sim.stan', data = dat_Stan, iter = 1000, chains = 4, seed = 123)
çµæã®ç¢ºèª
ãã£ããã£ã³ã°ãçµãã£ãã®ã§ãçµæãè¦ã¦ã¿ã¾ããããfit_01ãããµã³ãã«ãæ½åºãã¦å å·¥ãã¾ãã
## ãµã³ãã«ãæ½åºãã res_01 <- rstan::extract(fit_01) ## 該å½ãããã©ã¡ã¼ã¿ãåãåºã ests <- summary(fit_01)$summary t0_rows <- rownames(ests)[grep("state_t0", rownames(ests))] state_rows <- rownames(ests)[grep("state\\[", rownames(ests))] b0_rows <- rownames(ests)[grep("beta_0", rownames(ests))] beta_rows <- rownames(ests)[grep("beta\\[", rownames(ests))] ## ç¶æ 夿° state_par <- ests %>% data.frame %>% select(mean) %>% mutate("Par" = rownames(ests)) %>% filter(Par %in% state_rows) %>% mutate("Area" = DF$Area_ID) state_cmp <- data.frame( True = State, Est = state_par$mean, Area = as.factor(state_par$Area), YM = DF$YM )
ã¾ãã¯ç¶æ 夿°ã®æ¨å®çµæãè¦ã¦ã¿ã¾ããããå®éã®å¤ã¨æ¨å®å¤ã並ã¹ã¦ã¿ã¾ãã ï¼ãªã以éã®ä½æ¥ã®åã«ãµã³ãã«ã®traceããããã¨ãã¹ãã°ã©ã ã確èªããåæãã¦ããã¨å¤æãã¦ãã¾ãï¼
state_cmp %>% gather("Var", "Val", -c(Area, YM)) %>% ggplot(., aes(x = YM, y = Val, colour = Var)) + geom_line() + facet_wrap(~Area)

ãããããªã精度è¯ãæ¨å®ã§ãã¦ããããã§ããï¼ç¶æ 夿°ã®åæå¤ãåæ£ã®æ¨å®å¤ã¯ã©ãã§ããããï¼
> ests %>% data.frame %>% select(mean) %>% rename("Estimated" = mean) %>% mutate("Par" = rownames(ests)) %>% filter(Par %in% t0_rows) %>% mutate("True" = c(state_t0, var_state_t0)) %>% mutate("Simulated" = c( mean(DF[c(1, 49, 97, 145, 193), "True_s"]), var(DF[c(1, 49, 97, 145, 193), "True_s"]) )) %>% select(Par, True, Simulated, Estimated) Par True Simulated Estimated 1 state_t0 3 3.7429556 3.731479 2 var_state_t0 1 0.8521877 1.328192
ç¶æ
夿°åæå¤ã®å¹³åããã³åæ£ã®çã®å¤ããããã3ããã³1ã§ãã£ãã®ã«å¯¾ããçæããããã¼ã¿ã§ã¯3.7ããã³0.85ã§ãããæ¨å®ãããå¤ã¯3.7ããã³1.3ã§ã忣ãããéå¤§ã«æ¨å®ããã¦ããããã§ãããã ã以ä¸ã«ç¤ºãããã«ãæ¨å®å¤ã®95%ä¿¡ç¨åºéãçµæ§åºããããé¸è±ãã¦ããã¨ã¾ã§ã¯è¨ããªãããã§ãã
> ests %>% data.frame %>% select(X2.5., X97.5.) %>% rename("Lower95" = X2.5., "Upeer95" = X97.5.) %>% mutate("Par" = rownames(ests)) %>% filter(Par %in% t0_rows) %>% select(Par, everything()) Par Lower95 Upeer95 1 state_t0 2.3975622 5.027106 2 var_state_t0 0.5549254 3.525860
ç¶ãã¦åå¸°ä¿æ°ã¯ã©ãã§ããããã
beta_par <- ests %>% data.frame %>% select(mean) %>% mutate("Par" = rownames(ests)) %>% filter(Par %in% beta_rows) beta_cmp <- data.frame( True = beta_ad, Est = beta_par$mean ) ggplot(beta_cmp, aes(x = True, y = Est)) + geom_point() + coord_fixed()

ãã¡ãããäºåã«è¨å®ããåå¸°ä¿æ°ã¨æ¨å®å¤ãä¼¼éã£ã¦ããããã§ããæ°å¤ã確èªãã¦ããè¯ãç²¾åº¦ã§æ¨å®ã§ãã¦ãã¾ãã
ests %>% data.frame %>% select(mean) %>% mutate("Par" = rownames(ests)) %>% filter(Par %in% beta_rows) %>% bind_cols("True" = beta_ad) %>% select(Par, True, mean) Par True mean 1 beta[1] 0.4439524 0.4520343 2 beta[2] 0.4769823 0.4825661 3 beta[3] 0.6558708 0.6604340 4 beta[4] 0.5070508 0.5001494 5 beta[5] 0.5129288 0.5355829
æå¾ã«ãåå¸°ä¿æ°ã®äºååå¸ã«ã¤ãã¦ãè¦ã¦ããã¾ãããã
> ests %>% data.frame %>% select(mean) %>% mutate("Par" = rownames(ests)) %>% filter(Par %in% b0_rows) %>% bind_cols("True" = c(mu_beta, var_beta)) %>% select(Par, True, mean) Par True mean 1 beta_0 0.50 0.5184981 2 var_beta_0 0.01 0.1385978
åå¸°ä¿æ°ã®äºååå¸ã®åæ£ã¯ã¡ãã£ã¨å¤§ããæ¨å®ããã¦ããããã§ãããå¹³åã¯è¿ããå¤ã¨ãªã£ã¦ããããã§ãã
çµããã«
以ä¸ããé層ãã¤ãºã¨ç¶æ 空éã¢ãã«ãåããã¦åãæ±ããããã¨ãã試ã¿ã§ããããçµæã¨ãã¦ã¯æ¦ãæºè¶³ã®è¡ããã®ã«ãªã£ãã¨æãã¾ããåºç¤ã«æ¸ããéããé層ãã¤ãº + ç¶æ 空éã¢ãã«ã®ãããªé常ã«è¤éãªã¢ãã«ã§ãã£ã¦ããstanã使ãã°é常ã«ç°¡åã«æ¨å®ãå¯è½ã§ãã
ããã¾ã§åºåå¹æã®æ¨å®ã«é¢ãã¦ããã¤ãè¨äºãæ¸ãã¦ãã¾ããããå®ã¯ã´ã¼ã«ã¨ãªãã¢ãã«ã¨ãã¦ã¯ãããæ³å®ãã¦ãã¾ããããã¨ã¯ãã®ã¢ãã«ã«ãããã¾ã§ç´¹ä»ãã¦ãããããªAdStockå¹æã®æ¨å®ãçµã¿è¾¼ãã°è©¦ãã¦ã¿ããã£ãã¢ãã«ã¯ä¸éãå®äºãããã¨ã«ãªãã¾ãã
ãããã«ã¤ãã¦ã追ã£ã¦ç´¹ä»ãããã¨æãã¾ãã
-
ä¼åºå çã®ãã¤ãºã¢ããªã³ã°ã®ä¸çãä¹ ä¿å çã®ç·æ¬ãæ¾æµ¦ããã®ã¢ãã«æ¬ãªã©è¯æ¸ã¯ããããããã¾ãã↩