éå£ã®ç¹æ ã«å¿ è¦ãªåã©ãã®æ°ã«ã¤ãã¦èãã
â»ä»åã®å 容ã¯Dr.STONEã¨ããæ¼«ç»ã®ãã¿ãã¬ãå«ã¿ã¾ã
åçªã§ãããDr.STONEã¨ããæ¼«ç»ã好ãã§ãã
ç©çãåå¦ã«è©³ãããªãã®ã§å 容ãçè§£ããªããèªãã¦ããããã§ã¯ãªãã®ã§ãããç³åããä¸çã§å°ããã¤ææã®ã¬ãã«ãä¸ãã£ã¦ããã®ãè¦ã¦ããã¨ãç§å¦ã£ã¦å°éã ãã©é¢ç½ããªã¼ã¨æã£ã¦ãã¾ãã¾ãã
ãããèªãã§ãã¦ä¸ç¹æ°ã«ãªãã¨ãããããã¾ããã ç½å¤ãã¡ãå®å®ããå°çã«å¸°éããå¾ã3çµã®ã«ããã«ããç³ç¥æãèªçãããã¨ã§ãã
ãã¡ããããããã®ã«ããã«ãé常ã«ããããã®åã©ããç£ãã°å¯è½ã§ãããããããç¾ä»£äººã10人ã20人ãåã©ããç£ããã®ãã¨ããã¨å°ãé£ãããããªæ°ããã¾ãã ã¾ãã«ããã«ã3çµããããã¾ããã®ã§ãåã©ããå°ãªããã°ãã£ã¨è¨ãéã«ãå ¨å¡ã親æå士ãã¿ãããªãã¨ã¨ãªããæ°ããªã«ããã«ãä½ããªããªããããªæ°ããã¾ããããèããã¨å®éã«ã©ãã ãå³ããæ¡ä»¶ã ã£ããã ãããã¨æ°ã«ãªã£ã¦ãã¾ãã¾ããã
ã¨ããããã§ä»åã¯ç½å¤ãã¡ãå°æ¥çã«éå£ãç¹æ ãããããã«ãã©ã®ç¨åº¦ã®åã©ããããããå¿ è¦ããã£ãããèãã¦ã¿ããã¨æãã¾ãã
èãæ¹
ä¸è¨ãèããã«ãããã以ä¸ã®ãããªæ¡ä»¶ã§ã·ãã¥ã¬ã¼ã·ã§ã³ã宿½ãã¾ãï¼
- ã¹ã¿ã¼ãã¯ç·æ§3人女æ§3人ã¨ããï¼ç¬¬ä¸ä¸ä»£ï¼
- åä¸ä»£ã«ããã¦ç·å¥³1人ãã¤ã§ã«ããã«ã¨ãªã
- ç·å¥³ã®æ°ãç°ãªãå ´åãä½ã£ã人ã¯åã©ããæ®ããªã
- åã«ããã«ã¯ãããã
np人ã®åã©ããç£ã npã¯ä¸ä»£ãã¾ããå ¨ã¦ã®ã«ããã«ã§å ±éã¨ãã- çã¾ãã¦ããåã©ãã®æ§å¥ã¯ç·å¥³æ¯ 1:1 ã§ã©ã³ãã ã«æ±ºã¾ã
- ã«ããã«ã¯ä¸å¤«ä¸å¦»å¶ãåãããã¼ããã¼ä»¥å¤ã®ç°æ§ã¨ã®åã©ãã¯ç£ã¾ãªã
- ä¸ä»£ãã¾ããã£ã¦ã«ããã«ãå½¢æããªã
ã·ãã¥ã¬ã¼ã·ã§ã³ããããããå³ããã®æ¡ä»¶ã¨ãã¦ãã¾ãããã ããã®ã¾ã¾ã§ã¯è¿ç¸è ã¨ã®ã«ããã«ãåºæ¥ã¦ãã¾ãã¾ãã®ã§ã以ä¸ã®æ¡ä»¶ã追å ãã¾ãï¼
- åã©ãã®è¿äº¤ä¿æ°ãä¸å®ã®å¤ä»¥ä¸ã¨ãªãã«ããã«ã¯å½¢æããªã
ããã§è¿äº¤ä¿æ°ã¨ã¯éå£éºä¼å¦ãªã©ã§ä½¿ãããç¨èªã§ãè¿è¦ªäº¤é ã®ç¨åº¦ã表ãã¾ããæ¥æ¬ã§ã¯ãã¨ãå士ã§ã®çµå©ãæ³å¾ä¸èªãããã¦ãã¾ããããã¨ãå士ã®åã©ãã®è¿äº¤ä¿æ°ã6.25%ã¨ãªãã¾ãã第ä¸ä¸ä»£ã®æ°ãå°ãªãããããããå³ããããã¨ãã£ã¨è¨ãéã«ã«ããã«ãåºæ¥ãªããªããã¨ãäºæ³ããã¾ãã
ä¸è¨ã®ã·ãã¥ã¬ã¼ã·ã§ã³ãå®è¡ããããã®ããã°ã©ã ã¯ä»¥ä¸ã¨ãã¾ããï¼
generate_population <- function(np, # ã«ããã«ãç£ãåã©ãã®æ° n0 = 6, # 第ä¸ä¸ä»£ã®äººæ° num_column = 6, # çµææ ¼ç´ç¨ã®ãã¼ãã«ã®åæ° G = 5, # ã·ãã¥ã¬ã¼ãããä¸ä»£æ° random_seed = 42, # ä¹±æ°åºå®ç¨ rel_lim = 0.90 # è¡ç¸ä¿æ°ï¼è¿äº¤ä¿æ°ã¨ã¯ç°ãªãï¼ ) { set.seed(random_seed) # åã©ãæ§å¥ã決ããã®ã«ä¹±æ°ã使ããã pairs <- c() # ã«ããã«ãè¨é²ããç¨ã®ãã¼ãã« ### å人ãã¨ã®è¨é²ï¼IDãä¸ä»£ãæ§å¥ãè¿äº¤ä¿æ°ãç¶ãæ¯ï¼ãæ ¼ç´ãããã¼ãã« res_mat <- matrix(0, n0, num_column) colnames(res_mat) <- c("ID", "Gen", "Sex", "Inbred", "Father", "Mother") ### 第ä¸ä¸ä»£ã®çæ res_mat[, "ID"] <- 1:n0 res_mat[, "Gen"] <- 1 res_mat[, "Sex"] <- rep(c(1, 0), 3) ### 以ä¸ãå°æ¥ä¸ä»£ãçæ for(g in 1:G) { ### ç¾å¨ä¸ä»£ã®è¡ç¸é¢ä¿ãè©ä¾¡ãã rel_mat <- create_rel_mat(res_mat) ### ã«ããã«ãä½ããã¢ã決ãã tmp_pairs <- make_pairs(g, res_mat, rel_mat, rel_lim) pairs <- rbind(pairs, tmp_pairs) ### np人ã®åã©ããçæãã tmp_res_mat <- generate_progenies(tmp_pairs, res_mat, g, np, num_column) ### ããããã®è¿äº¤ä¿æ°ãè¨ç®ããï¼è¿äº¤ä¿æ°ã¯ä¸¡è¦ªéã®è¡ç¸ä¿æ°*1/2ï¼ tmp_res_mat <- calculate_inbred_coef(tmp_res_mat, rel_mat) res_mat <- rbind(res_mat, tmp_res_mat) # sprintf("The %d generation generated", g+1) print(g) } return(res_mat) }
ã試ã
ãã¦ã颿°ã®è§£èª¬ã¯å¾åãã«ãã¦ã¾ãã¯å®è¡ãã¦ã¿ã¾ãããããã®ããã°ã©ã ãå®è¡ããã¨ä»¥ä¸ã®ãããªçµæãå¾ããã¾ãã
source("/YourDirectory/my_functions.r") # 颿°ãå®ç¾©ãããã¡ã¤ã« res_mat <- generate_population(np = 3, G = 15)
> calc_num_pop(res_mat) Gen Population 1 1 6 2 2 9 3 3 6 4 4 9 5 5 9 6 6 12 7 7 15 8 8 18 9 9 24 10 10 33 11 11 30 12 12 39 13 13 54 14 14 60 15 15 60 16 16 87
plot(calc_num_pop(res_mat), type = "l", xlab = "Generation g", ylab = "Population size")

ãã®è¡¨ã¨ã°ã©ãã¯å ã®æ¡ä»¶ã§ã·ãã¥ã¬ã¼ã·ã§ã³ã宿½ããã¨ãã®ä¸ä»£ãã¨ã®äººæ°ãéè¨ãã¦ãã¾ãã
第äºä¸ä»£ã¾ã§ã¯ãªããªã人æ°ãå¢ãã¦ããã¾ãããã第å ä¸ä»£ããå¾ã ã«äººæ°ãå¢ããã¦ãã¾ãããã«ããã«ãã¨ã®åã©ãã®æ°ã3人ã¨ããã®ã¯å°ãå¤ãããã«ãæãã¾ãããæå24å¹´ããã®ç¬¬ä¸æ¬¡ããã¼ãã¼ã ã®åè¨ç¹æ®åºççã4.32ã¨ãããã¨ãããã®ã§*1ãããå¾ãªãæ°å¤ã§ã¯ãªãã§ãããããããæ»äº¡ãä¸åèæ ®ãã¦ããªãã·ãã¥ã¬ã¼ã·ã§ã³ãªã®ã§ããã£ã¨ç£ãã§ããªãã¨3人ã§ãæªãããªãããã§ããã¨ã¯ããããããªãç³ç¥æãåé¡ãªãèªçããã§ãããããã§ãããã§ããâ¦
â¦ã¨ãè¨ãããã¨ããã§ãããçµè«ã¥ããåã«è¿äº¤ä¿æ°ãè¦ã¦ã¿ã¾ãããã
> summary(res_mat[, "Inbred"]) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.0000 0.2956 0.3209 0.3051 0.3377 0.4053
ãªãã¨è¿äº¤ä¿æ°ãå¹³åã§30%ãæå¤§ã§40%ã«ããªã£ã¦ãã¾ã£ã¦ãã¾ããè¿äº¤ä¿æ°25%ã¯è¦ªåéã§ã®ã«ããã«ã¨ããã¬ãã«ãªã®ã§ãããã¯å°ã£ã¦ãã¾ãã¾ãã
ã·ãã¥ã¬ã¼ã·ã§ã³
ã¨ããããã§ãããããæ¬é¡ã§ããã·ãã¥ã¬ã¼ã·ã§ã³ã§ãã©ã¡ã¼ã¿ã¨ãã¦æ¸¡ãã¦ãã rel_lim ã夿´ãããã¨ã§ãè¿äº¤ãé«ã¾ãã®ãé¿ããªããéå£ãç¹æ ãããç¡äºã«ç³ç¥æãèªçããããã¨ãã§ãããè¦ã¦ã¿ã¾ãããã
ã¾ãã¯æ¥æ¬ã®æ³å¾ã§èªãããã¦ãããã¨ãå士ã¾ã§è¨±å®¹ããï¼è¿äº¤ä¿æ°6.25%ãããªãã¡ä¸¡è¦ªéã®è¡ç¸ä¿æ°12.5%ã®ã«ããã«ã¾ã§èªããï¼ã¨ã©ããªãã§ããããã
> res_mat <- generate_population(np = 3, G = 15, rel_lim = 0.125) [1] 1 [1] 2 [1] 3 make_pairs(g, res_mat, rel_mat, rel_lim) ã§ã¨ã©ã¼: Couldn't make any couple!
ãªãã¨ãéä¸ã§ã«ããã«ãä½ããã¨ãã§ããªããªã£ã¦ãã¾ãã¾ãâ¦ããããã㦠seed ã«ããã®ããããã¾ããã®ã§ãããã¤ã試ãã¦ã¿ã¾ãããã
> generate_population(np = 3, G = 15, rel_lim = 0.125, random_seed = 1) [1] 1 [1] 2 [1] 3 make_pairs(g, res_mat, rel_mat, rel_lim) ã§ã¨ã©ã¼: Couldn't make any couple!
> generate_population(np = 3, G = 15, rel_lim = 0.125, random_seed = 2) [1] 1 [1] 2 [1] 3 make_pairs(g, res_mat, rel_mat, rel_lim) ã§ã¨ã©ã¼: Couldn't make any couple!
> generate_population(np = 3, G = 15, rel_lim = 0.125, random_seed = 3) [1] 1 [1] 2 [1] 3 make_pairs(g, res_mat, rel_mat, rel_lim) ã§ã¨ã©ã¼: Couldn't make any couple!
ãã¡ãªããã§ãâ¦ã
ã¡ãªã¿ã«ã§ãããè¿äº¤ã®å¶éããªããã¦ãï¼åä¸ä»£å
ã®ã©ããªè¿ç¸è
ã¨ã®ã«ããã«ãèªãã¦ãï¼ç·å¥³ã®åãã«ãã£ã¦éå£ãåç¶ããªããã¨ãããã¾ããä¾ãã°ä»¥ä¸ã®ããã«ã rel_lim ã1ã¨ãã¦ã seed ã«ãã£ã¦ã¯éä¸ã§æ¢ã¾ã£ã¦ãã¾ãã¾ãã
> res_mat <- generate_population(np = 3, G = 15, rel_lim = 1.0, random_seed = 5) [1] 1 [1] 2 [1] 3 [1] 4 [1] 5 make_pairs(g, res_mat, rel_mat, rel_lim) ã§ã¨ã©ã¼: Couldn't make any couple!
ç½å¤ãã¡ã¯ãã¨ãã¨å³ããæ¡ä»¶ã«ç½®ããã¦ãããã¨ããããã¾ããã
ããã§ã¯éã«ãè¿äº¤ä¿æ°ã®æ¡ä»¶ãä¿ã£ãå ´åã«å¿
è¦ã¨ãªã np ãèãã¦ã¿ã¾ãããã4人ããé çªã«å¢ããã¦ã¿ã¾ãï¼
> res_mat <- generate_population(np = 4, G =15, rel_lim = 0.125) [1] 1 [1] 2 [1] 3 make_pairs(g, res_mat, rel_mat, rel_lim) ã§ã¨ã©ã¼: Couldn't make any couple!
ãã 以ä¸çç¥ ãã
> res_mat <- generate_population(np = 8, G =15, rel_lim = 0.125) [1] 1 [1] 2 [1] 3 make_pairs(g, res_mat, rel_mat, rel_lim) ã§ã¨ã©ã¼: Couldn't make any couple!
9人ã¾ã§å¢ããã¨è¨ç®ããªããªãçµãããªãã£ãããæã¡åã£ã¦ãã¾ãã¾ãããããã®æ¡ä»¶ï¼æ¥æ¬ã®æ³å¾ã«åãããã«ããã«ï¼ã¯ããªãå³ããããã§ããããããåã©ãã9人ç£ãã¨ããã®ãçµæ§é£ãããã§ãããã
è¿äº¤ä¿æ°ã®æ¡ä»¶ãããå°ãç·©ãã12.5%ã¾ã§èªãããã¨ã«ãã¦ã¿ã¾ããããã¡ãªã¿ã«ãã®12.5%ã¨ããã®ã¯ããã»å§ªããã°ã»ç¥ã§ã®ã«ããã«ã«ç¸å½ãã¾ãã
> res_mat <- generate_population(np = 4, G = 15, rel_lim = 0.250) [1] 1 [1] 2 [1] 3 [1] 4 [1] 5 make_pairs(g, res_mat, rel_mat, rel_lim) ã§ã¨ã©ã¼: Couldn't make any couple!
4人ã§ã¯ãã¡ã5人ã§ã¯ï¼
> res_mat <- generate_population(np = 5, G = 15, rel_lim = 0.25) [1] 1 [1] 2 [1] 3 [1] 4 [1] 5 [1] 6 [1] 7
ä»åº¦ã¯è¨ç®ãçµããã¾ããã§ããâ¦ã䏿©å¾ ã£ãã®ã§ããâ¦ã
ãããç¬¬å «ä¸ä»£ã¾ã§é²ããã¨ã¯ç¢ºèªããã®ã§ããããªãå¯è½æ§ãããããã§ããã¤ã¾ãã
- å ¨ã¦ã®ã«ããã«ã5人ã®åã©ããç£ã¿ç¶ãã
- å¤å°ã®è¿äº¤ã許容ããï¼è¿ç¸ã®äººéã¨ãã«ããã«ãå½¢æããï¼
ãã¨ã§ç³ç¥æãç¡äºã«èªçããããã¨ãã§ãããã§ãããã£ããç½å¤ï¼ *2
ãã®ã¨ãã®è¿äº¤ä¿æ°ãã©ããªã£ã¦ãããã確èªãã¦ããã¾ãããã
### 第ä¸ä¸ä»£ã¾ã§ã§æ¢ãã res_mat <- generate_population(np = 5, G = 7, rel_lim = 0.25)
> calc_inb(res_mat) Gen Inbred_Coef 1 1 0.0000000 2 2 0.0000000 3 3 0.0000000 4 4 0.0625000 5 5 0.0937500 6 6 0.1132812 7 7 0.1250000 8 8 0.1191406
è¿äº¤ä¿æ°ã®å¹³åã¯ç¬¬ä¸ä¸ä»£ã§12.5%ã¨ä¸éã«éãã¦ãã¾ãããç¬¬å «ä¸ä»£ã§ã¯ãããã«æ¸å°ãã¦ãã¾ãã
> calc_num_pop(res_mat) Gen Population 1 1 6 2 2 15 3 3 25 4 4 50 5 5 115 6 6 245 7 7 505 8 8 1195
ã¾ããä¸ä»£ãã¨ã®äººæ°ãè¦ãã¨ç¬¬å «ä¸ä»£ã§å¤§å¹ ã«å¢ãã¦ããã®ã§ããã以éã¯è¿äº¤ãé«ãããã¨ã®ãªãã«ããã«ãå®å®ãã¦ä½ãããã¨ãæå¾ ã§ãã¾ãã
ããå°ãããã°ã©ã ããã¾ããããã°ãã®è¾ºãã追ãããããããªãã®ã§ãããå®è£ åã®ç¡ããæ¨ãããâ¦ã
çµããã«
ã¨ããããã§ãå°çã«éãç«ã£ã6人ãå§ç¥ã¨ãã¦éå£ãç¹æ ããããã¨ãæ¬å½ã«ã§ããã®ããæ¤è¨¼ãã¦ã¿ãããã§ãããçµæã¨ãã¦ã¯ãå³ããæ¡ä»¶ãªãããä¸å¯è½ã§ã¯ãªããããã¨ãããã¨ãè¦ãã¦ãã¾ããã
次åã¯ãã®æ¤è¨¼ã«ç¨ãã颿°ã®å ·ä½çãªè§£èª¬ããããã¨æãã¾ãã
*1:https://www8.cao.go.jp/shoushi/shoushika/meeting/taikou_4th/k_1/pdf/ref1.pdf
*2:漫ç»ã§ã¯ç½å¤ä»¥å¤ã®ã¡ã³ãã¼ãå²ã¨ããã«äº¡ããªã£ã¦ããæ§åãªã®ã§ãå®éã«ã¯5人ã¨ããã®ãé£ããã§ããããã©
glmnetãããå°ãçè§£ãããâ¤
ããã§ã¯ååã®è¨äºã«ç¶ãã¦elnet1ã®ç´¹ä»ã§ããååã®è¨äºã¯ãã¡ãã§ãã
- ã«ã¼ãâ¢ï¼åå¸°ä¿æ°ã®æ¨å®ï¼
- ã«ã¼ãâ¥ï¼åå¸°ä¿æ°ã®æ¨å®ã»åï¼
- ã«ã¼ãâ§ï¼åå¸°ä¿æ°ã®æ´æ°ã»åã ï¼
- ã«ã¼ãâ¨ï¼åå¸°ä¿æ°ãæ¨å®ããã夿°ã®ã«ã¦ã³ãï¼
- çµããã«
ã«ã¼ãâ¢ï¼åå¸°ä¿æ°ã®æ¨å®ï¼
以ä¸ã¾ã§ã§è¦ã¦ããéããã«ã¼ãâ ã»â¡ã§ã¯ almããªãã¡lambdaãæ´æ°ãã¤ã¤ãalphaï¼alfï¼ãpenalty.factorï¼vpï¼ã¨ã®ä¹ç®ã«ãã£ã¦ç½°åãè¨ç®ãã¦ãã¾ããã
ã«ã¼ãâ¢ã§ã¯ãã®ç½°åãç¨ãã¦åå¸°ä¿æ°ãæ´æ°ãã¾ãã
ãªã®ã§ãã®ã«ã¼ããglmnetã«ããã¦ã¡ã¤ã³ã¨ãªãå¦çã¨è¨ã£ã¦è¯ãã¨æãã¾ãã
ã«ã¼ãâ¢ã¯niã«å¯¾ããã«ã¼ãã§ããããã§niã¯èª¬æå¤æ°ã®æ°ã§ãããk ãã¤ã³ããã¯ã¹ã¨ãã¦å説æå¤æ°ãããã£ã¦ããã¾ãã
ã¾ãjuã§ãããããã¯å説æå¤æ°åã«ãããæ°å¤ã®ãã©ã¤ãã®æç¡ã示ã 1/0 ã®ãã¯ãã«ã§ããããã©ã¤ãããªããããªãã¡å
¨ã¦ã®æ°å¤ãåãã§ããã°ï¼ju(k) == 0 ï¼ã«ã¼ãâ¢ãã¹ããããã¾ãï¼gotoã®åããå
ã10371ã§ãã«ã¼ãã®ç¯å²ãåãã10371ã¨ãªã£ã¦ãã¾ãï¼ã
do 10371 k=1,ni if(ju(k).eq.0) goto 10371
次ã«aãã k çªç®ã®å¤æ°ã®å¤ãakã«æ ¼ç´ãã¾ããååè¨äºã§è¿½ããããéãããã®aï¼ã¾ãã¯aoï¼ãæçµçã«ã¯åå¸°ä¿æ°ã¨ãã¦è¿ãã¾ãã
åå¦çã«ããã¦a = 0.0ã§åæåããã¦ããã®ã§ã«ã¼ãã® 1 å¨ç®æç¹ã§ã¯akã 0 ã§ãããã«ã¼ãâ ã® 2 å¨ç®ä»¥éã¯ç¸®å°ãããåå¸°ä¿æ°ãå
¥ã£ã¦ãã¾ãã
ak=a(k) ! k çªç®ã®å¤æ°ã® a ã®å¤ã ak ã«ä»£å ¥ã
ç¶ãã¦uã¨vãè¨ç®ãã¾ãããããã¯ååã®è¨äºã§å°ãç´¹ä»ããéããæ¬¡ã®ãããã¯ã§åå¸°ä¿æ°aãæ´æ°ããããã®ãã®ã§ãã
uã¯g(k)ã«ak*xv(k)ãå ç®ãã¦è¨ç®ãã¾ããããã§g(k)ã¯standerdã«ããã¦g(j)=dot_product(y,x(:,j))ãã¤ã¾ãyã¨xã®å
ç©ã¨ãã¦å®ç¾©ããããã®ã§ããï¼yã¨xã¯ããããæ¨æºåããã¦ãã¾ãï¼ããããç½°åãä»ãã¦ããªããã°ãã®å
±åæ£ã OLS ã«ããåå¸°ä¿æ°ã¨ãªãã¯ãã§ãï¼æ¨æºåããã¦ããã®ã§xã®æ¨æºå差㯠1ï¼ã
ãã®gã«xvã§éã¿ãã¤ããakãå ç®ãã¾ããããã§xv㯠weight ãä¹ããxã®äºä¹åã§ãããããã«ã¼ãã® 1 å¨ç®ã§ã¯ak=0ã§ããããgããã®ã¾ã¾å©ç¨ããããã¨ã«ãªãã¾ãã
ãã®ããã«ãã¦å®ç¾©ãããuã®çµ¶å¯¾å¤ããç½°åãæ¸ãããã®ãvã¨ãªãã¾ãã
u=g(k)+ak*xv(k) v=abs(u)-vp(k)*ab
ããã¦ããã«vã 0 ããã大ããå ´åï¼OLS ã«ããåå¸°ä¿æ°ãç½°åããã大ããå ´åï¼ã
- ã
cl(2,k)ãã¨ãsign(v,u)/(xv(k)+vp(k)*dem)ããæ¯è¼ãã¦å°ããæ¹ãé¸ã¶ - ãããã
cl(1,k)ãã¨æ¯è¼ãã¦å¤§ããæ¹ãé¸ã¶
ã¨ããå¦çãè¡ããæ°ãã«aã¨ãã¦æ ¼ç´ãã¾ãã
ããã§clã¯glmnet.rã§cl = rbind(lower.limits, upper.limits) ã¨ãã¦å®ç¾©ããããã®ãªã®ã§ãæ¨å®ãããå¤ãä¸éã¨ä¸éã®éã«æãããã¨ãã¦ãããã¨ããããã¾ããã¾ãvã 0 以ä¸ã®å ´å㯠0 ã¨ãªãã¾ãã
! a(k) ãæ´æ° a(k)=0.0 if(v.gt.0.0) a(k)=max(cl(1,k),min(cl(2,k),sign(v,u)/(xv(k)+vp(k)*dem)))
以ä¸ãåå¸°ä¿æ°ã®æ´æ°ãè¡ãå¦çã«ãªãã¾ãã ããã¢ããµãªãã¦ãã¾ãããããã®å¦ç㯠glmnet ãçè§£ããä¸ã§æ¥µãã¦éè¦ãªã®ã§ããå°ã説æãã¾ãã
ã¾ãåæã¨ãã¦ãï¼Elastic Net ã§ã¯ãªãï¼Lasso ã§ã¯è»é¾å¤ä½ç¨ç´ ã¨å¼ã°ããååãç¨ãã¦è§£ãæ¨å®ãã¦ãã¾ãã
ããã§è»é¾å¤ä½ç¨ç´ ã¨ã¯ã宿° ããã³
ã«ããã¦
ã®çµ¶å¯¾å¤ã
ããã大ãããã°
ããããã§ãªããã° 0 ãè¿ãä½ç¨ç´ ã§ãï¼
ããªãã¡ãæ¨å®ãããåå¸°ä¿æ°ï¼ã®çµ¶å¯¾å¤ï¼ãç½°åãããå°ãããã° 0 ã«ä¸¸ãã¦ãã¾ãã大ããã¦ãç½°åã®åã ãä¿æ°ã縮å°ãã¦ãã¾ããã¨ãããã¨ã§ãã ä¸è¬ã« Lasso ã¯å¹æã®å°ããªå¤æ°ã®åå¸°ä¿æ°ã 0 ã«ç¸®å°ããæ¹æ³ã¨ãã¦ç¥ããã¦ãã¾ãããå®è£ ã¨ãã¦ã¯ãã®ãããªè»é¾å¤ä½ç¨ç´ ãç¨ãããã¦ããããããè¦ãã¨ãLasso ã¯ã¹ãã¼ã¹ãªè§£ãæ¨å®ã§ãããã¨ããè¨èã®æå³ããããã®ã§ã¯ãªãã§ãããããæ¨å®ããã 0 ã«ãªãããã§ã¯ãªããæç¤ºçã« 0 ã«ãã¦ããã®ã ã¨ã
ããã§å°ãä½è«ãªã®ã§ãããLasso ã Ridge ã«é¢ããåèæ¸ãªã©ãèªãã§ããã¨ãå¹¾ä½å¦çãªèª¬æãã¨ãã¦ä»¥ä¸ã®ãããªã°ã©ããæããããã¨ãããããã¨æãã¾ãï¼

ãã®ã°ã©ããè¦ããã³ã«ç§ã¯ç´å¾ãããªãæ°åã«ãªã£ã¦ãã¾ãããã¨è¨ãã®ããLasso ã®æ¹ï¼ã°ã©ãå·¦å´ï¼ã«çç®ããã¨ãOLS ã«ããæ¨å®å¤ã®åº§æ¨ï¼ã°ã©ãä¸ã®Ãå°ã®ä½ç½®ï¼ãæ¥åã®åºããæ¹ã«ãã£ã¦ã¯è±å½¢ã®é ç¹ã§ã¯ãªãè¾ºã«æ¥ãããã¨ãæ®éã«ããå¾ããã ããã§ãã å°ãªãã¨ããã®ã°ã©ãããã£ã¦ãLasso ã¯è±å½¢ã®é ç¹ã«æ¥ããããï¼ããã«è§£ã 0 ã¨æ¨å®ãããããï¼ãã¨ããã®ã¯å ¨ãèªæã§ã¯ãªããç´æçã§ããªããªãã¨æã£ã¦ãã¾ããã
ãããªæã«ãæ©æ¢°å¦ç¿ã®æ°ç100åã·ãªã¼ãºãã®ãã¹ãã¼ã¹æ¨å®100å with Rããèªãã§ããã¨ãã¾ããä¸è¨ã®ãããªã°ã©ããåºã¦ããã®ã§æ¶ã ã¨ããã®ã§ãããæ¬¡ã®ãã¼ã¸ã«ã¯ä»¥ä¸ã®ãããªã°ã©ããããã¾ããï¼

ã¾ãã«ããã§ãããã®ã°ã©ãã«ããã¦ç½è²ã®é¨åã« OLS ã®æ¨å®å¤ãããå ´åãé ç¹ã§ã¯ãªãè¾ºã«æ¥ãããã¨ã«ãªãã¾ããããããå°ãããã¦ç·è²ã®é¨åã« OLS ã®æ¨å®å¤ãåå¨ããå ´åã«ã¯è±å½¢ã®é ç¹ã«æ¥ãããã¨ã¨ãªããã¤ã¾ãããããéè¦ã§ãªãæ¹ã®è§£ã 0 ã¨ãã¦æ¨å®ãããããã«ãªãã¾ãã
ä¸ã®ã°ã©ãã®ãããªãå¹¾ä½å¦çãªèª¬æãã¯æ¬å½ã«å¤ãã®æ¬ã»è¨äºã§è¦ãããã®ã§ãããä¸ã®ã°ã©ããåããã¦èª¬æãããã¨ã§ããçè§£ãæ·±ã¾ãã®ã§ã¯ãã¨æãã¾ããã ä½è«ãããã
ãã¦ãä¸è¨ã®ãããã¯ã§ã¯ãåå¸°ä¿æ°ãç½°åããã大ããããã¤ä¸éã»ä¸éã®ç¯å²å
ã§ããã°sign(v,u)/(xv(k)+vp(k)*dem)ãæ°ããªaã¨ããã®ã§ããã
ããã»ã©ã®è»é¾å¤ä½ç¨ç´ ã®èª¬æã«ããã¦ã¯ãç½°åãæ¸ããåå¸°ä¿æ°ãï¼ã¤ã¾ãvï¼ãLassoæ¨å®å¤ã¨ãã¦ãã¾ããããããã§ã¯ãããxv(k)+vp(k)*demã§é¤ãã¦ãã¾ãã
ããã¯ãããã§å¾ããã¨ãã¦ããæ¨å®å¤ã¨ããã®ã Lasso ã§ã¯ãªã Elastic Net ã§ããããã§ãããï¼ç¬¬ä¸åã§ç´¹ä»ããï¼æç§æ¸ï¼P36ï¼ã§ã¯ Elastic Net ã®æ¨å®éã
ã¨ãã¦ãã¾ããdemã¯alm*(1-bta)ã§å®ç¾©ããã¦ãããã¨ãæãåºãã¨ããã㯠Ridge ï¼L2ï¼ã«å¯¾ããç½°åã§ãããä¸è¨ã®å¼ã§ã¯ $\lambda_{2}$ ã«è©²å½ãã¾ãã
ã¾ãxv㯠X ã®äºä¹åã忣ã§é¤ã㦠1 ãå ç®ãããã®ã§ããããä½ãæå³ãã¦ããã®ãã¯ä»¥åç´¹ä»ããã¨ããããããªãã£ãã®ã§ããããµã³ãã«ãã¼ã¿ã使ã£ã¦è¨ç®ãã¦ã¿ãã¨ãããã 1 ã«ãªããããªã®ã§ãã£ã¨ããããæ°å¤ãªãã ããã¨æãã¾ãï¼é©å½ï¼ã
æ®ãå¦çã§ãããä¸è¨ã«ãã£ã¦a(k)ãæ´æ°ãããªããã°ã«ã¼ããæãã¦æ¬¡ã®å¤æ°ã«ç§»ãã¾ãï¼gotoã®ç§»åå
10371ã¯ã«ã¼ãâ¢ã®çµç¹ã§ããï¼ã
ã¾ãmmã 0 ã§ãªããã°10391ï¼ã«ã¼ãâ£ã®å
ï¼ã«ç§»åããããã以éã®å¦çããæ¬¡ã«ç´¹ä»ããã«ã¼ãâ£ã¾ã§ãã¹ãããããããã§ãã
ãªããã®mmã¯ã«ã¼ãâ ã®ï¼åç®ã§ã¯ 0 ãªã®ã§ï¼åç®ã¯ç¢ºå®ã«å¦çãè¡ãããããã§ããã
ã¾ãnxã¯éã¼ãã¨ãã夿°ã®æ°ã®ä¸éãªã®ã§ãæ¨å®ãããã©ã¡ã¼ã¿æ°ããããè¶ããã¨ï¼çªç®ã®ã«ã¼ããæããããã§ãã
if(a(k).eq.ak) goto 10371 if(mm(k) .ne. 0) goto 10391 nin=nin+1 if(nin.gt.nx)goto 10372
ã«ã¼ãâ£ï¼åæ£å ±åæ£è¡åã®è¨ç®ï¼
ç¶ãã¦ã«ã¼ãâ£ã§ãã
ããã§ãã«ã¼ãã®å¯¾è±¡ã¯èª¬æå¤æ°ï¼niï¼ã§ãããä»åº¦ã¯ã¤ã³ããã¯ã¹ã¨ãã¦jãç¨ãã忣å
±åæ£è¡åï¼ã®ãããªãã®ï¼ãè¨ç®ãã¦cã«æ ¼ç´ããããã§ãã
ããã§cã¯ni*nxã®ãµã¤ãºã®è¡åã§ãã
ãã®ã«ã¼ãã¯çãã®ã§ã¾ã¨ãã¦è¦ã¦ãã¾ãã¾ãããã
ã¾ãã¯juã§å¤æ°ã«ãã©ããããããã確èªãããªããã°æ¬¡ã®å¤æ°ã«ã¹ããããã¾ãã
ç¶ãã¦mmããã§ãã¯ããmmã 0 ã§ãªããã°cã«mmã代å
¥ãã¦æ¬¡ã®å¤æ°ã«ã¹ããããã¾ãï¼ãªããã®mmã«ã¯å¾ç¶ã®å¦çã§ninã代å
¥ãããã®ã§ããããã®ninã¯mmãåºæºã«æ°å¤ãå ç®ããããããªå¤æ°ã¨ãªã£ã¦ããäºãã«å
¥ãçµãã§ãã¦ä½ããã£ã¦ããã®ããããããã¾ããã§ããï¼ã
ç¶ãã¦jã¨kãæ¯è¼ãã¦åä¸ï¼åã夿°ï¼ã ã£ããcã«xvããåä¸ã§ãªããã°xã®jã¨kã®å
ç©ãcã«ä»£å
¥ãã¾ããxvã¯å
ã»ã©åºã¦ããxã®äºä¹åã§ãã®ã§ããã®cã¯åæ£å
±åæ£è¡åã®ãããªãã®ãè¨ç®ãã¦ããããã§ãï¼æ£æ¹è¡åã§ã¯ãªãã®ã§åæ£å
±åæ£è¡åã¨ã¯è¨ããªãã§ããããã©ãï¼ã
do 10401 j=1,ni ! ãã©ããããªããã°ä»¥éã®å¦çãã¹ããã if(ju(j).eq.0)goto 10401 ! mm ã 0ï¼ãã©ã¡ã¼ã¿ã 0 ã§ãªãï¼ã§ãªããã°æ¬¡ã®ãããã¯ãå®è¡ãã¦æ¬¡ã®å¤æ°ã¸ã¹ããã if(mm(j) .eq. 0)goto 10421 c(j,nin)=c(k,mm(j)) goto 10401 10421 continue if(j .ne. k)goto 10441 ! 夿°ãåä¸ã§ãªããã° 10441 ã«é£ã¶ c(j,nin)=xv(j) ! åä¸ã ã£ãããã goto 10401 10441 continue c(j,nin)=dot_product(x(:,j),x(:,k)) ! åä¸ã§ãªãã£ãã j 㨠k ã®å ç©ãã¨ã 10401 continue ! ï¼çªç®ã®ã«ã¼ãã¯ããã¾ã§
ã«ã¼ãâ£ãçµãã£ãå¾ã¯å°ãã ãå¦çãå
¥ãã¾ãã
mmã«ã¯ninã代å
¥ããã¾ããã¾ãiaã«ã¯kãå
¥ãã¾ããããã®kã¯ã«ã¼ãâ¢ã®ã¤ã³ããã¯ã¹ã§ãã«ã¼ãâ¢ã¯æ´æ°ããªããã°ã«ã¼ãâ£ãã¹ããããã¦ãã¾ãããããã©ã¡ã¼ã¿ã«æ´æ°ããã£ã夿°ã®ã¤ã³ããã¯ã¹ã表ããã¨ã«ãªãã¾ãã
ãã®ä¸ã§ãæ¨å®ãããåå¸°ä¿æ°ã®å·®åãè©ä¾¡ããæ®å·®å¹³æ¹åãæ´æ°ãã¾ãã
ãã®ã¨ãg(k)ã¯ç¸®å°åã®åå¸°ä¿æ°ï¼yã¨x(k)ã®å
ç©ï¼ã§ããããã weightèª¿æ´æ¸ã¿ã® x ã®äºä¹å ãæ¸ãããã®ãæ®å·®å¹³æ¹åããæ¸ãã¦è¨ç®ãã¦ãã¾ãã
continue ! mm ã« nin ãå ¥ãã mm(k)=nin ! ia ã« k ãæ ¼ç´ ia(nin)=k 10391 continue ! a(k) ã®å·®åãã¨ãã a(k)ã ak ã¯æ¨å®ãããåå¸°ä¿æ°ã del=a(k)-ak ! æ®å·®å¹³æ¹åãæ´æ°ãã rsq=rsq+del*(2.0*g(k)-del*xv(k)) dlx=max(xv(k)*del**2,dlx)
ã«ã¼ãâ¤ï¼åå¸°ä¿æ°ã®æ´æ°ï¼
ããã«ç¶ãã¦ã«ã¼ãâ¤ã§ããããã¯ä¸ç¬ã§çµããããã¾è¨ç®ãããdelãç¨ãã¦g(j)ã¤ã¾ã縮å°åã®åå¸°ä¿æ°ãæ´æ°ãã¾ããã¨ããã§kã¯ã«ã¼ãâ¢ã®ã¤ã³ããã¯ã¹ã§ããã®ã«ã¼ãã®ä¸ã§ã¯åºå®ããã¦ãã¾ãã®ã§ãå夿°ã®åå¸°ä¿æ°ã®ç¸®å°ã«å¥ã®å¤æ°ã¨ã®å
±åæ£ãå©ç¨ãã¦ããããã§ããã
å
±åæ£ã大ããã¨ãããã¨ã¯äºãã®å¤æ°éã«ç¸é¢ãããã¨ãããã¨ã§ãããç¸é¢ãæ£ãªãåå¸°ä¿æ°ãå°ãããªãããã«åãããã§ãã
! æ¢ç´¢ç¯å²ã¯ä¸åº¦èª¬æå¤æ° do 10451 j=1,ni ! ã¤ã³ããã¯ã¹ã¯å度 j ã使ã if(ju(j).ne.0) g(j)=g(j)-c(j,mm(k))*del 10451 continue ! ï¼çªç®ã®ã«ã¼ãã¯ããã¾ã§ continue
ã«ã¼ãâ¤ãæããã¨ããã«ã«ã¼ãâ¢ãçµäºã§ãã
10371 continue ! ï¼çªç®ã®ã«ã¼ãã¯ããã¾ã§
ç¶ãã¦ä»¥ä¸ã®ãããã¯ã§çµäºå¦çã®å¤å®ãè¡ãã¾ãã10352ã¾ã§é£ã¶ã¨ãããã¤ãå¦çã¯ãããã®ã®ãã®ã¾ã¾returnã¨ãªãã¾ããã¤ã¾ãdlxãthrãããå°ãããã¾ãã¯ninãnxããã大ããå ´åã«ã¯elnet1ãæãã¾ãã
ããã§ã¯ãªãå ´åãããå°ãå¦çãç¶ãã¾ãã
10372 continue if(dlx.lt.thr)goto 10352 if(nin.gt.nx)goto 10352 if(nlp .le. maxit)goto 10471 jerr=-m return 10471 continue 10360 continue iz=1 da(1:nin)=a(ia(1:nin)) continue 10481 continue nlp=nlp+1 dlx=0.0
ã«ã¼ãâ¥ï¼åå¸°ä¿æ°ã®æ¨å®ã»åï¼
ããã«ç¶ãã¦ã«ã¼ãâ¥ã§ããå®ã¯ãã®ã«ã¼ãã以ä¸ã®éãã«ã¼ãâ¢ã¨å¦çãã»ã¨ãã©åãã§ãã
! ï¼çªç®ã®ã«ã¼ãï¼ä¸é¨çç¥ï¼ do 10371 k=1,ni if(ju(k).eq.0)goto 10371 ak=a(k) u=g(k)+ak*xv(k) v=abs(u)-vp(k)*ab a(k)=0.0 if(v.gt.0.0) a(k)=max(cl(1,k),min(cl(2,k),sign(v,u)/(xv(k)+vp(k)*dem))) if(a(k).eq.ak)goto 10371 if(mm(k) .ne. 0)goto 10391 nin=nin+1 if(nin.gt.nx)goto 10372 continue mm(k)=nin ia(nin)=k 10391 continue del=a(k)-ak rsq=rsq+del*(2.0*g(k)-del*xv(k)) dlx=max(xv(k)*del**2,dlx) do 10451 j=1,ni if(ju(j).ne.0) g(j)=g(j)-c(j,mm(k))*del ! ï¼çªç®ã®ã«ã¼ã do 10491 l=1,nin k=ia(l) ak=a(k) u=g(k)+ak*xv(k) v=abs(u)-vp(k)*ab a(k)=0.0 if(v.gt.0.0) a(k)=max(cl(1,k),min(cl(2,k),sign(v,u)/(xv(k)+vp(k)*dem))) if(a(k).eq.ak)goto 10491 del=a(k)-ak rsq=rsq+del*(2.0*g(k)-del*xv(k)) dlx=max(xv(k)*del**2,dlx) do 10501 j=1,nin g(ia(j))=g(ia(j))-c(ia(j),mm(k))*del
ã«ã¼ãã®å¯¾è±¡ãniã§ã¯ãªãninã«ãªã£ã¦ããç¹ãç°ãªãã¾ãããå¦çã¨ãã¦ã¯å¤§ä½åããªã®ã§èª¬æã¯çç¥ãã¾ãã
do 10491 l=1,nin k=ia(l) ! k ãåãåºãï¼ ia ã«ã¯ 0 ã§ã¯ãªããã©ã¡ã¼ã¿ãæ¨å®ããã夿°ã®åãæ ¼ç´ããã¦ãï¼ ak=a(k) ! a ãåãåºã u=g(k)+ak*xv(k) v=abs(u)-vp(k)*ab a(k)=0.0 if(v.gt.0.0) a(k)=max(cl(1,k),min(cl(2,k),sign(v,u)/(xv(k)+vp(k)*dem))) if(a(k).eq.ak)goto 10491 del=a(k)-ak rsq=rsq+del*(2.0*g(k)-del*xv(k)) dlx=max(xv(k)*del**2,dlx)
ã«ã¼ãâ¦ï¼åå¸°ä¿æ°ã®æ´æ°ã»åï¼
ã«ã¼ãâ¦ãåæ§ã«ã«ã¼ãâ¤ã¨åãå¦çãninã«å¯¾ãã¦è¡ã£ã¦ãã¾ãã
do 10501 j=1,nin g(ia(j))=g(ia(j))-c(ia(j),mm(k))*del 10501 continue ! ï¼çªç®ã®ã«ã¼ãã¯ããã¾ã§
ããã¦ã«ã¼ãâ¥ãçµäºã
continue 10491 continue ! ï¼çªç®ã®ã«ã¼ãã¯ããã¾ã§
ããã§çµäºå¤å®ãè¡ããã¾ãã
nlpã¯ã«ã¼ãã®ã«ã¦ã³ã¿ã¼ã¨ãªã£ã¦ããããã§ãä¸å®åæ°ãéãã¦ããªããã°10481ã¾ã§æ»ããã¾ãã
ãã®10481ã¯ã«ã¼ãâ¥ã®æåã§ãã®ã§ãdlxãååã«å°ãããªããã°å度ã«ã¼ãâ¥ãå®è¡ãããããªæµãã«ãªã£ã¦ããããã§ããã
continue if(dlx.lt.thr)goto 10482 if(nlp .le. maxit)goto 10521 jerr=-m return 10521 continue goto 10481 ! ã«ã¼ãâ¥ã®æåã¾ã§æ»ã 10482 continue da(1:nin)=a(ia(1:nin))-da(1:nin)
ã«ã¼ãâ§ï¼åå¸°ä¿æ°ã®æ´æ°ã»åã ï¼
ã«ã¼ãâ§ã§ããæ¹ãã¦ãninã§ã¯ãªãniã«å¯¾ãã¦åå¸°ä¿æ°ã®æ´æ°ãè¡ããã¾ãã
ããã§daã«ã¯ããä¸ã®ãããã¯ã§aã®å¤ããdaã®å¤ãæ¸ãã¦æ´æ°ãã¦ããã®ã§ãããããå°ãä¸ã®æ¹ã§daã«ã¯aãæ¸¡ãã¦ãã¾ãã
ã¤ã¾ãé çªã¨ãã¦ã¯ãda <- a ã¨ããä¸ã§aãæ´æ°ããæ´æ°å¾ã®aã¨daï¼ã¤ã¾ãæ´æ°åã®aï¼ã®å·®åãæ¹ãã¦daã¨ãããã¨ããæµãã§ãã
ãã®æ´æ°å¾ã®daã¨åæ£å
±åæ£è¡åã®å
ç©ãåå¸°ä¿æ°ããæ¸ããããã§ãã®ã§ããã£ã¦ãããã¨ã¯ã«ã¼ãâ¤ã«ãããåå¸°ä¿æ°ã®æ´æ°ã¨åãã§ããã
do 10531 j=1,ni if(mm(j).ne.0)goto 10531 if(ju(j).ne.0) g(j)=g(j)-dot_product(da(1:nin),c(j,1:nin)) 10531 continue ! ï¼çªç®ã®ã«ã¼ãã¯ããã¾ã§
ã«ã¼ãâ§ãæããã¨å¾ã¯çµäºã¾ã§ä¸ç´ç·ã§ãâ¦ã¨è¨ãããã¨ããã§ãããããã§ãªãã¨è¡æçãªãã¨ã«ã10351ãã¤ã¾ãã«ã¼ãâ¢ã®éå§ã¾ã§æ»ããã¦ãã¾ãã¾ãããªãã¦ãã£ãã
å®ã¯ã«ã¼ãâ¢ã®éå§ç´å¾ã«ã¯iz*jzã§å¦çãå¤ããå¤å®ããããã¨ãã« 1 ã§ããã°ã«ã¼ãâ¢ã®çµäºæç¹ã¾ã§ç§»åããã®ã§ãããããã§jzã 0 ã«ãã¦ãã¾ã£ã¦ããã®ã§æç´ã«ã«ã¼ãâ¢ãå度å®è¡ãããã¨ã«ãªãã¾ãã
ãããjzã 1 ã«æ´æ°ãããæ©ä¼ãããã®ã¯ã«ã¼ãâ¢ãããåã®æ®µéãªã®ã§ãä¸åº¦ãã®å¦çã«å
¥ã£ãå ´åã«ã¯å¿
ãã«ã¼ãâ¢ã®å¦çããåéããªãã¨ãããªããã¨ãããã¨ã§ããã
continue jz=0 goto 10351 ! ãã£ï¼ï¼
ä¸ã®gotoãç¡äºã«åé¿ã§ããå ´åãæå¾ã®å¦çã«å
¥ãã¾ãã
以ä¸ã§ã¯å¿
è¦ãªå¤æ°ãæ ¼ç´ãã¦ãã¾ãã
10352 continue if(nin .le. nx)goto 10551 ! nin ã nx ãè¶ ããå ´åã¯ããã«ãã jerr=-10000-m goto 10282 ! jerr ã æ´æ°ã㦠elnet1 ãæãã 10551 continue if(nin.gt.0) ao(1:nin,m)=a(ia(1:nin)) kin(m)=nin ! m åç®ã®ã«ã¼ãã® nin ã kin[m] ã«æ ¼ç´ãã rsqo(m)=rsq ! m åç®ã®ã«ã¼ãã® rsq ã rsqo[m] ã«æ ¼ç´ãã almo(m)=alm ! m åç®ã®ã«ã¼ãã® alm ã almo[m] ã«æ ¼ç´ãã lmu=m if(m.lt.mnl)goto 10281 if(flmin.ge.1.0)goto 10281 me=0
ã«ã¼ãâ¨ï¼åå¸°ä¿æ°ãæ¨å®ããã夿°ã®ã«ã¦ã³ãï¼
以ä¸ã§ã¯elnet1ã®ããã¾ã§ã®ã«ã¼ãã«ãã£ã¦æ¨å®ãããåå¸°ä¿æ°ã確èªãã0.0 ã§ã¯ãªã夿°ã®æ°ãã«ã¦ã³ããã¦ãã¾ããæ¹ãã¦ãjã¯å¤æ°ãmã¯lambdaã®ã¤ã³ããã¯ã¹ã§ãã
! ï¼çªç®ã®ã«ã¼ã do 10561 j=1,nin if(ao(j,m).ne.0.0) me=me+1 10561 continue ! ï¼çªç®ã®ã«ã¼ãããã¾ã§
æå¾ã«meãrsqãrsq0ã®ç¢ºèªãããåé¡ãªããã°æ¬¡ã®lambdaã«ç§»ãã¾ãã
continue if(me.gt.ne)goto 10282 if(rsq-rsq0.lt.sml*rsq)goto 10282 if(rsq.gt.rsqmax)goto 10282 10281 continue ! ï¼çªç®ã®ã«ã¼ãã¯ããã¾ã§ 10282 continue deallocate(a,mm,c,da) return end
çµããã«
以ä¸ã§elnet1ã¯çµäºã§ãã
ããã¾ã§éåã¨ãããã¾ãããããªãã¨ã{glmnet}ã®ã¡ã¤ã³ã®å¦çãæå¾ã¾ã§è¿½ãããããã¨ãåºæ¥ã¾ããï¼éä¸ã§ããããªãé¨åãé£ã°ããããã¾ãããï¼ã
ä»åã®èª¿æ»ã§ã®ä¸çªã®ãã¤ã³ãã¯ãã¯ãããLassoã§ã¯æ¨å®ãããåå¸°ä¿æ°ãç½°åãããå°ãããã° 0 ã«ä¸¸ãã¦ãã¾ããã¨ãããã¨ã確èªã§ãããã¨ã ã¨æãã¾ãã ãLassoã¯ä¸è¦ãªå¤æ°ã0ã¨ãã¦æ¨å®ãããã¨ã§å¤æ°é¸æã§ãããã¨ããã®ã¯ééã£ã¦ã¯ããªãã®ã§ããã0ã¨ãã¦æ¨å®ã§ããã¨ãããããæç¤ºçã«0ã«ãã¦ãã¾ã£ã¦ããã¨ãã表ç¾ã®æ¹ãæ£ããã¨æãã¾ãã ãªã®ã§ã夿°é¸æã§ãããã¨ããè¨èãæ¬æ¥ã§ããã°ã广ã®å°ããªå¤æ°ãç¡è¦ãããã¨ã§å¤æ°é¸æãã¦ãããã¨ããè¨ãæ¹ã«ãªãã®ããªã¨æãã¾ããã
ãããã£ãã¢ãã«ã«ãããéè¦ãªãã¤ã³ãããã½ã¼ã¹ã³ã¼ãã追ããããªããçè§£ããã¨ããã®ã¯æ¬å½ã«å¤§äºãªãã¨ã ã¨æ¹ãã¦æãã¾ãã
ããã§ã¯ã
glmnetãããå°ãçè§£ãããâ£
ããã§ã¯ååã®è¨äºã«ç¶ãã¦elnet1ã®ç´¹ä»ã§ããéå»ã®è¨äºã¯ãã¡ãã§ãã
ushi-goroshi.hatenablog.com ushi-goroshi.hatenablog.com ushi-goroshi.hatenablog.com
elnet1ã®å®è£
ååã®è¨äºã§æå¾ã«è§¦ããéããelnet1èªä½ã¯ 180 è¡ç¨åº¦ã¨ããã»ã©å¤§ããã¯ãªããµãã«ã¼ãã³ãªã®ã§ããã夿°ã®ã«ã¼ããè¾¼ã¿å
¥ã£ã¦ãã¾ãã
å
·ä½çã«ã¯ä»¥ä¸ã®éã 9 ã¤ã®ã«ã¼ãå¦çï¼fortran ãªã®ã§ do æï¼ããã¹ãããæ§é ã¨ãªã£ã¦ããã ãããgotoã«ãã£ã¦è¡ãæ¥ãã¦ãã¾ãï¼ããããããããã« R ã§æ¸ãã¦ããã¾ãããæ·»åã¯çµ±ä¸ãã¦ããã¾ãï¼ã
# 1çªç® for (m in 1:nlam) { # 2çªç®ã®ã«ã¼ã for (j in 1:ni) { } # 3çªç®ã®ã«ã¼ã for (k in 1:ni) { # 4çªç®ã®ã«ã¼ã for (j in 1:ni) { } # 5çªç®ã®ã«ã¼ã for (j in 1:ni) { } } # 6çªç®ã®ã«ã¼ã for (l in 1:nin) { # 7çªç®ã®ã«ã¼ã for (j in 1:nin) { } } # 8çªç®ã®ã«ã¼ã for (j in 1:ni) { } # 9çªç®ã®ã«ã¼ã for (j in 1:nin) { } }
åå¦ç
ã¾ãã¯ãã¤ãã®éã夿°ã®å®ç¾©ã§ãããããã«å ãã¦åæãã©ã¡ã¼ã¿ãåå¾ããã¨ããå¦çãå ¥ãã¾ãã
subroutine elnet1(beta,ni,ju,vp,cl,g,no,ne,nx,x,nlam,flmin,ulam,th *r,maxit,xv, lmu,ao,ia,kin,rsqo,almo,nlp,jerr) implicit double precision(a-h,o-z) double precision vp(ni),g(ni),x(no,ni),ulam(nlam),ao(nx,nlam) double precision rsqo(nlam),almo(nlam),xv(ni) double precision cl(2,ni) integer ju(ni),ia(nx),kin(nlam) double precision, dimension (:), allocatable :: a,da integer, dimension (:), allocatable :: mm double precision, dimension (:,:), allocatable :: c allocate(c(1:ni,1:nx),stat=jerr) if(jerr.ne.0) return; ! åæãã©ã¡ã¼ã¿ãåå¾ call get_int_parms(sml,eps,big,mnlam,rsqmax,pmin,exmx,itrace) ! a, mm, da ã allocate allocate(a(1:ni),stat=jerr) ! a ã¯èª¬æå¤æ°ã®æ°ã®æ¬¡å ããã¤ãã¯ãã« if(jerr.ne.0) return allocate(mm(1:ni),stat=jerr) ! mm ã¯èª¬æå¤æ°ã®æ°ã®æ¬¡å ããã¤ãã¯ãã« if(jerr.ne.0) return allocate(da(1:ni),stat=jerr) if(jerr.ne.0) return
ããã§get_int_parmsã¯ããã»ã©å¤§ãããªãã®ã§å
¨ä½ãè¦ã¦ã¿ã¾ãããã
以ä¸ã®ãããªãµãã«ã¼ãã³ã§ãï¼
subroutine get_int_parms(sml,eps,big,mnlam,rsqmax,pmin,exmx,itrace) implicit double precision(a-h,o-z) data sml0,eps0,big0,mnlam0,rsqmax0,pmin0,exmx0,itrace0 /1.0d-5,1.0d-6,9.9d35,5,0.999,1.0d-9,250.0,0/ sml=sml0 eps=eps0 big=big0 mnlam=mnlam0 rsqmax=rsqmax0 pmin=pmin0 exmx=exmx0 itrace=itrace0 return entry chg_fract_dev(arg) sml0=arg return entry chg_dev_max(arg) rsqmax0=arg return entry chg_min_flmin(arg) eps0=arg return entry chg_big(arg) big0=arg return entry chg_min_lambdas(irg) mnlam0=irg return entry chg_min_null_prob(arg) pmin0=arg return entry chg_max_exp(arg) exmx0=arg return entry chg_itrace(irg) itrace0=irg return end
ä¸ãã3è¡ç®ã®dataæã¯å¤æ°ã«åæå¤ãä¸ãã fortran ã®è¨æ³ã®ããã§ãdataã«ç¶ãã¦å®£è¨ãã夿°ã«å¯¾ãã¦/ã§æãã å¤ãåæå¤ã¨ãã¦ä¸ããããã§ãã
ãã®ããsml0ã«ã¯ 1.0d-5 ããeps0ã«ã¯ 1.0d-6 ãå
¥åããã¾ãã
ããã§ d ã¯åç²¾åº¦ã®ææ°è¡¨è¨ã表ãã¾ãã13è¡ç®ã®entry以éã¯å夿°ã«ã¤ãã¦ç¹å®ã®å¤ãæå®ããããã®ãã®ã®ããã§ãï¼entryã®ä½¿ãæ¹ãããããããªãâ¦ï¼ã
ç¶ãã¦ããã¤ã夿°ã«å¤ã代å
¥ãã¾ãã
ã¾ãã¯btaã§ããã代å
¥ãã¦ããbetaã¯å
ã
parmã¨ãã¦æ¸¡ããããã®ã§ãããã¯elnet.rã§parm = alphaã¨ãã¦æ¸¡ãã¦ãããã®ã§ãããããã«ãã®alphaã¯glmnet.rã§å®ç¾©ããããã®ã§ãL1 㨠L2 ããããã«å¯¾ããç½°åã®é
åãæ±ºãããã©ã¡ã¼ã¿ã§ãï¼
ï¼ãªãã Tex ã表示ãããªãã®ã§ã²ã¨ã¾ãï¼
(1 â \alpha)/2 ||\beta||^2_2 + \alpha||\beta||_1
bta=beta
ãã®btaã 1 ããæ¸ãããã®ãombã¨ãã¾ããããã®ombã¯ããä¸ã§å®ç¾©ãããalmã¨ã®ä¹ç®ã§demãå®ç¾©ããï¼ã¤ã¾ãdem = alm * obmï¼ããã ãã«ä½¿ããã¦ãã¾ãã
ããã«almã¯ã«ã¼ãã®ä¸ã§æ´æ°ãããªããæçµçã«ã¯btaã¨ã®ä¹ç®ã«ãã£ã¦abã¨ãªããåå¸°ä¿æ°ã®ç¸®å°ã«ä½¿ããããã¨ã«ãªãã¾ãã
ã¾ããã®æ¬¡ã®alfã¯almã®æ´æ°ã«ä½¿ããã¾ãã®ã§ããããã®å¤æ°ãã«ã¼ãã®ä¸ã§æ´æ°ããã¤ã¤åå¸°ä¿æ°ã®ç¸®å°ã«å©ç¨ãããã¨ãããã¨ã«ãªãã¾ãï¼ä»ã«ãããã¾ãï¼ã
omb=1.0-bta alm=0.0 alf=1.0
以ä¸ã®ãããã¯ã§ã¯eqsã¨alfãå®ç¾©ãã¾ãããflminã 1.0以ä¸ã§ããã°ã¹ããããããããã§ãã
ãã®flminã¨ããã®ã¯glmnet.rã«ããã¦ç½°ålambdaãæå®ããã¦ããã° 1
ããããã¦ããªãæã«ã¯lambda.min.ratioãå
¥åããã夿°ã§ããã
lambda.min.ratioã¯ããã©ã«ãã§ã¯lambda.min.ratio = ifelse(nobs < nvars, 0.01, 1e-04)ã¨ãªã£ã¦ãã¾ãã®ã§ 1 ããã¯å°ããå¤ãå
¥ãããã§ãã
ãããã£ã¦ä»¥ä¸ã®ãããã¯ã¯ãlambdaãæå®ããã¦ããªãã¨ãã¯alfãå®ç¾©ããããã¨ããå¦çã«ãªã£ã¦ãã¾ãï¼eqsã¯ããããåºã¦ãã¾ããï¼ã
ãã®å ´åãepsã¨flminï¼=1ï¼ã®å¤§ããæ¹ãæ°ãã«eqsã¨å®ç¾©ãã¾ããããã®eps ã¯get_int_parmsã§eps0ï¼1.0d-6 ã¨ããå°ããæ°ï¼ãåãåã£ã¦ãã¾ããã
䏿¹lambda.min.ratioã¯å
ã»ã©è¿°ã¹ãããã«ããã©ã«ãã§ã¯lambda.min.ratio = ifelse(nobs < nvars, 0.01, 1e-04)ã¨ãªã£ã¦ãã¾ãã®ã§ãããå°ã大ããå¤ã¨ãªãããã§ãã
ãããã£ã¦eqs㯠0.01 or 1e-04 ãalfã¯ãã®1/(nlam-1)ä¹ã¨ãªãããã§ãã
if(flmin .ge. 1.0)goto 10271 eqs=max(eps,flmin) alf=eqs**(1.0/(nlam-1)) ! alf ã eqs ã® (1/(nlam-1)) ã§å®ç¾©ãã
flminã 1 以ä¸ã§ããï¼lambdaãæå®ããã¦ããï¼å ´åã¯ä¸è¨ãã¹ããããã¦ãã¡ãã«ãã¾ããrsqã¯ãã®ã¾ã¾æ®å·®å¹³æ¹åã§ããã
ç¶ãaã¯elnet1ã®ä¸ã§éè¦ãªå½¹å²ãæ
ã£ã¦ããã®ã§ãã£ããã¨è¦ã¦ããã¾ãããã
å®ã¯ãã®aã¯ï¼ç¸®å°ãããï¼åå¸°ä¿æ°ãæ ¼ç´ãã夿°ã§ãã
10271 continue ! ãã©ã¡ã¼ã¿ã®åæå rsq=0.0 ! æ®å·®å¹³æ¹å a=0.0
ãã®aãã©ããªãã®ãããã©ã¤ã³ã°ãã¦å
ã®å¦çãè¦ã¦ã¿ã¾ãããã
elnet1ã® 70 è¡ç®åå¾ã«ä»¥ä¸ã®å¦çãããã¾ãï¼
ak=a(k) u=g(k)+ak*xv(k) v=abs(u)-vp(k)*ab a(k)=0.0 if(v.gt.0.0) a(k)=max(cl(1,k),min(cl(2,k),sign(v,u)/(xv(k)+vp(k)*dem))) if(a(k).eq.ak)goto 10371
akã¨ãã夿°ã«aã® k çªç®ã®å¤ã渡ãã¦ãããuã¨vãå®ç¾©ããaã® k çªç®ã®å¤ã 0 ã«æ´æ°ããä¸ã§è²ããªå¤ãåç
§ããªããååº¦æ´æ°ãã¦ãã¾ãï¼ãã®uãvã¯å¾ã§ç¢ºèªãã¾ãï¼ã
æçµçã«aã¯ä»¥ä¸ã®ããã«aoã¨ãã夿°ã«ä»£å
¥ããã¾ãï¼154 è¡ç®ï¼ï¼
if(nin.gt.0) ao(1:nin,m)=a(ia(1:nin))
ãã®aoã§ãããelnetuã®ä¸ã§elnet1ãå¼ã³åºãã¨ãã«ã¯caã¨ãã弿°ã¨ãã¦æ¸¡ããã¦ãã¾ãã
! elnet1 ã§åãåã夿° ! lmu ã®æ¬¡ã« ao ããã subroutine elnet1(beta,ni,ju,vp,cl,g,no,ne,nx,x,nlam,flmin,ulam,th *r,maxit,xv, lmu,ao,ia,kin,rsqo,almo,nlp,jerr) ! elnetu ã§ elnet1 ã call ããã¨ãã®å¼æ° ! ãã¡ã㯠lmu ã®æ¬¡ã« ca ããã call elnet1(parm,ni,ju,vp,cl,g,no,ne,nx,x,nlam,flmin,vlam,thr,maxi,xv, lmu,ca,ia,nin,rsq,alm,nlp,jerr)
ãã®caã¯elnet.rã®ä¸ã§.Fortran("elnet", ...)㨠call ãããéã«å®ç¾©ããã夿°ã§ããï¼
else .Fortran("elnet", ka, parm = alpha, nobs, nvars, as.double(x), y, weights, jd, vp, cl, ne, nx, nlam, flmin, ulam, thresh, isd, intr, maxit, lmu = integer(1), a0 = double(nlam), # ããã§ ca ãå®ç¾©ããã¦ãã ca = double(nx * nlam), ia = integer(nx), nin = integer(nlam), rsq = double(nlam), alm = double(nlam), nlp = integer(1), jerr = integer(1), PACKAGE = "glmnet")
ããã§nxã¯èª¬æå¤æ°ã®æ°ãnlamã¯ç½°ålambdaã®æ°ãªã®ã§ã説æå¤æ°ã®æ° à lambda ã®æ°ã®ãã¯ãã«ãå®ç¾©ãã¦ãã¾ãï¼ããã¦ãããelnet1ã®ä¸ã§aoã¨ãã¦è©ä¾¡ã»æ ¼ç´ãããï¼ã
ãã®caã¯elnet.rã®å¾ç¶ã®å¦çã«ããã¦ä»¥ä¸ã®ç®æã§æ½åºããã¾ãï¼
outlist = getcoef(fit, nvars, nx, vnames)
ããã§glmnet:::getcoefã¯ä»¥ä¸ã®éãã§ãfitã¨ãã¦è¿ã£ã¦ãããªãã¸ã§ã¯ãã®caãã®ãã®ãbetaã«æ ¼ç´ãã¦ãã¾ãï¼ninmaxã 0 ã®å ´å㯠0 ã®ãã¯ãã«ãè¿ãï¼ã
# glmnet:::getcoef function (fit, nvars, nx, vnames) { # ããã¾ã§çç¥ nin = fit$nin[seq(lmu)] ninmax = max(nin) # ããã¾ã§çç¥ if (ninmax > 0) { # ããã§ ca ãæ½åºãã¦ãã ca = matrix(fit$ca[seq(nx * lmu)], nx, lmu)[seq(ninmax), , drop = FALSE] df = apply(abs(ca) > 0, 2, sum) ja = fit$ia[seq(ninmax)] oja = order(ja) ja = rep(ja[oja], lmu) ia = cumsum(c(1, rep(ninmax, lmu))) # beta ã«æ ¼ç´ãã beta = drop0(new("dgCMatrix", Dim = dd, Dimnames = list(vnames, stepnames), x = as.vector(ca[oja, ]), p = as.integer(ia - 1), i = as.integer(ja - 1))) } else { beta = zeromat(nvars, lmu, vnames, stepnames) df = rep(0, lmu) } # ãããçç¥ list(a0 = a0, beta = beta, df = df, dim = dd, lambda = lam) }
ããã«ããã¤ãã®æ
å ±ã追å ãããã®ãglmnetã®è¿ãå¤ã§ããelnet1ã«ããã¦è©ä¾¡ãããaãaoã«æ ¼ç´ãããelnetã«caã¨ãã¦æ¸¡ãããelnet.rã§betaã«æ½åºã»æ ¼ç´ãããæµããä¼ããã¾ããã§ããããã
éè¦ãªå¤æ°ã説æããã¨ãããªã®ã§ã以ä¸ãããã¯ã§åæåãã¦ãã夿°ã®è©³ç´°ã¯åºã¦ããã¨ãã«èª¬æããã¨ãã¦ããã£ãã¨æ¬¡ã«é²ãã§ãã¾ãã¾ãããã
mm=0 nlp=0 nin=nlp iz=0 mnl=min(mnlam,nlam)
ã«ã¼ãâ ï¼almã®æ´æ°ï¼
ä¸è¨ã¾ã§ã§å¿
è¦ãªå¤æ°ã®åæåãå®äºããã®ã§ã以ä¸ããã«ã¼ãã«å
¥ãã¾ãã
ä¸çªå¤å´ã®ã«ã¼ãã¯lambdaã®åæ°ï¼nlamï¼ã«å¯¾ãã¦å®è¡ããã¾ãããnlamã®ããã©ã«ã㯠100 ã¨ãªã£ã¦ãã¾ãï¼glmnet.rï¼ã
以ä¸ã§ã¯ããããalmãæ´æ°ããå¦çãè¡ãã®ã§ãããlambdaã®æå®ã®æç¡ããã«ã¼ãã®åæ°ã«ãã£ã¦almã«å
¥åããå¤ãå¤ãã¦ãã¾ãã
ã¾ãã¯lambdaã®æå®ã®æç¡ã§å¦çãåãã¾ãã以ä¸ã®ã¾ã¨ã¾ãã¯flminã 1.0 ããå°ããå ´åã«ã¹ãããããã¾ãããå
ã»ã©è¿°ã¹ãããã«ãflminã¯glmnet.rã«ããã¦lambdaã®æå®ããªãå ´åã«ç¸å½ãã¾ãã
lambdaã®æå®ãããå ´åã«ã¯alm = ulam(m)ã¨ãã¦almãæ´æ°ããä¸ã§ã10291 ã¾ã§ã¹ãããããã®ã§ããããã® 10291 㯠2 çªç®ã®ã«ã¼ãã®ä¸ã«ããã¾ãã®ã§ãå°ã大ããã®ã¹ãããã¨ãªãããã§ãã
ãªãulamã¯lambdaãæå®ããã¦ããå ´åãlambdaã®éé ã«ãªã£ã¦ãããããã«ã¼ãã® 1 åç®ã§ããã°lambdaã®æå¤§å¤ãå
¥ãã¾ãã
do 10281 m=1,nlam ! nlambda ãªã®ã§ lambda ã®åæ°ã ãã«ã¼ã if(itrace.ne.0) call setpb(m-1) ! ããã°ã¬ã¹ãã¼ if(flmin .lt. 1.0)goto 10301 alm=ulam(m) ! flmin ã 1.0 以ä¸ã®å ´å㯠alm = ulam(m) ã¨ãã goto 10291
lambdaã®æå®ããªããã°ä»¥ä¸ã®å¦çã«å
¥ãã®ã§ãããããã§ã¯ã«ã¼ãã®åæ°ã«ãã£ã¦almã«å
¥åããå¤ãå¤ãã¦ãã¾ãã
å
·ä½çã«ã¯ãã«ã¼ãã® 1 åç®ã«ã¯bigï¼9.9d35ï¼ã¨ããæ¥µç«¯ã«å¤§ããªå¤ãå
¥åãã 2 åç®ã«ã¯ 0.0 ãã3 åç®ä»¥é㯠å
ã®å¤ã«alfãä¹ãããã®ãå
¥åãã¾ãã
10301 if(m .le. 2)goto 10311 ! ã«ã¼ãã®ï¼åç®ã¨ï¼åç®ã¯ãããã¹ããã alm=alm*alf ! ã«ã¼ãã®ï¼åç®ãã㯠alm ã alf ãä¹ãã goto 10291 10311 if(m .ne. 1)goto 10321 ! ã«ã¼ãã®ï¼åç®ã¯ãããã¹ããã alm=big ! ã«ã¼ãã®ï¼åç®ã¯ alm = big(9.9d35) ã«ãã goto 10331 10321 continue alm=0.0 ! ã«ã¼ãã®ï¼åç®ã¯ alm ã ãã£ãã 0 ã«ãã
ãã®alfã¯å
ã»ã©èª¬æããéãeqs^(1.0/(nlam-1))ã¨ãã¦å®ç¾©ããã¾ãããeqsã 0.01 or 1e-4 ã¨ããã¨ãnlambdaã 10 ã¨ããå ´åã«ã¯ä»¥ä¸ã®ãããªæ°å¤ã«ãªãã¾ãï¼
0.01^(1/(10-1)) # [1] 0.5994843 1e-4^(1.0/(10-1)) # [1] 0.3593814
ã¤ã¾ãalmã¯ã ãã ã絶対å¤ãå°ãããªãããã§ããã
ã«ã¼ãâ¡ï¼ç½°åã®å®ç¾©ï¼
ç¶ã㦠2 çªç®ã®ã«ã¼ãã«å
¥ãã¾ãâ¦ã¨è¨ãã¤ã¤ 2 çªç®ã®ã«ã¼ãã¯ä¸ç¬ã§çµããã¾ãã
å
ã»ã©æ´æ°ããalmã«ã¤ãã¦å¤æ°ãã¨ã®å
ç©ã¨æ¯è¼ãã大ããæ¹ãæ¡ç¨ãã¾ãã
ãããã£ã¦ããã§ã¯å夿°ã«å¯¾ããã«ã¼ãã¨ãªãã¾ãã
ã¾ãjuã¨vpã§ãããjuã¯ååè¨äºã§ç¢ºèªããéããchkvarsã«ãã£ã¦å夿°åã®å
容ãå
¨ãåãã§ãªããã確èªãããã®ã§ããã
ãã夿°åã®ä¸èº«ãå
¨ãåãã§ããã° 0 ã§ãã£ããããããã§æ¬¡ã®å¤æ°ã«ã¹ãããããã¾ãã
次ã«vpã§ããããã㯠1 åç®ã®è¨äºã§ç¢ºèªããéãglmnet.rã«ããã¦å夿°ã«å¯¾ããç½°åã®éã¿ï¼ããã©ã«ã㯠1ï¼ ãå
¥ã£ããã¯ãã«ã¨ãã¦å®ç¾©ããããã®ã§ãï¼vp = as.double(penalty.factor)ï¼ã
ç½°åããããªãå ´å㯠0 ã¨ãªããã¹ããããããããã§ãã
夿°ã«ãã©ã¤ãããããç½°åãæ¤è¨ããå ´åã«ã¯ããã§å度almãæ´æ°ãã¾ãã
ããã§åºã¦ããgã¯standardã®ä¸ã§yã¨xã®å
ç©ï¼å
±åæ£ï¼ãæ ¼ç´ãããã®ã¨ãã¦å®ç¾©ããããã®ã§ããã
ãããç½°åã®å¤§ããã§é¤ãã¦ãããããpenalty.factorãå°ããããï¼åæ¯ãå°ãããªãï¼ã¨å
±åæ£ã大ãããªã夿°ã¨ãã¦æ®ãããããã¨ãããã¸ãã¯ã«ãªã£ã¦ããããã§ããã
ã¡ãªã¿ã«ã«ã¼ãâ ã® 1 åç®ã®ã«ã¼ãã¯almã« 9.9d35 ã¨ããæ°å¤ãå
¥ãã®ã§å¿
ããã®æ°å¤ãæ¡ç¨ãããã¨æãã¾ããã¾ãã«ã¼ã 2 åç®ã¯ä»åº¦ã¯almã 0.0 ã«ãªããããä»åº¦ã¯å¿
ã夿°ã®å
±åæ£å´ã®æ°å¤ãalmã«ãªãã¨æããã¾ãã
! ï¼çªç®ã®ã«ã¼ã ! alm ã®æ´æ° do 10341 j=1,ni ! ni ã¯å¤æ°ã®æ° if(ju(j).eq.0) goto 10341 if(vp(j).le.0.0) goto 10341 alm=max(alm,abs(g(j))/vp(j)) 10341 continue ! ï¼çªç®ã®ã«ã¼ãããã¾ã§
ä¸è¨ã®å¦çã§å¤æ°ã横æãã¦almãæ´æ°ããã®ã¡ã以ä¸ã§ããã«almãæ´æ°ãã¾ãã
ããã§ã¯btaï¼alpha; L1 㨠L2 ã¸ã®éã¿ã®é
åãã©ã¡ã¼ã¿ï¼ã¨ 0.001 ã® max ã§ alm ãé¤ããalfãä¹ãã¦ãã¾ãã
ä¸å¿ããã§å¼ã確èªãã¦ããã¨ä»¥ä¸ã®ããã«ãªãã¾ãï¼
ä¸ä½ããã¯ä½ããã£ã¦ãããã§ããããâ¦ã
continue alm=alf*alm/max(bta,1.0d-3)
ç¶ãã¦ããã¤ãã®å¤æ°ãæ´æ°ãã¾ãã
demã¯alm * ombã¨ãã¦å®ç¾©ããã¾ãããããã§omb㯠(1-bta)ã§ããã ã¾ãabã¯almã«btaãä¹ãããã®ã§ãã®ã§ããããã¯ãããããlambdaÃ(1-alpha)ãããã³ãlambdaÃalphaãã¨ãããã¨ã«ãªããdemã¨abãå®è³ªçãªç½°åã®å¤§ããã表ããã¨ã«ãªãããã§ããã
10331 continue 10291 continue dem=alm*omb ! dem = alm * (1-bta) ab=alm*bta ! ab = alm * bta
ããããã©ã®ããã«ä½¿ããã¦ãããå°ãå ãè¦ã¦ã¿ã¾ãããã
! ab u=g(k)+ak*xv(k) ! L69ï¼ã«ã¼ãâ¢ã®ä¸ï¼ãL119ï¼ã«ã¼ãâ¥ã®ä¸ï¼ v=abs(u)-vp(k)*ab ! L70ãL120ï¼ã¨ãã«ä¸ã«åãï¼ ! dem a(k)=0.0 ! L71ãL121 if(v.gt.0.0) a(k)=max(cl(1,k),min(cl(2,k),sign(v,u)/(xv(k)+vp(k)*dem))) ! L72ãL122
両æ¹ã¨ãvpã«ä¹ãã¦ãããabã¯abs(u)ããã®æ¸ç®ãdemã¯xv(k)ã¨ã®å ç®ã®å¾ã«sign(v,u)ã¨é¤ç®ããclã¨ã® max/min ãåã£ã¦ãã¾ãã
vpã¯ç½°åã®éã¿ãå®ç¾©ãããã®ã§ããã®ã§ãalphaã¨lambdaã§æ±ºã¾ãç½°åã®å¤§ããããã®ã¾ã¾ä½¿ããå¼±ãããããæ±ºãã¦ãã¾ãã
demã®æ¹ã¯æ¼ç®ã®çµæãaã«æ ¼ç´ãã¦ãã¾ãããåè¿°ã®éãaã¯åå¸°ä¿æ°ãä¿åãã夿°ã§ããã®ã§ãsign(v,u)/(xv(k)+vp(k)*dem)ãcl(1,k)ããã大ãããã°aãããªãã¡åå¸°ä¿æ°ãæ´æ°ãããã¨ãããã¨ã«ãªãã¾ããã
ã¾ããã®æ¼ç®ãå®è¡ããããã®åºæºã¨ãã¦vã使ããã¦ããããã®vãè¨ç®ããããã«abã使ããã¦ãããã¨ãããã¨ã®ããã§ãã
ããããã®uã¨ãvã£ã¦ä½ãªã®ï¼ã¨ãã話ãªã®ã§ãããããã¯æ¬¡ã®ã«ã¼ãã®è©±ãªã®ã§å°ããå¾
ã¡ãã ããã
æ®ã夿°ã®ãã¡rsq0ã¯æ®å·®å¹³æ¹åã§ãããã¾ãjzã¯izã¨çµã¿åããã¦ä½¿ããã¦ãã¾ããããã®æ¡ä»¶åå²ãã¡ãã£ã¨çè§£åºæ¥ãªãã£ãã®ã§ã¹ããããã¾ãã
ä¸å¿ãizã¯ã«ã¼ãâ ã®éä¸ï¼ã«ã¼ãâ¢ãçµäºããæç¹ï¼ã§ 1 ã«ãªããããiz * jzã 0 ã«ãªãã®ã¯ã»ã¼jzã 0 ã®æã«éãã¨è¨ãããã§ãã
nlp㯠iteration ã®ã«ã¦ã³ã¿ã¼ã¨ãã¦ä½¿ããã¦ãããdlxã¯åå¸°ä¿æ°ã®æ´æ°åå¾ã®å·®åãè¦ã¦ãã¾ãã
ã©ã¡ããã«ã¼ããæããããã®åºæºã¨ãã¦ä½¿ããã¦ãã¾ãã
rsq0=rsq jz=1 continue 10351 continue if(iz*jz.ne.0) goto 10360 ! iz = 0, jz = 1 nlp=nlp+1 dlx=0.0
ã¡ãã£ã¨é·ããªã£ã¦ãã¾ã£ãã®ã§ä¸åº¦ããã¾ãã æ¬¡åã¯ã«ã¼ãâ¢ããå§ãã¾ãã
glmnetãããå°ãçè§£ãããâ¢
ååã®è¨äºã§ã¯ R ã®é¢æ° elnet ã®ä¸ã§ elnet ã¨ãã Fortran ã®ãµãã«ã¼ãã³ãå¼ã°ãï¼ãã£ã±ããããããã§ããï¼ãããã« type.gaussian ã®å¤ï¼ covariance 㨠naive ï¼ã«ãã£ã¦ elnetu 㨠elnetn ã®ãããããå¼ã°ããã¨ããã¾ã§ç¢ºèªãã¾ããã
ä»å㯠elnetu ã®ä¸èº«ãè¦ã¦ããã¾ãã
éå»ã®è¨äºã¯ãã¡ãã§ãã
elnetu ã®å®è£
ããã§ã¯æ©é elnetu ãè¦ã¦ããã¾ãããã elnetu 㯠elnet ã¨åæ§ã«ããã»ã©å¤§ãããªãã®ã§ãããªãå
容ã®ç¢ºèªã«å
¥ãã¾ãããå¦çã¨ãã¦ã¯ä»¥ä¸ã®æé ã«ãªã£ã¦ããããã§ãï¼
- åå¦ç
- æ¨æºå
- ãã£ããã£ã³ã°
- å¾å¦ç
ã¾ãã¯åå¦çã§ãããã¡ã¢ãªã®å²ãä»ãã®ãã¨ã« chkvars ã¨ãããµãã«ã¼ãã³ãå¼ã³åºãã¦ãã¾ãã
subroutine elnetu(parm,no,ni,x,y,w,jd,vp,cl,ne,nx,nlam, flmin,ulam,thr,isd,intr,maxit, lmu,a0,ca,ia,nin,rsq,alm,nlp,jerr) implicit double precision(a-h,o-z) double precision x(no,ni),y(no),w(no),vp(ni),ulam(nlam),cl(2,ni) double precision ca(nx,nlam),a0(nlam),rsq(nlam),alm(nlam) integer jd(*),ia(nx),nin(nlam) double precision, dimension (:), allocatable :: xm,xs,g,xv,vlam integer, dimension (:), allocatable :: ju allocate(g(1:ni),stat=jerr) if(jerr.ne.0) return allocate(xm(1:ni),stat=jerr) if(jerr.ne.0) return allocate(xs(1:ni),stat=jerr) if(jerr.ne.0) return allocate(ju(1:ni),stat=jerr) if(jerr.ne.0) return allocate(xv(1:ni),stat=jerr) if(jerr.ne.0) return allocate(vlam(1:nlam),stat=jerr) if(jerr.ne.0) return ! 1. åå¦ç call chkvars(no,ni,x,ju) if(jd(1).gt.0) ju(jd(2:(jd(1)+1)))=0 if(maxval(ju) .gt. 0)goto 10071 jerr=7777 return 10071 continue
ãã® chkvars ã§ã¯ x ã®å夿°ã«ã¤ãã¦ä¸è¡ç®ã®å¤ã¨ç°ãªãå¤ãäºè¡ç®ä»¥éã«ãããã確èªãã ju ã«æ ¼ç´ãã¦ãã¾ãã
subroutine chkvars(no,ni,x,ju) implicit double precision(a-h,o-z) double precision x(no,ni) integer ju(ni) ! ããããå夿°ã®ãã§ãã¯ãéå§ do 11061 j=1,ni ju(j)=0 t=x(1,j) ! 1è¡ç®ã®å¤ãåå¾ ! ãããã2è¡ç®ã®å¤ã確èªãã do 11071 i=2,no ! t 㯠x(1, j) ãªã®ã§ãå夿° j ã«ã¤ã㦠1 è¡ç®ã®å¤ã¨çãããã確èªãã¦ãã if(x(i,j).eq.t) goto 11071 ! çãããã°æ¬¡ã®è¡ã¸ ju(j)=1 ! çãããªãæ°å¤ãããã° ju ã 1 ã«ãã¦æ¬¡ã®å¤æ°ã¸ goto 11072 11071 continue 11072 continue 11061 continue continue return end
ç°ãªãå¤ããªããã°å
¨ã¦ã®å¤ã¯åãã¨ãããã¨ã«ãªãã¾ãã®ã§ãä¾ãã°åå¸°ä¿æ°ãæ¨å®ããæå³ã¯ããã¾ããã
å¾ç¶ã®å¦çã§ã¯ãã® ju ãåç
§ãã¦ã¹ãããããããæ±ºãã¦ããç®æãå¤ã
åºã¦ãã¾ãã
ç¶ã㦠standard ã¨ãããµãã«ã¼ãã³ãå¼ã³åºãã¦æ¨æºåãè¡ãã¾ãã
! 2. æ¨æºå call standard(no,ni,x,y,w,isd,intr,ju,g,xm,xs,ym,ys,xv,jerr)
ãã® standard ã¨ããµãã«ã¼ãã³ã¯çµæ§å¤§ããè¦ãã¾ãããåçã®æç¡ã§å¦çãåãã¦ããããéè¤é¨åãããã¾ãã
å¦çã®å
容ã¨ãã¦ã¯ï¼
- éã¿ã®å¤æ
- y 㨠x ã®æ´æ°
- y 㨠x ã®å ç©ï¼å ±åæ£ï¼ãè¨ç®
ã¨ãªã£ã¦ãã¾ãã
ã¾ãã¯éã¿ã®å¤æã確èªãã¦ã¿ãã¨ãéã¿ w ããéã¿ã®ç·åãããã®éã¿ãã«å¤æãã
ããã«ãã®å¹³æ¹æ ¹ãã¨ã£ããã®ã v ã¨ãã¦å®ç¾©ãã¦ãã¾ãã
ã¾ããã®æ¬¡ãããå
ã«è¿°ã¹ãããã«åçã®æç¡ã«ãã£ã¦å¦çãåãã¦ãã¾ãã
subroutine standard(no,ni,x,y,w,isd,intr,ju,g,xm,xs,ym,ys,xv,jerr) implicit double precision(a-h,o-z) double precision x(no,ni),y(no),w(no),g(ni),xm(ni),xs(ni),xv(ni) integer ju(ni) double precision, dimension (:), allocatable :: v allocate(v(1:no),stat=jerr) if(jerr.ne.0) return ! 1. éã¿ã®å¤æ w=w/sum(w) v=sqrt(w) ! intr 㯠intercept ãªã®ã§åçã 0 ã§ãããã§å¤å® ! åçã 0 ã§ãªãå ´å㯠10141 ã«é£ã°ããã if(intr .ne. 0) goto 10141
以éã®å¦çã§ã¯ãã® v ã y ã x ã«å¯¾ãã¦æãåãããã®ã§ãããå
¨ã¦ã®è¦³æ¸¬å¤ã®éã¿ãçããåç´ãªãã¿ã¼ã³ãæ³å®ãã㨠w ã«ã¯ $1/n$ ãv ã«ã¯ãã®å¹³æ¹æ ¹ãå
¥ãã¾ãã
ä¾ãã°è¦³æ¸¬å¤ã®æ°ã 100 ã§ããã° $w = 1/100 = 0.01$ ã$v = sqrt(1/100) = 0.1$ ã¨ãªãã¾ãã
ã§ã¯ãã®ãã㪠w ã v ã使ã£ã¦ä½ããã£ã¦ãããã¨ããã¨ã y ã«å¯¾ãã¦ã¯ï¼
yã«vãä¹ãããã®ãæ°ãã«yã¨ãã- ãã®
yã®å ç©ï¼äºä¹åï¼ããvã¨yã®å ç©ã®äºä¹ãæ¸ããå¹³æ¹æ ¹ãã¨ãï¼ysï¼ yãysã§å²ã
ã¨ãããã¨ããã¦ãã¾ãã
! 2. y 㨠x ã®æ´æ° ! 以ä¸ã®ã»ã¯ã·ã§ã³ã§ã¯ y 㨠x ããããã«ã¤ãã¦è¦³æ¸¬å¤ã®éã¿ã使ã£ã¦è²ã ã¨èª¿æ´ãã ! ã¾ã㯠y ym = 0.0 y = v*y ys = sqrt(dot_product(y,y)-dot_product(v,y)**2) y = y/ys
ãã ãã®èª¬æã ãã§ã¯æå³ãåãããªãã¨æãã¾ãã®ã§å°ãå¼ãæ´çãã¦ã¿ã¾ãããã
ãã¨ã® y ããã³ w ã $y0$ ã $w0$ ã¨ããã¨ã
ã¨ãªãã¾ãã
ã¾ã ys ã®äºä¹ï¼å¹³æ¹æ ¹ãåãåï¼ $ (ys)^{2} $ ã¯ï¼
ã¨æ¸ãã¾ãã
ããã§ $w$ ã¯è¦³æ¸¬å¤ã«å¯¾ããéã¿ $w0$ ããã®ç·åã§é¤ããå½¢ï¼åç´ãªãã¿ã¼ã³ã§ã¯ $\frac{1}{n}$ ï¼ã¨ãªã£ã¦ãããã¨ãæãåºãã¨ããããä¹ãããã®ã®ç·åã¯éã¿ä»ãå¹³åã¨ãªãã¾ãã
ããããã¨å³è¾ºã®ç¬¬ä¸é
ã¯ãã¨ãã¨ã® y ï¼$y0$ï¼ã®äºä¹ã®éã¿ä»ãå¹³åã第äºé
ã¯éã¿ä»ãå¹³åã®äºä¹ãå¾ããã¦ãããã¨ããããã¾ãã
äºä¹ã®å¹³åããå¹³åã®äºä¹ãå¼ãããã®ã¨è¨ãã°åæ£ã§ãã®ã§ããã®å¹³æ¹æ ¹ãã¨ã£ã ys 㯠$y0$ ã®éã¿ä»ãæ¨æºåå·®ãå¾ã¦ããããã§ã
ï¼ã¨ããã§ $w$ã$w0$ ã¯æ·»å $i$ ãä»ããã¹ãã§ãããã¯ã¦ãªããã°ã® LaTeX ããªããå´©ããã®ã§çç¥ãã¦ãã¾ãï¼ã
å®éã«ãµã³ãã«ãã¼ã¿ã§è¨ç®ãã¦ã¿ã¾ãããã ã¾ãã¯ä»¥ä¸ã®ãããªç°¡åãªãã¼ã¿ã§äºä¹ã®å¹³åããå¹³åã®äºä¹ãå¼ãããã®ã忣ã«ãªããã¨ã確èªãã¾ãã
# é©å½ãªãã¼ã¿ a <- c(5, 5, 6, 7, 9) # ä¸è¬çãªåæ£ã®è¨ç® mean((a - mean(a))^2) # äºä¹ã®å¹³åããå¹³åã®äºä¹ãå¼ãã¦ã¿ã mean(a^2) - mean(a)^2 # R ã® var ã使ã var(a) * 4 / 5
[1] 2.24 [1] 2.24 [1] 2.24
ä¸ã®ä¾ã§ã¯ããããåãå¤ãè¿ãã¦ãããã¨ããããã¾ãã
ãªã var ã使ã£ãè¨ç®ã§ã¯ä¸å忣ã§ã¯ãªãæ¨æ¬åæ£ã«ä¿®æ£ãã¾ããã
ç¶ãã¦å ã®è¨ç®ã«ãããã£ãå ´åã«ããã¯ãåãããã«åæ£ã»æ¨æºåå·®ãå¾ãããããè¦ã¦ã¿ã¾ãã
set.seed(123) n <- 10 y0 <- rnorm(n) w0 <- rep(1, n) w <- w0/sum(w0) v <- sqrt(w) y <- v*y0 ys <- sqrt(y %*% y - (v %*% y)^2) y_new <- y/ys[1]
> mean((y0 - mean(y0))^2) # ä¸è¬çãªåæ£ã®è¨ç® [1] 0.8187336 > mean(y0^2) - mean(y0)^2 # äºä¹ã®å¹³åããå¹³åã®äºä¹ãå¼ã [1] 0.8187336 > var(y0) * (n-1) / n # R ã® var ã使ã£ã¦ [1] 0.8187336 > (ys^2)[1] # è¨ç®ããå¤ [1] 0.8187336
$(ys)^{2}$ ã $y0$ ã®åæ£ã«ãªã£ã¦ãããã¨ã確èªã§ãã¾ããã
ã¨ãããã¨ã§ãå
ã»ã©ã®å¦çã§ã¯ w ã v ã使ã£ã¦ãã¨ãã¨ã® y ã®éã¿ä»ãæ¨æºåå·®ãè¨ç®ãããã®å¤ã§éã¿ä»ãã® y ãé¤ãã¦ãããã¨ããããã¾ããã
ãã®ãµãã«ã¼ãã³ã®ååã standard ãªã®ã§å½ç¶ã§ãããæ¨æºåããã¦ããããã§ãã
x ã«ã¤ãã¦ãåºæ¬çã«åæ§ã®å¦çãè¡ã£ã¦ãããv ã使ã£ã¦éã¿ä»ãæ¨æºåå·®ãè¨ç®ã»æ¨æºåããã¦ãã¾ãã
ãã ãæå¾ã«éã¿ä»ãå¹³åã®äºä¹ / 忣 ã« 1 ãå ç®ãããã®ã xv ã«æ ¼ç´ãã¦ãããããã x ã®åæ£ã¨ãã¦ãããããªã®ã§ããããããè¯ããããã¾ããã§ããã
ã¡ãªã¿ã« ju ã¯å
ã»ã©èª¬æããããã«å夿°ã«ç°ãªãæ°å¤ã»ãã©ã¤ãããããã示ããã®ã§ããã©ã¤ãããªããã°ãã£ãã¨ã«ã¼ããæãã¦æ¬¡ã®å¤æ°ã«ç§»ã£ã¦ãããã¨ããããã¾ãã
! x do 10151 j=1,ni ! ni 㯠nvars if(ju(j).eq.0)goto 10151 xm(j) = 0.0 x(:,j) = v*x(:,j) ! x ã«ãéã¿ãä¹ãã xv(j) = dot_product(x(:,j),x(:,j)) ! x ã®äºä¹ã®éã¿ä»ãå¹³å ! isd ã¯æ¨æºåãããã®æå®ã§ãæ¨æºåããå ´å㯠1 ãå ¥ã£ã¦ãã 10171 ã«é£ã°ãããªã if(isd .eq. 0) goto 10171 xbq = dot_product(v, x(:,j))**2 ! x ã®éã¿ä»ãå¹³åã®äºä¹ vc = xv(j)-xbq ! éã¿ä»ã忣 xs(j) = sqrt(vc) ! éã¿ä»ãæ¨æºåå·®ã ys ã¨å¯¾å¿ãã¦ããã x(:,j) = x(:,j)/xs(j) ! æ¨æºåå·®ã§å²ã£ã¦æ¨æºåã y/ys ã¨å¯¾å¿ãã¦ããã ! ããã¯ããããããªã xv(j) = 1.0 + xbq/vc ! éã¿ä»ãå¹³åã®äºä¹ / 忣 ã« 1 ãå ç® goto 10181 10171 continue xs(j)=1.0 10181 continue continue 10151 continue continue goto 10191
åçã 0 ã§ãªãå ´åã¯ãã¡ãã«ãã¾ãï¼åºæ¬ã¯ãã£ã¡ï¼ããå¦çã¯ä¸è¨ã¨å¤§ä½åãã§ãã
yãx ã¨ãã«å¤ãæ´æ°ããåã«éã¿ä»ãå¹³åãå¼ãã¦ããã¨ãããéãç¹ã§ããã
! åçã 0 ã§ãªãå ´åããã«æ¥ã ! åºæ¬ã¯ãã£ã¡ 10141 continue ! x do 10201 j=1,ni if(ju(j).eq.0)goto 10201 xm(j) = dot_product(w,x(:,j)) ! x ã®éã¿ä»ãå¹³å x(:,j) = v*(x(:,j)-xm(j)) ! éã¿ä»ãå¹³åãå¼ãã¦ããéã¿ãä¹ãã xv(j) = dot_product(x(:,j),x(:,j)) ! äºä¹ã®éã¿ä»ãå¹³å if(isd.gt.0) xs(j) = sqrt(xv(j)) ! éã¿ä»ãæ¨æºåå·® 10201 continue continue if(isd .ne. 0)goto 10221 xs = 1.0 goto 10231 10221 continue do 10241 j=1,ni if(ju(j).eq.0)goto 10241 x(:,j) = x(:,j)/xs(j) ! æ¨æºåã¯ããã§å®è¡ 10241 continue continue xv=1.0 10231 continue continue ym = dot_product(w,y) ! y ã®éã¿ä»ãå¹³å y = v*(y-ym) ! y ããéã¿ä»ãå¹³åãå¼ãããã®ã«éã¿ãä¹ãã ys = sqrt(dot_product(y,y)) ! äºä¹åï¼åæ£ï¼ã®å¹³æ¹æ ¹ï¼SDï¼ y = y/ys ! æ¨æºå
次ã®å¦çã¯å
±éã®ãã®ã§ãy 㨠x ã®å
ç©ãè¨ç®ãã g ã«æ ¼ç´ãã¾ãã
åç´ã« y 㨠x ã®å
ç©ãè¨ç®ãã¦ããããã«è¦ãã¾ãããããã§ã® y ã¯
ãx ã¯
ã¨ãªã£ã¦ããã®ã§ããã®å ç©ã¯éã¿ä»ãå ±åæ£ãããããã®æ¨æºåå·®ã®ç©ã§é¤ãããã®ãã¤ã¾ãéã¿ä»ãã®ç¸é¢ä¿æ°ã¨ãªã£ã¦ããã¯ãã§ãã
! 3. å ç©ï¼éã¿ä»ãç¸é¢ä¿æ°ï¼ãæ ¼ç´ 10191 continue continue g = 0.0 do 10251 j=1,ni ! j çªç®ã®å¤æ°ã«ãã©ããããããªã g ã« y 㨠x ã®å ç©ï¼å ±åæ£ï¼ãæ ¼ç´ãã ! ãã ããã®æç¹ã§ã® y 㨠x ã¯ããããæ¨æºåå·®ã§é¤ãããã®ã¨ãªã£ã¦ãã if(ju(j).ne.0) g(j) = dot_product(y, x(:,j)) 10251 continue continue deallocate(v) return end
å
ã®ãµã³ãã«ãã¼ã¿ã§ç¢ºããã¦ã¿ã¾ãããã
éã¿ãå
¨ã¦çããã¨ããåç´ãªãã¿ã¼ã³ã§ã¯ãæ´æ°ããã y 㨠x ã®å
ç©ãç¸é¢ä¿æ°ã«ãªã£ã¦ãããã¨ã確èªã§ãã¾ãã
set.seed(123) n <- 10 y0 <- rnorm(n) x0 <- rnorm(n) w0 <- rep(1, n) w <- w0/sum(w0) v <- sqrt(w) y <- v*(y0 - (w %*% y0)[1]) ys <- sqrt(y %*% y) y_new <- y/ys[1] x <- v*(x0 - (w %*% x0)[1]) xs <- sqrt(x %*% x) x_new <- x/xs[1]
> (y_new %*% x_new)[1] # å ç© [1] 0.5776151 > cor(y_new, x_new) # æ´æ°å¾ã® y 㨠ï½ã®ç¸é¢ä¿æ° [1] 0.5776151 > cor(y0, x0) # å ã®å¤ã®ç¸é¢ä¿æ° [1] 0.5776151
䏿¹éã¿ã観測å¤ã«ãã£ã¦ç°ãªãå ´åã¯ã¨ããã¨ãããã¯è¿ãå¤ã«ãªããã®ã®å®å ¨ã«ä¸è´ã¯ãã¾ããã§ããï¼ã§ããããªãã§ã ãããä¸è´ãããããªæ°ããããã ãã©ï¼ã
set.seed(123) n <- 10 y0 <- rnorm(n) x0 <- rnorm(n) w0 <- rep(1, n) - 0.5 * ifelse(runif(n) > 0.8, 1, 0) # ä¸é¨ã®ãã¼ã¿ã«å¯¾ãã¦éã¿ãå°ãããã¦ãã w <- w0/sum(w0) v <- sqrt(w) y <- v*(y0 - (w %*% y0)[1]) ys <- sqrt(y %*% y) y_new <- y/ys[1] x <- v*(x0 - (w %*% x0)[1]) xs <- sqrt(x %*% x) x_new <- x/xs[1]
> (y_new %*% x_new)[1] [1] 0.5687947 > cor(y_new, x_new) [1] 0.5687133 > cor(y0, x0) [1] 0.5776151
ã¨ããã§éã¿èª¿æ´å¾ã® y 㨠x ã®å
ç©ãç¸é¢ä¿æ°ã¨è¿ä¼¼ï¼ä¸è´ï¼ï¼ãããªããåå¥ã®ãã¼ã¿ã®ãã¢ãç¸é¢ã«å¯¾ãã¦ã©ã®ãããªå½±é¿ãæã£ã¦ããããè©ä¾¡ã§ããã®ã§ã¯ãªãã§ããããã
å
ç©ã§ã¯ãªãåãã¢ã®æãç®èªã®å¤ãè¦ã¦ã¿ãã¨ã6 çªç®ã¨ 8 çªç®ã®å¤ãé«ãå¤ã示ãã¦ãããã¨ããããã¾ãã
ãã®ãã¼ã¿ã®éã¿ä»ãç¸é¢ä¿æ°ã¯ 0.568 ãããã ã£ãã®ã§ããã® 2 ã¤ã®è¦³æ¸¬å¤ã®å½±é¿ã大ãããã§ãã
> cbind(1:n, y_new * x_new) [,1] [,2] [1,] 1 -0.0744142551 [2,] 2 -0.0043928887 [3,] 3 0.0261049036 [4,] 4 0.0004833048 [5,] 5 -0.0025033504 [6,] 6 0.2852904868 [7,] 7 0.0104270429 [8,] 8 0.3466035906 [9,] 9 -0.0413263433 [10,] 10 0.0225221645
å®éã«ãã¼ã¿ãè¦ã¦ã¿ãã¨ã 6 çªã¨ 8 çªã®ãã¼ã¿ã¯ä»ã®è¦³æ¸¬å¤ã¨æ¯ã¹ã¦é¢é£æ§ãå¼·ããã«è¦ãã¾ãã
> cbind(y_new, x_new) y_new x_new [1,] -0.233569767 0.31859541 [2,] -0.117117800 0.03750829 [3,] 0.513582873 0.05082900 [4,] -0.011106121 -0.04351698 [5,] 0.009617489 -0.26029149 [6,] 0.568708951 0.50164585 [7,] 0.126538479 0.08240215 [8,] -0.481982832 -0.71912020 [9,] -0.278126098 0.14858851 [10,] -0.136535493 -0.16495465
6 㨠8 çªç®ã®ãã¼ã¿ãå¡ãåãã¦ã¿ãã¨ããããããã§ããã
cols <- c(1, 1, 1, 1, 1, 3, 1, 3, 1, 1) + 1 plot(y ~ x, col = cols, pch = 16)

以ä¸ã§ y 㨠x ã«ã¤ãã¦æ¨æºåãçµãã£ãã®ã§standard ãã elnet ã«å¸°ã£ã¦ããã¨ä»åº¦ã¯åå¸°ä¿æ°ã®ä¸éã»ä¸éã«ã¤ãã¦ãæ¨æºåãè¡ãã¾ãã
ã¾ã flmin ã 1 以ä¸ã®å ´å㯠vlam ãæ´æ°ããã®ã§ããã flmin 㯠lambda ãæå®ãããå ´åã« 1 ãå
¥ããããã§ãªããã° $[0, 1)$ ã®å®æ°ãæå¾
ããããã©ã¡ã¼ã¿ã§ããã
ãªã®ã§ lambba ãæå®ãããå ´åï¼= flmin ã 1 ã®ã¨ãï¼ã« vlam ã y ã®éã¿ä»ãæ¨æºåå·®ã§èª¿æ´ãããäºã«ãªãã¾ãã
ãã® vlam ã¯å¾ç¶ã®å¦çï¼ãã£ããã£ã³ã°ï¼ã§ã¯ ulam ã¨ãã¦æ¸¡ããããã®ã§ãããulam 㯠lambda ã®æå®ããªããã° 1 ãæå®ãããã°ãã®éé ã¨ãªããã®ã§ããã
è¦ããã« lambda ã®å¤§ããã«ã¤ãã¦ãæ¨æºåããããã¨ããäºã®ããã§ããã
! jerr ã« 0 ã§ãªãå¤ãå ¥ã£ã¦ãã㨠return if(jerr.ne.0) return ! cl 㯠glmnet ã§ cl = rbind(lower.limits, upper.limits) ã¨å®ç¾©ããã ! åå¸°ä¿æ°ã®ä¸éã»å æ¸ cl=cl/ys ! æ¨æºåã®æå®ã 0 ã§ããã°ä»¥ä¸ã¯ã¹ããã if(isd .le. 0) goto 10091 ! 説æå¤æ°ãã¨ã«æ¨æºåå·®ãä¹ãã do 10101 j=1,ni cl(:,j)=cl(:,j)*xs(j) 10101 continue continue 10091 continue ! flmin 㯠glmnet ã®ãªãã§ flmin = as.double(lambda.min.ratio) ã§å®ç¾©ããã ! ããã§ lambda.min.ratio = ifelse(nobs < nvars, 0.01, 1e-04) if(flmin.ge.1.0) vlam=ulam/ys
ã§ã¯ãã£ããã£ã³ã°ã§ãã
ããã§å¼ã°ãã elnet1 ããã {glmnet} ã®æ¬ä½ã¨ãªããåå¸°ä¿æ°ã®è¨ç®ã¯ããã§è¡ããã¾ãã
ãã®ä¸ã§ã¯ãããµãã«ã¼ãã³ã¯ã»ã¨ãã©å¼ã°ãããåæãã©ã¡ã¼ã¿ãåã£ã¦ãããã®ã¨ããã°ã¬ã¹ãã¼ã表示ããããã®ãã®ã ãã§ãã
ãããããã©ãçãã¾ãããä»åãé·ãã£ãã§ããã
! 3. ãã£ããã£ã³ã° ! æ¬ä½ã§ãã elnet1 ã®å¼ã³åºã call elnet1(parm,ni,ju,vp,cl,g,no,ne,nx,x,nlam,flmin,vlam,thr,maxi,xv, lmu,ca,ia,nin,rsq,alm,nlp,jerr)
ãã®ãµãã«ã¼ãã³ã¯éã¯ããããï¼180è¡ç¨åº¦ï¼ãªã®ã§ãããã«ã¼ããè¾¼ã¿å ¥ã£ã¦ãã¦ç´¹ä»ãé·ããªãã®ã§ä»åã¯ããã¾ã§ã§ãã ã¾ã次åã
glmnetãããå°ãçè§£ãããâ¡
ååã®è¨äºã§ã¯ glmnet ã®ä¸èº«ã確èªãã弿°ã® family ã«ãã£ã¦å¼ã³åºã颿°ãå¤ãã¦ãããã¨ããããã¾ããã
ä»åã¯ãã®ãªãã§ã gaussian ãæå®ãããå ´åã®é¢æ°ã§ãã elnet ãè¦ã¦ããã¾ãããã
ãªãååã®è¨äºã¯ãã¡ãã§ãã
elnet ã®å®è£
ããã§ã¯æ©é elnet ã¨ãã颿°ãè¦ã¦ããã¾ãããã
ã¡ãªã¿ã«ããã§ã® elnet ã¯ã³ã³ã½ã¼ã«ã§ elnet ã¨æã£ã¦ã表示ããã¾ããããC ã Fortran ã§æ¸ããããã®ã§ã¯ãªãã¦åã« glmnet ããã¨ã¯ã¹ãã¼ãããã¦ããªã颿°ãªã®ã§ glmnet:::elnet ã§ä¸èº«ãè¦ããã¨ãã§ãã¾ãã
ãã®é¢æ°ã¯ããã»ã©é·ããªãã®ã§ãããªãå
容ã®ç¢ºèªã«å
¥ãã¾ãããä»ã®å¤ãã®é¢æ°åæ§ã« elnet ã§ãæåã¯ãã©ã¡ã¼ã¿ã®åãåãã»ç¢ºèªãè¡ãã¾ãã
ä¸ã®ãããã¯ã§ã¯åå¾©åæ°ï¼ maxit ï¼ã観測å¤ã®éã¿ï¼ weights ï¼ãåãåã£ãå¾ã type.gaussian ã®æå®å
容ã«ãã£ã¦ ka ã¨ãããã©ã¡ã¼ã¿ã«æ ¼ç´ããå¤ãå¤ãã¦ãã¾ãã
function (x, is.sparse, ix, jx, y, weights, offset, type.gaussian = c("covariance", "naive"), alpha, nobs, nvars, jd, vp, cl, ne, nx, nlam, flmin, ulam, thresh, isd, intr, vnames, maxit) { # 1. ãã©ã¡ã¼ã¿ã®åãåã ### maxit maxit = as.integer(maxit) ### weights weights = as.double(weights) ### type.gaussian type.gaussian = match.arg(type.gaussian) ka = as.integer(switch(type.gaussian, covariance = 1, naive = 2, ))
ka ã¯ããã«å
ã®å¦çã§ elnetu 㨠elnetn ã¨ããï¼ã¤ã®ãµãã«ã¼ãã³ã®ã©ã¡ããå¼ã¶ããæ±ºãã¦ãã¾ãã®ã§ã type.gaussian ã®æå®ã«åããã¦ãµãã«ã¼ãã³ã夿´ãã¦ããã¨ãããã¨ã§ããã
以ä¸ã§ã¯ y ããã³ offset ï¼åå¨ããå ´åï¼ã double ã«å¤æãã¦ãã¾ãã
ã¾ã y ã®éã¿ä»ãå¹³åã使ã£ã¦ Null Deviance ï¼æ®å·®é¸è±åº¦ï¼ãè¨ç®ãã¦ãã¾ãã
### y ã® storage.mode storage.mode(y) = "double" ### offset if (is.null(offset)) { is.offset = FALSE } else { storage.mode(offset) = "double" is.offset = TRUE y = y - offset } ### éã¿ä»ãå¹³å ybar = weighted.mean(y, weights) ### Null Devianceï¼å¸°ç¡ã¢ãã«ã®æ®å·®é¸è±åº¦ï¼ nulldev = sum(weights * (y - ybar)^2) if (nulldev == 0) stop("y is constant; gaussian glmnet fails at standardization step")
次ã®ãããã¯ã§æ©éãã£ããã£ã³ã°ã«å
¥ãã¾ãã
is.sparse ãæå®ããã¦ãããå¦ãã§ spelnet 㨠elnet ã®ã©ã¡ããå¼ã°ããããæ±ºã¾ãã¾ããã弿°ã®éãã¨ãã¦ã¯ spelnet ã«ãã㦠x ã as.double ã¨ããã¦ããã ix 㨠jx ï¼ããããçè¡åã«ããã¦éã¼ãã®è¦ç´ ã®åº§æ¨ãç¹å®ããããã®æ°å¤ï¼ã追å ããã¦ãã¾ãã
# 2. ãã£ããã£ã³ã° ## çè¡åã§ãããã§é¢æ°ãå¤ãã fit = if (is.sparse) .Fortran("spelnet", ka, parm = alpha, nobs, nvars, x, # çè¡åã§ããå ´åã以ä¸ã® ix, jx ã弿°ã¨ãã¦è¿½å ããã # ix, jx ã¯çè¡åã«ãããéã¼ãã®è¦ç´ ã®ç´¯ç©åæ°ã¨è¡çªå· ix, jx, y, weights, jd, vp, cl, ne, nx, nlam, flmin, ulam, thresh, isd, intr, maxit, lmu = integer(1), a0 = double(nlam), ca = double(nx * nlam), ia = integer(nx), nin = integer(nlam), rsq = double(nlam), alm = double(nlam), nlp = integer(1), jerr = integer(1), PACKAGE = "glmnet") else .Fortran("elnet", ka, parm = alpha, nobs, nvars, as.double(x), y, weights, jd, vp, cl, ne, nx, nlam, flmin, ulam, thresh, isd, intr, maxit, lmu = integer(1), a0 = double(nlam), ca = double(nx * nlam), ia = integer(nx), nin = integer(nlam), rsq = double(nlam), alm = double(nlam), nlp = integer(1), jerr = integer(1), PACKAGE = "glmnet") # nx 㯠éã¼ãã®å¤æ°ã®åæ° # nlam ã¯æ¤è¨¼ãã lambda ã®åæ° # ãªã®ã§ ca ã¯å¤æ°ã®æ° * lambda ã®æ°
å¦çãæãããã¨ã¯ãã¨ã©ã¼ããã§ãã¯ããä¸ã§å¿ è¦ãªãã©ã¡ã¼ã¿ãåå¾ãã¾ãã
# 3. å¾å¦ç ## ã¨ã©ã¼ãã§ã㯠if (fit$jerr != 0) { errmsg = jerr(fit$jerr, maxit, pmax = nx, family = "gaussian") if (errmsg$fatal) stop(errmsg$msg, call. = FALSE) else warning(errmsg$msg, call. = FALSE) } ## ãã©ã¡ã¼ã¿ï¼åçãåå¸°ä¿æ°ãèªç±åº¦ã次å ãlambdaï¼ãåã£ã¦ãã outlist = getcoef(fit, nvars, nx, vnames) ## ãã©ã¡ã¼ã¿ï¼xxxxxxxxxxï¼ãåã£ã¦ã㦠outlist ã«çµåãã dev = fit$rsq[seq(fit$lmu)] outlist = c(outlist, list(dev.ratio = dev, nulldev = nulldev, npasses = fit$nlp, jerr = fit$jerr, offset = is.offset)) ## elnet ã¯ã©ã¹ãä»ä¸ãã class(outlist) = "elnet" outlist }
ããã§ã¯æ¬¡ã« elnet ã®æ¬ä½ã§ãã elnetï¼ãããããã§ããï¼ ã®ä¸èº«ãè¦ã¦ããã¾ãããã
elnetï¼äºåº¦ç®ï¼ã®å®è£
ä¸è¨ã®ãã£ããã£ã³ã°ã®ã»ã¯ã·ã§ã³ã§ elnet 㯠.Fortran("elnet") ã¨ãã¦å¼ã°ãã¦ãã¾ãããããã¾ã§ glm ã GAM ã§è¦ã¦ããã¨ãã¨åãããã«ã glmnet ã§ããã¯ã fortran ã«è¡ãçãããã§ããã
ã¨è¨ã£ã¦ãããã§ã¯ã¾ã 颿°èªä½ã¯å¤§ãããªããä¸ã®ããã«ï¼ã³ã¡ã³ãæãã§ï¼30è¡ç¨åº¦ã§æ¸ããã¦ãã¾ãã
subroutine elnet(ka,parm,no,ni,x,y,w,jd,vp,cl,ne,nx,nlam, flmin,u *lam,thr,isd,intr,maxit, lmu,a0,ca,ia,nin,rsq,alm,nlp,jerr) implicit double precision(a-h,o-z) double precision x(no,ni),y(no),w(no),vp(ni),ca(nx,nlam),cl(2,ni) double precision ulam(nlam),a0(nlam),rsq(nlam),alm(nlam) integer jd(*),ia(nx),nin(nlam) double precision, dimension (:), allocatable :: vq; ! vp ã 0.0 ã ã£ãå ´åã«ã¯ jerr = 100000 ã¨ã㦠return ãã¦ãã¾ã if(maxval(vp) .gt. 0.0)goto 10021 jerr=10000 return 10021 continue allocate(vq(1:ni),stat=jerr) ! ããã§ã jerr ã« 0 以å¤ã®æ°å¤ãå ¥ã£ã¦ããã return ãã¦ãã¾ã if(jerr.ne.0) return ! vp ã®å¤ã«ãã£ã¦ vq ãçæ ! ããã©ã«ã㯠1 ! ni 㯠nvars ã§å¤æ°ã®æ°ãªã®ã§ã vq ã«ã¯ããã©ã«ãã§ã¯å¤æ°ã®æ°ãå ¥ã ! ã§ããªãã§ sum(vq) ãªãã ã vq=max(0d0,vp) vq=vq*ni/sum(vq) ! elnetu ã elnetn ã®ã©ã¡ããå¼ã¶ã㯠ka .ne. 1 ã§ãããã§å¤æãã¦ãã ! 1 ã§ãªããã° elnetn ã 1 ãªã elnetu if(ka .ne. 1)goto 10041 call elnetu (parm,no,ni,x,y,w,jd,vq,cl,ne,nx,nlam,flmin,ulam,thr, *isd,intr,maxit, lmu,a0,ca,ia,nin,rsq,alm,nlp,jerr) goto 10051 10041 continue call elnetn (parm,no,ni,x,y,w,jd,vq,cl,ne,nx,nlam,flmin,ulam,thr,i *sd,intr,maxit, lmu,a0,ca,ia,nin,rsq,alm,nlp,jerr) 10051 continue continue deallocate(vq) return end
goto ãå¤ç¨ãã¦ãã¾ããããã
夿°å®£è¨ä»¥ä¸ã§æ°ã«ãªãã¨ããã¨ãã¦ã¯ã vp ã 0 ã ã£ãã¨ãã®æåã¨ã elnetu ãå¼ã¶ã¨ããã§ããããã
vp ã¯ååã®è¨äºã§ç¢ºèªããéãã glmnet ã®ãªãã§ vp = as.double(penalty.factor) ã¨ãã¦å®ç¾©ããã¦ãã¾ãã
ãã® penalty.factor ã¯ããã©ã«ãã§ã¯ 1 ãå
¥ãã¾ãã®ã§åºæ¬çã«ã¯ goto 10021 ã§é£ã°ããã¦ãã¾ãã¾ãã
ãã®ã»ã¯ã·ã§ã³ã§å¼ã£ãããã®ã¯æç¤ºçã« penalty.factor ã« 0 ãæå®ããå ´åã§ããã
! vp ã¯å夿°ã«å¯¾ããç½°åã®éã¿ï¼ããã©ã«ã㯠1ï¼ ãå ¥ã£ããã¯ãã« ! vp = as.double(penalty.factor) ! jerr ã®æ°å¤ã§å¾ç¶ã®å¦çã§åºåããã¨ã©ã¼ã¡ãã»ã¼ã¸ã決ã¾ã if(maxval(vp) .gt. 0.0)goto 10021 jerr=10000 return 10021 continue allocate(vq(1:ni),stat=jerr)
ã§ã¯ penalty.factor ã« 0 ãæå®ããå ´åã¯ã©ããªããã¨è¨ãã¨ã jerr ã« 10000 ãå
¥åãã㦠return ããã¾ãã
ãã® jerr ã¯å
ã»ã©ç¢ºèªããå¾å¦çã«ãã㦠errmsg = jerr(fit$jerr, maxit, pmax = nx, family = "gaussian") ã¨ãã¦ã¨ã©ã¼ã¡ãã»ã¼ã¸ã«å¤æãããã®ã§ããã
ã¾ããã® jerr ã¨ãã颿°ã¯ glmnet ã§å®ç¾©ããã¦ãã¾ãã®ã§ã
> glmnet:::jerr function (n, maxit, pmax, family) { if (n == 0) list(n = 0, fatal = FALSE, msg = "") else { errlist = switch(family, gaussian = jerr.elnet(n, maxit, pmax), binomial = jerr.lognet(n, maxit, pmax), multinomial = jerr.lognet(n, maxit, pmax), poisson = jerr.fishnet(n, maxit, pmax), cox = jerr.coxnet(n, maxit, pmax), mrelnet = jerr.mrelnet(n, maxit, pmax)) names(errlist) = c("n", "fatal", "msg") errlist$msg = paste("from glmnet Fortran code (error code ", n, "); ", errlist$msg, sep = "") errlist } }
ã¨ãã¦åãåºãã¾ãã
颿°ãã¿ã¦ã¿ãã¨ã errlist 㯠switch(family, ~) ã§æ´ã«ç°ãªã颿°ãå¼ã³åºãããã®çµæãæ ¼ç´ãã¦ããããã§ãã
ãã®ããæ´ã« jerr.elnet ã確èªããã¨
> glmnet:::jerr.elnet function (n, maxit, pmax) { if (n > 0) { if (n < 7777) msg = "Memory allocation error; contact package maintainer" else if (n == 7777) msg = "All used predictors have zero variance" else if (n == 10000) msg = "All penalty factors are <= 0" else msg = "Unknown error" list(n = n, fatal = TRUE, msg = msg) } else if (n < 0) { if (n > -10000) msg = paste("Convergence for ", -n, "th lambda value not reached after maxit=", maxit, " iterations; solutions for larger lambdas returned", sep = "") if (n < -10000) msg = paste("Number of nonzero coefficients along the path exceeds pmax=", pmax, " at ", -n - 10000, "th lambda value; solutions for larger lambdas returned", sep = "") list(n = n, fatal = FALSE, msg = msg) } }
else if (n == 10000) msg = "All penalty factors are <= 0" ã¨ãç½°åé
ã 0 ã§ãããã¨ãæãã¦ããã¦ãã¾ããã
ãã¦ç¶ã㦠elnetu ã®å¼ã³ã ãã確èªããã¨ãelnetu 㨠elnetn ã®ããããå¼ã¶ã㯠ka ã§æ±ºã¾ã£ã¦ãã¾ãã
å
ã»ã©å°ã触ããéãã ka 㯠ka = as.integer(switch(type.gaussian, covariance = 1, naive = 2, )) ã§å®ç¾©ããã¦ãã¾ãã
ã¾ã type.gaussian 㯠glmnet ã®å¼æ°ã§ãããtype.gaussian = ifelse(nvars < 500, "covariance", "naive") ã¨å®ç¾©ããã¦ãã¾ãã
夿°ã®æ°ã 500 æªæºã§ããã° covarinace ã¨ãªãã ka ã«ã¯ 1 ãå¼ã渡ãããã®ã§ if(ka .ne. 1) ã«ã¯è©²å½ããããããã£ã¦ elnetu ãå¼ã°ãããã¨ã«ãªãããã§ããã
! elnetu ã elnetn ã®ã©ã¡ããå¼ã¶ã㯠ka .ne. 1 ã§ãããã§å¤æãã¦ãã ! 1 ã§ãªããã° elnetn ã 1 ãªã elnetu ! ka 㯠elnet ã®ç¬¬ä¸å¼æ° ! ka = as.integer(switch(type.gaussian, covariance = 1, naive = 2, )) ! ãã® covariance / naive ã¯å¤æ°ã®æ°ã§æ±ºã¾ã ! type.gaussian = ifelse(nvars < 500, "covariance", "naive") if(ka .ne. 1)goto 10041 call elnetu (parm,no,ni,x,y,w,jd,vq,cl,ne,nx,nlam,flmin,ulam,thr, *isd,intr,maxit, lmu,a0,ca,ia,nin,rsq,alm,nlp,jerr) goto 10051 10041 continue call elnetn (parm,no,ni,x,y,w,jd,vq,cl,ne,nx,nlam,flmin,ulam,thr,i *sd,intr,maxit, lmu,a0,ca,ia,nin,rsq,alm,nlp,jerr)
次åã¯ãã® elnetu ãè¦ã¦ã¿ã¾ãããã
glmnetãããå°ãçè§£ãããâ
ä¹ ãã¶ãã®æ´æ°ã§ãï¼ãã¤ãè¨ã£ã¦ãã¾ãï¼ã
èæ¯
ãã¼ã¿ãµã¤ã¨ã³ã¹å
¥éã·ãªã¼ãºã®ãã¹ãã¼ã¹å帰åæã¨ãã¿ã¼ã³èªèããèªãã§ããã大å¤é¢ç½ãã£ãã®ã§ããã¤ãã®ããã« glmnet ã®ä¸èº«ãè¦ã¦ã¿ããã¨ã«ãã¾ããã
ãªãç§ã¯æ¥åã§Lasso/Ridgeã使ã£ãçµé¨ããã¾ããªãããçè§£ãééã£ã¦ããããããã¾ãããããã®ç¹ããããããäºæ¿ãã ããã

ã¹ãã¼ã¹å帰åæã¨ãã¿ã¼ã³èªè (ãã¼ã¿ãµã¤ã¨ã³ã¹å ¥éã·ãªã¼ãº)
- ä½è :æ¢ æ´¥ ä½å¤ª,è¥¿äº é¾æ ,ä¸ç° åç¥
- çºå£²æ¥: 2020/02/28
- ã¡ãã£ã¢: åè¡æ¬ï¼ã½ããã«ãã¼ï¼
ãã¡ãã®æ¬ã§ããè¯ãæ¬ã§ãã
glmnet ã®å®è¡çµæ
ååã® GAM ã®æã¨åæ§ã«ãã¾ã㯠glmnet ã§ã©ã®ãããªçµæãå¾ããã¨ãã§ããã®ã確èªãã¦ã¿ã¾ãããããã¹ãã¼ã¹å帰åæã¨ãã¿ã¼ã³èªèãï¼ä»¥ä¸ãæç§æ¸ï¼P12 ã³ã¼ã1.2ãï¼å°ãæ¹å¤ãã¦ï¼å®è¡ãã¦ã¿ã¾ãã
ãªããããã®ã³ã¼ãã¯ãã¡ããããã¦ã³ãã¼ããããã¨ãã§ãã¾ãã
ç°å¢ã¯ä»¥ä¸ã®ãããªæãã§ãã
> sessionInfo() R version 3.6.0 (2019-04-26) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS Mojave 10.14.6 Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib locale: [1] ja_JP.UTF-8/ja_JP.UTF-8/ja_JP.UTF-8/C/ja_JP.UTF-8/ja_JP.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_3.6.0 tools_3.6.0 grid_3.6.0 lattice_0.20-38
library(glmnet) library(plotmo) x <- scale(LifeCycleSavings[, 2:5]) y <- LifeCycleSavings[, 1] - mean(LifeCycleSavings[, 1]) lasso <- glmnet(x, y, family = "gaussian", alpha = 1) # alpha = 1 ã§ lasso ridge <- glmnet(x, y, family = "gaussian", alpha = 0) # alpha = 0 ã§ ridge ## directoryã¯é©å½ã«æå® png("./Image/glmnet_dive_01_01.png", width = 600, height = 400) plot_glmnet(lasso, xvar = "lambda", label = TRUE) dev.off() png("./Image/glmnet_dive_01_02.png", width = 600, height = 400) plot_glmnet(ridge, xvar = "lambda", label = TRUE) dev.off()


çµæã®è§£éãªã©ã«ã¤ãã¦è©³ããã¯æç§æ¸ãè¦ã¦é ãã¨ãã¦ã glmnet ã¯ç®ç颿°ã«åå¸°ä¿æ°ã®è¦æ¨¡ã«å¿ããç½°åãè¨ãããã¨ã§ãåå¸°ä¿æ°ã0ã«åãã£ã¦ç¸®å°ãããªãããã£ããã£ã³ã°ãè¡ãã¾ãã
ã¾ãã°ã©ãã®ããã«ç½°åã®å¤§ãããè²ã
ã¨åãããã¨ã§å夿°ã¸ã®åå¸°ä¿æ°ãã©ã®ããã«å¤åããããè©ä¾¡ãããã¨ãã§ãã¾ãã
ãã®ã°ã©ãã§ã¯å·¦ããå³ã«åãã£ã¦ç½°åãå¼·ããªãã¾ãããããã«ã¤ãã¦Lasso/Ridgeã®ä¸¡æ¹ã¨ãåå¸°ä¿æ°ã0ã«åãã£ã¦å°ãããªã£ã¦ããï¼ç¸®å°ãã¦ããï¼ãã¨ããããã¾ãã
ãªã Lasso ã§ã¯åå¸°ä¿æ°ã0ã«åæãã¦ãã䏿¹ã Ridge ã§ã¯å¾®å°ãªããæå¾ã¾ã§ä¿æ°ã0ã¨ãªããã«æ®ã£ã¦ãããã¨ããããã¾ããï¼ã°ã©ãä¸é¨ã® Degrees of Freedom ã 4 ã®ã¾ã¾ã¨ãªã£ã¦ãã¾ãï¼ã Lasso ã®ããã«ä¸é¨ã®åå¸°ä¿æ°ãæ£ç¢ºã« 0 ã¨æ¨å®ãããã¨ãå¯è½ãªææ³ãã¹ãã¼ã¹æ¨å®ã¨å¼ã³ã¾ãã
glmnet ã®å®è£
ããã§ã¯ glmnet ã¨ãã颿°ãã©ã®ããã«å®è£
ããã¦ããã®ãè¦ã¦ããã¾ãããã
ã¾ãã¯ãã¤ãã®ããã«å
¨ä½ãçºããè¦éãããããã¾ãã
function (x, y, family = c("gaussian", "binomial", "poisson", "multinomial", "cox", "mgaussian"), weights, offset = NULL, alpha = 1, nlambda = 100, lambda.min.ratio = ifelse(nobs < nvars, 0.01, 1e-04), lambda = NULL, standardize = TRUE, intercept = TRUE, thresh = 1e-07, dfmax = nvars + 1, pmax = min(dfmax * 2 + 20, nvars), exclude, penalty.factor = rep(1, nvars), lower.limits = -Inf, upper.limits = Inf, maxit = 1e+05, type.gaussian = ifelse(nvars < 500, "covariance", "naive"), type.logistic = c("Newton", "modified.Newton"), standardize.response = FALSE, type.multinomial = c("ungrouped", "grouped"), relax = FALSE, trace.it = 0, ...) { ### 1. ãã©ã¡ã¼ã¿ã®è¨å®ãåå¦çãã¨ã©ã¼ãã§ã㯠family = match.arg(family) if (alpha > 1) { warning("alpha >1; set to 1") alpha = 1 } if (alpha < 0) { warning("alpha<0; set to 0") alpha = 0 } alpha = as.double(alpha) this.call = match.call() nlam = as.integer(nlambda) y = drop(y) np = dim(x) if (is.null(np) | (np[2] <= 1)) stop("x should be a matrix with 2 or more columns") nobs = as.integer(np[1]) if (missing(weights)) weights = rep(1, nobs) else if (length(weights) != nobs) stop(paste("number of elements in weights (", length(weights), ") not equal to the number of rows of x (", nobs, ")", sep = "")) nvars = as.integer(np[2]) dimy = dim(y) nrowy = ifelse(is.null(dimy), length(y), dimy[1]) if (nrowy != nobs) stop(paste("number of observations in y (", nrowy, ") not equal to the number of rows of x (", nobs, ")", sep = "")) vnames = colnames(x) if (is.null(vnames)) vnames = paste("V", seq(nvars), sep = "") ne = as.integer(dfmax) nx = as.integer(pmax) if (missing(exclude)) exclude = integer(0) if (any(penalty.factor == Inf)) { exclude = c(exclude, seq(nvars)[penalty.factor == Inf]) exclude = sort(unique(exclude)) } if (length(exclude) > 0) { jd = match(exclude, seq(nvars), 0) if (!all(jd > 0)) stop("Some excluded variables out of range") penalty.factor[jd] = 1 jd = as.integer(c(length(jd), jd)) } else jd = as.integer(0) vp = as.double(penalty.factor) internal.parms = glmnet.control() if (internal.parms$itrace) trace.it = 1 else { if (trace.it) { glmnet.control(itrace = 1) on.exit(glmnet.control(itrace = 0)) } } if (any(lower.limits > 0)) { stop("Lower limits should be non-positive") } if (any(upper.limits < 0)) { stop("Upper limits should be non-negative") } lower.limits[lower.limits == -Inf] = -internal.parms$big upper.limits[upper.limits == Inf] = internal.parms$big if (length(lower.limits) < nvars) { if (length(lower.limits) == 1) lower.limits = rep(lower.limits, nvars) else stop("Require length 1 or nvars lower.limits") } else lower.limits = lower.limits[seq(nvars)] if (length(upper.limits) < nvars) { if (length(upper.limits) == 1) upper.limits = rep(upper.limits, nvars) else stop("Require length 1 or nvars upper.limits") } else upper.limits = upper.limits[seq(nvars)] cl = rbind(lower.limits, upper.limits) if (any(cl == 0)) { fdev = glmnet.control()$fdev if (fdev != 0) { glmnet.control(fdev = 0) on.exit(glmnet.control(fdev = fdev)) } } storage.mode(cl) = "double" isd = as.integer(standardize) intr = as.integer(intercept) if (!missing(intercept) && family == "cox") warning("Cox model has no intercept") jsd = as.integer(standardize.response) thresh = as.double(thresh) if (is.null(lambda)) { if (lambda.min.ratio >= 1) stop("lambda.min.ratio should be less than 1") flmin = as.double(lambda.min.ratio) ulam = double(1) } else { flmin = as.double(1) if (any(lambda < 0)) stop("lambdas should be non-negative") ulam = as.double(rev(sort(lambda))) nlam = as.integer(length(lambda)) } is.sparse = FALSE ix = jx = NULL if (inherits(x, "sparseMatrix")) { is.sparse = TRUE x = as(x, "CsparseMatrix") x = as(x, "dgCMatrix") ix = as.integer(x@p + 1) jx = as.integer(x@i + 1) x = as.double(x@x) } if (trace.it) { if (relax) cat("Training Fit\n") pb <- createPB(min = 0, max = nlam, initial = 0, style = 3) } kopt = switch(match.arg(type.logistic), Newton = 0, modified.Newton = 1) if (family == "multinomial") { type.multinomial = match.arg(type.multinomial) if (type.multinomial == "grouped") kopt = 2 } kopt = as.integer(kopt) ### 2. ãã£ããã£ã³ã° fit = switch(family, gaussian = elnet(x, is.sparse, ix, jx, y, weights, offset, type.gaussian, alpha, nobs, nvars, jd, vp, cl, ne, nx, nlam, flmin, ulam, thresh, isd, intr, vnames, maxit), poisson = fishnet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs, nvars, jd, vp, cl, ne, nx, nlam, flmin, ulam, thresh, isd, intr, vnames, maxit), binomial = lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs, nvars, jd, vp, cl, ne, nx, nlam, flmin, ulam, thresh, isd, intr, vnames, maxit, kopt, family), multinomial = lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs, nvars, jd, vp, cl, ne, nx, nlam, flmin, ulam, thresh, isd, intr, vnames, maxit, kopt, family), cox = coxnet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs, nvars, jd, vp, cl, ne, nx, nlam, flmin, ulam, thresh, isd, vnames, maxit), mgaussian = mrelnet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs, nvars, jd, vp, cl, ne, nx, nlam, flmin, ulam, thresh, isd, jsd, intr, vnames, maxit)) if (trace.it) { utils::setTxtProgressBar(pb, nlam) close(pb) } ### 3. å¾å¦ç if (is.null(lambda)) fit$lambda = fix.lam(fit$lambda) fit$call = this.call fit$nobs = nobs class(fit) = c(class(fit), "glmnet") if (relax) relax.glmnet(fit, x = x, y = y, weights = weights, offset = offset, lower.limits = lower.limits, upper.limits = upper.limits, check.args = FALSE, ...) else fit }
glmnet ã§ã¯ä»¥ä¸ã®ããã«ã
- ãã©ã¡ã¼ã¿ã®è¨å®ãåå¦çãã¨ã©ã¼ãã§ãã¯
- ãã£ããã£ã³ã°
- å¾å¦ç
ã¨ãã£ãã¹ãããã§å¦çãé²ãã§ãããããã¯éå»ã«ã¿ã¦ãã glm ã gam ã¨åæ§ã§ããã
ããã§ã¯åã¹ããããç´°ããè¦ã¦ããã¾ãããã
1. ãã©ã¡ã¼ã¿ã®è¨å®ãåå¦çãã¨ã©ã¼ãã§ãã¯
ã¾ãã¯ãã©ã¡ã¼ã¿ã®è¨å®ãåå¦çã«é¢ããé¨åã§ãããã¯ããã« family ã®æå®ãåé¡ãªããããã§ãã¯ãã¾ãã
## æå®ããfamilyã弿°ã¨ãã¦OKããã§ã㯠family = match.arg(family)
glmnet ã§ä½¿ç¨å¯è½ãª family 㯠glm ã¨ã¯ç°ãªã£ã¦ãããGamma / inverse.gaussian / quasi- ã使ããªã代ããã«ã multinomial / cox / mgaussian ã使ããããã«ãªã£ã¦ãã¾ãã
ããã§ multinomial ã¯å¤é
åå¸ãmgaussian ã¯å¤å¤éæ£è¦åå¸ãæå³ããããã§ãã
family ã®ãã§ãã¯ã«ã¯ match.arg 颿°ã使ããã¦ãã¾ãã
ãã®é¢æ°ã®æåãçè§£ããã®ã¯å°ãé£ããã®ã§ããããã¡ãã®ããã°ãåèã«ãªãã¾ãã
ç¶ã㦠alpha ããã§ãã¯ãã¾ãï¼
## alpha ### Lassoã¨Ridgeããããã«å¯¾ããããã«ãã£ã®é åãæ±ºãããã©ã¡ã¼ã¿ ### glmnetã«ãããç½°åé ã¯ä»¥ä¸ã§å®ç¾© ### alphaã¯0~1ã§ã1ãªãLassoã0ãªãRidgeã«å¯¾å¿ if (alpha > 1) { warning("alpha >1; set to 1") alpha = 1 } if (alpha < 0) { warning("alpha<0; set to 0") alpha = 0 } alpha = as.double(alpha)
glmnet ã«ããã¦ãã® alpha ã¯ãåå¸°ä¿æ°ã®L1ããã³L2ãã«ã ããããã«å¯¾ããç½°åã®å²åãã³ã³ããã¼ã«ãã¾ãã
ããå
·ä½çã«ã¯ã glmnet ã§ã¯ç½°åé
ã¯ä»¥ä¸ã«ãã£ã¦å®ç¾©ããã¾ãï¼https://cran.r-project.org/web/packages/glmnet/glmnet.pdf ã® P19ããï¼ï¼
åé ã®ã³ã¼ãã§ã¯ alpha = 1 ã¾ã㯠alpha = 0 ã¨ãã¾ããããä¸ã®å¼ãã alpha = 1 ã®ã¨ãã«L2ãã«ã ã«å¯¾ããç½°åãæ¶ãã¦L1ãã«ã ã®ã¿ãæ®ãï¼Lassoï¼ãéã« alpha = 0 ã¨ããã¨L1ãã«ã ã«å¯¾ããç½°åãæ¶ãã¦L2ãã«ã ãæ®ãï¼Ridgeï¼ãã¨ããããã¾ãã
ã¾ã alpha ã (0, 1) ã¨ããã¨ä¸¡è
ãããããã®å²åã§ãã¬ã³ãããã¾ãã
ãªããããã§L2ãã«ã ã«å¯¾ããç½°åã1/2ã«ãªã£ã¦ããçç±ã¯ãããããã¾ããã§ããã
glmnet ã® help ã§å¼ç¨ããã¦ãããã¡ãã®è«æã§ã¯ãã§ã« $(1-\alpha)1/2||\beta||^2_2$ ã¨ãã¦å®ç¾©ããã¦ãã¾ãã
ã¾ãscikit-learnã§ãåæ§ã«L2ãã«ã ã«å¯¾ãã¦ã¯0.5ãä¹ãã¦ããããã§ãï¼https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.ElasticNet.htmlï¼ã
誰ãçç±ãæãã¦ãã ããã
ç¶ã㦠match.call() ãç¨ãã¦å¼æ°ã®æå®ãæ£å¼ãªãã®ã«ç´ãã¾ãï¼
## match.call this.call = match.call()
ããã ãã ã¨ä½ãè¨ã£ã¦ãããã¡ãã£ã¨ããããªãã¨æãã¾ãã®ã§ã以ä¸ã®ä¾ã§ç¢ºèªãã¦ã¿ã¾ãããï¼
myfun <- function(abc, def, ghi) { return(abc + 2*def + 3*ghi) }
ä¸ã®ããã«å¼æ°ã¨ã㦠abc ã def ã ghi ãåã颿°ãå®ç¾©ãã¾ãã
ãã®ã¨ã R ã§ã¯ã弿°ã®æå®ããªãå ´åã«ã¯é çªéãã«å
¥åããã¾ãï¼
> myfun(1, 2, 3) [1] 14
ä¸é¨ã®å¼æ°ã®ã¿æå®ãããå ´åã§ã¯æå®ããã弿°ã ãããã®éãã«å ¥åãããæ®ãã¯é çªéãã«å²ãå½ã¦ãããããã§ãã
> myfun(def = 3, 4, 5) [1] 25
ã¨ããã§ãã®å¼æ°ã®æå®ã¯ãä¸æã«æ±ºã¾ãã°æå®ã¯çç¥ãããã¨ãã§ãã¾ãï¼
> myfun(d = 3, 4, 5) [1] 25
䏿¹ãä¾ãã°ä»¥ä¸ã®ãããªå¼ã³åºãã§ã¯ g ããå§ã¾ã弿°ãï¼ã¤ããããä¸æã«æ±ºã¾ãããã¨ã©ã¼ã¨ãªã£ã¦ãã¾ãã¾ãã
> myfun2 <- function(abc, def, ghi, gjk) { + return(abc + 2*def + 3*ghi + 4*gjk) + } > myfun2(g = 3, 4, 5, 6) myfun2(g = 3, 4, 5, 6) ã§ã¨ã©ã¼: 弿° 1 ãè¤æ°ã®ä»®å¼æ°ã«ä¸è´ãã¾ã
ã§ã¯ match.call ã使ã£ã¦é¢æ°ãå¼ã³åºãã¨ã©ããªããã¨è¨ãã¨ï¼
> match.call(myfun, call("myfun", 1, def = 3, ghi = 5)) myfun(abc = 1, def = 3, ghi = 5)
ãã®éããå弿°ã«å¯¾ãã¦ä½ãå²ãå½ã¦ãããå¾ããã¨ãã§ãã¾ãã 便å©ã§ããã
ããã«ç¶ãã¦ã nlambda ã®æå®ã§ãã
ããã§ã¯ $\lambda$ ï¼ç½°åã®å¤§ããï¼ãã®ãã®ã§ã¯ãªããæ¤è¨¼ãã $\lambda$ ã®æ°ï¼nubmer of lambdaï¼ãæå®ãã¾ãï¼ããã©ã«ãã¯100ï¼ã
## nlambda nlam = as.integer(nlambda)
ãããã㯠y ã x ããã³ weight ã®ãã§ãã¯ã§ãï¼
## drop y = drop(y) ## x ### x ã¯ï¼å以䏿ã¤å¿ è¦ãããã®ã§ãåå帰ã¯ã§ããªãæ§å np = dim(x) if (is.null(np) | (np[2] <= 1)) stop("x should be a matrix with 2 or more columns") ### x ã®ã¬ã³ã¼ãæ° nobs = as.integer(np[1]) ### weights ### æªå ¥åã®ã¨ã㯠1 ãä¸ããweights 㨠nobs ãä¸è´ããªãã¨ãã¯ã¨ã©ã¼ if (missing(weights)) weights = rep(1, nobs) else if (length(weights) != nobs) stop(paste("number of elements in weights (", length(weights), ") not equal to the number of rows of x (", nobs, ")", sep = "")) ### 夿°ã®æ° nvars = as.integer(np[2]) ## y dimy = dim(y) ### y ã®ã¬ã³ã¼ãæ° nrowy = ifelse(is.null(dimy), length(y), dimy[1]) ### y 㨠x ã§ã¬ã³ã¼ãæ°ãåããªãã¨ãã¯ã¨ã©ã¼ if (nrowy != nobs) stop(paste("number of observations in y (", nrowy, ") not equal to the number of rows of x (", nobs, ")", sep = "")) ## 夿°å vnames = colnames(x) if (is.null(vnames)) vnames = paste("V", seq(nvars), sep = "")
y ã«å¯¾ãã drop ã§ããããã㯠length ã 1 ã§ãããããªåé·ãªæ¬¡å
ãè½ã¨ã颿°ã§ãã
ç¶ã㦠x ã®è¡æ°ã weight ã y ã¨åããªãå ´åã«ã¨ã©ã¼ãè¿ãã¦ãã¾ãã
以ä¸ã§ã¯ã¢ãã«ã«å«ãã夿°ãéã¼ãã¨ãã夿°ãªã©ãæå®ãã¾ã
ï¼ nx(=pmax) ã®æ¹ã¯ã¡ãã£ã¨çè§£ãã¢ã¤ã·ã¤ã®ã§ help ã®èª¬æãæ¸ãã¦ããã¾ãï¼ï¼
## èªç±åº¦ ### ã¢ãã«ã«å«ã¾ãã夿°ã®ä¸éãæå® ### dfmax = nvars + 1 ne = as.integer(dfmax) ### éã¼ãã¨ãã夿°ã®æ°ã®ä¸é(?) ### Limit the maximum number of variables ever to be nonzero ### pmax = min(dfmax * 2 + 20, nvars) nx = as.integer(pmax) ### é¤å¤å¯¾è±¡ã¨ãªã夿°ã®æå® if (missing(exclude)) exclude = integer(0)
次ã«å¤æ°ãã¨ã«ç°ãªãããã«ãã£ãä¸ããããã« penalty.factor ãæå®ãã¾ãã
ãã®æ°å¤ã lambda ã«ä¹ãããããããä¾ãã°ç¹å®ã®å¤æ°ã«å¯¾ã㦠penalty.factor = 0 ã¨ãã¦ããã°ç½°åãä¸ããªãããã«ãããã¨ãå¯è½ã¨ãªãã¾ãï¼çµæã¨ãã¦å¸¸ã«ã¢ãã«ã«æ¡ç¨ãããããã«ãªãï¼ï¼
## 夿°ãã¨ã«ç°ãªãããã«ãã£ãä¸ãã ### ããã©ã«ã㯠1 ãå ¥ã ### Inf ãæå®ããã¦ãã夿°ã¯ exclude ã¨ãã¦æ±ããã if (any(penalty.factor == Inf)) { exclude = c(exclude, seq(nvars)[penalty.factor == Inf]) exclude = sort(unique(exclude)) } if (length(exclude) > 0) { jd = match(exclude, seq(nvars), 0) if (!all(jd > 0)) stop("Some excluded variables out of range") penalty.factor[jd] = 1 jd = as.integer(c(length(jd), jd)) } else jd = as.integer(0) vp = as.double(penalty.factor)
ããã¯ãã£ãããªã®ã§å®éã«ãã£ã¦ã¿ã¾ãããã
åé ã®ã³ã¼ããæã£ã¦ãã¦ã以ä¸ã®ããã« lambda ãé©å½ã«è¨å®ãã¦ã¿ã¾ãã
x <- scale(LifeCycleSavings[, 2:5]) y <- LifeCycleSavings[, 1] - mean(LifeCycleSavings[, 1])
> coef(glmnet(x, y, family = "gaussian", alpha = 1, lambda = 0.3)) 5 x 1 sparse Matrix of class "dgCMatrix" s0 (Intercept) 1.182354e-15 pop15 -1.691002e+00 pop75 . dpi . ddpi 9.816514e-01
ãã®ã¨ãã2ã»3çªç®ã®å¤æ°ã§ãã pop75 㨠dpi 㯠0 ã¨æ¨å®ããã¦ãã¾ãã¾ããã
ããã§ãããã®å¤æ°ã® penalty.factor ã 0 ã¨ããã¨
> coef(glmnet(x, y, family = "gaussian", alpha = 1, lambda = 0.3, + penalty.factor = c(1, 0, 0, 1))) 5 x 1 sparse Matrix of class "dgCMatrix" s0 (Intercept) 9.523943e-16 pop15 -7.827680e-01 pop75 8.127991e-01 dpi -1.560908e-01 ddpi 6.812498e-01
ã¡ããã¨æ¨å®ãããããã«ãªã£ã¦ãã¾ãã
éã« pop15 ã® penalty.factor ã大ããããã¨
> coef(glmnet(x, y, family = "gaussian", alpha = 1, lambda = 0.3, + penalty.factor = c(2, 0, 0, 1))) 5 x 1 sparse Matrix of class "dgCMatrix" s0 (Intercept) 7.266786e-16 pop15 . pop75 1.374655e+00 dpi 2.586151e-02 ddpi 9.300500e-01
ãã®ããã«ã¢ãã«ããé¤å¤ããããã¨ã«ãªãã¾ãã
ããã« penalty.factor = Inf ã¨ããã¨ããã®å¤æ°ã¯ exclude ã¨ãã¦æ±ãããããã«ãªãã¾ãã
ç¶ã㦠glmnet.control ã§æã£ã¦ãããã©ã¡ã¼ã¿ã渡ãã¾ãã
## å é¨ã§ããã©ã«ãã§æã£ã¦ãããã©ã¡ã¼ã¿ internal.parms = glmnet.control() ### ããã°ã¬ã¹ãã¼ã表示ããï¼ if (internal.parms$itrace) trace.it = 1 else { if (trace.it) { glmnet.control(itrace = 1) on.exit(glmnet.control(itrace = 0)) } }
次ã«ãåå¸°ä¿æ°ã«å¯¾ããä¸éã»ä¸éãè¨å®ãã¾ãã ãªãä¸é㯠non-positive ãä¸é㯠non-negative ããè¨å®ã§ããªãããã§ããã
## ä¸éã»ä¸é ### lower.limit ã¨ãã¦ã¯éæ£ã®å¤ã®ã¿æå®ã§ãã if (any(lower.limits > 0)) { stop("Lower limits should be non-positive") } ### upper.limtit ã¯éã«éè² ã®å¤ã®ã¿æå®ã§ãã if (any(upper.limits < 0)) { stop("Upper limits should be non-negative") } ### Inf ï¼ããã©ã«ãï¼ã«ãªã£ã¦ãããã®ã«ã¤ãã¦ã¯ç¹å®ã®å¤(9.9e35)ã«å·®ãæ¿ã lower.limits[lower.limits == -Inf] = -internal.parms$big upper.limits[upper.limits == Inf] = internal.parms$big ### nvars ã¨ã®æ´åæ§ãã§ã㯠if (length(lower.limits) < nvars) { ### lower.limits ã¨ãã¦ã¹ã«ã©ãæå®ããã¦ããå ´å㯠nvars å ¨ã¦ã«é©ç¨ if (length(lower.limits) == 1) lower.limits = rep(lower.limits, nvars) else stop("Require length 1 or nvars lower.limits") } ### lower.limits ã nvars ãããé·ãå ´åã¯åããå©ç¨ãã else lower.limits = lower.limits[seq(nvars)] ### nvars ã¨ã®æ´åæ§ãã§ãã¯ï¼lower.limits ã¨åæ§ï¼ if (length(upper.limits) < nvars) { if (length(upper.limits) == 1) upper.limits = rep(upper.limits, nvars) else stop("Require length 1 or nvars upper.limits") } else upper.limits = upper.limits[seq(nvars)] ### ä¸éã»ä¸é ### coefficient limitï¼ cl = rbind(lower.limits, upper.limits) ### lower ã¾ã㯠upper ã« 0 ãå«ãå ´å ### 0é¤ç®ãçºçããã¨ãã®ã¨ã©ã¼å¯¾çï¼ if (any(cl == 0)) { ### fdev ã¯æå°ã¨ãªãããã¢ã³ã¹ã®å¤åé(å²å) ### minimum fractional change in deviance for stopping path; factory default = 1.0e5 fdev = glmnet.control()$fdev if (fdev != 0) { glmnet.control(fdev = 0) on.exit(glmnet.control(fdev = fdev)) # 颿°çµäºæã«å®è¡ãããå¦ç } } storage.mode(cl) = "double"
æ¨æºåã¨åçã«å¯¾ããæå®ã§ãã æ¨æºåã®å¦çãã®ãã®ã¯ä»¥éã®é¢æ°ã®å é¨ã§å®è¡ããããããããã§ã¯æå®ã®ã¿ãè¡ãã¾ãã
## æ¨æºå ### standardize 㨠intercept ã¯ããã©ã«ã㯠TRUE ãªã®ã§ 1 ã«ãªã isd = as.integer(standardize) intr = as.integer(intercept) ### Coxå帰ã«ãããè¦å if (!missing(intercept) && family == "cox") warning("Cox model has no intercept") ### standardize.response 㯠family="mgaussian" ã®ã¨ãã«ç®ç夿°ãæ¨æºåãããã®æå® jsd = as.integer(standardize.response)
åæãå¤å®ããé¾å¤ãæå®ãã¾ãã
## åæå¤å® ### coordinate descent ã«ãããåæã®é¾å¤ thresh = as.double(thresh)
次ã«ã lambda ã«é¢ããæå®ã¨ãªãã¾ããã flmin ããã³ ulam ã®ä½¿ããæ¹ãããçè§£ã§ããªãã£ãããããããã®èª¬æã¯çç¥ãã¾ãã
ãªã help ã«ãããã¾ãããé常㯠lambda ã«ã¯åä¸ã®å¤ã§ã¯ãªããåè£ã¨ãªãå¤ã®ãã¯ãã«ãä¸ãã¾ãã
Avoid supplying a single value for lambda (for predictions after CV use predict() instead).
## lambda ### ããã«ãã£ã®å¤§ãã ### æå®ããªãå ´åãflmin 㨠ulam 㯠lambda.min.ratio ããã³ 1 ã«æå®ããã ### lambda.min.ratio = ifelse(nobs < nvars, 0.01, 1e-04) if (is.null(lambda)) { if (lambda.min.ratio >= 1) stop("lambda.min.ratio should be less than 1") flmin = as.double(lambda.min.ratio) ulam = double(1) } ### æå®ãããå ´åãflmin(ä¸éï¼)ã¨ulam(ä¸éï¼)㯠1 ããã³ lambdaã®éé ã«æå®ããã else { flmin = as.double(1) if (any(lambda < 0)) stop("lambdas should be non-negative") ulam = as.double(rev(sort(lambda))) nlam = as.integer(length(lambda)) }
次ã«çè¡åã®æå®ã§ãã
å
¥å X ãçè¡åã§ããå ´åãdgCMatrix å½¢å¼ã«å¤æããã¾ãã
ããã§ dgCMatrix ã¨ã¯åæ¹åã®å¿åæ§ãæã¤çè¡åã®å½¢å¼ã§ãã
## sparse matrix ### x ã Matrix::sparseMatrix ã®å ´å㯠Matrix::dgCMatrix ã«å¤æãã ### dgCMatrix: cscé ã«ä¸¦ã³æ¿ãã¦(cscå½¢å¼)ã®çè¡åå§ç¸®ä¿ç®¡ is.sparse = FALSE ix = jx = NULL if (inherits(x, "sparseMatrix")) { is.sparse = TRUE x = as(x, "CsparseMatrix") x = as(x, "dgCMatrix") ### x@p ã¯ååã®éã¼ãã®å¤ã®åæ°ãç©ã¿ä¸ãããã®ãæ ¼ç´ããã¦ããï¼åæ° + 1ï¼ ### diff(x@p + 1) ããã°ååã®éã¼ãã®å¤ã®åæ°ãããã ix = as.integer(x@p + 1) ### x@i ã¯ååã®éã¼ãã®å¤ã®è¡çªå·ãæ ¼ç´ããã¦ããï¼ãªã®ã§ length(x@i) ãéã¼ãã®å¤ã®åæ°ã¨ä¸è´ããï¼ ### 0-index ãªã®ã§ R ã®ã¹ã¿ã¤ã«ã¨åãããããã« +1 ãã¦ããã®ã§ããã jx = as.integer(x@i + 1) ### x@x ã¯éã¼ãã§ããå¤ãã®ãã®ã®ãã¯ãã« x = as.double(x@x) }
ãããããã£ãããªã®ã§çè¡åã«ãããæ°å¤ã®æ ¼ç´æ¹æ³ã«ã¤ãã¦ãè¦ã¦ããã¾ãããã 以ä¸ã®ããã«çè¡åã使ãã¾ãï¼
set.seed(1234) i <- c(1, 5, 18) j <- c(4, 13, 19) n <- rnorm(3) m <- matrix(0, 20, 20) for (k in 1:length(n)) { m[i[k], j[k]] <- n[k] } s_m <- as(m, "dgCMatrix")
ããã§ s_m ã¯è¡å m ãçè¡åã¨ãã¦æ±ã£ããã®ã§ãã
str() ã§ç¢ºèªããã¨ã s_m ã«ã¯
@ iï¼éã¼ãã®è¦ç´ ã®å ¥ã£ã¦ããè¡çªå·( 0-index ã§ãããã¨ã«æ³¨æ)@ pï¼ååã«ãããéã¼ãã®è¦ç´ ã®åæ°ãç©ã¿ä¸ãããã®@ Dimï¼è¡åã®æ¬¡å @ Dimnamesï¼è¡åã®å次å ã®åå@ xï¼éã¼ãã®è¦ç´ ã®æ°å¤@ factorsï¼ï¼ããã¯ã¡ãã£ã¨ãããã¾ããã§ããï¼
ãæ ¼ç´ããã¦ãã¾ãã
> str(s_m) Formal class 'dgCMatrix' [package "Matrix"] with 6 slots ..@ i : int [1:3] 0 4 17 ..@ p : int [1:21] 0 0 0 0 1 1 1 1 1 1 ... ..@ Dim : int [1:2] 20 20 ..@ Dimnames:List of 2 .. ..$ : NULL .. ..$ : NULL ..@ x : num [1:3] -1.207 0.277 1.084 ..@ factors : list()
ããã§ @ i ã«ã¯éã¼ãã§ããåè¦ç´ ã®è¡çªå·ãå
¥ãããè¡å m ãä½ã£ãã¨ãã®è¡çªå·ã®æå® i ã«å¯¾å¿ãã¾ããã0-index ã§ããããæ°åã¯1ã¤ãã¤å°ãããªã£ã¦ãã¾ãã
> print(i- 1) [1] 0 4 17 > print(s_m@i) [1] 0 4 17
ã¡ãã£ã¨ãããã«ããã®ã @ p ã§ãããã«ã¯ååã«ãããéã¼ãã®è¦ç´ ã®åæ°ã®ç´¯ç©ãæ ¼ç´ãããåæ°ã«å¯¾å¿ãã¾ãï¼ãã ãæåã« 0 ã追å ããããããåæ° + 1 ã®é·ãã«ãªãã¾ãï¼ã
ä»åã®ä¾ã§ã¯è¡åã®åæ°ã 20 ãªã®ã§ãlength ã 21 ã¨ãªãã¾ãã
> length(s_m@p) [1] 21
ãã®ãã¯ãã«ã«ã¯éã¼ãã®è¦ç´ ã®åæ°ã®ç´¯ç©ãå ¥ã£ã¦ãããããå·®åãåãã¨å ã®è¡åã§éã¼ãã®è¦ç´ ãå ¥ã£ã¦ããåãå¾ããã¨ãã§ãã¾ãã
> diff(s_m@p) [1] 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0
åçªå·ãæå®ãã j ã¨æ¯è¼ãã¦ã¿ã¾ãããï¼
> which(diff(s_m@p) == 1) [1] 4 13 19 > j [1] 4 13 19
åã£ã¦ãã¾ããã
ç¶ãå¦çã§ã¯ã ix ã«ã¯ååã«ãããéã¼ãã®è¦ç´ ã®ç´¯ç©åæ°(+1)ã
ã jx ã«ã¯è¡çªå·ã代å
¥ãã¦ãã¾ãã
ã¾ã x ã«ã¯å
ã®çè¡åã«ãããéã¼ãã®è¦ç´ ã®å¤ãã®ãã®ããã¯ãã«ã¨ãã¦å
¥åãã¦ããã説æå¤æ°ã®è¡åãçè¡åã§ãã£ãå ´åããã®æç¹ã§è¡åã§ã¯ãªããã¯ãã«ã¨ãã¦æ±ããããã¨ã«ãªãã¾ãã
次ã«ãããã°ã¬ã¹ãã¼ã®æå®ã§ãï¼åºãããã§ããï¼ã
## ããã°ã¬ã¹ãã¼ if (trace.it) { if (relax) cat("Training Fit\n") pb <- createPB(min = 0, max = nlam, initial = 0, style = 3) }
ããã¦æå¾ã«æé©åã®ææ³ã«ã¤ãã¦ã®æå®ã§ãã
family ã `binomial ã¾ã㯠multinomial ã®å ´åã glmnet ã®å¼æ°ã§ãã type.logistic ããã³ type.multinomial ãè©ä¾¡ãããï¼å¾ã®å·¥ç¨ã§ï¼ããã«å¿ãã¦å¼ã°ãã颿°ãå¤ããã¾ãã
å
·ä½çã«ã¯ lognet2m ã lognetn ããã³ multlognetn ã®ã©ããé¸ã°ããããæ±ºã¾ãã¾ãã
ããã¯å¥ã®æ©ä¼ã«è§£èª¬ãã¾ãï¼äºå®ã§ãï¼ã
## æé©åã®ææ³ï¼ãã¸ã¹ãã£ãã¯ããã³å¤é ãã¸ã¹ãã£ãã¯ã®æï¼ ### type.logistic = c("Newton", "modified.Newton") ### Newton ãæå®ãªã 0ãmodified.Newton ãæå®ãªã 1 ãè¿ã ### If "Newton" then the exact hessian is used (default), while "modified.Newton" uses an upper-bound on the hessian, and can be faster. kopt = switch(match.arg(type.logistic), Newton = 0, modified.Newton = 1) ### type.multinomial = c("ungrouped", "grouped") ### å¤é ãã¸ã¹ãã£ãã¯ã§æ´ã«groupedã®å ´å㯠kopt 㯠2 ã¨ãªã ### If "grouped" then a grouped lasso penalty is used on the multinomial coefficients for a variable. This ensures they are all in our out together. ### The default is "ungrouped" if (family == "multinomial") { type.multinomial = match.arg(type.multinomial) if (type.multinomial == "grouped") kopt = 2 } kopt = as.integer(kopt)
æåã®æ¹ã§ family ã®ãã§ãã¯ã«ä½¿ãããããã§ã使ããã¦ãã match.arg ã§ããããã£ãããªã®ã§æåã確èªãã¦ããã¾ãããï¼
### 弿°ã« type.logistic ãæã¤é¢æ°ãå®ç¾© myfun <- function(a = "aaa", type.logistic = c("Newton", "modified.Newton")) { ### å¼ã³åºãå ã®é¢æ°ã®å¼æ°ããã§ãã¯ãã Newton ãªã 0ãmodified.Newton ãªã 1ãå²ãå½ã¦ã kopt <- switch(match.arg(type.logistic), Newton = 0, modified.Newton = 1) kopt }
ä¸ã®ãããªé¢æ°ãå®ç¾©ãã以ä¸ã®ããã«å¼ã³åºãã¨ãçµæã¯ãããã 0, 0, 1 ã¨ãªãã¾ãã
> myfun() [1] 0 > myfun(type.logistic = "Newton") [1] 0 > myfun(type.logistic = "modified.Newton") [1] 1
2. ãã£ããã£ã³ã°
以ä¸ã§ãã©ã¡ã¼ã¿ã®è¨å®ãåå¦çãçµããã¾ããã®ã§æ¬¡ã¯ãã£ããã£ã³ã°ã§ãã
ã¨ãã£ã¦ãããã§ã¯ family ã«å¿ãã¦å¼ã³åºã颿°ãå¤ãã¦ããã ããªã®ã§ã詳細ã¯ä¸æ¦ã¹ããããã¾ãããã
# ãã£ããã£ã³ã° ## family ã«å¿ãã¦ãã®å¾ã«å¼ã³åºã颿°ãå¤ãã fit = switch(family, ### gaussian ã®ã¨ã㯠elnet gaussian = elnet(x, is.sparse, ix, jx, y, weights, offset, type.gaussian, alpha, nobs, nvars, jd, vp, cl, ne, nx, nlam, flmin, ulam, thresh, isd, intr, vnames, maxit), ### poisson ã®ã¨ã㯠fishnet poisson = fishnet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs, nvars, jd, vp, cl, ne, nx, nlam, flmin, ulam, thresh, isd, intr, vnames, maxit), ### binomial ã®ã¨ã㯠lognet binomial = lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs, nvars, jd, vp, cl, ne, nx, nlam, flmin, ulam, thresh, isd, intr, vnames, maxit, kopt, family), ### multinomial ã®ã¨ãã lognet multinomial = lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs, nvars, jd, vp, cl, ne, nx, nlam, flmin, ulam, thresh, isd, intr, vnames, maxit, kopt, family), ### cox ã®ã¨ã㯠coxnet cox = coxnet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs, nvars, jd, vp, cl, ne, nx, nlam, flmin, ulam, thresh, isd, vnames, maxit), ### mgaussian ã®ã¨ã㯠mrelnet mgaussian = mrelnet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs, nvars, jd, vp, cl, ne, nx, nlam, flmin, ulam, thresh, isd, jsd, intr, vnames, maxit)) ## ããã°ã¬ã¹ãã¼ if (trace.it) { utils::setTxtProgressBar(pb, nlam) close(pb) }
ãªãããã§ããããã®é¢æ°ã«æ¸¡ããã¦ãã弿°ãæ¯è¼ããã¨ä»¥ä¸ã®ããã«ãªãã¾ãï¼ä¸é¨ã¯ãããããã¾ããã§ããï¼ï¼
| 弿° | 説æ | elnet | fishnet | lognet | coxnet | mrelnet |
|---|---|---|---|---|---|---|
| x | 説æå¤æ°ã®è¡å | ã | ã | ã | ã | ã |
| is.sparse | çè¡åã§ãããã®æå® | ã | ã | ã | ã | ã |
| ix | çè¡åã«ãããéã¼ãã®è¦ç´ ã®ç´¯ç©åæ° | ã | ã | ã | ã | ã |
| jx | çè¡åã«ãããéã¼ãã®è¦ç´ ã®è¡çªå· | ã | ã | ã | ã | ã |
| y | ç®ç夿°ã®è¡å | ã | ã | ã | ã | ã |
| weights | 観測å¤ã«å¯¾ããéã¿ | ã | ã | ã | ã | ã |
| offset | ãªãã»ãã | ã | ã | ã | ã | ã |
| type.gaussian | 1:covariance, 2:naïve | ã | - | - | - | - |
| alpha | L1ã¨L2ã«å¯¾ããéã¿ã®èª¿æ´ãã©ã¡ã¼ã¿ | ã | ã | ã | ã | ã |
| nobs | ã¬ã³ã¼ãæ° | ã | ã | ã | ã | ã |
| nvars | 説æå¤æ°ã®æ° | ã | ã | ã | ã | ã |
| jd | ? | ã | ã | ã | ã | ã |
| vp | å夿°ã«å¯¾ããç½°åã®éã¿ï¼penalty.factorï¼ | ã | ã | ã | ã | ã |
| cl | ? | ã | ã | ã | ã | ã |
| ne | ã¢ãã«ã«å«ã¾ãã夿°ã®ä¸éãne = dfmax = nvars + 1 | ã | ã | ã | ã | ã |
| nx | éã¼ãã¨ãã夿°ã®åæ°ã®ä¸éï¼ | ã | ã | ã | ã | ã |
| nlam | lambdaã®æ° | ã | ã | ã | ã | ã |
| flmin | ? | ã | ã | ã | ã | ã |
| ulam | ? | ã | ã | ã | ã | ã |
| thresh | åæå¤å®ã®é¾å¤ | ã | ã | ã | ã | ã |
| isd | standardizeãããã®æå® | ã | ã | ã | ã | ã |
| jsd | ? | - | - | - | - | ã |
| intr | åçï¼Interceptï¼ãå«ãããã®æå® | ã | ã | ã | - | ã |
| vnames | 夿°å | ã | ã | ã | ã | ã |
| maxit | å復忰ã®ä¸é | ã | ã | ã | ã | ã |
| kopt | æé©åã®ææ³ | - | - | ã | - | - |
| family | family | - | - | ã | - | - |
3. å¾å¦ç
æå¾ã«å¾å¦çã§ãã
# å¾å¦ç ## lambda ãæå®ããã¦ããã fit$lambda ã 3 ãã¿ã¼ã³ä»¥ä¸æ¤è¨¼ããã¦ããå ´åãå é ãå·®ãæ¿ãã ## glmnet::fix.lam ## function (lam) { ## if (length(lam) > 2) { ## llam = log(lam) ## lam[1] = exp(2 * llam[2] - llam[3]) ## } ## lam ## } if (is.null(lambda)) fit$lambda = fix.lam(fit$lambda) ## call fit$call = this.call ## ã¬ã³ã¼ãæ° fit$nobs = nobs ## class ã« glmnet ã追å class(fit) = c(class(fit), "glmnet") # ãªã¿ã¼ã³ ## relax ã TRUE ã®å ´åãè§£ãã¹ã®åã»ããã«ã¤ãã¦ç½°åãªãã§ã¢ãã«ããã£ããã£ã³ã°ãã ## If TRUE then for each active set in the path of solutions, the model is refit without any regularization. See details for more information. ## This argument is new, and users may experience convergence issues with small datasets, especially with non-gaussian families. ## Limiting the value of âmaxpâ can alleviate these issues in some cases. if (relax) relax.glmnet(fit, x = x, y = y, weights = weights, offset = offset, lower.limits = lower.limits, upper.limits = upper.limits, check.args = FALSE, ...) else fit
ãã®å¾å¦çã§ç®ç«ã¤å·¥ç¨ã¨ãã¦ã¯ relax ã®é¨åã§ãããã
ããã§ relax 㯠help ã«ããã¨ã
If relax=TRUE a duplicate sequence of models is produced, where each active set in the elastic-net path is refit without regularization. The result of this is a matching "glmnet" object which is stored on the original object in a component named "relaxed", and is part of the glmnet output.
ã¨ãããã¨ã§ãglmnet ã«ãã£ã¦å¤æ°é¸æãããçµæãç¨ãã¦ãç½°åãªãã§å度ãã£ããã£ã³ã°ãè¡ããªãã·ã§ã³ã®ããã§ãã
ãããå®éã«ãã£ã¦ã¿ãã®ãæ©ãã¨æãã¾ãã®ã§ã以ä¸ã®ããã«å®è¡ãã¦ã¿ã¾ãï¼
lasso_02 <- glmnet(x, y, family = "gaussian", relax = T)
ããã¨ãå
ç¨ã®çµæï¼ lasso ï¼ã«ã lasso_02$relaxed ã¨ããçµæã追å ããã¦ãããã¨ããããã¾ãããå
容㯠lasso ã¨ã»ã¨ãã©åãã§ãã
> str(lasso) List of 12 $ a0 : Named num [1:68] 6.11e-16 6.71e-16 7.26e-16 7.76e-16 8.22e-16 ... ..- attr(*, "names")= chr [1:68] "s0" "s1" "s2" "s3" ... $ beta :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots .. ..@ i : int [1:216] 0 0 0 0 0 3 0 3 0 3 ... .. ..@ p : int [1:69] 0 0 1 2 3 4 6 8 10 12 ... .. ..@ Dim : int [1:2] 4 68 .. ..@ Dimnames:List of 2 .. .. ..$ : chr [1:4] "pop15" "pop75" "dpi" "ddpi" .. .. ..$ : chr [1:68] "s0" "s1" "s2" "s3" ... .. ..@ x : num [1:216] -0.181 -0.347 -0.497 -0.634 -0.757 ... .. ..@ factors : list() $ df : int [1:68] 0 1 1 1 1 2 2 2 2 2 ... $ dim : int [1:2] 4 68 $ lambda : num [1:68] 2.02 1.84 1.68 1.53 1.39 ... $ dev.ratio: num [1:68] 0 0.0352 0.0645 0.0888 0.1089 ... $ nulldev : num 984 $ npasses : int 562 $ jerr : int 0 $ offset : logi FALSE $ call : language glmnet(x = x, y = y, family = "gaussian", alpha = 1) $ nobs : int 50 - attr(*, "class")= chr [1:2] "elnet" "glmnet" > str(lasso_02$relaxed) List of 12 $ a0 : Named num [1:68] 6.11e-16 1.29e-15 1.29e-15 1.29e-15 1.29e-15 ... ..- attr(*, "names")= chr [1:68] "s0" "s1" "s2" "s3" ... $ beta :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots .. ..@ i : int [1:216] 0 0 0 0 0 3 0 3 0 3 ... .. ..@ p : int [1:69] 0 0 1 2 3 4 6 8 10 12 ... .. ..@ Dim : int [1:2] 4 68 .. ..@ Dimnames:List of 2 .. .. ..$ : chr [1:4] "pop15" "pop75" "dpi" "ddpi" .. .. ..$ : chr [1:68] "s0" "s1" "s2" "s3" ... .. ..@ x : num [1:216] -2.04 -2.04 -2.04 -2.04 -1.98 ... .. ..@ factors : list() $ df : int [1:68] 0 1 1 1 1 2 2 2 2 2 ... $ dim : int [1:2] 4 68 $ lambda : num [1:68] 2.02 1.84 1.68 1.53 1.39 ... $ dev.ratio: num [1:68] 0 0.208 0.208 0.208 0.208 ... $ nulldev : num 984 $ npasses : int 562 $ jerr : int 0 $ offset : logi FALSE $ call : language glmnet(x = x, y = y, family = "gaussian", relax = T) $ nobs : int 50 - attr(*, "class")= chr [1:2] "elnet" "glmnet"
ããã§ lasso_02$relaxed ã®ä¸èº«ãå°ãè¦ã¦ã¿ãã¨ãä¾ãã° beta ã«ã¯ä»¥ä¸ã®ãããªæ°å¤ãå
¥ã£ã¦ãã¾ãã
> lasso_02$relaxed$beta[, 1:6] 4 x 6 sparse Matrix of class "dgCMatrix" s0 s1 s2 s3 s4 s5 pop15 . -2.040996 -2.040996 -2.040996 -2.040996 -1.980216 pop75 . . . . . . dpi . . . . . . ddpi . . . . . 1.270865
ããã¯ä½ãã¨è¨ãã¨ãå°ããã¤ç½°åã®éã¿ãå¤ãããã¨ã§å¤æ°ã鏿ãããç¶æ
ã§é常ã®ç·å½¢å帰ãå½ã¦ã¯ããã¨ãã®åå¸°ä¿æ°ã¨ãªã£ã¦ãã¾ãã
ä¾ãã° lasso_02$relaxed$beta[, 6] ã«ã¯ã夿°ã¨ãã¦é¸æããã pop15 㨠ddpi ããããã®åå¸°ä¿æ°ãå
¥ã£ã¦ãã¾ãã
å®éã« lm ã®çµæã¨ä¸è´ãããè¦ã¦ã¿ã¾ãããï¼
> coef(lm(y ~ x[, c(1, 4)])) (Intercept) x[, c(1, 4)]pop15 x[, c(1, 4)]ddpi 1.364331e-15 -1.980216e+00 1.270865e+00
åã£ã¦ãã¾ããã
ã¨ããã§åçã®æ¨å®å¤ãå
¥ã£ã¦ãã lasso_02$relaxed$a0 ã®å¤ã¯å°ãç°ãªãããã§ãï¼
> lasso_02$relaxed$a0[6] s5 1.28119e-15
ãªãã§ããããã
ããããããæ¨æºåã®éããã¨ãæãã¾ãããããã§ããªãããã§ããã®çç±ã¯ãããã¾ããã§ããã
lasso_03 <- glmnet(x, y, family = "gaussian", relax = T, standardize = F)
> lasso_03$relaxed$a0[6] s5 1.28119e-15
glmnet() ã®å®è£
ã¯ä»¥ä¸ã¨ãªãã¾ãã
次åã¯ãã£ããã£ã³ã°ã®é¨åã§å¼ã°ãã¦ãã elnet ã詳ããè¦ã¦ããã¾ãããã
ãªã gam ã®ã¨ãã¨ã¯éãã glmnet ã§ã¯ library ãã¤ã³ã¹ãã¼ã«ãã¦ãã½ã¼ã¹ã³ã¼ãã¯ä»ãã¦ãã¾ããã§ããã®ã§ããã¡ããåèã« fortran ã®ã½ã¼ã¹ã³ã¼ããåå¾ãã¾ããã
ã§ã¯ã¾ã次åã
çµ±è¨æ°çç ç©¶æå ¬éè¬åº§ãçµ±è¨ã®å²å¦ãçè§£ããããã«ãåå è¨é²
1/31ã«éå¬ãããçµ±è¨æ°çç ç©¶æã®å ¬éè¬åº§ãçµ±è¨ã®å²å¦ãçè§£ããããã«ãã«åå ãã¦ãã¾ããã®ã§ãã®ã¡ã¢ãå ±æãã¦ããã¾ããå ¨ä½çã«ã¯ã¨ãªãªããã»ã½ã¼ãã¼ã¨ãã尤度主義è ããè¦ãé »åº¦ä¸»ç¾©ã»ãã¤ãºä¸»ç¾©ã«å¯¾ããæ¹å¤ç観ç¹ã®ç´¹ä»ã¨ããæ§æã§ãããããã®ç«å ´ãçãããã¨ãã¦ããåããæµ®ã彫ãã«ãªããããªå 容ã§ããã
ãªããã®è¬ç¾©ã¯ãç§å¦ã¨è¨¼æ ãã¨ããæ¸ç±ã«åºã¥ãã¦ãã¾ããã¨ã¦ãé¢ç½ãæ¬ã§ãã

ç§å¦ã¨è¨¼æ âçµ±è¨ã®å²å¦ å ¥éâ
- ä½è :ã¨ãªãªããã»ã½ã¼ãã¼
- åºç社/ã¡ã¼ã«ã¼: åå¤å±å¤§å¦åºçä¼
- çºå£²æ¥: 2012/10/20
- ã¡ãã£ã¢: åè¡æ¬
ã¾ãæå¾ã«ããã¾ãæéããªãã¦é§ãè¶³ã§ã®ç´¹ä»ã¨ãªã£ã¦ãã¾ããããDeborah Mayoã«ããã誤ãçµ±è¨å¦ãã«ã¤ãã¦ç°¡åã«ç´¹ä»ãããã¾ããï¼æ¾çå çæ°ãä¸çåã®è³æãã¨ã®ãã¨ï¼ã

Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars
- ä½è :Deborah G. Mayo
- åºç社/ã¡ã¼ã«ã¼: Cambridge University Press
- çºå£²æ¥: 2018/09/20
- ã¡ãã£ã¢: ãã¼ãã¼ããã¯
æ¥æ
2020/1/31 10:00@ç«å·
è¬å¸«
åæµ·é大å¦ãæ¾çæ¿æµ©
ã¨ãªãªããã»ã½ã¼ãã¼
- ã¢ã¡ãªã«ã®ä»£è¡¨çãªç§å¦å²å¦è
- 尤度主義 + AICãæ¯æ
- èæ¸
- ç§å¦ã¨è¨¼æ
- OCKHAM'S RAZORSã®ç¿»è¨³ãã§ãäºå®
çµ±è¨å¦è«äºã¯çµãã£ã¦ããªã
- Statistical Inference as Severe Testing
- Deborah Mayo
- é »åº¦ä¸»ç¾©
- Deborah Mayo
- æææ¤å®è«äºãã¬ã·ãçãªçµ±è¨å¦ã®é£ãæ¢ã
çµ±è¨ã®å²å¦
- çµ±è¨ã®åºç¤ããããè°è«ã®ç·ä½
- å²å¦ã¯å¸¸ã«è«äºã§æãç«ã¡ãçµè«ã«ã¯å°éããªã
- ãã¤ãºä¸»ç¾©
- 信念度åã
- 主観
- 客観
- 信念度åã
- é »åº¦ä¸»ç¾©
- ãã¤ãã³-ãã¢ã½ã³
- ç¡éåã®æ½è¡ãåæã¨ããèãã
- ãã£ãã·ã£ã¼
- ç¸å¯¾é »åº¦ãæ¨æ¸¬ç¢ºç
- ãã¤ãã³-ãã¢ã½ã³
- ãã¤ãºä¸»ç¾©
- çµ±ä¸çè¦è§£ã示ãããã®ã¯ãªãã®ã§å主義ãå¦ã¶ãããªã
- ãã¤ãº
- 宿 ãæãã«ãã
- æææ±ºå®ä¸å¿ãç§å¦ç仮説ã®ç¢ºè¨¼ä¸å¿ï¼ç§å¦å²å¦ï¼ããã¤ãºçµ±è¨ã®æµå
- ãã¼ã¬ã¼ï¼ã¦ã©ã«ãã¼ãï¼1988ï¼ãã²ã«ãã³ï¼ä¸é£ï¼
- é »åº¦ä¸»ç¾©
- Mayoã®ãã®ã¯å æ¬çã ããããã«ãã
- 尤度主義
- ã³ã³ãã¯ãã§ãããããã
- ã½ã¼ãã¼ããã¤ã¤ã«ã¯è©±ãçãè«ç¹ãæå¿«ã ãæä»ä¸»ç¾©
- ãã¤ãº
ï¼ã¤ã®ä¸»ç¾©
- ãã¤ã¤ã«ã®ï¼ã¤ã®åã
- 証æ ããã¨ã«ä½ããããããâ 尤度主義
- éçç¾å¨ä¸»ç¾©ãå¤å¨ä¸»ç¾©
- ãã¼ã¿ãç´æ¥ç¤ºãã¦ããã仮説ã«é¢ããæ å ±ï¼è¨¼æ ï¼ãããã£ãããã
- ãã£ãããã¼ã¿ãå¾ããããä»ã®ãã¼ã¿ã®å¯è½æ§ã¯æ¨è±¡ãã
- 証æ ããã¨ã«ä½ãä¿¡ããããâ ãã¤ãºä¸»ç¾©
- åçç¾å¨ä¸»ç¾©ãå å¨ä¸»ç¾©
- ä»ã®ãã¼ã¿ã®å¯è½æ§ã¯ä¸åæ¨è±¡ãã
- ãã¼ã¿ããï¼å¯è½ãªï¼ä»®èª¬ã«é¢ããæ å ±ããã©ãå¤åãããããã£ãããã
- 証æ ããã¨ã«ä½ãããã¹ãããâ é »åº¦ä¸»ç¾©
- åäºå®ä¸»ç¾©ãè¦ç´ä¸»ç¾©
- ãã®å ´ã§çãã¦ãããã¨ã ãã§ãªããçããå¯è½æ§ãããäºæããã¹ã¦ä½µãã¦ä»®èª¬ã«ã¤ãã¦å¤æãã
- ä½ã確çãæ£å´ã®ãµã¤ã³ã¨ã¿ãªãããåãã«ã¼ã«ãä½åº¦ãé©ç¨ããã°èª¤ãã®å¯è½æ§ãä½ãã®ã ããä»åããã®ã«ã¼ã«ãé©ç¨ãã
- 証æ ããã¨ã«ä½ããããããâ 尤度主義
尤度ãåçãã«ã¤ãã¦
- ã½ã¼ãã¼ã®è°è«ã®æ ¸ã¨ãªãï¼ã¤ã®åç
- ç©å½ãªåç
- ããEãçã§ããã¨ç¥ããã¨ã«ãã£ã¦å½é¡Pãæ£å´ãããã¨ãæ£å½åããããã¤ããã®æ å ±ãå¾ã¦ã¯ããã¦Pã®æ£å´ãæ£å½åãããã®ã§ããã°ãEã¯Pã«åãã証æ ã¨ãããã°ãªããªã
- å
¨è¨¼æ ã®åç
- å®é¨ã«ãã£ã¦å¾ããããã¼ã¿ã¯ãã¹ã¦è¨¼æ ã®å¤æã«ç¨ããããªããã°ãªããªã
- ç©å½ãªåç
- å°¤åº¦ã®æ³åã¨ã¯ç°ãªã
- 尤度主義ããã¤ãºä¸»ç¾©ã®ä¸¡æ¹ã«å ±éããæºæ³
- 尤度åçï¼LPï¼
- xã観測ããããã¨ãθã«ã¤ãã¦æ¨è«ï¼æ±ºå®ï¼ããéã«ãå®é¨ã«ã¤ãã¦é¢ä¿ã®ãããã¹ã¦ã®æ å ±ã¯ã観測ãããxã«å¯¾ããå°¤åº¦é¢æ°ã«å«ã¾ãã¦ãããããã«ï¼ã¤ã®å°¤åº¦é¢æ°ãθã®é¢æ°ã¨ãã¦äºãã«æ¯ä¾ã®é¢ä¿ã«ãããªãï¼ã¤ã®é¢æ°ã¯Î¸ã«ã¤ãã¦åãæ å ±ãå«ãã§ãã
- ãã£ãã·ã£ã¼ã«ç±æ¥
- ãããæºãããæºãããªããã§ãé£å¶ãåããã
- æºãã
- 尤度主義ããã¤ãºä¸»ç¾©
- æºãããªã
- é »åº¦ä¸»ç¾©
- æºãã
- ãã¾ã è«äºã®ç«ç¨®
- ããLPããã©ã¡ã¼ã¿æ¨å®ã®æ ¹æ¬åçã§ãããªãã
- P(x' | θi) = kP(x | θi) ãæãç«ã¤ã¨ãEv(E, x) = Ev(E', x')ã¨ãªãã¯ããEvã¯è¨¼æ ãEã¯å®é¨
- ãããæãç«ããªããã°ãã®ãããªçµæãããããæ¨è«ã¯ä¸é©å
- ãã¼ã³ãã¦ã ã証æ
- 証æã®æ¯éã¯æªæ±ºç
- ããã«é¡ãã¨
- ååæ§ã®åç
- æ¡ä»¶ä»ãã®åç
- ãããï¼ã¤ã®åçã¯çµ±è¨å¦è ãªã誰ã§ãåãå ¥ããããã¯ãã®åç
- ãããï¼ã¤ã®åçã¨å°¤åº¦åçã®ç価æ§ãã証æãï¼ãã¼ã³ãã¦ã ï¼ãï¼ã¤ã®åçãåãå ¥ãããªã尤度åçãåãå ¥ããªãã¨ãããªã
- ãã¼ã³ãã¦ã ã®å顿è
- ãçµ±è¨å¦çã«å°ãã証æ ãããå®é¨ã«ããã証æ ãã«ãªã£ã¦ãã
- (E, x) 㨠Ev(E, x) ã¯åºå¥ããã
- (E, x)
- ãã©ã¡ã¼ã¿ç©ºéΩã«ã¤ãã¦ã®è¨è¿°
- Eã®å¯è½ãªçµæxã®ãµã³ãã«ç©ºéã«ã¤ãã¦ã®è¨è¿°
- Ev(E, x)ï¼å®é¨ç証æ
- ã©ãè©ä¾¡ãããï¼
- (E, x)
- (E, x) 㨠Ev(E, x) ã¯åºå¥ããã
- è§£æã®ãã¤ã³ã
- ï¼ã¤ã®çµ±è¨ç証æ (E, x) 㨠(E', y) ãé¢ä¿ããããããç¹ã§çããã¨è¨ããæ¡ä»¶ï¼
- çµ±è¨ç証æ (E, x) 㨠(E', y) ãé¢ä¿ããããããéè¦ãªç¹ã§çããæã Ev(E, x) = Ev(E', y)
- ï¼ã¤ã®çµ±è¨ç証æ (E, x) 㨠(E', y) ãé¢ä¿ããããããç¹ã§çããã¨è¨ããæ¡ä»¶ï¼
- ãçµ±è¨å¦çã«å°ãã証æ ãããå®é¨ã«ããã証æ ãã«ãªã£ã¦ãã
- 尤度åçããã®éè¦ãªå¸°çµ
- ãµã³ãã«ã¹ãã¼ã¹ï¼å¯è½ã ãå®éã«ã¯å¾ãããªãã£ã確ç夿°ã®å¤ï¼ã®ç¡é¢ä¿æ§
- éè¦ãªäºç¹
- åäºå®ã®éè¦æ§ãå¦å®
- ãµã³ãã«ã¹ãã¼ã¹ï¼å¯è½ã ãå®éã«ã¯å¾ãããªãã£ã確ç夿°ã®å¤ï¼ã®ç¡é¢ä¿æ§
対ãã¤ãºä¸»ç¾©
- ããã£ãããã£
- 確çãè³ãã信念ã®é¢ä¿
- ãåççãªè³ããæç«ããããã®æ¡ä»¶ãã確çã®ï¼ã¤ã®è¦åãåãæ¡ä»¶
- D.ã®ãªã¼ã¹ã確çã®å²å¦çè«ã
- 確çãè³ãã信念ã®é¢ä¿
- ãµã´ã§ãã¸
- Inductive inference 㨠Inductive behavior
- å¾è
ãããéè¦
- Inference
- æè¦ãå¤ãããã¨
- Behavior
- åå¸ã¨æçµçãªè¡çºã®çµæ¸çäºå®ãç¨ãã¦æãæå¾ å¹ç¨ã®é«ããã®ãé¸ã¶
- Inference
- 主観ç vs 客観ç
- 信念å¤åã®åçæ§
- åºç¤ä»ã主義
- 客観çäºå確ç
- ã¸ã§ããªã¼ãºã®ç¡æ å ±äºååå¸ãã¸ã§ã¤ã³ãºã®æå¤§ã¨ã³ãããã¼
- å²å¦çãã¤ãºä¸»ç¾©
- ã«ã«ãããã®å¸°ç´è«ç
- ã©ãã©ã¹å¸°ç´æ¨è«ã®çºå±å½¢ï¼Î»é£ç¶ä½ï¼
- λ = 0ï¼é »åº¦
- λ = kï¼ã©ãã©ã¹
- λ = âï¼è«ç説
- ã©ãã©ã¹å¸°ç´æ¨è«ã®çºå±å½¢ï¼Î»é£ç¶ä½ï¼
- ãã¤ãºç確証ã確証ç¨åº¦ãetc.
- ã«ã«ãããã®å¸°ç´è«ç
- å®ç¨ã¸ã®è»¢æ
尤度主義ã¨ã¯
- ã°ã¬ã ãªã³ä»®èª¬
- å²å¦ã§ã¯ãã°ãã°ã説æå¯è½æ§ãã¨ã確çããçµã³ã¤ã
- 尤度主義ã®éç
- 尤度ãã¢ãã«ã«ä¾åããããæ£ææ§ãæ··ãããããªã尤度主義è
ã¯ãã¤ãºä¸»ç¾©ãæ¹å¤ã§ããã®ãï¼
- ãã¹ãªã¼ãã£ã³ã°ç¢ºçã¨ãããã®ãç¨ãããã¨ã§å®¢è¦³æ§ãä¿è¨¼ã§ãããã¨ããã®ãã½ã¼ãã¼ã®ç«å ´
- ã½ã¼ãã¼ã®AICã®æ³åãã©ãããã使ããã¦ããã®ãï¼
- ãã¾ã
å¯¾é »åº¦ä¸»ç¾©
- ãã£ãã·ã£ã¼ã®æææ§æ¤å®
- åç¬ã®ä»®èª¬ã«å¯¾ããæ¤å®
- 帰ç¡ä»®èª¬ã¯æ£ããã¨è¨¼æããããã¨ã¯ãªã
- 確çè«çã¢ã¼ãã¹ãã¬ã³ã¹ã¸ã®æ¹å¤
- 帰ç´è«ç
- ãã¤ãã³ - ãã¢ã½ã³
- è¡çºé¸æã®è¦å
- 帰ç´è¡çº
- è¡çºé¸æã®è¦å
