
AIã»æ©æ¢°å¦ç¿ãã¼ã ã®é«æ©ã§ãããã®ããã°ã¯AIã»æ©æ¢°å¦ç¿ãã¼ã ããã°ãªã¬ã¼6æ¥ç®ã®è¨äºã§ãã
ããåå¹´ã»ã©ã§CopilotâClineâClaude Codeã¨AIã³ã¼ãã£ã³ã°ãã¼ã«ãä¹ãæãã¦ãã¾ããClaude Codeã¯å·¥å¤«ããã¨ãé«ãå®è£ åãçºæ®ãã¦ããã¾ãããç¹ã«Custom Slash Commandãé常ã«ä¾¿å©ã§ãã©ãã¾ã§ã³ã¼ãã£ã³ã°ãèªååã§ããã®ã試è¡é¯èª¤ãã¦ãã¾ãã
ãã¦ãClaude Codeã®æ ¹å¹¹ã«ã¯å¤§è¦æ¨¡è¨èªã¢ãã«(LLM)ãããããã®åºç¤ã¨ãªãäºåå¦ç¿ã®ã¢ã¤ãã¢ã¯ãããã¾ã§ã®æèã«åºã¥ãã¦æ¬¡ã®æåï¼ãã¼ã¯ã³ï¼ãä½ã§ãããã¢ãã«ã«æ¨æ¸¬ããããã¨ã§ãããã®æ¨æ¸¬ãéãã¦ã¢ãã«ã¯è¨èªã®ææ³çãªæ§é ãç¥èãç²å¾ããæ´ã«RLHFçãéããå¦ç¿ããããã¨ã§ãäººãæ±ããå¿çãçæã§ããããã«ãªãã¾ãã
å®ã¯ãããããæ¬¡ã®æåãäºæ¸¬ããæ çµã¿ãææ¡ãããã®ã¯1951å¹´ã¨å¤ãããããææ¡ããã®ã¯æ å ±çè«ã®ç¶ã¨å¼ã°ããClaude Shannonã§ããå½¼ã1951å¹´ã«çºè¡¨ããè«æãPrediction and Entropy of Printed Englishãã§åãã¦è¨èªã¢ãã«(ã«ç¸å½ãããã®)ãç»å ´ãã¾ããä»åã¯ãã®è«æã§ææ¡ãããè¨èªã¢ãã«ã®ã¢ã¤ãã¢ã¨ãShannonãè¡ã£ãå®é¨ã«ã¤ãã¦ç´¹ä»ãã¾ãã
ã¡ãªã¿ã«ãClaude Codeã¨åãååãªã®ã¯å¶ç¶ã§ã¯ãªãããµã¼ãã¹åã¯Shannonã®ååããåããã¦ãã¾ãã*1
ã·ã£ãã³ã¯ãªãè¨èªã¢ãã«ãèããã
ã·ã£ãã³ã¯ãªãè¨èªã¢ãã«ãèããã®ã§ããããï¼ãã®èæ¯ã«ã¯ãæ å ±çè«ã®åºç¤ã¨ãªããã¨ã³ãããã¼ãã¨ããæ¦å¿µãããã¾ããã·ã£ãã³ã¯1948å¹´ã«çºè¡¨ããè«æãA Mathematical Theory of Communicationãã§ã¨ã³ãããã¼ã¨ããæ¦å¿µãå°å ¥ããæ å ±ã®éãå®éçã«æ¸¬ãæ¹æ³ãææ¡ãã¾ããã
ã·ã£ãã³ã¯ããã®ã¨ã³ãããã¼ã¨ããæ¦å¿µãç¨ãã¦ãè±èªã¨ããè¨èªãæã¤æ å ±éãå ·ä½çã«è¨æ¸¬ãããã¨ãã¾ããããã®ææ®µã¨ãã¦èæ¡ãããã®ããè¨èªã¢ãã«ã§ããã
è¨èªã®ã¨ã³ãããã¼ãå®ç¾©ãã
以ä¸ã§ã¯ç»å ´ããã·ã³ãã«ã¯ã¢ã«ãã¡ããã26æåã¨ã¹ãã¼ã¹ã®27æåã«éå®ãã¾ãã
ã¾ãã¢ã«ãã¡ãããã«ã¯EãTã¯é »åºãããQãZã¯é »åºããªãã¨ããããã«ãæåãã¨ã«é »åº¦ãç°ãªãã¾ãã27æåãåçã«ç¾ããå ´åã¨æ¯ã¹ã¦ã¨ã³ãããã¼ã¯ä½ããªããã¨ãåããã¾ããè«æã§ã¯æåã®é »åº¦ããã¨ã«è¨ç®ããã¨4.03ãããã«ãªãã¨å ±åããã¦ãã¾ã(27æåãåçã«ç¾ããå ´åã®ã¨ã³ãããã¼ã¯ç´4.7ããã)ã
次ã«ã2æåã®çµã¿åããï¼2-gramï¼ãèãã¦ã¿ã¾ããããä¾ãã°ã"TH"ã"HE"ã¯é »åºãã¾ããã"XZ"ã"QJ"ã¯ã»ã¨ãã©åºç¾ãã¾ããããã®ããã«ãæåã®çµã¿åãããèæ ®ããã¨ã¨ã³ãããã¼ã¯æ¸å°ãã¾ãã2-gramã®é »åº¦æ å ±ããã¨ã«ããã¨3.32ãããã«ãªãã¾ãã
æåã®çµã¿åãããä¸è¬åãã¦N-gramãèããã¨ãN-1æåã®æèãä¸ããããã¨ãã«ãNæåç®ãå¾ãåå¸ãåããã°N-gramã®ã¨ã³ãããã¼ãåããã¾ãã ããã¦ãNã大ãããã¦ããã¨ããpre-ããã-tionãããªã©ã®é »åºããæ¥å°¾è¾ãæ¥é è¾ãç¾ããæ´ã«åèªã®é »åº¦ãææ³çãªæ§é ãããã«ã¯æç« å ¨ä½ã®æ§æãªã©ãããé·è·é¢ã®ãã¿ã¼ã³ãè¦ãã¦ãã¾ãã
Shannonã¯ããããè¨èªã®ãã¿ã¼ã³ããã¹ã¦å å ãããNãç¡é大ã®ã¨ãã®N-gramã¨ã³ãããã¼ãè¨èªã®ã¨ã³ãããã¼ã¨å®ç¾©ãã¾ãããããã¦ãN-1æåã®æèãä¸ããããã¨ãã«ãNæåç®ãå¾ãåå¸ãããè¨èªã¢ãã«ã®å®ç¾©ã«ãªãã¾ãã
Shannonã®å®é¨
2-gramã§ãã£ãã¨ãã®ããã«ãNã大ããã¨ããé »åº¦æ å ±ããã¨ã«N-gramã¨ã³ãããã¼ãè¨ç®ã§ããã°è¯ãã§ãããããã¯å®éã«ã¯å°é£ã§ãã ã¨ããã®ããN-gramã®çµã¿åããã¯Nã«å¯¾ãã¦ææ°çã«å¢å ãããããä¾ãã°Nã100ã«ããã¨å¯è½ãª100-gramã¯27ã®100ä¹(â1.4Ã10ã®143ä¹)éãã¨ãªãã¾ããããã¯LLMã®äºåå¦ç¿ã§å©ç¨ãããå¤§è¦æ¨¡ã³ã¼ãã¹(Common Corpus, 5Ã10ã®11ä¹ç¨åº¦ã®åèªæ°)ã¨æ¯è¼ãã¦ãã¯ããã«è¨å¤§ãªæ°ã§ãã
ããã§ãShannonã¯æ¬¡ã®ãããªå®é¨ãè¡ããNã大ããã¨ãã®N-gramã¨ã³ãããã¼ãæ¨å®ãã¾ããã Shannonãè¡ã£ãå®é¨ã¯ã次æåãäºæ¸¬ãããè¨èªã¢ãã«ãã¨ãã¦ãã³ã³ãã¥ã¼ã¿ã¼ã§ã¯ãªãã人éããæ¡ç¨ãã¾ããã被é¨è ã¨ãªã£ãã®ã¯ãå½¼ã®å¦»ã§ããBetty Shannonã§ãã
å®é¨ã¯æ¬¡ã®æé ã§è¡ããã¾ããã
- Jefferson the Virginianã¨ããå°èª¬ãã100æåã®æç²ã100åã®æååç¨æããã
- 被é¨è ã«ãæç²ããæååã1æåãããEã§ããï¼ããTã§ããï¼ãã¨é çªã«æ¨æ¸¬ãããã
- 2ã§æ£è§£ããã¾ã§ã«ä½åã®æ¨æ¸¬ãå¿ è¦ã ã£ãããè¨é²ããããã100æåç®ã¾ã§è¡ãã
- 100åã®æç²ãçµããã¾ã§2ã¨3ãç¹°ãè¿ãã
ãã®å®é¨ã«ããå¾ããããã¼ã¿ã次ã®è¡¨ã§ãã

ãã®è¡¨ã¯æèã®é·ãã«å¿ãã¦æ¬¡ã®æåãäºæ¸¬ããã®ã«å¿ è¦ã¨ãªã£ãæ¨æ¸¬åæ°ã®åå¸ã«ãªã£ã¦ãã¾ããN=100ã®ã±ã¼ã¹ã¯99æåã®æèãä¸ããããå ´åã®æ¨æ¸¬åæ° (1åã§å½ã¦ãããåæ°ã80åã2åã§å½ã¦ãããåæ°ã7å...)ã«ãªãã¾ãã
ã¨ã³ãããã¼ãè¨ç®ãã
å®éã«ã¨ã³ãããã¼ãæ±ãã¦ã¿ã¾ããããN=100ã§ã®æ¨æ¸¬åæ°ã®åå¸ãã¨ã³ãããã¼ã®å®ç¾© -Σqã»log2(q) ã«å½ã¦ã¯ãã¦*2ãè¨ç®ããã¨ç´1.3ãããã«ãªãã¾ããã ãã®1.3ãããã¨ããæ°åã¯ããã®å¾ã®èªç¥å®é¨ãä¾ãã°è¢«é¨è ã«æ¬¡ã®æåã®ç¢ºçåå¸ãäºæ¸¬ãããèªç¥å®é¨*3ããæåå³åã§ããã¯ã©ã¦ãã½ã¼ã·ã³ã°ãå©ç¨ããå¤§è¦æ¨¡ãªèªç¥å®é¨*4ã§ãè¿ãå¤ã«ãªããã¨ã確èªããã¦ãã¾ãã
2-gramã®ã¨ãã«3.32ãããã§ãã£ããã¨ãèããã¨ãèªç¶è¨èªã¯é·ãæèãä¸ãããããã¨ã«ãã£ã¦äºæ¸¬å¯è½æ§ãå¤§å¹ ã«ä¸ããæ§è³ªãæã£ã¦ãããã¨ãåããã¾ãã
LLMã¯ã·ã£ãã³ã®å¦»ãããåªç§ãªè¨èªã¢ãã«ãï¼
ããã¾ã§ã®è©±ã§ãèªç¶è¨èªã«ã¯åé·æ§ããããç¹ã«é·ãæèãä¸ããããã°ä¸ããããã»ã©ã¨ã³ãããã¼ã使¸ãã¦ãããã¨ãåããã¾ããã ã§ã¯ãããããé·ãæèãå©ç¨ãã¦æ¬¡ã®æåãäºæ¸¬ããã¿ã¹ã¯ã«ã¤ãã¦äººéã¨ã¢ãã«ã§ã¯ã©ã¡ããåªãã¦ããã§ããããï¼
ã·ã£ãã³ã®å®é¨ã§ã¯ãå¿ è¦ãªæ¨æ¸¬åæ°ã®åå¸ããã¨ã³ãããã¼ãæ±ãã¾ããããN-gramãæ·±å±¤è¨èªã¢ãã«ã§ããã°ãã¢ãã«ãåºåããäºæ¸¬ç¢ºçããã¨ã«ãã¦ã¨ã³ãããã¼ãè¨ç®ã§ãã¾ãã
å®éã«åã¢ãã«ã¨ã·ã£ãã³ã®èªç¥å®é¨ã§ã®çµæã並ã¹ãã¨ä»¥ä¸ã®ãããªçµæã¨ãªãã¾ããã
表. ã¢ãã«ã¨äººéã®ãããæ° (ã¢ãã«ã¯ããããtext8ãã¼ã¿ã»ããã§ã®è¨æ¸¬)
| ã¢ãã« | ãããæ° |
|---|---|
| N-gramã¢ãã« | 1.58 |
| Simple RNNs *5 | 1.46 |
| 人é (ã·ã£ãã³ã®å¦») | 1.31 |
| LSTM *6 | 1.27 |
| GPT-2 *7 | 0.98 |
| LLMZip *8 | 0.71 |
N-gramã¢ãã«ãã²ã¼ãæ©æ§ãæããªãSimple Recurrent Neural Networks (RNNs)ã§ã¯ã·ã£ãã³ã®å¦»ãããé«ããããæ°ã«ã¨ã©ã¾ã£ã¦ãã¾ããã ãã®å¾2010年代åã°ã«ã²ã¼ãæ©æ§ãæã¤LSTMãå§ãã¨ããã¢ãã«ãçºå±ãã¦ã1.3ããããåã£ãå ±åãããã¤ããããããã«ãªãã¾ããã
ããããé£èºçãªãã¬ã¼ã¯ã¹ã«ã¼ããããããã®ããã¯ãTransformerã§ãç¾å¨ã®LLMã®ä¸æ©æåã«ããã2019å¹´ã®GPT-2è«æã§text8ã¨ããWikipediaã®ããã¹ããã¼ã¿ã§0.98ããããè¨é²ãããã¨ãå ±åããã¦ãã¾ãã
æ´ã«2023å¹´ã«ã¯Llama-7Bã®åå¸äºæ¸¬ãç¨ãã¦ããã¹ããã¼ã¿ãå§ç¸®ãããã¨ãææ¡ãã¦ãçæç¹ã¯ããã¾ãã*9ã0.71ãããã¨ããé©ç°çãªæ°å¤ãå ±åãã¦ãã¾ãã
ç¾è¡ã®Gemini2.5-proãGrok3ãOpenAI-o3ã¨ãã£ãã¢ãã«ã®äºåå¦ç¿ã¢ãã«ã§ã¯æ´ã«é«ãæ§è½ã§ããã¨èããããä½ãããã¾ã§äºæ¸¬æ§è½ãåä¸ãã¦ããã®ãæ°ã«ãªãã¨ããã§ããæ®å¿µãªããLLMã®ãµã¼ãã¹ã«ã¨ã£ã¦è¨èªã¢ãã«ããèªä½ã®æ§è½ã¯éè¦ã§ã¯ãªããããã¾ãå ±åããã¾ããã
ãããã«
Shannonã®å®é¨ãã70年以ä¸çµã£ãä»ãå½¼ãã¨ã³ãããã¼æ¨å®ã®ããã«èæ¡ããã¿ã¹ã¯ã«ãã£ã¦å¦ç¿ãããLLMãç»å ´ããShannonã®ååããåãããClaude Codeãçã¾ãã¾ãããæ å ±ç§å¦ã®é»ææã«è¡ããããã®å®é¨ããã®ããã«å½¢ãå¤ãã¦ç¾ä»£ã«èããç§å¦ã¯æ´å²ã®ç©ã¿éãã§ãããã¨ãæãåºããã¦ããã¾ãã
We're Hiring!
ã¨ã ã¹ãªã¼AIæ©æ¢°å¦ç¿ãã¼ã ã§ã¯ãæè¡ã¨å ±ã«ããã®ã¼ã¯ãªã¡ã³ãã¼ãåéãã¦ãã¾ã æ°åã»ä¸éããããã®æ¡ç¨ã ãã§ãªããã«ã¸ã¥ã¢ã«é¢è«ãã¤ã³ã¿ã¼ã³ã常æåéãã¦ãã¾ãã
ã¨ã³ã¸ãã¢æ¡ç¨ãã¼ã¸ã¯ãã¡ã
ã«ã¸ã¥ã¢ã«é¢è«ããæ°è»½ã«ã©ãã
ã¤ã³ã¿ã¼ã³ã常æåéãã¦ãã¾ã
*1:Anthropicå ±å嵿¥è Dario Amodaiæ°ã®ã¤ã³ã¿ãã¥ã¼
*2:æ£ç¢ºã«ã¯ã¨ã³ãããã¼ã®ä¸çã«ãªãã¾ã
*3:A Convergent Gambling Estimate of the Entropy of English
*4:Entropy Rate Estimation for English via a Large Cognitive Experiment Using Mechanical Turk
*5:Tomas Mikolov et al. SUBWORD LANGUAGE MODELING WITH NEURAL NETWORKS. N-gramã«ã¤ãã¦ããã®è«æã®å ±åãåç §ãã
*6:Ben Krause et al. Multiplicative LSTM for sequence modelling
*7:Alec Radford et al. Language Models are Unsupervised Multitask Learners
*8:Chandra Shekhara Kaushik Valmeekam et al. LLMZip: Lossless Text Compression using Large Language Models
*9:Llama-7Bã®å¦ç¿ã«Wikipediaã®ãã¼ã¿ãå«ã¾ãã¦ãããã¨ãèããã¨ããããæ¬å½ã«ã¨ã³ãããã¼ã®ä¸çãè©ä¾¡ãã¦ãããã¨ã«ãªãã®ãã«ã¤ãã¦ã¯çåãæ®ã