å¤§è¦æ¨¡è¨èªã¢ãã«ãå©ç¨ããéã«ã¯ãã¢ãã«ãã®ãã®ã ãã§ãªããã¢ãã«ãæ±ãããã®ã©ã¤ãã©ãªãå¿
è¦ã«ãªã£ã¦ãã¾ããå¤ãã®å ´åTransformersã¨ããã©ã¤ãã©ãªã使ç¨ããã¦ãã¾ããããPagedAttentionãã¨ããä»çµã¿ãå°å
¥ããæ°ããªã©ã¤ãã©ãªãvLLMããå©ç¨ãããã¨ã§ã¹ã«ã¼ããããæå¤§24åã«åä¸ã§ãããã¨ãå¤§è¦æ¨¡è¨èªã¢ãã«ã®ç ç©¶ãã¼ã ã«ãã£ã¦çºè¡¨ããã¾ããã vLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention https://vllm.ai/ å¤§è¦æ¨¡è¨èªã¢ãã«ãæä½ããããã®ã©ã¤ãã©ãªã«ã¯Hugging Faceã®Transformers(HF)ãå®ç¨¼åç°å¢åãã®Text Generation Inference(TGI)ãåå¨ãã¦ãã¾ããä»åç»å ´ããvLLMã¯ãã®ä¸è§ã«å ãã£ãå½¢ã 以ä¸ã®å³ã¯ãNVIDIA
{{#tags}}- {{label}}
{{/tags}}