ð§ 人工ç¥èœã«ããŒããã³ãªã©ããªããªã€ãç³»ã®å ±é ¬ç³»ãå°å ¥ããçè«ã¯ãããŸããïŒ
人工ç¥èœã«ãããŠãããŒããã³ããªããªã€ãç³»ã®å ±é ¬ç³»ã«é¡äŒŒããã¡ã«ããºã ãæš¡å£ããè©Šã¿ããããŸãããããã¯äž»ã«ã匷ååŠç¿ããšããé åã§ç 究ãããŠããŸãã匷ååŠç¿ã§ã¯ããšãŒãžã§ã³ãã¯å ±é ¬ïŒããžãã£ããªãã£ãŒãããã¯ïŒãæ倧åããããã«ç°å¢å ã§è¡åãéžæããŸãã人éã®è³å ã§ããŒããã³ãæãã圹å²ã«çæ³ãåŸãã¢ãã«ãååšããããã«ãããšãŒãžã§ã³ãã®åŠç¿ããã»ã¹ãä¿é²ãããããšãæåŸ ãããŸãã
ããšãã°ãããŒããã³ã«ããå ±é ¬äºæž¬èª€å·®ã®æŠå¿µã¯ã匷ååŠç¿ã®ã¢ã«ãŽãªãºã ã«ãããŠéèŠãªåœ¹å²ãæãããŸããããã¯ãäºæž¬ãšå®éã®å ±é ¬ãšã®å·®ç°ãçšããŠåŠç¿ãé²ãæ¹æ³ã§ãè³ã®ç¥çµç§åŠã«åºã¥ãããã®ã§ãã
ãŸããäžéšã®ç 究ã§ã¯ããªããªã€ãç³»ã®å ±é ¬ç³»ããããããããªã快楜ããæš¡å£ããããã«ã人工çãªãææ ããã欲æããããã°ã©ã ããè©Šã¿ããããŸããããããã¯ãŸã åºç€çãªç 究段éã«ãããŸãã
ãã®ãããªã¢ãããŒãã¯ããã人éãããææ決å®ãè¡ãAIã®éçºããèªç¶ãªå¯Ÿè©±ãå¯èœãªãããããªã©ãå¿çšç¯å²ãåºããå¯èœæ§ãç§ããŠããŸãã
匷ååŠç¿ã¯ãªããªã€ãç³»ã®å ±é ¬ç³»ã念é ã«èæ¡ãããã®ã§ããïŒé¡äŒŒããŠãã ãã§ããïŒ
匷ååŠç¿ã®æŠå¿µèªäœã¯ããªããªã€ãç³»ã®å ±é
¬ç³»ãçŽæ¥å¿µé ã«çœ®ããŠèæ¡ãããããã§ã¯ãããŸããããçç©åŠçãªå ±é
¬ç³»ã®åçã«è§ŠçºãããŠããŸããå
·äœçã«ã¯ã匷ååŠç¿ã¯åç©ã®åŠç¿è¡åãã¢ãã«åããè©Šã¿ããçºå±ããŸãããåç©ãè¡åã®çµæãšããŠå ±é
¬ïŒé£ã¹ç©ãå¿«æãªã©ïŒãåããããšã§ããã®è¡åã匷åããããšããçç©åŠçãªããã»ã¹ã«åºã¥ããŠããŸãã
ãã®çè«ã¯ãè³å
ã§ã®å ±é
¬é¢é£ä¿¡å·ã®äŒéã«é¢äžããããŒããã³ç³»çµ±ã®æ©èœã«é¡äŒŒããŠããéšåããããŸããç¹ã«ãããŒããã³ãå ±é
¬äºæž¬èª€å·®ã®ã·ã°ãã«ãšããŠæ©èœãããšããç¥çµç§åŠã®çºèŠã¯ã匷ååŠç¿ã¢ã«ãŽãªãºã ã«ããããå ±é
¬äºæž¬èª€å·®ãã®æŠå¿µã«åœ±é¿ãäžããŸããã
ãããããªããªã€ãç³»ã®å ±é
¬ç³»ãšããããã¯ã匷ååŠç¿ã¯äžè¬çãªå ±é
¬ããŒã¹ã®åŠç¿çè«ãšããŠçºå±ããŠããŸããããªããªã€ãç³»å ±é
¬ç³»ã¯ãå¿«æãçã¿ã®ç·©åãšãã£ãå
éšçãªç¶æ
ã®èª¿ç¯ã«é¢ããããããããçŽæ¥æš¡å£ããAIã®ç 究ã¯ãŸã åæ段éã«ãããŸãã
ã€ãŸãã匷ååŠç¿ã¯çç©åŠçå ±é
¬ç³»ã®åãã«çæ³ãåŸãŠããŸããããã®çºå±ã¯äž»ã«èšç®æ©ç§åŠãšå·¥åŠçã¢ãããŒãã«ãããã®ã§ããããªããªã€ãç³»ã®å ±é
¬ç³»ã念é ã«çœ®ããŠèšèšãããããã§ã¯ãªããšèšããã§ãããã
匷ååŠç¿ã®çºæ¡çµç·¯ãæç³»åã§ãŸãšããŠãã ãã
匷ååŠç¿ã®çºå±ã¯ãè€æ°ã®åéããã®åœ±é¿ãåããŠé²åããŠããŸããã以äžã«äž»èŠãªæç³»åã§ã®çºå±ããŸãšããŸãã
1950幎代
1950幎代åŸå: ããã«ãã»ããã®çè«ãå¿çåŠè ã®åç©åŠç¿çè«ãåºç€ãšãªããå ±é ¬ã«åºã¥ãåŠç¿çè«ã泚ç®ããå§ããŸãããã®æ代ã®çè«ã¯ãåŸã®åŒ·ååŠç¿ã®æŠå¿µã«å€å€§ãªåœ±é¿ãäžããŸããã
1970幎代
1977幎: Richard S. Suttonããtemporal difference (TD) learningããšããæ¹æ³ãææ¡ããŸããããã®æ¹æ³ã¯ãæå·®åŠç¿ãšåŒã°ããå ±é ¬ãäºæž¬ããããã«éå»ã®äºæž¬ãçŸåšã®äºæž¬ã§æŽæ°ããææ³ã§ãã
1980幎代
1980幎代åé : Andrew BartoãšPaul Werbosã¯ãç¥çµç§åŠãšæäœæ¡ä»¶ä»ãã®ã¢ã€ãã¢ã«åºã¥ããŠåŒ·ååŠç¿ã¢ã«ãŽãªãºã ã®åæã®åœ¢ãéçºããŸããã
1989幎: Christopher WatkinsãQ-learningãææ¡ãããã¯ãæé©ãªè¡åéžæã®ããã«ç°å¢ããåŸãããå ±é ¬ãåºã«åŠç¿ããæ¹æ³ã§ãåŸã«åŒ·ååŠç¿ç 究ã®äž»æµãšãªãã¢ã«ãŽãªãºã ã§ãã
1990幎代
1992幎: SuttonããTDåŠç¿ãšQ-learningãçµã¿åããããSARSAããšããã¢ã«ãŽãªãºã ãå°å ¥ã
1990幎代åŸå: 匷ååŠç¿ãèšç®æ©ç§åŠãç¹ã«æ©æ¢°åŠç¿ã人工ç¥èœç 究ã«ãããŠäž»èŠãªç 究ããŒããšããŠå®çãå§ããŸããã
2000幎代以é
2000幎代: ã¢ã«ãŽãªãºã ã®é²åãšèšç®èœåã®åäžã«ãããããè€éãªç°å¢ã§ã®å¿çšãé²ã¿ãŸãããç¹ã«ãDeepMindãæå±ããDeep Q-NetworksïŒDQNïŒãããããªã²ãŒã ã§ã®è¶ 人çãªããã©ãŒãã³ã¹ãéæã
2010幎代: 匷ååŠç¿ã¯ãã²ãŒã ãã¬ã€ãèªåé転è»ããããã£ã¯ã¹ãæé©ååé¡ãªã©ãæ§ã ãªåéã§ã®å¿çšãé²ã¿ãŸããã
ãã®ããã«ã匷ååŠç¿ã¯å¿çåŠãç¥çµç§åŠãæäœæ¡ä»¶ä»ãã®æŠå¿µããçºå±ããèšç®æ©ç§åŠã«ãããéèŠãªåéãžãšæé·ããŠããŸããã
ãé¡ãèŽããŸã