2014N0504ú
RubyÖAHTML/WebXNCsOECuÜÆß 2014NÅ
¦[ubN}[N]{^ÌÉ delicious, reddit, digg, Google+(G+) Ì{^à èÜ·B
2014N5_ÅAHTML/WebXNCsOÉg¦é Ruby ÖAÌCu½¿ÉÂ��ÄAOñ (2010N) ÌÜÆßÆÌ·ªÆÆàÉATvðÜÆß½BeXÌCuÌÁ\ÆüèæAHPACXg[û@ACZXÈÇÉÂ��ÄÈPÉྵÄ��éB
¡ñµÁÄ��éÌÍAnokogiri, Mechanize, Selenium-webdriver, Capybara-webkit, poltergeist Ì 5 ÂBRuby ÈOÉàAPhantomJS, Yahoo!Pipes, ScraperWiki, kimono ÉÂ��Äàæèã��½B
êðÜÆß½Ì¿AärÆ]¿A�N··ßÌprÉÂ��ÄÈÉq�~éB
wi
2010NÉuRubyÖAHTML/WebXNCsOECuÜÆßvƵÄ��½LÅÍAÌæ¤È 6 ÂÌCuðµÁ½: Hpricot (·MÌo[V 0.8.2BȺ¯l), Mechanize (1.0.0), ScrAPI toolkit (1.2.0), ScRUBYt! (0.4.06), nokogiri (1.4.3.1).
�}̤¿A
- Hpricot ÍA2014N5»ÝA2 NOÉJð~µÄ��é
(gHpricot is overh) - ScrAPI toolkit ÍA2.0.0 ª 3 NOÉ[X³êARuby 1.9 nð
T|[gµÄ��éBµ©µAठRuby Í 2.0 nªå��ƩÄ����Ì
ÅÍÈ��©B - ScRUBYt! ÍAHpricot Æ Mechanize, Ruby Í 1.8 nðOñƵAURL
ÍNØêªÚ��ÂB
Å é�}ƪª©Á½B
æÁÄA{LÅÍ�}êçÉÂ��ÄÍÈB
ȺAOñÌL©çÌá�}Ìââð�N�}È��ÂÂA2014N5»ÝÅVÌRubyÖAHTML/WebXNCsOECuÌóµÉÂ��ÄÜÆßéB
nokogiri (1.6.1; 1.6.2.rc2)
nokogiri ÍAp[TÅ éBOñ©çÍÇÒAzXgêªÏíÁ½æ¤¾B
- à¾
Nokogiri ÍAHTML, XML, SAX, »µÄ Reader Ìp[TÌÐÆÂBNokogiri
̽Ì@\ÌÉÍAXPath â CSS3 ZN^ðÊ��ÄhL gàð
õÅ«éàÌà éB(README.rdoc) - Abvf[góµ 2008-2014.
- RubyGems.org https://rubygems.org/gems/nokogiri
- CXg[
[sudo] gem install nokogiri - \[XR[h https://github.com/sparklemotion/nokogiri
- CZX The MIT License
Mechanize (2.7.3; 2.7.4.beta2)
Mechanize ÍAwbhXÈuEUV~ [^Å éB½¾µAJavaScript ÌÍÅ«È��Bo[V 1.0.0 ©ç Hpricot ÅÍÈ nokogiri ðÌpµÄ��éB»ÝÍR[hÌÇàAnokogiri Ư��l½¿ªâÁÄï¿½ï¿½éæ¤¾B
- à¾
Mechanize CuÍAWeb TCgìÌ©®»Ì½ßÉgíêéB
Mechanize ÍA©®ÅNbL[Ìi[EMðµA_CNgð½ÇèA
»µÄNðÇÁ½ètH[ðTu~bgµ½èÅ«éBtH[Ì
tB[h½¿ÍAlðßÄTu~bgÅ«éBMechanize ÍܽAð
ÆµÄ È½ªKê½TCg½¿ÌgbNðAÛ·é�}ÆàÅ«éB
(README.rdoc) - Abvf[góµ 2005-2014.
- RubyGems.org https://rubygems.org/gems/mechanize
- CXg[
[sudo] gem install mechanize - \[XR[h https://github.com/sparklemotion/mechanize
- CZX The MIT License
selenium-webdriver (Selenium Client & WebDriver) (2.41.0)
Selenium ÉÂ��ÄÍAOñÌLÉüêé�~«¾Á½BSelenium ÍXNCp[Æï¿½ï¿½¤æèÍAWeb TCgÌeXg©®»Ì½ßÌfoCX(uEU)EhCoÆ»Ì API Cu¾ªAXNCsOÉàg¤ï¿½}ƪūéB
Selenium ÍAuEUð¼ÉRg[·éAv[`Å éBµ©µ ��íäé wbhX (headless ÍñȵÅA�}�}ÅÌÓ¡ÍæÊ\¦ÈµÆï¿½ï¿½¤ï¿½}Æ) ÉàÅ«éBÎuEUÍAã\IÈàÌ: Firefox, Chrome, IE ÍT|[g³êÄ��éB
hCoì̽ßÌNCAg¾êÍ Ruby ̼ÉàAJava, C#, Python, JavaScript (node.js) ª éB
Selenium IDE (ìðL^µÄAeXgP[XÌ`ƵÄR[hðf��ÄêéuEU¤ÌvOC) ÆAêÉg¤ÆAvfÌwèR[hðÈPÉÂêÄA�N··ß (Excel Å}NÌL^ð·éÆ«Ìæ¤È�L��Å)B
- à¾
Selenium ÍAuEUð©®»·éBÈãI»ÌÍðgÁĽð·é©ÍA
®SÉ È½æBæê`IÉÍAWeb Avð©®»·é�}ÆÅeXgÚI
Åg¤ªA»ÌprÉÀçêé�}ÆÍÈ��B (HPæè) - Abvf[góµ 2009-2014
- URL http://docs.seleniumhq.org/ CtHTCGX®ïÐÉæéú{êó
- RubyGems.org https://rubygems.org/gems/selenium-webdriver
- CXg[
gem install selenium-webdriverܽÍ_E[h - \[XR[h http://code.google.com/p/selenium/
- CZX Apache License 2.0
Capybara (2.2.1), capybara-webkit (1.1.1)
àÆÍARails ÌCeO[VEeXgÉgíêÄ��½»¤¾B
Cucamber â capybara-webkit ÆÆàÉ g¤ï¿½}ÆÅARails â JavaScript ðgÁ½TCgÌeXgÉàg¦é»¤¾ (½¾µ capybara-webkit gpÍAqt ÌpÓàKv)B
- à¾
Capybara ÍArackx[XÌEFuAv̽ßÌCeO[VE
eXeBOEc[ÌÐÆÂB[UÌWebTCgÌìðV~ [g·éB
(RubyGems.orgæè) - Abvf[góµ 2009-2014
- RubyGems.org https://rubygems.org/gems/capybara | https://rubygems.org/gems/capybara-webkit
- CXg[
[sudo] gem install capybara - \[XR[h https://github.com/jnicklas/capybara
- CZX MIT
poltergeist (1.4.0)
Poltergeist ÍAãqÌ PhantomJS Ìbp[Å Capybara ÆgÝí¹Äg¤B
capybara-webkit æèà qt ªÈ��Ôñ����©àH
- à¾
Poltergeist (|^[KCXg) ÍACapybara ̽ßÌhCoÅA
wbhXÈ WebKit uEUƵÄeXgū鿤ɷéàÌB
PhantomJSx[XB(RubyGems.orgæè) - Abvf[góµ 2013-2013
- RubyGems.org https://rubygems.org/gems/poltergeist
- CXg[
[sudo] gem install poltergeist - \[XR[h https://github.com/teampoltergeist/poltergeist
- CZX MIT
Ruby ÈOAWeb T[rX
PhantomJS (v1.9)
Mechanize ÆÄ��éªAJavaScript ð¼É ©¤¼A³çÉ��ëñÈ@\ðõ¦½àÌÉAPhantomJS ª éBwbhX ÆÍAæÊ\¦ðsíÈ��uEUÆï¿½ï¿½¤ï¿½}ÆBWebKit ÍAMac OS X ÌWuEU Safari â Google ªñµÄéuEU Chrome ªgÁÄ��é HTML _[B
gpáÆµÄÍAeXeBOEì©®»EXN[Lv`Elbg[Nj^Oªï¿½ï¿½çêÄ��éBX11 â qt ÈÇ GUI CuàgpµÄ��È��ÌÅA^T[oÈñ©ÅàAg¦éAÆÌ�}ÆB
- à¾
PhantomJS ÍAJavaScript API ðgÁ½wbhX^Ì WebKitB��ÅA
��ëñÈWebWZp: DOM ÌnhO, CSS ZN^, JSON, Canvas,
SVG ½¿ðlCeBuÉT|[gµÄ��éB(HP Ìà¾æè) - Abvf[góµ 2010-2014
- URL http://phantomjs.org/
- CZX BSD
- \[XR[h https://github.com/ariya/phantomjs/
- CXg[ Mac Èç
brew install phantomjs. àµÍ_E[hµÄð
WebT[rXn
Yahoo!Pipes (2007-) US Yahoo! ÌT[rXBGUI ÅHTMLÌvfð����êéB½¾µAµÔªd��B
ScraperWiki () ÍAXNCp[R[hð¤Lµ½èAÒWµ½èAf[^ð~ϵ½èÅ«éTCgB
ScraperWikiÍAàÆàÆCMXÅAÇÌcõªÇÌ@ÄÉ^��ܽͽÎ[ð��½©ðcïÌTCg©çnÆÒÌêlª2003N ÉXNCvµ½ï¿½}ÆðN¹É¿Ü·B
Èñ¾»¤ÈB
kimonoÆï¿½ï¿½¤wèµ½EFuTCgðXNCsOµÄAPI»µÄêéT[rX () Æï¿½ï¿½¤Ìà©Â©Á½B
ärÆ]¿
�NèyÉXNCsOµ½ï¿½ï¿½êÍAfÌÜÜÌ nokogiriAàµÍ Mechanize ðg¤Ìªï¿½ï¿½ï¿½ï¿½¾ë¤B
Æï¿½}ëªA�}êçÍ JavaScript ªgíê½TCg (ÆÉ MS Ì.aspªgíêÄ��éTCgÆ©) ÅA¤Ü@\µÈï¿½ï¿½êª éBCookie ðgÁ½ éíÌveNgð|¯Äï¿½ï¿½éæ¤ÈTCgà é (MÒÍ�}êÅMechanizeðúüµ½)B
�}¤ÈÁ½Æ«ÍAuEUÌ®ìð®SÉG~
[g·éàÌðg¤µ©È��B
Selenium ©A[gzXgÅâé�}ƪOªÉ éÈçÎ X (Æï¿½ï¿½¤©EBhEVXe) ÆÌË��ªÈ�� capybara+poltergeist ÌgÝí¹ðg¤Ìªï¿½ï¿½ï¿½ï¿½©à B
Ruby ðl��ÉüêÈ��ÌÅ êÎAPhantomJS Í©ÈèdzCÉ©¦éB
vO~Oµ½È��êÍAWeb nÌXNCsOET[rXðgÁÄÝéÌà����B½¾A��xÉÂ��ÄÍx��Æv¤ (Yahoo!Pipes ¾¯ðgÁÄݽÌ�Lz)B
¦È�NMÒÍ Capybara, PhantomJS ÉÂ��ÄÍܾGÁ½ï¿½}ƪÈ��B

