欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

ruby库 - nokogiri

程序员文章站 2022-07-14 14:10:36
...
介绍:
一个新的Ruby解析HTML/XML的ruby库

安装:
sudo apt-get install libxml2-dev libxslt1-dev
sudo gem install nokogiri


视频:
http://railscasts.com/episodes/190-screen-scraping-with-nokogiri

源码地址:
http://github.com/tenderlove/nokogiri/

demo(nokogiri_google.rb):
require 'rubygems'
require 'nokogiri'
require 'open-uri'

url = 'http://www.google.cn/search?q=tenderlove'
doc = Nokogiri::HTML(open(url))

doc.css('h3.r a.l').each do |link|
  puts link.content
end
puts '--------------------------------------------------'

doc.xpath('//h3/a[@class="l"]').each do |link|
  puts link.content
end
puts '--------------------------------------------------'

doc.search('h3.r a.l', '//h3/a[@class="l"]').each do |link|
  puts link.content
end