URL/HTML/JavaScript的encode/escape 博客分类： Ruby HTMLJavaScriptRubyCGIjQuery

程序员文章站 2024-02-22 16:53:40

...

最近经常被URL、HTML、JavaScript的encode/escape弄晕

在这里列出Ruby/JavaScript中一些方法的说明:

1，CGI.escape/CGI.unescape做URL的encode和decode
参加Ruby库中的cgi.rb

  # URL-encode a string.
  #   url_encoded_string = CGI::escape("'Stop!' said Fred")
  #      # => "%27Stop%21%27+said+Fred"
  def CGI::escape(string)
    string.gsub(/([^ a-zA-Z0-9_.-]+)/n) do
      '%' + $1.unpack('H2' * $1.size).join('%').upcase
    end.tr(' ', '+')
  end


  # URL-decode a string.
  #   string = CGI::unescape("%27Stop%21%27+said+Fred")
  #      # => "'Stop!' said Fred"
  def CGI::unescape(string)
    string.tr('+', ' ').gsub(/((?:%[0-9a-fA-F]{2})+)/n) do
      [$1.delete('%')].pack('H*')
    end
  end

2，CGI.escapeHTML/CGI.unescapeHTML做HTML的escape/unescape
参加Ruby库中的cgi.rb

  # Escape special characters in HTML, namely &\"<>
  #   CGI::escapeHTML('Usage: foo "bar" <baz>')
  #      # => "Usage: foo &quot;bar&quot; &lt;baz&gt;"
  def CGI::escapeHTML(string)
    string.gsub(/&/n, '&amp;').gsub(/\"/n, '&quot;').gsub(/>/n, '&gt;').gsub(/</n, '&lt;')
  end


  # Unescape a string that has been HTML-escaped
  #   CGI::unescapeHTML("Usage: foo &quot;bar&quot; &lt;baz&gt;")
  #      # => "Usage: foo \"bar\" <baz>"
  def CGI::unescapeHTML(string)
    string.gsub(/&(amp|quot|gt|lt|\#[0-9]+|\#x[0-9A-Fa-f]+);/n) do
      match = $1.dup
      case match
      when 'amp'                 then '&'
      when 'quot'                then '"'
      when 'gt'                  then '>'
      when 'lt'                  then '<'
      when /\A#0*(\d+)\z/n       then
        if Integer($1) < 256
          Integer($1).chr
        else
          if Integer($1) < 65536 and ($KCODE[0] == ?u or $KCODE[0] == ?U)
            [Integer($1)].pack("U")
          else
            "&##{$1};"
          end
        end
      when /\A#x([0-9a-f]+)\z/ni then
        if $1.hex < 256
          $1.hex.chr
        else
          if $1.hex < 65536 and ($KCODE[0] == ?u or $KCODE[0] == ?U)
            [$1.hex].pack("U")
          else
            "&#x#{$1};"
          end
        end
      else
        "&#{match};"
      end
    end
  end

3，html_escape/h做HTML的escape，url_encode/u做URL的encode
参加Ruby库中的erb.rb
ActionView::Base include了ERB::Util，所以可以直接在html.erb里调用h和u方法

  def html_escape(s)
    s.to_s.gsub(/&/, "&amp;").gsub(/\"/, "&quot;").gsub(/>/, "&gt;").gsub(/</, "&lt;")
  end
  alias h html_escape

  def url_encode(s)
    s.to_s.gsub(/[^a-zA-Z0-9_\-.]/n){ sprintf("%%%02X", $&.unpack("C")[0]) }
  end
  alias u url_encode

4，JavaScript中escape/unescape做HTML的escape，encodeURI/decodeURI做URL的encode/decode

var a = "<span>afd adf &&& <<< >>></span>"
var b = escape(a)
  => "%3Cspan%3Eafd%20adf%20%26%26%26%20%3C%3C%3C%20%3E%3E%3E%3C/span%3E"
var c = unescape(b)
  => "<span>afd adf &&& <<< >>></span>"

var a = "http://www.test.com/haha hehe/"
var b = encodeURI(a)
  => "http://www.test.com/haha%20hehe/"
var c = decodeURI(b)
  => "http://www.test.com/haha hehe/"

5，jQuery中的text(str)和html(str)
jQuery中text(str)方法给某个HTML Element设置文本内容，其中的文本如果包括HTML Tag或者<script>标签都会忽略而当作文本来看待
jQuery中html(str)方法给某个HTML Element设置html内容，其中的内容如果包括HTML Tag或者<script>标签则都会当作正常标签来执行

6，JavaScript实现h和uh方法来escape/unescape HTML
该方法用于JavaScript拼接HTML片段时防止&、<、>、"相关的问题

function escapeHTML(str) {
  str = String(str).replace(/&/g, '&amp;').
    replace(/>/g, '&gt;').
    replace(/</g, '&lt;').
    replace(/"/g, '&quot;');
  return str;
}
function unescapeHTML(str) {
  str = String(str).replace(/&gt;/g, '>').
    replace(/&lt;/g, '<').
    replace(/&quot;/g, '"').
    replace(/&amp;/g, '&');
  return str;
}
h = escapeHTML;
uh = unescapeHTML;

有了这些helper方法，基本上能满足99%的URL、HTML和JavaScript的escape和encode方面的需求

上一篇：使用coderay和railscasts样式进行代码高亮博客分类： Ruby RubyCSSRailsPythonDelphi

下一篇： Web开发大全：ROR版——推荐序博客分类： Ruby WebRails框架Google出版