欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  IT编程

PowerShell小技巧之实现文件下载(类wget)

程序员文章站 2022-06-17 22:49:30
对linux熟悉的读者可能会对linux通过wget下载文件有印象,这个工具功能很强大,在.net环境下提到下载文件大多数人熟悉的是通过system.net.webclie...

对linux熟悉的读者可能会对linux通过wget下载文件有印象,这个工具功能很强大,在.net环境下提到下载文件大多数人熟悉的是通过system.net.webclient进行下载,这个程序集能实现下载的功能,但是有缺陷,如果碰上类似于…/scripts/?dl=417这类的下载链接将无法正确识别文件名,下载的文件通常会被命名为dl=417这样古怪的名字,其实对应的文件名是在访问这个链接返回结果的http头中包含的。事实上微软也提供了避免这些缺陷的程序集system.net.httpwebrequest 和 httpwebresponse,本文将会使用这两个程序集来实现powershell版wget的功能。

代码不怎么复杂,基本上就是创建httpwebrequest对象,设定useragent和cookiecontainer以免在遇到设置防盗链的服务器出现无法下载的情况。然后通过httpwebrequest对象的getresponse()方法从http头中获取目标文件的大小以及文件名,以便能在下载到文件时提示当前下载进度,在下载完文件后,列出当前目录下对应的文件。代码不复杂,有任何疑问的读者可以留言给我,进行交流,下面上代码:

复制代码 代码如下:

 =====文件名:get-webfile.ps1=====
function get-webfile {
<# author:fuhj(powershell#live.cn ,http://fuhaijun.com)
   downloads a file or page from the web
.example
  get-webfile http://mirrors.cnnic.cn/apache/couchdb/binary/win/1.4.0/setup-couchdb-1.4.0_r16b01.exe
  downloads the latest version of this file to the current directory
#>

[cmdletbinding(defaultparametersetname="nocredentials")]
   param(
      #  the url of the file/page to download
      [parameter(mandatory=$true,position=0)]
      [system.uri][alias("url")]$uri # = (read-host "the url to download")
   ,
      #  a path to save the downloaded content.
      [string]$filename
   ,
      #  leave the file unblocked instead of blocked
      [switch]$unblocked
   ,
      #  rather than saving the downloaded content to a file, output it. 
      #  this is for text documents like web pages and rss feeds, and allows you to avoid temporarily caching the text in a file.
      [switch]$passthru
   ,
      #  supresses the write-progress during download
      [switch]$quiet
   ,
      #  the name of a variable to store the session (cookies) in
      [string]$sessionvariablename
   ,
      #  text to include at the front of the useragent string
      [string]$useragent = "powershellwget/$(1.0)"
   )

   write-verbose "downloading '$uri'"
   $eap,$erroractionpreference = $erroractionpreference, "stop"
   $request = [system.net.httpwebrequest]::create($uri);
   $erroractionpreference = $eap
   $request.useragent = $(
         "{0} (powershell {1}; .net clr {2}; {3}; )" -f $useragent,
         $(if($host.version){$host.version}else{"1.0"}),
         [environment]::version,
         [environment]::osversion.tostring().replace("microsoft windows ", "win")
      )

   $cookies = new-object system.net.cookiecontainer
   if($sessionvariablename) {
      $cookies = get-variable $sessionvariablename -scope 1
   }
   $request.cookiecontainer = $cookies
   if($sessionvariablename) {
      set-variable $sessionvariablename -scope 1 -value $cookies
   }

   try {
      $res = $request.getresponse();
   } catch [system.net.webexception] {
      write-error $_.exception -category resourceunavailable
      return
   } catch {
      write-error $_.exception -category notimplemented
      return
   }

   if((test-path variable:res) -and $res.statuscode -eq 200) {
      if($filename -and !(split-path $filename)) {
         $filename = join-path (convert-path (get-location -psprovider "filesystem")) $filename
      }
      elseif((!$passthru -and !$filename) -or ($filename -and (test-path -pathtype "container" $filename)))
      {
         [string]$filename = ([regex]'(?i)filename=(.*)$').match( $res.headers["content-disposition"] ).groups[1].value
         $filename = $filename.trim("\/""'")

         $ofs = ""
         $filename = [regex]::replace($filename, "[$([regex]::escape(""$([system.io.path]::getinvalidpathchars())$([io.path]::altdirectoryseparatorchar)$([io.path]::directoryseparatorchar)""))]", "_")
         $ofs = " "

         if(!$filename) {
            $filename = $res.responseuri.segments[-1]
            $filename = $filename.trim("\/")
            if(!$filename) {
               $filename = read-host "please provide a file name"
            }
            $filename = $filename.trim("\/")
            if(!([io.fileinfo]$filename).extension) {
               $filename = $filename + "." + $res.contenttype.split(";")[0].split("/")[1]
            }
         }
         $filename = join-path (convert-path (get-location -psprovider "filesystem")) $filename
      }
      if($passthru) {
         $encoding = [system.text.encoding]::getencoding( $res.characterset )
         [string]$output = ""
      }

      [int]$goal = $res.contentlength
      $reader = $res.getresponsestream()
      if($filename) {
         try {
            $writer = new-object system.io.filestream $filename, "create"
         } catch {
            write-error $_.exception -category writeerror
            return
         }
      }
      [byte[]]$buffer = new-object byte[] 4096
      [int]$total = [int]$count = 0
      do
      {
         $count = $reader.read($buffer, 0, $buffer.length);
         if($filename) {
            $writer.write($buffer, 0, $count);
         }
         if($passthru){
            $output += $encoding.getstring($buffer,0,$count)
         } elseif(!$quiet) {
            $total += $count
            if($goal -gt 0) {
               write-progress "downloading $uri" "saving $total of $goal" -id 0 -percentcomplete (($total/$goal)*100)
            } else {
               write-progress "downloading $uri" "saving $total bytes..." -id 0
            }
         }
      } while ($count -gt 0)

      $reader.close()
      if($filename) {
         $writer.flush()
         $writer.close()
      }
      if($passthru){
         $output
      }
   }
   if(test-path variable:res) { $res.close(); }
   if($filename) {
      ls $filename
   }
}

调用方法,如下:
get-webfile http://mirrors.cnnic.cn/apache/couchdb/binary/win/1.4.0/setup-couchdb-1.4.0_r16b01.exe
这里下载couchdb的最新windows安装包。
执行效果如下图所示:

PowerShell小技巧之实现文件下载(类wget)

能够看到在下载文件的过程中会显示当前已下载数和总的文件大小,并且有进度条显示当前下载的进度,跟wget看起来是有些神似了。下载完毕后会显示已经下载文件的情况。

PowerShell小技巧之实现文件下载(类wget)