php解析字符串里所有URL地址的方法

程序员文章站 2022-06-08 21:17:48

...

php解析字符串里所有URL地址的方法

具体如下：

// $html = the html on the page

// $current_url = the full url that the html came from

//(only needed for $repath)

// $repath = converts ../ and / and // urls to full valid urls

function pageLinks($html, $current_url = "", $repath = false){

preg_match_all("/\

$links = array();

if(isset($matches[2])){

$links = $matches[2];

}

if($repath && count($links) > 0 && strlen($current_url) > 0){

$pathi = pathinfo($current_url);

$dir = $pathi["dirname"];

$base = parse_url($current_url);

$split_path = explode("/", $dir);

$url = "";

foreach($links as $k => $link){

if(preg_match("/^\.\./", $link)){

$total = substr_count($link, "../");

for($i = 0; $i

array_pop($split_path);

}

$url = implode("/", $split_path) . "/" . str_replace("../", "", $link);

}elseif(preg_match("/^\/\//", $link)){

$url = $base["scheme"] . ":" . $link;

}elseif(preg_match("/^\/|^.\//", $link)){

$url = $base["scheme"] . "://" . $base["host"] . $link;

}elseif(preg_match("/^[a-zA-Z0-9]/", $link)){

if(preg_match("/^http/", $link)){

$url = $link;

}else{

$url = $dir . "/" . $link;

}

$links[$k] = $url;

}

return $links;

}

header("content-type: text/plain");

$url = "http://www.jb51.net";

$html = file_get_contents($url);

// Gets links from the page:

print_r(pageLinks($html));

// Gets links from the page and formats them to a full valid url:

print_r(pageLinks($html, $url, true));

下一篇：浅析org.springframework.aop.framework包设计 AOP框架Spring编程配置管理

php解析字符串里所有URL地址的方法

php解析字符串里所有URL地址的方法

PHP为表单获取的URL 地址预设 http 字符串函数代码

PHP使用strstr()函数获取指定字符串后所有字符的方法

PHP下打开URL地址的几种方法小结

php获取指定(访客)IP所有信息（地址、邮政编码、国家、经纬度等）的方法

解析php获取字符串的编码格式的方法(函数)

php遍历所有文件及文件夹的方法深入解析

关于url地址传参数时字符串有回车造成页面脚本赋值失败的解决方法

解析关于java,php以及html的所有文件编码与乱码的处理方法汇总

php实现把url转换迅雷thunder资源下载地址的方法

PHP清除字符串中所有无用标签的方法