标签 php 下的文章

 1 <?php
 2 $url = $_GET['url'];
 3 //获取视频url
 4 $url = get_redirect_url($url);
 5 //获取视频ID
 6 $str = dirname($url);
 7 $id = substr($str,strripos($str,'video')+6);
 8 //调用抖音官方API
 9 $str = file_get_contents('https://www.douyin.com/web/api/v2/aweme/iteminfo/?item_ids='.$id);
10 //将返回的json数据转为数组
11 $data = json_decode($str,true);
12 //获取有水印的视频地址
13 $url = $data['item_list'][0]['video']['play_addr']['url_list'][0];
14 //将playvm替换为play,从而获取无水印的视频地址
15 $url = str_replace('playwm','play',$url);
16 //获取重定向后的真实地址
17 $video_url = get_redirect_url($url);
18 echo "<a href='$video_url' target='_blank'>$video_url</a>";
19 
20 function get_redirect_url($url) {
21 $ch = curl_init();
22 curl_setopt($ch, CURLOPT_URL, $url);
23 curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
24 curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
25 curl_setopt($ch, CURLOPT_HTTPHEADER, array(
26 'Accept: */*',
27 'Accept-Encoding: gzip',
28 'Connection: Keep-Alive',
29 'User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)'
30 ));
31 curl_setopt($ch, CURLOPT_HEADER, true);
32 curl_setopt($ch, CURLOPT_NOBODY, 1);
33 curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
34 curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
35 $ret = curl_exec($ch);
36 curl_close($ch);
37 preg_match("/Location: (.*?)\r\n/iU",$ret,$location);
38 return $location[1];
39 }

原理可参考:https://baijiahao.baidu.com/s?id=1710928885019968429&wfr=spider&for=pc 

函数可参考:https://blog.csdn.net/weixin_29924799/article/details/116287105

函数说明

可以直接获取网址重定向 (302,301) 之后的地址

函数源码

function get_location($url,$ua=0){

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, $url);

curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);

curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);

$httpheader[] = "Accept:*/*";

$httpheader[] = "Accept-Encoding:gzip,deflate,sdch";

$httpheader[] = "Accept-Language:zh-CN,zh;q=0.8";

$httpheader[] = "Connection:close";

curl_setopt($ch, CURLOPT_HTTPHEADER, $httpheader);

curl_setopt($ch, CURLOPT_HEADER, true);

if ($ua) {

curl_setopt($ch, CURLOPT_USERAGENT, $ua);

} else {

curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Linux; U; Android 4.0.4; es-mx; HTC_One_X Build/IMM76D) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0");

}

curl_setopt($ch, CURLOPT_NOBODY, 1);

curl_setopt($ch, CURLOPT_ENCODING, "gzip");

curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);

curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

$ret = curl_exec($ch);

curl_close($ch);

preg_match("/Location: (.*?)\r\n/iU",$ret,$location);

return $location[1];

}

使用示例

//使用默认ua

echo get_location('http://example.com');

//使用自定义ua

$ua = 'Mozilla/5.0 (iPhone; CPU iPhone OS 13_3 like Mac OS X) AppleWebKit/604.3.5 (KHTML, like Gecko) Version/13.0 MQQBrowser/9.0.0 Mobile/15B87 Safari/604.1 MttCustomUA/2 QBWebViewType/1 WKType/1';

echo get_location('http://example.com',$ua);

 

转载:https://blog.csdn.net/weixin_29924799/article/details/116287105

1. 爬取页面数据

$url = "http://www.zongscan.com/demo333/178.html";
$request = new GuzzleRequest('GET', $url);
$client = new \GuzzleHttp\Client();
$response = $client->send($request, ['timeout' => 5]);


2. 获取爬虫结果

$content = $response->getBody()->getContents();

 

3. 将结果转换为数组

$data = json_decode($content,true);

 

参考:https://www.zongscan.com/demo333/187.html