采集数据时,访问次数过多会被封IP,所以加上socks5代理去请求。
#!/usr/bin/python # coding=utf-8 import requests headers = { 'Host': 'www.rootop.org', 'Upgrade-Insecure-Requests': '1', 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.146 Safari/537.36', } # socks5无认证 #proxies = {'https': 'socks5://61.97.x.x:20115','http': 'socks5://61.97.x.x:20115'} # socks5带密码认证 proxies = {'https': 'socks5://user111:pass111@61.97.x.x:20115','http': 'socks5://user111:pass111@61.97.x.x:20115'} page = requests.get("https://www.rootop.org/pages/4927.html", headers=headers, proxies=proxies) print(page.text)
nginx日志可以看到来源ip变成代理服务器的ip。
{ "remote_addr":"61.97.x.x", "time_local":"31/Mar/2021:09:54:02 +0800", "method":"GET","uri":"/index.php", "server_protocol":"HTTP/1.1", "request":"GET /pages/4927.html HTTP/1.1", "user_agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.146 Safari/537.36" }
原创文章,转载请注明。本文链接地址: https://www.rootop.org/pages/4935.html