序言
嗨喽,你们好呀~这儿是爱看美眉的茜茜呐
技术赋能,用科技提高每位人奇特的幸福感。
在快上,用户可以用相片和短视频记录自己的生活点滴,也可以通过直播与粉丝实时互动。
快的内容覆盖生活的方方面面点赞评论网站,用户遍及全省各地。
在这儿,人们能找到自己喜欢的内容,找到自己感兴趣的人,听到更真实有趣的世界,也可以让世界发觉真实有趣的自己。
知识点:开发环境:代码实现:
发送恳求
获取数据
解析数据
保存数据
采集视频代码
网址里的笔名被我删啦点赞评论网站,你可以看一下它的链接怎样的之后自己添加一下
具体爬的哪些网站我会在评论区打出~你们注意看哦
假如你实在不会或有点点小懒癌的小可耐也可以私聊我发放完整源码哦~
导出模块
import requests # 第三方模块 发送请求
import re
伪装
headers = {
'content-type': 'application/json',
'Cookie': 'kpf=PC_WEB; kpn=KUAISHOU_VISION; clientid=3; did=web_d3f9d8c2cbebafd126b80eb0b1c13360; client_key=65890b29; didv=1658130458000; userId=270932146; kuaishou.server.web_st=ChZrdWFpc2hvdS5zZXJ2ZXIud2ViLnN0EqABymzXlGDinYWz3v5NKZWKq6Ld14uOvyRNPT3Gi7uJwI8CE4aatjowKRbPtRt5YIE3s2otZdFEzL7kvW1PQuijqUT_qUe4-u0FlfN1S49mhR4QRc9YKQNObXAPYzZRWIRcrSvdohIwUW8TBTSWLUtMlMh2He2FyvNMR-JfhUHaK-YSkwqXKUj-N-zlHTCPp0z0y6cSgrR9RIdlXqIJFifSbxoSsguEA2pmac6i3oLJsA9rNwKEIiB86mXKYIgbGBbtkVuyoy8TCIwZ2uckiTnfAGZiyV9imCgFMAE; kuaishou.server.web_ph=7353170c91b8f7f05c250730c2faea5355e1',
'Host': 'www..com',
'Origin': 'https://www..com',
'Referer': 'https://www..com/search/video?searchKey=%E6%B3%B3%E8%A3%85%E5%B0%8F%E5%A7%90%E5%A7%90',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36'
}
for page in range(1, 11):
# post请求里面才会有
json = {
'operationName': "visionSearchPhoto",
'query': "fragment photoContent on PhotoEntity {\n id\n duration\n caption\n likeCount\n viewCount\n realLikeCount\n coverUrl\n photoUrl\n photoH265Url\n manifest\n manifestH265\n videoResource\n coverUrls {\n url\n __typename\n }\n timestamp\n expTag\n animatedCoverUrl\n distance\n videoRatio\n liked\n stereoType\n profileUserTopPhoto\n __typename\n}\n\nfragment feedContent on Feed {\n type\n author {\n id\n name\n headerUrl\n following\n headerUrls {\n url\n __typename\n }\n __typename\n }\n photo {\n ...photoContent\n __typename\n }\n canAddComment\n llsid\n status\n currentPcursor\n __typename\n}\n\nquery visionSearchPhoto($keyword: String, $pcursor: String, $searchSessionId: String, $page: String, $webPageArea: String) {\n visionSearchPhoto(keyword: $keyword, pcursor: $pcursor, searchSessionId: $searchSessionId, page: $page, webPageArea: $webPageArea) {\n result\n llsid\n webPageArea\n feeds {\n ...feedContent\n __typename\n }\n searchSessionId\n pcursor\n aladdinBanner {\n imgUrl\n link\n __typename\n }\n __typename\n }\n}\n",
'variables': {'keyword': "泳装小姐姐", 'pcursor': str(page), 'page': "search", 'searchSessionId': "MTRfMjcwOTMyMTQ2XzE2NTg5MjM5NDExODBf5rOz6KOF5bCP5aeQ5aeQXzE4NzQ"}
}
url = 'https://www..com/graphql'
1.发送恳求
response = requests.post(url=url, headers=headers, json=json)
:恳求成功
:没有在服务器上面找到你想要的资源
给不给你数据是两码事
2.获取数据
.text:字符串
.json():字典类型数据
json_data = response.json()
3.解析数据
xpath
css只能取网页源代码上面数据的
re假如当xpath和css和json都不可以用的时侯都可以取(复杂)
json只能取{"":""}["",""]
feeds = json_data['data']['visionSearchPhoto']['feeds']
for i in range(0, len(feeds)):
photoUrl = feeds[i]['photo']['photoUrl']
caption = feeds[i]['photo']['caption']
print(caption, photoUrl)
caption = re.sub('[\\/:*?"<>|\\n]', '', caption)
4.保存视频
通常情况下,大部份网站视频链接图片链接音频链接都可以直接用get
.content:获取视频二补码数据
video_data = requests.get(photoUrl).content
with open(f'video/{caption}.mp4', mode='wb') as f:
f.write(video_data)
源码、解答、教程加Q裙:261823976点击蓝字加入【python学习裙】