⽹络学习爬⾍⼼得3(爬取图⽚)
图⽚的爬取和⽂本内容的爬取基本上是⼀样的,关键是在到图⽚的下载地址。
link ='picbian/4kfengjing/'
link_add =[]
link_add.append(link)
for i in range(2,11):
link_add.append(link+'index_'+str(i)+'.html')
分析每⼀页的4K风景图,发现图⽚都在
标签⾥⾯,分布在
标签下,⽤⼀条代码就能到图⽚的地址。
4k电影源代码pic_list = soup.find('div',class_='slist').find_all('img')
for pic in pic_list:
pic_url ='picbian'+pic['src']
图⽚需要命名,直接截取图⽚地址进⾏命名。
整体的代码如下:
import requests
from bs4 import BeautifulSoup
header ={'user-agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36'} link ='picbian/4kfengjing/'
link_add =[]
link_add.append(link)
for i in range(2,11):
link_add.append(link+'index_'+str(i)+'.html')
for n in range(10):
r = (link_add[n],headers=header,timeout=10)
soup = ,'lxml')
pic_list = soup.find('div',class_='slist').find_all('img')
for pic in pic_list:
pic_url ='picbian'+pic['src']
pic_name = pic['src'][-18:]
picture = (pic_url).content
with open('d:\\picture\\'+pic_name,'wb')as f:
f.write(picture)
f.close()