python 爬虫之scrapy的image pipeline使用

Python 更新时间：2026-04-01 18:10:38 发布时间：1638天前 IT归档最新发布模块sitemap 名妆网法律咨询聚返吧英语巴士网伯小乐网商动力

参考官方文档：Downloading and processing files and images — Scrapy 2.5.0 documentation

使用方法

spiders 下的爬虫文件代码

import scrapy


class ZolSpider(scrapy.Spider):
    name = 'zol'
    allowed_domains = ['zol.com.cn']
    start_urls = ['https://desk.zol.com.cn/bizhi/9691_117173_2.html']

    def parse(self, response):
        img_url = response.xpath('//img[@id="bigImg"]/@src').getall()
        yield {
            'image_urls':img_url
        }

settings.py

ITEM_PIPELINES = {
#    'image.pipelines.ImagePipeline': 300,
   'scrapy.pipelines.images.ImagesPipeline':200
}

# 图片下载路径
IMAGES_STORE= 'D:python_reptilescrapy中imagepipeline的使用imageimg'

自定义 imagepipeline

pipeline.py

继承ImagesPipeline类

from scrapy.pipelines.images import ImagesPipeline

class ImagePipeline(ImagesPipeline):
    def get_media_requests(self, item, info):
        return Request(item.get('key'))

注意这个得要一致

转载请注明：文章转载自 www.mshxw.com

本文地址：https://www.mshxw.com/it/293944.html

上一篇 Shiro异常java.lang.IllegalArgumentException: Odd number of characters的解决方案

下一篇 Python 处理批量处理txt并保存成excel

Python相关栏目本月热门文章

关于我们文章归档网站地图联系我们