首页 > 代码库 > scrapy 自动下载图片
scrapy 自动下载图片
Item 字段名必须是 image_urls
即:image_urls = Field()
item[‘image_urls‘]的类型是一个list。
item[‘image_urls‘] = "http://some.jpg" 是不行的。
会有如下错误:
Traceback (most recent call last):
File "D:\Python27\lib\site-packages\scrapy\middleware.py", line 62, in _process_chain
return process_chain(self.methods[methodname], obj, *args)
File "D:\Python27\lib\site-packages\scrapy\utils\defer.py", line 65, in process_chain
d.callback(input)
File "D:\Python27\lib\site-packages\twisted\internet\defer.py", line 382, in callback
self._startRunCallbacks(result)
File "D:\Python27\lib\site-packages\twisted\internet\defer.py", line 490, in _startRunCallbacks
self._runCallbacks()
--- <exception caught here> ---
File "D:\Python27\lib\site-packages\twisted\internet\defer.py", line 577, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "D:\Python27\lib\site-packages\scrapy\contrib\pipeline\media.py", line 40, in process_item
requests = arg_to_iter(self.get_media_requests(item, info))
File "D:\Python27\lib\site-packages\scrapy\contrib\pipeline\images.py", line 104, in get_media_requests
return [Request(x) for x in item.get(self.IMAGES_URLS_FIELD, [])]
File "D:\Python27\lib\site-packages\scrapy\http\request\__init__.py", line 26, in __init__
self._set_url(url)
File "D:\Python27\lib\site-packages\scrapy\http\request\__init__.py", line 57, in _set_url
self._set_url(url.encode(self.encoding))
File "D:\Python27\lib\site-packages\scrapy\http\request\__init__.py", line 61, in _set_url
raise ValueError(‘Missing scheme in request url: %s‘ % self._url)
exceptions.ValueError: Missing scheme in request url: h
必须是 item[‘image_urls‘] = ["http://some.jpg"]
scrapy 自动下载图片