亚洲日本欧洲欧美视频,日韩中文字幕有码av,一本一道av中文字幕无码,国产线播放免费人成视频播放,人妻少妇偷人无码视频,日夜啪啪一区二区三区,国产尤物精品自在拍视频首页,久热这里只有精品12

使用scrapy shell 進行爬取過程中的調試

參考文檔：Scrapy shell — Scrapy 2.6.2 documentation

使用scrapy.shell.inspect_response函數進行爬取過程的調試：

例：在爬蟲中啟用shell

import scrapy
class MySpider(scrapy.Spider):
    name = "myspider"
    start_urls = [
        "http://example.com",
        "http://example.org",
        "http://example.net",
    ]
    
    def parse(self, response):
        # We want to inspect one specific response.
        if ".org" in response.url:
            from scrapy.shell import inspect_response   #此處引入inspect_response
            inspect_response(response, self)            #進入shell
        # Rest of parsing code.

當運行爬蟲后，回進入如下shell狀態：

2014-01-23 17:48:31-0400 [scrapy.core.engine] DEBUG: Crawled (200) <GET http://example.
?→com> (referer: None)
2014-01-23 17:48:31-0400 [scrapy.core.engine] DEBUG: Crawled (200) <GET http://example.
?→org> (referer: None)
[s] Available Scrapy objects:
[s] crawler <scrapy.crawler.Crawler object at 0x1e16b50>
...
>>> response.url
'http://example.org'

此時，可檢查期望的結果是否正常：

>>> response.xpath('//h1[@class="fn"]')
[]

不正常，可調用瀏覽器進行檢查：

>>> view(response)
True

最后，可鍵Ctrl-D（windows下鍵Ctrl-Z）來退出shell并繼續后續爬取：

>>> ^D
2014-01-23 17:50:03-0400 [scrapy.core.engine] DEBUG: Crawled (200) <GET http://example.
?→net> (referer: None)
...

posted @ 2022-08-19 10:25 sfccl 閱讀(80) 評論(0) 收藏舉報

刷新頁面返回頂部

sfccl

使用scrapy shell 進行爬取過程中的調試

公告