This Middleware adds some settings to configure how to work with Crawlera.


Default: None

Unique Crawlera API Key provided for authentication.


Default: ''

Crawlera instance url, it varies depending on adquiring a private or dedicated instance. If Crawlera didn’t provide you with a private instance url, you don’t need to specify it.


Default: 400

Number of consecutive bans from Crawlera necessary to stop the spider.


Default: 190

Timeout for processing Crawlera requests. It overrides Scrapy’s DOWNLOAD_TIMEOUT.


Default: False

If False Sets Scrapy’s DOWNLOAD_DELAY to 0, making the spider to crawl faster. If set to True, it will respect the provided DOWNLOAD_DELAY from Scrapy.


Default: {}

Default headers added only to crawlera requests. Headers defined on DEFAULT_REQUEST_HEADERS will take precedence as long as the CrawleraMiddleware is placed after the DefaultHeadersMiddleware. Headers set on the requests have precedence over the two settings.

  • This is the default behavior, DefaultHeadersMiddleware default priority is 400 and we recommend CrawleraMiddleware priority to be 610


Default: 15

Step size used for calculating exponential backoff according to the formula: random.uniform(0, min(max, step * 2 ** attempt)).


Default: 180

Max value for exponential backoff as showed in the formula above.