externalRequestQueue
Index
Properties
Methods
Properties
externalassumedHandledCount
externalassumedTotalCount
externalclient
externalclientKey
externalreadonlyconfig
externalid
externalinProgress
externalinternalTimeoutMillis
externallastActivity
externallog
externaloptionalname
externaloptionalqueryQueueHeadPromise
externalrecentlyHandled
externalrequestsCache
externaltimeoutSecs
Methods
externaladdRequest
Adds a request to the queue.
If a request with the same
uniqueKeyproperty is already present in the queue, it will not be updated. You can find out whether this happened from the resulting QueueOperationInfo object.To add multiple requests to the queue by extracting links from a webpage, see the enqueueLinks helper function.
Parameters
externalrequestLike: Request<Dictionary<any>> | RequestOptions<Dictionary<any>>
Request object or vanilla object with request data. Note that the function sets the
uniqueKeyandidfields to the passed Request.externaloptionaloptions: RequestQueueOperationOptions
Request queue operation options.
Returns Promise<QueueOperationInfo>
externaladdRequests
Adds requests to the queue in batches of 25.
If a request that is passed in is already present due to its
uniqueKeyproperty being the same, it will not be updated. You can find out whether this happened by finding the request in the resulting BatchAddRequestsResult object.Parameters
externalrequestsLike: (Request<Dictionary<any>> | RequestOptions<Dictionary<any>>)[]
Request objects or vanilla objects with request data. Note that the function sets the
uniqueKeyandidfields to the passed requests if missing.externaloptionaloptions: RequestQueueOperationOptions
Request queue operation options.
Returns Promise<BatchAddRequestsResult>
externaldrop
Removes the queue either from the Apify Cloud storage or from the local database, depending on the mode of operation.
Returns Promise<void>
externalfetchNextRequest
Returns a next request in the queue to be processed, or
nullif there are no more pending requests.Once you successfully finish processing of the request, you need to call RequestQueue.markRequestHandled to mark the request as handled in the queue. If there was some error in processing the request, call RequestQueue.reclaimRequest instead, so that the queue will give the request to some other consumer in another call to the
fetchNextRequestfunction.Note that the
nullreturn value doesn't mean the queue processing finished, it means there are currently no pending requests. To check whether all requests in queue were finished, use RequestQueue.isFinished instead.Type parameters
- T: Dictionary<any> = Dictionary<any>
Returns Promise<null | Request<T>>
Returns the request object or
nullif there are no more pending requests.
externalgetInfo
Returns an object containing general information about the request queue.
The function returns the same object as the Apify API Client's getQueue function, which in turn calls the Get request queue API endpoint.
Example:
{
id: "WkzbQMuFYuamGv3YF",
name: "my-queue",
userId: "wRsJZtadYvn4mBZmm",
createdAt: new Date("2015-12-12T07:34:14.202Z"),
modifiedAt: new Date("2015-12-13T08:36:13.202Z"),
accessedAt: new Date("2015-12-14T08:36:13.202Z"),
totalRequestCount: 25,
handledRequestCount: 5,
pendingRequestCount: 20,
}Returns Promise<undefined | RequestQueueInfo>
externalgetRequest
Gets the request from the queue specified by ID.
Type parameters
- T: Dictionary<any> = Dictionary<any>
Parameters
externalid: string
ID of the request.
Returns Promise<null | Request<T>>
Returns the request object, or
nullif it was not found.
externalhandledCount
Returns the number of handled requests.
This function is just a convenient shortcut for:
const { handledRequestCount } = await queue.getInfo();Returns Promise<number>
externalisEmpty
Resolves to
trueif the next call to RequestQueue.fetchNextRequest would returnnull, otherwise it resolves tofalse. Note that even if the queue is empty, there might be some pending requests currently being processed. If you need to ensure that there is no activity in the queue, use RequestQueue.isFinished.Returns Promise<boolean>
externalisFinished
Resolves to
trueif all requests were already handled and there are no more left. Due to the nature of distributed storage used by the queue, the function might occasionally return a false negative, but it will never return a false positive.Returns Promise<boolean>
externalmarkRequestHandled
Marks a request that was previously returned by the RequestQueue.fetchNextRequest function as handled after successful processing. Handled requests will never again be returned by the
fetchNextRequestfunction.Parameters
externalrequest: Request<Dictionary<any>>
Returns Promise<null | QueueOperationInfo>
externalreclaimRequest
Reclaims a failed request back to the queue, so that it can be returned for processing later again by another call to RequestQueue.fetchNextRequest. The request record in the queue is updated using the provided
requestparameter. For example, this lets you store the number of retries or error messages for the request.Parameters
externalrequest: Request<Dictionary<any>>
externaloptionaloptions: RequestQueueOperationOptions
Returns Promise<null | QueueOperationInfo>
staticexternalopen
Opens a request queue and returns a promise resolving to an instance of the RequestQueue class.
RequestQueue represents a queue of URLs to crawl, which is stored either on local filesystem or in the cloud. The queue is used for deep crawling of websites, where you start with several URLs and then recursively follow links to other pages. The data structure supports both breadth-first and depth-first crawling orders.
For more details and code examples, see the RequestQueue class.
Parameters
externaloptionalqueueIdOrName: null | string
ID or name of the request queue to be opened. If
nullorundefined, the function returns the default request queue associated with the crawler run.externaloptionaloptions: StorageManagerOptions
Open Request Queue options.
Returns Promise<RequestQueue>
Represents a queue of URLs to crawl, which is used for deep crawling of websites where you start with several URLs and then recursively follow links to other pages. The data structure supports both breadth-first and depth-first crawling orders.
Each URL is represented using an instance of the Request class. The queue can only contain unique URLs. More precisely, it can only contain Request instances with distinct
uniqueKeyproperties. By default,uniqueKeyis generated from the URL, but it can also be overridden. To add a single URL multiple times to the queue, corresponding Request objects will need to have differentuniqueKeyproperties.Do not instantiate this class directly, use the RequestQueue.open function instead.
RequestQueueis used by BasicCrawler, CheerioCrawler, PuppeteerCrawler and PlaywrightCrawler as a source of URLs to crawl. Unlike RequestList,RequestQueuesupports dynamic adding and removing of requests. On the other hand, the queue is not optimized for operations that add or remove a large number of URLs in a batch.RequestQueuestores its data either on local disk or in the Apify Cloud, depending on whether theAPIFY_LOCAL_STORAGE_DIRorAPIFY_TOKENenvironment variable is set.If the
APIFY_LOCAL_STORAGE_DIRenvironment variable is set, the queue data is stored in that directory in an SQLite database file.If the
APIFY_TOKENenvironment variable is set butAPIFY_LOCAL_STORAGE_DIRis not, the data is stored in the Apify Request Queue cloud storage. Note that you can force usage of the cloud storage also by passing theforceCloudoption to RequestQueue.open function, even if theAPIFY_LOCAL_STORAGE_DIRvariable is set.Example usage: