site crawler
1.1. Introduction
1.1.1. What does it do?
Uses a cron-script to traverse a queue of actions, eg. requesting a URL but could also be calling of a function. The name "crawler" refers to this action: That a URL in the queue is requested.
Features an API that other extensions can plug into. Example of this is "indexed_search" which uses crawler to index content defined by its Indexing Configurations.
The requests of URLs is specially designed to request TYPO3 frontends with special processing instructions. The requests sends a TYPO3 specific header in the GET requests which identifies a special action. For instance the action requested could be to publish the URL to a static file or it could be to index its content - or re-cache the page. These processing instructions are also defined by third-party extensions (and indexed search is one of them). In this way a processing instruction can instruct the frontend to perform an action (like indexing, publishing etc.) which cannot be done with a request from outside.