site crawler

1.3.3. Explanation of processing instructions

Before we look at how the queue is processed, we need to take a look at processing instructions. When the crawler requests one of these URLs from the TYPO3 frontend it can add a TYPO3 specific request header which asks the frontend to do a special thing; For instance this header can ask to re-index the page, re-cache the page, to process the request with some frontend usergroups initialized etc.

If you look at the configuration code you can see how each set is assigned processing instructions. When you submit URLs you must select which processing instructions to send in the request:


The available processing instructions are defined by third-party extensions using an API in the crawler extension. In this case "indexed_search" and the extension "cachemgm" is installed and provides processing instructions. If you select "Re-indexing" it means that all configuration sets with this processing instruction is used to generate URLs which will pass this processing instruction to the frontend. In the frontend there are hooks which will take care of processing according to the processing instruction. In the case of "tx_indexedsearch_reindex" it will ask to have pages re-indexed!

The same is the case with "Re-cache pages"; This will re-generate the cached version of a page.

To top


Valid XHTML 1.0!