The Searchmetrics Suite offers the possibility of a technical analysis of your website. This leads to the identification of potentials as well as the quick identification and minimization of risks. To get a technical analysis a crawl must first be created in the Site Experience (Site Experience > Crawl Overview > Create Crawl).
- Things that should be considered when creating a crawl
- Crawl Set-Up
Things that should be considered when creating a crawl:
- Make sure that the stored URL does not block any crawler. In the best case whitelist the Searchmetrics bot.
- If a start page is defined, it must contain either "https://" or "http://".
At first you have to select the project, for which the analysis should be done. Afterwards you have to name the crawl and choose a search engine.
Caution: A crawl must always be started in a specific project, so you can only choose between the search engines defined in the project. If you want to add more search engines to the project, look here.
The project URL will be automatically set as the starting page of the crawl but can still be edited in the second step (Crawler). In the next step the maximum level of pages must be selected.
After all settings have been made, the estimated time as well as the estimated amount of page credits of each crawl will be shown in the window on the right side.
If everything goes as planned, the estimated duration is one hour and about 3.000 page credits will be charged.
If the desired start page does not correspond to the project domain, it can be inserted in the field Crawler start page. For example, if amazon.com/en/ is selected as the start page, only the pages in the /de subdirectory of www.amazon.com will be crawled.
Important: The URL must be inserted with the protocol (https:// or http://)!
If you want to crawl only under this page, you must place the corresponding checkmark in the field.
The Searchmetrics Bot is set as user agent by default. But there is the possibility to change it. For example, if the Google Bot is selected the crawler pretends to be the Google Bot and crawls the contents released to it. You should also set the region of the crawler and the maximum level (layers of the web page) which should be crawled. The maximum crawl speed selects the number of URLs per second that should be sent to your website. Please note that a higher number can affect your server load.
If the crawl result should be compared with a previous one, the respective result can be selected in the bottom bar. When crawl results are compared with each other, the trend becomes visible in the issues (overview page). This indicates the changes from the last crawl, which gives a good overview of the effect, especially in the case of URL improvements.
In this step additional settings can be made.
If a static IP should be used to bypass possible barriers, this can be selected with a check mark in the field. If, for example, a test environment is to be crawled, which only allows access via corresponding access data, these can be entered in the corresponding fields.
Furthermore, in this step individual parameters can be removed or excluded from the crawl. Following the same principle, entire URLs can also be excluded.
If you want to exclude for example all of the /de/ subdirectories just put this into the corresponding field.
In the last step, user-defined headers can be defined, which the crawler should send to a URL when requesting.
Saved URL Groups
If saved URL groups are to be included in the crawl, they can be easily selected and added in the last tab.
After all settings have been made, the crawl can be performed. Select Start Crawl (field on the right).
Note: A crawl costs Page Credits. The estimated number is displayed before the start, but may differ from the number actually used. By clicking on "Start Crawl" you agree to use your existing Page Credits.
All completed crawls are listed in the Crawl Overview page of the corresponding project.