Use-Case Specification: Web Crawler

1. Use-Case: Web Crawler

1.1 Brief Description

The web crawler is an important component of our project. In this Use-Case-Specification we specify the main task of this component: Crawling predefined webpages for current prices and saving them into a database.

2. Flow of Events

Activity Diagram

At the very beginning the crawler process reads it's configuration file and the products to be crawled from the database. If one of both is invalid / not availabel, the process will terminate. If both are valid, the load balancer will distribute the crawling tasks across all registered crawling instances. Each instance will then iterate over the assigned list of products. For each product, it will then iterate over all vendors where this product is available and fetch the price. After all prices have been fetched, the price entries are sent to the database. After all products are crawled, the program is done and can potentially send the status of the execution to some HTTP endpoint.

3. Special Requirements

N/A

4. Preconditions

4.1 The Database has to accept connections

4.2 A configuration file has to be in place / Environment variables have to be set properly

5. Postconditions

N/A