Betterzon/Crawler/unused/scrapy/crawler/items.py at 6e0f1e7659aee332f156beb979a8a4d25a40b0f7 - Betterzon - Pluto Development Git

Paddy/Betterzon

mirror of https://github.com/Mueller-Patrick/Betterzon.git synced 2026-05-03 10:30:11 +00:00

Files

T

henningxtro 26ba21156a BETTERZON-58 (#53 )

* BETTERZON-58: Basic Functionality with scrapy

* Added independent crawler function, yielding price

* moved logic to amazon.py

* .

* moved scrapy files to unused folder

* Added basic amazon crawler using beautifulsoup4

* Connected Api to Crawler

* Fixed string concatenation for sql statement in getProductLinksForProduct

* BETTERZON-58: Fixing SQL insert

* BETTERZON-58: Adding access key verification

* BETTERZON-58: Fixing API endpoint of the crawler
- The list of products in the API request was treated like a string and henceforth, only the first product has been crawled

* Added another selector for price on amazon (does not work for books)

Co-authored-by: root <root@DESKTOP-ARBPL82.localdomain>
Co-authored-by: Patrick Müller <patrick@mueller-patrick.tech>
Co-authored-by: Patrick <50352812+Mueller-Patrick@users.noreply.github.com>

2021-05-19 00:46:14 +02:00

13 lines

263 B

Python

Raw Blame History

 # Define here the models for your scraped items
 #
 # See documentation in:
 # https://docs.scrapy.org/en/latest/topics/items.html
 import scrapy
 class CrawlerItem(scrapy.Item):
     # define the fields for your item here like:
     # name = scrapy.Field()
     pass