Betterzon/Crawler/unused/scrapy/crawler/middlewares.py
Patrick 6e8c52857f
Master deployment (#67)
* BETTERZON-58: Basic Functionality with scrapy (#33)

* BETTERZON-73: Adding API endpoint that returns the lowest non-amazon prices for a given list of product ids (#32)

* BETTERZON-75: User registration API endpoint (#34)

* BETTERZON-75: Adding backend functions to enable user registration

* BETTERZON-75: Adding regex to check email and username

* BETTERZON-83: FE unit testing (#35)

* BETTERZON-83: Making pre-generated unit tests work

* BETTERZON-83: Writing unit tests for angular to improve code coverage

* BETTERZON-79: Adding API endpoint for logging in (#36)

* BETTERZON-84: Adding service method to check if a session is valid (#37)

* BETTERZON-77: Changing error behavior as the previous behavior cloud have opened up security vulnerabilities (#38)

* BETTERZON-76: Adding method descriptions for backend service methods (#40)

* Adding Codacy code quality badge to README

* BETTERZON-89: Refactoring / Reformatting and adding unit tests (#41)

* BETTERZON-90: Adding API endpoint for creating price alarms (#42)

* BETTERZON-91: Adding API endpoint to GET all price alarms for the currently logged in user (#43)

* BETTERZON-92: Adding API endpoint to edit (update) price alarms (#44)

* BETTERZON-99: Adding some basic cucumber tests (#45)

* BETTERZON-100: Switching to cookies for session management (#46)

* BETTERZON-100: Switching session handling to cookies

* BETTERZON-100: Some code reformatting

* BETTERZON-100: Some more code reformatting

* BETTERZON-93: Adding API endpoint to get managed shops (#47)

* BETTERZON-94: Adding API endpoint to deactivate price listings as a vendor manager (#48)

* BETTERZON-97: Adding API endpoint to get all products listed by a specific vendor (#50)

* BETTERZON-98: Adding API endpoint for adding price entries as a registered vendor manager (#51)

* BETTERZON-95: Adding API endpoint for getting, inserting and updating contact persons (#52)

* BETTERZON-58 (#53)

* BETTERZON-58: Basic Functionality with scrapy

* Added independent crawler function, yielding price

* moved logic to amazon.py

* .

* moved scrapy files to unused folder

* Added basic amazon crawler using beautifulsoup4

* Connected Api to Crawler

* Fixed string concatenation for sql statement in getProductLinksForProduct

* BETTERZON-58: Fixing SQL insert

* BETTERZON-58: Adding access key verification

* BETTERZON-58: Fixing API endpoint of the crawler
- The list of products in the API request was treated like a string and henceforth, only the first product has been crawled

* Added another selector for price on amazon (does not work for books)

Co-authored-by: root <root@DESKTOP-ARBPL82.localdomain>
Co-authored-by: Patrick Müller <patrick@mueller-patrick.tech>
Co-authored-by: Patrick <50352812+Mueller-Patrick@users.noreply.github.com>

* BETTERZON-96: Adding API endpoint for delisting a whole vendor (#54)

* BETTERZON-101: Adding service functions for pricealarms api (#55)

- Not properly tested though as login functionality is required to test but not yet implemented

* BETTERZON-110: Refactoring, reformatting and commenting api service (#56)

* BETTERZON-107: Refactoring code with Proxy as design pattern (#49)

* BETTERZON-78 (#39)

* BETTERZON-31, dependencies.

* BETTERZON-31: Fixing dependencies

* BETTERZON-31,
BETTERZON-50

info popover and footer had been changed.

* BETTERZON-74

simple top-bar has been created.

* WIP: creating footer using grid.

* BETTERZON-78 adding bottom bar and top bar

* Adding cookieconsent as dependency again since it was removed by a merge

* Adding cookieconsent as dependency again since it was removed by a merge

* Apply suggestions from code review

Switching from single to double quotes

* BETTERZON-78 - grid added, structured as in Adobe XD mockup

Co-authored-by: Patrick Müller <patrick@mueller-patrick.tech>
Co-authored-by: Patrick <50352812+Mueller-Patrick@users.noreply.github.com>

* BETTERZON-109 (#57)

* BETTERZON-31, dependencies.

* BETTERZON-31: Fixing dependencies

* BETTERZON-31,
BETTERZON-50

info popover and footer had been changed.

* BETTERZON-74

simple top-bar has been created.

* WIP: creating footer using grid.

* BETTERZON-78 adding bottom bar and top bar

* Adding cookieconsent as dependency again since it was removed by a merge

* Adding cookieconsent as dependency again since it was removed by a merge

* Apply suggestions from code review

Switching from single to double quotes

* BETTERZON-78 - grid added, structured as in Adobe XD mockup

* wip: component rewritten, simple grid applied.

* wip: new component created and added to the app.module.ts. Added a minimal grid layout.

* wip: all components were wrapped now. Grid structure has been applied to the main wrapper-class "container".

* wip: component created and added to the app.module.ts

Co-authored-by: Patrick Müller <patrick@mueller-patrick.tech>
Co-authored-by: Patrick <50352812+Mueller-Patrick@users.noreply.github.com>

* BETTERZON-108 (#58)

* BETTERZON-31, dependencies.

* BETTERZON-31: Fixing dependencies

* BETTERZON-31,
BETTERZON-50

info popover and footer had been changed.

* BETTERZON-74

simple top-bar has been created.

* WIP: creating footer using grid.

* BETTERZON-78 adding bottom bar and top bar

* Adding cookieconsent as dependency again since it was removed by a merge

* Adding cookieconsent as dependency again since it was removed by a merge

* Apply suggestions from code review

Switching from single to double quotes

* BETTERZON-78 - grid added, structured as in Adobe XD mockup

* wip: component rewritten, simple grid applied.

* wip: new component created and added to the app.module.ts. Added a minimal grid layout.

* wip: all components were wrapped now. Grid structure has been applied to the main wrapper-class "container".

Co-authored-by: Patrick Müller <patrick@mueller-patrick.tech>
Co-authored-by: Patrick <50352812+Mueller-Patrick@users.noreply.github.com>

* BETTERZON-106 (#59)

* BETTERZON-31, dependencies.

* BETTERZON-31: Fixing dependencies

* BETTERZON-31,
BETTERZON-50

info popover and footer had been changed.

* BETTERZON-74

simple top-bar has been created.

* WIP: creating footer using grid.

* BETTERZON-78 adding bottom bar and top bar

* Adding cookieconsent as dependency again since it was removed by a merge

* Adding cookieconsent as dependency again since it was removed by a merge

* Apply suggestions from code review

Switching from single to double quotes

* BETTERZON-78 - grid added, structured as in Adobe XD mockup

* wip: component rewritten, simple grid applied.

* wip: new component created and added to the app.module.ts. Added a minimal grid layout.

Co-authored-by: Patrick Müller <patrick@mueller-patrick.tech>
Co-authored-by: Patrick <50352812+Mueller-Patrick@users.noreply.github.com>

* BETTEZON-102 (#60)

* BETTERZON-31, dependencies.

* BETTERZON-31: Fixing dependencies

* BETTERZON-31,
BETTERZON-50

info popover and footer had been changed.

* BETTERZON-74

simple top-bar has been created.

* WIP: creating footer using grid.

* BETTERZON-78 adding bottom bar and top bar

* Adding cookieconsent as dependency again since it was removed by a merge

* Adding cookieconsent as dependency again since it was removed by a merge

* Apply suggestions from code review

Switching from single to double quotes

* BETTERZON-78 - grid added, structured as in Adobe XD mockup

* wip: component rewritten, simple grid applied.

Co-authored-by: Patrick Müller <patrick@mueller-patrick.tech>
Co-authored-by: Patrick <50352812+Mueller-Patrick@users.noreply.github.com>

* BETTERZON-113, BETTERZON-114, BETTERZON-115: Adding API endpoint for favorite shops (#61)

* BETTERZON-116: Adding API endpoint for searching a new product (#62)

* BETTERZON-117: Adding API endpoint for getting the latest crawling status (#63)

* BETTERZON-111: Adding service functions for login and registration (#64)

* BETTERZON-112: Adding service functions for managing vendor shops (#65)

* BETTERZON-118: Adding service functions for managing favorite shops (#66)

Co-authored-by: henningxtro <sextro.henning@student.dhbw-karlsruhe.de>
Co-authored-by: root <root@DESKTOP-ARBPL82.localdomain>
Co-authored-by: Reboooooorn <61185041+Reboooooorn@users.noreply.github.com>
2021-05-29 10:58:27 +02:00

104 lines
3.6 KiB
Python
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Define here the models for your spider middleware
#
# See documentation in:
# https://docs.scrapy.org/en/latest/topics/spider-middleware.html
from scrapy import signals
# useful for handling different item types with a single interface
from itemadapter import is_item, ItemAdapter
class CrawlerSpiderMiddleware:
# Not all methods need to be defined. If a method is not defined,
# scrapy acts as if the spider middleware does not modify the
# passed objects.
@classmethod
def from_crawler(cls, crawler):
# This method is used by Scrapy to create your spiders.
s = cls()
crawler.signals.connect(s.spider_opened, signal=signals.spider_opened)
return s
def process_spider_input(self, response, spider):
# Called for each response that goes through the spider
# middleware and into the spider.
# Should return None or raise an exception.
return None
def process_spider_output(self, response, result, spider):
# Called with the results returned from the Spider, after
# it has processed the response.
# Must return an iterable of Request, or item objects.
for i in result:
yield i
def process_spider_exception(self, response, exception, spider):
# Called when a spider or process_spider_input() method
# (from other spider middleware) raises an exception.
# Should return either None or an iterable of Request or item objects.
pass
def process_start_requests(self, start_requests, spider):
# Called with the start requests of the spider, and works
# similarly to the process_spider_output() method, except
# that it doesnt have a response associated.
# Must return only requests (not items).
for r in start_requests:
yield r
def spider_opened(self, spider):
spider.logger.info('Spider opened: %s' % spider.name)
class CrawlerDownloaderMiddleware:
# Not all methods need to be defined. If a method is not defined,
# scrapy acts as if the downloader middleware does not modify the
# passed objects.
@classmethod
def from_crawler(cls, crawler):
# This method is used by Scrapy to create your spiders.
s = cls()
crawler.signals.connect(s.spider_opened, signal=signals.spider_opened)
return s
def process_request(self, request, spider):
# Called for each request that goes through the downloader
# middleware.
# Must either:
# - return None: continue processing this request
# - or return a Response object
# - or return a Request object
# - or raise IgnoreRequest: process_exception() methods of
# installed downloader middleware will be called
return None
def process_response(self, request, response, spider):
# Called with the response returned from the downloader.
# Must either;
# - return a Response object
# - return a Request object
# - or raise IgnoreRequest
return response
def process_exception(self, request, exception, spider):
# Called when a download handler or a process_request()
# (from other downloader middleware) raises an exception.
# Must either:
# - return None: continue processing this exception
# - return a Response object: stops process_exception() chain
# - return a Request object: stops process_exception() chain
pass
def spider_opened(self, spider):
spider.logger.info('Spider opened: %s' % spider.name)