thatscraper
Subpackages
Submodules
Package Contents
Classes
A selenium.webdriver adapter. |
|
"keys to use in send functions |
Attributes
- class thatscraper.Crawler(browser: str = 'firefox', headless: bool = False, quit_on_failure: bool = True, **kwargs)
A selenium.webdriver adapter.
An instance of Window calss cam perform a series of automated actions on webpages. Designed to handle sites with heavy use of javascript.
- property driver
“selenium webdriver
- property logger
logger
- quitdriver() Callable
“safe quit webdriver to avoid memory leakages
- __download_dir(path)
- goto(url: str)
open window at url
- half_left_window()
half_left_window
Resize and shifts window to the left.
- half_right_window()
half_right_window
Resize and shifts window to the right.
- element(value: str, by_attribute: str = 'id', expected_condition=EC.presence_of_element_located) list
element method.
Selects an element by type of attribute defined with ‘by’, with ‘value’, from current page. See thatscraper.ATTR_SELECTOR for a list of attributes types.
If elements are na available yet, the there will be an attempt every ‘step’ seconds, unitl excceed the total time ‘timeout’ (in seconds).
- Parameters:
value (str) – attribute’s value
by_attribute (str, optional) – attribute type., by default “id”
- Returns:
Element retrieved.
- Return type:
WebElement
- element_id(value: str, expected_condition=EC.presence_of_element_located) WebElement
element_id
Retrieve element from current page by it’s id value.
- Parameters:
value (str) – id’s value.
- Returns:
Element retrieved.
- Return type:
WebElement
- elements(value: str, by_attribute: str = 'id', expected_condition=EC.presence_of_all_elements_located) WebElements
elements
Selects elements by type of attribute defined with ‘by’, with ‘value’, from current page. See thatscraper.ATTR_SELECTOR for a list of attributes names.
If elements are na availiable yet, the there will be an attempt every ‘step’ seconds, unitl excceed the total time ‘timeout’ (in seconds).
- Parameters:
value (str) – attribute’s value
by_attribute (str, optional) – attribute type, by default “id”
- Returns:
List with all elements selected.
- Return type:
WebElements
- child_of(element: WebDriver, value: str, by_attribute: str = 'id') WebElement
Selects child of element.
- Parameters:
element (WebDriver) – parent
value (str) – Child’s attribute’s value.
by_attribute (str, optional) – Attribute.
- Returns:
Child element.
- Return type:
WebElement
- children_of(element, value, by_attribute='id', expected_condition=EC.presence_of_all_elements_located) WebElements
Selects children of element.
- Parameters:
element (WebDriver) – parent
value (str) – Children’s attribute’s value.
by_attribute (str, optional) – Attribute.
- Returns:
Child element.
- Return type:
list[WebElement] = WebElements
- click_element(element: WebElement) WebElement
click_element
Click a selected element.
- Parameters:
element (WebElement) – Clickable (previously selected) element. If element is not clickable, selenium raises InvalidSelectorException.
- Returns:
Clicked element (selenium web element).
- Return type:
WebElement
- click(value: str, by_attribute: str = 'id') WebElement
“click on element
- click_id(id_value) WebElement
“quit element by id
- send_to_element(element: WebElement, key, enter=False)
send_key similar to Window.send
Send ‘key’ to WebElement ‘element’
- Parameters:
element (WebElement) – Valid WebElement from selenium.
key (Valid Selenium key or text.) –
- Returns:
Element which key was sent to.
- Return type:
WebElement
- send(key, value: str, by_attribute='id', enter=False)
send
Send a valid ‘key’ to element with selector ‘by’ and corresponding ‘value’.
- Parameters:
key (Valid Selenium key or text.) –
value (str) – _description_
by (str, optional) – _description_, by default ‘name’
step (float, optional) – timeout step, by default 0.5
timeout (int, optional) – timeout until throw error, by default 10
- Returns:
Element which key was sent to.
- Return type:
WebElement
- esc()
- arrow_down_element(element, n_times: int = 1, enter=False)
arrow_down
Press keyboard arrow down n_times at element.
- Parameters:
element (WebElement) – Valid WebElement from selenium
n_times (int, optional) – Number of times pressing down key, by default 1
- arrow_down(value: str, by_attribute='id', n_times: int = 1, enter=False)
arrow_down
Select element by given selector ‘by’ and corresponding value, then send keyboard arrow down n_times.
- Parameters:
value (str) – value of the selected attributes
by (str, optional) – attribute, by default “name”
step (float, optional) – timeout setp, by default 0.5
timeout (int, optional) – timeout, by default 10
n_times (int, optional) – times of pressing arrow up, by default 1
enter (bool, optional) – If True, ‘enter’ key is sent to element, by default False
- arrow_up_element(element, n_times: int = 1, enter=False)
arrow_down
Presse keyboard arrow up n_times
- Parameters:
element (WebElement) – Valid WebElement from selenium
n_times (int, optional) – Number of times pressing down key, by default 1
- arrow_up(value: str, by_attribute='id', n_times: int = 1, enter=False)
arrow_up
Select element by given selector ‘by’ and corresponding value, then send keyboard arrow up n_times.
- Parameters:
value (str) – value of the selected attributes
by (str, optional) – attribute, by default “name”
step (float, optional) – timeout setp, by default 0.5
timeout (int, optional) – timeout, by default 10
n_times (int, optional) – times of pressing arrow up, by default 1
enter (bool, optional) – If True, ‘enter’ key is sent to element, by default False
- items_of(parent: WebElement, click=True) WebElements
items_of
Select li elements nested within ‘parent’. Syntax: ```html <parent>
- <ul>
<li></li> <li></li> … <li></li>
</ul>
</parent> ``` :param parent: parent of ul element. :type parent: WebElement :param step: Time between trial calls, by default 0.5 :type step: float, optional :param timeout: Total Timeout, by default 10 :type timeout: int, optional :param click: Whether to click parent before and after, by default True :type click: bool, optional
- Returns:
List of li elements.
- Return type:
WebElements
- run_script(script: str)
run_script
Execute Javascript code given a string.
When interacting with log in forms or register, prefer this method instead of Crawler.send or Crawler.send_to_element.
- Parameters:
script (str) – Javascript code.
- Returns:
Whatever JavaScript code returns.
- Return type:
unknown
- query_selector(selector: str)
run document.querySelector()
- value_to_selector(selector: str, value: str)
value_to_selector
Assing ‘value’ to value attribute of the first element found with ‘selector’.
When interacting with log in forms or register, prefer this method instead of Crawler.send or Crawler.send_to_element.
- Parameters:
selector (str) – Element selector.
value (str) –
- Element’s value. Equivalent to:
document.querySelector(selector).value=value
in JavaScript.
- Returns:
Whatever JavaScript returns.
- Return type:
unkown
- to_selector(selector: str, attribute: str, value: str)
value_to_selector
Assing ‘value’ to ‘attribute’ of the first element found with ‘selector’.
When interacting with log in forms or register, prefer this method instead of Crawler.send or Crawler.send_to_element.
- Parameters:
selector (str) – Element selector.
value (str) –
- Element’s value. Equivalent to:
document.querySelector(selector).value=value
in JavaScript.
- Returns:
Whatever JavaScript returns.
- Return type:
unkown
- scroll_page()
scroll page 1 vh
- google(query)
select anchors from Google search page
- source() str
“source page
- close() None
Closes the current window.
- quit(clean=False)
Quits the driver and close every associated window.
- class thatscraper.Key
Bases:
selenium.webdriver.common.keys.Keys“keys to use in send functions
- enter
- esc
- delete
- down
- up
- tab
- backspace
- thatscraper.ATTR_SELECTOR