thatscraper.browser

“thatscraper’s module to handle actions via Selenium’s webdriver

Module Contents

Classes

Key

"keys to use in send functions

Crawler

A selenium.webdriver adapter.

Functions

document_query_selector(selector)

"String to parse javascript code.

Attributes

OPERATING_SYSTEM

FIREFOX_OPTIONS

FIREFOX_OPTIONS

WebElement

WebElements

WebDriver

Number

ATTR_SELECTOR

webdrivers

thatscraper.browser.OPERATING_SYSTEM
thatscraper.browser.FIREFOX_OPTIONS
thatscraper.browser.FIREFOX_OPTIONS
thatscraper.browser.WebElement
thatscraper.browser.WebElements
thatscraper.browser.WebDriver
thatscraper.browser.Number
class thatscraper.browser.Key

Bases: selenium.webdriver.common.keys.Keys

“keys to use in send functions

enter
esc
delete
down
up
tab
backspace
thatscraper.browser.document_query_selector(selector: str)

“String to parse javascript code.

thatscraper.browser.ATTR_SELECTOR
thatscraper.browser.webdrivers
class thatscraper.browser.Crawler(browser: str = 'firefox', headless: bool = False, quit_on_failure: bool = True, **kwargs)

A selenium.webdriver adapter.

An instance of Window calss cam perform a series of automated actions on webpages. Designed to handle sites with heavy use of javascript.

property driver

“selenium webdriver

property logger

logger

quitdriver() Callable

“safe quit webdriver to avoid memory leakages

__download_dir(path)
goto(url: str)

open window at url

half_left_window()

half_left_window

Resize and shifts window to the left.

half_right_window()

half_right_window

Resize and shifts window to the right.

element(value: str, by_attribute: str = 'id', expected_condition=EC.presence_of_element_located) list

element method.

Selects an element by type of attribute defined with ‘by’, with ‘value’, from current page. See thatscraper.ATTR_SELECTOR for a list of attributes types.

If elements are na available yet, the there will be an attempt every ‘step’ seconds, unitl excceed the total time ‘timeout’ (in seconds).

Parameters:
  • value (str) – attribute’s value

  • by_attribute (str, optional) – attribute type., by default “id”

Returns:

Element retrieved.

Return type:

WebElement

element_id(value: str, expected_condition=EC.presence_of_element_located) WebElement

element_id

Retrieve element from current page by it’s id value.

Parameters:

value (str) – id’s value.

Returns:

Element retrieved.

Return type:

WebElement

elements(value: str, by_attribute: str = 'id', expected_condition=EC.presence_of_all_elements_located) WebElements

elements

Selects elements by type of attribute defined with ‘by’, with ‘value’, from current page. See thatscraper.ATTR_SELECTOR for a list of attributes names.

If elements are na availiable yet, the there will be an attempt every ‘step’ seconds, unitl excceed the total time ‘timeout’ (in seconds).

Parameters:
  • value (str) – attribute’s value

  • by_attribute (str, optional) – attribute type, by default “id”

Returns:

List with all elements selected.

Return type:

WebElements

child_of(element: WebDriver, value: str, by_attribute: str = 'id') WebElement

Selects child of element.

Parameters:
  • element (WebDriver) – parent

  • value (str) – Child’s attribute’s value.

  • by_attribute (str, optional) – Attribute.

Returns:

Child element.

Return type:

WebElement

children_of(element, value, by_attribute='id', expected_condition=EC.presence_of_all_elements_located) WebElements

Selects children of element.

Parameters:
  • element (WebDriver) – parent

  • value (str) – Children’s attribute’s value.

  • by_attribute (str, optional) – Attribute.

Returns:

Child element.

Return type:

list[WebElement] = WebElements

click_element(element: WebElement) WebElement

click_element

Click a selected element.

Parameters:

element (WebElement) – Clickable (previously selected) element. If element is not clickable, selenium raises InvalidSelectorException.

Returns:

Clicked element (selenium web element).

Return type:

WebElement

click(value: str, by_attribute: str = 'id') WebElement

“click on element

click_id(id_value) WebElement

“quit element by id

send_to_element(element: WebElement, key, enter=False)

send_key similar to Window.send

Send ‘key’ to WebElement ‘element’

Parameters:
  • element (WebElement) – Valid WebElement from selenium.

  • key (Valid Selenium key or text.) –

Returns:

Element which key was sent to.

Return type:

WebElement

send(key, value: str, by_attribute='id', enter=False)

send

Send a valid ‘key’ to element with selector ‘by’ and corresponding ‘value’.

Parameters:
  • key (Valid Selenium key or text.) –

  • value (str) – _description_

  • by (str, optional) – _description_, by default ‘name’

  • step (float, optional) – timeout step, by default 0.5

  • timeout (int, optional) – timeout until throw error, by default 10

Returns:

Element which key was sent to.

Return type:

WebElement

esc()
arrow_down_element(element, n_times: int = 1, enter=False)

arrow_down

Press keyboard arrow down n_times at element.

Parameters:
  • element (WebElement) – Valid WebElement from selenium

  • n_times (int, optional) – Number of times pressing down key, by default 1

arrow_down(value: str, by_attribute='id', n_times: int = 1, enter=False)

arrow_down

Select element by given selector ‘by’ and corresponding value, then send keyboard arrow down n_times.

Parameters:
  • value (str) – value of the selected attributes

  • by (str, optional) – attribute, by default “name”

  • step (float, optional) – timeout setp, by default 0.5

  • timeout (int, optional) – timeout, by default 10

  • n_times (int, optional) – times of pressing arrow up, by default 1

  • enter (bool, optional) – If True, ‘enter’ key is sent to element, by default False

arrow_up_element(element, n_times: int = 1, enter=False)

arrow_down

Presse keyboard arrow up n_times

Parameters:
  • element (WebElement) – Valid WebElement from selenium

  • n_times (int, optional) – Number of times pressing down key, by default 1

arrow_up(value: str, by_attribute='id', n_times: int = 1, enter=False)

arrow_up

Select element by given selector ‘by’ and corresponding value, then send keyboard arrow up n_times.

Parameters:
  • value (str) – value of the selected attributes

  • by (str, optional) – attribute, by default “name”

  • step (float, optional) – timeout setp, by default 0.5

  • timeout (int, optional) – timeout, by default 10

  • n_times (int, optional) – times of pressing arrow up, by default 1

  • enter (bool, optional) – If True, ‘enter’ key is sent to element, by default False

items_of(parent: WebElement, click=True) WebElements

items_of

Select li elements nested within ‘parent’. Syntax: ```html <parent>

<ul>

<li></li> <li></li> … <li></li>

</ul>

</parent> ``` :param parent: parent of ul element. :type parent: WebElement :param step: Time between trial calls, by default 0.5 :type step: float, optional :param timeout: Total Timeout, by default 10 :type timeout: int, optional :param click: Whether to click parent before and after, by default True :type click: bool, optional

Returns:

List of li elements.

Return type:

WebElements

run_script(script: str)

run_script

Execute Javascript code given a string.

When interacting with log in forms or register, prefer this method instead of Crawler.send or Crawler.send_to_element.

Parameters:

script (str) – Javascript code.

Returns:

Whatever JavaScript code returns.

Return type:

unknown

query_selector(selector: str)

run document.querySelector()

value_to_selector(selector: str, value: str)

value_to_selector

Assing ‘value’ to value attribute of the first element found with ‘selector’.

When interacting with log in forms or register, prefer this method instead of Crawler.send or Crawler.send_to_element.

Parameters:
  • selector (str) – Element selector.

  • value (str) –

    Element’s value. Equivalent to:

    document.querySelector(selector).value=value

    in JavaScript.

Returns:

Whatever JavaScript returns.

Return type:

unkown

to_selector(selector: str, attribute: str, value: str)

value_to_selector

Assing ‘value’ to ‘attribute’ of the first element found with ‘selector’.

When interacting with log in forms or register, prefer this method instead of Crawler.send or Crawler.send_to_element.

Parameters:
  • selector (str) – Element selector.

  • value (str) –

    Element’s value. Equivalent to:

    document.querySelector(selector).value=value

    in JavaScript.

Returns:

Whatever JavaScript returns.

Return type:

unkown

scroll_page()

scroll page 1 vh

google(query)

select anchors from Google search page

source() str

“source page

close() None

Closes the current window.

quit(clean=False)

Quits the driver and close every associated window.