main#

Entry point for asrch

Main file for asrch which handles all modules CLI argument and command handling

asrch.__main__.browse(url: str = <typer.models.ArgumentInfo object>, proxy: str = <typer.models.OptionInfo object>, log: bool = <typer.models.OptionInfo object>, nodriver: bool = <typer.models.OptionInfo object>, parser: str = <typer.models.OptionInfo object>, images: bool = <typer.models.OptionInfo object>, debug: bool = <typer.models.OptionInfo object>)[source]#

Browse the web.

Parameters:

url (str) – The URL to start browsing from (default is “https://asrch.bitbucket.io”).
proxy (str, optional) – Proxy to send requests (<ip:port>) [optional].
log (bool, optional) – Suppress all logs (for emacs mode).
nodriver (bool, optional) – Use requests and BS4 instead of selenium (faster but more detectable).
parser (str, optional) – HTML parser to use for nodriver. Choices: html.parser, lxml (default is html.parser).

Returns:

None

Return type:

None

Raises:

ValueError – If invalid parser option is provided.

Example:

To browse using default settings:

>>> browse()

To browse using a proxy and suppress logs:

>>> browse(proxy="127.0.0.1:8080", log=True)

To browse without using a webdriver and specify a parser:

>>> browse(nodriver=True, parser="lxml")

asrch.__main__.ccache()[source]#: Clear browser cache

asrch.__main__.conf()[source]#: Output current config

asrch.__main__.create_workspace(name: str, current_datetime: str, config: str)[source]#

asrch.__main__.ensure_ws_doesnt_exist(ws_name: str) → bool[source]#

asrch.__main__.find(url: typing.Annotated[str, <typer.models.OptionInfo object at 0x7054c526d460>], element: typing.Annotated[str, <typer.models.OptionInfo object at 0x7054c526d4f0>], proxy: typing.Annotated[str | None, <typer.models.OptionInfo object at 0x7054c526d580>] = '', header: typing.Annotated[bool, <typer.models.OptionInfo object at 0x7054c526d610>] = False, log: typing.Annotated[bool, <typer.models.OptionInfo object at 0x7054c526d6a0>] = False, locator: typing.Annotated[str, <typer.models.OptionInfo object at 0x7054c526d730>] = 'tag_name')[source]#

Find an element on a web page.

Parameters:

url (str) – URL to retrieve.
element (str) – Element to return.
proxy (Optional[str]) – Proxy to send request (<ip:port>) [optional].
header (bool) – Show browser header.
log (bool) – Suppress all logs.
locator (str) – Locator to find the element.

Default proxy:

“”

Default header:

False

Default log:

False

Default locator:

“tag_name”

Returns:

None

Return type:

None

asrch.__main__.get_ws()[source]#

asrch.__main__.open_(mode: asrch.utils.constants.OpenModes = <typer.models.ArgumentInfo object>, url: str = <typer.models.ArgumentInfo object>, proxy: str = <typer.models.OptionInfo object>, header: bool = <typer.models.OptionInfo object>, browse: bool = <typer.models.OptionInfo object>, pager: bool = <typer.models.OptionInfo object>, download: bool = <typer.models.OptionInfo object>, log: bool = <typer.models.OptionInfo object>, nodriver: bool = <typer.models.OptionInfo object>, parser: str = <typer.models.OptionInfo object>, silent: bool = <typer.models.OptionInfo object>, inspect_mode: str = <typer.models.OptionInfo object>)[source]#

Open a URL and perform various operations.

Parameters:

mode (OpenModes) – Mode for the command.
url (str) – URL to retrieve.
proxy (str, optional) – Proxy to send request <ip:port> [optional].
header (bool, optional) – Show browser header.
browse (bool, optional) – Enable browsing (using keyboard inputs to open URLs).
pager (bool, optional) – Output [JS] in pager.
download (bool, optional) – Download all scraped images.
log (bool, optional) – Suppress all logs (for emacs mode).
nodriver (bool, optional) – Use requests and BS4 instead of selenium (faster but more detectable).
parser (str, optional) – HTML parser to use for nodriver. Choices: html.parser, lxml.
silent (bool, optional) – Output to text file instead of terminal (defaults to workspace folder).
inspect_mode (str, optional) – What part of the page you would like to inspect.

Note:

For log, this option is intended for emacs mode and should be ignored in normal CLI mode, but can be used if needed.
inspect_mode with -M js can produce a large output (~5k line history record).

asrch.__main__.search(query: str = <typer.models.ArgumentInfo object>, proxy: str = <typer.models.OptionInfo object>, browse: bool = <typer.models.OptionInfo object>, header: bool = <typer.models.OptionInfo object>, log: bool = <typer.models.OptionInfo object>)[source]#

Search function to perform a search operation.

Parameters:

header (Annotated[bool, typer.Option(help="show browser header")]) – Show browser header. Annotated with bool.
proxy (Annotated[str, typer.Option(help="proxy to send request <IP:port> [optional]")]) – Proxy to send request <IP:port> [optional]. Annotated with str.
log (bool) – toggle logging message visibility

Default:

false

: This flag is intended for the emacs plugin and is not made to: be used within the normal CLI mode however you can use it if you like.

asrch.__main__.workspace(initialize: bool = <typer.models.OptionInfo object>, create: bool = <typer.models.OptionInfo object>, config: str = <typer.models.OptionInfo object>, name: str = <typer.models.OptionInfo object>, delete: bool = <typer.models.OptionInfo object>)[source]#

Perform operations related to workspaces. Depending on the options provided, this function can initialize, create, delete, or perform other actions related to workspaces.

Parameters:

initialize (bool) – Flag to initialize workspace.
create (bool) – Flag to create workspace.
config (str) – The name of the workspace to act on.
name (str) – The name of the workspace to act on.
delete (bool) – Flag to delete workspace.

main

Contents

main#