Skip to content

Upgrade Web_Search to use DDGS instead of web scraping #7468

@Aztexx

Description

@Aztexx

In the current tools web_search + fetch_webpage they work fantastic, but I am running into issues specifically with the web search function. Either my public IP is being rate-limited or the regex being hardcoded is causing it to return zero results. Fetch webpage works great when directed to a specific web site, but without web search functioning, the fetch is relatively useless.
(This is in relation to only using DuckDuckGo as the search engine)
~
I recommend implementing DDGS (used to be duckduckgo-search)
https://pypi.org/project/ddgs
This allows for multiple engines to be used with fallbacks built in

This would maintain the trafilatura content extraction logic, but would make the web_search more reliable and still maintain privacy.

The main change to the modules/web_search.py would be:
Replace the fragile requests + regex block with import DDGS.

Something like this:
def perform_web_search(query, num_pages=3, max_workers=5, timeout=10, fetch_content=True): try: with DDGS() as ddgs: search_results_gen = ddgs.text(keywords=query, max_results=num_pages) search_data = list(search_results_gen)

I'm open to building my own tools and modules, but would love if this was directly implemented to mitigate the need of maintaining separate tools from the source code, especially as I prefer the portable versions you provide.

I still think TGWUI is the best front end for local AI inference :)

  • Aztec

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions