Kasha Stealth Web Scraper

Kasha – Stealth Web Scraper (CLI)

Description

Kasha is a stealth web scraping utility designed for precision and silence. Named after the Japanese demon that steals corpses from funerals, Kasha moves with similar discretion — gathering data from remote web targets without drawing attention.

Built upon Python HTTPX and fortified with BeautifulSoup for parsing, it employs a random User-Agent engine — and not just some of them, but every known browser string ever compiled. The complete arsenal was too immense to list here.

Note: A vast range of browser identifiers were included in the source. You have access to all of them, ensuring each connection appears unique and unpredictable.

Purpose

Kasha is crafted for developers, researchers, and digital samurai who must collect remote data efficiently — whether for:

  • Archiving or mirroring websites
  • Analyzing HTML structures and links
  • Harvesting assets such as images, CSS, or JavaScript
  • Following internal or external link structures recursively

Usage

Invoke from terminal:

./kasha <url> [options]
Available Options:
Option Description
--resources Scrape and save all assets (images, CSS, JS)
--dynamic Enable Playwright mode for dynamic pages
--logging Activate detailed logging output
--follow-internal Follow and scrape all internal links
--follow-all Follow all links (internal & external)
--rate-limit N Pause N seconds between requests
Example Command:
./kasha https://example.com --resources --follow-internal --rate-limit 2

Structure

All scraped data is preserved in thescrapes/directory, organized by domain:

scrapes/
  ├── example.com/
  │   ├── index.html
  │   ├── assets/
  │   ├── css/
  │   └── js/
  └── anotherdomain.org/

Features

  • Massive randomized User-Agent rotation
  • Recursive link following (internal/external)
  • Optional rate limiting for stealth operations
  • Support for static and dynamic content (Playwright)
  • Clean directory mirroring and structured output

Philosophy

In the tradition of the Ronin, Kasha acts without master or mercy — silent, methodical, and precise.
Each scrape is a strike: deliberate, unseen, and final.

“Strike once, unseen — leave only echoes.”

Requirements

  • Python 3.8+
  • Libraries:httpx,beautifulsoup4,playwright(optional)

Install dependencies:

pip install httpx beautifulsoup4 playwright

License

This project is distributed under a permissive open license. Use with responsibility and respect for target servers. The sword is sharp — wield it wisely.

Downloads