127 lines
3.9 KiB
Markdown
127 lines
3.9 KiB
Markdown
# selenium_mcp
|
|
|
|
An MCP (Model Context Protocol) server for browser automation via Selenium WebDriver, served over **Streamable HTTP** transport.
|
|
|
|
## Features
|
|
|
|
- **Multi-session management** — run multiple independent browser sessions simultaneously
|
|
- **Full browser automation** — navigate, click, type, select, hover, scroll, go back/forward
|
|
- **Form filling** — fill multiple fields and submit in a single tool call
|
|
- **Content extraction** — get visible text, raw HTML, or all hyperlinks from a page
|
|
- **Screenshots** — viewport or full-page PNG captures returned as base64
|
|
- **JavaScript execution** — run arbitrary JS and get the return value
|
|
- **Smart waits** — wait for elements to be present, visible, clickable, or gone
|
|
- **Actionable errors** — every Selenium exception is mapped to a helpful suggestion
|
|
|
|
## Tools
|
|
|
|
| Tool | Description |
|
|
|---|---|
|
|
| `selenium_navigate` | Navigate to a URL, optionally wait for an element |
|
|
| `selenium_click` | Click an element |
|
|
| `selenium_type` | Type text into an input/textarea |
|
|
| `selenium_select` | Select a dropdown option by value, text, or index |
|
|
| `selenium_find_elements` | Find elements and return their info |
|
|
| `selenium_screenshot` | Take a viewport or full-page screenshot |
|
|
| `selenium_get_page_content` | Extract text, HTML, or links from the page |
|
|
| `selenium_execute_script` | Run JavaScript in the browser |
|
|
| `selenium_wait_for` | Wait for an element condition |
|
|
| `selenium_fill_form` | Fill multiple form fields, then optionally submit |
|
|
| `selenium_scroll` | Scroll up/down/top/bottom |
|
|
| `selenium_back` / `selenium_forward` | Browser history navigation |
|
|
| `selenium_hover` | Hover over an element |
|
|
| `selenium_get_attribute` | Get all attributes/properties of an element |
|
|
| `selenium_list_sessions` | List active browser sessions |
|
|
| `selenium_close_session` | Close a browser session |
|
|
|
|
## Requirements
|
|
|
|
- Python 3.10+
|
|
- Google Chrome / Chromium
|
|
- ChromeDriver (auto-managed by Selenium Manager in selenium >= 4.6)
|
|
|
|
## Installation
|
|
|
|
```bash
|
|
poetry install
|
|
```
|
|
|
|
## Running
|
|
|
|
```bash
|
|
# Default: headless Chrome on port 8000
|
|
poetry run selenium-mcp
|
|
|
|
# STDIO MODE:
|
|
poetry run selenium-mcp-stdio
|
|
|
|
# Custom port and visible browser
|
|
SELENIUM_MCP_PORT=9000 SELENIUM_HEADLESS=false poetry run selenium-mcp
|
|
```
|
|
|
|
The MCP endpoint will be available at `http://localhost:8000/mcp`.
|
|
|
|
## Environment Variables
|
|
|
|
| Variable | Default | Description |
|
|
|---|---|---|
|
|
| `SELENIUM_MCP_HOST` | `0.0.0.0` | Bind host |
|
|
| `SELENIUM_MCP_PORT` | `8000` | Bind port |
|
|
| `SELENIUM_HEADLESS` | `true` | Run Chrome in headless mode |
|
|
| `SELENIUM_WINDOW_WIDTH` | `1920` | Browser window width |
|
|
| `SELENIUM_WINDOW_HEIGHT` | `1080` | Browser window height |
|
|
| `CHROME_BINARY` | *(auto)* | Path to Chrome/Chromium binary |
|
|
| `CHROMEDRIVER_PATH` | *(auto)* | Path to ChromeDriver binary |
|
|
| `SELENIUM_SCREENSHOT_DIR` | `/tmp/selenium_screenshots` | Screenshot storage directory |
|
|
|
|
## MCP Client Configuration
|
|
|
|
### Claude Code / Claude Desktop (via mcp-remote)
|
|
|
|
```json
|
|
{
|
|
"mcpServers": {
|
|
"selenium": {
|
|
"type": "streamable-http",
|
|
"url": "http://localhost:8000/mcp"
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### Programmatic Python client
|
|
|
|
```python
|
|
import asyncio
|
|
from mcp import ClientSession
|
|
from mcp.client.streamable_http import streamablehttp_client
|
|
|
|
async def main():
|
|
async with streamablehttp_client("http://localhost:8000/mcp") as (r, w, _):
|
|
async with ClientSession(r, w) as session:
|
|
await session.initialize()
|
|
tools = await session.list_tools()
|
|
print([t.name for t in tools.tools])
|
|
|
|
asyncio.run(main())
|
|
```
|
|
|
|
## Locator Strategies
|
|
|
|
All element-targeting tools accept a `by` parameter:
|
|
|
|
| Value | Selenium `By` |
|
|
|---|---|
|
|
| `css` (default) | `By.CSS_SELECTOR` |
|
|
| `xpath` | `By.XPATH` |
|
|
| `id` | `By.ID` |
|
|
| `name` | `By.NAME` |
|
|
| `tag_name` | `By.TAG_NAME` |
|
|
| `class_name` | `By.CLASS_NAME` |
|
|
| `link_text` | `By.LINK_TEXT` |
|
|
| `partial_link_text` | `By.PARTIAL_LINK_TEXT` |
|
|
|
|
## License
|
|
|
|
MIT
|