llm-tools/mcps/selenium_mcp
Gregory Gauthier 83ec950df7 first commit
2026-04-08 12:11:04 +01:00
..
__pycache__ first commit 2026-04-08 12:11:04 +01:00
launch_selenium_mcp_server.sh first commit 2026-04-08 12:11:04 +01:00
mcp_config.json first commit 2026-04-08 12:11:04 +01:00
poetry.lock first commit 2026-04-08 12:11:04 +01:00
pyproject.toml first commit 2026-04-08 12:11:04 +01:00
README.md first commit 2026-04-08 12:11:04 +01:00
selenium_mcp_server.py first commit 2026-04-08 12:11:04 +01:00
test_client.py first commit 2026-04-08 12:11:04 +01:00

selenium_mcp

An MCP (Model Context Protocol) server for browser automation via Selenium WebDriver, served over Streamable HTTP transport.

Features

  • Multi-session management — run multiple independent browser sessions simultaneously
  • Full browser automation — navigate, click, type, select, hover, scroll, go back/forward
  • Form filling — fill multiple fields and submit in a single tool call
  • Content extraction — get visible text, raw HTML, or all hyperlinks from a page
  • Screenshots — viewport or full-page PNG captures returned as base64
  • JavaScript execution — run arbitrary JS and get the return value
  • Smart waits — wait for elements to be present, visible, clickable, or gone
  • Actionable errors — every Selenium exception is mapped to a helpful suggestion

Tools

Tool Description
selenium_navigate Navigate to a URL, optionally wait for an element
selenium_click Click an element
selenium_type Type text into an input/textarea
selenium_select Select a dropdown option by value, text, or index
selenium_find_elements Find elements and return their info
selenium_screenshot Take a viewport or full-page screenshot
selenium_get_page_content Extract text, HTML, or links from the page
selenium_execute_script Run JavaScript in the browser
selenium_wait_for Wait for an element condition
selenium_fill_form Fill multiple form fields, then optionally submit
selenium_scroll Scroll up/down/top/bottom
selenium_back / selenium_forward Browser history navigation
selenium_hover Hover over an element
selenium_get_attribute Get all attributes/properties of an element
selenium_list_sessions List active browser sessions
selenium_close_session Close a browser session

Requirements

  • Python 3.10+
  • Google Chrome / Chromium
  • ChromeDriver (auto-managed by Selenium Manager in selenium >= 4.6)

Installation

poetry install

Running

# Default: headless Chrome on port 8000
poetry run selenium-mcp

# STDIO MODE:
poetry run selenium-mcp-stdio

# Custom port and visible browser
SELENIUM_MCP_PORT=9000 SELENIUM_HEADLESS=false poetry run selenium-mcp

The MCP endpoint will be available at http://localhost:8000/mcp.

Environment Variables

Variable Default Description
SELENIUM_MCP_HOST 0.0.0.0 Bind host
SELENIUM_MCP_PORT 8000 Bind port
SELENIUM_HEADLESS true Run Chrome in headless mode
SELENIUM_WINDOW_WIDTH 1920 Browser window width
SELENIUM_WINDOW_HEIGHT 1080 Browser window height
CHROME_BINARY (auto) Path to Chrome/Chromium binary
CHROMEDRIVER_PATH (auto) Path to ChromeDriver binary
SELENIUM_SCREENSHOT_DIR /tmp/selenium_screenshots Screenshot storage directory

MCP Client Configuration

Claude Code / Claude Desktop (via mcp-remote)

{
  "mcpServers": {
    "selenium": {
      "type": "streamable-http",
      "url": "http://localhost:8000/mcp"
    }
  }
}

Programmatic Python client

import asyncio
from mcp import ClientSession
from mcp.client.streamable_http import streamablehttp_client

async def main():
    async with streamablehttp_client("http://localhost:8000/mcp") as (r, w, _):
        async with ClientSession(r, w) as session:
            await session.initialize()
            tools = await session.list_tools()
            print([t.name for t in tools.tools])

asyncio.run(main())

Locator Strategies

All element-targeting tools accept a by parameter:

Value Selenium By
css (default) By.CSS_SELECTOR
xpath By.XPATH
id By.ID
name By.NAME
tag_name By.TAG_NAME
class_name By.CLASS_NAME
link_text By.LINK_TEXT
partial_link_text By.PARTIAL_LINK_TEXT

License

MIT