selenium_mcp

An MCP (Model Context Protocol) server for browser automation via Selenium WebDriver, served over Streamable HTTP transport.

Features

Multi-session management — run multiple independent browser sessions simultaneously
Full browser automation — navigate, click, type, select, hover, scroll, go back/forward
Form filling — fill multiple fields and submit in a single tool call
Content extraction — get visible text, raw HTML, or all hyperlinks from a page
Screenshots — viewport or full-page PNG captures returned as base64
JavaScript execution — run arbitrary JS and get the return value
Smart waits — wait for elements to be present, visible, clickable, or gone
Actionable errors — every Selenium exception is mapped to a helpful suggestion

Tools

Tool	Description
`selenium_navigate`	Navigate to a URL, optionally wait for an element
`selenium_click`	Click an element
`selenium_type`	Type text into an input/textarea
`selenium_select`	Select a dropdown option by value, text, or index
`selenium_find_elements`	Find elements and return their info
`selenium_screenshot`	Take a viewport or full-page screenshot
`selenium_get_page_content`	Extract text, HTML, or links from the page
`selenium_execute_script`	Run JavaScript in the browser
`selenium_wait_for`	Wait for an element condition
`selenium_fill_form`	Fill multiple form fields, then optionally submit
`selenium_scroll`	Scroll up/down/top/bottom
`selenium_back` / `selenium_forward`	Browser history navigation
`selenium_hover`	Hover over an element
`selenium_get_attribute`	Get all attributes/properties of an element
`selenium_list_sessions`	List active browser sessions
`selenium_close_session`	Close a browser session

Requirements

Python 3.10+
Google Chrome / Chromium
ChromeDriver (auto-managed by Selenium Manager in selenium >= 4.6)

Installation

poetry install

Running

# Default: headless Chrome on port 8000
poetry run selenium-mcp

# STDIO MODE:
poetry run selenium-mcp-stdio

# Custom port and visible browser
SELENIUM_MCP_PORT=9000 SELENIUM_HEADLESS=false poetry run selenium-mcp

The MCP endpoint will be available at http://localhost:8000/mcp.

Environment Variables

Variable	Default	Description
`SELENIUM_MCP_HOST`	`0.0.0.0`	Bind host
`SELENIUM_MCP_PORT`	`8000`	Bind port
`SELENIUM_HEADLESS`	`true`	Run Chrome in headless mode
`SELENIUM_WINDOW_WIDTH`	`1920`	Browser window width
`SELENIUM_WINDOW_HEIGHT`	`1080`	Browser window height
`CHROME_BINARY`	(auto)	Path to Chrome/Chromium binary
`CHROMEDRIVER_PATH`	(auto)	Path to ChromeDriver binary
`SELENIUM_SCREENSHOT_DIR`	`/tmp/selenium_screenshots`	Screenshot storage directory

MCP Client Configuration

Claude Code / Claude Desktop (via mcp-remote)

{
  "mcpServers": {
    "selenium": {
      "type": "streamable-http",
      "url": "http://localhost:8000/mcp"
    }
  }
}

Programmatic Python client

import asyncio
from mcp import ClientSession
from mcp.client.streamable_http import streamablehttp_client

async def main():
    async with streamablehttp_client("http://localhost:8000/mcp") as (r, w, _):
        async with ClientSession(r, w) as session:
            await session.initialize()
            tools = await session.list_tools()
            print([t.name for t in tools.tools])

asyncio.run(main())

Locator Strategies

All element-targeting tools accept a by parameter:

Value	Selenium `By`
`css` (default)	`By.CSS_SELECTOR`
`xpath`	`By.XPATH`
`id`	`By.ID`
`name`	`By.NAME`
`tag_name`	`By.TAG_NAME`
`class_name`	`By.CLASS_NAME`
`link_text`	`By.LINK_TEXT`
`partial_link_text`	`By.PARTIAL_LINK_TEXT`

License

MIT

3.9 KiB Raw Blame History