This document catalogs Comet's nine specialized tools for browser automation and information retrieval, detailing their parameters, usage patterns, and coordination strategies. It focuses on the technical specifications and interaction patterns that enable Comet to navigate web pages, extract content, interact with page elements, and manage complex multi-step tasks.
For information about Comet's security model and permission system governing when these tools can be used, see Security and Privacy Protocols. For broader system architecture and behavioral principles, see System Architecture and Core Principles.
Comet's tool ecosystem consists of nine specialized tools organized into three functional categories. All browser interaction tools require a tab_id parameter Comet Assistant/System Prompt.txt635-641 to specify which browser tab to operate on, enabling multi-tab workflows.
Tool Category Architecture
Tool Count by Category:
navigate, computer, read_page, find, form_input, get_page_text)search_web)tabs_create, todo_write)Sources: Comet Assistant/tools.json1-230 Comet Assistant/System Prompt.txt22-42 Comet Assistant/System Prompt.txt635-641
The navigate tool controls URL loading and browser history traversal. It is the entry point for all page interactions.
Parameters:
tab_id (required): Target browser tab identifierurl (required): Target URL or history command ("back", "forward")Usage Patterns:
| Operation | Example | Use Case |
|---|---|---|
| Load URL | navigate(url="https://example.com", tab_id=123) | Navigate to new page |
| Protocol-less | navigate(url="example.com", tab_id=123) | Defaults to HTTPS |
| Back | navigate(url="back", tab_id=123) | Return to previous page |
| Forward | navigate(url="forward", tab_id=123) | Advance in history |
The tool supports protocol-less URLs that default to HTTPS Comet Assistant/tools.json22-23 History navigation is preferred over keyboard shortcuts on Windows systems Comet Assistant/System Prompt.txt626
Sources: Comet Assistant/tools.json7-24 Comet Assistant/System Prompt.txt899-907
The computer tool provides low-level browser interaction through eight action types Comet Assistant/tools.json29-37 It returns screenshots after execution, with blue dots marking click locations Comet Assistant/System Prompt.txt32-33
Action Type Specifications and JSON Structure
Complete Action Type Enumeration:
| Action Type | JSON Key | Value Format | Purpose |
|---|---|---|---|
left_click | action | "left_click" | Primary click interaction |
right_click | action | "right_click" | Context menu activation |
double_click | action | "double_click" | Text selection/file opening |
triple_click | action | "triple_click" | Line/paragraph selection |
type | action | "type" | Text input simulation |
key | action | "key" | Keyboard key press |
scroll | action | "scroll" | Page scrolling |
screenshot | action | "screenshot" | Visual capture |
Tool Call JSON Structure:
| Action | Required JSON Fields | Example Tool Call |
|---|---|---|
left_click | action, tab_id, (coordinate OR ref) | {"action": "left_click", "coordinate": [100, 200], "tab_id": 1} |
left_click (with ref) | action, tab_id, ref | {"action": "left_click", "ref": "ref_5", "tab_id": 1} |
type | action, tab_id, text | {"action": "type", "text": "Hello World", "tab_id": 1} |
key | action, tab_id, text | {"action": "key", "text": "ctrl+a", "tab_id": 1} |
key (special) | action, tab_id, text | {"action": "key", "text": "Return", "tab_id": 1} |
scroll | action, tab_id, coordinate, scroll_parameters | {"action": "scroll", "coordinate": [0, 0], "scroll_parameters": {"scroll_direction": "down", "scroll_amount": 3}, "tab_id": 1} |
screenshot | action, tab_id | {"action": "screenshot", "tab_id": 1} |
scroll_parameters Object Structure:
Keyboard Modifier Convention: Windows systems use "ctrl" as the primary modifier key for shortcuts like select all (ctrl+a), copy (ctrl+c), paste (ctrl+v) Comet Assistant/System Prompt.txt625
Action Chaining: Multiple actions can be combined in a single tool call for efficiency Comet Assistant/System Prompt.txt34-36 The pattern click → type should always be a single call rather than separate invocations Comet Assistant/System Prompt.txt35
Sources: Comet Assistant/tools.json25-51 Comet Assistant/System Prompt.txt886-897 Comet Assistant/System Prompt.txt623-626
The read_page tool extracts the DOM accessibility tree, returning element references in the format ref_1, ref_2, ref_3, etc. Comet Assistant/tools.json64 for precise interaction. This is the preferred method for finding elements that may not be visible in screenshots Comet Assistant/System Prompt.txt25
Tool Call JSON Structure:
Parameter Specifications:
| Parameter | Type | Default | Valid Values | Description |
|---|---|---|---|---|
tab_id | integer | required | Any valid tab ID | Target browser tab identifier |
depth | integer | 15 | 1-15 | DOM tree traversal depth limit |
filter | string | none | "interactive", "all" | Element filtering mode |
ref_id | string | none | "ref_N" format | Focus on specific element's children |
Return Value Structure:
ref_1, ref_2, ref_3, ..., ref_NOptimization Strategies:
The tool prioritizes element references over screenshot-based coordinates for precision Comet Assistant/tools.json68-70 When elements are not visible in the latest screenshot, read_page provides references for off-screen elements Comet Assistant/System Prompt.txt24-25
Sources: Comet Assistant/tools.json52-70 Comet Assistant/System Prompt.txt908-917
The find tool searches for page elements using natural language descriptions, returning up to 20 matches with both element references and coordinates Comet Assistant/tools.json81-82
Tool Call JSON Structure:
Parameters:
tab_id (required, integer): Target browser tab identifierquery (required, string): Natural language element descriptionReturn Value Format:
ref (e.g., "ref_7"), coordinate (e.g., [x, y])computer toolQuery Pattern Examples:
| Query Type | Example Query | Matches |
|---|---|---|
| By role | "search button" | Button elements with search-related text |
| By text | "Sign In" | Elements containing exact text |
| By function | "email input field" | Input elements for email addresses |
| By location | "submit button at bottom" | Buttons in lower page regions |
| By state | "checked checkbox" | Checkbox elements in checked state |
Usage Decision Tree:
The tool is specifically recommended when elements are not present in the latest screenshot Comet Assistant/System Prompt.txt24-25 providing an alternative to scrolling through long pages Comet Assistant/System Prompt.txt27
Sources: Comet Assistant/tools.json72-88 Comet Assistant/System Prompt.txt915-920
The form_input tool sets values in form elements without simulating user interactions. It requires element references from read_page Comet Assistant/System Prompt.txt929
Tool Call JSON Structure:
Parameter Specifications:
| Parameter | Type | Required | Format | Description |
|---|---|---|---|---|
tab_id | integer | Yes | Any valid tab ID | Target browser tab identifier |
ref | string | Yes | "ref_N" where N is integer | Element reference from read_page output |
value | string or boolean | Yes | Text string or true/false | Value to set in the form element |
Element Type to Value Type Mapping:
Tool Call Examples by Element Type:
| Element Type | Tool Call JSON | Result |
|---|---|---|
| Text Input | {"ref": "ref_5", "value": "example text", "tab_id": 123} | Sets text field value |
| Email Input | {"ref": "ref_7", "value": "[email protected]", "tab_id": 123} | Sets email field value |
| Checkbox (check) | {"ref": "ref_8", "value": true, "tab_id": 123} | Checks the checkbox |
| Checkbox (uncheck) | {"ref": "ref_8", "value": false, "tab_id": 123} | Unchecks the checkbox |
| Dropdown/Select | {"ref": "ref_12", "value": "Option Text", "tab_id": 123} | Selects option by visible text |
| Textarea | {"ref": "ref_3", "value": "Long text content", "tab_id": 123} | Sets textarea content |
Workflow Pattern: The standard form completion workflow is read_page → identify field refs → form_input for each field → locate submit button → computer click Comet Assistant/tools.json212-218
The tool can handle multiple field updates in sequence Comet Assistant/tools.json107 and ensures accuracy for form completion Comet Assistant/tools.json105-106
Sources: Comet Assistant/tools.json90-107 Comet Assistant/System Prompt.txt928-935
The get_page_text tool extracts plain text content from the current page, prioritizing article and main content areas. It returns text without HTML formatting Comet Assistant/tools.json117-118
Tool Call JSON Structure:
Parameters:
tab_id (required, integer): Target browser tab identifierReturn Value:
<article>, <main>, and primary content elementsUse Cases:
| Scenario | Advantage | Pattern |
|---|---|---|
| Long articles | Avoids repeated scrolling | Navigate → get_page_text → extract info |
| Text-heavy pages | Efficient content access | Navigate → get_page_text → parse |
| Infinite scroll pages | Combine with "max" scroll to load all content | Scroll to bottom → get_page_text |
| Content comparison | Extract raw text from multiple tabs | Create tabs → navigate → get_page_text each |
Tool Selection Decision:
The tool is explicitly recommended for avoiding repeated scrolling on long web pages Comet Assistant/System Prompt.txt27 Some complex web applications (Google Docs, Figma, Canva, Google Slides) require visual tools when read_page returns no meaningful content Comet Assistant/System Prompt.txt29
Sources: Comet Assistant/tools.json109-124 Comet Assistant/System Prompt.txt922-926
The search_web tool performs web searches without controlling the browser, providing an efficient alternative to navigating to search engines Comet Assistant/System Prompt.txt39-41 This tool must be used instead of navigating to google.com Comet Assistant/System Prompt.txt41
Tool Call JSON Structure:
Parameters:
queries (required, array of strings): Keyword-based search queries, maximum 3 per call Comet Assistant/tools.json130Return Value Structure:
id: Citation identifier (format: "web:1", "web:2", "web:3", etc.)title: Page title stringurl: Full URL stringcontent: Text snippet from pageCitation ID Format: Results include id field values like "web:1", "web:2", "web:3" that must be cited as [web:N] in responses Comet Assistant/System Prompt.txt60-63 Comet Assistant/System Prompt.txt982-985
Query Optimization Patterns:
| Query Type | Anti-Pattern | Recommended Pattern | Rationale |
|---|---|---|---|
| Simple fact | "What is the inflation rate in Canada?" | "inflation rate Canada" | Keyword-focused queries return better results |
| Multi-entity | "Compare prices of laptops and desktops" | ["laptop prices", "desktop prices"] | Separate queries per entity |
| Current info | "Tell me about recent elections" | "2025 elections results" | Specific temporal keywords |
| Location-specific | "What are restaurants near me?" | "restaurants [city name]" | Explicit location names |
Search Strategy Flow:
The tool prohibits using google.com for searches; search_web must be used instead Comet Assistant/System Prompt.txt41
Citation Requirements: Search results include id fields that must be cited using the format [web:N] immediately after the relevant statement Comet Assistant/System Prompt.txt60-63
Sources: Comet Assistant/tools.json125-142 Comet Assistant/System Prompt.txt39-41 Comet Assistant/System Prompt.txt969-996
The tabs_create tool creates new browser tabs for parallel task execution. Tabs are automatically grouped together Comet Assistant/System Prompt.txt651
Tool Call JSON Structure:
Parameters:
url (optional, string): Starting URL for new tab, defaults to about:blank Comet Assistant/tools.json148Return Value:
tabId (integer) for use with other toolsTab Context System:
After tool execution or user messages, tab context is provided in <system-reminder> tags when changed Comet Assistant/System Prompt.txt631-633:
The availableTabs array contains objects with three fields:
tabId (integer): Unique tab identifier for tool callstitle (string): Current page titleurl (string): Current page URLTab ID Requirement: Every browser interaction tool requires the tabId parameter Comet Assistant/System Prompt.txt635-641:
computer: {"action": "screenshot", "tabId": 1}
navigate: {"url": "https://example.com", "tabId": 1}
read_page: {"tabId": 1}
find: {"query": "search button", "tabId": 1}
get_page_text: {"tabId": 1}
form_input: {"ref": "ref_1", "value": "text", "tabId": 1}
Multi-Tab Usage Patterns:
Best Practices:
Sources: Comet Assistant/tools.json143-157 Comet Assistant/System Prompt.txt628-654
The todo_write tool creates and manages task lists to demonstrate progress and prevent task forgetting Comet Assistant/System Prompt.txt47 It provides user visibility into complex workflows Comet Assistant/System Prompt.txt45
Tool Call JSON Structure:
Field Specifications:
| Field | Type | Required | Valid Values | Description | Example |
|---|---|---|---|---|---|
todos | array | Yes | Array of todo objects | List of task items | [{...}, {...}] |
content | string | Yes | Imperative verb phrase | Task description in command form | "Run tests", "Build project" |
status | string | Yes | "pending", "in_progress", "completed" | Current task state | "in_progress" |
active_form | string | Yes | Present continuous verb phrase | Task description in -ing form | "Running tests", "Building project" |
Example Tool Call:
Usage Frequency Pattern:
Critical Requirements:
Status Transition State Machine:
Status Values:
"pending": Task not yet started"in_progress": Task currently executing"completed": Task finishedThe tool is particularly valuable for demonstrating thoroughness on complex tasks Comet Assistant/tools.json173 and helps demonstrate that Comet is making progress Comet Assistant/System Prompt.txt864
Sources: Comet Assistant/tools.json159-174 Comet Assistant/System Prompt.txt44-50 Comet Assistant/tools.json164-167
Tool effectiveness depends on proper sequencing and coordination. The following patterns represent recommended workflows for common scenarios.
Pattern 1: Navigate and Read
Pattern 2: Form Completion
Pattern 3: Web Research with Citation
Pattern 4: Element Interaction Decision Tree
Sources: Comet Assistant/tools.json204-229 Comet Assistant/System Prompt.txt22-42 Comet Assistant/tools.json226-229
Action Batching: Multiple actions should be combined in a single computer call when they form a clear sequence Comet Assistant/System Prompt.txt34:
| Anti-Pattern (Sequential Calls) | Optimized Pattern (Single Call) |
|---|---|
computer(action="left_click", ...) then computer(action="type", ...) | computer(actions=["left_click", "type"], ...) |
The specific case of clicking and typing should always be a single call Comet Assistant/System Prompt.txt35
Parallel Tab Utilization:
Combining multiple tabs enables more efficient task completion Comet Assistant/System Prompt.txt646 by working on different tasks in parallel Comet Assistant/tools.json154
Element Reference Invalidation:
Recovery Procedure:
Sources: Comet Assistant/tools.json176-195 Comet Assistant/System Prompt.txt34-37 Comet Assistant/System Prompt.txt936-960
The following matrix guides tool selection based on task requirements:
| Task Requirement | Primary Tool | Alternative Tool | Rationale |
|---|---|---|---|
| Load new page | navigate | - | Only tool for URL loading |
| See page visually | computer (screenshot) | - | Captures visual state |
| Read page text | get_page_text | read_page | get_page_text for plain content, read_page for structure |
| Find element ref | read_page | find | read_page preferred, find if element not in DOM tree output |
| Click element | computer (left_click) | - | Use with coordinates or ref |
| Fill form field | form_input | computer (type) | form_input preferred for accuracy |
| Search internet | search_web | Never use browser to search | search_web is built-in, more efficient |
| Create new workspace | tabs_create | - | Only tool for tab creation |
| Track progress | todo_write | - | Essential for complex tasks |
Tool Selection Flow:
Sources: Comet Assistant/tools.json1-230 Comet Assistant/System Prompt.txt22-42
Refresh this wiki