name: browser description: Control a Chrome session via Stagehand to browse, act, extract, and screenshot on demand inside the Factory CLI.

Skill: Browser

Name: Agent Skills CLI
Rating: 5 (50 reviews)
Author: Karanjot Singh

Use this skill when you need live browser automation during a Factory session—opening sites, clicking through flows, gathering structured data, or capturing screenshots.

Inputs

Target URL or task description (natural language)
Optional structured extraction schema (JSON field → type)

Behavior

Ensure Chrome is running via Stagehand with the local profile stored in .chrome-profile.
Support these commands:
- navigate <url>: open a page and capture a screenshot.
- act "<instruction>": perform natural-language actions.
- extract "<instruction>" '{"field":"type"}': return structured data.
- observe "<goal>": list suggested steps the agent can take.
- screenshot: capture the current viewport.
- close: shut down the session when finished.
Save screenshots to agent/browser_screenshots and report the file path in the response.
When tasks finish, summarize what happened plus any follow-up steps for the user.

Verification

If a navigation/action fails, include the error message and prompt the user for next steps.
Before ending the session, ensure close has been run so Chrome processes don’t linger.

Notes

This skill expects ANTHROPIC_API_KEY (for Stagehand) and, if used via droid exec, FACTORY_API_KEY to already be configured.
The working directory should remain inside the cloned skill folder so relative paths resolve correctly.

browser

Installation

Details

Usage

Skill Instructions