Browser Integration
Antigravity's Chrome extension lets agents autonomously interact with web applications — navigating, clicking, filling forms, reading DOM state, and taking screenshots — without you touching the browser. This module covers installation, capabilities, automation patterns, visual testing, web scraping, and screenshot analysis in depth.
Why Browser Integration Matters
In traditional development, the write-test-verify cycle for UI changes looks like this: write code in the editor, switch to the browser, refresh the page, manually test the change, switch back to the editor, fix issues, repeat. This context switching is slow, error-prone, and tedious.
Antigravity's browser agent collapses this entire cycle into a single autonomous operation. The agent writes code, starts the dev server (if needed), opens Chrome, navigates to the correct page, interacts with the UI, reads the result, and captures a screenshot — all without your involvement. The verification loop that normally requires manual effort becomes part of the agent's automated workflow.
While browser verification is the primary use case, the browser agent is also valuable for web scraping (reading documentation, pulling reference data), visual regression testing (comparing screenshots before and after changes), and debugging UI issues (reading console errors, inspecting network requests). It is a general-purpose browser automation tool controlled by natural language.
Setup and Installation
Install Once, Use Always
Install the Antigravity Chrome extension from the Chrome Web Store. It connects automatically to any running Antigravity session. No per-project configuration needed — it works across all your projects.
The extension requires permissions to read and interact with web pages. This is necessary for form filling, clicking, DOM reading, and screenshot capture. The extension communicates with Antigravity over a local WebSocket connection — no data leaves your machine unless the agent navigates to an external URL.
Connection Verification
After installation, verify the connection: in Antigravity, go to Settings → Browser. You should see "Chrome extension connected" with a green indicator. If the indicator is red, restart Chrome, then restart Antigravity. The extension auto-connects when both are running on the same machine.
Chrome Profile Isolation
The browser agent operates in your default Chrome profile. If you want isolation (recommended for testing), create a dedicated Chrome profile: Chrome → Profile icon → Add Profile → "Antigravity Testing." Install the extension in that profile. This prevents the agent from accessing your logged-in sessions, bookmarks, or saved passwords.
Headless vs Visible Mode
By default, the browser agent runs in visible mode — you can watch Chrome as the agent interacts with it. For faster execution, enable headless mode in Settings → Browser → "Run browser agent headless." Headless mode is faster but you cannot watch the agent work in real time. Screenshots are still captured in headless mode.
How the Browser Sub-Agent Works
The browser agent is a specialised sub-agent within Antigravity's architecture. Here is how it operates:
Task includes browser instruction
When your task description includes browser-related instructions (e.g., "test the form in the browser," "verify the page loads correctly," "take a screenshot of the checkout page"), Antigravity automatically activates the browser sub-agent. You do not need to explicitly request it — the agent detects browser intent from natural language.
Chrome session initialises
The browser sub-agent opens a Chrome tab (or reuses an existing one) via the Antigravity extension. It navigates to the specified URL — typically localhost:3000 or whatever your dev server runs on. If your dev server is not running, the agent can start it via a terminal command first.
DOM analysis
Before interacting with the page, the agent reads the full DOM structure. It identifies interactive elements (buttons, inputs, links, checkboxes), their labels, their states (enabled/disabled, checked/unchecked), and their positions. This analysis is what allows the agent to "see" the page and understand what to click or fill.
Action execution
The agent executes actions sequentially: navigate, click, type, select, scroll. After each action, it re-reads the DOM to verify the expected result. If an action fails (element not found, page error), the agent logs the failure and either retries or reports it. Each action is logged in the browser agent's timeline.
Screenshot capture
After every significant action (form submission, page navigation, error state), the agent captures a full-page screenshot. These screenshots are saved as artifacts in the Manager Surface. You can review them to verify the agent saw what you expected.
Result reporting
The browser agent reports its findings back to the parent agent (or directly to you). The report includes: actions taken, DOM state observed, any errors found, and screenshots. If the browser test was part of a larger code task, the parent agent can use this feedback to iterate — for example, fixing a CSS issue discovered during browser testing.
Chromium Automation Capabilities
Full DOM Access
Agents can read any element on the page — text content, input values, computed CSS styles, element attributes, ARIA labels, and data attributes. They use standard CSS selectors and XPath to locate elements. The agent can also read the full HTML source of any element for inspection.
Form Interaction
Agents fill text inputs, select dropdown options, tick/untick checkboxes, toggle radio buttons, click buttons, and handle file uploads. They can complete multi-step forms end-to-end: login forms, multi-page wizards, checkout flows. For file uploads, the agent can generate test files on the fly.
Network Monitoring
The browser agent can monitor network requests and responses. It detects 4xx and 5xx errors, reads response bodies, checks request headers, and measures response times. This is invaluable for API integration testing — the agent can fill a form, submit it, and verify the correct API request was made with the expected payload.
Console Log Reading
The agent reads the browser console output: JavaScript errors, warnings, log statements. If your code throws a runtime error, the agent sees it. This is especially useful for detecting issues that only appear at runtime — type errors, undefined variables, failed API calls — that static analysis would miss.
Agent Testing the Checkout Form
What Browser Agents Can Do
| Action | Example | Detail |
|---|---|---|
| Navigate | "Go to the product page for SKU #4821" | Supports absolute URLs, relative paths, and query parameters. Agent waits for page load before proceeding. |
| Click | "Click the Add to Cart button" | Locates elements by text content, CSS class, ID, ARIA label, or XPath. Waits for element to be visible and clickable. |
| Fill forms | "Fill the shipping form with test data" | Generates realistic test data (names, addresses, emails) or uses specific values you provide. Handles text inputs, textareas, and contenteditable elements. |
| Select dropdowns | "Select 'Express Shipping' from the dropdown" | Works with native HTML select elements and custom dropdown components. Searches by visible text or value attribute. |
| Read DOM | "What is the total shown in the order summary?" | Reads text content, input values, computed styles, and attributes from any element on the page. |
| Verify state | "Confirm the gift wrap checkbox is checked and £2.50 added" | Asserts element states (checked, disabled, visible, hidden) and text content. Reports pass/fail with details. |
| Screenshot | Automatic after every significant step | Full-page screenshots saved as PNG artifacts. Named sequentially: step_01_navigate.png, step_02_fill_form.png, etc. |
| Multi-step flows | "Complete a full checkout as a guest user" | Chains multiple actions into a single flow: navigate → fill → click → verify → screenshot. Agent handles page transitions and loading states. |
| Error detection | "Report any console errors or network 4xx/5xx responses" | Monitors browser console and network tab. Reports JavaScript errors, failed API calls, and slow responses. |
| Wait for element | "Wait for the loading spinner to disappear" | Agent waits for elements to appear, disappear, or change state. Configurable timeout (default 30 seconds). |
Visual Testing Patterns
The browser agent enables several visual testing patterns that are difficult or impossible with unit tests alone:
Before/After Screenshots
Before making a CSS or layout change, dispatch an agent to take a screenshot of the current state. Then make the change (or have another agent make it). Dispatch a second browser agent to take an "after" screenshot. Compare the two screenshots in the Artifacts panel to verify the visual change is correct.
Responsive Testing
Instruct the browser agent to resize the viewport and take screenshots at key breakpoints: "Take screenshots of the checkout page at 1440px, 1024px, 768px, and 375px widths." The agent resizes the Chrome window and captures each state. Review all four screenshots to catch responsive layout issues.
Error State Verification
Test error handling visually: "Navigate to the checkout page, leave the email field empty, and click Submit. Take a screenshot showing the validation error." The agent triggers the error state and captures it. You verify that error messages appear correctly and are user-friendly.
Full User Journey
Test an entire user flow end-to-end: "As a new user, sign up, add a product to the cart, go to checkout, fill the form, submit the order, and verify the confirmation page." The agent walks through every step, capturing screenshots at each stage. The result is a complete visual record of the user journey.
Web Scraping with Browser Agents
The browser agent can also scrape web content — useful for pulling reference data, reading documentation, or comparing your app against a design spec:
| Use Case | Task Description Example | Output |
|---|---|---|
| Read API docs | "Navigate to docs.stripe.com/api/charges and extract the list of required fields for creating a charge" | Structured list of field names and types |
| Compare with design | "Navigate to our Figma embed at [URL] and compare the button styles with what we have on localhost:3000/checkout" | Comparison report with screenshots |
| Pull reference data | "Navigate to [competitor URL] and list all the product categories they show in their navigation" | Text list of categories |
| Check link health | "Navigate to every link in the footer on localhost:3000 and report any that return 404" | List of broken links with status codes |
Screenshot Analysis
Antigravity does not just capture screenshots — the agent can analyse them. Because the underlying AI model is multimodal (it can "see" images), the agent can reason about visual content:
Visual verification
The agent captures a screenshot and analyses it: "Does the checkout page show the gift wrap option below the shipping form?" It reads the visual layout and confirms or denies. This catches CSS issues that DOM reading alone would miss — an element might exist in the DOM but be invisible due to display: none or opacity: 0.
Layout comparison
Given two screenshots (before and after a change), the agent can describe the visual differences: "The button moved from the right side to the centre. The font size of the heading increased. The background colour changed from white to light grey." This is useful for reviewing visual changes without opening the browser yourself.
Accessibility observations
The agent can flag potential accessibility issues from screenshots: "The grey text on the light background appears to have low contrast. The button text is very small." While not a substitute for proper accessibility testing tools, this catches obvious issues early.
The browser agent works with any web app that runs in Chrome — React, Vue, Next.js, Angular, Svelte, plain HTML, Django templates, Rails views, PHP pages. It does not care about the stack. As long as it runs on a URL, the agent can interact with it. Single-page apps, server-rendered pages, and static sites all work equally well.
Security Considerations
Browser agents can click buttons, submit forms, and trigger real actions. Always point them at localhost or a staging environment. Never give a browser agent a production URL — it could accidentally submit orders, delete data, or mutate state. Create a dedicated Chrome profile with no saved passwords or sessions to prevent the agent from accessing your production accounts.
| Risk | Mitigation |
|---|---|
| Agent submits real orders on production | Always use localhost or staging URLs. Never provide production URLs in task descriptions. |
| Agent accesses saved passwords/sessions | Use a dedicated Chrome profile with no saved credentials for Antigravity. |
| Agent navigates to malicious URLs | Review the agent's plan before dispatching. The agent only navigates to URLs you specify or that it finds in your codebase. |
| Screenshots contain sensitive data | Use test data, not real customer data. Screenshots are stored locally in the Antigravity workspace. |
| Extension permissions too broad | The extension only activates when Antigravity dispatches a browser task. It does not monitor your browsing activity. |
Combining Code Changes with Browser Verification
The most powerful pattern is a single agent task that writes code AND verifies it in the browser:
This single task produces: code changes (the new form field), unit tests (if requested), and visual evidence (two screenshots proving it works). The agent's diff includes the code, and the artifacts include the screenshots. You review everything in one place.
Browser Agent Limitations
The browser agent is powerful but has clear boundaries. Understanding these prevents frustration:
| Limitation | Detail | Workaround |
|---|---|---|
| Single browser only | Only Chrome is supported. No Firefox, Safari, or Edge. | Use Chrome for agent testing. Run cross-browser testing separately with tools like Playwright or BrowserStack. |
| No file download verification | The agent cannot verify that a file download completed correctly. | Write a unit test for the download endpoint instead of relying on browser verification. |
| Authentication complexity | OAuth flows, multi-factor auth, and CAPTCHA cannot be automated. | Pre-authenticate in the test Chrome profile, or use test accounts with simplified auth. |
| Canvas and WebGL | The agent cannot interact with canvas elements or WebGL content meaningfully. | Use screenshots for visual verification. Interactive canvas testing requires specialised tools. |
| Iframes and popups | Cross-origin iframes are not accessible. Popup windows have limited support. | Test iframe content separately. For popups, configure your app to use inline modals instead during testing. |
| Speed | Browser operations are slower than unit tests (seconds vs milliseconds). | Use browser testing selectively — for UI verification, not as a replacement for unit tests. |
Browser Agent vs Traditional E2E Testing
How does Antigravity's browser agent compare to dedicated end-to-end testing tools like Playwright, Cypress, or Selenium?
| Aspect | Antigravity Browser Agent | Playwright / Cypress |
|---|---|---|
| Setup | Install Chrome extension, describe test in natural language | Write test scripts in JavaScript/TypeScript, configure test runner |
| Maintenance | No scripts to maintain — describe the test each time | Scripts require updates when UI changes |
| Repeatability | Non-deterministic — agent may take slightly different paths | Fully deterministic — same script, same execution |
| CI/CD integration | Not suitable for CI pipelines (requires running IDE) | Designed for CI pipelines |
| Best for | Exploratory testing, quick verification during development | Regression testing, CI/CD gates, production monitoring |
| Cross-browser | Chrome only | Chrome, Firefox, Safari, Edge |
The browser agent is best for quick, exploratory verification during development — "does this change look right?" Traditional E2E tools (Playwright, Cypress) are best for repeatable regression tests in CI/CD. Use both: the browser agent during development, Playwright for your test suite. You can even dispatch an agent to write Playwright test scripts based on the browser agent's exploratory test results.