How to Build a General-Purpose Accessibility Agent for Your Codebase

Introduction

Creating an AI-powered accessibility agent is a practical way to embed inclusive design into your development workflow. This guide walks you through building a general-purpose accessibility agent, based on the principles and lessons learned from GitHub's experimental pilot. The agent serves two main functions: providing just-in-time answers to accessibility questions and automatically catching and fixing simple, objective issues before they reach production. By the end, you'll have a clear roadmap to augment your team's accessibility efforts without expecting a silver bullet.

How to Build a General-Purpose Accessibility Agent for Your Codebase — Source: github.blog

What You Need

Access to a large language model (LLM) or API that supports agentic workflows (e.g., GPT-4, Claude, or GitHub Copilot models)
Basic understanding of LLM agents, tool calling, and multi-step reasoning (see resources below)
Version control system (e.g., GitHub) with pull request and code review capabilities
Integration points: command-line interface (CLI) or editor extension (e.g., VS Code)
Automated CI/CD pipeline that can trigger agent actions on front-end code changes
Labeled dataset of common accessibility issues (structure, names, announcements, alt text, focus order) for training/fine-tuning
Team buy-in and a clear scope of responsibility

Step-by-Step Guide

Step 1: Define Your Agent’s Core Goals

Start by deciding what your accessibility agent will do. The original pilot had two primary goals:

Provide on-demand accessibility answers – Integrate with developer tools so engineers can ask questions and get reliable, just-in-time responses (e.g., “How should I label this button?”)
Automatically remediate simple issues – Scan pull requests that modify front-end code for objective accessibility errors and fix them before shipping.

Write your own goal statements. Keep them focused – don't try to solve everything at once. A narrow scope speeds up launch and builds trust.

Step 2: Choose Integration Points

To be genuinely useful, the agent must live where developers work. The pilot used two surfaces:

GitHub Copilot CLI – For answering accessibility questions in the terminal.
Copilot VS Code integration – For in-editor suggestions and real-time feedback.

Identify your primary development environment and pick one or two integration channels. For example, a GitHub Actions bot that comments on PRs or a Slack command that triggers scans.

Step 3: Set Up Automated PR Review

For the remediation goal, the agent must automatically evaluate every pull request that changes front-end code. Configure your CI pipeline to trigger the agent on pull request events. The agent should:

Check for modifications to HTML, JSX, CSS, or other UI-related files.
Run a series of accessibility checks (structure, names, announcements, alt text, focus order).
If it finds a fixable issue, apply the remediation and commit it directly to the PR branch.

GitHub’s own agent reviewed 3,535 pull requests with a 68% resolution rate – set your targets accordingly.

Step 4: Identify and Categorize Issue Types

Based on your codebase’s history, list the most frequent accessibility issues. The pilot found these top five (in order):

Making structure and relationships clear to assistive technologies (e.g., missing landmarks, improper heading hierarchy)
Providing clear and concise names for interactive controls (e.g., missing aria-labels or button names)
Ensuring users are aware of important announcements (e.g., live region updates)
Ensuring text alternatives for non-text content (e.g., missing alt attributes)
Moving keyboard focus through pages in a logical order (e.g., tabindex misuse)

Map each issue to a set of rules the agent can check programmatically. For trickier decisions, let the agent flag the issue for human review.

Step 5: Build the Agent Architecture

Design a modular system:

LLM core – A general-purpose model capable of understanding code context and accessibility guidelines.
Tool layer – Functions the agent can call: scan HTML, query a11y rules, generate fixes, commit changes.
Memory / context – Keep track of recent PRs, common patterns, and already-applied fixes to avoid duplication.

Keep in mind that multi-agent workflows can fail. Start with a single agent and only split tasks if performance demands it. Use the agents.md pattern to document behavior (see resources).

Step 6: Adopt the Right Mindset

This is a critical, human-focused step. The social model of disability teaches that barriers are created by the environment, not by individuals. Your agent is not a silver bullet – it augments human effort, not replaces it. Communicate this to your team:

The agent catches objective, low-hanging issues, freeing developers to focus on nuanced, context-dependent problems.
It respects the boundaries of what can be automatically fixed (e.g., it won’t redesign a page).
Transparency about the agent’s limitations builds trust and encourages more robust human oversight.

When the scope is clear, your team will embrace the agent as a helpful teammate rather than an imposing audit tool.

Step 7: Launch, Monitor, and Iterate

Start with a small subset of repositories or issue types. Track metrics:

Number of PRs reviewed
Resolution rate (issues found vs. fixed)
False positive/negative rate
Developer satisfaction (survey)

Use the feedback to fine-tune your detection rules and LLM prompts. The pilot achieved a 68% resolution rate – you can improve by iterating on the agent’s context window and rule set.

Tips for Success

Start small, scale later. Focus on one integration (e.g., PR review) before adding CLI or editor support.
Use real remediation data. The top five issues from your own pull requests give you a training set that is representative of your codebase.
Document agent behavior. Maintain an agents.md file that describes how the agent works, when it triggers, and what it can/cannot fix. This aligns team expectations and aids debugging.
Combine automated and manual review. Let the agent fix obvious errors (missing alt text, incorrect aria roles) but always flag subjective ones (contrast decisions, logical order) for a human.
Celebrate wins. When the agent removes barriers for assistive technology users, share that impact with the team. It reinforces the value of accessibility.
Stay current. LLMs and accessibility guidelines evolve. Revisit your agent’s prompts and rules every quarter.