Implement copilot quality alt#38
Conversation
There was a problem hiding this comment.
⚠️ Not ready to approve
There are correctness and reliability issues in new code paths (notably link-context wording in prompts and loadImageAsDataUrl handling of data: URLs / timeouts) that should be addressed before merging.
Pull request overview
Adds a new opt-in, model-backed alt-text-quality rule to evaluate whether an image’s alt text actually describes the image (using GitHub Models, optionally enriched by an Azure AI Vision pre-pass), and extends image extraction to capture additional context for higher-quality judging.
Changes:
- Introduces the async, opt-in
alt-text-qualityrule plus a new judge layer (Copilot judge, optional Azure-augmented judge, caching). - Enhances
extractImagesandImageRecordwith intrinsic dimensions and page/section/prose context used by the model-backed rule. - Adds an offline probe harness + fixtures and updates docs/schema/tests to cover the new rule and async rule evaluation.
File summaries
| File | Description |
|---|---|
| tsconfig.json | Includes scripts/**/*.ts in typechecking. |
| tests/utils/helpers.ts | Adds ImageRecord defaults for new fields; adds sync-only rule evaluation helper. |
| tests/unit/vague-alt-text.test.ts | Updates typing to account for async-capable Rule.evaluate. |
| tests/unit/missing-alt-text.test.ts | Updates typing to account for async-capable Rule.evaluate. |
| tests/unit/judges-caching.test.ts | Adds unit tests for judge/vision caching behavior. |
| tests/unit/alt-text-quality.test.ts | Adds unit tests for the new alt-text-quality rule behavior. |
| tests/fixtures/alt-quality/cases.json | Adds offline probe fixture cases for expected verdicts. |
| tests/fixtures/alt-quality/cases-github.json | Adds GitHub-specific probe fixture cases. |
| tests/extract.test.ts | Adds tests for capturing pageTitle and sectionHeading. |
| tests/example-site.test.ts | Skips opt-in rules when asserting expected findings in the example site. |
| src/utils/load-image-data-url.ts | Adds helper to load images as data URLs for the rule and probe. |
| src/utils/fetch-with-retry.ts | Adds fetch wrapper with timeout and bounded retry for transient failures. |
| src/types.ts | Extends ImageRecord; makes Rule.evaluate async-capable. |
| src/rules/index.ts | Registers alt-text-quality in the append-only rules list. |
| src/rules/alt-text-quality.ts | Implements the async, opt-in model-backed rule orchestrator. |
| src/judges/types.ts | Defines judge interfaces/types shared across judge implementations. |
| src/judges/prompt.ts | Adds the system prompt and strict JSON schema for structured verdicts. |
| src/judges/index.ts | Adds createJudge() with mode selection and wiring (copilot vs azure-augmented + caching). |
| src/judges/copilot-judge.ts | Implements the GitHub Models-backed judge call. |
| src/judges/caching.ts | Adds content-hash caches for verdicts and Azure vision analysis. |
| src/judges/azure-vision-api-client.ts | Implements Azure AI Vision Image Analysis client. |
| src/judges/azure-augmented-judge.ts | Decorates the Copilot judge with optional Azure pre-analysis context. |
| src/extract.ts | Enhances extraction with nearby text, figcaption, link/button context, page title, and section heading. |
| scripts/probe-alt-quality.ts | Adds offline probe CLI for scoring fixture cases against the judge. |
| schema/config.schema.json | Adds alt-text-quality to config schema and updates schema description. |
| README.md | Documents the new opt-in rule, configuration, and required secrets. |
| package.json | Adds probe script and tsx dev dependency. |
| package-lock.json | Locks new dependencies (tsx, esbuild, etc.). |
| index.ts | Awaits rule evaluation to support async rules. |
| .gitignore | Ignores .env (used by the probe). |
| .github/workflows/scan-static-sites.yml | Adds job-level env wiring for model/vision credentials. |
Copilot's findings
- Files reviewed: 28/31 changed files
- Comments generated: 4
Note
Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
⚠️ Not ready to approve
It introduces a model call pathway that currently forwards raw outerHTML into prompts (risking sensitive URL/query leakage) and includes a few correctness/perf/test-safety issues that should be addressed before approval.
Copilot's findings
- Files reviewed: 29/32 changed files
- Comments generated: 6
Note
Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.
There was a problem hiding this comment.
⚠️ Not ready to approve
The model-backed rule currently risks leaking sensitive URL data (e.g., link hrefs and logged image URLs with querystrings) into model context and CI logs, which should be redacted before merging.
Copilot's findings
- Files reviewed: 29/32 changed files
- Comments generated: 5
Note
Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.
| export async function fetchWithRetry( | ||
| url: string | URL, | ||
| init: RequestInit, | ||
| options: FetchRetryOptions = {}, | ||
| ): Promise<Response> { |
There was a problem hiding this comment.
+1, this one's worth doing before merge, it's in the path of every model and image call.
taarikashenafi
left a comment
There was a problem hiding this comment.
btw on doing the other stretch goals in this branch, might be easier to get this one in first and then branch the next thing off main? this one's already pretty big and the rules registry seems set up for adding one rule at a time, so keeping them split might make each easier to review since Joyce hasn't' started on this yet.
Yeah for sure, good call. |
| console.log(`Cases: ${casesPath}`) | ||
| console.log(`Total: ${cases.length}`) | ||
| if (minIntervalMs > 0) console.log(`Pacing: ${minIntervalMs}ms minimum between cases`) | ||
| console.log('') |
There was a problem hiding this comment.
actually, is this even affecting output?
There was a problem hiding this comment.
By default minIntervalMs is set to zero, so it wouldn't print that pacing log (if that is what you were referring to). But, when minIntervalMs is set, I think it would be helpful to know its value.
Redact query/fragment from URLs sent to model and logs
Co-authored-by: Joyce Zhu <joycezhu@github.com>
| export async function fetchWithRetry( | ||
| url: string | URL, | ||
| init: RequestInit, | ||
| options: FetchRetryOptions = {}, | ||
| ): Promise<Response> { |
| export async function fetchWithRetry( | ||
| url: string | URL, | ||
| init: RequestInit, | ||
| options: FetchRetryOptions = {}, | ||
| ): Promise<Response> { | ||
| const timeoutMs = options.timeoutMs ?? DEFAULT_TIMEOUT_MS | ||
| const maxRetries = options.maxRetries ?? DEFAULT_MAX_RETRIES | ||
| const baseDelayMs = options.baseDelayMs ?? DEFAULT_BASE_DELAY_MS | ||
|
|
||
| let lastError: unknown | ||
| for (let attempt = 0; attempt <= maxRetries; attempt++) { | ||
| const controller = new AbortController() | ||
| const timer = setTimeout(() => controller.abort(), timeoutMs) | ||
| try { | ||
| const res = await fetch(url, {...init, signal: controller.signal}) | ||
| if ((res.status === 429 || res.status >= 500) && attempt < maxRetries) { | ||
| // Discard the body so the connection can be reused, then retry. | ||
| await res.body?.cancel() | ||
| await sleep(retryDelayMs(res, attempt, baseDelayMs)) | ||
| continue | ||
| } | ||
| return res | ||
| } catch (err) { | ||
| lastError = err | ||
| if (attempt >= maxRetries) break | ||
| await sleep(backoffMs(attempt, baseDelayMs)) | ||
| } finally { | ||
| clearTimeout(timer) | ||
| } | ||
| } | ||
|
|
||
| throw lastError instanceof Error ? lastError : new Error(`fetch failed after ${maxRetries + 1} attempts`) | ||
| } |
| if (image.ariaLabel) parts.push(`aria-label="${image.ariaLabel}" is present on the image.`) | ||
| if (image.ariaLabelledBy) parts.push(`aria-labelledby="${image.ariaLabelledBy}" is present on the image.`) |
| const stripped = outerHTML | ||
| .replace(/\s+src\s*=\s*("[^"]*"|'[^']*')/gi, ' src="(omitted)"') | ||
| .replace(/\s+srcset\s*=\s*("[^"]*"|'[^']*')/gi, ' srcset="(omitted)"') |
Closes github/accessibility#10635
Adds an opt-in, model-backed
alt-text-qualityrule that judges whether an image's alt text actually describes the image — a gap the deterministic rules can't reach.Design & rationale: ADR