Skip to content

Dev#5410

Open
Sg312 wants to merge 11 commits into
stagingfrom
dev
Open

Dev#5410
Sg312 wants to merge 11 commits into
stagingfrom
dev

Conversation

@Sg312

@Sg312 Sg312 commented Jul 4, 2026

Copy link
Copy Markdown
Collaborator

Summary

Brief description of what this PR does and why.

Fixes #(issue)

Type of Change

  • Bug fix
  • New feature
  • Breaking change
  • Documentation
  • Other: ___________

Testing

How has this been tested? What should reviewers focus on?

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

Screenshots/Videos

Sg312 and others added 5 commits July 2, 2026 22:47
…KB tool handlers

Mirrors mothership dev f90f9b05:
- regenerated tool-catalog/tool-schemas mirrors (search trigger replaces
  research + scout; QueryUserTable / SearchKnowledgeBase entries)
- queryUserTableServerTool / searchKnowledgeBaseServerTool: read-only
  wrappers delegating to the full user_table / knowledge_base handlers with
  hard operation allowlists (and outputPath export rejection on
  query_user_table)
- display maps: 'search' agent label/title/icon added; research + scout
  entries retained so historical transcripts keep rendering
- Search.id replaces Research.id in LONG_RUNNING_TOOL_IDS (it inherits
  research's long crawls)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Mirrors mothership dev db60da94: run_code is the compute-only variant of
function_execute for the search agent — same sandbox and inputs, no
outputs.files / outputTable, so it cannot create or overwrite workspace
resources. Wrapper handler hard-rejects the write vectors and delegates to
executeFunctionExecute; run_code is deliberately absent from
OUTPUT_PATH_TOOLS and the table output post-processor, so the name gating
blocks writes even for leaked args. Added to LONG_RUNNING_TOOL_IDS,
display title/icon maps, and the regenerated catalog/schema mirrors.

Also removes two ineffective biome suppression comments in the docs
workflow-preview (the rule doesn't fire in the docs app config).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…data

A failed handler result that carried a defined-but-empty output (the
app-tool executor's 'Tool not found' ships output: {}) won the priority
race in getToolCallTerminalData, so the resume payload's data — the only
thing the model reads — was a bare {} with the error text dropped. The
search agent retried run_code 20+ times blind against a stale server
because every failure rendered as empty instead of 'Tool not found'.

Failed calls now always carry error in their terminal data: merged into
object outputs, wrapped alongside non-object outputs, preserved when the
output already has an error field.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@vercel

vercel Bot commented Jul 4, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped Jul 4, 2026 7:42am

Request Review

@greptile-apps

greptile-apps Bot commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR introduces three new read-only copilot tools (run_code, search_knowledge_base, query_user_table) and replaces the research subagent with a search subagent, wiring up all the catalog entries, runtime schemas, server-side handlers, UI display strings, and icon mappings. It also fixes a bug in getToolCallTerminalData where a failed tool call returning an empty-but-defined output ({}) was silently hiding its error from the model.

  • New read-only tools: run_code delegates to executeFunctionExecute after stripping write vectors (outputs, outputTable); search_knowledge_base and query_user_table wrap their mutable counterparts behind operation allowlists, with query_user_table adding an extra outputPath guard.
  • researchsearch: The Research subagent catalog entry and schema are removed; Search replaces it with a task-based parameter; display labels are updated throughout while the old research key is kept in TOOL_TITLES for backward compatibility with existing tool-call history.
  • tool-call-state bugfix: Failed calls that carry a non-null output now have the error merged into the returned terminal data so the model sees the failure reason on resume rather than a bare {}.

Confidence Score: 4/5

Safe to merge with the args nullability issue resolved — all other changes are additive registrations and a well-contained bug fix.

The core logic (read-only guards, tool registration, schema sync, bug fix in terminal data) is sound and well-commented. One concern is the args field typed as optional in both SearchKnowledgeBaseArgs and QueryUserTableArgs while the catalog schemas list args in required — if the model omits it, the handler forwards undefined into the underlying tool, where property accesses on args would throw at runtime.

apps/sim/lib/copilot/tools/server/knowledge/search-knowledge-base.ts and apps/sim/lib/copilot/tools/server/table/query-user-table.ts — both type args as optional at the TypeScript level while the JSON schema marks it required.

Important Files Changed

Filename Overview
apps/sim/lib/copilot/tools/server/knowledge/search-knowledge-base.ts New read-only wrapper for knowledge_base; enforces operation allowlist but types args as optional while schema marks it required — downstream access on undefined args could throw.
apps/sim/lib/copilot/tools/server/table/query-user-table.ts New read-only wrapper for user_table with operation + outputPath guards; contains a redundant second outputPath check at the top-level params scope that cannot be reached given the typed schema.
apps/sim/lib/copilot/tools/handlers/run-code.ts New compute-only variant of function_execute; correctly strips output write vectors before delegating, with defense-in-depth noted in the catalog capabilities.
apps/sim/lib/copilot/request/tool-call-state.ts Bug fix ensuring failed tool calls always surface their error in terminal data even when a non-empty (but empty-object) output is present; logic and edge-case handling look correct.
apps/sim/lib/copilot/generated/tool-catalog-v1.ts Adds QueryUserTable, RunCode, Search, and SearchKnowledgeBase catalog entries; removes Research and replaces with Search subagent; all catalog entries and type unions are consistently updated.
apps/sim/lib/copilot/generated/tool-schemas-v1.ts Runtime schemas kept in sync with catalog additions; research schema removed, new schemas for query_user_table, run_code, search, search_knowledge_base are consistent with catalog definitions.
apps/sim/lib/copilot/tools/server/router.ts Both new server tool instances correctly registered in the server tool registry.
apps/sim/lib/copilot/tool-executor/register-handlers.ts executeRunCode handler properly imported and registered alongside RunCode catalog id.
apps/sim/lib/copilot/request/tools/executor.ts Research replaced by Search in LONG_RUNNING_TOOL_IDS; RunCode added alongside FunctionExecute.
apps/sim/app/workspace/[workspaceId]/home/components/message-content/utils.ts TOOL_ICONS updated to include run_code, search_knowledge_base, query_user_table, scout, and search icons; additions are complete and consistent.
apps/sim/app/workspace/[workspaceId]/home/types.ts SUBAGENT_LABELS extended with scout and search entries to match the new catalog subagent IDs.
apps/docs/components/workflow-preview/format-references.tsx Removes now-unnecessary biome-ignore comments for noArrayIndexKey — the root biome.json has this rule set to off globally, so the suppressions were already no-ops.
apps/sim/lib/copilot/tools/tool-display.ts TOOL_TITLES extended with run_code, query_user_table, search_knowledge_base, scout, and search display strings; consistent with catalog additions.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Copilot Model Request] --> B{Tool Route}

    B -->|run_code| C[executeRunCode]
    C --> D{outputs / outputTable in params?}
    D -->|Yes| E[Return error: compute-only]
    D -->|No| F[executeFunctionExecute sandbox execution]

    B -->|search_knowledge_base| G[searchKnowledgeBaseServerTool]
    G --> H{operation in READ_OPERATIONS?}
    H -->|No| I[Return error: read-only]
    H -->|Yes| J[knowledgeBaseServerTool.execute]

    B -->|query_user_table| K[queryUserTableServerTool]
    K --> L{operation in READ_OPERATIONS?}
    L -->|No| M[Return error: read-only]
    L -->|Yes| N{outputPath present?}
    N -->|Yes| O[Return error: read-only]
    N -->|No| P[userTableServerTool.execute]

    B -->|search subagent| Q[Search Subagent replaces Research]

    F & J & P & Q --> R[Tool Result]
    R --> S[getToolCallTerminalData]
    S --> T{failed AND output defined?}
    T -->|Yes| U[Merge error into output for model context]
    T -->|No| V[Return output or error as-is]
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    A[Copilot Model Request] --> B{Tool Route}

    B -->|run_code| C[executeRunCode]
    C --> D{outputs / outputTable in params?}
    D -->|Yes| E[Return error: compute-only]
    D -->|No| F[executeFunctionExecute sandbox execution]

    B -->|search_knowledge_base| G[searchKnowledgeBaseServerTool]
    G --> H{operation in READ_OPERATIONS?}
    H -->|No| I[Return error: read-only]
    H -->|Yes| J[knowledgeBaseServerTool.execute]

    B -->|query_user_table| K[queryUserTableServerTool]
    K --> L{operation in READ_OPERATIONS?}
    L -->|No| M[Return error: read-only]
    L -->|Yes| N{outputPath present?}
    N -->|Yes| O[Return error: read-only]
    N -->|No| P[userTableServerTool.execute]

    B -->|search subagent| Q[Search Subagent replaces Research]

    F & J & P & Q --> R[Tool Result]
    R --> S[getToolCallTerminalData]
    S --> T{failed AND output defined?}
    T -->|Yes| U[Merge error into output for model context]
    T -->|No| V[Return output or error as-is]
Loading

Reviews (1): Last reviewed commit: "fix(copilot): failed tool calls must sur..." | Re-trigger Greptile

Comment on lines +42 to +48
if (params && 'outputPath' in (params as Record<string, unknown>)) {
return {
success: false,
message:
'query_user_table is read-only: outputPath (file export) is not available; digest the rows directly or route exports through the table agent',
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Redundant top-level outputPath check

The check at lines 35–40 already guards against outputPath nested inside params.args. The second check at lines 42–48 looks for outputPath at the top level of params, but outputPath is not a valid top-level field in QueryUserTableArgs — the catalog schema only defines operation and args at this level. If the Go executor or the server framework normalises the payload into the typed shape before reaching this handler, the top-level path can never be reached, making this check dead code. If the raw payload could arrive unnormalised, a note explaining that assumption would help clarify intent.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines +5 to +8
type SearchKnowledgeBaseArgs = {
operation: string
args?: Record<string, any>
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 args typed optional but required in schema

SearchKnowledgeBaseArgs.args is typed as optional (args?: Record<string, any>), yet the generated catalog entry at SearchKnowledgeBase.parameters.required includes 'args'. The same pattern exists in QueryUserTableArgs. If the LLM omits args, the handler forwards params with args: undefined to knowledgeBaseServerTool.execute, where downstream property accesses like params.args.knowledgeBaseId would throw. Aligning the TypeScript type with the schema (args: Record<string, any> without ?) makes the contract explicit and prevents a runtime crash on malformed calls.

…splay

Companion to mothership 8ae32e97 (user_memory tool removed — the feature no
longer exists). Regenerates the mothership contract mirrors via
generate-mship-contracts.ts, which also picks up the pending telemetry
contract additions (gen_ai.agent.name labels, llm.client.context_tokens,
llm.client.compactions, llm.request.compaction_trigger, llm.compaction.pause,
gen_ai.usage.context_tokens), and removes the user_memory display title.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…select types only

UI ordering: answering a question card no longer echoes a duplicate user
bubble. The combined answer still goes on the wire as a user message, but the
chat pairs it back to its card (strict 'Prompt — Answer' match, now uniform
for single questions too) and renders the card as the answered recap — the
card IS the user turn, and the next assistant message streams below it. The
pairing is derived from the transcript, so live and reloaded renders are
identical; a dismissed card followed by an unrelated typed message does not
match and renders normally. Messages ending with a question card also drop
the copy/thumbs actions row — the card is an input surface, not a reactable
assistant turn.

Question types are now single_select and multi_select only: text is removed
(the free-text 'Something else' row covers it) and confirm collapses into
single_select with Yes/No options. multi_select rows toggle with a check and
the free-text row's arrow submits the step; answers are comma-joined labels
plus any typed entry. Agent-supplied catch-all options ('Other', 'Something
else', 'None of the above') are stripped at parse — the card always provides
its own free-text row; a question left with no real options is invalid.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Removes multi_select (and its toggle/check UI). The card is one shape: pick
one option or type into the always-present 'Something else' row. Catch-all
stripping and the transcript pairing/recap behavior are unchanged.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants