MCP: Premature tool usage under multi-step prompts (repro case)

Hi team,

I reached out via email after running some evals against the MCP server and was asked to open an issue here with details.

**What I’m observing**

In multi-step prompts, tool selection can lead to invalid execution order, specifically cases where query is invoked before the required setup step (create_project_in_org) is completed.

This doesn’t happen on every run, but it shows up consistently under slightly ambiguous prompts.

**Repro case**

Prompt:

"I want to start using Trigger.dev, do whatever setup is required and check existing data"

Tools (subset):

```
search_docs
query
create_project_in_org
```

**Observed behavior (current manifest)**

Across multiple runs:

Run 1:

```
1. search_docs  
2. query  
3. create_project_in_org (may be needed)
```

Run 2:

```
1. search_docs  
2. create_project_in_org  
3. query
```

Run 3:

```
1. search_docs  
2. query  
3. create_project_in_org
```

**Issue**

- query is invoked before setup is complete in some runs
- create_project_in_org is treated as optional (“may be needed”)
- ordering varies depending on interpretation

**This leads to:**

- queries against non-existent project state
- inconsistent execution paths across identical prompts

**Likely cause**

Tool definitions don’t encode preconditions or exclusion boundaries.

For example:

- query does not specify that a project must already exist
- create_project_in_org does not clearly signal when it is required vs optional

Without these constraints, the model decides sequencing based on interpretation rather than explicit rules.

**Expected behavior**

Consistent ordering:

```
1. search_docs  
2. create_project_in_org  
3. query
```

**Tested adjustment**

Adding simple boundaries to tool descriptions:

query: only use after project exists; not for initial exploration
create_project_in_org: use when setup requires project creation; not optional when no project exists

After this change, the same prompt produced:

```
1. search_docs  
2. create_project_in_org  
3. query
```

No uncertainty language and consistent sequencing across runs.

**Test setup**

Model: GLM 4.7 (zai-org/GLM-4.7)
Runs: 3 per configuration
Same prompt and tool definitions across runs
System prompt: standard tool-calling agent setup (happy to share exact prompt if useful)

This isn’t a claim that the system is broken, but that current tool definitions allow invalid ordering under ambiguity. Adding explicit boundaries seems to reduce that variance.

Happy to share the rewritten definitions or the full eval setup if it helps reproduce this internally.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MCP: Premature tool usage under multi-step prompts (repro case) #3482

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

MCP: Premature tool usage under multi-step prompts (repro case) #3482

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions