Skip to content

Your Gemini Agent Just Crashed. Here’s Why

You’ve built the perfect agent. It searches the web for live data, crunches numbers with your custom tools, saves results โ€” all orchestrated by Gemini 2.5. You ship it. It works in testing. Then in production:

400 INVALID_ARGUMENT
Built-in tools ({google_search}) and Function Calling cannot be combined
in the same request.

Your agent is dead. Not because of a bug in your code. Not because of a network issue. Because Google’s own API forbids what you’re trying to do.

And here’s the uncomfortable truth: as of April 2026, no major Python agent framework has a transparent fix for this. Not LangChain. Not CrewAI. Not AutoGen. Not even Google’s own ADK โ€” unless you’re willing to tear apart your agent architecture.

NucleusIQ v0.7.5 does. And we have the test results to prove it.


Why This Is a Bigger Deal Than You Think

This isn’t some obscure edge case. Think about the most natural thing you’d want an AI agent to do:

NucleusIQ
v0.6.0 ยท Open Source ยท MIT Licensed

Tired of complex agent frameworks? NucleusIQ gives you 3 execution modes, 10 production plugins, and provider portability โ€” in pure Python. Try it โ†’

Gearbox Strategy 10 Production Plugins Provider Portable
$ pip install nucleusiq nucleusiq-openai

“Search the web for today’s temperature in Tokyo, convert it to Fahrenheit, run a calculation, and save me a note.”

That task requires native tools (Google Search, Code Execution โ€” the powerful built-in capabilities that run server-side on Google’s infrastructure) and custom tools (your converter, your note-taker โ€” the business logic that makes your agent yours).

On Gemini 2.5 โ€” which remains the production model for most teams โ€” you cannot send both types in the same API call. Google’s generateContent endpoint rejects the request flat-out. This is documented in googleapis/python-genai#58 and acknowledged in Google ADK’s official limitations page.

Four native tools are affected: google_search, code_execution, url_context, and google_maps. The moment you add any custom function declaration alongside them โ€” boom. 400 error.

So what does the ecosystem do about it?


The Gap: Every Framework Either Blocks You or Ignores the Problem

We audited every major Python agent framework. The results are striking:

LangChain: Actively blocks you

LangChain’s Google integration (PR #795) added validation that prevents you from even trying to mix native and custom tools. If you pass a grounding tool alongside function declarations, the framework throws an error before the request ever reaches Google. Their solution is: pick one or the other.

Google ADK: Makes you rebuild your agent

Google’s own Agent Development Kit documents the limitation and offers a workaround: split each tool type into a separate sub-agent, then wrap them as AgentTool in a root agent:

search_agent = Agent(model='gemini-2.0-flash', tools=[google_search])
coding_agent = Agent(model='gemini-2.0-flash', code_executor=BuiltInCodeExecutor())
root_agent = Agent(
    tools=[AgentTool(agent=search_agent), AgentTool(agent=coding_agent)]
)

This works. But now your single intelligent agent has become a committee. The LLM no longer sees all tools in one decision space. It must delegate to the right sub-agent before that sub-agent runs. Context is fragmented across agents. You’ve traded a clean architecture for a workaround.

CrewAI and AutoGen: Pretend the problem doesn’t exist

CrewAI supports Gemini’s native Google Search grounding but does not address what happens when you mix it with custom tools. Users either discover the error themselves or learn to avoid mixing.

AutoGen’s Gemini function calling integration is still problematic. Native tool mixing isn’t even on their radar.

Google’s own fix: Gemini 3 only

In March 2026, Google announced tool combinations โ€” the ability to combine built-in and custom tools in a single call. Their documentation is clear:

Preview: Built-in and custom tools combinations are in Preview and supported for Gemini 3 models only.

โ€” ai.google.dev/gemini-api/docs/tool-combination

If you’re on gemini-2.5-flash or gemini-2.5-pro โ€” the models most teams are actually running in production โ€” this doesn’t help you.

The complete picture

FrameworkStatusWhat you have to do
LangChainBlockedChoose native OR custom. Can’t have both.
Google ADKSub-agent workaroundRestructure into multiple agents.
CrewAINot addressedDiscover the 400 error yourself.
AutoGenNot addressedBasic Gemini tool support still incomplete.
Google APIGemini 3 onlyWait for model upgrade. Still in Preview.
NucleusIQSolvedJust pass your tools. It works.

That last row is what the rest of this article is about.


How NucleusIQ Solves It: The Proxy Pattern

NucleusIQ v0.7.5 introduces a transparent proxy pattern that resolves the limitation with three properties that matter:

  1. Zero code changes โ€” you pass tools exactly as you would normally
  2. Works on all Gemini models โ€” 2.5-flash, 2.5-pro, and future models
  3. Works across all execution modes โ€” Direct, Standard, Autonomous

Here’s what it looks like from your perspective:

from nucleusiq.agents import Agent
from nucleusiq.agents.config import AgentConfig
from nucleusiq.agents.task import Task
from nucleusiq_gemini import BaseGemini
from nucleusiq_gemini.tools.gemini_tool import GeminiTool

agent = Agent(
    name="ResearchAgent",
    llm=BaseGemini(model_name="gemini-2.5-flash"),
    tools=[
        GeminiTool.google_search(),      # Native tool
        GeminiTool.code_execution(),     # Native tool
        my_unit_converter,               # Your custom tool
        my_note_taker,                   # Your custom tool
    ],
    config=AgentConfig(execution_mode="standard"),
    ...
)
result = await agent.execute(task)  # Just works.

No sub-agents. No restructuring. No choosing between native and custom. You declare what tools your agent needs, and it uses all of them.

What happens behind the scenes

1. BaseGemini.convert_tool_specs() detects mixed native + custom tools
2. Native tools switch to proxy mode:
   - google_search appears as a function declaration (not a native spec)
   - code_execution appears as a function declaration
3. LLM sees ALL tools as function declarations -- no API rejection
4. LLM decides the order: "search first, then convert, then calculate, then save"
5. When the Executor calls google_search.execute():
   - Proxy mode intercepts the call
   - Makes a separate generateContent() sub-call with the REAL native tool
   - Returns grounded search results as if it were a normal tool response
6. Custom tools execute normally (local Python)
7. Agent completes with all tool results combined

The LLM never knows the difference. It sees four callable functions and chooses the best order. The proxy layer handles the API constraint invisibly.

Why this is architecturally clean

AspectADK Sub-Agent ApproachNucleusIQ Proxy
Agent structureMultiple agents (root + sub-agents)Single agent
Tool visibilityLLM sees sub-agents, not toolsLLM sees all tools directly
Tool interleavingMust delegate to correct sub-agentLLM picks any tool in any order
Code changes neededRestructure your agent architectureZero
ContextFragmented across sub-agentsSingle shared conversation
Works on Gemini 2.5Yes (with architectural cost)Yes (transparently)

Zero core framework changes

The proxy pattern lives entirely within the Gemini provider package (nucleusiq-gemini). The core NucleusIQ framework โ€” Agent, StandardMode, Executor โ€” has no idea proxy mode exists. This was achieved through three mechanisms:

  1. Overriding BaseLLM.convert_tool_specs() in the Gemini provider (an existing extension point)
  2. Modifying _GeminiNativeTool.execute() to handle proxy sub-calls
  3. Python object references โ€” the Executor holds the same tool object that was switched to proxy mode

This means OpenAI tools are unaffected, future providers can add their own strategies, and the proxy gracefully retires when Gemini 3 adoption makes it unnecessary.


The Proof: Real Test Results

Claims are easy. Evidence is what counts. All results below are from actual test runs on April 3, 2026, with print() instrumentation in every custom tool so you can see exactly when each tool is called.

Full integration tests: tests/integration/test_mixed_tools.py (13 tests, all passing).

Test 1: 2 Native + 2 Custom on gemini-2.5-flash

Tools: google_search (native), code_execution (native), unit_converter (custom), note_taker (custom)

Task: "Search for NY-to-London distance in km, convert to miles, save a note."

Status:      success
Wall time:   10,670 ms
LLM calls:   4
Tool calls:  3

Tool call trace:
  Round 1: google_search   [NATIVE/PROXY]  success=True  duration=4,419ms
  Round 2: unit_converter  [CUSTOM]        success=True  duration=0.1ms
  Round 3: note_taker      [CUSTOM]        success=True  duration=0.0ms

Custom tool stdout (proof of execution):
  >>> [UnitConverterTool] CALLED #1
      value=5570, from_unit='km', to_unit='miles'
      Result: 5570.0 km = 3461.0365 miles
      Execution time: 0.020 ms
  <<< [UnitConverterTool] DONE

  >>> [NoteTakerTool] CALLED #1
      action='save', title='NY to London'
      content='Distance from New York to London: 5570 km or 3461.0365 miles'
      Execution time: 0.010 ms
  <<< [NoteTakerTool] DONE

Test 2: code_execution (proxy) + unit_converter (custom) on gemini-2.5-flash

Task: "Calculate factorial of 12 with code_execution, convert 100C to F."

Status:      success
Wall time:   5,907 ms
Tool calls:  2

Tool call trace:
  Round 1: code_execution  [NATIVE/PROXY]  success=True  duration=1,743ms
  Round 2: unit_converter  [CUSTOM]        success=True  duration=0.1ms

Output: "The factorial of 12 is 479,001,600. 100 celsius is equal to 212 fahrenheit."

Test 3: ALL 4 Tools in One Session on gemini-2.5-pro

This is the definitive test. Every tool โ€” 2 native, 2 custom โ€” used together in a single agent execution on the higher-tier model.

Task: "Search Tokyo temperature, convert to fahrenheit, calculate 25^5,
       save a note, then list all notes."

Model:       gemini-2.5-pro
Status:      success
Wall time:   31,452 ms
LLM calls:   6
Tool calls:  5

Tool call trace:
  Round 1: google_search   [NATIVE/PROXY]  success=True  duration=3,306ms
  Round 2: unit_converter  [CUSTOM]        success=True  duration=0.1ms
  Round 3: code_execution  [NATIVE/PROXY]  success=True  duration=3,450ms
  Round 4: note_taker      [CUSTOM]        success=True  duration=0.0ms
  Round 5: note_taker      [CUSTOM]        success=True  duration=0.0ms

LLM call trace:
  [main      ] gemini-2.5-pro  4,879ms  tokens_in=473  tokens_out=18
  [tool_loop ] gemini-2.5-pro  4,505ms  tokens_in=562  tokens_out=32
  [tool_loop ] gemini-2.5-pro  5,387ms  tokens_in=624  tokens_out=24
  [tool_loop ] gemini-2.5-pro  4,290ms  tokens_in=697  tokens_out=62
  [tool_loop ] gemini-2.5-pro  2,604ms  tokens_in=789  tokens_out=15
  [tool_loop ] gemini-2.5-pro  3,028ms  tokens_in=850  tokens_out=130

Custom tool stdout (all tools verified):
  >>> [UnitConverterTool] CALLED #1
      value=15, from_unit='celsius', to_unit='fahrenheit'
      Result: 15.0 celsius = 59.0000 fahrenheit
      Execution time: 0.027 ms

  >>> [NoteTakerTool] CALLED #1
      action='save', title='Tokyo Research'
      content='The temperature in Tokyo is 15C (59F), and 25^5 is 9,765,625.'
      Execution time: 0.016 ms

  >>> [NoteTakerTool] CALLED #2
      action='list', title=''
      Result: Notes (1): 1. Tokyo Research
      Execution time: 0.016 ms

Verification:
  Unique tools used: [code_execution, google_search, note_taker, unit_converter]
  All 4 used:        True
  All calls success: True

Understanding the timings

ToolTypeDurationWhat’s happening
google_searchNative/Proxy3,300-4,400msReal generateContent API round-trip with native tool
code_executionNative/Proxy1,700-3,500msReal generateContent API round-trip with native tool
unit_converterCustom<0.1msLocal Python: dictionary lookup + arithmetic
note_takerCustom<0.1msLocal Python: list append

Native tools in proxy mode take 1.7-4.4 seconds because each one makes a full API round-trip to Gemini with the real native tool spec. Custom tools execute locally in microseconds โ€” a dictionary lookup is genuinely sub-millisecond. This is expected and correct.


Try It Yourself: The Complete 4-Tool Example

Here’s a fully runnable example with 2 native tools (google_search, code_execution) and 2 custom tools (unit_converter, note_taker). Custom tools include print() instrumentation so you can verify every invocation.

Interactive notebook: [notebooks/agents/gemini_mixed_tools_showcase.ipynb](../../notebooks/agents/gemini_mixed_tools_showcase.ipynb)

pip install nucleusiq nucleusiq-gemini

Step 1: Define custom tools with print instrumentation

import time
from typing import Any
from nucleusiq.tools.base_tool import BaseTool


class UnitConverterTool(BaseTool):
    """Converts between common units with call instrumentation."""

    def __init__(self):
        super().__init__(
            name="unit_converter",
            description=(
                "Convert between units. Supports: km<->miles, kg<->pounds, "
                "celsius<->fahrenheit, liters<->gallons. "
                "Provide value, from_unit, to_unit."
            ),
            version=None,
        )
        self.call_count = 0

    async def initialize(self) -> None:
        pass

    async def execute(
        self, value: float = 0, from_unit: str = "", to_unit: str = "", **kwargs: Any
    ) -> str:
        self.call_count += 1
        t0 = time.perf_counter()
        print(f"\n>>> [UnitConverterTool] CALLED #{self.call_count}")
        print(f"    value={value}, from_unit='{from_unit}', to_unit='{to_unit}'")

        value = float(value)
        conversions = {
            ("km", "miles"): lambda v: v * 0.621371,
            ("miles", "km"): lambda v: v * 1.60934,
            ("kg", "pounds"): lambda v: v * 2.20462,
            ("pounds", "kg"): lambda v: v * 0.453592,
            ("celsius", "fahrenheit"): lambda v: v * 9 / 5 + 32,
            ("fahrenheit", "celsius"): lambda v: (v - 32) * 5 / 9,
            ("liters", "gallons"): lambda v: v * 0.264172,
            ("gallons", "liters"): lambda v: v * 3.78541,
        }
        key = (from_unit.lower().strip(), to_unit.lower().strip())
        if key in conversions:
            result = conversions[key](value)
            output = f"{value} {from_unit} = {result:.4f} {to_unit}"
        else:
            output = f"Unsupported conversion: {from_unit} -> {to_unit}"

        elapsed_ms = (time.perf_counter() - t0) * 1000
        print(f"    Result: {output}")
        print(f"    Execution time: {elapsed_ms:.3f} ms")
        print(f"<<< [UnitConverterTool] DONE\n")
        return output

    def get_spec(self) -> dict[str, Any]:
        return {
            "name": self.name,
            "description": self.description,
            "parameters": {
                "type": "object",
                "properties": {
                    "value": {"type": "number", "description": "Numeric value to convert"},
                    "from_unit": {"type": "string", "description": "Source unit"},
                    "to_unit": {"type": "string", "description": "Target unit"},
                },
                "required": ["value", "from_unit", "to_unit"],
            },
        }


class NoteTakerTool(BaseTool):
    """Stores and retrieves notes with call instrumentation."""

    def __init__(self):
        super().__init__(
            name="note_taker",
            description=(
                "Save a note with title and content (action='save'), "
                "or list all notes (action='list')."
            ),
            version=None,
        )
        self.notes: list[dict[str, str]] = []
        self.call_count = 0

    async def initialize(self) -> None:
        pass

    async def execute(
        self, action: str = "save", title: str = "", content: str = "", **kwargs: Any
    ) -> str:
        self.call_count += 1
        t0 = time.perf_counter()
        print(f"\n>>> [NoteTakerTool] CALLED #{self.call_count}")
        print(f"    action='{action}', title='{title}'")

        if action == "save" and title:
            self.notes.append({"title": title, "content": content})
            output = f"Note saved: '{title}' ({len(content)} chars). Total: {len(self.notes)}"
        elif action == "list":
            if not self.notes:
                output = "No notes saved."
            else:
                lines = [f"  {i+1}. {n['title']}" for i, n in enumerate(self.notes)]
                output = f"Notes ({len(self.notes)}):\n" + "\n".join(lines)
        else:
            output = f"Unknown action '{action}' or missing title."

        elapsed_ms = (time.perf_counter() - t0) * 1000
        print(f"    Result: {output}")
        print(f"    Execution time: {elapsed_ms:.3f} ms")
        print(f"<<< [NoteTakerTool] DONE\n")
        return output

    def get_spec(self) -> dict[str, Any]:
        return {
            "name": self.name,
            "description": self.description,
            "parameters": {
                "type": "object",
                "properties": {
                    "action": {"type": "string", "enum": ["save", "list"]},
                    "title": {"type": "string", "description": "Note title"},
                    "content": {"type": "string", "description": "Note content"},
                },
                "required": ["action"],
            },
        }

Step 2: Create agent with all 4 tools and execute

import asyncio
from nucleusiq.agents import Agent
from nucleusiq.agents.config import AgentConfig
from nucleusiq.agents.task import Task
from nucleusiq_gemini import BaseGemini
from nucleusiq_gemini.tools.gemini_tool import GeminiTool


async def main():
    agent = Agent(
        name="FullToolAgent",
        role="Research assistant with full tool suite",
        objective="Complete multi-step research tasks using all available tools",
        narrative=(
            "You have exactly 4 tools. Use ALL of them:\n"
            "1. google_search - Search the web for facts\n"
            "2. code_execution - Execute Python code for calculations\n"
            "3. unit_converter - Convert units (km/miles, celsius/fahrenheit, etc.)\n"
            "4. note_taker - Save notes (action='save') or list notes (action='list')"
        ),
        llm=BaseGemini(model_name="gemini-2.5-pro", temperature=0.0),
        tools=[
            GeminiTool.google_search(),
            GeminiTool.code_execution(),
            UnitConverterTool(),
            NoteTakerTool(),
        ],
        config=AgentConfig(
            execution_mode="standard",
            verbose=False,
            enable_tracing=True,
            max_tool_calls=15,
        ),
    )
    await agent.initialize()

    result = await agent.execute(
        Task(
            id="all-four",
            objective=(
                "Step 1: Use google_search for the temperature in Tokyo today.\n"
                "Step 2: Use unit_converter to convert that from celsius to fahrenheit.\n"
                "Step 3: Use code_execution to calculate 25 ** 5.\n"
                "Step 4: Use note_taker action='save' title='Tokyo Research' with findings.\n"
                "Step 5: Use note_taker action='list' to show all saved notes."
            ),
        )
    )

    print(f"\nStatus:     {result.status.value}")
    print(f"Wall time:  {result.duration_ms:.0f} ms")
    print(f"LLM calls:  {len(result.llm_calls)}")
    print(f"Tool calls: {len(result.tool_calls)}")
    print()
    for tc in result.tool_calls:
        tool_type = "NATIVE/PROXY" if tc.tool_name in ("google_search", "code_execution") else "CUSTOM"
        print(f"  Round {tc.round}: {tc.tool_name:20s} [{tool_type:12s}]  "
              f"success={tc.success}  duration={tc.duration_ms:.1f}ms")
    print()
    print(result.output)


asyncio.run(main())

When to Use What

Your situationWhat happens
Gemini 2.5 + mixed toolsProxy activates automatically. Everything works.
Gemini 3 + mixed toolsNative tool combinations. No proxy needed.
Native tools only (no custom)No issue anywhere. All frameworks handle this.
Custom tools only (no native)No issue anywhere. All frameworks handle this.

NucleusIQ detects automatically: if tools are mixed, proxy mode activates; if not, native mode is preserved. When Gemini 3 becomes widespread, the proxy gracefully retires โ€” zero overhead.


References

NucleusIQ is open-source and MIT-licensed. Star us on GitHub, try the quick start, or read the philosophy.

Footnotes:

Additional Reading

OK, thatโ€™s it, we are done now. If you have any questions or suggestions, please feel free to comment. Iโ€™ll come up with more topics on Machine Learning and Data Engineering soon. Please also comment and subscribe if you like my work, any suggestions are welcome and appreciated.

0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments