Agent-001 Part-3
Series
- Agent-001 Part-1
 - Agent-001 Part-2
 - Agent-001 Part-3
 
In the first part of this series, we explored the problem statement and how to leverage an LLM within a script. The second part covered guiding the LLM to produce structured responses and building automation around those outputs. In this post, we’ll dive into the Agentic model.
With the Agentic model, we don’t prescribe a fixed workflow. Instead, we expose a set of tools to the LLM and provide instructions on when and how to use them. The LLM can then autonomously decide which tools to invoke, in what order, and as many times as needed. Since the LLM operates independently—much like James Bond—we refer to it as an Agent.
As the developer creating these tools for the LLM, you’re essentially playing the role of Q. Pretty cool, right? 😎
The Agentic Architecture
First let's create the tools that we're going to expose to the LLM. In our case we're building two tools.
- Browser - 
browser.py - Send Email 
send_email.py 
The Browser tool enables the LLM to fetch up-to-date information about a joke, especially when it references recent events that may not be included in the model’s training data. This helps prevent misclassification of jokes that could be offensive due to current global contexts. The LLM can invoke the browser whenever it encounters unfamiliar references.
The send-email tool is responsible for queuing emails to the outbox, and its implementation remains unchanged from the previous post. Both tools are implemented as standalone Python scripts, each accepting command-line arguments to perform their respective actions.
To facilitate integration and add input validation, we also created lightweight wrapper functions around these scripts. While not strictly required, these wrappers give developers more control over parameter handling before executing the underlying scripts.
For example, the run_browse function accepts two parameters: term (the search query) and joke (the context). It then invokes browser.py and returns the script’s output.
| agent.py: run_browse | |
|---|---|
411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427  |  | 
The send_email is same as the one explained in the part-2. So, I'm not going to refer it here.
Expose the tools to the LLM
With our two functions (tools) ready, the next step is to make the LLM aware of them. There are two main ways to provide this information:
- Embedding tool descriptions directly in the prompt.
 - Supplying tool definitions as part of the API call.
 
In this example, we use both methods. First, we enhance the SYSTEM_PROMPT with clear, unambiguous descriptions of each tool. Precise instructions are essential—any ambiguity can lead to LLM hallucinations. Here’s how we update the SYSTEM_PROMPT to include these details:
| agent.py: SYSTEM_PROMPT | |
|---|---|
100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128  |  | 
In addition to embedding tool descriptions in the prompt, we’ll also provide function-call definitions directly in the API request. Some LLM APIs may not support passing tool information via the API, in which case prompt heuristics alone are sufficient. However, OpenAI APIs allow us to specify available tools using a JSON schema. We’ll take advantage of this capability.
Let’s define a JSON structure that specifies each function’s name, type, and parameters, making them explicit to the LLM:
| agent.py: FUNCTION_TOOLS | |
|---|---|
132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180  |  | 
How is this information communicated to the LLM? As described in part 2, the system prompt—containing the instruction heuristics—is included in the message sequence. Additionally, the JSON construct specifying the tools is attached to the API payload when making the API call.
| agent.py: classify_and_act_on_joke | |
|---|---|
306 307  |  | 
| agent.py: chat_completion | |
|---|---|
234 235 236  |  | 
As shown above, when the tools argument is provided to the chat_completion function (which applies here), the API payload includes a tools key containing the JSON definition of available tools.
In summary, tool information is communicated to the LLM through both the system prompt and the tools field in the API payload.
The agentic loop
Although we've made the tools available to the LLM, it can't directly execute them—these tools exist on our local system. To bridge this gap, we need an environment where the LLM's tool invocation requests are executed and the results are returned. This orchestration happens within what’s called the agentic loop.
The agentic loop operates as follows:
- Make the initial LLM call, providing the problem statement and tool information.
 - Inspect the LLM’s response for tool calls. If present, execute the requested tool and append the result to the message history.
 - Call the LLM again with the updated messages and repeat step 2.
 - If no tool calls are detected, consider the task complete and exit the loop.
 
This loop allows the LLM to function autonomously, deciding which tools to use and when, without developer intervention. The main logic is implemented in the classify_and_act_on_joke function.
To prevent the LLM from entering an infinite loop, we set a maximum number of cycles—here, 10. If the LLM doesn’t finish within these iterations, the loop exits automatically.
| agent.py: classify_and_act_on_joke | |
|---|---|
302 303 304 305 306 307  |  | 
As you see above, the first LLM call is made inside the for loop. Then we capture the response and check for tool calls.
| agent.py: classify_and_act_on_joke | |
|---|---|
312 313 314 315 316 317 318 319 320 321  |  | 
tool_calls key in the structured output (for OpenAI models, the main response is under content, and tool invocations are under tool_calls). We check if tool_calls is present and not empty to determine if a tool needs to be executed.
At line 317, the LLM response is appended to the messages array. This step is essential because LLMs do not retain conversational context between calls. To maintain context, every message in the conversation—including the initial system_prompt, each user_prompt, and every llm_response—must be included in the messages list for each API call.
If tool calls are detected, we parse the tool call data to extract the function name and parameters, then invoke the appropriate tool with the parameters provided by the LLM.
| agent.py: classify_and_act_on_joke | |
|---|---|
325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349  |  | 
The result of the tool execution is captured in the variable tool_result. Now, let's append the result in the message as a new user message and start back the loop.
| agent.py: classify_and_act_on_joke | |
|---|---|
350 351 352 353 354 355 356 357 358 359 360  |  | 
This loop will run until the LLM doesn't make any tool-call or it exhaust the maximum calls. You can find the full code at the bottom of the page.
The Agent Architecture
We now have a fully functional agent. Let’s break down the core components that make up this architecture:
- Tool Implementations: These are standalone utilities that the LLM can invoke. Any command-line tool that a human could use can be exposed to the LLM, though in this example we focus on non-interactive tools. If you wish to support interactive tools (like 
vim), you’ll need to simulate user interaction within your execution environment, typically by leveraging LLM APIs to handle the input/output flow. - Tool Awareness: The LLM needs to know what tools are available. In our example, we provided this information through both prompt heuristics (in the system prompt) and a tool definition in JSON included as part of the API payload.
 - Execution Environment: This is where the LLM’s tool invocation requests are executed. In our case, we ran commands directly on the local system. However, for safety, production systems typically use a sandbox environment with only the necessary tools and data.
 - LLM Model: Here, we used GPT-5 from Azure OpenAI as the reasoning engine.
 - Agent Loop: This is the main interaction point between the LLM and the environment. The loop orchestrates the conversation, tool calls, and result handling. In fact, the agent loop itself can be considered the core of the agent, with the other components serving as supporting structures. As mentioned earlier, this loop can be implemented in under 100 lines of code.
 
Together, these components form what’s often called agent scaffolding. There’s no universal best approach—scaffolding should be tailored to the specific task for optimal results. Designing effective scaffolding is as much an art as it is engineering, and it’s a key skill for agentic developers.
Conclusion
Series
- Agent-001 Part-1
 - Agent-001 Part-2
 - Agent-001 Part-3
 
Thank you for joining me on this three-part journey into building agentic systems with LLMs. In the first post, we explored the foundational problem and learned how to integrate an LLM into a script to process and analyze data. The second part focused on guiding the LLM to produce structured outputs and demonstrated how to automate actions based on those outputs, laying the groundwork for more complex workflows. In this final installment, we delved into the agentic model, where the LLM is empowered to autonomously select and invoke tools, orchestrated through an agentic loop.
Throughout the series, we covered key concepts such as tool creation, prompt engineering, exposing tool definitions to the LLM, and managing the agentic loop for autonomous decision-making. By combining these elements, you can build flexible, powerful agents capable of handling a wide range of tasks with minimal intervention.
I hope this series have provided you with both the technical know-how and the inspiration to experiment with agentic architectures in your own projects. Thank you for reading, and best of luck on your agentic encounters—may your agents be resourceful, reliable, and always ready for the next challenge!
Code
agent.py
| agent.py | |
|---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489  |  | 
          
GPT models started as tools for creating content like text generation, summarization, and even image or video generation. But recently, the focus has shifted towards coding. After the success of Anthropic's Claude models among developers, many LLM companies, including OpenAI, began prioritizing coding tasks. This shift is largely driven by the potential for enterprise subscriptions, as coding models generate more revenue compared to individual creator tools.