AI, Code, or Human — Pick the Right Tool

If all you have is a hammer, everything looks like a nail. Right now, a lot of professionals are anxious about falling behind in what I'll call the AI productivity craze. I see people rushing to use AI for any new use case. While it's hard to overstate how amazing LLMs are and how quickly the models and tooling around them are getting better — AI is still not, and will never be, the right solution for every problem. It's best to have other tools in your metaphorical toolbelt.

The way I see it, there are three main ways to get work done: manually with a human involved (think an analyst building a model in excel), via deterministic code running on CPUs (think about python, java, etc), and via agents running non-deterministically on GPUs. Each has a lane. Mixing them up costs you money, time, or both. Below is a high-level guide on when to use which one. I'm sure this guidance will change — and fast.


AI — for the messy stuff

Use AI when the input or output is unstructured. Images, video, or large bodies of text. When a user inputs freeform text, you can't write a clean if-statement to handle every case. Similarly, if you want output that's more flexible than a template, use AI.

AI is also the right call when the output needs judgment, synthesis, or flexibility. Summarizing a document. Extracting meaning from a noisy dataset. Classifying support tickets. Anything where "it depends" is a legitimate answer.

Signal: if a human who is experienced in the task would still need to think a bit before completing the task, AI is probably the right call.

Example: a SaaS company routes inbound support tickets. Some are straightforward requests to increase a quota, filled out via a form with preset options — deterministic, handle with code. But the ones that say "this feature seems buggy" or "the product isn't working the way I expected" need interpretation. AI reads the ticket, classifies intent, hypothesizes potential issues, validates until it finds the root cause, and provides the customer with potential resolution paths.

Expensive vs. cheap models

Not all AI is the same — and the cost difference is real. A frontier model like Claude Opus or GPT-4o might cost 10-50x more per call than a smaller, faster model. That spread matters at scale and will matter more as token subsidies decrease.

Use a slow, expensive model when the task requires deep reasoning, multi-step logic, or nuanced judgment. Writing a due diligence summary. Analyzing a legal clause. Generating a complex data pipeline from a vague spec.

Use a fast, cheap model when the task is simpler — classification, extraction, or short-form generation. The output quality difference is often negligible for these tasks and the latency, cost, and energy savings are significant.

The rule: match model capability to task complexity. Defaulting to the best model for everything is like using a tank to bring the kids to soccer practice — fun, exciting, but really not necessary.

Also consider running batch LLM inference. This will get you a much lower price and will help your friendly cloud providers with managing peak load, which users should only care about becasue better cloud utilization leads to lower operating costs which in theory leads to lower prices.


Code — for the repeatable stuff

Use code when the process is deterministic — same input, same output, every time. Parsing a CSV. Calculating a tax rate. Transforming a date format. Sending a webhook. Code is fast, cheap, and auditable. You can test it. You can debug it. It doesn't hallucinate.

The mistake I see constantly: using an LLM to do something a regex could've handled in 3 milliseconds for free. Every token costs something. Code doesn't. Just because there are changing parameters doesn't mean a Python script is too basic and you'll need a frontier LLM.

Signal: if you can write a unit test for it, write the code instead.

Example: a fintech app needs to convert currencies on every transaction. The exchange rate comes from an API — the math is fixed. That's a function, not a prompt. Running it through an LLM would be 100x slower and occasionally wrong.


Humans — for the rare stuff

Use humans when the stakes are high, the context is deeply variable, or genuine creativity is required. A sensitive customer escalation. A contract negotiation. A strategic decision with incomplete information. The kind of work where being wrong has real consequences and the edge cases are endless.

Humans are the most expensive resource in any system — not just in salary, but in coordination overhead, context-switching cost, and latency. They should be reserved for things that actually need them.

Signal: if you'd be uncomfortable with the output going live without a human reviewing it — a human should probably be doing it.

Example: a PE fund is closing a $50M acquisition. The AI flagged three technical risks in the target's codebase and drafted a summary. A human expert reviews it, catches a nuance around vendor lock-in the model missed, and adjusts the risk rating before it goes to the IC. AI did the legwork — the human made the call.


It's also an economics problem

When built and run under optimal conditions: Code is cheapest, AI is middle, humans are most expensive.

Stack them accordingly. Automate what you can with code. Delegate to AI what code can't handle. Escalate to humans only what AI can't handle reliably. Every time you push something down the ladder, you're buying back time and margin. The economic model of how to get work done is changing — code used to be written only by expensive matcha-drinking engineers. As the cost to produce good-enough code decreases, code will proliferate.

The goal isn't to use AI everywhere — it's to use each tool where it actually wins. Get that right and the whole system runs cleaner.

← Back to blog