How I built an agent that sorts leads for me (case study)

In October 2025 it hit me that my morning routine - the one supposedly meant to "kick-start my day" - was 45 minutes of procrastination inside Gmail. I'd open the inbox, scan through about 30 messages, close four, snooze two, and leave the rest hanging with a vague feeling of "it'll sort itself out." Then I'd go make coffee and open Slack.

This wasn't working. So I tried to fix it. Here's what worked and, more importantly, what didn't - because before I got to a usable solution, I went down two dead ends worth mentioning.

Diagnosis: where the 45 minutes went

Before you automate anything, you need to know what's eating your time. I tracked my morning triage for a week.

20-40 emails per day. Mostly noise, but every now and then something lands that can't wait.
~12 minutes just reading and sorting into categories (lead / invoice / internal / spam).
~25 minutes drafting replies to leads - typically I had to pull up the CRM, check a proposal, look up pricing, write the email.
~8 minutes deciding what to defer and what to handle today.

Total: 45 minutes and a head full of context I'd throw away moments later. That's a tax nobody ever puts on an invoice, but everyone pays it.

Attempt #1: Gmail rules

First instinct: rules. Labels, filters, auto-replies. The classics.

I spent a Sunday evening setting up about 15 rules and workflows. Rule: "if the sender domain is client.cz, apply the Client label." Rule: "if the subject contains 'invoice,' move to the Invoices folder." Rule: "if the body mentions a quote, send me a notification."

It worked for one week. Then it started falling apart.

A client emailed from Gmail, not their company domain - the filter missed it.
The subject "Service proposal" matched everything - including our own cold outreach that colleagues forwarded.
"Question about an invoice" vs. "Invoice attached" - the rule saw no difference. An important question vanished into the Invoices folder.

After a month I had 30 rules, half of them disabled, and still 45 minutes a day. Rules are like stop signs at an intersection. They work as long as the driver knows what they mean and the sign is exactly where it should be. The moment an exception shows up, it's chaos and you're on the wrong side of it.

Attempt #2: ChatGPT by hand

I tried getting help from ChatGPT. Copy-paste the email, "category and suggested reply, please." Functional, but absurdly manual. Copy a message here, read the output there, paste the reply back. After a week I realized I enjoyed this even less than the original routine, because every email now meant two tools and three extra clicks.

The takeaway was key for me: ChatGPT is useful, but it wants you to use it. It does nothing until you come to it. I was looking for something that does its job even when I'm not sitting there.

If you want more on why ChatGPT (a chatbot) isn't the same thing as an agent, I wrote a separate article about that. Here I'll jump straight to the solution.

Attempt #3: the agent

I built an agent that runs on a small server and does this every hour:

Step 1 - Pull new emails

It connects to my inbox (via an app password - for the agent it's like a key to a mailbox). It downloads every email it hasn't processed yet.

Step 2 - Classify each email

It uses a language model that I initially trained on about 50 examples: "this is a lead, this is internal, this is spam." For each new message it returns a category and a brief rationale.

Step 3 - Pull context for leads

If it's a lead, the agent reaches into my CRM (Pipedrive) - think of it as a contact book the agent flips through. It checks whether the client has inquired before, what the quote looked like, which product. If the client isn't in the CRM yet, it creates a new record.

Step 4 - Draft a reply

Based on the CRM context and standard pricing, it generates a draft reply. Not necessarily ready to send - ready for me to approve.

Step 5 - Send me a daily summary

Every morning at 8:30 I get a single email: "today: 14 messages, 6 leads, 3 invoices, 5 other." Each lead has a link to the draft. I click Approve / Edit / Defer. No writing from scratch.

Architecture - the bird's-eye view

For the technical reader: the agent is a Python script running on Hetzner cloud (cron every hour). It talks to the inbox via IMAP, to the CRM via its REST API, and to the language model via the Claude API. Logs go into a SQLite database; the daily summary goes out via SendGrid. For the non-technical reader: one small computer on the internet that runs three steps every hour and sends me an email each morning.

I built the whole thing in a weekend and two evenings. Not because I'm fast - because each piece is trivial. An agent isn't rocket science. It's Lego assembled from blocks that anyone can use today.

Results after two months

Metric	Before	After
Morning triage time	45 min/day	8 min/day
Missed leads (over 24 h without reply)	~2 per week	0
Time to first reply on a lead	4-18 hours	1.5 hours on average
Mental load of "what's left in the inbox"	constant	occasional

In hard numbers, it gives you back about 3 hours a week. In how it feels, it's more - because those 45 morning minutes in the inbox weren't just lost time, they were a lost context switch. When you start your day spending 45 minutes in someone else's head (email queues), it takes another hour to find your own. The agent gave me that back - and that's a value you can't count in hours.

What I'd do differently

A few lessons if I were starting now:

1. I'd start with the CRM, not Gmail

The longest part was teaching the agent what "active client," "VIP," and "haven't heard from in a while" mean. If my CRM had been set up better, the agent would have been up and running faster. One weekend instead of two.

2. I'd start with a smaller scope

Sorting and notifications alone - no reply drafts. That would have delivered 70% of the value for 30% of the work. I'd add drafts in a second iteration, once I could see which category actually benefits from them.

3. I'd build a dashboard from day one

Instead of "checking the logs" to see what the agent was doing, I should have had a simple dashboard from the start - how many messages processed, how many classified correctly, where it asked for help. It took me a month to figure that out.

4. I'd feed corrections back to the agent more systematically

When a draft didn't feel right and I edited it by hand, that edit should have gone back into the training data. Instead, I spent two months correcting the same mistake across different emails because the agent didn't know about it.

What this means for your business

This case study isn't a template to copy. Your context will be different - different inbox, different message types, different CRM, different risk tolerance. But the principle holds: find a process where you're stuck on routine, describe it step by step, and test whether a new hire with a manual could do it. If the manual exists, an agent can handle it.

If you want to find out whether you have a similar process worth handing off to an agent, book a free 15-minute call. Tell me how things work on your end, and I'll give you a straight answer: does it make sense to build now, or should you wait until there are fewer exceptions. No ceremony.

The agent gave me back 45 minutes a day. For you, it might give back a whole person.