The Challenge
Teams make commitments in Slack every day: "I'll send that by Friday", "Let me review the PR", "I'll set up the meeting". These promises get buried in message threads and forgotten. The founder wanted an AI system that could detect these commitments automatically, track them, and hold people accountable. No existing tool did this well.
Discovery Phase
Before writing a single line of code, I ran a paid discovery phase. Analyzed 70+ competing tools across 8 categories (task management, communication platforms, AI meeting tools, workload management, AI copilots, OKR tools, email management). Mapped the competitive landscape, assessed technical feasibility, and delivered a comprehensive report with architecture recommendations. This gave the founder confidence to invest in the full build.
What I Built
The MVP included: Slack OAuth with multi-workspace support, real-time event ingestion from Slack channels and threads, an AI detection pipeline using GPT models with carefully engineered system prompts, an LLM-as-a-Judge evaluation layer where a second AI call independently validates each detection, a web-based evaluation tool for the founder to test detection accuracy, and full deployment on AWS EC2 with PostgreSQL.
The AI Pipeline
The detection system goes beyond simple keyword matching. It handles sarcasm ("I'll solve world hunger and teleport pizzas"), vague language, multilingual messages (German, French, Spanish), conditional statements, passive-aggressive tones, and ambiguous short replies. I built a stress-test suite of 46 edge cases across 8 categories to validate accuracy. The system achieved 87% accuracy across all edge cases.
Outcome
Delivered M1 in 10 days from discovery start. The founder was able to demo the product to potential customers immediately. The evaluation tool let them test detection quality in real-time. An open-source testing framework (pytest-eval) came out of this project as a side product. The engagement expanded from the initial €500 discovery to a €15K multi-milestone contract.
Tech stack
Need something similar?
Every project starts with a conversation. Tell me what you need, I tell you if I can help.