Building a Pluggable AI Brain: A Modern Chatbot Architecture
Building an AI chatbot is deceptively simple at first. You grab a library for Telegram or a web framework like Typebot, wire it up to an LLM, and you have a working prototype in an afternoon. But this initial velocity often hides a critical architectural flaw that leads to significant technical debt: platform lock-in.
I've been thinking about a more robust, scalable approach. What if we could build a central, intelligent "brain" once, and then simply "plug it in" to any frontend we choose—Telegram today, a web app tomorrow, maybe even a Slack bot next month?
This isn't a fantasy. It's achievable with a clean, decoupled architecture.
The Core Problem: The Platform Trap
The common approach is to write your backend logic to directly serve a specific frontend. If you're building for Telegram, your code is filled with logic for handling Telegram Update
objects and calling the sendMessage
API. If you're building for a web UI, your code is handling HTTP requests and returning JSON.
The problem is that this hard-wires your core business logic—your RAG implementation, your complex prompting, your very "secret sauce"—to a specific communication protocol. Migrating to a new platform means a significant rewrite. This is not a sustainable model.
The Architectural Shift: A Decoupled Brain
The solution is to enforce a strict separation of concerns. We think of our system not as two parts (frontend and backend), but as three:
- The Frontend: This is the "face" of our application. It could be Typebot for a structured web chat or Telegram for a native mobile experience. Its only job is to manage the UI and communicate with our brain. It is completely interchangeable.
- The Brain (The Backend): This is our protocol-agnostic, core logic service. It contains all the complex AI processing, but it has zero knowledge of whether it's talking to Telegram or a web browser.
- The Deployment Platform: The infrastructure that runs our brain. A modern PaaS like Railway is perfect here. It takes our code, deploys it, and gives it a stable, secure address on the internet.
The Secret Sauce: The Adapter Pattern
So, how does the Brain remain ignorant of the outside world while still communicating with it? The answer is the classic Adapter Pattern.
The idea is simple: you wouldn't rewire your laptop just to plug it into a foreign power outlet; you'd use an adapter. We do the same with our code.
The Brain defines a single, standardized, internal format for requests and responses. Then, for each frontend we want to support, we build a lightweight Adapter.
An Adapter has two jobs: 1. Translate Inbound: It takes a platform-specific request (like a Telegram webhook) and translates it into the Brain's standard internal format. 2. Translate Outbound: It takes the Brain's standard internal response and translates it into the specific API calls or HTTP responses that the platform expects.
This creates a clean, insulated core.
+----------------+ +---------------------+ +---------------------+
| Telegram |----->| Telegram Adapter | | |
+----------------+ +---------------------+ | |
| | |
+-------------->| Core "Brain" |
| | (Protocol Agnostic) |
+----------------+ +---------------------+ | - RAG Logic |
| Typebot Web UI|----->| Web Adapter | | - LLM Calls |
+----------------+ +---------------------+ | |
+---------------------+
The core Brain
only ever sees a standard request object. It doesn't care if that request originated from a Telegram user or a Typebot session. It does its work and returns a standard response, leaving the adapter to handle the platform-specific translation.
A Note on Security & State
This model handles production concerns gracefully. Security becomes layered. The first layer is a static M2M (machine-to-machine) token to authenticate the frontend application itself. The second is a dynamic user token (like a JWT) passed inside the request body to identify the end-user for long-term memory access (RAG). The adapter is responsible for extracting these details and passing them to the core in a standardized way.
Why This Architecture Matters
Adopting this decoupled, adapter-driven approach provides clear, immediate benefits:
- Flexibility: Adding a new frontend like Discord or Slack is no longer a rewrite. It's simply the work of creating a new adapter.
- Reusability: The most complex and valuable part of your system—the AI brain—is written once and reused everywhere.
- Testability: You can write comprehensive unit tests for your core logic without needing to mock complex web requests or external APIs. You test your brain in complete isolation.
This isn't just a theoretical exercise. It's a pragmatic blueprint for building serious, multi-platform AI applications that are designed to evolve, not break.