Why Plugin Kit?
A year ago I was building an AI agent platform inside Codelessly with around thirty different capabilities: auto-model routing, context injection, auto-retry with fallback agents, secret resolution, MCP server integrations, custom system prompts, live-tunable temperature and tools, per-session sandboxing. Every one of those capabilities had to be enable-able, override-able, settings-tunable, and hot-swappable at runtime. A user toggling the testing-agent off in their settings UI had to actually unwire it from the bus, free its resources, and let any higher-priority replacement take over without a restart.
State management did not prepare me for this. DI containers did not prepare me for this. What I needed was something that behaves like your IDE’s plugin and extension system, but in Dart. So I wrote one. It is now plugin_kit, the backbone of Codelessly’s agent platform.
DI container vs plugin runtime
Section titled “DI container vs plugin runtime”A DI container answers the question “give me an instance of X.” It wires up a fixed graph at startup.
A plugin runtime answers a different question: “given this set of capabilities, this user’s settings, and this session’s scope, compose a coherent system and let me change my mind at any time.” The graph is dynamic. It responds to settings changes, plugin enable/disable, priority overrides, and session boundaries while the application is running.
Most apps do not need this
Section titled “Most apps do not need this”If your app has one HTTP client, one auth service, one analytics service, and a few screens that call them, use the boring thing. Instantiate the client. Register the service. Ship the app. Nobody gets a trophy for turning a todo app into a miniature operating system.
Plugin Kit starts making sense at the exact moment the word “just” starts lying to you:
- “Just add a setting to disable this provider.”
- “Just let enterprise customers override this behavior.”
- “Just let the admin dashboard turn this feature off without redeploying.”
- “Just experiment quickly with this implementation.”
- “Just make sure the original runs if this way fails.”
Each of those sounds small. One flag. One if statement. One extra service. But after a while, the app stops being a clean tree of dependencies and becomes a negotiation between features, settings, tenants, sessions, fallbacks, priorities, and runtime decisions. That is the point where a singleton is no longer the architecture; it is just the place where the architecture is hiding.
A worked escalation: a chat plugin and a teammate’s experiment
Section titled “A worked escalation: a chat plugin and a teammate’s experiment”The original system prompt lives inside the chat plugin. It is boring at its best:
abstract class SystemPrompt { String build({required User user});}
class DefaultSystemPrompt implements SystemPrompt { @override String build({required User user}) => 'You are a helpful assistant. The user is $user.';}
class ChatPlugin extends SessionPlugin { @override PluginId get pluginId => const PluginId('chat');
@override void register(ScopedServiceRegistry registry) { registry.registerSingleton<SystemPrompt>( const ServiceId('system_prompt'), () => DefaultSystemPrompt(), ); }}A teammate wants to A/B a more aggressive prompt that injects the user’s recent activity. They do not want to touch the chat plugin. They do not want to add an if statement. They want their experiment to live as its own plugin, opt in for testing, and disappear cleanly when nobody wants it.
class ActivityAwarePlugin extends SessionPlugin { @override PluginId get pluginId => const PluginId('activity_aware');
@override List<FeatureFlag> get featureFlags => const [FeatureFlag.experimental];
@override void register(ScopedServiceRegistry registry) { registry.registerSingleton<SystemPrompt>( const ServiceId('system_prompt'), () => ActivityAwareSystemPrompt(), priority: Priority.elevated, // wins ); }}Two plugins, one slot. The experimental plugin wins because it asked for higher priority. The chat plugin’s prompt is still registered, sitting at the lower priority, ready as a fallback.
The agent loop has not changed:
final prompt = session.resolve<SystemPrompt>(const ServiceId('system_prompt'));final reply = await agent.sendMessage(prompt.build(user: myUser));No conditional. No flag. No branch. Whatever is registered at the highest priority for that slot is what resolve returns. The call site does not know it is in an experiment.
Ship to production. A user pulls up the settings dialog and decides they want the original back. The settings UI does not need to know what these plugins do; it just hands the runtime a new settings JSON:
await runtime.updateSettings( const RuntimeSettings( plugins: { PluginId('activity_aware'): PluginConfig(enabled: false), }, ),);The experimental plugin detaches. Its instance is torn down. The chat plugin’s original prompt becomes the winner again, because nothing higher is fighting for the slot. The next message uses the original prompt. No reload, no rebuild of the session, no if (experimentalMode) flag scrubbed across the codebase. Flip it back to true and the experiment re-attaches.
A few things to notice. The chat plugin and the experiment plugin do not know about each other. Neither checks for the other. They both register into a slot named system_prompt; the registry decides who wins. The settings change was a statement of intent, not a script. You did not write code that said “when the experiment toggle goes off, restart the agent loop with the default prompt.” You handed the runtime a new picture of the world and it converged on that state.
Plugin Kit vs your state management library
Section titled “Plugin Kit vs your state management library”A Flutter developer reading this far is probably thinking: this is yet another state management solution. A little. Plugin Kit has a registry, a typed event bus, session-scoped services, and a lifecycle. If you squint, it sounds like the job a Riverpod, Bloc, Provider, GetIt, or ChangeNotifier pile usually does.
The overlap is not the interesting part. The boundary is.
State management owns presentation state. Plugin Kit owns participation.
- A chat screen showing messages is presentation state. A plugin deciding whether it wants to enrich an outgoing prompt is participation.
- A model selector showing the current model is presentation state. The list of available models changing because provider plugins were enabled or disabled is participation.
- A loading spinner is presentation state. A fallback provider taking over because the first one is unavailable is participation.
Plugin Kit can sit beside Provider, Riverpod, Bloc, or GetIt instead of replacing them. Let your state library keep doing widget-facing work; let Plugin Kit own the runtime protocol underneath. Do not move everything into Plugin Kit if you’re not starting from scratch. Move the seams where participation matters.
Design decisions
Section titled “Design decisions”Domain-agnostic core. The runtime knows nothing about Flutter, servers, editors, agents, or any specific UI framework. Client code extends PluginContext, Plugin, and PluginService to model domain concepts. The same runtime drops into a CLI, a server, or a Flutter app unchanged.
Competing implementations are a first-class idea. Most DI containers treat override as a special case. Plugin Kit treats it as the normal case. That is why priority is on the registration, not on the slot: the slot is just a name, and any plugin that registers there is a candidate to win.
Metadata without instantiation. Capabilities live on registration wrappers, so host code can discover what a service claims to support without paying construction cost. This is how dynamic settings UIs, extension manifests, and slot inspectors stay fast.
Settings as reconciliation input, not instruction. Handing the runtime a new RuntimeSettings is a statement of intent (“this is the world as it should be”), not a script. The runtime figures out what has to attach, detach, or reconfigure to converge on the new state. Plugin authors write lifecycle hooks; they do not write “on settings change” handlers for every configurable knob.
Errors surface. The bus does not swallow handler exceptions. The runtime’s lifecycle phases aggregate failures and throw PluginLifecycleException, naming the phase that failed. Silent failure in a plugin system is a debugging nightmare a year later.