* feat(terminal): add interactive web terminal for bot containers
Add WebSocket-based terminal endpoint (/container/terminal/ws) that
provides a full PTY shell session inside the bot's MCP container.
Extend the gRPC proto with pty and resize fields, implement PTY exec
on the container side using creack/pty, and add an xterm.js-based
terminal component in the frontend bot detail page.
* chore: add /mcp in .gitignore
* feat(terminal): add multi-tab support, localStorage cache, and reactivity fixes
- Support unlimited terminal tabs with add/close/switch
- Cache terminal content to localStorage via SerializeAddon for session persistence
- Use shallowReactive for tab objects to ensure status updates trigger UI reactivity
- Fix listener leak by tracking and disposing onData/onResize on reconnect
- Fix bottom clipping by using inset offsets instead of padding
* fix(containerd): prevent silent network failures from leaving containers unreachable
Container network setup failures were silently swallowed at multiple
points in the call chain, leaving containers in a "running but
unreachable" ghost state. This patch closes every silent-failure path:
- setupCNINetwork: return error when CNI yields no usable IP
- Manager.Start: roll back container when IP is empty instead of
returning success
- ensureContainerAndTask: extract setupNetworkOrFail with 1 retry,
propagate error to callers
- ReconcileContainers: stop reporting "healthy" when network setup fails
- recoverContainerIP: retry up to 2 times with backoff for transient
CNI failures (IPAM lock contention, etc.)
- gRPC Pool: evict connections stuck in Connecting state for >30s
* fix(containerd): clean stale cni0 bridge on startup to prevent MAC error
After a Docker container restart, the cni0 bridge interface can linger
with a zeroed MAC (00:00:00:00:00:00) and DOWN state. The CNI bridge
plugin then fails with "could not set bridge's mac: invalid argument",
making all MCP containers unreachable.
Two-layer fix:
- Entrypoint: delete cni0 and flush IPAM state before starting containerd
- Go: detect bridge MAC errors in setupCNINetwork and auto-delete cni0
before retrying, as defense-in-depth for runtime recovery
* fix(containerd): use exec.CommandContext to satisfy noctx linter
* feat(container): add explicit data workflows and snapshot rollback
Make container upgrades and recreation data-safe by adding explicit preserve, export, import, restore, and rollback flows across the backend, SDK, and web UI.
* fix(container): resolve go lint issues
Fix formatting and lint violations introduced by the container data workflow changes so the Go CI lint job passes cleanly.
Replace the host bind-mount + containerd exec approach with a per-bot
in-container gRPC server (ContainerService, port 9090). All file I/O,
exec, and MCP stdio sessions now go through gRPC instead of running
shell commands or reading host-mounted directories.
Architecture changes:
- cmd/mcp: rewritten as a gRPC server (ContainerService) with full
file and exec API (ReadFile, WriteFile, ListDir, ReadRaw, WriteRaw,
Exec, Stat, Mkdir, Rename, DeleteFile)
- internal/mcp/mcpcontainer: protobuf definitions and generated stubs
- internal/mcp/mcpclient: gRPC client wrapper with connection pool
(Pool) and Provider interface for dependency injection
- mcp.Manager: add per-bot IP cache, gRPC connection pool, and
SetContainerIP/MCPClient methods; remove DataDir/Exec helpers
- containerd.Service: remove ExecTask/ExecTaskStreaming; network setup
now returns NetworkResult{IP} for pool routing
- internal/fs/service.go: deleted (replaced by mcpclient)
- handlers/fs.go: deleted; MCP stdio session logic moved to mcp_stdio.go
- container provider Executor: all tools (read/write/list/edit/exec)
now call gRPC client instead of running shell via exec
- storefs, containerfs, media, skills, memory: all I/O ported to
mcpclient.Provider
Database:
- migration 0022: drop host_path column from containers table
One-time data migration:
- migrateBindMountData: on first Start() after upgrade, copies old
bind-mount data into the container via gRPC, then renames src dir
to prevent re-migration; runs in background goroutine
Bug fixes:
- mcp_stdio: callRaw now returns full JSON-RPC envelope
{"jsonrpc","id","result"|"error"} matching protocol spec;
explicit "initialize" call now advances session init state to
prevent duplicate handshake on next non-initialize call
- mcpclient Pool: properly evict stale gRPC connection after snapshot
replace (container process recreated); use SetContainerIP instead
of direct map write so IP changes always evict pool entry
- migrateBindMountData: walkErr on directories now counted as failure
so partially-walked trees don't get incorrectly marked as migrated
- cmd/mcp/Dockerfile: removed dead file (docker/Dockerfile.mcp is the
canonical production build)
Tests:
- provider_test.go: restored with bufconn in-process gRPC mock
(fakeContainerService + staticProvider), 14 cases covering all 5
tools plus edge cases
- mcp_session_test.go: new, covers JSON-RPC envelope, init state
machine, pending cleanup on cancel/close, readLoop cancel
- storefs/service_test.go: restored (pure function roundtrip tests)
- Fix DeleteContainer FAILED_PRECONDITION by cleaning up stopped task
entries before container deletion
- Fix CreateSnapshot leaving container in broken state: commit turns
the active snapshot read-only, so the full cycle (stop → commit →
prepare → delete → recreate → start) is now applied consistently
- Use context.WithoutCancel for atomic container replacement sequences
to prevent cancelled HTTP requests from corrupting container state
- Use dctx for DB operations (recordSnapshotVersion/insertEvent) to
avoid orphan snapshots in containerd without matching DB records
- Restart task + network after snapshot replacement, fixing Exec after
CreateVersion where the container had no running task
- Extract replaceContainerSnapshot helper to deduplicate the prepare →
delete → recreate → start pattern across three call sites
- Move snapshot list data fetching into Manager.ListBotSnapshotData to
encapsulate per-container locking; remove exported LockBot method
- Use UnixNano for snapshot names to avoid second-precision collisions
* feat(devenv): add containerized development environment
Replace local-process dev workflow with a fully containerized stack
using docker compose. This enables consistent development across
machines without requiring local Go/Node toolchains or containerd.
- Add Dockerfile.server.dev with containerd + CNI networking support
- Add Dockerfile.web.dev for frontend dev server
- Add server-dev-entrypoint.sh for containerd lifecycle management
- Expand devenv/docker-compose.yml with server, agent, web, migrate
and deps services with proper health checks and dependency ordering
- Update app.dev.toml to use container service names instead of localhost
- Refactor mise.toml dev tasks to drive docker compose workflow
- Support agent_gateway.server_addr in config package for inter-container
communication
* feat(devenv): add hot-reload and registry mirror support
- Add air for Go server hot-reload in dev containers
- Fix agent_gateway host in dev config (0.0.0.0 -> agent)
- Add configurable registry mirror for China mainland users
- Unify MCP image refs via MCPConfig.ImageRef()
* feat(scripts): add China mainland mirror option to install script
Prompt users to opt-in to memoh.cn mirror during installation,
which applies docker-compose.cn.yml overlay and sets registry
in config.toml for MCP image pulls.
Server container restart drops cni0 bridge, veth and iptables masquerade
in its network namespace while MCP tasks keep running in containerd.
Reconcile and ensureContainerAndTask now re-run SetupNetwork for already-
running tasks so outbound connectivity is restored.
- Refactor RuntimeChecker interface: CheckKeys() + RunCheck() for
individual check dispatch instead of batch-all
- Add GET /bots/:id/checks/keys to list all available check keys
- Add GET /bots/:id/checks/run/:key to evaluate a single check
- MCP ConnectionChecker probes each active connection independently
via tools/list with 8s timeout
- Keep container checks (init/record/task/data_path) as fast builtins
- Graceful network setup failure in containerd handler (log warning
instead of killing task) for containerd-in-docker compatibility
In containerd-in-docker mode, SetupNetwork fails because netns is
unavailable. Previously this killed the task, making stdio MCP tools
unusable. Now the task continues running with a warning log, since
stdio MCP communication does not require networking.
- Rename chat module to conversation with flow-based architecture
- Move channelidentities into channel/identities subpackage
- Add channel/route for routing logic
- Add message service with event hub
- Add MCP providers: container, directory, schedule
- Refactor Feishu/Telegram adapters with directory and stream support
- Add platform management page and channel badges in web UI
- Update database schema for conversations, messages and channel routes
- Add @memoh/shared package for cross-package type definitions
Split business executors from federation sources and migrate unified tool/federation transports to the official go-sdk for stricter MCP compliance and safer session lifecycle handling. Add targeted regression tests for accept compatibility, initialization retries, pending cleanup, and include updated swagger artifacts.
Unify auth and chat identity semantics around user_id, enforce personal-bot owner-only authorization, and remove legacy compatibility branches in integration tests.
Align channel identity and bind flow across backend and app-facing layers, including generated swagger artifacts and package lock updates while excluding docs content changes.
- Add SetupBotContainer to ContainerLifecycle interface so containers
are automatically created when a bot is created, matching the existing
cleanup-on-delete behavior.
- Refactor schedule tools to use bot-scoped API paths and pass identity
context for proper authorization.
- Introduce dedicated trigger-schedule endpoint in chat resolver with
explicit schedule payload instead of reusing the generic chat path.
- Generate short-lived JWT tokens for schedule trigger callbacks with
resolved bot owner identity.
- Validate required parameters in NewLLMClient and NewOpenAIEmbedder
constructors, returning errors instead of falling back to defaults.
- Add unit tests for schedule token generation and chat resolver.
- Refactor channel manager with support for Sender/Receiver interfaces and hot-swappable adapters.
- Implement identity routing and pre-authentication logic for inbound messages.
- Update database schema to support bot pre-auth keys and extended channel session metadata.
- Add Telegram and Feishu channel configuration and adapter enhancements.
- Update Swagger documentation and internal handlers for channel management.
Co-authored-by: Cursor <cursoragent@cursor.com>
Major changes:
1. Core Architecture: Decoupled Bots from Users. Bots now have independent lifecycles, member management (bot_members), and dedicated configurations.
2. Channel Gateway:
- Implemented a unified Channel Manager supporting Feishu, Telegram, and Local (Web/CLI) adapters.
- Added message processing pipeline to normalize interactions across different platforms.
- Introduced a Contact system for identity binding and guest access policies.
3. Database & Tooling:
- Consolidated all migrations into 0001_init with updated schema for bots, channels, and contacts.
- Optimized sqlc.yaml to automatically track the migrations directory.
4. Agent Enhancements:
- Introduced ToolContext to provide Agents with platform-aware execution capabilities (e.g., messaging, contact lookups).
- Added tool logging and fallback mechanisms for toolChoice execution.
5. UI & Docs: Updated frontend stores, UI components, and Swagger documentation to align with the new Bot-centric model.