Commit Graph

25 Commits

Author SHA1 Message Date
BBQ abbb14c59f fix(containerd): prevent silent network failures from leaving containers unreachable (#202)
* fix(containerd): prevent silent network failures from leaving containers unreachable

Container network setup failures were silently swallowed at multiple
points in the call chain, leaving containers in a "running but
unreachable" ghost state. This patch closes every silent-failure path:

- setupCNINetwork: return error when CNI yields no usable IP
- Manager.Start: roll back container when IP is empty instead of
  returning success
- ensureContainerAndTask: extract setupNetworkOrFail with 1 retry,
  propagate error to callers
- ReconcileContainers: stop reporting "healthy" when network setup fails
- recoverContainerIP: retry up to 2 times with backoff for transient
  CNI failures (IPAM lock contention, etc.)
- gRPC Pool: evict connections stuck in Connecting state for >30s

* fix(containerd): clean stale cni0 bridge on startup to prevent MAC error

After a Docker container restart, the cni0 bridge interface can linger
with a zeroed MAC (00:00:00:00:00:00) and DOWN state. The CNI bridge
plugin then fails with "could not set bridge's mac: invalid argument",
making all MCP containers unreachable.

Two-layer fix:
- Entrypoint: delete cni0 and flush IPAM state before starting containerd
- Go: detect bridge MAC errors in setupCNINetwork and auto-delete cni0
  before retrying, as defense-in-depth for runtime recovery

* fix(containerd): use exec.CommandContext to satisfy noctx linter
2026-03-07 17:50:01 +08:00
BBQ 21999b49f4 feat(container): add explicit data workflows and snapshot rollback (#193)
* feat(container): add explicit data workflows and snapshot rollback

Make container upgrades and recreation data-safe by adding explicit preserve, export, import, restore, and rollback flows across the backend, SDK, and web UI.

* fix(container): resolve go lint issues

Fix formatting and lint violations introduced by the container data workflow changes so the Go CI lint job passes cleanly.
2026-03-06 17:57:48 +08:00
BBQ 3feb03aca7 ci: add go lint and race test workflow (#187) 2026-03-05 11:25:33 +08:00
BBQ 9ceabf68c4 feat(mcp): replace bind-mount+exec with in-container gRPC service (#179)
Replace the host bind-mount + containerd exec approach with a per-bot
in-container gRPC server (ContainerService, port 9090). All file I/O,
exec, and MCP stdio sessions now go through gRPC instead of running
shell commands or reading host-mounted directories.

Architecture changes:
- cmd/mcp: rewritten as a gRPC server (ContainerService) with full
  file and exec API (ReadFile, WriteFile, ListDir, ReadRaw, WriteRaw,
  Exec, Stat, Mkdir, Rename, DeleteFile)
- internal/mcp/mcpcontainer: protobuf definitions and generated stubs
- internal/mcp/mcpclient: gRPC client wrapper with connection pool
  (Pool) and Provider interface for dependency injection
- mcp.Manager: add per-bot IP cache, gRPC connection pool, and
  SetContainerIP/MCPClient methods; remove DataDir/Exec helpers
- containerd.Service: remove ExecTask/ExecTaskStreaming; network setup
  now returns NetworkResult{IP} for pool routing
- internal/fs/service.go: deleted (replaced by mcpclient)
- handlers/fs.go: deleted; MCP stdio session logic moved to mcp_stdio.go
- container provider Executor: all tools (read/write/list/edit/exec)
  now call gRPC client instead of running shell via exec
- storefs, containerfs, media, skills, memory: all I/O ported to
  mcpclient.Provider

Database:
- migration 0022: drop host_path column from containers table

One-time data migration:
- migrateBindMountData: on first Start() after upgrade, copies old
  bind-mount data into the container via gRPC, then renames src dir
  to prevent re-migration; runs in background goroutine

Bug fixes:
- mcp_stdio: callRaw now returns full JSON-RPC envelope
  {"jsonrpc","id","result"|"error"} matching protocol spec;
  explicit "initialize" call now advances session init state to
  prevent duplicate handshake on next non-initialize call
- mcpclient Pool: properly evict stale gRPC connection after snapshot
  replace (container process recreated); use SetContainerIP instead
  of direct map write so IP changes always evict pool entry
- migrateBindMountData: walkErr on directories now counted as failure
  so partially-walked trees don't get incorrectly marked as migrated
- cmd/mcp/Dockerfile: removed dead file (docker/Dockerfile.mcp is the
  canonical production build)

Tests:
- provider_test.go: restored with bufconn in-process gRPC mock
  (fakeContainerService + staticProvider), 14 cases covering all 5
  tools plus edge cases
- mcp_session_test.go: new, covers JSON-RPC envelope, init state
  machine, pending cleanup on cancel/close, readLoop cancel
- storefs/service_test.go: restored (pure function roundtrip tests)
2026-03-04 21:50:08 +08:00
BBQ 04bce702b7 feat(devenv): MCP dev hot-reload with image-based approach (#145)
Add mcp-build.sh that compiles the MCP binary and packages it as an
OCI image layer on top of the base rootfs, imported directly into
containerd. Air triggers rebuild on code changes, cleaning stale
containers automatically.

Consolidate dev-only files (Dockerfiles, entrypoint, config, build
script) into devenv/ to separate dev tooling from production artifacts.
Skip image pull when already imported to speed up dev startup.
2026-03-02 14:59:48 +08:00
BBQ d6aebf654f feat(devenv): add containerized development environment (#116)
* feat(devenv): add containerized development environment

Replace local-process dev workflow with a fully containerized stack
using docker compose. This enables consistent development across
machines without requiring local Go/Node toolchains or containerd.

- Add Dockerfile.server.dev with containerd + CNI networking support
- Add Dockerfile.web.dev for frontend dev server
- Add server-dev-entrypoint.sh for containerd lifecycle management
- Expand devenv/docker-compose.yml with server, agent, web, migrate
  and deps services with proper health checks and dependency ordering
- Update app.dev.toml to use container service names instead of localhost
- Refactor mise.toml dev tasks to drive docker compose workflow
- Support agent_gateway.server_addr in config package for inter-container
  communication

* feat(devenv): add hot-reload and registry mirror support

- Add air for Go server hot-reload in dev containers
- Fix agent_gateway host in dev config (0.0.0.0 -> agent)
- Add configurable registry mirror for China mainland users
- Unify MCP image refs via MCPConfig.ImageRef()

* feat(scripts): add China mainland mirror option to install script

Prompt users to opt-in to memoh.cn mirror during installation,
which applies docker-compose.cn.yml overlay and sets registry
in config.toml for MCP image pulls.
2026-02-26 17:32:19 +08:00
Ran 5e12b5a53f fix: ensure unifying on hardcoded /data mount path 2026-02-24 03:35:27 +08:00
Ran 65c4d6f793 feat(container): support for apple container 2026-02-23 22:40:46 +08:00
BBQ 4fc0ca5110 fix(container): propagate host timezone to all containers
Replace TZ env var with /etc/localtime bind-mount in docker-compose
and inject timezone spec opts into containerd bot containers.
2026-02-20 03:22:31 +08:00
BBQ bc374fe8cd refactor: content-addressed assets, cross-channel multimodal, infra simplification (#63)
* refactor(attachment): multimodal attachment refactor with snapshot schema and storage layer

- Add snapshot schema migration (0008) and update init/versions/snapshots
- Add internal/attachment and internal/channel normalize for unified attachment handling
- Move containerfs provider from internal/media to internal/storage
- Update agent types, channel adapters (Telegram/Feishu), inbound and handlers
- Add containerd snapshot lineage and local_channel tests
- Regenerate sqlc, swagger and SDK

* refactor(media): content-addressed asset system with unified naming

- Replace asset_id foreign key with content_hash as sole identifier
  for bot_history_message_assets (pure soft-link model)
- Remove mime, size_bytes, storage_key from DB; derive at read time
  via media.Resolve from actual storage
- Merge migrations 0008/0009 into single 0008; keep 0001 as canonical schema
- Add Docker initdb script for deterministic migration execution order
- Fix cross-channel real-time image display (Telegram → WebUI SSE)
- Fix message disappearing on refresh (null assets fallback)
- Fix file icon instead of image preview (mime derivation from storage)
- Unify AssetID → ContentHash naming across Go, Agent, and Frontend
- Change storage key prefix from 4-char to 2-char for directory sharding
- Add server-entrypoint.sh for Docker deployment migration handling

* refactor(infra): embedded migrations, Docker simplification, and config consolidation

- Embed SQL migrations into Go binary, removing shell-based migration scripts
- Consolidate config files into conf/ directory (app.example.toml, app.docker.toml, app.dev.toml)
- Simplify Docker setup: remove initdb.d scripts, streamline nginx config and entrypoint
- Remove legacy CLI, feishu-echo commands, and obsolete incremental migration files
- Update install script and docs to require sudo for one-click install
- Add mise tasks for dev environment orchestration

* chore: recover migrations

---------

Co-authored-by: Acbox <acbox0328@gmail.com>
2026-02-19 00:20:27 +08:00
斬風千雪 0bdc31311c improvement(mcp): make CNI binary & data path configurable (#55) 2026-02-17 17:57:13 +08:00
BBQ 85251a2905 refactor(core): codebase quality cleanup
- Remove user-level model settings (chat_model_id, memory_model_id,
  embedding_model_id, max_context_load_time, language) from users table
- Merge migration 0002 into 0001, remove compatibility migrations
- Delete dead conversation/resolver.go (1177 lines, only flow/resolver.go used)
- Remove type aliases (Chat=Conversation, types_alias.go)
- Fix SQL: remove AND false stub, fix UpdateChatTitle model_id,
  reset model IDs in DeleteSettings, add preauth expiry filter,
  add ListMessages limit, remove 10 dead queries
- Extract shared handler helpers (RequireChannelIdentityID, AuthorizeBotAccess)
- Rename internal/router to internal/channel/inbound
- Fix identity confusion: remove UserID->ChannelIdentityID fallbacks
- Fix all _ = var patterns with proper error logging
- Fix error propagation: storeMessages, rescheduleJob, botContainerID
- Fix naming: ModelId->ModelID, active->is_active, Duration semantic fix
- Remove dead code: mcpService, ReplyTarget, callMCPServer, sshShellQuote,
  buildSessionMetadata, ChatRequest.Language, TriggerPayload.ChatID
- Fix code quality: errors.Is(), remove goto, CreateHuman deprecated
- Remove Enable model endpoint and user-level settings CLI commands
- Regenerate sqlc, swagger, SDK
2026-02-12 23:50:48 +08:00
BBQ c53d35740e feat(deploy): self-contained containerd with embedded MCP image
- Add Dockerfile.containerd: multi-stage build that compiles MCP binary,
  assembles rootfs, creates Docker image tar, and bundles it with containerd
- Add containerd-entrypoint.sh: auto-imports MCP image on first start
- Fix MCP image reference: rename busybox_image to image in config,
  use fully-qualified docker.io/library/memoh-mcp:latest everywhere
- Make image ref configurable via config.toml instead of hardcoded
- Simplify deploy.sh: remove manual nerdctl/containerd-install steps
2026-02-12 23:50:48 +08:00
BBQ 30281742ef merge(github/main): integrate fx dependency injection framework
Merge upstream fx refactor and adapt all services to use go.uber.org/fx
for dependency injection. Resolve conflicts in main.go, server.go,
and service constructors while preserving our domain model changes.

- Fix telegram adapter panic on shutdown (double close channel)
- Fix feishu adapter processing messages after stop
- Increase directory lookup timeout from 2s to 5s
2026-02-12 16:42:44 +08:00
BBQ ca5c6a1866 refactor(core): restructure conversation, channel and message domains
- Rename chat module to conversation with flow-based architecture
- Move channelidentities into channel/identities subpackage
- Add channel/route for routing logic
- Add message service with event hub
- Add MCP providers: container, directory, schedule
- Refactor Feishu/Telegram adapters with directory and stream support
- Add platform management page and channel badges in web UI
- Update database schema for conversations, messages and channel routes
- Add @memoh/shared package for cross-package type definitions
2026-02-12 15:34:40 +08:00
BBQ 06e8619a37 refactor(core): migrate channel identity and binding across app
Align channel identity and bind flow across backend and app-facing layers, including generated swagger artifacts and package lock updates while excluding docs content changes.
2026-02-11 14:51:58 +08:00
MengYX 6548c31597 refactor: using fx 2026-02-11 10:25:40 +08:00
Ran 26dd8651b7 feat: go cni lifecycle manage 2026-02-08 21:39:34 +08:00
Ran 3f8cb3292c chore: optimize code structure 2026-02-08 01:45:53 +08:00
BBQ 5a35ef34ac feat: channel gateway implementation and multi-bot refactor
- Refactor channel manager with support for Sender/Receiver interfaces and hot-swappable adapters.
- Implement identity routing and pre-authentication logic for inbound messages.
- Update database schema to support bot pre-auth keys and extended channel session metadata.
- Add Telegram and Feishu channel configuration and adapter enhancements.
- Update Swagger documentation and internal handlers for channel management.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-06 14:41:54 +08:00
Ran cb36b68ee4 feat: Atomic update mcp image 2026-02-05 02:40:10 +08:00
BBQ 6aebbe9279 feat: refactor User/Bot architecture and implement multi-channel gateway
Major changes:
1. Core Architecture: Decoupled Bots from Users. Bots now have independent lifecycles, member management (bot_members), and dedicated configurations.
2. Channel Gateway:
   - Implemented a unified Channel Manager supporting Feishu, Telegram, and Local (Web/CLI) adapters.
   - Added message processing pipeline to normalize interactions across different platforms.
   - Introduced a Contact system for identity binding and guest access policies.
3. Database & Tooling:
   - Consolidated all migrations into 0001_init with updated schema for bots, channels, and contacts.
   - Optimized sqlc.yaml to automatically track the migrations directory.
4. Agent Enhancements:
   - Introduced ToolContext to provide Agents with platform-aware execution capabilities (e.g., messaging, contact lookups).
   - Added tool logging and fallback mechanisms for toolChoice execution.
5. UI & Docs: Updated frontend stores, UI components, and Swagger documentation to align with the new Bot-centric model.
2026-02-04 23:49:50 +08:00
BBQ 46d2968e2c feat: refactor logging system to slog with DI and component tagging 2026-02-01 15:23:57 +08:00
Ran 0edaba4e74 fix: update go dependencies 2026-01-20 23:23:07 +07:00
Ran d40cc581d2 refactor: initial go service 2026-01-20 00:04:23 +07:00