Memoh

mirror of https://github.com/memohai/Memoh.git synced 2026-04-25 07:00:48 +09:00

Author	SHA1	Message	Date
EYHN	f2fa845e16	fix(bridge): close stdout/stderr pipes on exec timeout to prevent stream hang (#351 )	2026-04-09 21:43:11 +08:00
Acbox	c1e6e0cc7a	feat(agent): add pagination and smart collapsing to container list tool Large directories like node_modules/.venv could return thousands of entries, wasting tokens and causing timeouts. Add offset/limit pagination to ListDir RPC and collapse heavy subdirectories (>50 items) into summaries in recursive mode. Collapsing runs at the bridge layer before pagination so the page window reflects the collapsed view.	2026-04-02 01:51:19 +08:00
Acbox	86d83108d9	fix: use readline-capable shell for interactive terminal sessions Container terminals were echoing raw ANSI escape sequences (^[[A, ^[[B, etc.) instead of handling arrow keys because /bin/sh (dash/ash) lacks readline support. Two changes fix this: 1. Bridge execPTY now directly exec's bare paths (e.g. /bin/bash) instead of always wrapping through "/bin/sh -c", preserving readline behavior. 2. Terminal handler detects bash/zsh in the container and prefers them over /bin/sh for interactive PTY sessions.	2026-03-29 19:31:24 +08:00
Acbox Liu	7d7d0e4b51	refactor: introduce multi-session chat support (#session) (#267 ) * refactor: introduce multi-session chat support (#session) Replace the single-context-per-bot model with multiple chat sessions. Database: - Add bot_sessions table (route_id, channel_type, title, metadata, soft delete) - Migrate bot_history_messages from (route_id, channel_type) to session_id - Add active_session_id to bot_channel_routes - Migration 0036 handles data migration from existing messages Backend: - New internal/session service for session CRUD - Update message service/types to use session_id instead of route_id - Update conversation flow (resolver, history, store) for session context - Channel inbound auto-creates/retrieves active session via SessionEnsurer - New REST endpoints: /bots/:bot_id/sessions (CRUD) - WebSocket and message handlers accept optional session_id - Wire session service into FX dependency graph (agent + memoh) Frontend: - Refactor chat store: sessions replaces chats, sessionId replaces chatId - Session-aware message loading, sending, and pagination - WebSocket sends include session_id - New session sidebar component with select/delete - Chat area header shows active session title + new session button - API layer updated: fetchSessions, createSession, deleteSession - i18n strings for session management (en + zh) SDK: - Regenerated TypeScript SDK and Swagger docs with session endpoints * fix: update tests for session refactoring (RouteID → SessionID) Remove references to removed RouteID and Platform fields from PersistInput/Message in channel_test.go and service_integration_test.go. * fix: restore accidentally deleted SDK files and guard migration 0032 - Restore packages/sdk/src/container-stream.ts and extra/index.ts that were accidentally removed during SDK regeneration - Wrap migration 0032 route_id index creation in a column existence check to avoid failure on fresh databases where 0001_init.up.sql no longer has route_id * fix: guard migration 0036 data steps for fresh databases Wrap steps 3-7 (which reference route_id/channel_type on bot_history_messages) in a column existence check so the migration is safe on fresh databases where 0001_init.up.sql already reflects the final schema without those columns. * feat: add title model setting and auto-generate session titles on user input - Add title_model_id to bots table (migration 0037) and bot settings API - Implement async title generation triggered at user message time (not after assistant response) for faster title availability - Publish session_title_updated events via SSE event hub for real-time frontend updates without page refresh - Fix SSE message event parsing: use direct JSON.parse instead of normalizeStreamEvent which silently dropped non-chat-stream event types - Add title model selector in bot settings UI with i18n support * fix: session-scoped message filtering and URL-based chat routing - Filter realtime SSE messages by session_id to prevent cross-session message leakage after page refresh - Add /chat/:sessionId? route with bidirectional URL ↔ store sync - Visiting /chat shows a clean state with no bot or session pre-selected - Visiting /chat/:sessionId loads the specific session directly - Session switches from sidebar automatically update the URL - Fix stale RouteID field in dedupe test (removed during session refactor) * fix: skip cross-channel stream events to prevent session leakage The bot-level web stream pushes events from all channels (Telegram, Discord, etc.) without session_id context. Previously these were rendered inline in the current chat view regardless of session. Now cross-channel events are ignored in handleLocalStreamEvent; persisted messages arrive via the SSE message events stream with proper session_id filtering through appendRealtimeMessage. * feat: show IM avatars and platform badges on session sidebar - Add sender_avatar_url to route metadata from identity resolution - Resolve group avatar and handle via directory adapter for group chats - JOIN bot_channel_routes in ListSessionsByBot to return route metadata - Display avatar with ChannelBadge on IM session items (group avatar for groups, sender avatar for private chats) - Show @groupname or @username as session sub-label * fix: clean up RunConfig unused fields, fix skill system and copy bug - Remove unused RunConfig fields: Tools, Channels, CurrentChannel, ActiveContextTime - Remove unused SessionContext fields: DisplayName, ConversationType - Fix EnabledSkillNames copy bug: make([]string, 0, n) + copy copies zero elements; changed to make([]string, n) - Fix prepareRunConfig dead code: remove no-op loop over CurrentPlatform runes; compute supportsImageInput from model's InputModalities - Fix EnabledSkills always nil in system prompt: resolve enabled skill entries from EnabledSkillNames + Skills - Fix use_skill tool returning empty response: now returns full skill content (description + instructions) so LLM gets it in the same turn - Skip use_skill tool registration when no skills are available - Conditionally render Skills section in system prompt (hidden when no skills exist) * feat: add session type field and bind sessions to heartbeat/schedule executions - Add `type` column to `bot_sessions` (chat \| heartbeat \| schedule) - Add `session_id` to `bot_heartbeat_logs` for per-execution session tracking - Create `schedule_logs` table binding schedule_id + session_id - Heartbeat and schedule runs now create independent sessions and persist agent messages via storeRound, enabling full conversation replay - Add schedule logs API endpoints (list by bot, list by schedule, delete) - Update Triggerer interfaces to return TriggerResult with status/usage/model * refactor: modular system prompts per session type (chat/heartbeat/schedule) Split the monolithic system.md into three type-specific system prompts with shared fragments via {{include:_xxx}} syntax, so each session type gets a focused prompt without irrelevant instructions. * fix: prevent message duplication after task completion message_created events from Persist() had an empty platform field because toMessageFromCreate() didn't extract it from the session. This caused appendRealtimeMessage to fail the platform === 'web' guard, and hasMessageWithId to fail because local IDs differ from server UUIDs, resulting in all messages being appended as duplicates. - Extract platform from metadata in toMessageFromCreate so published events carry the correct value - Pass channel_type: 'web' when creating sessions from the web frontend so List queries return the correct platform via the session JOIN * fix: use per-message usage from SDK instead of misaligned step-level usages Previously, token usage was stored via a separate per-step usages array that didn't align with messages (off-by-one from prepending user message, step count != message count). This caused: - User messages incorrectly receiving usage data - Usage values shifted across messages in multi-step rounds - Last assistant message getting the accumulated total instead of its own step usage - InputTokenDetails/OutputTokenDetails lost during manual accumulation Now each sdk.Message carries its own per-step Usage (set by the SDK in buildStepMessages), which is extracted in sdkMessagesToModelMessages and stored directly via ModelMessage.Usage. The storeRound/storeMessages path no longer needs external usage/usages parameters. Also fixes the totalUsage accumulation in runStream to include all detail fields (InputTokenDetails, OutputTokenDetails). * feat: add /new slash command to create a new active session from IM channels Users in Telegram/Discord/Feishu can now send /new to start a fresh conversation, resetting the session context for the current chat thread. The command resolves the channel route, creates a new session, sets it as the active session on the route, and replies with a confirmation message. * feat: distinguish heartbeat and schedule sessions with dedicated icons in sidebar Heartbeat sessions show a heart-pulse icon (rose), schedule sessions show a clock icon (amber), and both display a type label beneath the session title. * refactor: remove enabledSkills system prompt injection, keep sorted skill listing use_skill now returns skill content directly as tool output, so there is no need to inject enabled skill body text into the system prompt. Remove the entire enabledSkills tracking chain (RunConfig.EnabledSkillNames, StreamEvent.Skills, GenerateResult.Skills, ChatRequest/Response.Skills, enableSkill closures in runStream/runGenerate, prepareRunConfig matching). Keep a lightweight skills listing (name + description only) in the system prompt so the model knows which skills are available. Sort entries by name to guarantee deterministic ordering and maximize KV cache reuse. * refactor: remove inbox system, persist passive messages directly to history Replace the bot_inbox table and service with direct writes to bot_history_messages for group conversations where the bot is not @mentioned. Trigger-path messages continue to be persisted after the agent responds (unchanged). - Drop bot_inbox table and max_inbox_items column (migration 0039) - Delete internal/inbox/, handlers/inbox.go, command/inbox.go, agent/tools/inbox.go and the MCP message provider - Add persistPassiveMessage() in channel inbound to write user messages into the active session immediately - Rewrite ListObservedConversationsByChannelIdentity to query bot_history_messages + bot_sessions instead of bot_inbox - Extract shared send/react logic into internal/messaging/executor.go; agent/tools/message.go is now a thin SDK adapter - Clean up all inbox references from agent prompts, flow resolver, email trigger, settings, commands, DI wiring, and frontend - Regenerate sqlc, swagger, and SDK * feat: add list_sessions and search_messages agent tools Provide agents with the ability to query session metadata and search message history across all sessions. search_messages supports filtering by time range, keyword (JSONB-aware ILIKE), session, contact, and role, with a default 7-day lookback when no start_time is given. * feat: inject last_heartbeat time and improve heartbeat search guidance Query the previous heartbeat's started_at timestamp and pass it through TriggerPayload into the heartbeat prompt template. Update system prompt and HEARTBEAT.md checklist to guide agents to use search_messages with start_time=last_heartbeat for efficient cross-session message review. * fix: pass BridgeProvider to FSClient and store full heartbeat prompt FSClient was always created with nil provider, causing all container file reads (IDENTITY.md, SOUL.md, MEMORY.md, HEARTBEAT.md, etc.) to silently return empty strings. Expose Agent.BridgeProvider() and wire it into Resolver. Also fix heartbeat trigger to store the full prompt template as the user message instead of the literal "heartbeat" string. * feat: add line numbers to container file read output Move line-number formatting from the bridge gRPC server to the agent tool layer so that the raw content stored and transmitted via gRPC remains clean, while the read_file tool output includes numbered lines for easier reference by the agent. * chore(deps): update twilight-ai to v0.3.2 * fix: lint, test	2026-03-21 15:57:22 +08:00
Menci	d5b410d7e3	refactor(workspace): new workspace v3 container architecture (#244 ) * feat(mcp): workspace container with bridge architecture Migrate MCP containers to use UDS-based bridge communication instead of TCP gRPC. Containers now mount runtime binaries and Unix domain sockets from the host, eliminating the need for a dedicated MCP Docker image. - Remove Dockerfile.mcp and entrypoint.sh in favor of standard base images - Add toolkit Dockerfile for building MCP binary separately - Containers use bind mounts for /opt/memoh (runtime) and /run/memoh (UDS) - Update all config files with new runtime_path and socket_dir settings - Support custom base images per bot (debian, alpine, ubuntu, etc.) - Legacy container detection and TCP fallback for pre-bridge containers - Frontend: add base image selector in container creation UI * feat(container): SSE progress bar for container creation Add real-time progress feedback during container image pull and creation using Server-Sent Events, without breaking the existing synchronous JSON API (content negotiation via Accept header). Backend: - Add PullProgress/LayerStatus types and OnProgress callback to PullImageOptions (containerd service layer) - DefaultService.PullImage polls ContentStore.ListStatuses every 500ms when OnProgress is set; AppleService ignores it - CreateContainer handler checks Accept: text/event-stream and switches to SSE branch: pulling → pull_progress → creating → complete/error Frontend: - handleCreateContainer/handleRecreateContainer use fetch + SSE instead of the SDK's synchronous postBotsByBotIdContainer - Progress bar shows layer-level pull progress (offset/total) during pulling phase and indeterminate animation during creating phase - i18n keys added for pullingImage and creatingContainer (en/zh) * fix(container): clear stale legacy route and type create SSE * fix(ci): resolve lint errors and arm64 musl node.js download - Fix unused-receiver lint: rename `s` to `_` on stub methods in manager_legacy_test.go - Fix sloglint: use slog.DiscardHandler instead of slog.NewTextHandler(io.Discard, nil) - Handle missing arm64 musl Node.js builds: unofficial-builds.nodejs.org does not provide arm64 musl binaries, fall back to glibc build * fix(lint): address errcheck, staticcheck, and gosec findings - Discard os.Setenv/os.Remove return values explicitly with _ - Use omitted receiver name instead of _ (staticcheck ST1006) - Tighten directory permissions from 0o755 to 0o750 (gosec G301) * fix(lint): sanitize socket path to satisfy gosec G703 filepath.Clean the env-sourced socket path before os.Remove to avoid path-traversal taint warning. * fix(lint): use nolint directive for gosec G703 on socket path filepath.Clean does not satisfy gosec's taint analysis. The socket path comes from MCP_SOCKET_PATH env (operator-configured) or a compiled-in default, not from end-user input. * refactor: rename MCP container/bridge to workspace/bridge Split internal/mcp/ to separate container lifecycle management from Model Context Protocol connections, eliminating naming confusion: - internal/mcp/ (container mgmt) → internal/workspace/ - internal/mcp/mcpclient/ → internal/workspace/bridge/ - internal/mcp/mcpcontainer/ → internal/workspace/bridgepb/ - cmd/mcp/ → cmd/bridge/ - config: MCPConfig → WorkspaceConfig, [mcp] → [workspace] - container prefix: mcp-{id} → workspace-{id} - labels: mcp.bot_id → memoh.bot_id, add memoh.workspace=v1 - socket: mcp.sock → bridge.sock, env BRIDGE_SOCKET_PATH - runtime: /opt/memoh/runtime/mcp → /opt/memoh/runtime/bridge - devenv: mcp-build.sh → bridge-build.sh Legacy containers (mcp- prefix) detected by container name prefix and handled via existing fallback path. * fix(container): use memoh.workspace=v3 label value * refactor(container): drop LegacyBotLabelKey, infer bot ID from container name Legacy containers use mcp-{botID} naming, so bot ID can be derived via TrimPrefix instead of looking up the mcp.bot_id label. * fix(workspace): resolve containers via manager and drop gateway container ID * docs: fix stale mcp references in AGENTS.md and DEPLOYMENT.md * refactor(workspace): move container lifecycle ownership into manager * dev: isolate local devenv from prod config * toolkit: support musl node runtime * containerd: fix fallback resolv.conf permissions * web: preserve container create progress on completion * web: add bot creation wait hint * fix(workspace): preserve image selection across recreate * feat(web): shorten default docker hub image refs * fix(container): address code review findings - Remove synchronous CreateContainer path (SSE-only now) - Move flusher check before WriteHeader to avoid committed 200 on error - Fix legacy container IP not cached via ensureContainerAndTask path - Add atomic guard to prevent stale pull_progress after PullImage returns - Defensive copy for tzEnv slice to avoid mutating shared backing array - Restore network failure severity in restartContainer (return + Error) - Extract duplicate progress bar into ContainerCreateProgress component - Fix codesync comments to use repo-relative paths - Add SaaS image validation note and kernel version comment on reaper * refactor(devenv): extract toolkit install into shared script Unify the Node.js + uv download logic into docker/toolkit/install.sh, used by the production Dockerfile and runnable locally for dev. Dev environment no longer bakes toolkit into the Docker image — it is volume-mounted from .toolkit/ instead, so wrapper script changes take effect immediately without rebuilding. The entrypoint checks for the toolkit directory and prints a clear error if missing. * fix(ci): address go ci failures * chore(docker): remove unused containerd image * refactor(config): rename workspace image key * fix(workspace): fix legacy container data loss on migration and stop swallowing errors Three root causes were identified and fixed: 1. Delete() used hardcoded "workspace-" prefix to look up legacy "mcp-" containers, causing GetContainer to return NotFound. CleanupBotContainer then silently skipped the error and deleted the DB record without ever calling PreserveData. Fix: resolve the actual container ID via ContainerID() (DB → label → scan) before operating. 2. Multiple restore error paths were silently swallowed (logged as Warn but not returned), so the user saw HTTP 200/204 with no data and no error. Fix: all errors in the preserve/restore chain now block the workflow and propagate to the caller. 3. tarGzDir used cached DirEntry.Info() for tar header size, which on overlayfs can differ from the actual file size, causing "archive/tar: write too long". Fix: open the file first, Fstat the fd for a race-free size, and use LimitReader as a safeguard. Also adds a "restoring" SSE phase so the frontend shows a progress indicator ("Restoring data, this may take a while...") during data migration on container recreation. * refactor(workspace): single-point container ID resolution Replace the `containerID func(string) string` field with a single `resolveContainerID(ctx, botID)` method that resolves the actual container ID via DB → label → scan → fallback. All ~16 lookup callsites across manager.go, dataio.go, versioning.go, and manager_lifecycle.go now go through this single resolver, which correctly handles both legacy "mcp-" and new "workspace-" containers. Only `ensureBotWithImage` inlines `ContainerPrefix + botID` for creating brand-new containers — every other path resolves dynamically. * fix(web): show progress during data backup phase of container recreate The recreate flow (delete with preserve_data + create with restore_data) blocked on the DELETE call while backing up /data with no progress indication. Add a 'preserving' phase to the progress component so users see "正在备份数据..." instead of an unexplained hang. * chore: remove [MYDEBUG] debug logging Clean up all 112 temporary debug log statements added during the legacy container migration investigation. Kept only meaningful warn-level logs for non-fatal errors (network teardown, rename failures).	2026-03-18 15:19:09 +08:00

5 Commits