Commit Graph

64 Commits

Author SHA1 Message Date
Acbox 26b01cc463 fix(media): failed to store the media to /data/media and add image part 2026-04-13 12:58:08 +08:00
Fodesu 19619d73a9 fix(chat): respect override model selection (#354) 2026-04-10 02:31:29 +08:00
BBQ d3bf6bc90a fix(channel,attachment): channel quality refactor & attachment pipeline fixes (#349)
* feat(channel): add DingTalk channel adapter

- Add DingTalk channel adapter (`internal/channel/adapters/dingtalk/`) using dingtalk-stream-sdk-go, supporting inbound message receiving and outbound text/markdown reply
- Register DingTalk adapter in cmd/agent and cmd/memoh
- Add go.mod dependency: github.com/memohai/dingtalk-stream-sdk-go
- Add Dingtalk and Wecom SVG icons and Vue components to @memohai/icon
- Refactor existing icon components to remove redundant inline wrappers
- Add `channelTypeDisplayName` util for consistent channel label resolution
- Add DingTalk/WeCom i18n entries (en/zh) for types and typesShort
- Extend channel-icon, bot-channels, channel-settings-panel to support dingtalk/wecom
- Use channelTypeDisplayName in profile page to replace ad-hoc i18n lookup

* fix(channel,attachment): channel quality refactor & attachment pipeline fixes

Channel module:
- Fix RemoveAdapter not cleaning connectionMeta (stale status leak)
- Fix preparedAttachmentTypeFromMime misclassifying image/gif
- Fix sleepWithContext time.After goroutine/timer leak
- Export IsDataURL/IsHTTPURL/IsDataPath, dedup across packages
- Cache OutboundPolicy in managerOutboundStream to avoid repeated lookups
- Split OutboundAttachmentStore: extract ContainerAttachmentIngester interface
- Add ManagerOption funcs (WithInboundQueueSize, WithInboundWorkers, WithRefreshInterval)
- Add thread-safety docs on OutboundStream / managerOutboundStream
- Add debug logs on successful send/edit paths
- Expand outbound_prepare_test.go with 21 new cases
- Convert no-receiver adapter helpers to package-level funcs; drop unused params

DingTalk adapter:
- Implement AttachmentResolver: download inbound media via /v1.0/robot/messageFiles/download
- Fix pure-image inbound messages failing due to missing resolver

Attachment pipeline:
- Fix images invisible to LLM in pipeline (DCP) path: inject InlineImages into
  last user message when cfg.Query is empty
- Fix public_url fallback: skip direct URL-to-LLM when ContentHash is set,
  always prefer inlined persisted asset
- Inject path: carry ImageParts through agent.InjectMessage; inline persisted
  attachments in resolver inject goroutine so mid-stream images reach the model
- Fix ResolveMime for images: prefer content-sniffed MIME over platform-declared
  MIME (fixes Feishu sending image/png header for actual JPEG content → API 400)
2026-04-09 14:36:11 +08:00
Acbox Liu 8d5c38f0e5 refactor: unify providers and models tables (#338)
* refactor: unify providers and models tables

- Rename `llm_providers` → `providers`, `llm_provider_oauth_tokens` → `provider_oauth_tokens`
- Remove `tts_providers` and `tts_models` tables; speech models now live in the unified `models` table with `type = 'speech'`
- Replace top-level `api_key`/`base_url` columns with a JSONB `config` field on `providers`
- Rename `llm_provider_id` → `provider_id` across all references
- Add `edge-speech` client type and `conf/providers/edge.yaml` default provider
- Create new read-only speech endpoints (`/speech-providers`, `/speech-models`) backed by filtered views of the unified tables
- Remove old TTS CRUD handlers; simplify speech page to read-only + test
- Update registry loader to skip malformed YAML files instead of failing entirely
- Fix YAML quoting for model names containing colons in openrouter.yaml
- Regenerate sqlc, swagger, and TypeScript SDK

* fix: exclude speech providers from providers list endpoint

ListProviders now filters out client_type matching '%-speech' so Edge
and future speech providers no longer appear on the Providers page.
ListSpeechProviders uses the same pattern match instead of hard-coding
'edge-speech'.

* fix: use explicit client_type list instead of LIKE pattern

Replace '%-speech' pattern with explicit IN ('edge-speech') for both
ListProviders (exclusion) and ListSpeechProviders (inclusion). New
speech client types must be added to both queries.

* fix: use EXECUTE for dynamic SQL in migrations referencing old schema

PL/pgSQL pre-validates column/table references in static SQL statements
inside DO blocks before evaluating IF/RETURN guards. This caused
migrations 0010-0061 to fail on fresh databases where the canonical
schema uses `providers`/`provider_id` instead of `llm_providers`/
`llm_provider_id`.

Wrap all SQL that references potentially non-existent old schema objects
(llm_providers, llm_provider_id, tts_providers, tts_models, etc.) in
EXECUTE strings so they are only parsed at runtime when actually reached.

* fix: revert canonical schema to use llm_providers for migration compatibility

The CI migrations workflow (up → down → up) failed because 0061 down
renames `providers` back to `llm_providers`, but 0001 down only dropped
`providers` — leaving `llm_providers` as a remnant. On the second
migrate up, 0010 found the stale `llm_providers` and tried to reference
`models.llm_provider_id` which no longer existed.

Revert 0001 canonical schema to use original names (llm_providers,
tts_providers, tts_models) so incremental migrations work naturally and
0061 handles the final rename. Remove EXECUTE wrappers and unnecessary
guards from migrations that now always operate on llm_providers.

* fix: icons

* fix: sync canonical schema with 0061 migration to fix sqlc column mismatch

0001_init.up.sql still used old names (llm_providers, llm_provider_id)
and included dropped tts_providers/tts_models tables. sqlc could not
parse the PL/pgSQL EXECUTE in migration 0061, so generated code retained
stale columns (input_modalities, supports_reasoning) causing runtime
"column does not exist" errors when adding models.

- Update 0001_init.up.sql to current schema (providers, provider_id,
  no tts tables, add provider_oauth_tokens)
- Use ALTER TABLE IF EXISTS in 0010/0041/0042 for backward compat
- Regenerate sqlc

* fix: guard all legacy migrations against fresh schema for CI compat

On fresh databases, 0001_init.up.sql creates providers/provider_id
(not llm_providers/llm_provider_id). Migrations 0013, 0041, 0046, 0047
referenced the old names without guards, causing CI migration failures.

- 0013: check llm_provider_id column exists before adding old constraint
- 0041: check llm_providers table exists before backfill/constraint DDL
- 0046: wrap CREATE TABLE in DO block with llm_providers existence check
- 0047: use ALTER TABLE IF EXISTS + DO block guard
2026-04-08 01:03:44 +08:00
Acbox Liu 43c4153938 feat: introduce DCP pipeline layer for unified context assembly (#329)
* refactor: introduce DCP pipeline layer for unified context assembly

Introduce a Deterministic Context Pipeline (DCP) inspired by Cahciua,
providing event-driven context assembly for LLM conversations.

- Add `internal/pipeline/` package with Canonical Event types, Projection
  (reduce), Rendering (XML RC), Pipeline manager, and EventStore persistence
- Change user message format from YAML front-matter to XML `<message>` tags
  with self-contained attributes (sender, channel, conversation, type)
- Merge CLI/Web dual API into single `/local/` endpoint, remove CLI handler
- Add `bot_session_events` table for event persistence and cold-start replay
- Add `discuss` session type (reserved for future Cahciua-style mode)
- Wire pipeline into HandleInbound: adapt → persist → push on every message
- Lazy cold-start replay: load events from DB on first session access

* feat: implement discuss mode with reactive driver and probe gate

Add discuss session mode where the bot autonomously decides when to speak
in group chats via tool-gated output (send tool only, no direct text reply).

- Add discuss driver (per-session goroutine, RC watch, step loop via
  agent.Generate, TR persistence, late-binding prompt with mention hints)
- Add system_discuss.md prompt template ("text = inner monologue, send = speak")
- Add context composition (MergeContext, ComposeContext, TrimContext) for
  RC + assistant/tool message interleaving by timestamp
- Add probe gate: when discuss_probe_model_id is set, cheap model pre-filters
  group messages; no tool calls = silence, tool calls = activate primary
- Add /new [chat|discuss] command: explicit mode selection, defaults to
  discuss in groups, chat in DMs, chat-only for WebUI
- Add ResolveRunConfig on flow.Resolver for discuss driver to reuse
  model/tools/system-prompt resolution without reimplementing
- Fix send tool for discuss mode: same-conversation sends now go through
  SendDirect (channel adapter) instead of the local emitter shortcut
- Add target attribute to XML message format (reply_target for routing)
- Add discuss_probe_model_id to bots table settings
- Remove pipeline compaction (SetCompactCursor) — reuse existing compaction.Service
- Persist full SDK messages (including tool calls) in discuss mode

* refactor: unify DCP event layer, fix persistence and local channel

- Fix bot_session_events dedup index to include event_kind so that
  message + edit events for the same external_message_id coexist.
- Change CreateSessionEvent from :one to :exec so ON CONFLICT DO NOTHING
  does not produce spurious errors on duplicate delivery.
- Move ACL evaluation before event ingest; denied messages no longer
  enter bot_session_events or the in-memory pipeline.
- Let chat mode consume RenderedContext from the DCP pipeline when
  available, sharing the same event-driven context assembly as discuss.
- Collapse local WebSocket handler to route through HandleInbound
  instead of directly calling StreamChatWS, eliminating the dual
  business entry point.
- Extract buildBaseRunConfig shared builder so resolve() and
  ResolveRunConfig() no longer duplicate model/credentials/skills setup.
- Add StoreRound to RunConfigResolver interface so discuss driver
  persists assistant output with full metadata, usage, and memory
  extraction (same quality as chat mode).
- Fix discuss driver context: use context.Background() instead of the
  short-lived HTTP request context that was getting cancelled.
- Fix model ID passed to StoreRound: return database UUID from
  ResolveRunConfig instead of SDK model name.
- Remove dead CLIAdapter/CLIType and update legacy web/cli references
  in tests and comments.

* fix: stop idle discuss goroutines after 10min timeout

Discuss session goroutines were never cleaned up when a session became
inactive (e.g. after /new). Add a 10-minute idle timer that auto-exits
the goroutine and removes it from the sessions map when no new RC
arrives.

* refactor: pipeline details — event types, structured reply, display content

- Remove [User sent N attachments] placeholder text from buildInboundQuery;
  attachment info is now expressed via pipeline <attachment> tags.
- Unify in-reply-to as structured ReplyRef (Sender/Preview fields) across
  Telegram, Discord, Feishu, and Matrix adapters instead of prepending
  [Reply to ...] text into the message body. Remove now-unused
  buildTelegramQuotedText, buildDiscordQuotedText, buildMatrixQuotedText.
- Make AdaptInbound return CanonicalEvent interface and dispatch to
  adaptMessage/adaptEdit/adaptService based on metadata["event_type"].
- Add event_id column to bot_history_messages (migration 0059) so user
  messages can reference their canonical pipeline event.
- PersistEvent now returns the event UUID; HandleInbound passes it through
  to both persistPassiveMessage and ChatRequest.EventID for storeRound.
- Add FillDisplayContent to message service: extracts plain text from
  event_data for clean frontend display.
- Frontend extractMessageText prefers display_content when available,
  falling back to legacy strip logic for old messages.
- Fix: always generate headerifiedQuery for storage even when usePipeline
  is true, so user messages are persisted via storeRound in chat mode.

* fix: use json.Marshal for pipeline context content serialization

The manual string escaping in buildMessagesFromPipeline only handled
double quotes but not newlines, backslashes, and other JSON special
characters, producing invalid json.RawMessage values. The LLM then
received empty/malformed context and complained about having no history.

* fix: restore WebSocket handler to use StreamChatWS directly

The previous refactoring replaced the WS handler with HandleInbound +
RouteHub subscription, which broke streaming because RouteHub events
use a different format (channel.StreamEvent) than what the frontend
expects (flow.WSStreamEvent with text_delta, tool_call_start, etc.).

Restore the original direct StreamChatWS call path so WebUI streaming
works again. The WS handler now matches the pre-refactoring behavior
while all other changes (pipeline, ACL, event types, etc.) are kept.

* feat: store display_text directly in bot_history_messages

Instead of computing display content at API response time by querying
bot_session_events via event_id, store the raw user text in a dedicated
display_text column at write time. This works for all paths including
the WebSocket handler which does not go through the pipeline/event layer.

- Migration 0060: add display_text TEXT column
- PersistInput gains DisplayText; filled from trimmedText (passive) and
  req.Query (storeRound)
- toMessageFields reads display_text into DisplayContent
- Remove FillDisplayContent runtime query and ListSessionEventsByEventID
- Frontend already prefers display_content when available (no change)

* fix: display_text should contain raw user text, not XML-wrapped query

req.Query gets overwritten to headerifiedQuery (with XML <message> tags)
before storeRound runs. Add RawQuery field to ChatRequest to preserve
the original user text, and use it for display_text in storeMessages.

* fix(web): show discuss sessions

* refactor: introduce DCP pipeline layer for unified context assembly

Introduce a Deterministic Context Pipeline (DCP) inspired by Cahciua,
providing event-driven context assembly for LLM conversations.

- Add `internal/pipeline/` package with Canonical Event types, Projection
  (reduce), Rendering (XML RC), Pipeline manager, and EventStore persistence
- Change user message format from YAML front-matter to XML `<message>` tags
  with self-contained attributes (sender, channel, conversation, type)
- Merge CLI/Web dual API into single `/local/` endpoint, remove CLI handler
- Add `bot_session_events` table for event persistence and cold-start replay
- Add `discuss` session type (reserved for future Cahciua-style mode)
- Wire pipeline into HandleInbound: adapt → persist → push on every message
- Lazy cold-start replay: load events from DB on first session access

* feat: implement discuss mode with reactive driver and probe gate

Add discuss session mode where the bot autonomously decides when to speak
in group chats via tool-gated output (send tool only, no direct text reply).

- Add discuss driver (per-session goroutine, RC watch, step loop via
  agent.Generate, TR persistence, late-binding prompt with mention hints)
- Add system_discuss.md prompt template ("text = inner monologue, send = speak")
- Add context composition (MergeContext, ComposeContext, TrimContext) for
  RC + assistant/tool message interleaving by timestamp
- Add probe gate: when discuss_probe_model_id is set, cheap model pre-filters
  group messages; no tool calls = silence, tool calls = activate primary
- Add /new [chat|discuss] command: explicit mode selection, defaults to
  discuss in groups, chat in DMs, chat-only for WebUI
- Add ResolveRunConfig on flow.Resolver for discuss driver to reuse
  model/tools/system-prompt resolution without reimplementing
- Fix send tool for discuss mode: same-conversation sends now go through
  SendDirect (channel adapter) instead of the local emitter shortcut
- Add target attribute to XML message format (reply_target for routing)
- Add discuss_probe_model_id to bots table settings
- Remove pipeline compaction (SetCompactCursor) — reuse existing compaction.Service
- Persist full SDK messages (including tool calls) in discuss mode

* refactor: unify DCP event layer, fix persistence and local channel

- Fix bot_session_events dedup index to include event_kind so that
  message + edit events for the same external_message_id coexist.
- Change CreateSessionEvent from :one to :exec so ON CONFLICT DO NOTHING
  does not produce spurious errors on duplicate delivery.
- Move ACL evaluation before event ingest; denied messages no longer
  enter bot_session_events or the in-memory pipeline.
- Let chat mode consume RenderedContext from the DCP pipeline when
  available, sharing the same event-driven context assembly as discuss.
- Collapse local WebSocket handler to route through HandleInbound
  instead of directly calling StreamChatWS, eliminating the dual
  business entry point.
- Extract buildBaseRunConfig shared builder so resolve() and
  ResolveRunConfig() no longer duplicate model/credentials/skills setup.
- Add StoreRound to RunConfigResolver interface so discuss driver
  persists assistant output with full metadata, usage, and memory
  extraction (same quality as chat mode).
- Fix discuss driver context: use context.Background() instead of the
  short-lived HTTP request context that was getting cancelled.
- Fix model ID passed to StoreRound: return database UUID from
  ResolveRunConfig instead of SDK model name.
- Remove dead CLIAdapter/CLIType and update legacy web/cli references
  in tests and comments.

* fix: stop idle discuss goroutines after 10min timeout

Discuss session goroutines were never cleaned up when a session became
inactive (e.g. after /new). Add a 10-minute idle timer that auto-exits
the goroutine and removes it from the sessions map when no new RC
arrives.

* refactor: pipeline details — event types, structured reply, display content

- Remove [User sent N attachments] placeholder text from buildInboundQuery;
  attachment info is now expressed via pipeline <attachment> tags.
- Unify in-reply-to as structured ReplyRef (Sender/Preview fields) across
  Telegram, Discord, Feishu, and Matrix adapters instead of prepending
  [Reply to ...] text into the message body. Remove now-unused
  buildTelegramQuotedText, buildDiscordQuotedText, buildMatrixQuotedText.
- Make AdaptInbound return CanonicalEvent interface and dispatch to
  adaptMessage/adaptEdit/adaptService based on metadata["event_type"].
- Add event_id column to bot_history_messages (migration 0059) so user
  messages can reference their canonical pipeline event.
- PersistEvent now returns the event UUID; HandleInbound passes it through
  to both persistPassiveMessage and ChatRequest.EventID for storeRound.
- Add FillDisplayContent to message service: extracts plain text from
  event_data for clean frontend display.
- Frontend extractMessageText prefers display_content when available,
  falling back to legacy strip logic for old messages.
- Fix: always generate headerifiedQuery for storage even when usePipeline
  is true, so user messages are persisted via storeRound in chat mode.

* fix: use json.Marshal for pipeline context content serialization

The manual string escaping in buildMessagesFromPipeline only handled
double quotes but not newlines, backslashes, and other JSON special
characters, producing invalid json.RawMessage values. The LLM then
received empty/malformed context and complained about having no history.

* fix: restore WebSocket handler to use StreamChatWS directly

The previous refactoring replaced the WS handler with HandleInbound +
RouteHub subscription, which broke streaming because RouteHub events
use a different format (channel.StreamEvent) than what the frontend
expects (flow.WSStreamEvent with text_delta, tool_call_start, etc.).

Restore the original direct StreamChatWS call path so WebUI streaming
works again. The WS handler now matches the pre-refactoring behavior
while all other changes (pipeline, ACL, event types, etc.) are kept.

* feat: store display_text directly in bot_history_messages

Instead of computing display content at API response time by querying
bot_session_events via event_id, store the raw user text in a dedicated
display_text column at write time. This works for all paths including
the WebSocket handler which does not go through the pipeline/event layer.

- Migration 0060: add display_text TEXT column
- PersistInput gains DisplayText; filled from trimmedText (passive) and
  req.Query (storeRound)
- toMessageFields reads display_text into DisplayContent
- Remove FillDisplayContent runtime query and ListSessionEventsByEventID
- Frontend already prefers display_content when available (no change)

* fix: display_text should contain raw user text, not XML-wrapped query

req.Query gets overwritten to headerifiedQuery (with XML <message> tags)
before storeRound runs. Add RawQuery field to ChatRequest to preserve
the original user text, and use it for display_text in storeMessages.

* fix(web): show discuss sessions

* chore(feishu): change discuss output to stream card

* fix(channel): unify discuss/chat send path and card markdown delivery

* feat(discuss): switch to stream execution with RouteHub broadcasting

* refactor(pipeline): remove context trimming from ComposeContext

The pipeline path should not trim context by token budget — the
upstream IC/RC already bounds the event window. Remove TrimContext,
FindWorkingWindowCursor, EstimateTokens, FormatLastProcessedMs (all
unused or only used for trimming), the maxTokens parameter from
ComposeContext, and MaxContextTokens from DiscussSessionConfig.

---------

Co-authored-by: 晨苒 <16112591+chen-ran@users.noreply.github.com>
2026-04-06 21:56:25 +08:00
晨苒 830c521f11 feat(feishu): keep mention(at) target 2026-04-06 06:18:03 +08:00
Acbox a31995424c feat: add per-route message dispatch modes (inject/parallel/queue)
Introduce three inbound message handling modes for channel adapters:

- inject (default, /btw): when a route has an active agent stream,
  inject the new user message into the running stream via the SDK's
  PrepareStep hook between tool rounds. The message is interleaved at
  the correct position in the persisted round.
- parallel (/now): start a new agent stream immediately, running
  concurrently with any existing stream (preserves current behavior).
- queue (/next): enqueue the message and process it after the current
  stream completes.

Key components:
- RouteDispatcher: per-route state management with inject channel,
  task queue, and active-stream tracking.
- PrepareStep integration: drains inject channel between tool rounds,
  records insertion position via InjectedRecorder for correct
  persistence ordering.
- interleaveInjectedMessages: inserts injected user messages at their
  actual injection position within the persisted message round.
- Parallel mode isolation: /now streams do not interact with the
  dispatcher, preventing them from clearing another stream's active
  state.
2026-04-03 01:17:33 +08:00
Acbox fc2b603018 fix(agent): skip tools for models without tool-call capability and parse image output
- Add SupportsToolCall to RunConfig; only inject tools into SDK when set
- Update twilight-ai to 497ad09 which adds SSE scanner 10MB buffer
  (fixes token-too-long on large image payloads) and parses the images
  array from OpenAI-compatible chat completions into StreamFilePart
2026-04-03 00:01:14 +08:00
Acbox 33f39c20ff feat: add per-message model and reasoning effort override
Allow users to select a different model and reasoning effort level
directly from the chat input toolbar, overriding the bot defaults
on a per-message basis. The backend accepts optional model_id and
reasoning_effort parameters via both WebSocket and HTTP APIs, with
request-level values taking priority over bot/session settings.

- Backend: extend wsClientMessage and LocalChannelMessageRequest with
  model_id/reasoning_effort fields; add ReasoningEffort to ChatRequest;
  update resolver to prioritize request-level reasoning effort
- Frontend: add ModelOptions and ReasoningEffortSelect shared components;
  refactor model-select to reuse ModelOptions; add model/reasoning
  selectors to chat input toolbar; initialize from bot settings
- Regenerate swagger spec and TypeScript SDK
2026-03-29 19:45:55 +08:00
Acbox 0730ff2945 refactor: remove max_context_load_time and max_context_tokens from bot settings
These two fields controlled history context window (time-based) and token-based
trimming. They are no longer needed — the resolver now always uses the hardcoded
24-hour default and skips token-based history trimming.
2026-03-29 00:00:10 +08:00
Yiming Qi 64378d29ed feat: openai codex support (#292)
* feat(web): add provider oauth management ui

* feat: add OAuth callback support on port 1455

* feat: enhance reasoning effort options and support for OpenAI Codex OAuth

* feat: update twilight-ai dependency to v0.3.4

* refactor: promote openai-codex to first-class client_type, remove auth_type

Replace the previous openai-responses + metadata auth_type=openai-codex-oauth
combo with a dedicated openai-codex client_type. OAuth requirement is now
determined solely by client_type, eliminating the auth_type concept from the
LLM provider domain entirely.

- Add openai-codex to DB CHECK constraint (migration 0047) with data migration
- Add ClientTypeOpenAICodex constant and dedicated SDK/probe branches
- Remove AuthType from SDKModelConfig, ModelCredentials, TriggerConfig, etc.
- Simplify supportsOAuth to check client_type == openai-codex
- Add conf/providers/codex.yaml preset with Codex catalog models
- Frontend: replace auth_type selector with client_type-driven OAuth UI

---------

Co-authored-by: Acbox <acbox0328@gmail.com>
2026-03-27 19:30:45 +08:00
Acbox da2e999ce3 feat: searchable timezone select & bot timezone priority
- Add reusable TimezoneSelect component with search and UTC offset labels
- Replace plain Select with searchable TimezoneSelect in profile settings,
  bot settings, and browser context settings
- Move bot timezone setting from header dialog into bot settings tab
- Resolve timezone with bot > user > system priority for all LLM-facing
  time formatting (user message header, system prompt, heartbeat, tools,
  memory extraction)
- Format tool output timestamps (history, contacts) in resolved timezone
2026-03-26 21:00:21 +08:00
Acbox 65b2797626 refactor: unify SDK model factories into internal/models
Move CreateModel, BuildReasoningOptions, ReasoningBudgetTokens and
related types from internal/agent to internal/models as NewSDKChatModel,
SDKModelConfig, etc. This eliminates duplicate ClientType constants and
centralises all Twilight AI SDK instance creation in a single package.

NewSDKEmbeddingModel now accepts a clientType parameter and dispatches
to the native Google embedding provider for google-generative-ai,
instead of always using the OpenAI-compatible endpoint.
2026-03-26 20:08:35 +08:00
Yiming Qi 03ba13e7e5 feat: add timezone support for schedule and user runtime (#282) 2026-03-26 01:32:02 +08:00
Acbox e9c9ed5ab1 fix(agent): route native images into user message for vision models
Images sent by users were silently dropped when the model supported
vision: routeAttachmentsByCapability classified them as "Native", but
extractFileRefPaths only collected "Fallback" (tool_file_ref) paths,
so the image data URL was computed and then discarded — the model saw
neither the image nor its container path.

- Add InlineImages field to RunConfig to carry native image data
- Replace extractFileRefPaths with extractAttachmentPaths that
  collects paths from both Native (FallbackPath) and Fallback
  attachments so the YAML header always lists every attachment
- Add extractNativeImageParts to extract inline image data URLs
- Pass InlineImages as sdk.ImagePart in prepareRunConfig so the
  LLM receives the actual image content alongside the text query
2026-03-24 19:14:33 +08:00
AlexMa233 609ca49cf5 feat: matrix support (part 1) (#242)
* feat(channel): add Matrix adapter support

* fix(channel): prevent reasoning leaks in Matrix replies

* fix(channel): persist Matrix sync cursors

* fix(channel): improve Matrix markdown rendering

* fix(channel): support Matrix attachments and multimodal history

* fix(channel): expand Matrix reply media context

* fix(handlers): allow media downloads for chat-access bots

* fix(channel): classify Matrix DMs as direct chats

* fix(channel): auto-join Matrix room invites

* fix(channel): resolve Matrix room aliases for outbound send

* fix(web): use Matrix brand icon in channel badges

Replace the generic Matrix hashtag badge with the official brand asset so channel badges feel recognizable and fit the circular mask cleanly.

* fix(channel): add Matrix room whitelist controls

Let Matrix bots decide whether to auto-join invites and restrict inbound activity to allowed rooms or aliases. Expose the new controls in the web settings UI with line-based whitelist input so access rules stay explicit.

* fix(channel): stabilize Matrix multimodal follow-ups and settings

* fix(flow): avoid gosec panic on byte decoding

* fix: fix golangci-lint

* fix(channel): remove Matrix built-in ACL

* fix(channel): preserve Matrix image captions

* fix(channel): validate Matrix homeserver and sync access

Fail Matrix connections early when the homeserver, access token, or /sync capability is misconfigured so bot health checks surface actionable errors.

* fix(channel): preserve optional toggles and relax Matrix startup validation

* fix(channel): tighten Matrix mention fallback parsing

* fix(flow): skip structured assistant tool-call outputs

* fix(flow): resolve merged resolver duplication

Keep the internal agent resolver implementation after merging main so split helper files do not redeclare flow symbols. Restore user message normalization in sanitize and persistence paths to keep flow tests and command packages building.

* fix(flow): remove unused merged resolver helper

Drop the leftover truncate helper and import from the resolver merge fix so golangci-lint passes again without affecting flow behavior.

---------

Co-authored-by: Acbox Liu <acbox0328@gmail.com>
2026-03-22 21:55:34 +08:00
Acbox Liu b3a39ad93d refactor: replace persistent subagents with ephemeral spawn tool (#280)
* refactor: replace persistent subagents with ephemeral spawn tool (#subagent)

- Drop subagents table, remove all persistent subagent infrastructure
- Add 'subagent' session type with parent_session_id on bot_sessions
- Rewrite subagent tool as single 'spawn' tool with parallel execution
- Create system_subagent.md prompt, add _subagent.md include for chat
- Limit subagent tools to file, exec, web_search, web_fetch only
- Merge subagent token usage into parent chat session in reporting
- Remove frontend subagent management page, update chat UI for spawn
- Fix UTF-8 truncation in session title, fix query not passed to agent

* refactor: remove history message page
2026-03-22 19:03:28 +08:00
Acbox Liu b88ca96064 refactor: provider & models (#277)
* refactor: move client_type to provider, replace model fields with config JSONB

- Move `client_type` from `models` to `llm_providers` table
- Add `icon` field to `llm_providers`
- Replace `dimensions`, `input_modalities`, `supports_reasoning` on `models`
  with a single `config` JSONB column containing `dimensions`,
  `compatibilities` (vision, tool-call, image-output, reasoning),
  and `context_window`
- Auto-imported models default to vision + tool-call + reasoning
- Update all backend consumers (agent, flow resolver, handlers, memory)
- Regenerate sqlc, swagger, and TypeScript SDK
- Update frontend forms, display, and i18n for new schema

* ui: show provider icon avatar in sidebar and detail header, remove icon input

* feat: add built-in provider registry with YAML definitions and enable toggle

- Add `enable` column to llm_providers (default true, backward-compatible)
- Create internal/registry package to load YAML provider/model definitions
  on startup and upsert into database (new providers disabled by default)
- Add conf/providers/ with OpenAI, Anthropic, Google YAML definitions
- Add RegistryConfig to TOML config (providers_dir, default conf/providers)
- Model listing APIs and conversation flow now filter by enabled providers
- Frontend: enable switch in provider form, green status dot in sidebar,
  enabled providers sorted to top

* fix: make 0041 migration idempotent for fresh databases

Guard data migration steps with column-existence checks so the
migration succeeds on databases created from the updated init schema.
2026-03-22 17:24:45 +08:00
Acbox Liu de62f94315 feat: add context compaction to automatically summarize old messages (#compaction) (#276)
When input tokens exceed a configurable threshold after a conversation round,
the system asynchronously compacts older messages into a summary. Cascading
compactions reference prior summaries via <prior_context> tags to maintain
conversational continuity without duplicating content.

- Add bot_history_message_compacts table and compact_id on messages
- Add compaction_enabled, compaction_threshold, compaction_model_id to bots
- Implement compaction service (internal/compaction) with LLM summarization
- Integrate into conversation flow: replace compacted messages with summaries
  wrapped in <summary> tags during context loading
- Add REST API endpoints (GET/DELETE /bots/:bot_id/compaction/logs)
- Add frontend Compaction tab with settings and log viewer
- Wire compaction service into both dev (cmd/agent) and prod (cmd/memoh) entry points
- Update test mocks to include new GetBotByID columns
2026-03-22 14:26:00 +08:00
Acbox Liu 7d7d0e4b51 refactor: introduce multi-session chat support (#session) (#267)
* refactor: introduce multi-session chat support (#session)

Replace the single-context-per-bot model with multiple chat sessions.

Database:
- Add bot_sessions table (route_id, channel_type, title, metadata, soft delete)
- Migrate bot_history_messages from (route_id, channel_type) to session_id
- Add active_session_id to bot_channel_routes
- Migration 0036 handles data migration from existing messages

Backend:
- New internal/session service for session CRUD
- Update message service/types to use session_id instead of route_id
- Update conversation flow (resolver, history, store) for session context
- Channel inbound auto-creates/retrieves active session via SessionEnsurer
- New REST endpoints: /bots/:bot_id/sessions (CRUD)
- WebSocket and message handlers accept optional session_id
- Wire session service into FX dependency graph (agent + memoh)

Frontend:
- Refactor chat store: sessions replaces chats, sessionId replaces chatId
- Session-aware message loading, sending, and pagination
- WebSocket sends include session_id
- New session sidebar component with select/delete
- Chat area header shows active session title + new session button
- API layer updated: fetchSessions, createSession, deleteSession
- i18n strings for session management (en + zh)

SDK:
- Regenerated TypeScript SDK and Swagger docs with session endpoints

* fix: update tests for session refactoring (RouteID → SessionID)

Remove references to removed RouteID and Platform fields from
PersistInput/Message in channel_test.go and service_integration_test.go.

* fix: restore accidentally deleted SDK files and guard migration 0032

- Restore packages/sdk/src/container-stream.ts and extra/index.ts that
  were accidentally removed during SDK regeneration
- Wrap migration 0032 route_id index creation in a column existence check
  to avoid failure on fresh databases where 0001_init.up.sql no longer
  has route_id

* fix: guard migration 0036 data steps for fresh databases

Wrap steps 3-7 (which reference route_id/channel_type on
bot_history_messages) in a column existence check so the migration
is safe on fresh databases where 0001_init.up.sql already reflects
the final schema without those columns.

* feat: add title model setting and auto-generate session titles on user input

- Add title_model_id to bots table (migration 0037) and bot settings API
- Implement async title generation triggered at user message time (not after
  assistant response) for faster title availability
- Publish session_title_updated events via SSE event hub for real-time
  frontend updates without page refresh
- Fix SSE message event parsing: use direct JSON.parse instead of
  normalizeStreamEvent which silently dropped non-chat-stream event types
- Add title model selector in bot settings UI with i18n support

* fix: session-scoped message filtering and URL-based chat routing

- Filter realtime SSE messages by session_id to prevent cross-session
  message leakage after page refresh
- Add /chat/:sessionId? route with bidirectional URL ↔ store sync
- Visiting /chat shows a clean state with no bot or session pre-selected
- Visiting /chat/:sessionId loads the specific session directly
- Session switches from sidebar automatically update the URL
- Fix stale RouteID field in dedupe test (removed during session refactor)

* fix: skip cross-channel stream events to prevent session leakage

The bot-level web stream pushes events from all channels (Telegram,
Discord, etc.) without session_id context. Previously these were
rendered inline in the current chat view regardless of session.

Now cross-channel events are ignored in handleLocalStreamEvent;
persisted messages arrive via the SSE message events stream with
proper session_id filtering through appendRealtimeMessage.

* feat: show IM avatars and platform badges on session sidebar

- Add sender_avatar_url to route metadata from identity resolution
- Resolve group avatar and handle via directory adapter for group chats
- JOIN bot_channel_routes in ListSessionsByBot to return route metadata
- Display avatar with ChannelBadge on IM session items (group avatar
  for groups, sender avatar for private chats)
- Show @groupname or @username as session sub-label

* fix: clean up RunConfig unused fields, fix skill system and copy bug

- Remove unused RunConfig fields: Tools, Channels, CurrentChannel,
  ActiveContextTime
- Remove unused SessionContext fields: DisplayName, ConversationType
- Fix EnabledSkillNames copy bug: make([]string, 0, n) + copy copies
  zero elements; changed to make([]string, n)
- Fix prepareRunConfig dead code: remove no-op loop over
  CurrentPlatform runes; compute supportsImageInput from model's
  InputModalities
- Fix EnabledSkills always nil in system prompt: resolve enabled skill
  entries from EnabledSkillNames + Skills
- Fix use_skill tool returning empty response: now returns full skill
  content (description + instructions) so LLM gets it in the same turn
- Skip use_skill tool registration when no skills are available
- Conditionally render Skills section in system prompt (hidden when
  no skills exist)

* feat: add session type field and bind sessions to heartbeat/schedule executions

- Add `type` column to `bot_sessions` (chat | heartbeat | schedule)
- Add `session_id` to `bot_heartbeat_logs` for per-execution session tracking
- Create `schedule_logs` table binding schedule_id + session_id
- Heartbeat and schedule runs now create independent sessions and persist
  agent messages via storeRound, enabling full conversation replay
- Add schedule logs API endpoints (list by bot, list by schedule, delete)
- Update Triggerer interfaces to return TriggerResult with status/usage/model

* refactor: modular system prompts per session type (chat/heartbeat/schedule)

Split the monolithic system.md into three type-specific system prompts
with shared fragments via {{include:_xxx}} syntax, so each session type
gets a focused prompt without irrelevant instructions.

* fix: prevent message duplication after task completion

message_created events from Persist() had an empty platform field because
toMessageFromCreate() didn't extract it from the session. This caused
appendRealtimeMessage to fail the platform === 'web' guard, and
hasMessageWithId to fail because local IDs differ from server UUIDs,
resulting in all messages being appended as duplicates.

- Extract platform from metadata in toMessageFromCreate so published events
  carry the correct value
- Pass channel_type: 'web' when creating sessions from the web frontend so
  List queries return the correct platform via the session JOIN

* fix: use per-message usage from SDK instead of misaligned step-level usages

Previously, token usage was stored via a separate per-step usages array
that didn't align with messages (off-by-one from prepending user message,
step count != message count). This caused:
- User messages incorrectly receiving usage data
- Usage values shifted across messages in multi-step rounds
- Last assistant message getting the accumulated total instead of its own step usage
- InputTokenDetails/OutputTokenDetails lost during manual accumulation

Now each sdk.Message carries its own per-step Usage (set by the SDK in
buildStepMessages), which is extracted in sdkMessagesToModelMessages and
stored directly via ModelMessage.Usage. The storeRound/storeMessages path
no longer needs external usage/usages parameters.

Also fixes the totalUsage accumulation in runStream to include all detail
fields (InputTokenDetails, OutputTokenDetails).

* feat: add /new slash command to create a new active session from IM channels

Users in Telegram/Discord/Feishu can now send /new to start a fresh
conversation, resetting the session context for the current chat thread.
The command resolves the channel route, creates a new session, sets it as
the active session on the route, and replies with a confirmation message.

* feat: distinguish heartbeat and schedule sessions with dedicated icons in sidebar

Heartbeat sessions show a heart-pulse icon (rose), schedule sessions
show a clock icon (amber), and both display a type label beneath the
session title.

* refactor: remove enabledSkills system prompt injection, keep sorted skill listing

use_skill now returns skill content directly as tool output, so there is
no need to inject enabled skill body text into the system prompt. Remove
the entire enabledSkills tracking chain (RunConfig.EnabledSkillNames,
StreamEvent.Skills, GenerateResult.Skills, ChatRequest/Response.Skills,
enableSkill closures in runStream/runGenerate, prepareRunConfig matching).

Keep a lightweight skills listing (name + description only) in the system
prompt so the model knows which skills are available. Sort entries by name
to guarantee deterministic ordering and maximize KV cache reuse.

* refactor: remove inbox system, persist passive messages directly to history

Replace the bot_inbox table and service with direct writes to
bot_history_messages for group conversations where the bot is not
@mentioned. Trigger-path messages continue to be persisted after the
agent responds (unchanged).

- Drop bot_inbox table and max_inbox_items column (migration 0039)
- Delete internal/inbox/, handlers/inbox.go, command/inbox.go,
  agent/tools/inbox.go and the MCP message provider
- Add persistPassiveMessage() in channel inbound to write user
  messages into the active session immediately
- Rewrite ListObservedConversationsByChannelIdentity to query
  bot_history_messages + bot_sessions instead of bot_inbox
- Extract shared send/react logic into internal/messaging/executor.go;
  agent/tools/message.go is now a thin SDK adapter
- Clean up all inbox references from agent prompts, flow resolver,
  email trigger, settings, commands, DI wiring, and frontend
- Regenerate sqlc, swagger, and SDK

* feat: add list_sessions and search_messages agent tools

Provide agents with the ability to query session metadata and search
message history across all sessions. search_messages supports filtering
by time range, keyword (JSONB-aware ILIKE), session, contact, and role,
with a default 7-day lookback when no start_time is given.

* feat: inject last_heartbeat time and improve heartbeat search guidance

Query the previous heartbeat's started_at timestamp and pass it through
TriggerPayload into the heartbeat prompt template. Update system prompt
and HEARTBEAT.md checklist to guide agents to use search_messages with
start_time=last_heartbeat for efficient cross-session message review.

* fix: pass BridgeProvider to FSClient and store full heartbeat prompt

FSClient was always created with nil provider, causing all container
file reads (IDENTITY.md, SOUL.md, MEMORY.md, HEARTBEAT.md, etc.) to
silently return empty strings. Expose Agent.BridgeProvider() and wire
it into Resolver. Also fix heartbeat trigger to store the full prompt
template as the user message instead of the literal "heartbeat" string.

* feat: add line numbers to container file read output

Move line-number formatting from the bridge gRPC server to the agent
tool layer so that the raw content stored and transmitted via gRPC
remains clean, while the read_file tool output includes numbered lines
for easier reference by the agent.

* chore(deps): update twilight-ai to v0.3.2

* fix: lint, test
2026-03-21 15:57:22 +08:00
Ringo.Typowriter ad08f335eb feat(agent): restore read_media in pure Go (#257) 2026-03-21 14:28:50 +08:00
Acbox Liu 1680316c7f refactor(agent): remove agent gateway instead of twilight sdk (#264)
* refactor(agent): replace TypeScript agent gateway with in-process Go agent using twilight-ai SDK

- Remove apps/agent (Bun/Elysia gateway), packages/agent (@memoh/agent),
  internal/bun runtime manager, and all embedded agent/bun assets
- Add internal/agent package powered by twilight-ai SDK for LLM calls,
  tool execution, streaming, sential logic, tag extraction, and prompts
- Integrate ToolGatewayService in-process for both built-in and user MCP
  tools, eliminating HTTP round-trips to the old gateway
- Update resolver to convert between sdk.Message and ModelMessage at the
  boundary (resolver_messages.go), keeping agent package free of
  persistence concerns
- Prepend user message before storeRound since SDK only returns output
  messages (assistant + tool)
- Clean up all Docker configs, TOML configs, nginx proxy, Dockerfile.agent,
  and Go config structs related to the removed agent gateway
- Update cmd/agent and cmd/memoh entry points with setter-based
  ToolGateway injection to avoid FX dependency cycles

* fix(web): move form declaration before computed properties that reference it

The `form` reactive object was declared after computed properties like
`selectedMemoryProvider` and `isSelectedMemoryProviderPersisted` that
reference it, causing a TDZ ReferenceError during setup.

* fix: prevent UTF-8 character corruption in streaming text output

StreamTagExtractor.Push() used byte-level string slicing to hold back
buffer tails for tag detection, which could split multi-byte UTF-8
characters. After json.Marshal replaced invalid bytes with U+FFFD,
the corruption became permanent — causing garbled CJK characters (�)
in agent responses.

Add safeUTF8SplitIndex() to back up split points to valid character
boundaries. Also fix byte-level truncation in command/formatter.go
and command/fs.go to use rune-aware slicing.

* fix: add agent error logging and fix Gemini tool schema validation

- Log agent stream errors in both SSE and WebSocket paths with bot/model context
- Fix send tool `attachments` parameter: empty `items` schema rejected by
  Google Gemini API (INVALID_ARGUMENT), now specifies `{"type": "string"}`
- Upgrade twilight-ai to d898f0b (includes raw body in API error messages)

* chore(ci): remove agent gateway from Docker build and release pipelines

Agent gateway has been replaced by in-process Go agent; remove the
obsolete Docker image matrix entry, Bun/UPX CI steps, and agent-binary
build logic from the release script.

* fix: preserve attachment filename, metadata, and container path through persistence

- Add `name` column to `bot_history_message_assets` (migration 0034) to
  persist original filenames across page refreshes.
- Add `metadata` JSONB column (migration 0035) to store source_path,
  source_url, and other context alongside each asset.
- Update SQL queries, sqlc-generated code, and all Go types (MessageAsset,
  AssetRef, OutboundAssetRef, FileAttachment) to carry name and metadata
  through the full lifecycle.
- Extract filenames from path/URL in AttachmentsResolver before clearing
  raw paths; enrich streaming event metadata with name, source_path, and
  source_url in both the WebSocket and channel inbound ingestion paths.
- Implement `LinkAssets` on message service and `LinkOutboundAssets` on
  flow resolver so WebSocket-streamed bot attachments are persisted to the
  correct assistant message after streaming completes.
- Frontend: update MessageAsset type with metadata field, pass metadata
  through to attachment items, and reorder attachment-block.vue template
  so container files (identified by metadata.source_path) open in the
  sidebar file manager instead of triggering a download.

* refactor(agent): decouple built-in tools from MCP, load via ToolProvider interface

Migrate all 13 built-in tool providers from internal/mcp/providers/ to
internal/agent/tools/ using the twilight-ai sdk.Tool structure. The agent
now loads tools through a ToolProvider interface instead of the MCP
ToolGatewayService, which is simplified to only manage external federation
sources. This enables selective tool loading and removes the coupling
between business tools and the MCP protocol layer.

* refactor(flow): split monolithic resolver.go into focused modules

Break the 1959-line resolver.go into 12 files organized by concern:
- resolver.go: core orchestration (Resolver struct, resolve, Chat, prepareRunConfig)
- resolver_stream.go: streaming (StreamChat, StreamChatWS, tryStoreStream)
- resolver_trigger.go: schedule/heartbeat triggers
- resolver_attachments.go: attachment routing, inlining, encoding
- resolver_history.go: message loading, deduplication, token trimming
- resolver_store.go: persistence (storeRound, storeMessages, asset linking)
- resolver_memory.go: memory provider integration
- resolver_model_selection.go: model selection and candidate matching
- resolver_identity.go: display name and channel identity resolution
- resolver_settings.go: bot settings, loop detection, inbox
- user_header.go: YAML front-matter formatting
- resolver_util.go: shared utilities (sanitize, normalize, dedup, UUID)

* fix(agent): enable Anthropic extended thinking by passing ReasoningConfig to provider

Anthropic's thinking requires WithThinking() at provider creation time,
unlike OpenAI which uses per-request ReasoningEffort. The config was
never wired through, so Claude models could not trigger thinking.

* refactor(agent): extract prompts into embedded markdown templates

Move inline prompt strings from prompt.go into separate .md files under
internal/agent/prompts/, using {{key}} placeholders and a simple render
engine. Remove obsolete SystemPromptParams fields (Language,
MaxContextLoadTime, Channels, CurrentChannel) and their call-site usage.

* fix: lint
2026-03-19 13:31:54 +08:00
MoeMagicMango 3ca84a3ab6 fix(image): add normalization for image parts in messages to fix "The messages do not match the ModelMessage[] schema." (#260) 2026-03-18 15:20:51 +08:00
Menci d5b410d7e3 refactor(workspace): new workspace v3 container architecture (#244)
* feat(mcp): workspace container with bridge architecture

Migrate MCP containers to use UDS-based bridge communication instead of
TCP gRPC. Containers now mount runtime binaries and Unix domain sockets
from the host, eliminating the need for a dedicated MCP Docker image.

- Remove Dockerfile.mcp and entrypoint.sh in favor of standard base images
- Add toolkit Dockerfile for building MCP binary separately
- Containers use bind mounts for /opt/memoh (runtime) and /run/memoh (UDS)
- Update all config files with new runtime_path and socket_dir settings
- Support custom base images per bot (debian, alpine, ubuntu, etc.)
- Legacy container detection and TCP fallback for pre-bridge containers
- Frontend: add base image selector in container creation UI

* feat(container): SSE progress bar for container creation

Add real-time progress feedback during container image pull and creation
using Server-Sent Events, without breaking the existing synchronous JSON
API (content negotiation via Accept header).

Backend:
- Add PullProgress/LayerStatus types and OnProgress callback to
  PullImageOptions (containerd service layer)
- DefaultService.PullImage polls ContentStore.ListStatuses every 500ms
  when OnProgress is set; AppleService ignores it
- CreateContainer handler checks Accept: text/event-stream and switches
  to SSE branch: pulling → pull_progress → creating → complete/error

Frontend:
- handleCreateContainer/handleRecreateContainer use fetch + SSE instead
  of the SDK's synchronous postBotsByBotIdContainer
- Progress bar shows layer-level pull progress (offset/total) during
  pulling phase and indeterminate animation during creating phase
- i18n keys added for pullingImage and creatingContainer (en/zh)

* fix(container): clear stale legacy route and type create SSE

* fix(ci): resolve lint errors and arm64 musl node.js download

- Fix unused-receiver lint: rename `s` to `_` on stub methods in
  manager_legacy_test.go
- Fix sloglint: use slog.DiscardHandler instead of
  slog.NewTextHandler(io.Discard, nil)
- Handle missing arm64 musl Node.js builds: unofficial-builds.nodejs.org
  does not provide arm64 musl binaries, fall back to glibc build

* fix(lint): address errcheck, staticcheck, and gosec findings

- Discard os.Setenv/os.Remove return values explicitly with _
- Use omitted receiver name instead of _ (staticcheck ST1006)
- Tighten directory permissions from 0o755 to 0o750 (gosec G301)

* fix(lint): sanitize socket path to satisfy gosec G703

filepath.Clean the env-sourced socket path before os.Remove
to avoid path-traversal taint warning.

* fix(lint): use nolint directive for gosec G703 on socket path

filepath.Clean does not satisfy gosec's taint analysis. The socket
path comes from MCP_SOCKET_PATH env (operator-configured) or a
compiled-in default, not from end-user input.

* refactor: rename MCP container/bridge to workspace/bridge

Split internal/mcp/ to separate container lifecycle management from
Model Context Protocol connections, eliminating naming confusion:

- internal/mcp/ (container mgmt) → internal/workspace/
- internal/mcp/mcpclient/ → internal/workspace/bridge/
- internal/mcp/mcpcontainer/ → internal/workspace/bridgepb/
- cmd/mcp/ → cmd/bridge/
- config: MCPConfig → WorkspaceConfig, [mcp] → [workspace]
- container prefix: mcp-{id} → workspace-{id}
- labels: mcp.bot_id → memoh.bot_id, add memoh.workspace=v1
- socket: mcp.sock → bridge.sock, env BRIDGE_SOCKET_PATH
- runtime: /opt/memoh/runtime/mcp → /opt/memoh/runtime/bridge
- devenv: mcp-build.sh → bridge-build.sh

Legacy containers (mcp- prefix) detected by container name prefix
and handled via existing fallback path.

* fix(container): use memoh.workspace=v3 label value

* refactor(container): drop LegacyBotLabelKey, infer bot ID from container name

Legacy containers use mcp-{botID} naming, so bot ID can be derived
via TrimPrefix instead of looking up the mcp.bot_id label.

* fix(workspace): resolve containers via manager and drop gateway container ID

* docs: fix stale mcp references in AGENTS.md and DEPLOYMENT.md

* refactor(workspace): move container lifecycle ownership into manager

* dev: isolate local devenv from prod config

* toolkit: support musl node runtime

* containerd: fix fallback resolv.conf permissions

* web: preserve container create progress on completion

* web: add bot creation wait hint

* fix(workspace): preserve image selection across recreate

* feat(web): shorten default docker hub image refs

* fix(container): address code review findings

- Remove synchronous CreateContainer path (SSE-only now)
- Move flusher check before WriteHeader to avoid committed 200 on error
- Fix legacy container IP not cached via ensureContainerAndTask path
- Add atomic guard to prevent stale pull_progress after PullImage returns
- Defensive copy for tzEnv slice to avoid mutating shared backing array
- Restore network failure severity in restartContainer (return + Error)
- Extract duplicate progress bar into ContainerCreateProgress component
- Fix codesync comments to use repo-relative paths
- Add SaaS image validation note and kernel version comment on reaper

* refactor(devenv): extract toolkit install into shared script

Unify the Node.js + uv download logic into docker/toolkit/install.sh,
used by the production Dockerfile and runnable locally for dev.

Dev environment no longer bakes toolkit into the Docker image — it is
volume-mounted from .toolkit/ instead, so wrapper script changes take
effect immediately without rebuilding. The entrypoint checks for the
toolkit directory and prints a clear error if missing.

* fix(ci): address go ci failures

* chore(docker): remove unused containerd image

* refactor(config): rename workspace image key

* fix(workspace): fix legacy container data loss on migration and stop swallowing errors

Three root causes were identified and fixed:

1. Delete() used hardcoded "workspace-" prefix to look up legacy "mcp-"
   containers, causing GetContainer to return NotFound. CleanupBotContainer
   then silently skipped the error and deleted the DB record without ever
   calling PreserveData. Fix: resolve the actual container ID via
   ContainerID() (DB → label → scan) before operating.

2. Multiple restore error paths were silently swallowed (logged as Warn
   but not returned), so the user saw HTTP 200/204 with no data and no
   error. Fix: all errors in the preserve/restore chain now block the
   workflow and propagate to the caller.

3. tarGzDir used cached DirEntry.Info() for tar header size, which on
   overlayfs can differ from the actual file size, causing "archive/tar:
   write too long". Fix: open the file first, Fstat the fd for a
   race-free size, and use LimitReader as a safeguard.

Also adds a "restoring" SSE phase so the frontend shows a progress
indicator ("Restoring data, this may take a while...") during data
migration on container recreation.

* refactor(workspace): single-point container ID resolution

Replace the `containerID func(string) string` field with a single
`resolveContainerID(ctx, botID)` method that resolves the actual
container ID via DB → label → scan → fallback. All ~16 lookup
callsites across manager.go, dataio.go, versioning.go, and
manager_lifecycle.go now go through this single resolver, which
correctly handles both legacy "mcp-" and new "workspace-" containers.

Only `ensureBotWithImage` inlines `ContainerPrefix + botID` for
creating brand-new containers — every other path resolves dynamically.

* fix(web): show progress during data backup phase of container recreate

The recreate flow (delete with preserve_data + create with restore_data)
blocked on the DELETE call while backing up /data with no progress
indication. Add a 'preserving' phase to the progress component so
users see "正在备份数据..." instead of an unexplained hang.

* chore: remove [MYDEBUG] debug logging

Clean up all 112 temporary debug log statements added during the
legacy container migration investigation. Kept only meaningful
warn-level logs for non-fatal errors (network teardown, rename
failures).
2026-03-18 15:19:09 +08:00
晨苒 627b673a5c refactor: multi-provider memory adapters with scan-based builtin (#227)
* refactor: restructure memory into multi-provider adapters, remove manifest.json dependency

- Rename internal/memory/provider to internal/memory/adapters with per-provider subdirectories (builtin, mem0, openviking)
- Replace manifest.json-based delete/update with scan-based index from daily files
- Add mem0 and openviking provider adapters with HTTP client, chat hooks, MCP tools, and CRUD
- Wire provider lifecycle into registry (auto-instantiate on create, evict on update/delete)
- Split docker-compose into base stack + optional overlays (qdrant, browser, mem0, openviking)
- Update admin UI to support dynamic provider config schema rendering

* chore(lint): fix all golangci-lint issues for clean CI

* refactor(docker): replace compose overlay files with profiles

* feat(memory): add built-in memory multi modes

* fix(ci): golangci lint

* feat(memory): edit built-in memory sparse design
2026-03-14 06:04:13 +08:00
Fodesu b46e494d3a feat(tts): introduce TTS system (#195) 2026-03-13 02:49:52 +08:00
Acbox 30653fbdbf fix(agent): reject send tool when targeting the same conversation
Pass replyTarget through the full pipeline (ChatRequest → gateway
identity → agent headers → MCP session) so the send tool can detect
when the destination matches the current conversation and return an
error guiding the agent to reply directly instead.
2026-03-11 16:59:42 +08:00
Acbox Liu 23d49a1c7b feat: message abort and web socket support (#222)
* feat: message abort and web socket support

* fix(web): chat end

* fix: lint

* fix: lint
2026-03-09 23:27:50 +08:00
Menci c741f2410b fix(conversation): correct token trimming edge cases (#207)
- Treat maxTokens=0 as "unconfigured/unlimited" instead of disabling
  trimming for any non-positive value (which masked exhausted budgets)
- Set historyBudget=1 when maxTokens>0 but overhead exceeds the limit,
  ensuring aggressive trimming instead of no trimming
- Estimate token cost for messages without usage data (len/4 fallback)
  so user/tool messages are not free-passed during budget accounting
2026-03-09 13:06:19 +08:00
BBQ 3739def43f fix(text): avoid breaking UTF-8 during truncation
Use rune-aware truncation for user-facing text and log previews so multibyte content is not corrupted in memory context, Telegram messages, or diagnostics.
2026-03-09 12:43:57 +08:00
Ringo.Typowriter e6a6dbe3f6 feat(channel): add QQ channel support and image message pipeline (#199)
* feat(channel): add qq adapter and outbound delivery

* feat(channel): ingest inbound qq messages

* feat(web): expose qq channel in management ui

* feat(channel): support qq attachment ingestion

* fix(mcp): fail read raw immediately for missing files

* fix(agent): parse inline image data into native image parts

* test(agent): align read_media tool tests with SDK options

* fix(channel): harden qq image delivery and reconnect loop

Avoid data URLs for qq channel images, reset reconnect backoff after healthy sessions, and fall back gracefully for malformed public image URLs.

* fix(channel): restore qq media delivery and target resolution

* fix(qq,mcp,agent): fix message/qq regressions and pass go lint

* fix(qq,agent): validate inline base64 and sync heartbeat seq

* fix(qq): validate remote voice mime for upload checks

* fix(qq): fall back intents and restore adapter wiring

* fix(qq): prevent final text leakage and dedupe persisted inbound query
2026-03-07 17:12:06 +08:00
Acbox 4109a141f9 feat: move all tools from @memoh/agent into built-in mcp 2026-03-06 16:48:18 +08:00
BBQ 3feb03aca7 ci: add go lint and race test workflow (#187) 2026-03-05 11:25:33 +08:00
Acbox Liu ea719f7ca7 refactor: memory provider (#140)
* refactor: memory provider

* fix: migrations

* feat: divide collection from different built-in memory

* feat: add `MEMORY.md` and `PROFILES.md`

* use .env for docker compose. fix #142 (#143)

* feat(web): add brand icons for search providers (#144)

Add custom FontAwesome icon definitions for all 9 search providers:
- Yandex: uses existing faYandex from FA free brands
- Tavily, Jina, Exa, Bocha, Serper: custom icons from brand SVGs
- DuckDuckGo, SearXNG, Sogou: custom icons from Simple Icons

Icons are registered with a custom 'fac' prefix and rendered as
monochrome (currentColor) via FontAwesome's standard rendering.

* fix: resolve multiple UI bugs (#147)

* feat: add email service with multi-adapter support (#146)

* feat: add email service with multi-adapter support

Implement a full-stack email service with global provider management,
per-bot bindings with granular read/write permissions, outbox audit
storage, and MCP tool integration for direct mailbox access.

Backend:
- Email providers: CRUD with dynamic config schema (generic SMTP/IMAP, Mailgun)
- Generic adapter: go-mail (SMTP) + go-imap/v2 (IMAP IDLE real-time push via
  UnilateralDataHandler + UID-based tracking + periodic check fallback)
- Mailgun adapter: mailgun-go/v5 with dual inbound mode (webhook + poll)
- Bot email bindings: per-bot provider binding with independent r/w permissions
- Outbox: outbound email audit log with status tracking
- Trigger: inbound emails push notification to bot_inbox (from/subject only,
  LLM reads full content on demand via MCP tools)
- MailboxReader interface: on-demand IMAP queries for listing/reading emails
- MCP tools: email_accounts, email_send, email_list (paginated mailbox),
  email_read (by UID) — all with multi-binding and provider_id selection
- Webhook: /email/mailgun/webhook/:config_id (JWT-skipped, signature-verified)
- DB migration: 0019_add_email (email_providers, bot_email_bindings, email_outbox)

Frontend:
- Email Providers page: /email-providers with MasterDetailSidebarLayout
- Dynamic config form rendered from ordered provider meta schema with i18n keys
- Bot detail: Email tab with bindings management + outbox audit table
- Sidebar navigation entry
- Full i18n support (en + zh)
- Auto-generated SDK from Swagger

Closes #17

* feat(email): trigger bot conversation immediately on inbound email

Instead of only storing an inbox item and waiting for the next chat,
the email trigger now proactively invokes the conversation resolver
so the bot processes new emails right away — aligned with the
schedule/heartbeat trigger pattern.

* fix: lint

---------

Co-authored-by: Acbox <acbox0328@gmail.com>

* chore: update AGENTS.md

* feat: files preview

* feat(web): improve MCP details page

* refactor(skills): import skill with pure markdown string

* merge main into refactor/memory

* fix: migration

* refactor: temp delete qdrant and bm25 index

* fix: clean merge code

* fix: update memory handler

---------

Co-authored-by: Leohearts <leohearts@leohearts.com>
Co-authored-by: Menci <mencici@msn.com>
Co-authored-by: Quincy <69751197+dqygit@users.noreply.github.com>
Co-authored-by: BBQ <35603386+HoneyBBQ@users.noreply.github.com>
Co-authored-by: Ran <16112591+chen-ran@users.noreply.github.com>
2026-03-03 15:33:50 +08:00
Ringo.Typowriter d3edd17d90 feat(agent): loop detection (#152)
* feat(loop-detection): add configurable text and tool loop guards

* style(web): remove duplicate separator in bot settings
2026-03-02 15:00:09 +08:00
Acbox Liu 0cdf822603 feat: token usage state (#153)
* feat: token usage state

* fix: typo
2026-03-01 02:19:07 +08:00
Acbox Liu fe10abf3fc refactor: inbox (#137)
* refactor: inbox

* fix: migrations

* fix: migrations
2026-02-26 20:16:02 +08:00
Acbox Liu 2f38662d4d feat: heartbeat (#108)
* feat: heartbeat

* feat: independent heartbeat model
2026-02-25 16:32:52 +08:00
Acbox Liu 17cd077f34 feat: add thinking support (#100)
* feat: add thinking support

* feat: improve thinking block render in web and filter thinking content in channels

* fix: migrate
2026-02-23 14:41:27 +08:00
Acbox ac929f9f44 feat: add message id in user header 2026-02-23 00:06:15 +08:00
ringotypowriter f00e4bcd4e fix(flow): support UUID model refs when provider filter is set 2026-02-22 16:53:30 +08:00
ringotypowriter 8cd7c4aa86 Merge branch 'main' into fix/provider-scoped-model-id-resolution 2026-02-22 12:47:37 +08:00
Acbox Liu c591af14b0 feat: bot inbox (#77)
* feat: bot inbox

* feat: unified header

* fix: missing tool_call usage

* feat: add group name in header
2026-02-22 01:27:24 +08:00
ringotypowriter 50bdbd519c fix(models,settings,conversation): scope model_id uniqueness per
provider and harden model reference resolution
2026-02-21 22:31:32 +08:00
Ringo.Typowriter 9461f923df fix(flow): stabilize chunked SSE and unify prune limits for read/exec/gateway (#71)
* fix(agent): emit chunked SSE data

fix(flow): reassemble chunked SSE and prune tool payloads

fix: avoid whitespace prune bypass; optimize chunked SSE builder

* refactor: LLM provider pruning use shared textprune library

* chore: smaller range
2026-02-21 17:06:02 +08:00
Acbox 6b7c3db952 refactor: process user header in go side 2026-02-20 21:40:13 +08:00
Acbox ab84e29dde fix: change memory message role from system to user caused by the imcompatibility of anthropic messages api 2026-02-20 15:19:24 +08:00
Acbox 5e1de4fe7b fix: include system tokens in max tokens compute 2026-02-19 18:28:29 +08:00
Acbox a65c741e28 fix(agent): missing user header in message store 2026-02-19 17:19:39 +08:00
BBQ bc374fe8cd refactor: content-addressed assets, cross-channel multimodal, infra simplification (#63)
* refactor(attachment): multimodal attachment refactor with snapshot schema and storage layer

- Add snapshot schema migration (0008) and update init/versions/snapshots
- Add internal/attachment and internal/channel normalize for unified attachment handling
- Move containerfs provider from internal/media to internal/storage
- Update agent types, channel adapters (Telegram/Feishu), inbound and handlers
- Add containerd snapshot lineage and local_channel tests
- Regenerate sqlc, swagger and SDK

* refactor(media): content-addressed asset system with unified naming

- Replace asset_id foreign key with content_hash as sole identifier
  for bot_history_message_assets (pure soft-link model)
- Remove mime, size_bytes, storage_key from DB; derive at read time
  via media.Resolve from actual storage
- Merge migrations 0008/0009 into single 0008; keep 0001 as canonical schema
- Add Docker initdb script for deterministic migration execution order
- Fix cross-channel real-time image display (Telegram → WebUI SSE)
- Fix message disappearing on refresh (null assets fallback)
- Fix file icon instead of image preview (mime derivation from storage)
- Unify AssetID → ContentHash naming across Go, Agent, and Frontend
- Change storage key prefix from 4-char to 2-char for directory sharding
- Add server-entrypoint.sh for Docker deployment migration handling

* refactor(infra): embedded migrations, Docker simplification, and config consolidation

- Embed SQL migrations into Go binary, removing shell-based migration scripts
- Consolidate config files into conf/ directory (app.example.toml, app.docker.toml, app.dev.toml)
- Simplify Docker setup: remove initdb.d scripts, streamline nginx config and entrypoint
- Remove legacy CLI, feishu-echo commands, and obsolete incremental migration files
- Update install script and docs to require sudo for one-click install
- Add mise tasks for dev environment orchestration

* chore: recover migrations

---------

Co-authored-by: Acbox <acbox0328@gmail.com>
2026-02-19 00:20:27 +08:00