Commit Graph

24 Commits

Author SHA1 Message Date
Acbox 63fe03cfff Revert "Feat/speech support (#392)"
This reverts commit c9dcfe287f.
2026-04-22 00:10:36 +08:00
Acbox c9dcfe287f Feat/speech support (#392)
* feat: expand speech provider support with new client types and configuration schema

* feat: add icon support for speech providers and update related configurations

* feat: add SVG support for Deepgram and Elevenlabs with Vue components

* feat: except *-speech client type in llm provider

* feat: enhance speech provider functionality with advanced settings and model import capabilities

* chore: remove go.mod replace

* feat: enhance speech provider functionality with advanced settings and model import capabilities

* chore: update go module dependencies

* feat: Ear and Mouth

* fix: separate ear/mouth page

* fix: separate audio domain and restore transcription templates

Move speech and transcription internals into the audio domain, restore template-driven transcription providers, and regenerate Swagger/SDK so the frontend can stop hand-calling /transcription-* APIs.

---------

Co-authored-by: aki <arisu@ieee.org>
2026-04-22 00:09:46 +08:00
KasuganoSora a40207ab6d feat: Misskey channel adapter, agent reliability hardening & stream error resilience (#359) 2026-04-13 17:10:50 +08:00
Acbox Liu 7a21fd5f07 feat: ui message (#357) 2026-04-11 13:29:41 +08:00
Acbox Liu 43c4153938 feat: introduce DCP pipeline layer for unified context assembly (#329)
* refactor: introduce DCP pipeline layer for unified context assembly

Introduce a Deterministic Context Pipeline (DCP) inspired by Cahciua,
providing event-driven context assembly for LLM conversations.

- Add `internal/pipeline/` package with Canonical Event types, Projection
  (reduce), Rendering (XML RC), Pipeline manager, and EventStore persistence
- Change user message format from YAML front-matter to XML `<message>` tags
  with self-contained attributes (sender, channel, conversation, type)
- Merge CLI/Web dual API into single `/local/` endpoint, remove CLI handler
- Add `bot_session_events` table for event persistence and cold-start replay
- Add `discuss` session type (reserved for future Cahciua-style mode)
- Wire pipeline into HandleInbound: adapt → persist → push on every message
- Lazy cold-start replay: load events from DB on first session access

* feat: implement discuss mode with reactive driver and probe gate

Add discuss session mode where the bot autonomously decides when to speak
in group chats via tool-gated output (send tool only, no direct text reply).

- Add discuss driver (per-session goroutine, RC watch, step loop via
  agent.Generate, TR persistence, late-binding prompt with mention hints)
- Add system_discuss.md prompt template ("text = inner monologue, send = speak")
- Add context composition (MergeContext, ComposeContext, TrimContext) for
  RC + assistant/tool message interleaving by timestamp
- Add probe gate: when discuss_probe_model_id is set, cheap model pre-filters
  group messages; no tool calls = silence, tool calls = activate primary
- Add /new [chat|discuss] command: explicit mode selection, defaults to
  discuss in groups, chat in DMs, chat-only for WebUI
- Add ResolveRunConfig on flow.Resolver for discuss driver to reuse
  model/tools/system-prompt resolution without reimplementing
- Fix send tool for discuss mode: same-conversation sends now go through
  SendDirect (channel adapter) instead of the local emitter shortcut
- Add target attribute to XML message format (reply_target for routing)
- Add discuss_probe_model_id to bots table settings
- Remove pipeline compaction (SetCompactCursor) — reuse existing compaction.Service
- Persist full SDK messages (including tool calls) in discuss mode

* refactor: unify DCP event layer, fix persistence and local channel

- Fix bot_session_events dedup index to include event_kind so that
  message + edit events for the same external_message_id coexist.
- Change CreateSessionEvent from :one to :exec so ON CONFLICT DO NOTHING
  does not produce spurious errors on duplicate delivery.
- Move ACL evaluation before event ingest; denied messages no longer
  enter bot_session_events or the in-memory pipeline.
- Let chat mode consume RenderedContext from the DCP pipeline when
  available, sharing the same event-driven context assembly as discuss.
- Collapse local WebSocket handler to route through HandleInbound
  instead of directly calling StreamChatWS, eliminating the dual
  business entry point.
- Extract buildBaseRunConfig shared builder so resolve() and
  ResolveRunConfig() no longer duplicate model/credentials/skills setup.
- Add StoreRound to RunConfigResolver interface so discuss driver
  persists assistant output with full metadata, usage, and memory
  extraction (same quality as chat mode).
- Fix discuss driver context: use context.Background() instead of the
  short-lived HTTP request context that was getting cancelled.
- Fix model ID passed to StoreRound: return database UUID from
  ResolveRunConfig instead of SDK model name.
- Remove dead CLIAdapter/CLIType and update legacy web/cli references
  in tests and comments.

* fix: stop idle discuss goroutines after 10min timeout

Discuss session goroutines were never cleaned up when a session became
inactive (e.g. after /new). Add a 10-minute idle timer that auto-exits
the goroutine and removes it from the sessions map when no new RC
arrives.

* refactor: pipeline details — event types, structured reply, display content

- Remove [User sent N attachments] placeholder text from buildInboundQuery;
  attachment info is now expressed via pipeline <attachment> tags.
- Unify in-reply-to as structured ReplyRef (Sender/Preview fields) across
  Telegram, Discord, Feishu, and Matrix adapters instead of prepending
  [Reply to ...] text into the message body. Remove now-unused
  buildTelegramQuotedText, buildDiscordQuotedText, buildMatrixQuotedText.
- Make AdaptInbound return CanonicalEvent interface and dispatch to
  adaptMessage/adaptEdit/adaptService based on metadata["event_type"].
- Add event_id column to bot_history_messages (migration 0059) so user
  messages can reference their canonical pipeline event.
- PersistEvent now returns the event UUID; HandleInbound passes it through
  to both persistPassiveMessage and ChatRequest.EventID for storeRound.
- Add FillDisplayContent to message service: extracts plain text from
  event_data for clean frontend display.
- Frontend extractMessageText prefers display_content when available,
  falling back to legacy strip logic for old messages.
- Fix: always generate headerifiedQuery for storage even when usePipeline
  is true, so user messages are persisted via storeRound in chat mode.

* fix: use json.Marshal for pipeline context content serialization

The manual string escaping in buildMessagesFromPipeline only handled
double quotes but not newlines, backslashes, and other JSON special
characters, producing invalid json.RawMessage values. The LLM then
received empty/malformed context and complained about having no history.

* fix: restore WebSocket handler to use StreamChatWS directly

The previous refactoring replaced the WS handler with HandleInbound +
RouteHub subscription, which broke streaming because RouteHub events
use a different format (channel.StreamEvent) than what the frontend
expects (flow.WSStreamEvent with text_delta, tool_call_start, etc.).

Restore the original direct StreamChatWS call path so WebUI streaming
works again. The WS handler now matches the pre-refactoring behavior
while all other changes (pipeline, ACL, event types, etc.) are kept.

* feat: store display_text directly in bot_history_messages

Instead of computing display content at API response time by querying
bot_session_events via event_id, store the raw user text in a dedicated
display_text column at write time. This works for all paths including
the WebSocket handler which does not go through the pipeline/event layer.

- Migration 0060: add display_text TEXT column
- PersistInput gains DisplayText; filled from trimmedText (passive) and
  req.Query (storeRound)
- toMessageFields reads display_text into DisplayContent
- Remove FillDisplayContent runtime query and ListSessionEventsByEventID
- Frontend already prefers display_content when available (no change)

* fix: display_text should contain raw user text, not XML-wrapped query

req.Query gets overwritten to headerifiedQuery (with XML <message> tags)
before storeRound runs. Add RawQuery field to ChatRequest to preserve
the original user text, and use it for display_text in storeMessages.

* fix(web): show discuss sessions

* refactor: introduce DCP pipeline layer for unified context assembly

Introduce a Deterministic Context Pipeline (DCP) inspired by Cahciua,
providing event-driven context assembly for LLM conversations.

- Add `internal/pipeline/` package with Canonical Event types, Projection
  (reduce), Rendering (XML RC), Pipeline manager, and EventStore persistence
- Change user message format from YAML front-matter to XML `<message>` tags
  with self-contained attributes (sender, channel, conversation, type)
- Merge CLI/Web dual API into single `/local/` endpoint, remove CLI handler
- Add `bot_session_events` table for event persistence and cold-start replay
- Add `discuss` session type (reserved for future Cahciua-style mode)
- Wire pipeline into HandleInbound: adapt → persist → push on every message
- Lazy cold-start replay: load events from DB on first session access

* feat: implement discuss mode with reactive driver and probe gate

Add discuss session mode where the bot autonomously decides when to speak
in group chats via tool-gated output (send tool only, no direct text reply).

- Add discuss driver (per-session goroutine, RC watch, step loop via
  agent.Generate, TR persistence, late-binding prompt with mention hints)
- Add system_discuss.md prompt template ("text = inner monologue, send = speak")
- Add context composition (MergeContext, ComposeContext, TrimContext) for
  RC + assistant/tool message interleaving by timestamp
- Add probe gate: when discuss_probe_model_id is set, cheap model pre-filters
  group messages; no tool calls = silence, tool calls = activate primary
- Add /new [chat|discuss] command: explicit mode selection, defaults to
  discuss in groups, chat in DMs, chat-only for WebUI
- Add ResolveRunConfig on flow.Resolver for discuss driver to reuse
  model/tools/system-prompt resolution without reimplementing
- Fix send tool for discuss mode: same-conversation sends now go through
  SendDirect (channel adapter) instead of the local emitter shortcut
- Add target attribute to XML message format (reply_target for routing)
- Add discuss_probe_model_id to bots table settings
- Remove pipeline compaction (SetCompactCursor) — reuse existing compaction.Service
- Persist full SDK messages (including tool calls) in discuss mode

* refactor: unify DCP event layer, fix persistence and local channel

- Fix bot_session_events dedup index to include event_kind so that
  message + edit events for the same external_message_id coexist.
- Change CreateSessionEvent from :one to :exec so ON CONFLICT DO NOTHING
  does not produce spurious errors on duplicate delivery.
- Move ACL evaluation before event ingest; denied messages no longer
  enter bot_session_events or the in-memory pipeline.
- Let chat mode consume RenderedContext from the DCP pipeline when
  available, sharing the same event-driven context assembly as discuss.
- Collapse local WebSocket handler to route through HandleInbound
  instead of directly calling StreamChatWS, eliminating the dual
  business entry point.
- Extract buildBaseRunConfig shared builder so resolve() and
  ResolveRunConfig() no longer duplicate model/credentials/skills setup.
- Add StoreRound to RunConfigResolver interface so discuss driver
  persists assistant output with full metadata, usage, and memory
  extraction (same quality as chat mode).
- Fix discuss driver context: use context.Background() instead of the
  short-lived HTTP request context that was getting cancelled.
- Fix model ID passed to StoreRound: return database UUID from
  ResolveRunConfig instead of SDK model name.
- Remove dead CLIAdapter/CLIType and update legacy web/cli references
  in tests and comments.

* fix: stop idle discuss goroutines after 10min timeout

Discuss session goroutines were never cleaned up when a session became
inactive (e.g. after /new). Add a 10-minute idle timer that auto-exits
the goroutine and removes it from the sessions map when no new RC
arrives.

* refactor: pipeline details — event types, structured reply, display content

- Remove [User sent N attachments] placeholder text from buildInboundQuery;
  attachment info is now expressed via pipeline <attachment> tags.
- Unify in-reply-to as structured ReplyRef (Sender/Preview fields) across
  Telegram, Discord, Feishu, and Matrix adapters instead of prepending
  [Reply to ...] text into the message body. Remove now-unused
  buildTelegramQuotedText, buildDiscordQuotedText, buildMatrixQuotedText.
- Make AdaptInbound return CanonicalEvent interface and dispatch to
  adaptMessage/adaptEdit/adaptService based on metadata["event_type"].
- Add event_id column to bot_history_messages (migration 0059) so user
  messages can reference their canonical pipeline event.
- PersistEvent now returns the event UUID; HandleInbound passes it through
  to both persistPassiveMessage and ChatRequest.EventID for storeRound.
- Add FillDisplayContent to message service: extracts plain text from
  event_data for clean frontend display.
- Frontend extractMessageText prefers display_content when available,
  falling back to legacy strip logic for old messages.
- Fix: always generate headerifiedQuery for storage even when usePipeline
  is true, so user messages are persisted via storeRound in chat mode.

* fix: use json.Marshal for pipeline context content serialization

The manual string escaping in buildMessagesFromPipeline only handled
double quotes but not newlines, backslashes, and other JSON special
characters, producing invalid json.RawMessage values. The LLM then
received empty/malformed context and complained about having no history.

* fix: restore WebSocket handler to use StreamChatWS directly

The previous refactoring replaced the WS handler with HandleInbound +
RouteHub subscription, which broke streaming because RouteHub events
use a different format (channel.StreamEvent) than what the frontend
expects (flow.WSStreamEvent with text_delta, tool_call_start, etc.).

Restore the original direct StreamChatWS call path so WebUI streaming
works again. The WS handler now matches the pre-refactoring behavior
while all other changes (pipeline, ACL, event types, etc.) are kept.

* feat: store display_text directly in bot_history_messages

Instead of computing display content at API response time by querying
bot_session_events via event_id, store the raw user text in a dedicated
display_text column at write time. This works for all paths including
the WebSocket handler which does not go through the pipeline/event layer.

- Migration 0060: add display_text TEXT column
- PersistInput gains DisplayText; filled from trimmedText (passive) and
  req.Query (storeRound)
- toMessageFields reads display_text into DisplayContent
- Remove FillDisplayContent runtime query and ListSessionEventsByEventID
- Frontend already prefers display_content when available (no change)

* fix: display_text should contain raw user text, not XML-wrapped query

req.Query gets overwritten to headerifiedQuery (with XML <message> tags)
before storeRound runs. Add RawQuery field to ChatRequest to preserve
the original user text, and use it for display_text in storeMessages.

* fix(web): show discuss sessions

* chore(feishu): change discuss output to stream card

* fix(channel): unify discuss/chat send path and card markdown delivery

* feat(discuss): switch to stream execution with RouteHub broadcasting

* refactor(pipeline): remove context trimming from ComposeContext

The pipeline path should not trim context by token budget — the
upstream IC/RC already bounds the event window. Remove TrimContext,
FindWorkingWindowCursor, EstimateTokens, FormatLastProcessedMs (all
unused or only used for trimming), the maxTokens parameter from
ComposeContext, and MaxContextTokens from DiscussSessionConfig.

---------

Co-authored-by: 晨苒 <16112591+chen-ran@users.noreply.github.com>
2026-04-06 21:56:25 +08:00
Acbox c172699466 fix(channel): allow attachment-only messages via WebSocket (#331)
The WebSocket handler rejected messages with empty text even when
attachments were present, while the HTTP POST endpoint correctly
used Message.IsEmpty(). Move the empty-check after attachment
parsing so only truly empty messages are rejected.
2026-04-04 21:50:40 +08:00
Acbox f1dd30a388 fix: strip agent tags from IM/WebUI output and fix attachment display after refresh
Three independent bugs fixed:

1. IM channels were sending raw <attachments>/<reactions>/<speech> tag blocks
   alongside file attachments. Now ExtractAssistantOutputs strips these tags
   before building the outbound channel message.

2. WebUI rendered these tags as markdown after page refresh. Now
   extractMessageText strips agent tags for non-user messages.

3. WebUI lost attachment blocks after refresh because convertMessagesToChats
   did not call buildAssetBlocks when merging assistant messages into a
   pending tool-call group. Also made LinkOutboundAssets session-aware so
   assets are linked to the correct assistant message.
2026-03-31 15:11:57 +08:00
Acbox 33f39c20ff feat: add per-message model and reasoning effort override
Allow users to select a different model and reasoning effort level
directly from the chat input toolbar, overriding the bot defaults
on a per-message basis. The backend accepts optional model_id and
reasoning_effort parameters via both WebSocket and HTTP APIs, with
request-level values taking priority over bot/session settings.

- Backend: extend wsClientMessage and LocalChannelMessageRequest with
  model_id/reasoning_effort fields; add ReasoningEffort to ChatRequest;
  update resolver to prioritize request-level reasoning effort
- Frontend: add ModelOptions and ReasoningEffortSelect shared components;
  refactor model-select to reuse ModelOptions; add model/reasoning
  selectors to chat input toolbar; initialize from bot settings
- Regenerate swagger spec and TypeScript SDK
2026-03-29 19:45:55 +08:00
Acbox Liu 7d7d0e4b51 refactor: introduce multi-session chat support (#session) (#267)
* refactor: introduce multi-session chat support (#session)

Replace the single-context-per-bot model with multiple chat sessions.

Database:
- Add bot_sessions table (route_id, channel_type, title, metadata, soft delete)
- Migrate bot_history_messages from (route_id, channel_type) to session_id
- Add active_session_id to bot_channel_routes
- Migration 0036 handles data migration from existing messages

Backend:
- New internal/session service for session CRUD
- Update message service/types to use session_id instead of route_id
- Update conversation flow (resolver, history, store) for session context
- Channel inbound auto-creates/retrieves active session via SessionEnsurer
- New REST endpoints: /bots/:bot_id/sessions (CRUD)
- WebSocket and message handlers accept optional session_id
- Wire session service into FX dependency graph (agent + memoh)

Frontend:
- Refactor chat store: sessions replaces chats, sessionId replaces chatId
- Session-aware message loading, sending, and pagination
- WebSocket sends include session_id
- New session sidebar component with select/delete
- Chat area header shows active session title + new session button
- API layer updated: fetchSessions, createSession, deleteSession
- i18n strings for session management (en + zh)

SDK:
- Regenerated TypeScript SDK and Swagger docs with session endpoints

* fix: update tests for session refactoring (RouteID → SessionID)

Remove references to removed RouteID and Platform fields from
PersistInput/Message in channel_test.go and service_integration_test.go.

* fix: restore accidentally deleted SDK files and guard migration 0032

- Restore packages/sdk/src/container-stream.ts and extra/index.ts that
  were accidentally removed during SDK regeneration
- Wrap migration 0032 route_id index creation in a column existence check
  to avoid failure on fresh databases where 0001_init.up.sql no longer
  has route_id

* fix: guard migration 0036 data steps for fresh databases

Wrap steps 3-7 (which reference route_id/channel_type on
bot_history_messages) in a column existence check so the migration
is safe on fresh databases where 0001_init.up.sql already reflects
the final schema without those columns.

* feat: add title model setting and auto-generate session titles on user input

- Add title_model_id to bots table (migration 0037) and bot settings API
- Implement async title generation triggered at user message time (not after
  assistant response) for faster title availability
- Publish session_title_updated events via SSE event hub for real-time
  frontend updates without page refresh
- Fix SSE message event parsing: use direct JSON.parse instead of
  normalizeStreamEvent which silently dropped non-chat-stream event types
- Add title model selector in bot settings UI with i18n support

* fix: session-scoped message filtering and URL-based chat routing

- Filter realtime SSE messages by session_id to prevent cross-session
  message leakage after page refresh
- Add /chat/:sessionId? route with bidirectional URL ↔ store sync
- Visiting /chat shows a clean state with no bot or session pre-selected
- Visiting /chat/:sessionId loads the specific session directly
- Session switches from sidebar automatically update the URL
- Fix stale RouteID field in dedupe test (removed during session refactor)

* fix: skip cross-channel stream events to prevent session leakage

The bot-level web stream pushes events from all channels (Telegram,
Discord, etc.) without session_id context. Previously these were
rendered inline in the current chat view regardless of session.

Now cross-channel events are ignored in handleLocalStreamEvent;
persisted messages arrive via the SSE message events stream with
proper session_id filtering through appendRealtimeMessage.

* feat: show IM avatars and platform badges on session sidebar

- Add sender_avatar_url to route metadata from identity resolution
- Resolve group avatar and handle via directory adapter for group chats
- JOIN bot_channel_routes in ListSessionsByBot to return route metadata
- Display avatar with ChannelBadge on IM session items (group avatar
  for groups, sender avatar for private chats)
- Show @groupname or @username as session sub-label

* fix: clean up RunConfig unused fields, fix skill system and copy bug

- Remove unused RunConfig fields: Tools, Channels, CurrentChannel,
  ActiveContextTime
- Remove unused SessionContext fields: DisplayName, ConversationType
- Fix EnabledSkillNames copy bug: make([]string, 0, n) + copy copies
  zero elements; changed to make([]string, n)
- Fix prepareRunConfig dead code: remove no-op loop over
  CurrentPlatform runes; compute supportsImageInput from model's
  InputModalities
- Fix EnabledSkills always nil in system prompt: resolve enabled skill
  entries from EnabledSkillNames + Skills
- Fix use_skill tool returning empty response: now returns full skill
  content (description + instructions) so LLM gets it in the same turn
- Skip use_skill tool registration when no skills are available
- Conditionally render Skills section in system prompt (hidden when
  no skills exist)

* feat: add session type field and bind sessions to heartbeat/schedule executions

- Add `type` column to `bot_sessions` (chat | heartbeat | schedule)
- Add `session_id` to `bot_heartbeat_logs` for per-execution session tracking
- Create `schedule_logs` table binding schedule_id + session_id
- Heartbeat and schedule runs now create independent sessions and persist
  agent messages via storeRound, enabling full conversation replay
- Add schedule logs API endpoints (list by bot, list by schedule, delete)
- Update Triggerer interfaces to return TriggerResult with status/usage/model

* refactor: modular system prompts per session type (chat/heartbeat/schedule)

Split the monolithic system.md into three type-specific system prompts
with shared fragments via {{include:_xxx}} syntax, so each session type
gets a focused prompt without irrelevant instructions.

* fix: prevent message duplication after task completion

message_created events from Persist() had an empty platform field because
toMessageFromCreate() didn't extract it from the session. This caused
appendRealtimeMessage to fail the platform === 'web' guard, and
hasMessageWithId to fail because local IDs differ from server UUIDs,
resulting in all messages being appended as duplicates.

- Extract platform from metadata in toMessageFromCreate so published events
  carry the correct value
- Pass channel_type: 'web' when creating sessions from the web frontend so
  List queries return the correct platform via the session JOIN

* fix: use per-message usage from SDK instead of misaligned step-level usages

Previously, token usage was stored via a separate per-step usages array
that didn't align with messages (off-by-one from prepending user message,
step count != message count). This caused:
- User messages incorrectly receiving usage data
- Usage values shifted across messages in multi-step rounds
- Last assistant message getting the accumulated total instead of its own step usage
- InputTokenDetails/OutputTokenDetails lost during manual accumulation

Now each sdk.Message carries its own per-step Usage (set by the SDK in
buildStepMessages), which is extracted in sdkMessagesToModelMessages and
stored directly via ModelMessage.Usage. The storeRound/storeMessages path
no longer needs external usage/usages parameters.

Also fixes the totalUsage accumulation in runStream to include all detail
fields (InputTokenDetails, OutputTokenDetails).

* feat: add /new slash command to create a new active session from IM channels

Users in Telegram/Discord/Feishu can now send /new to start a fresh
conversation, resetting the session context for the current chat thread.
The command resolves the channel route, creates a new session, sets it as
the active session on the route, and replies with a confirmation message.

* feat: distinguish heartbeat and schedule sessions with dedicated icons in sidebar

Heartbeat sessions show a heart-pulse icon (rose), schedule sessions
show a clock icon (amber), and both display a type label beneath the
session title.

* refactor: remove enabledSkills system prompt injection, keep sorted skill listing

use_skill now returns skill content directly as tool output, so there is
no need to inject enabled skill body text into the system prompt. Remove
the entire enabledSkills tracking chain (RunConfig.EnabledSkillNames,
StreamEvent.Skills, GenerateResult.Skills, ChatRequest/Response.Skills,
enableSkill closures in runStream/runGenerate, prepareRunConfig matching).

Keep a lightweight skills listing (name + description only) in the system
prompt so the model knows which skills are available. Sort entries by name
to guarantee deterministic ordering and maximize KV cache reuse.

* refactor: remove inbox system, persist passive messages directly to history

Replace the bot_inbox table and service with direct writes to
bot_history_messages for group conversations where the bot is not
@mentioned. Trigger-path messages continue to be persisted after the
agent responds (unchanged).

- Drop bot_inbox table and max_inbox_items column (migration 0039)
- Delete internal/inbox/, handlers/inbox.go, command/inbox.go,
  agent/tools/inbox.go and the MCP message provider
- Add persistPassiveMessage() in channel inbound to write user
  messages into the active session immediately
- Rewrite ListObservedConversationsByChannelIdentity to query
  bot_history_messages + bot_sessions instead of bot_inbox
- Extract shared send/react logic into internal/messaging/executor.go;
  agent/tools/message.go is now a thin SDK adapter
- Clean up all inbox references from agent prompts, flow resolver,
  email trigger, settings, commands, DI wiring, and frontend
- Regenerate sqlc, swagger, and SDK

* feat: add list_sessions and search_messages agent tools

Provide agents with the ability to query session metadata and search
message history across all sessions. search_messages supports filtering
by time range, keyword (JSONB-aware ILIKE), session, contact, and role,
with a default 7-day lookback when no start_time is given.

* feat: inject last_heartbeat time and improve heartbeat search guidance

Query the previous heartbeat's started_at timestamp and pass it through
TriggerPayload into the heartbeat prompt template. Update system prompt
and HEARTBEAT.md checklist to guide agents to use search_messages with
start_time=last_heartbeat for efficient cross-session message review.

* fix: pass BridgeProvider to FSClient and store full heartbeat prompt

FSClient was always created with nil provider, causing all container
file reads (IDENTITY.md, SOUL.md, MEMORY.md, HEARTBEAT.md, etc.) to
silently return empty strings. Expose Agent.BridgeProvider() and wire
it into Resolver. Also fix heartbeat trigger to store the full prompt
template as the user message instead of the literal "heartbeat" string.

* feat: add line numbers to container file read output

Move line-number formatting from the bridge gRPC server to the agent
tool layer so that the raw content stored and transmitted via gRPC
remains clean, while the read_file tool output includes numbered lines
for easier reference by the agent.

* chore(deps): update twilight-ai to v0.3.2

* fix: lint, test
2026-03-21 15:57:22 +08:00
Acbox Liu 1680316c7f refactor(agent): remove agent gateway instead of twilight sdk (#264)
* refactor(agent): replace TypeScript agent gateway with in-process Go agent using twilight-ai SDK

- Remove apps/agent (Bun/Elysia gateway), packages/agent (@memoh/agent),
  internal/bun runtime manager, and all embedded agent/bun assets
- Add internal/agent package powered by twilight-ai SDK for LLM calls,
  tool execution, streaming, sential logic, tag extraction, and prompts
- Integrate ToolGatewayService in-process for both built-in and user MCP
  tools, eliminating HTTP round-trips to the old gateway
- Update resolver to convert between sdk.Message and ModelMessage at the
  boundary (resolver_messages.go), keeping agent package free of
  persistence concerns
- Prepend user message before storeRound since SDK only returns output
  messages (assistant + tool)
- Clean up all Docker configs, TOML configs, nginx proxy, Dockerfile.agent,
  and Go config structs related to the removed agent gateway
- Update cmd/agent and cmd/memoh entry points with setter-based
  ToolGateway injection to avoid FX dependency cycles

* fix(web): move form declaration before computed properties that reference it

The `form` reactive object was declared after computed properties like
`selectedMemoryProvider` and `isSelectedMemoryProviderPersisted` that
reference it, causing a TDZ ReferenceError during setup.

* fix: prevent UTF-8 character corruption in streaming text output

StreamTagExtractor.Push() used byte-level string slicing to hold back
buffer tails for tag detection, which could split multi-byte UTF-8
characters. After json.Marshal replaced invalid bytes with U+FFFD,
the corruption became permanent — causing garbled CJK characters (�)
in agent responses.

Add safeUTF8SplitIndex() to back up split points to valid character
boundaries. Also fix byte-level truncation in command/formatter.go
and command/fs.go to use rune-aware slicing.

* fix: add agent error logging and fix Gemini tool schema validation

- Log agent stream errors in both SSE and WebSocket paths with bot/model context
- Fix send tool `attachments` parameter: empty `items` schema rejected by
  Google Gemini API (INVALID_ARGUMENT), now specifies `{"type": "string"}`
- Upgrade twilight-ai to d898f0b (includes raw body in API error messages)

* chore(ci): remove agent gateway from Docker build and release pipelines

Agent gateway has been replaced by in-process Go agent; remove the
obsolete Docker image matrix entry, Bun/UPX CI steps, and agent-binary
build logic from the release script.

* fix: preserve attachment filename, metadata, and container path through persistence

- Add `name` column to `bot_history_message_assets` (migration 0034) to
  persist original filenames across page refreshes.
- Add `metadata` JSONB column (migration 0035) to store source_path,
  source_url, and other context alongside each asset.
- Update SQL queries, sqlc-generated code, and all Go types (MessageAsset,
  AssetRef, OutboundAssetRef, FileAttachment) to carry name and metadata
  through the full lifecycle.
- Extract filenames from path/URL in AttachmentsResolver before clearing
  raw paths; enrich streaming event metadata with name, source_path, and
  source_url in both the WebSocket and channel inbound ingestion paths.
- Implement `LinkAssets` on message service and `LinkOutboundAssets` on
  flow resolver so WebSocket-streamed bot attachments are persisted to the
  correct assistant message after streaming completes.
- Frontend: update MessageAsset type with metadata field, pass metadata
  through to attachment items, and reorder attachment-block.vue template
  so container files (identified by metadata.source_path) open in the
  sidebar file manager instead of triggering a download.

* refactor(agent): decouple built-in tools from MCP, load via ToolProvider interface

Migrate all 13 built-in tool providers from internal/mcp/providers/ to
internal/agent/tools/ using the twilight-ai sdk.Tool structure. The agent
now loads tools through a ToolProvider interface instead of the MCP
ToolGatewayService, which is simplified to only manage external federation
sources. This enables selective tool loading and removes the coupling
between business tools and the MCP protocol layer.

* refactor(flow): split monolithic resolver.go into focused modules

Break the 1959-line resolver.go into 12 files organized by concern:
- resolver.go: core orchestration (Resolver struct, resolve, Chat, prepareRunConfig)
- resolver_stream.go: streaming (StreamChat, StreamChatWS, tryStoreStream)
- resolver_trigger.go: schedule/heartbeat triggers
- resolver_attachments.go: attachment routing, inlining, encoding
- resolver_history.go: message loading, deduplication, token trimming
- resolver_store.go: persistence (storeRound, storeMessages, asset linking)
- resolver_memory.go: memory provider integration
- resolver_model_selection.go: model selection and candidate matching
- resolver_identity.go: display name and channel identity resolution
- resolver_settings.go: bot settings, loop detection, inbox
- user_header.go: YAML front-matter formatting
- resolver_util.go: shared utilities (sanitize, normalize, dedup, UUID)

* fix(agent): enable Anthropic extended thinking by passing ReasoningConfig to provider

Anthropic's thinking requires WithThinking() at provider creation time,
unlike OpenAI which uses per-request ReasoningEffort. The config was
never wired through, so Claude models could not trigger thinking.

* refactor(agent): extract prompts into embedded markdown templates

Move inline prompt strings from prompt.go into separate .md files under
internal/agent/prompts/, using {{key}} placeholders and a simple render
engine. Remove obsolete SystemPromptParams fields (Language,
MaxContextLoadTime, Channels, CurrentChannel) and their call-site usage.

* fix: lint
2026-03-19 13:31:54 +08:00
Acbox ac8a935545 refactor: remove bot type 2026-03-15 00:42:09 +08:00
BBQ 839e63acda feat(access): add guest chat ACL (#235) 2026-03-14 17:15:41 +08:00
Fodesu b46e494d3a feat(tts): introduce TTS system (#195) 2026-03-13 02:49:52 +08:00
Acbox Liu 23d49a1c7b feat: message abort and web socket support (#222)
* feat: message abort and web socket support

* fix(web): chat end

* fix: lint

* fix: lint
2026-03-09 23:27:50 +08:00
BBQ 3feb03aca7 ci: add go lint and race test workflow (#187) 2026-03-05 11:25:33 +08:00
BBQ bc374fe8cd refactor: content-addressed assets, cross-channel multimodal, infra simplification (#63)
* refactor(attachment): multimodal attachment refactor with snapshot schema and storage layer

- Add snapshot schema migration (0008) and update init/versions/snapshots
- Add internal/attachment and internal/channel normalize for unified attachment handling
- Move containerfs provider from internal/media to internal/storage
- Update agent types, channel adapters (Telegram/Feishu), inbound and handlers
- Add containerd snapshot lineage and local_channel tests
- Regenerate sqlc, swagger and SDK

* refactor(media): content-addressed asset system with unified naming

- Replace asset_id foreign key with content_hash as sole identifier
  for bot_history_message_assets (pure soft-link model)
- Remove mime, size_bytes, storage_key from DB; derive at read time
  via media.Resolve from actual storage
- Merge migrations 0008/0009 into single 0008; keep 0001 as canonical schema
- Add Docker initdb script for deterministic migration execution order
- Fix cross-channel real-time image display (Telegram → WebUI SSE)
- Fix message disappearing on refresh (null assets fallback)
- Fix file icon instead of image preview (mime derivation from storage)
- Unify AssetID → ContentHash naming across Go, Agent, and Frontend
- Change storage key prefix from 4-char to 2-char for directory sharding
- Add server-entrypoint.sh for Docker deployment migration handling

* refactor(infra): embedded migrations, Docker simplification, and config consolidation

- Embed SQL migrations into Go binary, removing shell-based migration scripts
- Consolidate config files into conf/ directory (app.example.toml, app.docker.toml, app.dev.toml)
- Simplify Docker setup: remove initdb.d scripts, streamline nginx config and entrypoint
- Remove legacy CLI, feishu-echo commands, and obsolete incremental migration files
- Update install script and docs to require sudo for one-click install
- Add mise tasks for dev environment orchestration

* chore: recover migrations

---------

Co-authored-by: Acbox <acbox0328@gmail.com>
2026-02-19 00:20:27 +08:00
BBQ df7876a30c feat: add media asset system, channel lifecycle refactor, and chat attachments (#54) 2026-02-17 19:06:46 +08:00
BBQ 85251a2905 refactor(core): codebase quality cleanup
- Remove user-level model settings (chat_model_id, memory_model_id,
  embedding_model_id, max_context_load_time, language) from users table
- Merge migration 0002 into 0001, remove compatibility migrations
- Delete dead conversation/resolver.go (1177 lines, only flow/resolver.go used)
- Remove type aliases (Chat=Conversation, types_alias.go)
- Fix SQL: remove AND false stub, fix UpdateChatTitle model_id,
  reset model IDs in DeleteSettings, add preauth expiry filter,
  add ListMessages limit, remove 10 dead queries
- Extract shared handler helpers (RequireChannelIdentityID, AuthorizeBotAccess)
- Rename internal/router to internal/channel/inbound
- Fix identity confusion: remove UserID->ChannelIdentityID fallbacks
- Fix all _ = var patterns with proper error logging
- Fix error propagation: storeMessages, rescheduleJob, botContainerID
- Fix naming: ModelId->ModelID, active->is_active, Duration semantic fix
- Remove dead code: mcpService, ReplyTarget, callMCPServer, sshShellQuote,
  buildSessionMetadata, ChatRequest.Language, TriggerPayload.ChatID
- Fix code quality: errors.Is(), remove goto, CreateHuman deprecated
- Remove Enable model endpoint and user-level settings CLI commands
- Regenerate sqlc, swagger, SDK
2026-02-12 23:50:48 +08:00
BBQ ca5c6a1866 refactor(core): restructure conversation, channel and message domains
- Rename chat module to conversation with flow-based architecture
- Move channelidentities into channel/identities subpackage
- Add channel/route for routing logic
- Add message service with event hub
- Add MCP providers: container, directory, schedule
- Refactor Feishu/Telegram adapters with directory and stream support
- Add platform management page and channel badges in web UI
- Update database schema for conversations, messages and channel routes
- Add @memoh/shared package for cross-package type definitions
2026-02-12 15:34:40 +08:00
BBQ 02b33c8e85 refactor(core): finalize user-centric identity and policy cleanup
Unify auth and chat identity semantics around user_id, enforce personal-bot owner-only authorization, and remove legacy compatibility branches in integration tests.
2026-02-11 15:49:38 +08:00
BBQ 06e8619a37 refactor(core): migrate channel identity and binding across app
Align channel identity and bind flow across backend and app-facing layers, including generated swagger artifacts and package lock updates while excluding docs content changes.
2026-02-11 14:51:58 +08:00
BBQ 29e6ddd1f9 refactor: replace global channel registry with instance-based Registry and interface-driven adapters
- Replace global channelRegistry singleton with explicit *Registry passed via dependency injection
- Split monolithic manager.go into connection.go (lifecycle), inbound.go (dispatch), outbound.go (pipeline)
- Introduce optional adapter interfaces: ConfigNormalizer, TargetResolver, BindingMatcher
- Move Descriptor() to Adapter interface, remove init()-based registration
- Relocate SessionHub to adapters/local package
- Extract shared UUID/time helpers to internal/db/uuid.go
- Decompose ConfigStore into fine-grained interfaces: ConfigLister, ConfigResolver, BindingStore, SessionStore
2026-02-06 23:47:12 +08:00
BBQ 5a35ef34ac feat: channel gateway implementation and multi-bot refactor
- Refactor channel manager with support for Sender/Receiver interfaces and hot-swappable adapters.
- Implement identity routing and pre-authentication logic for inbound messages.
- Update database schema to support bot pre-auth keys and extended channel session metadata.
- Add Telegram and Feishu channel configuration and adapter enhancements.
- Update Swagger documentation and internal handlers for channel management.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-06 14:41:54 +08:00
BBQ 6aebbe9279 feat: refactor User/Bot architecture and implement multi-channel gateway
Major changes:
1. Core Architecture: Decoupled Bots from Users. Bots now have independent lifecycles, member management (bot_members), and dedicated configurations.
2. Channel Gateway:
   - Implemented a unified Channel Manager supporting Feishu, Telegram, and Local (Web/CLI) adapters.
   - Added message processing pipeline to normalize interactions across different platforms.
   - Introduced a Contact system for identity binding and guest access policies.
3. Database & Tooling:
   - Consolidated all migrations into 0001_init with updated schema for bots, channels, and contacts.
   - Optimized sqlc.yaml to automatically track the migrations directory.
4. Agent Enhancements:
   - Introduced ToolContext to provide Agents with platform-aware execution capabilities (e.g., messaging, contact lookups).
   - Added tool logging and fallback mechanisms for toolChoice execution.
5. UI & Docs: Updated frontend stores, UI components, and Swagger documentation to align with the new Bot-centric model.
2026-02-04 23:49:50 +08:00