Files
Memoh/internal/conversation/types.go
T
Acbox Liu 1680316c7f refactor(agent): remove agent gateway instead of twilight sdk (#264)
* refactor(agent): replace TypeScript agent gateway with in-process Go agent using twilight-ai SDK

- Remove apps/agent (Bun/Elysia gateway), packages/agent (@memoh/agent),
  internal/bun runtime manager, and all embedded agent/bun assets
- Add internal/agent package powered by twilight-ai SDK for LLM calls,
  tool execution, streaming, sential logic, tag extraction, and prompts
- Integrate ToolGatewayService in-process for both built-in and user MCP
  tools, eliminating HTTP round-trips to the old gateway
- Update resolver to convert between sdk.Message and ModelMessage at the
  boundary (resolver_messages.go), keeping agent package free of
  persistence concerns
- Prepend user message before storeRound since SDK only returns output
  messages (assistant + tool)
- Clean up all Docker configs, TOML configs, nginx proxy, Dockerfile.agent,
  and Go config structs related to the removed agent gateway
- Update cmd/agent and cmd/memoh entry points with setter-based
  ToolGateway injection to avoid FX dependency cycles

* fix(web): move form declaration before computed properties that reference it

The `form` reactive object was declared after computed properties like
`selectedMemoryProvider` and `isSelectedMemoryProviderPersisted` that
reference it, causing a TDZ ReferenceError during setup.

* fix: prevent UTF-8 character corruption in streaming text output

StreamTagExtractor.Push() used byte-level string slicing to hold back
buffer tails for tag detection, which could split multi-byte UTF-8
characters. After json.Marshal replaced invalid bytes with U+FFFD,
the corruption became permanent — causing garbled CJK characters (�)
in agent responses.

Add safeUTF8SplitIndex() to back up split points to valid character
boundaries. Also fix byte-level truncation in command/formatter.go
and command/fs.go to use rune-aware slicing.

* fix: add agent error logging and fix Gemini tool schema validation

- Log agent stream errors in both SSE and WebSocket paths with bot/model context
- Fix send tool `attachments` parameter: empty `items` schema rejected by
  Google Gemini API (INVALID_ARGUMENT), now specifies `{"type": "string"}`
- Upgrade twilight-ai to d898f0b (includes raw body in API error messages)

* chore(ci): remove agent gateway from Docker build and release pipelines

Agent gateway has been replaced by in-process Go agent; remove the
obsolete Docker image matrix entry, Bun/UPX CI steps, and agent-binary
build logic from the release script.

* fix: preserve attachment filename, metadata, and container path through persistence

- Add `name` column to `bot_history_message_assets` (migration 0034) to
  persist original filenames across page refreshes.
- Add `metadata` JSONB column (migration 0035) to store source_path,
  source_url, and other context alongside each asset.
- Update SQL queries, sqlc-generated code, and all Go types (MessageAsset,
  AssetRef, OutboundAssetRef, FileAttachment) to carry name and metadata
  through the full lifecycle.
- Extract filenames from path/URL in AttachmentsResolver before clearing
  raw paths; enrich streaming event metadata with name, source_path, and
  source_url in both the WebSocket and channel inbound ingestion paths.
- Implement `LinkAssets` on message service and `LinkOutboundAssets` on
  flow resolver so WebSocket-streamed bot attachments are persisted to the
  correct assistant message after streaming completes.
- Frontend: update MessageAsset type with metadata field, pass metadata
  through to attachment items, and reorder attachment-block.vue template
  so container files (identified by metadata.source_path) open in the
  sidebar file manager instead of triggering a download.

* refactor(agent): decouple built-in tools from MCP, load via ToolProvider interface

Migrate all 13 built-in tool providers from internal/mcp/providers/ to
internal/agent/tools/ using the twilight-ai sdk.Tool structure. The agent
now loads tools through a ToolProvider interface instead of the MCP
ToolGatewayService, which is simplified to only manage external federation
sources. This enables selective tool loading and removes the coupling
between business tools and the MCP protocol layer.

* refactor(flow): split monolithic resolver.go into focused modules

Break the 1959-line resolver.go into 12 files organized by concern:
- resolver.go: core orchestration (Resolver struct, resolve, Chat, prepareRunConfig)
- resolver_stream.go: streaming (StreamChat, StreamChatWS, tryStoreStream)
- resolver_trigger.go: schedule/heartbeat triggers
- resolver_attachments.go: attachment routing, inlining, encoding
- resolver_history.go: message loading, deduplication, token trimming
- resolver_store.go: persistence (storeRound, storeMessages, asset linking)
- resolver_memory.go: memory provider integration
- resolver_model_selection.go: model selection and candidate matching
- resolver_identity.go: display name and channel identity resolution
- resolver_settings.go: bot settings, loop detection, inbox
- user_header.go: YAML front-matter formatting
- resolver_util.go: shared utilities (sanitize, normalize, dedup, UUID)

* fix(agent): enable Anthropic extended thinking by passing ReasoningConfig to provider

Anthropic's thinking requires WithThinking() at provider creation time,
unlike OpenAI which uses per-request ReasoningEffort. The config was
never wired through, so Claude models could not trigger thinking.

* refactor(agent): extract prompts into embedded markdown templates

Move inline prompt strings from prompt.go into separate .md files under
internal/agent/prompts/, using {{key}} placeholders and a simple render
engine. Remove obsolete SystemPromptParams fields (Language,
MaxContextLoadTime, Channels, CurrentChannel) and their call-site usage.

* fix: lint
2026-03-19 13:31:54 +08:00

269 lines
8.8 KiB
Go

// Package conversation defines conversation domain types and rules.
package conversation
import (
"encoding/json"
"strings"
"time"
)
// Conversation kind constants.
const (
KindDirect = "direct"
KindGroup = "group"
KindThread = "thread"
)
// Participant role constants.
const (
RoleOwner = "owner"
RoleAdmin = "admin"
RoleMember = "member"
)
// Conversation list access mode constants.
const (
AccessModeParticipant = "participant"
AccessModeChannelIdentityObserved = "channel_identity_observed"
)
// Conversation is the first-class conversation container.
type Conversation struct {
ID string `json:"id"`
BotID string `json:"bot_id"`
Kind string `json:"kind"`
ParentChatID string `json:"parent_chat_id,omitempty"`
Title string `json:"title,omitempty"`
CreatedBy string `json:"created_by"`
Metadata map[string]any `json:"metadata,omitempty"`
CreatedAt time.Time `json:"created_at"`
UpdatedAt time.Time `json:"updated_at"`
}
// ConversationListItem is a conversation entry with access context for list rendering.
type ConversationListItem struct {
ID string `json:"id"`
BotID string `json:"bot_id"`
Kind string `json:"kind"`
ParentChatID string `json:"parent_chat_id,omitempty"`
Title string `json:"title,omitempty"`
CreatedBy string `json:"created_by"`
Metadata map[string]any `json:"metadata,omitempty"`
CreatedAt time.Time `json:"created_at"`
UpdatedAt time.Time `json:"updated_at"`
AccessMode string `json:"access_mode"`
ParticipantRole string `json:"participant_role,omitempty"`
LastObservedAt *time.Time `json:"last_observed_at,omitempty"`
}
// ConversationReadAccess is the resolved access context for reading conversation content.
type ConversationReadAccess struct {
AccessMode string
ParticipantRole string
LastObservedAt *time.Time
}
// Participant represents a chat member.
type Participant struct {
ChatID string `json:"chat_id"`
UserID string `json:"user_id"`
Role string `json:"role"`
JoinedAt time.Time `json:"joined_at"`
}
// Settings holds per-chat configuration.
type Settings struct {
ChatID string `json:"chat_id"`
ModelID string `json:"model_id,omitempty"`
}
// CreateRequest is the input for creating a bot-scoped conversation container.
type CreateRequest struct {
Kind string `json:"kind"`
Title string `json:"title,omitempty"`
ParentChatID string `json:"parent_chat_id,omitempty"`
Metadata map[string]any `json:"metadata,omitempty"`
}
// UpdateSettingsRequest is the input for updating chat settings.
type UpdateSettingsRequest struct {
ModelID *string `json:"model_id,omitempty"`
}
// ModelMessage is the canonical message format exchanged with the agent gateway.
// Aligned with Vercel AI SDK ModelMessage structure.
type ModelMessage struct {
Role string `json:"role"`
Content json.RawMessage `json:"content,omitempty"`
ToolCalls []ToolCall `json:"tool_calls,omitempty"`
ToolCallID string `json:"tool_call_id,omitempty"`
Name string `json:"name,omitempty"`
}
// TextContent extracts the plain text from the message content.
// If content is a string, it returns it directly.
// If content is an array of parts, it joins all text-type parts.
func (m ModelMessage) TextContent() string {
if len(m.Content) == 0 {
return ""
}
var s string
if err := json.Unmarshal(m.Content, &s); err == nil {
return s
}
var parts []ContentPart
if err := json.Unmarshal(m.Content, &parts); err == nil {
texts := make([]string, 0, len(parts))
for _, p := range parts {
// Ignore Reasoning parts
if p.Type == "reasoning" {
continue
}
if strings.TrimSpace(p.Text) != "" {
texts = append(texts, p.Text)
}
}
return strings.Join(texts, "\n")
}
return ""
}
// ContentParts parses the content as an array of ContentPart.
// Returns nil if the content is a plain string or not parseable.
func (m ModelMessage) ContentParts() []ContentPart {
if len(m.Content) == 0 {
return nil
}
var parts []ContentPart
if err := json.Unmarshal(m.Content, &parts); err != nil {
return nil
}
return parts
}
// HasContent reports whether the message carries non-empty content or tool calls.
func (m ModelMessage) HasContent() bool {
if strings.TrimSpace(m.TextContent()) != "" {
return true
}
if len(m.ContentParts()) > 0 {
return true
}
return len(m.ToolCalls) > 0
}
// NewTextContent creates a json.RawMessage from a plain string.
func NewTextContent(text string) json.RawMessage {
data, err := json.Marshal(text)
if err != nil {
return nil
}
return data
}
// ContentPart represents one element of a multi-part message content.
type ContentPart struct {
Type string `json:"type"`
Text string `json:"text,omitempty"`
URL string `json:"url,omitempty"`
Styles []string `json:"styles,omitempty"`
Language string `json:"language,omitempty"`
ChannelIdentityID string `json:"channel_identity_id,omitempty"`
Emoji string `json:"emoji,omitempty"`
Metadata map[string]any `json:"metadata,omitempty"`
}
// HasValue reports whether the content part carries a meaningful value.
func (p ContentPart) HasValue() bool {
return strings.TrimSpace(p.Text) != "" ||
strings.TrimSpace(p.URL) != "" ||
strings.TrimSpace(p.Emoji) != ""
}
// ToolCall represents a function/tool invocation in an assistant message.
type ToolCall struct {
ID string `json:"id,omitempty"`
Type string `json:"type"`
Function ToolCallFunction `json:"function"`
}
// ToolCallFunction holds the name and serialized arguments of a tool call.
type ToolCallFunction struct {
Name string `json:"name"`
Arguments string `json:"arguments"`
}
// ChatAttachment is a media attachment carried in a chat request.
type ChatAttachment struct {
Type string `json:"type"`
Base64 string `json:"base64,omitempty"`
Path string `json:"path,omitempty"`
URL string `json:"url,omitempty"`
PlatformKey string `json:"platform_key,omitempty"`
ContentHash string `json:"content_hash,omitempty"`
Name string `json:"name,omitempty"`
Mime string `json:"mime,omitempty"`
Size int64 `json:"size,omitempty"`
Metadata map[string]any `json:"metadata,omitempty"`
}
// OutboundAssetRef carries an asset reference accumulated during outbound streaming.
type OutboundAssetRef struct {
ContentHash string
Role string
Ordinal int
Mime string
SizeBytes int64
StorageKey string
Name string
Metadata map[string]any
}
// ChatRequest is the input for Chat and StreamChat.
type ChatRequest struct {
BotID string `json:"-"`
ChatID string `json:"-"`
Token string `json:"-"`
UserID string `json:"-"`
SourceChannelIdentityID string `json:"-"`
DisplayName string `json:"-"`
RouteID string `json:"-"`
ChatToken string `json:"-"`
ExternalMessageID string `json:"-"`
ReplyTarget string `json:"-"`
ConversationType string `json:"-"`
ConversationName string `json:"-"`
UserMessagePersisted bool `json:"-"`
// OutboundAssetCollector returns asset refs accumulated during outbound streaming.
// Set by the inbound channel processor; called by the resolver at persist time.
OutboundAssetCollector func() []OutboundAssetRef `json:"-"`
Query string `json:"query"`
Model string `json:"model,omitempty"`
Provider string `json:"provider,omitempty"`
MaxContextLoadTime int `json:"max_context_load_time,omitempty"`
Channels []string `json:"channels,omitempty"`
CurrentChannel string `json:"current_channel,omitempty"`
Messages []ModelMessage `json:"messages,omitempty"`
Skills []string `json:"skills,omitempty"`
Attachments []ChatAttachment `json:"attachments,omitempty"`
}
// ChatResponse is the output of a non-streaming chat call.
type ChatResponse struct {
Messages []ModelMessage `json:"messages"`
Skills []string `json:"skills,omitempty"`
Model string `json:"model,omitempty"`
Provider string `json:"provider,omitempty"`
}
// StreamChunk is a raw JSON chunk from the streaming response.
type StreamChunk = json.RawMessage
// AssistantOutput holds extracted assistant content for downstream consumers.
type AssistantOutput struct {
Content string
Parts []ContentPart
}