Files
Memoh/internal/bots/service.go
T
Acbox Liu 7d7d0e4b51 refactor: introduce multi-session chat support (#session) (#267)
* refactor: introduce multi-session chat support (#session)

Replace the single-context-per-bot model with multiple chat sessions.

Database:
- Add bot_sessions table (route_id, channel_type, title, metadata, soft delete)
- Migrate bot_history_messages from (route_id, channel_type) to session_id
- Add active_session_id to bot_channel_routes
- Migration 0036 handles data migration from existing messages

Backend:
- New internal/session service for session CRUD
- Update message service/types to use session_id instead of route_id
- Update conversation flow (resolver, history, store) for session context
- Channel inbound auto-creates/retrieves active session via SessionEnsurer
- New REST endpoints: /bots/:bot_id/sessions (CRUD)
- WebSocket and message handlers accept optional session_id
- Wire session service into FX dependency graph (agent + memoh)

Frontend:
- Refactor chat store: sessions replaces chats, sessionId replaces chatId
- Session-aware message loading, sending, and pagination
- WebSocket sends include session_id
- New session sidebar component with select/delete
- Chat area header shows active session title + new session button
- API layer updated: fetchSessions, createSession, deleteSession
- i18n strings for session management (en + zh)

SDK:
- Regenerated TypeScript SDK and Swagger docs with session endpoints

* fix: update tests for session refactoring (RouteID → SessionID)

Remove references to removed RouteID and Platform fields from
PersistInput/Message in channel_test.go and service_integration_test.go.

* fix: restore accidentally deleted SDK files and guard migration 0032

- Restore packages/sdk/src/container-stream.ts and extra/index.ts that
  were accidentally removed during SDK regeneration
- Wrap migration 0032 route_id index creation in a column existence check
  to avoid failure on fresh databases where 0001_init.up.sql no longer
  has route_id

* fix: guard migration 0036 data steps for fresh databases

Wrap steps 3-7 (which reference route_id/channel_type on
bot_history_messages) in a column existence check so the migration
is safe on fresh databases where 0001_init.up.sql already reflects
the final schema without those columns.

* feat: add title model setting and auto-generate session titles on user input

- Add title_model_id to bots table (migration 0037) and bot settings API
- Implement async title generation triggered at user message time (not after
  assistant response) for faster title availability
- Publish session_title_updated events via SSE event hub for real-time
  frontend updates without page refresh
- Fix SSE message event parsing: use direct JSON.parse instead of
  normalizeStreamEvent which silently dropped non-chat-stream event types
- Add title model selector in bot settings UI with i18n support

* fix: session-scoped message filtering and URL-based chat routing

- Filter realtime SSE messages by session_id to prevent cross-session
  message leakage after page refresh
- Add /chat/:sessionId? route with bidirectional URL ↔ store sync
- Visiting /chat shows a clean state with no bot or session pre-selected
- Visiting /chat/:sessionId loads the specific session directly
- Session switches from sidebar automatically update the URL
- Fix stale RouteID field in dedupe test (removed during session refactor)

* fix: skip cross-channel stream events to prevent session leakage

The bot-level web stream pushes events from all channels (Telegram,
Discord, etc.) without session_id context. Previously these were
rendered inline in the current chat view regardless of session.

Now cross-channel events are ignored in handleLocalStreamEvent;
persisted messages arrive via the SSE message events stream with
proper session_id filtering through appendRealtimeMessage.

* feat: show IM avatars and platform badges on session sidebar

- Add sender_avatar_url to route metadata from identity resolution
- Resolve group avatar and handle via directory adapter for group chats
- JOIN bot_channel_routes in ListSessionsByBot to return route metadata
- Display avatar with ChannelBadge on IM session items (group avatar
  for groups, sender avatar for private chats)
- Show @groupname or @username as session sub-label

* fix: clean up RunConfig unused fields, fix skill system and copy bug

- Remove unused RunConfig fields: Tools, Channels, CurrentChannel,
  ActiveContextTime
- Remove unused SessionContext fields: DisplayName, ConversationType
- Fix EnabledSkillNames copy bug: make([]string, 0, n) + copy copies
  zero elements; changed to make([]string, n)
- Fix prepareRunConfig dead code: remove no-op loop over
  CurrentPlatform runes; compute supportsImageInput from model's
  InputModalities
- Fix EnabledSkills always nil in system prompt: resolve enabled skill
  entries from EnabledSkillNames + Skills
- Fix use_skill tool returning empty response: now returns full skill
  content (description + instructions) so LLM gets it in the same turn
- Skip use_skill tool registration when no skills are available
- Conditionally render Skills section in system prompt (hidden when
  no skills exist)

* feat: add session type field and bind sessions to heartbeat/schedule executions

- Add `type` column to `bot_sessions` (chat | heartbeat | schedule)
- Add `session_id` to `bot_heartbeat_logs` for per-execution session tracking
- Create `schedule_logs` table binding schedule_id + session_id
- Heartbeat and schedule runs now create independent sessions and persist
  agent messages via storeRound, enabling full conversation replay
- Add schedule logs API endpoints (list by bot, list by schedule, delete)
- Update Triggerer interfaces to return TriggerResult with status/usage/model

* refactor: modular system prompts per session type (chat/heartbeat/schedule)

Split the monolithic system.md into three type-specific system prompts
with shared fragments via {{include:_xxx}} syntax, so each session type
gets a focused prompt without irrelevant instructions.

* fix: prevent message duplication after task completion

message_created events from Persist() had an empty platform field because
toMessageFromCreate() didn't extract it from the session. This caused
appendRealtimeMessage to fail the platform === 'web' guard, and
hasMessageWithId to fail because local IDs differ from server UUIDs,
resulting in all messages being appended as duplicates.

- Extract platform from metadata in toMessageFromCreate so published events
  carry the correct value
- Pass channel_type: 'web' when creating sessions from the web frontend so
  List queries return the correct platform via the session JOIN

* fix: use per-message usage from SDK instead of misaligned step-level usages

Previously, token usage was stored via a separate per-step usages array
that didn't align with messages (off-by-one from prepending user message,
step count != message count). This caused:
- User messages incorrectly receiving usage data
- Usage values shifted across messages in multi-step rounds
- Last assistant message getting the accumulated total instead of its own step usage
- InputTokenDetails/OutputTokenDetails lost during manual accumulation

Now each sdk.Message carries its own per-step Usage (set by the SDK in
buildStepMessages), which is extracted in sdkMessagesToModelMessages and
stored directly via ModelMessage.Usage. The storeRound/storeMessages path
no longer needs external usage/usages parameters.

Also fixes the totalUsage accumulation in runStream to include all detail
fields (InputTokenDetails, OutputTokenDetails).

* feat: add /new slash command to create a new active session from IM channels

Users in Telegram/Discord/Feishu can now send /new to start a fresh
conversation, resetting the session context for the current chat thread.
The command resolves the channel route, creates a new session, sets it as
the active session on the route, and replies with a confirmation message.

* feat: distinguish heartbeat and schedule sessions with dedicated icons in sidebar

Heartbeat sessions show a heart-pulse icon (rose), schedule sessions
show a clock icon (amber), and both display a type label beneath the
session title.

* refactor: remove enabledSkills system prompt injection, keep sorted skill listing

use_skill now returns skill content directly as tool output, so there is
no need to inject enabled skill body text into the system prompt. Remove
the entire enabledSkills tracking chain (RunConfig.EnabledSkillNames,
StreamEvent.Skills, GenerateResult.Skills, ChatRequest/Response.Skills,
enableSkill closures in runStream/runGenerate, prepareRunConfig matching).

Keep a lightweight skills listing (name + description only) in the system
prompt so the model knows which skills are available. Sort entries by name
to guarantee deterministic ordering and maximize KV cache reuse.

* refactor: remove inbox system, persist passive messages directly to history

Replace the bot_inbox table and service with direct writes to
bot_history_messages for group conversations where the bot is not
@mentioned. Trigger-path messages continue to be persisted after the
agent responds (unchanged).

- Drop bot_inbox table and max_inbox_items column (migration 0039)
- Delete internal/inbox/, handlers/inbox.go, command/inbox.go,
  agent/tools/inbox.go and the MCP message provider
- Add persistPassiveMessage() in channel inbound to write user
  messages into the active session immediately
- Rewrite ListObservedConversationsByChannelIdentity to query
  bot_history_messages + bot_sessions instead of bot_inbox
- Extract shared send/react logic into internal/messaging/executor.go;
  agent/tools/message.go is now a thin SDK adapter
- Clean up all inbox references from agent prompts, flow resolver,
  email trigger, settings, commands, DI wiring, and frontend
- Regenerate sqlc, swagger, and SDK

* feat: add list_sessions and search_messages agent tools

Provide agents with the ability to query session metadata and search
message history across all sessions. search_messages supports filtering
by time range, keyword (JSONB-aware ILIKE), session, contact, and role,
with a default 7-day lookback when no start_time is given.

* feat: inject last_heartbeat time and improve heartbeat search guidance

Query the previous heartbeat's started_at timestamp and pass it through
TriggerPayload into the heartbeat prompt template. Update system prompt
and HEARTBEAT.md checklist to guide agents to use search_messages with
start_time=last_heartbeat for efficient cross-session message review.

* fix: pass BridgeProvider to FSClient and store full heartbeat prompt

FSClient was always created with nil provider, causing all container
file reads (IDENTITY.md, SOUL.md, MEMORY.md, HEARTBEAT.md, etc.) to
silently return empty strings. Expose Agent.BridgeProvider() and wire
it into Resolver. Also fix heartbeat trigger to store the full prompt
template as the user message instead of the literal "heartbeat" string.

* feat: add line numbers to container file read output

Move line-number formatting from the bridge gRPC server to the agent
tool layer so that the raw content stored and transmitted via gRPC
remains clean, while the read_file tool output includes numbered lines
for easier reference by the agent.

* chore(deps): update twilight-ai to v0.3.2

* fix: lint, test
2026-03-21 15:57:22 +08:00

739 lines
24 KiB
Go

package bots
import (
"context"
"encoding/json"
"errors"
"fmt"
"log/slog"
"strings"
"time"
"github.com/google/uuid"
"github.com/jackc/pgx/v5"
"github.com/jackc/pgx/v5/pgtype"
"github.com/memohai/memoh/internal/db"
"github.com/memohai/memoh/internal/db/sqlc"
)
// Service provides bot CRUD and membership management.
type Service struct {
queries *sqlc.Queries
logger *slog.Logger
containerLifecycle ContainerLifecycle
checkers []RuntimeChecker
containerReachability func(ctx context.Context, botID string) error
}
const (
botLifecycleOperationTimeout = 5 * time.Minute
)
var (
ErrBotNotFound = errors.New("bot not found")
ErrBotAccessDenied = errors.New("bot access denied")
ErrOwnerUserNotFound = errors.New("owner user not found")
)
// NewService creates a new bot service.
func NewService(log *slog.Logger, queries *sqlc.Queries) *Service {
if log == nil {
log = slog.Default()
}
return &Service{
queries: queries,
logger: log.With(slog.String("service", "bots")),
}
}
// SetContainerLifecycle registers a container lifecycle handler for bot operations.
func (s *Service) SetContainerLifecycle(lc ContainerLifecycle) {
s.containerLifecycle = lc
}
// SetContainerReachability registers a function that checks whether a bot's
// container is reachable via gRPC. Returns nil on success, error otherwise.
func (s *Service) SetContainerReachability(fn func(ctx context.Context, botID string) error) {
s.containerReachability = fn
}
// AddRuntimeChecker registers an additional runtime checker.
func (s *Service) AddRuntimeChecker(c RuntimeChecker) {
if c != nil {
s.checkers = append(s.checkers, c)
}
}
// AuthorizeAccess checks whether userID may access the given bot (owner or admin only).
func (s *Service) AuthorizeAccess(ctx context.Context, userID, botID string, isAdmin bool) (Bot, error) {
if s.queries == nil {
return Bot{}, errors.New("bot queries not configured")
}
bot, err := s.Get(ctx, botID)
if err != nil {
if errors.Is(err, pgx.ErrNoRows) {
return Bot{}, ErrBotNotFound
}
return Bot{}, err
}
if isAdmin || bot.OwnerUserID == userID {
return bot, nil
}
return Bot{}, ErrBotAccessDenied
}
// Create creates a new bot owned by owner user.
func (s *Service) Create(ctx context.Context, ownerUserID string, req CreateBotRequest) (Bot, error) {
if s.queries == nil {
return Bot{}, errors.New("bot queries not configured")
}
ownerID := strings.TrimSpace(ownerUserID)
if ownerID == "" {
return Bot{}, errors.New("owner user id is required")
}
ownerUUID, err := db.ParseUUID(ownerID)
if err != nil {
return Bot{}, err
}
if err := s.ensureUserExists(ctx, ownerUUID); err != nil {
return Bot{}, err
}
displayName := strings.TrimSpace(req.DisplayName)
if displayName == "" {
displayName = "bot-" + uuid.NewString()
}
avatarURL := strings.TrimSpace(req.AvatarURL)
isActive := true
if req.IsActive != nil {
isActive = *req.IsActive
}
metadata := req.Metadata
if metadata == nil {
metadata = map[string]any{}
}
payload, err := json.Marshal(metadata)
if err != nil {
return Bot{}, err
}
row, err := s.queries.CreateBot(ctx, sqlc.CreateBotParams{
OwnerUserID: ownerUUID,
DisplayName: pgtype.Text{String: displayName, Valid: displayName != ""},
AvatarUrl: pgtype.Text{String: avatarURL, Valid: avatarURL != ""},
IsActive: isActive,
Metadata: payload,
Status: BotStatusCreating,
})
if err != nil {
return Bot{}, err
}
bot, err := toBot(asSQLCBot(row))
if err != nil {
return Bot{}, err
}
if err := s.attachCheckSummary(ctx, &bot, asSQLCBot(row)); err != nil {
return Bot{}, err
}
s.enqueueCreateLifecycle(ctx, bot.ID)
return bot, nil
}
// Get returns a bot by its ID.
func (s *Service) Get(ctx context.Context, botID string) (Bot, error) {
if s.queries == nil {
return Bot{}, errors.New("bot queries not configured")
}
botUUID, err := db.ParseUUID(botID)
if err != nil {
return Bot{}, err
}
row, err := s.queries.GetBotByID(ctx, botUUID)
if err != nil {
return Bot{}, err
}
bot, err := toBot(asSQLCBot(row))
if err != nil {
return Bot{}, err
}
if err := s.attachCheckSummary(ctx, &bot, asSQLCBot(row)); err != nil {
return Bot{}, err
}
return bot, nil
}
// ListByOwner returns bots owned by the given user.
func (s *Service) ListByOwner(ctx context.Context, ownerUserID string) ([]Bot, error) {
if s.queries == nil {
return nil, errors.New("bot queries not configured")
}
ownerUUID, err := db.ParseUUID(ownerUserID)
if err != nil {
return nil, err
}
rows, err := s.queries.ListBotsByOwner(ctx, ownerUUID)
if err != nil {
return nil, err
}
items := make([]Bot, 0, len(rows))
for _, row := range rows {
item, err := toBot(asSQLCBot(row))
if err != nil {
return nil, err
}
if err := s.attachCheckSummary(ctx, &item, asSQLCBot(row)); err != nil {
return nil, err
}
items = append(items, item)
}
return items, nil
}
// ListAccessible returns all bots owned by the user.
func (s *Service) ListAccessible(ctx context.Context, channelIdentityID string) ([]Bot, error) {
return s.ListByOwner(ctx, channelIdentityID)
}
// Update updates bot profile fields.
func (s *Service) Update(ctx context.Context, botID string, req UpdateBotRequest) (Bot, error) {
if s.queries == nil {
return Bot{}, errors.New("bot queries not configured")
}
botUUID, err := db.ParseUUID(botID)
if err != nil {
return Bot{}, err
}
existing, err := s.queries.GetBotByID(ctx, botUUID)
if err != nil {
return Bot{}, err
}
displayName := strings.TrimSpace(existing.DisplayName.String)
avatarURL := strings.TrimSpace(existing.AvatarUrl.String)
isActive := existing.IsActive
metadata, err := decodeMetadata(existing.Metadata)
if err != nil {
return Bot{}, err
}
if req.DisplayName != nil {
displayName = strings.TrimSpace(*req.DisplayName)
}
if req.AvatarURL != nil {
avatarURL = strings.TrimSpace(*req.AvatarURL)
}
if req.IsActive != nil {
isActive = *req.IsActive
}
if req.Metadata != nil {
metadata = req.Metadata
}
if displayName == "" {
displayName = "bot-" + uuid.NewString()
}
payload, err := json.Marshal(metadata)
if err != nil {
return Bot{}, err
}
row, err := s.queries.UpdateBotProfile(ctx, sqlc.UpdateBotProfileParams{
ID: botUUID,
DisplayName: pgtype.Text{String: displayName, Valid: displayName != ""},
AvatarUrl: pgtype.Text{String: avatarURL, Valid: avatarURL != ""},
IsActive: isActive,
Metadata: payload,
})
if err != nil {
return Bot{}, err
}
bot, err := toBot(asSQLCBot(row))
if err != nil {
return Bot{}, err
}
if err := s.attachCheckSummary(ctx, &bot, asSQLCBot(row)); err != nil {
return Bot{}, err
}
return bot, nil
}
// TransferOwner transfers bot ownership to another user.
func (s *Service) TransferOwner(ctx context.Context, botID string, ownerUserID string) (Bot, error) {
if s.queries == nil {
return Bot{}, errors.New("bot queries not configured")
}
botUUID, err := db.ParseUUID(botID)
if err != nil {
return Bot{}, err
}
ownerUUID, err := db.ParseUUID(ownerUserID)
if err != nil {
return Bot{}, err
}
if err := s.ensureUserExists(ctx, ownerUUID); err != nil {
return Bot{}, err
}
row, err := s.queries.UpdateBotOwner(ctx, sqlc.UpdateBotOwnerParams{
ID: botUUID,
OwnerUserID: ownerUUID,
})
if err != nil {
return Bot{}, err
}
bot, err := toBot(asSQLCBot(row))
if err != nil {
return Bot{}, err
}
if err := s.attachCheckSummary(ctx, &bot, asSQLCBot(row)); err != nil {
return Bot{}, err
}
return bot, nil
}
// Delete removes a bot and its associated resources.
func (s *Service) Delete(ctx context.Context, botID string) error {
if s.queries == nil {
return errors.New("bot queries not configured")
}
botUUID, err := db.ParseUUID(botID)
if err != nil {
return err
}
row, err := s.queries.GetBotByID(ctx, botUUID)
if err != nil {
return err
}
if strings.TrimSpace(row.Status) == BotStatusDeleting {
return nil
}
if err := s.queries.UpdateBotStatus(ctx, sqlc.UpdateBotStatusParams{
ID: botUUID,
Status: BotStatusDeleting,
}); err != nil {
return err
}
s.enqueueDeleteLifecycle(ctx, botID)
return nil
}
// ListChecks evaluates runtime resource checks for a bot.
func (s *Service) ListChecks(ctx context.Context, botID string) ([]BotCheck, error) {
if s.queries == nil {
return nil, errors.New("bot queries not configured")
}
botUUID, err := db.ParseUUID(botID)
if err != nil {
return nil, err
}
row, err := s.queries.GetBotByID(ctx, botUUID)
if err != nil {
return nil, err
}
return s.buildRuntimeChecks(ctx, asSQLCBot(row), true)
}
func (s *Service) enqueueCreateLifecycle(ctx context.Context, botID string) {
go func() {
lifecycleCtx, cancel := context.WithTimeout(context.WithoutCancel(ctx), botLifecycleOperationTimeout)
defer cancel()
if s.containerLifecycle != nil {
if err := s.containerLifecycle.SetupBotContainer(lifecycleCtx, botID); err != nil {
s.logger.Error("bot container setup failed",
slog.String("bot_id", botID),
slog.Any("error", err),
)
}
}
if err := s.updateStatus(lifecycleCtx, botID, BotStatusReady); err != nil {
s.logger.Error("failed to update bot status to ready after create",
slog.String("bot_id", botID),
slog.Any("error", err),
)
}
}()
}
func (s *Service) enqueueDeleteLifecycle(ctx context.Context, botID string) {
go func() {
lifecycleCtx, cancel := context.WithTimeout(context.WithoutCancel(ctx), botLifecycleOperationTimeout)
defer cancel()
if s.containerLifecycle != nil {
if err := s.containerLifecycle.CleanupBotContainer(lifecycleCtx, botID, false); err != nil {
s.logger.Error("bot container cleanup failed",
slog.String("bot_id", botID),
slog.Any("error", err),
)
}
}
botUUID, err := db.ParseUUID(botID)
if err != nil {
s.logger.Error("invalid bot id while finalizing delete",
slog.String("bot_id", botID),
slog.Any("error", err),
)
if err := s.updateStatus(lifecycleCtx, botID, BotStatusReady); err != nil {
s.logger.Error("revert bot status failed", slog.String("bot_id", botID), slog.Any("error", err))
}
return
}
if err := s.queries.DeleteBotByID(lifecycleCtx, botUUID); err != nil {
s.logger.Error("failed to delete bot after cleanup",
slog.String("bot_id", botID),
slog.Any("error", err),
)
if err := s.updateStatus(lifecycleCtx, botID, BotStatusReady); err != nil {
s.logger.Error("revert bot status failed", slog.String("bot_id", botID), slog.Any("error", err))
}
return
}
}()
}
func (s *Service) updateStatus(ctx context.Context, botID, status string) error {
if s.queries == nil {
return errors.New("bot queries not configured")
}
botUUID, err := db.ParseUUID(botID)
if err != nil {
return err
}
return s.queries.UpdateBotStatus(ctx, sqlc.UpdateBotStatusParams{
ID: botUUID,
Status: strings.TrimSpace(status),
})
}
func (s *Service) ensureUserExists(ctx context.Context, userID pgtype.UUID) error {
if s.queries == nil {
return errors.New("bot queries not configured")
}
_, err := s.queries.GetUserByID(ctx, userID)
if err != nil {
if errors.Is(err, pgx.ErrNoRows) {
return ErrOwnerUserNotFound
}
return err
}
return nil
}
func asSQLCBot(v any) sqlc.Bot {
switch r := v.(type) {
case sqlc.Bot:
return r
case sqlc.CreateBotRow:
return sqlc.Bot{ID: r.ID, OwnerUserID: r.OwnerUserID, DisplayName: r.DisplayName, AvatarUrl: r.AvatarUrl, IsActive: r.IsActive, Status: r.Status, MaxContextLoadTime: r.MaxContextLoadTime, MaxContextTokens: r.MaxContextTokens, Language: r.Language, ReasoningEnabled: r.ReasoningEnabled, ReasoningEffort: r.ReasoningEffort, ChatModelID: r.ChatModelID, SearchProviderID: r.SearchProviderID, MemoryProviderID: r.MemoryProviderID, HeartbeatEnabled: r.HeartbeatEnabled, HeartbeatInterval: r.HeartbeatInterval, HeartbeatPrompt: r.HeartbeatPrompt, Metadata: r.Metadata, CreatedAt: r.CreatedAt, UpdatedAt: r.UpdatedAt}
case sqlc.GetBotByIDRow:
return sqlc.Bot{ID: r.ID, OwnerUserID: r.OwnerUserID, DisplayName: r.DisplayName, AvatarUrl: r.AvatarUrl, IsActive: r.IsActive, Status: r.Status, MaxContextLoadTime: r.MaxContextLoadTime, MaxContextTokens: r.MaxContextTokens, Language: r.Language, ReasoningEnabled: r.ReasoningEnabled, ReasoningEffort: r.ReasoningEffort, ChatModelID: r.ChatModelID, SearchProviderID: r.SearchProviderID, MemoryProviderID: r.MemoryProviderID, HeartbeatEnabled: r.HeartbeatEnabled, HeartbeatInterval: r.HeartbeatInterval, HeartbeatPrompt: r.HeartbeatPrompt, Metadata: r.Metadata, CreatedAt: r.CreatedAt, UpdatedAt: r.UpdatedAt}
case sqlc.ListBotsByOwnerRow:
return sqlc.Bot{ID: r.ID, OwnerUserID: r.OwnerUserID, DisplayName: r.DisplayName, AvatarUrl: r.AvatarUrl, IsActive: r.IsActive, Status: r.Status, MaxContextLoadTime: r.MaxContextLoadTime, MaxContextTokens: r.MaxContextTokens, Language: r.Language, ReasoningEnabled: r.ReasoningEnabled, ReasoningEffort: r.ReasoningEffort, ChatModelID: r.ChatModelID, SearchProviderID: r.SearchProviderID, MemoryProviderID: r.MemoryProviderID, HeartbeatEnabled: r.HeartbeatEnabled, HeartbeatInterval: r.HeartbeatInterval, HeartbeatPrompt: r.HeartbeatPrompt, Metadata: r.Metadata, CreatedAt: r.CreatedAt, UpdatedAt: r.UpdatedAt}
case sqlc.UpdateBotProfileRow:
return sqlc.Bot{ID: r.ID, OwnerUserID: r.OwnerUserID, DisplayName: r.DisplayName, AvatarUrl: r.AvatarUrl, IsActive: r.IsActive, Status: r.Status, MaxContextLoadTime: r.MaxContextLoadTime, MaxContextTokens: r.MaxContextTokens, Language: r.Language, ReasoningEnabled: r.ReasoningEnabled, ReasoningEffort: r.ReasoningEffort, ChatModelID: r.ChatModelID, SearchProviderID: r.SearchProviderID, MemoryProviderID: r.MemoryProviderID, HeartbeatEnabled: r.HeartbeatEnabled, HeartbeatInterval: r.HeartbeatInterval, HeartbeatPrompt: r.HeartbeatPrompt, Metadata: r.Metadata, CreatedAt: r.CreatedAt, UpdatedAt: r.UpdatedAt}
case sqlc.UpdateBotOwnerRow:
return sqlc.Bot{ID: r.ID, OwnerUserID: r.OwnerUserID, DisplayName: r.DisplayName, AvatarUrl: r.AvatarUrl, IsActive: r.IsActive, Status: r.Status, MaxContextLoadTime: r.MaxContextLoadTime, MaxContextTokens: r.MaxContextTokens, Language: r.Language, ReasoningEnabled: r.ReasoningEnabled, ReasoningEffort: r.ReasoningEffort, ChatModelID: r.ChatModelID, SearchProviderID: r.SearchProviderID, MemoryProviderID: r.MemoryProviderID, HeartbeatEnabled: r.HeartbeatEnabled, HeartbeatInterval: r.HeartbeatInterval, HeartbeatPrompt: r.HeartbeatPrompt, Metadata: r.Metadata, CreatedAt: r.CreatedAt, UpdatedAt: r.UpdatedAt}
default:
return sqlc.Bot{}
}
}
func toBot(row sqlc.Bot) (Bot, error) {
displayName := ""
if row.DisplayName.Valid {
displayName = row.DisplayName.String
}
avatarURL := ""
if row.AvatarUrl.Valid {
avatarURL = row.AvatarUrl.String
}
metadata, err := decodeMetadata(row.Metadata)
if err != nil {
return Bot{}, err
}
createdAt := time.Time{}
if row.CreatedAt.Valid {
createdAt = row.CreatedAt.Time
}
updatedAt := time.Time{}
if row.UpdatedAt.Valid {
updatedAt = row.UpdatedAt.Time
}
return Bot{
ID: row.ID.String(),
OwnerUserID: row.OwnerUserID.String(),
DisplayName: displayName,
AvatarURL: avatarURL,
IsActive: row.IsActive,
Status: strings.TrimSpace(row.Status),
CheckState: BotCheckStateUnknown,
CheckIssueCount: 0,
Metadata: metadata,
CreatedAt: createdAt,
UpdatedAt: updatedAt,
}, nil
}
func decodeMetadata(payload []byte) (map[string]any, error) {
if len(payload) == 0 {
return map[string]any{}, nil
}
var data map[string]any
if err := json.Unmarshal(payload, &data); err != nil {
return nil, err
}
if data == nil {
data = map[string]any{}
}
return data, nil
}
func (s *Service) attachCheckSummary(ctx context.Context, bot *Bot, row sqlc.Bot) error {
checks, err := s.buildRuntimeChecks(ctx, row, false)
if err != nil {
return err
}
checkState, issueCount := summarizeChecks(checks)
bot.CheckState = checkState
bot.CheckIssueCount = issueCount
return nil
}
// buildRuntimeChecks composes builtin checks and optional dynamic checker results.
// includeDynamic is disabled when computing list summary to avoid expensive runtime probes.
func (s *Service) buildRuntimeChecks(ctx context.Context, row sqlc.Bot, includeDynamic bool) ([]BotCheck, error) {
status := strings.TrimSpace(row.Status)
checks := make([]BotCheck, 0, 4)
if status == BotStatusCreating {
checks = append(checks, BotCheck{
ID: BotCheckTypeContainerInit,
Type: BotCheckTypeContainerInit,
TitleKey: "bots.checks.titles.containerInit",
Status: BotCheckStatusUnknown,
Summary: "Initialization is in progress.",
Detail: "Bot resources are still being provisioned.",
})
checks = append(checks, BotCheck{
ID: BotCheckTypeContainerRecord,
Type: BotCheckTypeContainerRecord,
TitleKey: "bots.checks.titles.containerRecord",
Status: BotCheckStatusUnknown,
Summary: "Container record is pending.",
Detail: "Container record will be checked after initialization.",
})
checks = append(checks, BotCheck{
ID: BotCheckTypeContainerTask,
Type: BotCheckTypeContainerTask,
TitleKey: "bots.checks.titles.containerTask",
Status: BotCheckStatusUnknown,
Summary: "Container task state is pending.",
Detail: "Task state will be checked after initialization.",
})
checks = append(checks, BotCheck{
ID: BotCheckTypeContainerData,
Type: BotCheckTypeContainerData,
TitleKey: "bots.checks.titles.containerDataPath",
Status: BotCheckStatusUnknown,
Summary: "Container reachability check is pending.",
Detail: "Reachability will be checked after initialization.",
})
if includeDynamic {
checks = s.appendDynamicChecks(ctx, row.ID.String(), checks)
}
return checks, nil
}
if status == BotStatusDeleting {
checks = append(checks, BotCheck{
ID: BotCheckTypeDelete,
Type: BotCheckTypeDelete,
TitleKey: "bots.checks.titles.botDelete",
Status: BotCheckStatusUnknown,
Summary: "Deletion is in progress.",
Detail: "Bot resources are being cleaned up.",
})
checks = append(checks, BotCheck{
ID: BotCheckTypeContainerRecord,
Type: BotCheckTypeContainerRecord,
TitleKey: "bots.checks.titles.containerRecord",
Status: BotCheckStatusUnknown,
Summary: "Container record check is skipped.",
Detail: "Bot is deleting and container checks are paused.",
})
checks = append(checks, BotCheck{
ID: BotCheckTypeContainerTask,
Type: BotCheckTypeContainerTask,
TitleKey: "bots.checks.titles.containerTask",
Status: BotCheckStatusUnknown,
Summary: "Container task check is skipped.",
Detail: "Bot is deleting and task checks are paused.",
})
checks = append(checks, BotCheck{
ID: BotCheckTypeContainerData,
Type: BotCheckTypeContainerData,
TitleKey: "bots.checks.titles.containerDataPath",
Status: BotCheckStatusUnknown,
Summary: "Container reachability check is skipped.",
Detail: "Bot is deleting and reachability checks are paused.",
})
if includeDynamic {
checks = s.appendDynamicChecks(ctx, row.ID.String(), checks)
}
return checks, nil
}
checks = append(checks, BotCheck{
ID: BotCheckTypeContainerInit,
Type: BotCheckTypeContainerInit,
TitleKey: "bots.checks.titles.containerInit",
Status: BotCheckStatusOK,
Summary: "Initialization finished.",
})
containerRow, err := s.queries.GetContainerByBotID(ctx, row.ID)
if err != nil {
if errors.Is(err, pgx.ErrNoRows) {
checks = append(checks, BotCheck{
ID: BotCheckTypeContainerRecord,
Type: BotCheckTypeContainerRecord,
TitleKey: "bots.checks.titles.containerRecord",
Status: BotCheckStatusError,
Summary: "Container record is missing.",
Detail: "No container is attached to this bot.",
})
checks = append(checks, BotCheck{
ID: BotCheckTypeContainerTask,
Type: BotCheckTypeContainerTask,
TitleKey: "bots.checks.titles.containerTask",
Status: BotCheckStatusUnknown,
Summary: "Container task state is unknown.",
Detail: "Task state cannot be determined without a container record.",
})
checks = append(checks, BotCheck{
ID: BotCheckTypeContainerData,
Type: BotCheckTypeContainerData,
TitleKey: "bots.checks.titles.containerDataPath",
Status: BotCheckStatusUnknown,
Summary: "Container reachability is unknown.",
Detail: "Reachability cannot be determined without a container record.",
})
if includeDynamic {
checks = s.appendDynamicChecks(ctx, row.ID.String(), checks)
}
return checks, nil
}
return nil, err
}
checks = append(checks, BotCheck{
ID: BotCheckTypeContainerRecord,
Type: BotCheckTypeContainerRecord,
TitleKey: "bots.checks.titles.containerRecord",
Status: BotCheckStatusOK,
Summary: "Container record exists.",
Detail: fmt.Sprintf("container_id=%s", strings.TrimSpace(containerRow.ContainerID)),
Metadata: map[string]any{
"container_id": strings.TrimSpace(containerRow.ContainerID),
"namespace": strings.TrimSpace(containerRow.Namespace),
"image": strings.TrimSpace(containerRow.Image),
},
})
taskStatus := strings.TrimSpace(strings.ToLower(containerRow.Status))
taskCheck := BotCheck{
ID: BotCheckTypeContainerTask,
Type: BotCheckTypeContainerTask,
TitleKey: "bots.checks.titles.containerTask",
Status: BotCheckStatusWarn,
Summary: "Container task state needs attention.",
}
switch taskStatus {
case "running", "created", "stopped", "paused":
taskCheck.Status = BotCheckStatusOK
taskCheck.Summary = "Container task state is reported."
taskCheck.Detail = fmt.Sprintf("status=%s", taskStatus)
case "":
taskCheck.Detail = "status is empty"
default:
taskCheck.Detail = fmt.Sprintf("unexpected status=%s", taskStatus)
}
taskCheck.Metadata = map[string]any{"status": taskStatus}
checks = append(checks, taskCheck)
dataCheck := BotCheck{
ID: BotCheckTypeContainerData,
Type: BotCheckTypeContainerData,
TitleKey: "bots.checks.titles.containerDataPath",
Status: BotCheckStatusWarn,
Summary: "Container reachability needs attention.",
}
if s.containerReachability == nil {
dataCheck.Status = BotCheckStatusUnknown
dataCheck.Summary = "Container reachability check not configured."
} else if err := s.containerReachability(ctx, row.ID.String()); err != nil {
dataCheck.Status = BotCheckStatusError
dataCheck.Summary = "Container is not reachable via gRPC."
dataCheck.Detail = err.Error()
} else {
dataCheck.Status = BotCheckStatusOK
dataCheck.Summary = "Container is reachable via gRPC."
}
checks = append(checks, dataCheck)
if includeDynamic {
checks = s.appendDynamicChecks(ctx, row.ID.String(), checks)
}
return checks, nil
}
// appendDynamicChecks appends checks from registered runtime checkers.
func (s *Service) appendDynamicChecks(ctx context.Context, botID string, checks []BotCheck) []BotCheck {
for _, checker := range s.checkers {
items := checker.ListChecks(ctx, botID)
for _, item := range items {
item.ID = strings.TrimSpace(item.ID)
item.Type = strings.TrimSpace(item.Type)
item.Status = strings.TrimSpace(item.Status)
if item.ID == "" {
if item.Type != "" {
item.ID = item.Type
} else {
item.ID = "runtime.unknown"
if s.logger != nil {
s.logger.Warn("runtime checker returned check without id and type",
slog.String("bot_id", botID))
}
}
}
if item.Type == "" {
item.Type = item.ID
}
if item.Status == "" {
item.Status = BotCheckStatusUnknown
}
checks = append(checks, item)
}
}
return checks
}
func summarizeChecks(checks []BotCheck) (string, int32) {
if len(checks) == 0 {
return BotCheckStateUnknown, 0
}
var issueCount int32
unknownCount := 0
for _, check := range checks {
switch check.Status {
case BotCheckStatusWarn, BotCheckStatusError:
issueCount++
case BotCheckStatusUnknown:
unknownCount++
}
}
if issueCount > 0 {
return BotCheckStateIssue, issueCount
}
if unknownCount == len(checks) {
return BotCheckStateUnknown, 0
}
return BotCheckStateOK, 0
}