mirror of
https://github.com/memohai/Memoh.git
synced 2026-04-25 07:00:48 +09:00
1680316c7f
* refactor(agent): replace TypeScript agent gateway with in-process Go agent using twilight-ai SDK
- Remove apps/agent (Bun/Elysia gateway), packages/agent (@memoh/agent),
internal/bun runtime manager, and all embedded agent/bun assets
- Add internal/agent package powered by twilight-ai SDK for LLM calls,
tool execution, streaming, sential logic, tag extraction, and prompts
- Integrate ToolGatewayService in-process for both built-in and user MCP
tools, eliminating HTTP round-trips to the old gateway
- Update resolver to convert between sdk.Message and ModelMessage at the
boundary (resolver_messages.go), keeping agent package free of
persistence concerns
- Prepend user message before storeRound since SDK only returns output
messages (assistant + tool)
- Clean up all Docker configs, TOML configs, nginx proxy, Dockerfile.agent,
and Go config structs related to the removed agent gateway
- Update cmd/agent and cmd/memoh entry points with setter-based
ToolGateway injection to avoid FX dependency cycles
* fix(web): move form declaration before computed properties that reference it
The `form` reactive object was declared after computed properties like
`selectedMemoryProvider` and `isSelectedMemoryProviderPersisted` that
reference it, causing a TDZ ReferenceError during setup.
* fix: prevent UTF-8 character corruption in streaming text output
StreamTagExtractor.Push() used byte-level string slicing to hold back
buffer tails for tag detection, which could split multi-byte UTF-8
characters. After json.Marshal replaced invalid bytes with U+FFFD,
the corruption became permanent — causing garbled CJK characters (�)
in agent responses.
Add safeUTF8SplitIndex() to back up split points to valid character
boundaries. Also fix byte-level truncation in command/formatter.go
and command/fs.go to use rune-aware slicing.
* fix: add agent error logging and fix Gemini tool schema validation
- Log agent stream errors in both SSE and WebSocket paths with bot/model context
- Fix send tool `attachments` parameter: empty `items` schema rejected by
Google Gemini API (INVALID_ARGUMENT), now specifies `{"type": "string"}`
- Upgrade twilight-ai to d898f0b (includes raw body in API error messages)
* chore(ci): remove agent gateway from Docker build and release pipelines
Agent gateway has been replaced by in-process Go agent; remove the
obsolete Docker image matrix entry, Bun/UPX CI steps, and agent-binary
build logic from the release script.
* fix: preserve attachment filename, metadata, and container path through persistence
- Add `name` column to `bot_history_message_assets` (migration 0034) to
persist original filenames across page refreshes.
- Add `metadata` JSONB column (migration 0035) to store source_path,
source_url, and other context alongside each asset.
- Update SQL queries, sqlc-generated code, and all Go types (MessageAsset,
AssetRef, OutboundAssetRef, FileAttachment) to carry name and metadata
through the full lifecycle.
- Extract filenames from path/URL in AttachmentsResolver before clearing
raw paths; enrich streaming event metadata with name, source_path, and
source_url in both the WebSocket and channel inbound ingestion paths.
- Implement `LinkAssets` on message service and `LinkOutboundAssets` on
flow resolver so WebSocket-streamed bot attachments are persisted to the
correct assistant message after streaming completes.
- Frontend: update MessageAsset type with metadata field, pass metadata
through to attachment items, and reorder attachment-block.vue template
so container files (identified by metadata.source_path) open in the
sidebar file manager instead of triggering a download.
* refactor(agent): decouple built-in tools from MCP, load via ToolProvider interface
Migrate all 13 built-in tool providers from internal/mcp/providers/ to
internal/agent/tools/ using the twilight-ai sdk.Tool structure. The agent
now loads tools through a ToolProvider interface instead of the MCP
ToolGatewayService, which is simplified to only manage external federation
sources. This enables selective tool loading and removes the coupling
between business tools and the MCP protocol layer.
* refactor(flow): split monolithic resolver.go into focused modules
Break the 1959-line resolver.go into 12 files organized by concern:
- resolver.go: core orchestration (Resolver struct, resolve, Chat, prepareRunConfig)
- resolver_stream.go: streaming (StreamChat, StreamChatWS, tryStoreStream)
- resolver_trigger.go: schedule/heartbeat triggers
- resolver_attachments.go: attachment routing, inlining, encoding
- resolver_history.go: message loading, deduplication, token trimming
- resolver_store.go: persistence (storeRound, storeMessages, asset linking)
- resolver_memory.go: memory provider integration
- resolver_model_selection.go: model selection and candidate matching
- resolver_identity.go: display name and channel identity resolution
- resolver_settings.go: bot settings, loop detection, inbox
- user_header.go: YAML front-matter formatting
- resolver_util.go: shared utilities (sanitize, normalize, dedup, UUID)
* fix(agent): enable Anthropic extended thinking by passing ReasoningConfig to provider
Anthropic's thinking requires WithThinking() at provider creation time,
unlike OpenAI which uses per-request ReasoningEffort. The config was
never wired through, so Claude models could not trigger thinking.
* refactor(agent): extract prompts into embedded markdown templates
Move inline prompt strings from prompt.go into separate .md files under
internal/agent/prompts/, using {{key}} placeholders and a simple render
engine. Remove obsolete SystemPromptParams fields (Language,
MaxContextLoadTime, Channels, CurrentChannel) and their call-site usage.
* fix: lint
229 lines
8.6 KiB
Markdown
229 lines
8.6 KiB
Markdown
---
|
|
name: twilight-ai
|
|
description: Assist with development in the Twilight AI Go SDK. Use when working in this repository, adding or updating providers, embeddings, tool calling, streaming, examples, or docs for Twilight AI.
|
|
---
|
|
|
|
# Twilight AI
|
|
|
|
## When To Use
|
|
|
|
Use this skill when the task involves `twilight-ai`, especially:
|
|
|
|
- implementing or refactoring SDK APIs in `sdk/`
|
|
- adding or updating providers under `provider/`
|
|
- working on `GenerateText`, `GenerateTextResult`, `StreamText`, `Embed`, or `EmbedMany`
|
|
- adding tool-calling, streaming, reasoning, or embedding support
|
|
- writing examples, docs, or usage guidance for this library
|
|
|
|
## Project Snapshot
|
|
|
|
Twilight AI is a lightweight Go AI SDK with a provider-agnostic core API.
|
|
|
|
- Text generation: `sdk.GenerateText`, `sdk.GenerateTextResult`, `sdk.StreamText`
|
|
- Embeddings: `sdk.Embed`, `sdk.EmbedMany`
|
|
- Tool calling: `sdk.Tool`, `sdk.NewTool[T]`, `WithMaxSteps`, approval flow
|
|
- MCP tool integration: `sdk.CreateMCPClient`, `sdk.MCPClient`, `sdk.MCPClientConfig`
|
|
- Streaming: typed `StreamPart` events over Go channels
|
|
- Current providers:
|
|
- `provider/openai/completions`
|
|
- `provider/openai/responses`
|
|
- `provider/anthropic/messages`
|
|
- `provider/google/generativeai`
|
|
- `provider/openai/embedding`
|
|
- `provider/google/embedding`
|
|
|
|
## Default Mental Model
|
|
|
|
Prefer the high-level SDK API first, then drop to provider details only when needed.
|
|
|
|
- `sdk.Model` binds a chat model to a `sdk.Provider`
|
|
- `sdk.EmbeddingModel` binds an embedding model to an `sdk.EmbeddingProvider`
|
|
- The client orchestrates tool loops, callbacks, approvals, and streaming lifecycle
|
|
- MCP clients can load remote MCP tools and turn them into ordinary `sdk.Tool` values
|
|
- Providers handle backend-specific HTTP, request mapping, response parsing, and SSE translation
|
|
|
|
## Core API Guidance
|
|
|
|
Choose the narrowest API that matches the task:
|
|
|
|
- Need only final text: use `sdk.GenerateText`
|
|
- Need usage, finish reason, steps, sources, files, or tool details: use `sdk.GenerateTextResult`
|
|
- Need live output: use `sdk.StreamText`
|
|
- Need one vector: use `sdk.Embed`
|
|
- Need multiple vectors or embedding token usage: use `sdk.EmbedMany`
|
|
|
|
If the task introduces examples or docs, prefer simple end-to-end snippets that start with:
|
|
|
|
1. construct provider
|
|
2. get model
|
|
3. call SDK API
|
|
4. handle error
|
|
|
|
## Provider Selection Rules
|
|
|
|
- Use `openai/completions` for broad OpenAI-compatible support such as DeepSeek, Groq, Ollama, Azure-style compatible endpoints, and generic `/chat/completions` backends.
|
|
- Use `openai/responses` when the task needs OpenAI Responses API features such as first-class reasoning models, reasoning summaries, URL citation annotations, or flat input mapping.
|
|
- Use `anthropic/messages` for Claude and Anthropic extended thinking via `WithThinking`.
|
|
- Use `google/generativeai` for Gemini chat, tool calling, vision, streaming, and Gemini reasoning.
|
|
- Use `openai/embedding` or `google/embedding` for embeddings. Keep embedding-provider work separate from chat-provider work.
|
|
|
|
## Implementation Rules
|
|
|
|
### Chat Providers
|
|
|
|
If adding or changing a chat provider, preserve the `sdk.Provider` contract:
|
|
|
|
- `Name()`
|
|
- `ListModels(ctx)`
|
|
- `Test(ctx)`
|
|
- `TestModel(ctx, modelID)`
|
|
- `DoGenerate(ctx, params)`
|
|
- `DoStream(ctx, params)`
|
|
|
|
Keep provider responsibilities focused:
|
|
|
|
- translate SDK messages/options into backend request format
|
|
- parse backend responses into `sdk.GenerateResult`
|
|
- map backend streaming events into typed `sdk.StreamPart` values
|
|
- report usage, finish reasons, reasoning, tool calls, sources, and files when supported
|
|
|
|
### Embedding Providers
|
|
|
|
Embedding providers are separate from chat providers. Use `sdk.EmbeddingProvider` and return an `sdk.EmbeddingModel` via `EmbeddingModel(id)`.
|
|
|
|
When updating embeddings:
|
|
|
|
- keep `sdk.Embed` for single-string convenience
|
|
- keep `sdk.EmbedMany` for batched requests
|
|
- preserve `Usage.Tokens`
|
|
- only expose dimensions/task-type behavior when the backend supports it
|
|
|
|
### Tool Calling
|
|
|
|
Prefer `sdk.NewTool[T]` for new tool examples and integrations. It gives typed input and inferred JSON Schema.
|
|
|
|
Use these defaults unless the task requires something else:
|
|
|
|
- `WithToolChoice("auto")` for normal use
|
|
- `WithMaxSteps(0)` for inspection-only tool calls
|
|
- `WithMaxSteps(N)` for automatic execution loops
|
|
- `RequireApproval: true` only for sensitive side effects
|
|
|
|
When streaming with tools, ensure the implementation can emit:
|
|
|
|
- tool input construction parts
|
|
- tool execution parts
|
|
- progress updates
|
|
- denial/error events when applicable
|
|
|
|
### MCP Tool Calling
|
|
|
|
Use MCP when the task needs remote tools exposed by an MCP server rather than locally implemented `Execute` handlers.
|
|
|
|
Default guidance:
|
|
|
|
- use `sdk.CreateMCPClient(ctx, &sdk.MCPClientConfig{...})`
|
|
- use `sdk.MCPTransportHTTP` for streamable HTTP MCP servers
|
|
- use `sdk.MCPTransportSSE` only when the server exposes legacy SSE transport
|
|
- for stdio, build the transport with the official MCP Go SDK and pass `Transport: ...`
|
|
- call `mcpClient.Tools(ctx)` and pass the result into `sdk.WithTools(...)`
|
|
- call `defer mcpClient.Close()` after successful creation
|
|
|
|
Important behavior:
|
|
|
|
- MCP tools become ordinary `sdk.Tool` values from the caller's perspective
|
|
- Twilight AI converts MCP `InputSchema` into `*jsonschema.Schema`
|
|
- MCP tool execution is delegated to `tools/call` on the remote server
|
|
- remote MCP text output becomes the tool result visible to the model
|
|
|
|
### Streaming
|
|
|
|
Twilight AI streaming is channel-first and type-safe. Prefer type switches over loosely typed event parsing.
|
|
|
|
Important expectations:
|
|
|
|
- `StreamText` returns `*sdk.StreamResult`
|
|
- `sr.Stream` must be consumed before relying on `sr.Steps` or `sr.Messages`
|
|
- `Text()` and `ToResult()` are the convenience paths when callers do not want manual event handling
|
|
|
|
### Messages And Results
|
|
|
|
Preserve the SDK message model and avoid backend-specific shapes leaking into public usage.
|
|
|
|
- user, assistant, system, and tool messages should stay in SDK types
|
|
- support rich parts where relevant: text, image, file, reasoning, tool call, tool result
|
|
- keep finish reason mapping aligned with SDK constants such as `stop`, `length`, `content-filter`, and `tool-calls`
|
|
|
|
## Common Task Patterns
|
|
|
|
### Add A New Usage Example
|
|
|
|
Use this structure:
|
|
|
|
1. pick the correct provider package
|
|
2. create provider with explicit options
|
|
3. create model via `ChatModel` or `EmbeddingModel`
|
|
4. call the top-level `sdk` function
|
|
5. show minimal but idiomatic result handling
|
|
|
|
### Add Or Update A Provider Feature
|
|
|
|
Check all affected layers:
|
|
|
|
1. request mapping
|
|
2. non-streaming response mapping
|
|
3. streaming event mapping
|
|
4. finish-reason and usage mapping
|
|
5. reasoning/tool/source/file support if the backend exposes them
|
|
6. model discovery and provider health checks if endpoints exist
|
|
|
|
### Add A Custom Provider
|
|
|
|
Use the built-in providers as the template. A custom provider should feel identical to existing ones from the caller's perspective.
|
|
|
|
Minimum behavior:
|
|
|
|
1. return a provider-bound model from `ChatModel`
|
|
2. implement discovery and health-check methods
|
|
3. support `DoGenerate`
|
|
4. support `DoStream` with correct lifecycle parts
|
|
|
|
## Documentation Rules
|
|
|
|
When writing Twilight AI docs or README content:
|
|
|
|
- prefer provider-agnostic phrasing first, provider-specific details second
|
|
- use Go examples, not pseudocode, unless explaining an interface contract
|
|
- keep examples small and runnable in spirit
|
|
- mention exact package paths for imports
|
|
- explain when to choose Completions vs Responses when OpenAI is involved
|
|
- keep embeddings, tool calling, and streaming as separate concerns unless the example truly combines them
|
|
|
|
## Terminology
|
|
|
|
Use these terms consistently:
|
|
|
|
- Provider: backend implementation for chat generation
|
|
- Embedding provider: backend implementation for embeddings
|
|
- Model: provider-bound chat model
|
|
- Embedding model: provider-bound embedding model
|
|
- Tool calling: model requests a tool invocation
|
|
- Multi-step execution: automatic tool loop controlled by `WithMaxSteps`
|
|
- Stream part: a typed event from `StreamText`
|
|
|
|
## Quick Checklist
|
|
|
|
Before finishing work in this repo, verify:
|
|
|
|
- the chosen provider package matches the intended backend capabilities
|
|
- chat and embedding concerns are not mixed accidentally
|
|
- public examples use top-level `sdk` APIs unless lower-level behavior is the point
|
|
- streaming logic uses typed `StreamPart` handling
|
|
- tool-calling changes cover both inspection mode and multi-step mode when relevant
|
|
- MCP examples show both transport setup and normal `WithTools(...)` usage when relevant
|
|
- provider work includes health checks or model discovery behavior if the backend supports them
|
|
|
|
## Additional Resources
|
|
|
|
- For exported APIs, signatures, provider options, and stream/event types, see [reference.md](reference.md)
|