* feat: port hermes-agent session-search, osv-check, clarify, and SSRF guard Adds four catch-up features from nousresearch/hermes-agent: - `session_search`: FTS5 full-text search over stored messages for cross-conversation recall. New migration v21 introduces a `messages_fts` virtual table with triggers that keep it synced on INSERT/UPDATE/DELETE and backfills existing rows; the tool returns ranked snippets with chat metadata. - `osv_check`: queries api.osv.dev for advisories across npm, PyPI, crates.io, RubyGems, Maven, NuGet, Packagist, Hex, Pub, and Go. Flags MAL-* malware advisories explicitly. - `clarify`: structured multi-choice or open-ended question tool that delivers the question through the caller's channel and releases the turn so the next user message supplies the answer. Capped at 4 predefined choices plus an automatic "Other" option. - SSRF pre-flight guard on `web_fetch`: new `block_private_ips` field on `web_fetch_url_validation` (default on) rejects loopback, link-local, private, CGNAT, unique-local IPv6, documentation, benchmarking, and cloud-metadata targets. Runs on the initial URL and every redirect hop. Wired into `ToolRegistry::new` and, for read-only tools, into `ToolRegistry::new_sub_agent`. Generated docs updated to list 50 built-in tools (was 47). Covered by unit tests for FTS search, SSRF ranges, and OSV ecosystem canonicalization; full `cargo clippy -D warnings` and `cargo test --all-targets` pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(session_search): scope default to caller's chat, gate cross-chat access Previous default of searching all chats when chat_id was omitted leaked messages across DMs, groups, and channels on the same microclaw deployment — a caller in chat A could FTS5-search a snippet that only existed in chat B. Changes: - Register `session_search` in `should_inject_default_chat_id` so the runtime injects the caller's chat_id when missing. - In the tool, explicit `chat_id` is gated by `authorize_chat_access` (same caller or control chat only). - Add `all_chats: true` opt-in that only control chats may use; it drops the chat scope entirely for audit/admin workflows. - Update tool description so the agent knows the scope semantics. - Add four unit tests: cross-chat denial, all_chats denial for non-control, all_chats success for control, default-scope is caller. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: add multimedia tool suite (image gen/vision/TTS/STT) Four OpenAI-compatible multimedia tools, all disabled by default and opt-in per `media.<tool>.enabled`: - `generate_image`: POST /v1/images/generations, saves PNG under <data_dir>/media/images/, delivers via channel send_attachment when supported. Supports b64_json + URL response shapes. - `describe_image`: POST /v1/chat/completions with image content block. Accepts file paths inside working_dir, https:// URLs, or data: URIs; remote URLs are SSRF-checked then re-encoded as data: URIs so the provider always sees inline bytes. - `text_to_speech`: POST /v1/audio/speech, saves audio to <data_dir>/media/audio/, delivers via channel send_attachment. Allowlists voices (alloy/echo/fable/...) and formats (mp3/opus/wav/...). - `transcribe_audio`: POST /v1/audio/transcriptions as multipart/form-data. Accepts the same location forms as describe_image. Shared `MediaClient` (microclaw-tools crate): - Enforces SSRF guard on the configured base URL (prevents operator from pointing media traffic at loopback/private/metadata addresses) - Redacts API keys from Debug output - Resolves credentials in priority order: media.api_key (plaintext, discouraged) -> MICROCLAW_OPENAI_API_KEY -> OPENAI_API_KEY -> existing config.openai_api_key (for zero-config on existing deployments) New config section `media` with per-tool knobs (model, default size/voice/ format, language) and a shared `openai_base_url` override. Defaults match OpenAI's current catalog (gpt-image-1, gpt-4o-mini, tts-1, whisper-1). Built-in tool count goes 50 -> 54. Schema unchanged. 5 new integration tests cover SSRF guard on base URL. Existing 949 tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(tools): insights — usage summary over trailing window Aggregates llm_usage_logs and per-model breakdown into a markdown report. Scoped to caller's chat by default; all_chats=true requires control chat (same pattern as session_search). Ported from hermes-agent's /insights [days] command, adapted to microclaw's existing usage-tracking schema. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(title): auto-generate session titles via LLM New module `src/title_generator.rs` with `generate_and_save_title(config, db, chat_id)`. Loads the first ~8 messages of a chat (sessions.messages_json length >= 4), asks the configured LLM for a 3-8 word title, strips quotes/trailing punctuation, and writes it to sessions.label. Also adds two small Database helpers: - set_session_label(chat_id, label) - get_session_label_and_length(chat_id) -> (Option<String>, usize) Ported from hermes-agent's agent/title_generator.py. No automatic scheduler hook yet — callers (web UI, admin CLI, or a future cron task) invoke the function; the agent loop is never blocked. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(cache): tool result cache (schema v22) + osv_check wiring New `tool_result_cache` table (migration v22) keyed by SHA-256 of (tool_name + normalized input JSON). Auth-context fields are stripped from the key so identical requests from different callers dedupe. Default TTLs: web_fetch/search/osv/describe_image 15m-1h, session_search 60s. Helpers on microclaw-tools::tool_cache: - cache_key(tool_name, &input) - normalize_input_for_key (key-sorted, auth-stripped) - default_ttls() catalog Helpers on Database: - get_cached_tool_result, put_cached_tool_result, prune_tool_result_cache Wired into `osv_check` as a proof of concept — repeated queries on the same package/ecosystem are served from SQLite within the TTL. Other network tools can opt in the same way. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(core): redact module for PII / credential scrubbing New microclaw-core::redact with `redact(&str) -> String` that replaces well-known credential and PII patterns: - sk-* / sk-ant-* / sk-proj-* API keys - "Bearer <token>" headers - GitHub PATs (ghp_/gho_/ghu_/ghs_/ghr_) - AWS access keys (AKIA*, ASIA*) - Slack tokens (xox[baprs]-*) - Google API keys (AIza*) - api_key=... in JSON/YAML bodies - Emails (masked user, domain kept for debugging) - Phone numbers (E.164 / CN 11-digit) Ported from hermes-agent's agent/redact.py. Compiled regexes are cached in a Lazy<Vec<...>>. 7 unit tests cover each branch plus a multi-secret case and a plain-text passthrough. Module is opt-in: callers apply redact at boundaries that might emit sensitive data to logs or error messages. Wiring into the tracing subscriber is deliberately deferred to keep the diff small. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(web): robots.txt consultation for web_fetch New module crates/microclaw-tools/src/website_policy.rs: - parse_robots_txt(text, user_agent) with UA-specific and '*' fallbacks - Longest-prefix match between Allow and Disallow (standard semantics) - Crawl-Delay surfaced in CrawlHint so callers can pace requests - Per-host cache (30min TTL, 500KB body cap) - Fail-open on network errors, 4xx, 5xx consult_robots(client, url, user_agent) -> CrawlHint returns {allowed, reason, crawl_delay_secs}. Ported from hermes-agent's tools/website_policy.py. Integration with web_fetch left as a follow-up to keep the diff small; the module is a pure helper today. 6 unit tests cover empty/disallow/allow-overrides/ua-specific/ crawl-delay/comments. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(security): reqwest redirect hook enforces SSRF on every hop Adds url_safety::ssrf_redirect_policy(max_hops) that returns a reqwest::redirect::Policy re-validating each redirect target against check_url_private_ip. A blocked hop short-circuits the chain with a descriptive error; the normal limit kicks in at max_hops. MediaClient now uses this policy so provider-side redirects or any third-party SDK that internally follows Location headers cannot slip traffic into loopback/private/metadata ranges. web_fetch's manual redirect loop remains unchanged (it already validates each hop); the new policy gives the same guarantee for clients that can't do manual redirect handling. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(tools): truncate oversized tool results to artifacts + fetch_artifact Tool results above tool_result_truncation_threshold_chars (default 4000) now keep head + tail in the message history and spill the full body to a new tool_result_artifacts table with a TTL. The agent reads further into the body via the new fetch_artifact tool, which is scoped to the chat that produced the artifact. Stops bash/web_fetch/read_file blasts from inflating every subsequent turn's prompt cost, while keeping the full body recoverable for the rest of the session. * feat(memory): per-row TTL + recency decay for ranking Adds an `expires_at` column to memories so the agent can mark time-bounded facts (e.g. "working from Tokyo this week") for auto-prune. write_memory accepts `ttl_days`; the reflector tick deletes anything past its expiry along with stale tool-result artifacts. L1 ranking in build_db_memory_context now multiplies confidence by an exponential recency-decay (configurable half-life, default 30d) so stale EVENT/KNOWLEDGE rows fall behind durable ones. PROFILE memories are exempt — they describe the user, not transient state. * feat(agent): per-tool duplicate-call circuit breaker Tracks the last N (tool_name, args_hash) keys across iterations. When the same call would run for the (limit+1)th time inside the window, the agent loop short-circuits it with an error tool_result that nudges the model to change approach instead of repeating itself. Defaults to a 10-call window with a limit of 3; both knobs are configurable. Distinct from the existing whole-turn-fingerprint streak guard (MAX_IDENTICAL_TOOL_USE_STREAK), which aborts the loop. The breaker is softer: only the offending call fails, so the model can self-correct in the same turn. fetch_artifact is exempted because paginated reads of one artifact look like duplicates by design. * feat(skills): structured tool trajectory for skill review Replace the lossy "[sender]: content" message dump fed to the skill-review LLM with a step-numbered trajectory built from the structured Vec<Message> loaded from sessions.messages_json. Each tool_use block is rendered with its name + truncated JSON input; each tool_result with its head + error flag. Image blocks are dropped, oversized payloads are head-truncated with a "+N chars" suffix. Also replace the messages.len() / 3 tool-call estimate with an exact count of tool_use blocks. Skips review when no session row exists rather than running on degraded data. * feat(skills): success-signal filter before review LLM call Skill review now consults a cheap heuristic (assess_success) before spending tokens on the review LLM. Conversations are flagged Unlikely — and skipped — when: - the duplicate-call circuit breaker fired during the turn - the agent ran tool calls but emitted no closing text - more than half of tool_results errored - the closing assistant text contains apology/failure phrasing (English + Chinese) Saves the LLM call on obvious failures and prevents codifying broken approaches as reusable skills. * feat(skills): trigger skill review at end-of-turn instead of reflector tick Replace the periodic reflector-driven review with an on-completion handoff: the agent loop enqueues chat_id to AppState.skill_review_queue right after persisting the final session, and a dedicated worker task drains the queue (deduping bursts) and runs the review pipeline. Why this is better than the old path: - reviews fire seconds after a turn ends, not up to reflector_interval_mins later, so context is fresh and feedback loops are tight - each conversation is reviewed once per completion, not once per reflector tick (no more re-reviewing the same chat on every tick) - the agent loop never blocks on review work; the queue is non-blocking and the worker runs out of band - dedup batching collapses multiple enqueues for the same chat (e.g. rapid user turns) into a single review Implementation: skill_review.rs gains run_skill_review, build_skill_review_channel, and spawn_skill_review_worker. AppState owns the SkillReviewQueue handle. Scheduler's reflector tick no longer initiates reviews. * feat(skills): review can edit / patch existing skills, not just create Replace the create-or-skip review verdict with a four-action enum ({"action": "create" | "edit" | "patch" | "none"}). The review LLM now sees existing skills with descriptions + a mutability tag and chooses to: - create: brand-new skill (version: 1) - edit: full rewrite of an existing agent-created skill (version + 1) - patch: single-occurrence find/replace inside agent-created skill (version + 1, ambiguous matches refused) - none: no-op Human-curated skills (source != "agent-created") are immutable from this path. Each agent-created skill carries a monotonic version counter in its frontmatter, surfaced in the apply-action log line. Legacy {"create": true|false, ...} responses are still accepted as a transitional shape so older prompts and self-hosted models keep working. Frontmatter version line is updated in place by patch (preserving the rest of the YAML), or rewritten wholesale by create/edit. * feat(skills): activation tracking + auto-archive of inactive skills Track every successful activate_skill call to a new skill_activation_logs table (schema v25). The reflector tick walks the skills directory once per cycle and moves agent-created skills that haven't been activated within skill_archive_after_days (default 30) to <skills_dir>/.archived/<name>-<timestamp>/, where the discoverer can't see them but the move is reversible. The archive policy is split into a pure decision rule (should_archive_skill) and the IO-only sweep (archive_inactive_agent_skills), so the policy is exhaustively unit-testable without mtime gymnastics. Guards against false-positive archival: - human-curated skills (source != "agent-created") are never touched - freshly-written skills (mtime within threshold) are kept regardless of activation history, so a never-yet-activated new skill survives the next sweep - threshold_days = 0 disables the sweep entirely Also exposes Database::skill_activation_counts_since for the insights tool to consume in a follow-up. * feat(skills): retrieval-gated catalog — inline top-K hot matches by query build_skills_catalog_for_query scores every skill's name+description against the current user query (keyword overlap with CJK n-gram support, reused from memory_service::tokenize_for_relevance) and splits the catalog into: - Hot bucket (top skills_catalog_top_k matches with score > 0): full SKILL.md body inlined, capped at 1500 chars per skill. - Cold bucket: name + truncated description only, with the standard "use activate_skill to load" hint. Trades a bigger token slice for the most-relevant skills (so the agent has procedural knowledge inline and skips an activate_skill round-trip) against keeping the long tail cheap. Falls back to the flat catalog when top_k = 0, the query is empty, or no skill scores > 0 — all the old behavior is preserved as the degenerate path. skills_catalog_top_k defaults to 3. * feat(skills): enable autonomous skill review by default Flip skill_review_min_tool_calls from 0 (disabled) to 5 — the whole end-of-turn skill review pipeline now runs out of the box, no config needed. The 5-tool-call threshold is the same minimum the prompt itself asks the review LLM to look for, so smaller turns still skip review and don't burn the LLM call. Operators who don't want autonomous skill creation can still opt out by setting skill_review_min_tool_calls: 0. Also fix the integration test minimal_config() to include the nine config fields added in the recent feature commits — it had drifted because cargo test --lib doesn't catch tests/ misses. * feat(security): media tools honor media.allowed_read_dirs allowlist Vision and STT tools previously rejected any file path outside working_dir. That blocks legitimate setups where media lives on a mounted volume or shared cache directory. Add an explicit allowlist parameter to load_bytes_from_location and thread media.allowed_read_dirs through describe_image and transcribe_audio so operators can opt in extra roots without weakening the default guard. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(voice): cross-channel inbound voice transcription Telegram already auto-transcribed `voice` messages, but Discord audio attachments and Feishu `audio` events were dropped or surfaced as a "not yet supported" placeholder, and Slack audio file uploads were ignored. Hoist the STT dispatch + inbound formatting out of telegram into a shared `voice` module, then plug it into the Discord/Slack/Feishu attachment paths so every platform that receives audio routes it through the same OpenAI/local STT provider with a uniform `[voice message from <user>]: <text>` shape the agent already handles. Web is unchanged — its frontend doesn't capture microphone input today. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(context): project-level context files in system prompt Hermes Agent surfaces a workspace-wide "Context Files" layer that sits above per-chat memory: facts that should shape every conversation in a deployment but are not personality (SOUL.md) and not user-curated recall. MicroClaw had no equivalent — operators had to inline such notes into SOUL.md or rely on the reflector to discover them. Add a Project Context layer: load all `*.md` files (alphabetical) from `<data_dir>/context/` plus `<runtime_data_dir>/groups/<chat_id>/context/` for chat-scoped overrides, concatenate them, cap at `context_max_chars` (default 8000), and inject into the system prompt between the identity preamble and the dynamic Memories section so the prefix stays cache-friendly. Path is overridable via `context_dir`; setting `context_max_chars: 0` disables the layer entirely. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(security): bash command-content gate + readable approval preview The HITL approval flow already paused high-risk tools waiting for operator confirmation, but the existing gate had two gaps relative to Hermes-style command approval: 1. Approval was bound to the *tool name* and the *chat type* — bash running in a non-control chat slipped through with no inspection of what command was about to execute. 2. The "waiting for confirmation" message only named the tool, leaving the operator to scroll back through tool-call JSON to see the actual command before approving. Add `bash_dangerous_patterns` (case-insensitive regex list, with a curated default covering rm -rf /, pipe-to-shell installers, sudo, dd, forkbombs, mkfs, recursive chmod/chown of root) compiled into the BashTool. When a command matches, bash returns `approval_required` even outside control chats; the existing auto-retry path then handles re-execution after the operator approves. Capture the bash command (or truncated input JSON for future high-risk tools) into `waiting_approval_preview` so the pause message includes a fenced code block of exactly what the agent intends to run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(skills): agentskills.io spec compatibility The SKILL.md parser already understood `name` and `description` plus MicroClaw's nested `compatibility.{os,deps}` form, but the agentskills.io standard is now adopted across Claude, Claude Code, OpenCode, Codex, Cursor, Goose, and a growing list of clients. Skills authored against that spec used three fields the parser silently dropped — `license`, `compatibility` (flat string form), and `allowed-tools` — and skill names with characters allowed by MicroClaw but disallowed by the spec couldn't round-trip to other clients. Make the parser accept the spec's flat fields alongside MicroClaw's nested forms (untagged enum on `compatibility`), surface the new fields on SkillMetadata for downstream consumers, and tighten the skill_manage create/edit name validator to the spec's character rules (lowercase a-z, digits, hyphens; no leading/trailing/consecutive hyphens; ≤64 chars). Pre-existing skills with underscores or uppercase names still load — the stricter rule only applies to skills the agent creates from now on, so they're portable to other clients out of the box. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(memory): per-chat USER.md user-model layer Hermes splits a single curated USER.md narrative from the bag of atomic memories so the agent always has a coherent description of who the user is, regardless of which atomic facts happen to rank high for the current query. MicroClaw had PROFILE-category memories injected at L0, but those still arrive as fragmented rows that compete for budget and can contradict each other as they accumulate. Add a curated user-model layer: - New `<runtime_data_dir>/groups/<channel>/<chat_id>/USER.md` per chat, with read/write helpers on MemoryManager. - `load_user_model` reads the file with a `user_model_max_chars` cap (default 1500, matching Hermes); returns None when the layer is disabled (`user_model_max_chars: 0`) or the file is missing. - System prompt grows a `# User Model` section between the soul/identity preamble and Project Context, so the user model anchors the prefix-cache prefix above query-driven memory ranking. - Reflector ends each `reflect_for_chat` with `curate_user_model_for_chat`, which calls a small dedicated LLM with the current USER.md + the chat's PROFILE memories + a recent conversation excerpt and rewrites the file. The curator is gated by `user_model_curation_due` so it amortizes across `reflector_interval_mins * user_model_curation_interval` (default ~3 reflector ticks) instead of firing every tick. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(reflector): fold USER.md curation into single LLM call The previous USER.md commit added a second LLM round trip per reflector tick to curate the user model. That doubled reflector cost for chats where PROFILE memories were extracted. Extend the existing reflector JSON output schema with an optional user_model field, include the current USER.md in the reflector's user message, and persist whatever the LLM returns (or null when no rewrite is needed). Drop the standalone curator function and the user_model_curation_interval knob — the LLM itself decides when a rewrite is warranted by emitting null, so the per-tick amortization gate is no longer load-bearing. Also tighten the response parser: legacy top-level arrays previously fell through into the embedded-object scan, which would silently match the first array element's braces and drop the rest. Branch on the parsed JSON type so arrays take the legacy memories path unambiguously. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(context): per-chat context dir uses channel/chat_id layout The project-context loader I added earlier wrote per-chat overlays at runtime/groups/<chat_id>/context/, but every other per-chat artifact (AGENTS.md, USER.md, soul overrides) lives at runtime/groups/<channel>/<chat_id>/. That mismatch forces operators to remember two layouts and would silently pick up the wrong overlay when the same numeric chat_id appeared on different channels. Thread caller_channel through load_project_context and join it before the chat_id segment so context directories sit alongside the other per-chat files, with a regression test that confirms a chat scoped to telegram doesn't leak its overlay to a discord chat with the same id. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(skills): surface agentskills.io fields + warn on legacy names The previous compat commit started parsing license / compatibility / allowed-tools but those fields stayed invisible to operators, and skill_review's name validator still accepted underscores while skill_manage's already enforced the spec — so reviewer-proposed skills could land with names that round-tripped fine inside MicroClaw but broke the moment they were published to other Agent Skills clients. - Skill listings (`microclaw skill list` / available output) now indent license, compatibility (string form), and allowed-tools beneath each available skill so operators can audit declared metadata at a glance. - Discovery emits a one-shot warn per non-spec-compliant skill name — silent for legacy installs that already passed loading, but loud enough to nudge a rename before publication. - skill_review's reviewer delegates to validate_agentskills_name so proposed names match the same rule skill_manage enforces. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(doctor): diagnostics for context cap, USER.md, bash patterns The new system-prompt layers (project context, USER.md) and the bash command-content gate landed without doctor coverage, so a misconfigured deployment — context_max_chars=0, user_model_max_chars=0, or an invalid regex in bash_dangerous_patterns — would silently degrade behavior with no preflight signal. Add three checks to `microclaw doctor`: - `context.max_chars` warns when 0 (layer disabled) or absurdly large (prefix-cache risk). - `user_model.max_chars` warns when 0 or above the curation budget Hermes treats as the upper bound. - `bash.dangerous_patterns` compiles each entry and FAILs loudly when any regex is invalid — the runtime currently swallows compile errors, which leaves the gate weaker than operators expect. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(security): redact PII before writes to USER.md and memory rows The redact module already scrubs OpenAI/Anthropic/GitHub/AWS/Slack/Google keys, bearer tokens, and emails for log output, but it wasn't on the write path for any persisted memory artifact. Reflector-extracted memories quote conversation content verbatim, so a user pasting an API key into chat would land that key in long-lived storage and the embedding store, and the new USER.md curator could verbatim-quote the same secrets into a per-chat narrative file. Apply `redact::redact` at three boundaries: - MemoryManager::write_chat_user_model — USER.md content gets scrubbed before hitting disk. - MemoryManager::{write_global,write_chat,write_bot}_memory — same treatment for AGENTS.md narrative files. - memory_service::apply_reflector_extractions — DB memory rows get scrubbed after normalization but before topic-key dedup and insertion, so neither the topic key nor the embedding payload sees the secret. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(commands): /user shows and clears per-chat USER.md USER.md is curated by the reflector and silently injected into every system prompt for the chat, but operators had no way to inspect what the curator had decided about them or to nudge it back to a clean slate when the narrative drifted. Add a slash command: - `/user` prints the current USER.md with a `(used/cap chars)` header so the operator can see how much room is left, or a friendly hint when the file is empty. - `/user clear` removes the file via a new `MemoryManager::clear_chat_user_model` helper; the reflector rebuilds it on its next tick. - Anything else after `/user` falls through to a one-line usage hint. The handler is a free function so the command logic doesn't need a full AppState fixture to test — the storage helper that does the on-disk work has its own unit test. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(voice): outbound TTS round-trip for voice-inbound turns Inbound voice was already transcribed across Telegram/Discord/Slack/Feishu in an earlier commit, but the bot's reply always came back as text — a jarring asymmetry for users on a phone who expected to listen to the response on the same surface they spoke into. Add an opt-in `voice_round_trip` config flag that, when paired with `media.tts.enabled`, renders the reply text as audio via the existing OpenAI-compatible /audio/speech endpoint and ships it back through the channel: - New `voice::synth_speech_to_temp` and `voice::round_trip_enabled` helpers so each channel pulls in two thin wrappers instead of fabricating a tool-input shape just to play back text. - Telegram tracks `voice_inbound`, then uses `bot.send_voice` so the client renders the reply as a native voice bubble. - Discord tracks `voice_inbound`, then attaches the audio file to a follow-up message via serenity's CreateMessage builder. - Slack threads `voice_inbound` through the audio-injection path and uploads the synthesized reply via files.upload (mirroring how the SlackAdapter delivers attachments). - Feishu deferred — its audio message type requires a separate resource-upload + tenant-token round trip. Defaults to false because each round-trip burns one extra TTS call; the operator must explicitly opt in alongside `media.tts.enabled`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(ci): clippy, audit, docs, and test-compile fixes for #335 - agent_engine.rs: allow clippy::too_many_arguments on build_system_prompt (now 9 args after user_model + project_context); matches existing usage in setup.rs / slack.rs / feishu.rs / db.rs. - doctor.rs: collapse two if_same_then_else branches in context/USER.md cap checks (warn on 0 OR over-cap) — same status, single arm. - Cargo.lock: bump rustls-webpki 0.103.10 -> 0.103.13 to clear RUSTSEC-2026-0104 (reachable panic in CRL parsing). - tests/config_validation.rs: add bash_dangerous_patterns, context_dir, context_max_chars, voice_round_trip, user_model_max_chars to the minimal_config helper. The fields were added in earlier commits on the branch but the test wasn't updated, breaking Rust and Coverage CI on all platforms. - docs/generated/config-defaults.md: regenerate to match the new fields so the docs --check gate passes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(providers): add Xiaomi MiMo preset (MiMo-V2.5-Pro / MiMo-V2.5) Xiaomi exposes its MiMo line through an OpenAI-compatible endpoint at https://api.xiaomimimo.com/v1, so this is a one-row addition to PROVIDER_PRESETS — `provider_protocol`, `default_model_for_provider`, the setup picker, and the generated provider matrix all derive from that table. Default model is MiMo-V2.5-Pro; MiMo-V2.5 is offered as the second option in the model picker. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(providers): xiaomi MiMo model ids are lowercase The /v1/models endpoint returns lowercase ids (`mimo-v2.5-pro`, `mimo-v2.5`, `mimo-v2-pro`, `mimo-v2-omni`); the camel-case names I shipped in the previous commit ("MiMo-V2.5-Pro") got rejected as "Not supported model" when the setup wizard ran its model test. Also expand the model list with mimo-v2-pro and mimo-v2-omni so the picker reflects the full non-TTS lineup. TTS variants are excluded because microclaw's chat path doesn't drive them. Verified against https://token-plan-cn.xiaomimimo.com/v1 (the coding plan gateway) — chat completion succeeds with model id mimo-v2.5-pro. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(providers): add mimo-v2-flash to xiaomi preset User confirmed mimo-v2-flash is a real model on the Xiaomi MiMo line even though /v1/models on the coding-plan gateway doesn't currently list it (likely tier-gated behind that endpoint). Added between mimo-v2-pro and mimo-v2-omni so the picker reflects pro -> flash -> omni from heaviest to lightest within the v2 generation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(read_file): clamp offset to line count to avoid slice panic When the agent passed an offset beyond the file's line count, the slice `lines[offset..end]` panicked with "range start index N out of range for slice of length M" because end was clamped to lines.len() but offset was not — leaving offset > end. The panic surfaced from inside a tokio worker, which is bad: a malformed tool input shouldn't crash the runtime. Clamp offset to lines.len() so an out-of-range offset yields an empty (offset..offset) slice, and switch offset+limit to saturating_add to be safe against pathological inputs. Added a regression test that reproduces the original panic (offset=369 on a 3-line file). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(reflector): downgrade benign no-op responses, log preview on real failures When the reflector LLM declined to extract anything (returning empty, "null", "[]", "{}", "none", or a short refusal), the parser fell through all four JSON strategies and logged ERROR — flooding the log with what is actually a benign "nothing to update" signal. Operators saw repeated "parse failed for chat ...: no valid JSON found" errors even though the runtime was behaving correctly. Distinguish the two cases now: - Empty / explicit-no-op shapes (length < 16, or matching the common refusal tokens) log at info — the model just had nothing to say. - Anything else still logs at warn (downgraded from error) and includes a 200-char response preview, so when the prompt schema drifts or a provider misbehaves we can see the actual payload without rerunning with LLM debug streams enabled. Added two regression tests covering both branches. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(providers): align preset list with OpenClaw + sort A→Z Add 17 missing OpenClaw-supported LLM providers and sort the entire PROVIDER_PRESETS table alphabetically by id. The setup wizard's preset picker and the generated provider matrix both mirror this order verbatim, so users now see a predictable A→Z list. New providers (all OpenAI-compatible): - arcee, cerebras, cloudflare-ai-gateway, deepinfra, fireworks, groq, inferrs, kilocode, litellm, lmstudio, qianfan, sglang, stepfun, venice, vercel-ai-gateway, vllm, volcengine Skipped: - Pure-multimedia providers (azure-speech, comfy, deepgram, elevenlabs, fal, gradium, inworld, runway, senseaudio, vydra) — microclaw routes multimedia through `media.*` tools, not a separate provider concept. - glm / zai — already covered by the existing `zhipu` preset (label is "Zhipu AI (GLM / Z.AI)"). - qwen — already covered by `aliyun-bailian` and `alibaba`. - opencode / opencode-go — OpenClaw-internal catalogs. - github-copilot — needs an OAuth/token-exchange flow that doesn't fit the simple preset shape. - perplexity — a web-search plugin, not an LLM provider. - bedrock-mantle — variant of bedrock; the existing `bedrock` entry already covers the OpenAI-compat surface. - claude-max-api-proxy — community proxy; the existing `custom` preset is the right shape for any OpenAI-compat localhost endpoint. Added a regression test enforcing the A→Z invariant so future additions don't drift. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(readme): surface hermes-era features and new tools Update Features and Tools sections to reflect what shipped on this branch — per-chat USER.md user model, cross-channel voice, multimedia suite, defensive web_fetch defaults, tool-result truncation + fetch_artifact, skill lifecycle, plus the nine new built-in tools (session_search, clarify, osv_check, insights, fetch_artifact, generate_image, describe_image, text_to_speech, transcribe_audio). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1.2 KiB
1.2 KiB
Generated Built-in Tools
This file is generated by scripts/generate_docs_artifacts.mjs. Do not edit manually.
Total built-in tools: 56
a2a_list_peersa2a_sendactivate_skillbashbrowsercalculatecancel_scheduled_taskclarifycompare_timedescribe_imageedit_fileexport_chatfetch_artifactgenerate_imageget_current_timeget_task_historyglobgrepinsightsknowledge_graph_addknowledge_graph_querylist_scheduled_task_dlqlist_scheduled_tasksosv_checkpause_scheduled_taskread_fileread_memoryreplay_scheduled_task_dlqresume_scheduled_taskschedule_tasksend_messagesession_searchsessions_spawnskill_managestructured_memory_deletestructured_memory_searchstructured_memory_updatesubagents_focussubagents_focusedsubagents_infosubagents_killsubagents_listsubagents_logsubagents_orchestratesubagents_retry_announcessubagents_sendsubagents_unfocussync_skillstext_to_speechtodo_readtodo_writetranscribe_audioweb_fetchweb_searchwrite_filewrite_memory