Changelog Archive: v0.6.x Series¶
[v0.6.39] - 2025-09-17¶
CI Bump¶
[v0.6.38] - 2025-09-17¶
Added¶
- List input resolver service shared by CLI and MCP (
resolveNameToPatternIfLocalFile) to detect when users pass a local file path and normalize it to an exact path pattern for the list tool. - CLI
list: when--namepoints to a local file, prints a concise line-based diff against the indexed content after the listing (non-disruptive; no mutations). - MCP
listtool: new opt-in flaginclude_diffto attach a structured diff block per matching document whennameis a local file. Response annotateslocal_input_filefor traceability.
Changed¶
- The hybrid search engine now defaults to Reciprocal Rank Fusion (RRF) instead of a linear combination of weights. This provides more robust and balanced search results out-of-the-box and simplifies tuning.
- Build/Deps: Bumped Tracy profiler to 0.12.2 (from 0.12.1); enabled by default in yams-debug preset.
- List command parity: CLI and MCP now share the same local-file detection and path normalization
for
list, reducing surprises between surfaces.
Connection Handling¶
- SocketServer: pre-accept backpressure now clamped to a very small delay (≤20ms) to avoid kernel backlog saturation and spurious
ECONNREFUSED. Added overload counters to daemon stats (acceptBackpressureDelays,acceptCapacityDelays). - RequestHandler: persistent connections are no longer closed on benign read timeouts; unified closure logs with explicit reasons (EOF, invalid frame, parse error, FSM not alive).
- Client transport/pool: clearer error messages that distinguish
ECONNREFUSEDfrom timeouts and include socket path with actionable hints.
Fixed¶
- The
add_directorytool now correctly validates that the input path is a directory and handles path normalization more robustly, fixing an issue where it would fail to index any files. - Corrected the
ifcondition syntax in therelease.ymlworkflow to resolve a parsing error related to thesecretscontext.
[v0.6.37] - 2025-09-17¶
Hotfixes¶
- Daemon Stability: Fixed a critical race condition in the SHA256 hasher that could cause the daemon to crash with a
SIGSEGVduring concurrent file ingestion. - Bounded IO pool (
ipc_io) to prevent CPU spikes on initial activity: - Growth gate now requires either high mux backlog OR
activeConn > ioThreads * ioConnPerThread. - Dynamic max:
ipc_io.max = min([tuning].pool_io_max, TuneAdvisor::recommendedThreads(0.5)). - StatusResponse now includes
ipcPoolSizeandioPoolSizefor quick verification. - Early tuner start:
TuningManagernow starts before sockets/services to avoid the race where the daemon accepted work without centralized tuning active. - Tracy zones: added lightweight zones around TuningManager loop/tick, WorkerPool threads, and SocketServer accept/connection to profile remaining hotspots on demand.
- Portability: snapshot registries switched to
atomic_load/storefree functions forshared_ptrto fix libc++ build errors (nostd::atomic<std::shared_ptr<...>>). - PostIngestQueue:
- Introduced dynamic worker resize with PoolManager(“post_ingest”) integration.
- Implemented bounded queue with backpressure (default capacity 1000; env
YAMS_POST_INGEST_QUEUE_MAX). - TuningManager now scales post_ingest up on backlog when idle and down when busy.
Updated Config guidance¶
- Suggested
[tuning]while validating: pool_io_max = 4,io_conn_per_thread = 16,pool_cooldown_ms >= 750.- Validate using
ipcPoolSize/ioPoolSizein status; pools should grow gradually and shrink at idle. - Capture a 5–10s trace with Tracy to identify any remaining hot loops.
[v0.6.36] - 2025-09-16¶
Fixed¶
- Adding homebrew formula as module
[v0.6.35] - 2025-09-16¶
Added¶
- CLI doctor: proactively triggers a plugin scan when neither typed providers nor
plugins_jsonare available, then briefly waits so results appear. Matchesyams plugin listbehavior for faster diagnosis. - CLI add: auto-detects directory paths and enables
--recursiveautomatically (LLM/UX friendly) with an informational log.
Changed¶
- Doctor plugin listing now prefers daemon typed providers from
status.providersand falls back togetStats().additionalStats["plugins_json"]for parity withyams plugin list. Retains macOSotool -Lhint for degraded plugins. - Directory ingestion: daemon dispatcher treats any directory path as directory ingestion (forces recursive=true) even if clients omit the flag.
Fixed¶
- ServiceManager: corrected placement of
env_truthy()(was nested in another function) and added<cctype>include to fix macOS build failures. - DocumentService: early, clear error when a directory path is sent to the single-file
store()path (preventsfile_size: Is a directory).
[v0.6.34] - 2025-09-16¶
Fixed¶
- CI improvements and fixes for release assets
- Fixing regression from LLD cmake linking
[v0.6.33] - 2025-09-16¶
Fixed¶
- MCP server
- Improved download logic to improve retrieving content from a retrieved files
- CI/CD
- Sourcehut: Improving build packaging step
- Github Actions: Improving test.yml configuration step
[v0.6.32] - 2025-09-16¶
Fixed¶
- Fixing linking errors seen on ubuntu 24.04
- CI/CD
- Sourcehut: Fixing how packaging is exposed
- Github Actions: Fixing test dependency setup and release package steps
[v0.6.31] - 2025-09-16¶
Fixed¶
- CI/CD
- GitHub Actions: tests workflow YAML fixed (indentation of
timeout-minutesunder the job). - Docker workflow: ARM64 builds now use QEMU + Buildx on
ubuntu-latest; nativeubuntu-24.04-armrunners no longer required for ARM publishing. Multi-arch manifest unchanged.
[v0.6.30] - 2025-08-16¶
Fixed¶
- Fixed regressions in CLI / MCP UI performance from the introduction of the daemon
- ARM Build: Boost
find_packagefailure resolved via corrected Conan toolchain path and added generator directory visibility during CMake configure.
Added¶
- Packaging: Introduced Homebrew formula templating (
packaging/homebrew/yams.rb.template) with placeholder substitution for version + sha256 during stable releases. - Release workflow: Generates and commits
latest.jsonmanifest plus Homebrew formula in a single pass; adds commit step guarded to stable channel. - Homebrew: Added
livecheckblock for future tap/live version detection. - Services
- RetrievalService parity for CLI and MCP (get/list/grep) with shared facade.
- DocumentIngestionService used by CLI add and MCP add for daemon‑first ingestion.
- Daemon: Native chunked get protocol handlers (GetInit/GetChunk/GetEnd) backed by in‑memory RetrievalSessionManager, enabling efficient large content retrieval and capped buffers.
Changed¶
- CI/Build
- Unpinned Conan across tests and release workflows (allow latest compatible Conan 2.x) while retaining reproducibility via profile + lock hashing.
- Updated
CMakePresets.jsontoolchain paths (removed incorrect nestedbuild/Debug/build/Releasesegments) enabling correct Boost discovery on ARM. - Extended test job timeout to 50 min (ARM warm builds) and added Conan/Boost diagnostics aiding cache hit analysis.
- Replaced heredoc inline formula generation with template-driven substitution reducing workflow maintenance surface and diff noise.
- CLI + MCP: Switched get/list/grep to RetrievalService; add paths to DocumentIngestionService to eliminate bespoke logic and timeouts/hangs divergences.
- CLI: Guarded
cfg.dataDirassignment behindhasExplicitDataDir()throughout commands per refactor plan. Centralized DaemonClient data‑dir resolution in default constructor (env → configcore.data_dir→ XDG/HOME →./yams_data). - RetrievalService:
getChunkedBufferprefers the native chunked protocol and falls back to unaryGetonly forNotImplemented/Timeout/NetworkError. Improved name‑smart get to resolve via filename when hybrid search returns a path (macOS path aliasing resilience).
[v0.6.29] - 2025-09-14¶
CI Bump¶
- Fixing github actions by removing conan version pin and fixing test.yml
[v0.6.28] - 2025-09-14¶
CI Bump¶
- Fixing github actions macos key signing regression
[v0.6.27] - 2025-09-14¶
Changed¶
- CI/build: Nightly packaging auto-enables on
refs/heads/main(no time-based scheduler needed). Improved repo-root detection (yams/vs.) across all tasks, and broadened artifact globs to support both layouts. - CI/build: Hardened packaging tasks — CPack RPM now guarded by
rpmbuildavailability and won’t fail the job if RPM tooling is missing; DEB packaging remains enabled. Addedset -eto packaging steps for clearer early exits while keeping RPM optional. - Knowledge Graph: Added unit tests for alias exact/fuzzy resolution, neighbor retrieval, and doc-entity roundtrips (
kg_store_alias_and_entities_test.cpp). - Search + KG: Added SimpleKGScorer end-to-end test validating entity and structural scores and explanations (
kg_scorer_simple_test.cpp).
[v0.6.26]¶
CI Bump¶
- Sourcehut debian image updates for packaging on fedora
- Github CI improvements for testing and release builds
[v0.6.25] - 2025-09-14¶
CI Bump¶
- Fixing github actions cache and release naming
[v0.6.24] - 2025-09-14¶
CI Bump¶
- Fixing build matrix used in github actions to fix artifact uploading.
[v0.6.23] - 2025-09-14¶
CI Bump¶
- Increasing test timeouts and fixing packing errors with sourcehut builds
[v0.6.22] - 2025-09-14¶
CI Bump¶
- Fixing macos env error
[v0.6.21] - 2025-09-14¶
Fixes¶
- Fixing prefix for sourcehut builds
- Adding cache for Github CI
[v0.6.20] - 2025-09-13¶
Added¶
- Asynchronous post‑ingest pipeline (daemon): decouples heavy work from
add/add_directory. - New
PostIngestQueueperforms extraction → full‑text index → knowledge graph upserts in the background. - Queue metrics (threads, queued, processed, failed) with EMAs for latency and throughput are exposed via
status. - Knowledge Graph enrichment beyond tags:
- Lightweight analyzers extract URLs, emails, and optional file paths and add
MENTIONSedges. - Caps and toggles are controlled via
TuneAdvisor. - Batch KG operations:
KnowledgeGraphStore::upsertNodes(std::vector<KGNode>)used in the post‑ingest worker to avoid read‑after‑write races.- New
KnowledgeGraphStore::addEdgesUnique(std::vector<KGEdge>)batches edge inserts with on‑conflict de‑duplication on(src_node_id, dst_node_id, relation). - TuneAdvisor controls (code‑level, no env):
kgBatchNodesEnabled,kgBatchEdgesEnabled(default true)- Analyzer toggles:
analyzerUrls,analyzerEmails,analyzerFilePaths(file paths off by default) maxEntitiesPerDoc(default 32)- MCP observability and parity:
- New
statustool that returns daemon readiness and counters (includes post‑ingest and MCP worker pools). - New
yams://statusresource for symmetry withyams://stats. - Minimal
doctortool summarizes degraded subsystems and suggests actions.
Changed¶
addandadd_directorydelegate extraction/index/graph work to the post‑ingest queue;DocumentServicehonorsdeferExtractionto keep adds fast.- Post‑ingest worker now batches KG node upserts and uses unique batch edge inserts to reduce DB round‑trips.
- Vector/Readiness metrics mirrored into
status.requestCountsfor MCP/CLI parity.
Fixed¶
- Daemon readiness consistency across CLI/UI:
DaemonMetricsnow derives booleanreadyfrom the lifecycle state (authoritative), not deprecatedfullyReady(). Preventsdoctorshowing “NOT READY, state=Ready”.- Normalized status strings to lowercase (“ready”, “degraded”, etc.) in the status path to match serializer/client expectations and avoid case-related derivation bugs.
- Stats
not_readyflag now reflects lifecycle readiness (Ready/Degraded →false) instead of alwaystrue, soyams statswon’t unnecessarily fall back to local mode. - Bootstrap status JSON (
yams-daemon.status.json) now writes a lowercaseoverallfield for consistency with IPC lifecycle strings. - Stats daemon reachability:
- Increased
yams statsshort-mode timeout from 1.5s → 5s and set explicit connect/header/body timeouts to avoid premature local fallback, aligning withstatus/doctorbehavior. not_readyheuristic is now lifecycle-driven (Ready/Degraded → false), so stats no longer assumes unready by default.- Temporary mitigation:
yams statscurrently forces local-only output while we investigate a daemon GetStats streaming/header path issue that can stall and cause slow fallbacks. The daemon path will be re-enabled once the bug is fixed.
[v0.6.19] - 2025-09-13¶
Fixed¶
- MCP stdio framing correctness and async send:
- Fixed double-encoding on outbound async path:
outboundDrainAsyncnow uses a framed string sender instead of passing a serialized JSON string tosend(const json&). - Removed extra CRLF after framed payload; framed responses now send exactly
Content-Lengthbytes after the blank line (LSP/MCP compliant). - Added
StdioTransport::sendFramedSerialized(const std::string&)for pre-serialized payloads used by the outbound drain. - MCP prompts/get schema correctness for MCP Inspector:
- Roles now use only
assistantanduser(removedsystem). contentis a single object{ "type": "text", "text": ... }, not an array.- macOS/Linux build/link improvements:
- Resolved yams-daemon undefined symbols on Linux by including IPC OBJECT files in the daemon static lib and linking
yams_ipc_proto. - Fixed yams-mcp-server linking on macOS by avoiding IPC OBJECTs that pull daemon-only symbols; explicitly link IPC/client/proto libs instead.
Changed¶
- StdioTransport testability: in test builds (
YAMS_TESTING), receive() usescin.rdbuf()->in_avail()to support reliable stringstream-driven tests. - Default server send now includes
Content-Type: application/vscode-jsonrpc; charset=utf-8and a trailing CRLF when using header framing to maximize client compatibility. - ONNX plugin dependency hygiene: Removed unintended PDFium bundling logic from the ONNX plugin CMake (it never linked or required PDFium). Added a configure-time guard warning if
pdfium::pdfiumbecomes linked accidentally to keep plugin responsibilities isolated. PDF extraction remains solely in thepdf_extractorplugin.
Added¶
- ONNX/GenAI path (vector library):
- Scaffolded a minimal GenAI adapter in the vector layer and made GenAI the default when ONNX is enabled; falls back to raw ONNX Runtime when GenAI is unavailable.
Daemon & Plugins¶
- Plugin autoload diagnostics: adoption logs now include the plugin file path; environment
YAMS_PLUGIN_DIRis prioritized over system directories to avoid stale installs.
[v0.6.18] - 2025-09-13¶
Hot fix¶
- CI Bump : CMake and conan update for
[v0.6.17] - 2025-09-13¶
Hot fix¶
- CI Bump : Resolving compile compatibiliy
Changed¶
- Build: Removed legacy
-fcoroutines-tsflag for modern Clang toolchains (C++20 coroutines are enabled by default). Prevents build failures on distributions where the flag is rejected. - Build: Fixed macOS Conan host profile parsing error by removing unsupported
[env]section; libc++ enforcement now handled via workflow sanitization. - Build: Added defensive scrub in
CMakeLists.txtto strip any externally injected-fcoroutines-tsoccurrences from cached flags.
[v0.6.16] - 2025-09-13¶
Hot fix¶
- CI Bump : Fixing syntax
[v0.6.15] - 2025-09-13¶
Hot fix¶
- CI Bump : Updating clang version in host.ninja for builds
[v0.6.14] - 2025-09-12¶
Added¶
- CLI Doctor:
yams doctor dedupe— detects (and optionally deletes) duplicate documents grouped by--mode(path|name|hash) with keep strategies (--strategy keep-newest|keep-oldest|keep-largest), dry-run by default, and safety flag--forcefor hash mismatches.yams doctor embeddings clear-degraded— attempts to clear the embedding degraded state by loading a preferred/available model (interactive confirmation, best-effort).- Status + CLI plugin diagnostics:
- StatusResponse now carries typed
providersand askippedPluginsarray (path + reason). yams plugins listprefers typed providers and supports--verboseto print a “Skipped plugins” section with clear reasons (preflight/dlopen/symbols/name-policy).- Config and init:
- New
[daemon].plugin_name_policywith valuesrelaxed(default) orspec(requirelibyams_*_plugin.*). examples/config.tomldocuments the field;yams init --enable-pluginsnow writesplugin_dirand setsplugin_name_policy = "spec"for canonical naming by default.- New
[daemon].auto_repair_batch_sizedefaulted to 16; daemon applies it when starting the RepairCoordinator. - macOS plugin robustness:
- ONNX plugin CMake now bundles ONNX Runtime dylibs (existing) and PDFium dylib into local subdirs and injects
@loader_path/onnxruntimeand@loader_path/pdfiuminto rpaths. SIP‑safe, no DYLD tweaks.
Changed¶
- Safer ABI plugin scanning:
scanTarget()no longerdlopen()s candidates (prevents constructor crashes); macOSdlopen_preflight()is performed inload()and failures are handled gracefully.- Filename filter accepts
libyams_*_plugin.*andyams_*_plugin.*; strict policy can be enforced via config/env. - CLI
yams plugins listauto‑starts the daemon, triggers a scan, waits briefly for readiness, and prints typed providers. Helpful guidance is printed if no info found.
Fixed¶
- macOS crash on plugin scan/load:
- Eliminated segfaults by avoiding
dlopen()during scan; added preflight + robust error logging in load path; fixeddlerror()usage (read once, avoid null concatenation). - Repair throttling and backpressure:
- RepairCoordinator honors config batch size and TuneAdvisor env knobs (
YAMS_REPAIR_MAX_BATCH,YAMS_REPAIR_TOKENS_IDLE/BUSY,YAMS_REPAIR_MAX_BATCHES_PER_SEC, etc.) and pauses token issuance when hitting per‑second caps to keep the daemon responsive. - Linux build/linking:
yams-daemonnow links explicitly againstyams::daemon_ipcandyams::daemon_client, resolving undefined symbol errors (ConnectionFsm/RequestHandler/ResourceTuner) on Ubuntu/ELF toolchains.- Test/CI stability:
run_syncrewritten to use a promise +co_spawn(detached)andco_await std::move(aw)to avoid move/copy pitfalls with Boost.Asio awaitables.- Examples/config:
examples/config.tomlupdated withplugin_name_policyand daemon plugin notes; defaults align with init.
Notes¶
- If your macOS ONNX plugin previously failed with
libpdfium.dylibnot found, rebuild/reinstall the plugin; the new CMake will bundle PDFium into a localpdfium/folder and adjust rpaths to@loader_path/pdfium. - Background repair can be tuned at runtime via environment without restarting; see the
TuneAdvisorvariables above for recommended settings during heavy foreground load.
[v0.6.13] - 2025-09-11¶
Hotfixes¶
- Build stabilizations and improvements from MacOS testing
[v0.6.12] - 2025-09-11¶
Added¶
- CLI:
yams graph— read‑only graph viewer that mirrorsget --graph. - Usage:
yams graph <hash> [--depth 1-5]oryams graph --name "<path>" [--depth N]. - Shows related documents via the knowledge graph without fetching content.
- Config: added
[daemon]section toexamples/config.tomlwithenable = true,auto_load_plugins = true, optionalplugin_dir, and notes aboutYAMS_PLUGIN_DIRdiscovery. - CLI/build: restored Hot/Cold mode helpers and include wiring for list/grep/retrieval.
yams doctor- Yams plugin and daemon health are now under one command
- Doctor checks for vector DB vs model embedding dimension mismatch with clear fix steps.
- Doctor warnings for missing companion files (config.json, tokenizer.json) next to model.onnx.
- Plugin System
- Generic C‑ABI plugin loader with trust policy (scan/load/unload; trust list/add/remove)
- IPC handlers for plugin scan/load/unload/trust; CLI
yams pluginsupports list/info/scan/load/unload/trust - DR CLI gating:
yams drsubcommands (status/agent/help) now require plugins and print a guidance message; see docs/PLUGINS.md - Spec docs: docs/spec/plugin_spec.md, WIT drafts (object_storage_v1.wit, dr_provider_v1.wit), JSON Schemas (manifest/object_storage_v1/dr_provider_v1)
- External examples scaffold under docs/examples/plugins and guide at docs/guide/external_dr_plugins.md for out‑of‑tree GPL plugins (S3/R2)
- CLI model tooling
yams model provider: shows active model provider status (loaded models) and the preferred model from config for transparency.- Daemon plugin UX
- Autoload from trusted + standard directories on startup (disable with
YAMS_AUTOLOAD_PLUGINS=0). - Trust‐add autoload: newly trusted directories/files are scanned and loaded immediately (best‑effort).
- Load by name: resolves logical names by scanning plugin descriptors across default dirs (e.g.,
yams plugin load onnx). - External plugin host (metadata only): supports
.yams-plugin.jsonlayouts for non‑ABI external integrations with trust and listing. - Embedding provider safeguard
- When a plugin model_provider_v1 is adopted, the daemon auto‑selects
embeddings.preferred_model = "all-MiniLM-L6-v2"if the user hasn’t set one (or if it’s set to a non‑ONNX default). This nudges embeddings to the ONNX path by default. - Status proto v3: extended
StatusResponseover protobuf to carry readiness (per-subsystem), init progress,overall_status, and request counts. Serializer/deserializer updated with back‑compat casting and safe parsing. - Streaming progress events for long ops:
EmbedEventandModelLoadEventover IPC. - Startup hints: daemon logs a hint to run
yams repair --embeddingson new/migrated data dirs and a hint to useyams stats --verbosefor monitoring.
Changed¶
- Documentation: general improvements and clarifications.
- CI: improved build reliability and semantic version enforcement.
yams doctorsummary now includes a Knowledge Graph section; when the graph is empty it recommends:yams doctor repair --graph.- PDF Extraction: The built-in PDF text extraction logic has been refactored into a standalone plugin (
pdf_extractor) that implements the newcontent_extractor_v1andsearch_provider_v1interfaces. - Model download command now uses the unified downloader for all artifacts (model + companions).
- Improved repo/URL inference; clearer errors list exact URLs attempted.
- ONNX model provider: batched ONNX inference for efficiency; expanded config parsing (hidden_size, max_position_embeddings).
- Init: default v2 config now includes a
[daemon]section withenable = trueandauto_load_plugins = trueso plugin adoption (e.g., ONNX) works out‑of‑the‑box afteryams initwhen plugins are present in standard directories. - CLI: added DR command (gated) and extended plugin command with install/enable/disable/verify stubs (no‑ops for now)
- Stats default view becomes adaptive: when service‑specific metrics or recommendations are relevant, the non‑verbose path includes the Recommendations and Service Status sections automatically; the compact STATS footer remains.
- Recommendation text updated to be actionable: suggests
yams repair --embeddingsinstead of a vague “run indexing”. - Doctor now performs smarter PID checks: prints the configured PID file and also checks a stable fallback in
/tmp, reporting active/stale states explicitly. - CLI daemon alignment:
get,grep, andlistnow initializeDaemonClientwith the CLI data directory and standardized timeouts (header 30s, body 120s). The CLI also seedsYAMS_STORAGEandYAMS_DATA_DIRso daemon auto-start binds to the same storage as the CLI (parity withsearch). - MCP server storage selection: server now resolves and sets
DaemonClientdataDirfrom environment/config (priority:YAMS_STORAGE→YAMS_DATA_DIR→~/.config/yams/config.toml→XDG_DATA_HOME/HOME) and seeds the same env vars for consistent daemon storage across processes. - MCP tools:
gettool schema now defaultsinclude_contenttotrueso content is returned by default unless explicitly disabled. - Embeddings: streaming and stability improvements
- Client streams batch embeddings by default when batch > 1 and retries transient failures with exponential backoff and dynamic sub‑batching (min 4).
- Added detailed ONNX per‑item inference progress logs and total timing in ModelManager::runInference for better diagnosis of hangs.
- Increased default daemon client timeouts to 120s; configurable via environment:
YAMS_REQUEST_TIMEOUT_MS,YAMS_HEADER_TIMEOUT_MS,YAMS_BODY_TIMEOUT_MS. - Doctor/repair now surfaces EmbeddingEvent progress to stdout during
yams doctor repair --embeddingsso users see live progress. - Daemon respects client streaming hints for embedding requests; emits progress chunks and a final response.
- Daemon logs batch embedding processing time (ms).
- Daemon mux tuning centralized via TuneAdvisor (header-only):
- New server-side knobs:
YAMS_SERVER_MAX_INFLIGHT,YAMS_SERVER_QUEUE_FRAMES_CAP,YAMS_SERVER_QUEUE_BYTES_CAP,YAMS_SERVER_WRITER_BUDGET_BYTES. - Existing knobs:
YAMS_WRITER_BUDGET_BYTES,YAMS_MAX_MUX_BYTES,YAMS_WORKER_THREADS,YAMS_CHUNK_SIZE. - SocketServer now honors
TuneAdvisor::chunkSize()for streaming chunk size (no 32 KiB cap). - RequestHandler reads server settings via TuneAdvisor for consistent behavior.
Fixed¶
- Knowledge Graph repair:
yams doctor repair --graphnow populates nodes/entities. - Implemented SQLite KG node upsert/lookups; previously stubbed methods caused
updated=0(no-op) during repair even with existing tags/metadata. - Plugin list over protobuf
GetStatsResponseJSON now embedsplugins_json; serializer/deserializer preserve it soyams plugin listshows loaded plugins reliably.- Framing/transport for plugin ops
- Resolved “Frame build failed” during
plugin scan/loadby aligning CLI and daemon payloads on the protobuf Envelope path. - Build/link fixes
- Link
nlohmann_json::nlohmann_jsonintoyams_ipc_proto(proto serializer uses JSON). - Link
-ldlfor CLI doctor on Linux (dlopen indoctor plugin). - Fixed unterminated string literals in plugin CLI output.
- macOS: CMake configuration and linking fixes for Darwin toolchain (guard LINK_GROUP to Linux; use -force_load where appropriate).
- External plugin host
- Corrected trust file newline writes, regex escapes, and Result usage in load(); ensured proper namespace scoping to fix build.
Notes¶
- When
[embeddings].enable = false, the daemon disables the model provider and plugin auto‑loading to reduce startup overhead; enable embeddings to allow autoload to adopt a provider (e.g., ONNX).
Known Issues¶
- Embeddings with
nomic-embed-text-v1.5(ONNX) may time out or fail under the daemon’s streaming path on some systems. - Symptoms:
yams repair --embeddingsreports “Embedding stream failed: Read timeout” and logs showEmbedDocuments dispatch: model='nomic-embed-text-v1.5'. - Status: under investigation in the experimental branch (plugin/SDK path).
- Workarounds:
- Prefer other ONNX models:
all-MiniLM-L6-v2(fast) orall-mpnet-base-v2(higher quality): yams config embeddings model all-MiniLM-L6-v2(or pass--model all-MiniLM-L6-v2to commands)yams daemon restart- Raise client timeouts for long embedding runs:
export YAMS_CLIENT_HEADER_TIMEOUT_MS=60000; export YAMS_CLIENT_BODY_TIMEOUT_MS=300000- Reduce batch pressure:
yams config embeddings tune balancedoryams config embeddings batch_size 8then restart daemon. - Preload the model and smoke test a single embed:
yams model load all-MiniLM-L6-v2yams embed --model all-MiniLM-L6-v2 "hello world"- Ensure the ONNX plugin is loaded and onnxruntime resolves:
yams plugin load onnx;ldd /usr/local/lib/yams/plugins/libyams_onnx_plugin.so
- Prefer other ONNX models:
- MIME detection on add: single‑file and directory adds now perform MIME detection (magic → extension fallback) when not provided; MIME is stored in both content metadata and DB.
- Embeddings generation robustness:
- Batch UTF‑8 sanitization prevents protobuf “invalid UTF‑8” errors.
- Default embedding targets restricted to text‑like MIME; binary types (e.g., PDFs/images) are skipped unless explicitly included via
--include-mime. - Daemon model preload is attempted only when the model provider is ready to avoid spurious read timeouts; Hybrid backend continues via local fallback.
- Auto-repair during degraded state: RepairCoordinator now allows small background batches while the daemon is not fully ready, limited by a configurable active‑connections threshold; also gates work using live
activeConnectionsinstead of assuming idle. - MCP server: store by path could return “File not found” due to fragile path resolution.
handleStoreDocumentnow normalizes paths robustly: - Strips
file://scheme, expands~, resolves relative paths against$PWDand processcurrent_path(), and appliesweakly_canonicalbest‑effort. - Prevents spurious failures on repeated adds and differing working directories.
- Retrieval via daemon appearing to return no content or “not found” while
searchworked. CLIgetand MCPget/catnow reliably hit the same storage assearch, resolving mismatches caused by unsetdataDiror differing environments.
[v0.6.11] - 2025-09-06¶
Fixed¶
- MCP server: add_directory and get_by_name could appear to hang when the daemon was not ready or when directories were passed without recursive. The MCP handlers now:
- Check daemon readiness up front and fall back to local services (IndexingService/DocumentService) when unavailable, matching CLI behavior.
- Fast‑fail directory adds without
--recursivewith a clear error (parity with CLI). - Resolve name→hash via DocumentService in get_by_name before attempting streamed transfers; this avoids unnecessary GetInit/GetChunk paths.
- Reduce overly long default timeouts for directory indexing to avoid perceived hangs (header 30s, body 300s).
- Daemon status -d could hang while models were loading. Status path no longer calls blocking provider APIs (e.g.,
getLoadedModels()), and the defensive objects-dir scan is capped by files/time to ensure immediate responses. - Embeddings via daemon IPC were unreliable because server handlers were missing. Added request handling on the daemon side for generate embedding(s), load/unload model, and model status; client RPCs now complete as expected.
- ONNX plugin discovery: plugin loader now detects installed plugins using a compile‑time
YAMS_INSTALL_PREFIXmacro; resolves cases where the ONNX provider failed to load after install. - Extraction backlog: RepairCoordinator now performs a one‑time backlog enqueue on startup (when idle) so existing documents without embeddings get scheduled; prevents “extraction_pending” from staying high indefinitely.
- Build (Darwin/Clang): resolved “no type named ‘stop_source’ in namespace ‘std’” in
SocketServerby using the existingyams::compat::stop_tokenand removing the unnecessary<stop_token>include. - Daemon components: fixed mismatched forward declaration of
StateComponent(class vs struct) to silence-Wmismatched-tags. - IPC RequestHandler logging: avoided evaluated operand in
typeidusage to silence-Wpotentially-evaluated-expression. - Extraction: corrected a dangling
std::string_viewwhen parsing range tokens informat_handler.cpp.
Changed¶
- MCP add_directory:
recursivenow defaults totruein the MCP tool schema for parity with CLI directory indexing. Timeouts tuned (30s/300s) and clearer error messages on misuse. - Keep-hot semantics for embeddings: when
[embeddings].keep_model_hot = true(i.e.,lazyLoading=false), the ONNX model pool now pre‑creates a session for hot models (preCreateResources=true) so embeddings are immediately usable after preload. - Status detail now reports provider presence without enumerating models to avoid lock contention;
onnx_models_loadedis “unknown” in this non-blocking mode. - Tests: enable model provider and plugin auto‑loading in daemon unit tests (with lazy loading, no forced preload) to exercise real provider paths instead of bypassing them.
- Build (SourceHut):
.build.ymladds a Boost fallback install whenBoostConfig.cmakeis missing, ensuringfind_package(Boost CONFIG)succeeds with Conan CMakeDeps output. - Build/CMake: define
YAMS_INSTALL_PREFIXforyams_daemonso runtime plugin discovery includes$prefix/lib/yams/plugins. - Embedding service: unified on compat
jthread/stop_token; simplifies cross‑platform behavior and cooperative stop. - PDF extractor: locally suppressed deprecated
std::wstring_convert/std::codecvtwarnings under Clang to reduce build noise while a modern replacement is planned. - Storage plugin loader: marked unused parameters to quiet
-Wunused-parameterwithout changing behavior.
Notes¶
- Large
extraction_pendingvalues reflect missing embeddings, not just text extraction. With the backlog enqueue and ONNX provider discovery fixes, pending counts should drop as embeddings are generated. Ensure the ONNX plugin builds (onnxruntime present) or setYAMS_PLUGIN_DIRto the plugin output directory during development/CI.
[v0.6.10] - 2025-09-06¶
CI bump¶
- macOS build failures: replaced direct
std::jthread/std::stop_tokenusages with our portability shim (yams::compat) across daemon, services, and embedding service; resolves libc++ gaps on hosted macOS. - GitHub Actions release workflow: corrected heredoc in summary generation to avoid “unexpected EOF”; now writes Python output to a temp file and reads it safely.
[v0.6.9] - 2025-09-05¶
Notes¶
- Upgrading daemon is recommended: older daemons (v1) will log a one‑time warning “Daemon protocol v1 < client v2” and some fields will be supplemented client‑side.
Known Issues¶
- Daemon startup may remain in “Initializing” when embeddings are configured to keep the model hot (preload) and the ONNX stack cannot complete early model resolution on some systems. Workarounds:
- Set
[embeddings].keep_model_hot = falseand[embeddings].preload_on_startup = falsein~/.config/yams/config.tomlto use lazy loading (model loads on first use). - Or export
YAMS_DISABLE_MODEL_PRELOAD=1before starting the daemon. - Status derives from the lifecycle FSM; optional subsystems (models) should not gate readiness in recent builds. If you still see “Initializing”, ensure you are running the updated daemon and ONNX plugin.
Changed¶
- Status handler hardened to a minimal, safe snapshot; FSM/MUX metrics gated by lifecycle (Ready/Degraded) to avoid init races.
- Linux CPU proxy: read
Threads:from/proc/self/status(robust) instead of parsing/proc/self/statfield 20. - Stats defaults:
yams statsnow begins with the System Health section before compact counters. yams stats: prefers daemon JSON values to avoid zeros; local fallback includes vector DB size.yams status: supplements from stats JSON when talking to older daemons (v1) to avoid zeros; shows services summary and waiting components when not ready.- Service detectors hardened:
- SearchExecutor now includes a reason when unavailable:
database_not_ready | metadata_repo_not_ready | not_initialized. - ONNX models status reports loaded count with clear guidance to download a preferred model.
- CLI search defaults to hybrid and now auto-retries with fuzzy when strict/hybrid returns zero results for better “true hybrid” behavior.
- Service metadata search path now falls back to fuzzy when full-text returns no results.
- Model CLI help updated with subcommands and
--urlusage; clearer guidance to avoid confusion with the top-leveldownloadcommand. - Model CLI guidance: after downloads, success output includes
yams config embeddings model <name>and related steps;yams model listandyams model infonow mirror a one-line configuration hint.
Added¶
- IPC Protocol v2:
- StatusResponse now carries runtime fields: running, ready, uptime_seconds, requests_processed, active_connections, memory_mb, cpu_pct, version.
- GetStatsResponse carries numeric fields alongside JSON: total_documents, total_size, indexed_documents, vector_index_size, compression_ratio.
- Daemon stats JSON: explicit
not_readyflag; includesdurations_ms,top_slowest, and latency percentiles (latency_p50_ms,latency_p95_ms). - Bootstrap status file:
~/.local/state/yams/yams-daemon.status.jsonwith readiness, progress, durations, and top_slowest for pre‑IPC visibility. - CLI doctor/status: shows top 3 slowest components with elapsed ms when available (from bootstrap JSON).
- CLI UI (retro): compact text for status and stats;
yams stats -vshows System Health and detailed sections;stats vectorsprints compact block. - CLI daemon status (-d):
- One-line services summary:
SVC : ✓ Content | ✓ Repo | ✓ Search | ⚠ (0) Models. - WAIT line during initialization listing not-ready services with progress (e.g.,
WAIT : search_engine (70%), model_provider (0%)). - ServiceManager:
- Preferred model preload on startup: uses configured
embeddings.preloadModels(first entry) when no models are loaded; falls back toall-MiniLM-L6-v2. - Sanity warning when
SearchExecutoris not initialized despite database and metadata repo ready. - FSM metrics emission (server): counts payload writes/bytes sent, header reads/bytes received; exposed in Status when Ready/Degraded.
- LatencyRegistry (server): lightweight histogram; p50/p95 emitted via stats JSON.
- Protocol compatibility guard: one‑time warning when daemon/client protocol versions differ.
- SocketServer::setDispatcher(RequestDispatcher*) for safe future hot‑rebinds (not used by default).
- Integration test: daemon_status_stats_integration_test covers Status/Stats proto presence and basic FSM exposure.
- MCP list: added
paths_onlyparameter to the MCP list tool; forwarded to daemonListRequest.pathsOnlyto engage the hot path (no snippet/metadata hydration). - MCP grep: introduced
fast_firstoption that returns a quick semantic suggestions burst when requested. - MCP server: startup flags to set hot/cold modes consistently with CLI:
--list-mode→ setsYAMS_LIST_MODE--grep-mode→ setsYAMS_GREP_MODE--retrieval-mode→ setsYAMS_RETRIEVAL_MODE- CLI model management:
--urloverride to download any model by name from a custom URL.- Subcommands:
yams model list|download|info|check(aliases to flags). yams model checkshows ONNX runtime support and plugin directory status, plus autodiscovers installed models under~/.yams/models.- Model autodiscovery:
yams model listnow lists locally installed models found at~/.yams/models/<name>/model.onnx. - Daemon model selection: honors
embeddings.preferred_modelfrom~/.config/yams/config.toml(or XDG config) to preload and prefer that model at runtime. - Model path resolution: honors
embeddings.model_pathas the models root (with~expansion) and resolves name-based models in priority order: configured root →~/.yams/models→models/→/usr/local/share/yams/models; full paths are used as-is. - Docs: PROMPT updated to show multi-path
yams add src/ include/ ...examples.
Fixed¶
- Build: resolved C++ signature mismatch for
EmbeddingService::runRepairby aligning the implementation with header declarations and providing a proper non-jthread legacy runner. - MCP schemas updated for list (
paths_only) and grep (fast_first) to reflect new behaviors. - ONNX plugin: CMake fixes to ensure dynamic plugin loads cleanly:
- Disable IPO/LTO on
yams_onnx_plugin(INTERPROCEDURAL_OPTIMIZATION FALSE) to prevent LTO symbol internalization and tooling noise. - Ensure exported C symbols have default visibility (
C_VISIBILITY_PRESET/CXX_VISIBILITY_PRESET default,VISIBILITY_INLINES_HIDDEN OFF) sogetProviderName/createOnnxProviderremain visible. - On ELF, link with
-Wl,-z,defsto catch unresolved symbols at link time. - Resolves daemon warning: “Plugin missing getProviderName function: …/libyams_onnx_plugin.so”.
- Retrieval now accepts
sha256:<hex>hashes (normalized before validation) inDocumentServiceretrieveandcat. - Model downloads provide clearer errors when offline (curl exit code mapping for host resolution/connect failures).
[v0.6.8] - 2025-09-04¶
Changed¶
- Data directory resolution uses a consistent precedence across CLI and daemon: configuration file > environment (
YAMS_STORAGE/YAMS_DATA_DIR) > CLI flag. This avoids CLI defaults masking configured storage roots. - MCP search defaults hardened: runs in hybrid mode with fuzzy matching enabled by default and a similarity threshold of 0.7 when the client doesn’t provide options.
Added¶
- MCP get-by-name and cat now include a fuzzy fallback: if direct lookup fails, a hybrid fuzzy search runs and returns the strongest single match, then retrieves by hash (preferred) or normalized path.
- URL-aware naming: post-index for MCP downloads now sets the document name from the URL basename (query stripped), improving name-based retrieval. MCP get-by-name also accepts full URLs and normalizes them to the basename automatically.
Fixed¶
- macOS build compatibility and warnings:
std::jthread/std::stop_tokenportability: embedding service falls back tostd::threadwith an atomic stop on platforms lacking libc++ support.- Third‑party noise reduced: mark
hnswlibincludes as SYSTEM and silencesqlite-vec.c; fence cast-qual warnings under Clang for HNSW. - Resolved shadowing and unused variable/parameter warnings across content handlers (PDF, audio, image, binary, archive) and daemon client helpers.
- Name propagation for downloads: downloaded artifacts are indexed with a human-friendly name, making
get --name <file>work reliably after MCP/CLI downloads.
[v0.6.7] - 2025-09-04¶
CI bump¶
- SourceHut build fixes
- GitHub CI updates
[v0.6.6] - 2025-09-04¶
CI bump¶
- SourceHut build fixes
- Github CI updates
Fixed¶
- Search regression from adding query results. Querys will now be supported by default in the fallback path after expressions are extracted
Known Issues¶
- Data directory path resolutions needs to be audited
[v0.6.5] - 2025-09-04¶
Changed¶
- Release workflow: migrate to Conan 2 profiles using a dedicated host profile (
conan/profiles/host.jinja) that enforcescompiler.cppstd=20and setscompiler.libcxx=libc++on macOS. - CMake (Darwin): replace GNU ld flags (
--start-group/--end-group,--whole-archive/--no-whole-archive) with-Wl,-force_loadfor specific archives in CLI, MCP server, and daemon. - CI (act): install build tools when missing in containers (cmake, ninja, build-essential/pkg-config) to allow local act runs of the release job.
Added¶
- Search ergonomics:
-q, --queryflagged alias to safely pass queries that start with ‘-’ (e.g.,yams search -q "--start-group --whole-archive").--stdinand--query-file <path>to read the query from stdin or a file (--query-file -also reads stdin).- Helpful “Query is required” guidance when no query is provided, with tips to use
-q/--query,--to stop option parsing, or stdin/file inputs.
Fixes¶
- macOS linking failures due to GNU-only linker flags appearing in executable link lines; now prevented and defensively stripped if injected elsewhere.
- SourceHut
.build.yml: use the Conan host profile and disable ONNX (-o enable_onnx=False) to avoid upstream onnxruntime 1.18 build errors with GCC 15; preserves successful release builds on Arch.
[v0.6.4] - 2025-01-05¶
Known Issues¶
- Daemon not showing correct storage details
Added¶
- Streaming metrics: Track and expose
stream_total_streams,stream_batches_emitted,stream_keepalives, and averagestream_ttfb_avg_msviayams stats(JSON and technical details in text mode). - Grep performance: line-by-line streaming scanner (no full-file buffering) and literal fast-path for
--literalwithout word-bounds/case-folding. - Add performance: parallel directory traversal/processing with bounded workers.
- Session helpers:
yams session pin|list|unpin|warmfor hot data management (feature is experimental and may not work reliably).
Changed¶
- List paths-only mode now avoids snippet/metadata hydration end-to-end for faster responses; honors
pathsOnlythrough daemon to services. - Retrieval (cat/get) prefers extracted text (hot) when available before falling back to CAS (cold), respecting per-document
force_cold. - Grep hot/cold race: daemon emits hot first-burst and cancels cold; batch/first-burst tuning supported via env for testing.
- Delete command now mirrors
rmergonomics: rmalias retained;-fis an alias for--force;-ris an alias for--recursive- Positional targets supported (names/paths/patterns); when a single target is provided with
-r, it is treated as a directory - Multiple positional targets are treated as a names list
- Selector requirement relaxed when positional targets are used; mode (directory/pattern/name) is inferred heuristically
- List paths-only mode now avoids snippet/metadata hydration end-to-end for faster responses; honors
pathsOnlythrough daemon to services. - Retrieval (cat/get) prefers extracted text (hot) when available before falling back to CAS (cold), respecting per-document
force_cold. - Grep hot/cold race: daemon emits hot first-burst and cancels cold; batch/first-burst tuning supported via env for testing.
- CMake: Platform-aware linker selection.
- Removed hard-coded
-fuse-ld=lldfrom presets. - Toolchain now attempts
-fuse-ld=lldonly on non-Apple platforms when supported, gated byYAMS_PREFER_LLD(default ON). - macOS Release builds use
-Wl,-dead_stripinstead of GNU--gc-sections. - Release workflow (macOS): Align Conan arch with target:
macos-arm64uses-s arch=armv8macos-x86_64uses-s arch=x86_64
Fixed¶
- Streaming search finalization: avoid infinite keepalive loop on empty results; keepalive cadence configurable via
YAMS_KEEPALIVE_MS. - Stats command:
yams stats helpnow prints a concise system metrics guide. - MCP server timeouts with some clients:
- Async handlers now run detached (no premature task destruction), preventing “Context server request timeout”
- Server sends
notifications/readyafter clientnotifications/initializedfor better client compatibility - Stdio framing hardened to consume trailing CR/LF after payloads to avoid parse glitches across clients
- Streaming search finalization: avoid infinite keepalive loop on empty results; keepalive cadence configurable via
YAMS_KEEPALIVE_MS. - Stats command:
yams stats helpnow prints a concise system metrics guide. - Docker: ARM64 build now mirrors AMD64 by using Buildx with proper tag suffixes (-arm64) and pushes versioned and latest tags. Multi-arch manifest creation no longer fails with “not found … -arm64”.
- Release workflow (macOS/Linux): Conan 2 profile updates use correct keys (compiler.cppstd, compiler.libcxx) via
conan profile update; removed brittle sed/grep edits that broke on macOS. - Release workflow (Windows): Corrected Conan 2 profile update syntax (
compiler.cppstd=20).
[v0.6.3] - 2025-01-04¶
Changes¶
- CI version bump for source hut and github action builds
yams addsupports multiple files- Adding small optimizations to decompression
Known Issues¶
- MCP server usage my vary in successful returns
- [v0.6.2] CLI performance still degraded, will be addressed in subsequent release
[v0.6.2] - 2025-01-04¶
Hot fixes¶
- CI version bump
- MCP server patch (tested working in zed and lm studio but not working in goose or jan)
[v0.6.1] - 2025-01-03¶
Hot fixes¶
- Docker and sourcehut build file updates
- Daemon liveness checks on startup bug fix where daemon did not signal start
- Added more aggressive parallelization for search now that it holds all resources (will make tunable)
Known Issues¶
- MCP server init not working with some clients
- CLI performance has been degraded as it creates socket to daemon each request, working on session mode
yams stats -vnot working as expected
[v0.6.0] - 2025-01-03¶
Repository¶
- Will move all future development work to experimental branch so that main and releases become more stable
- I apologize for recent instabilities
Added¶
- Server multiplexing: fair round-robin writer with per-turn byte budget and backpressure caps (default ON).
- Status: exposes multiplexing metrics (active handlers, queued bytes, writer budget).
- Cancel control frame: request type added and server-side cancel scaffolding (per-request contexts).
- Development Changes
- Updated tasks for vscode and zed tasks
- Enforcing clang-tidy warnings as errors
- Attempting to stabilize build system as plugin system is implemeneted per roadmap
- Daemon Logging Rotation
- Use rotating file sink to preserve logs across crashes
-
Log rotation info:
Log rotation enabled: /path/to/logfile.log (max 10MB x 5 files) -
Connection State Machine
- Deterministic
ConnectionFsmfor IPC connection lifecycle management - Clean state transitions: Disconnected → Connecting → Connected → ReadingHeader → ReadingPayload → WritingHeader → StreamingChunks → Error/Closed
- FSM metrics collection gated by daemon log level (debug/trace only)
- Coordination with
RepairFsmfor safe data integrity operations - Guardrails for readable/writable operations based on FSM state
-
Header-first streaming with chunked transfer support
-
Protobuf IPC Migration
- Complete migration from custom binary serialization to Protocol Buffers
ProtoSerializeras the single payload codec for client/server- Comprehensive
ipc_envelope.protoschema with oneof for all request/response types - Transport framing (header, CRC, streaming) remains unchanged
- MAX_MESSAGE_SIZE enforcement in encode/decode paths
- Older non-protobuf clients are incompatible after this release
Changed¶
- CLI Commands
- All CLI commands migrated to async-first pattern with
run_syncbridge - Removed deprecated
daemon_first()anddaemon_first_pooled()helpers - Commands now use
async_daemon_first()with proper timeout handling -
Consistent error handling and retry logic across all commands
-
MCP Server
- All handlers converted to async (
yams::Task<Result<T>>) -
Improved concurrency and reduced blocking operations
-
Daemon Architecture
- Socket server runs in-process, eliminating need for external IPC server
state_.readiness.ipcServerReadynow properly reflects socket server status- Graceful shutdown sequence: socket server stops before services
- Better resource lifecycle management and error propagation
Fixed¶
- Async Infrastructure
- Removed spin-wait bridges that caused CPU waste
- Fixed head-of-line blocking in request processing
- Proper cancellation handling for in-flight operations
- Eliminated
Task::get()usage from production code paths
Removed¶
- Deprecated Synchronous APIs
AsioClientPool::roundtrip(),status(),ping(),call()(sync versions)PooledRequestManager::execute()(now returns NotImplemented error)YAMS_ASYNC_CLIENTenvironment variable (async is now mandatory)- Legacy
MessageSerializerand custom binary serialization code - All synchronous daemon helper functions
- Forward declarations of
BinarySerializer/BinaryDeserializerremain inipc_protocol.hfor cleanup in v0.7.0