Skip to content

Changelog Archive: v0.4.x Series

[v0.4.8] - 2025-08-21

Added

  • Shell Wildcard Expansion Support
  • add command now accepts multiple file paths from shell expansion
  • Commands like yams add *.md now work correctly without “arguments were not expected” errors
  • Properly handles shell wildcard expansion (e.g., *.cpp, *.md, *.txt)
  • Dynamic Plugin Loading
  • Implemented PluginLoader class for runtime plugin loading
  • Added automatic plugin discovery from standard directories
  • Support for YAMS_PLUGIN_DIR environment variable
  • Onnx model provider now loads dynamically as a plugin
  • Daemon automatically loads plugins on startup
  • Created comprehensive test suite for plugin loader

Fixed

  • Tag Storage and Filtering
  • Fixed tag filtering in list command not working (--tags option)
  • Tags are now properly parsed as comma-separated values when stored
  • Fixed tag filtering logic to correctly match against stored comma-separated tag values
  • Multiple tags in filter work with OR logic (e.g., --tags "work,important" shows documents with either tag)
  • Daemon Path Resolution
  • Fixed “Added 0 documents” issue when using relative paths with daemon
  • CLI now converts relative paths to absolute paths before sending to daemon
  • Added proper error messages for invalid paths and non-recursive directory attempts
  • Daemon now returns helpful error messages instead of silently returning 0
  • Feature Parity Between CLI and Daemon
  • Added all missing fields to GrepRequest for complete CLI feature parity
  • Added all missing fields to SearchRequest for complete CLI feature parity
  • Updated serialization/deserialization for both request types
  • CLI commands now pass all parameters to daemon requests
  • Ensures consistent behavior whether using daemon or local execution
  • Wildcard Pattern Matching Bug
  • Fixed broken wildcard pattern matching in add and restore commands
  • Replaced regex-based implementation with efficient iterative algorithm
  • Patterns like *.sol, *.cpp, *.js now work correctly
  • Dots and other special characters in patterns are now handled properly
  • Linux Build Errors
  • Fixed missing #include <cstring> in compression_benchmark.cpp for std::strlen
  • Fixed C++ compiler flag -Wnon-virtual-dtor being incorrectly applied to C files
  • Used CMake generator expression $<$<COMPILE_LANGUAGE:CXX>:-Wnon-virtual-dtor> for language-specific flags
  • Fixed missing Rabin chunker header include in ingestion_benchmark.cpp
  • Fixed GCOptions struct initializer warnings by adding progressCallback field
  • Fixed query parser benchmark Result access patterns (use .value() instead of *result)
  • Fixed metadata benchmark Database constructor usage pattern
  • Fixed benchmark API calls to use storeBytes() instead of non-existent addContent()
  • Vector Index Loading
  • Fixed empty error message when vector index file doesn’t exist
  • Added proper file existence check before attempting to load
  • Shows debug message instead of warning for missing index file on startup

[v0.4.7] - 2025-08-21

Fixed

  • Release Workflow Build Failures
  • Fixed std::from_chars compilation error on macOS hosted runners (Xcode 15.2)
  • Replaced std::from_chars with portable std::stoull in http_adapter_curl.cpp for parsing Content-Length headers
  • Fixed missing benchmark package in Linux self-hosted validation builds
  • Added proper Conan options (-o build_tests=True -o build_benchmarks=True) to validation build step
  • Ensures compatibility with older macOS standard libraries that lack full C++20 support

[v0.4.6] - 2025-08-21

Changed

  • MCP Server Tool Naming
  • Simplified tool names to match CLI commands for better consistency
  • Tools now use generic names: search, grep, download, get, list, store, add, delete, cat, update, stats

Fixed

  • Added proper lifetime management for io_uring operations to prevent accessing freed memory
  • Implemented operation tracking and cancellation to ensure clean shutdown
  • Storage Backend Improvements
  • Fixed FilesystemBackend sharding to use hash-based approach for consistent key distribution
  • Replaced key-prefix sharding with SHA256 hash-based sharding to avoid path conflicts
  • Chunking Deduplication
  • Fixed RabinChunker deduplication by resetting window state at chunk boundaries
  • Ensures identical data patterns produce identical chunks for proper deduplication
  • Code Improvements
  • Added automatic configuration correction for invalid chunking configs (when min > max)
  • Enhanced preprocessText() to trim leading/trailing whitespace
  • Improved paragraph boundary detection to point after “\n\n” markers
  • CI/CD
  • Fixed release workflow by changing preset from conan-validation to conan-release

Known Issues

  • Temporarily Disabled Tests (to be fixed in v0.5.0)
  • VectorIndexManager: removeVector, index persistence, HNSW operations not implemented
  • ModelManagement: Registry initialization issues in test environment
  • OnnxRuntime: Tests timeout waiting for non-existent model files
  • All disabled tests are marked with TODO(v0.5.0) comments for tracking

[v0.4.5] - 2025-08-21

Fixed

  • MCP Server
  • Added mutex protection to StdioTransport for thread-safe I/O operations
  • Fixed potential JSON response interleaving when multiple clients connect
  • Prevents “Expected ‘,’ or ‘]’ after array element” errors in concurrent scenarios
  • Fixed missing includes for file operations
  • Test Failures
  • Fixed FilesystemBackend::list() key reconstruction from sharded paths
  • Fixed ManifestManager statistics by moving static counters to member variables
  • Fixed file type detection consistency in CommandIntegrationTest
  • Detection Module
  • Ensured FileSignature creation uses consistent methods for isBinary and fileType
  • Fixed mismatch between FileSignature fields and classification methods

[v0.4.4] - 2025-08-21

CI version bump

[v0.4.3] - 2025-08-20

Fixed

  • Build System
  • Updated GTest from 1.14.0 to 1.15.0 for Conan 2.0 compatibility
  • Homebrew Formula
  • Fixed documentation URL in Homebrew formula to point to correct repository

Known Issues

  • Daemon may crash when processing certain stats requests (investigation ongoing)

[v0.4.2] - 2025-08-20

Added

  • Linux Package Support
  • Added AppImage support for universal Linux distribution
  • Integrated package building into GitHub release workflow

Fixed

  • Linux Build Compilation
  • Fixed C++ template name lookup issues in message_serializer.cpp for GCC 13
  • Added namespace qualification to 31 deserialize calls for proper template resolution
  • Fixed missing <utility> header in async_socket.h and connection_pool.h for std::exchange
  • Fixed missing <netinet/in.h> header in async_socket.cpp for IPPROTO_TCP constant
  • Position-Independent Code (-fPIC) Linker Errors
  • Fixed shared library linking errors by enabling PIC for all static libraries in dependency chain
  • Added POSITION_INDEPENDENT_CODE ON property to 11 static libraries
  • Resolved yams_onnx_plugin.so build failure on Linux x86_64
  • Missing Symbol Linker Errors
  • Fixed undefined reference to HybridFuzzySearch by linking yams_metadata to yams::search
  • Resolved circular dependency issues in library linking order
  • CLI Output Cleanup
  • Changed TextExtractorFactory initialization logging from info to debug level
  • Removed spurious log messages from normal CLI output
  • Homebrew Formula
  • Fixed documentation URL in Homebrew formula to point to correct repository

[v0.4.1] - 2025-08-20

Added

  • Smart Text Extraction by Default
  • CLI commands now extract text from PDFs, HTML, and other supported formats by default
  • cat command: Shows readable text instead of binary data for PDFs and HTML files
  • get command: Auto-detects output destination - extracts text for terminal, raw for files
  • --raw flag added to both commands for accessing original content when needed
  • --extract flag for get command to force text extraction even when piping

Changed

  • MCP Server Text Extraction
  • Now uses TextExtractorFactory for all supported file types (not just HTML)
  • PDF text extraction works correctly in MCP tools
  • Consistent text extraction behavior across CLI and MCP interfaces

Fixed

  • PDF Text Extraction in MCP
  • Fixed issue where PDF files returned empty content in MCP tools
  • MCP server now properly extracts text from PDFs using PDFium
  • Added missing HtmlTextExtractor to build system
  • Compilation Issues
  • Fixed missing HtmlTextExtractor in CMakeLists.txt
  • Fixed ErrorCode enum references in HTML text extractor
  • Fixed regex_replace lambda usage for C++ standard compliance
  • IPC Response variant construction
  • GitHub Actions release workflow: wrap embedded multi-line Python f-string code in bash here-docs and pass JSON file path via argv in the benchmarks block to avoid shell syntax errors on runners.

[v0.4.0] - 2025-08-20

Added

  • Universal Content Handler System
  • New IContentHandler interface supporting all file types with metadata extraction
  • ContentHandlerRegistry with thread-safe handler management
  • TextContentHandler adapter wrapping existing PlainTextExtractor
  • PdfContentHandler wrapper for existing PdfExtractor
  • BinaryContentHandler as universal fallback for unknown file types
  • Updated DocumentIndexer to use new ContentHandlerRegistry system
  • Maintained backward compatibility with legacy TextExtractor system
  • High-Performance Daemon Architecture
  • New yams-daemon background service for persistent resource management
  • Unix domain socket IPC with zero-copy transfers for large payloads
  • Automatic daemon lifecycle management with configurable policies
  • yams daemon start/stop/status/restart command suite
  • Auto-start on first command if daemon enabled in config
  • Shared EmbeddingGenerator across all operations eliminates model loading overhead
  • VectorIndexManager cached in daemon memory
  • Robust Downloader Module
  • New download module with libcurl adapter, repo-local staging
  • SHA-256 integrity verification and rate limiting
  • Atomic finalize into CAS (store-only by default)
  • New yams download command returning JSON {hash, stored_path, size_bytes}
  • Progress output (human/json) with no user-path writes unless export requested
  • Configuration v2 Architecture
  • New [daemon] section for service configuration
  • [daemon.models] for model lifecycle management
  • [daemon.lifecycle] for auto-start and shutdown policies
  • [daemon.ipc] for communication tuning
  • Backward compatible with direct mode (no daemon)
  • Enhanced Search Capabilities
  • --literal-text flag for search and grep commands
  • Treats query patterns as literal text instead of regex/operators
  • Works across all search engines (fuzzy, hybrid, metadata, vector)
  • Example: yams search "call(" --literal-text safely searches for parentheses

Changed

  • Performance Architecture
  • All CLI commands can leverage shared daemon resources
  • Shared result renderer system across CLI and daemon
  • Deferred initialization eliminates startup overhead
  • MCP Integration Improvements
  • Removed HTTP transport support, now stdio-only for cleaner local integration
  • Improved EmbeddingGenerator lifecycle with lazy loading
  • Better resource management with on-demand initialization

Fixed

  • Daemon Integration Issues
  • Eliminated “Failed to preload model” warnings
  • Fixed daemon configuration defaults for missing config sections
  • Daemon now gracefully handles missing [daemon] sections in config files
  • Configuration System
  • ConfigMigrator properly handles v1 to v2 migrations without breaking existing configs
  • MCP Schema Compliance
  • Fixed “Expected object, received null” error in tool definitions
  • Removed empty properties: {} fields from tools with no parameters
  • MCP server now loads correctly without schema validation errors
  • Search Engine Robustness
  • Special characters in search queries no longer break FTS5
  • Automatic sanitization prevents syntax errors from ()[]{}*" characters
  • All search paths (fuzzy, hybrid, metadata) handle special characters safely
  • Raw query strings flow through pipeline unchanged until FTS5 level
  • Critical Compilation Errors
  • Fixed ChunkingStrategy enum reference mismatch in document chunker (FixedSize → FIXED_SIZE)
  • Fixed constructor initialization order in VectorIndexOptimizer to match member declaration order
  • Fixed compression level configuration in RecoveryManager (now uses level 3 per performance benchmarks)
  • Code Quality Issues
  • Fixed hundreds of uninitialized variable errors identified by cppcheck analysis
  • Eliminated critical errors in recovery_manager.cpp, error_handler.cpp, and metadata_api.cpp
  • Added proper RAII initialization patterns across vector and compression components
  • Build System Stability
  • All modules now compile successfully without errors
  • Fixed warning configurations that were breaking dependency builds
  • Improved cross-platform compilation compatibility