Changelog Archive: v0.4.x Series¶
[v0.4.8] - 2025-08-21¶
Added¶
- Shell Wildcard Expansion Support
addcommand now accepts multiple file paths from shell expansion- Commands like
yams add *.mdnow work correctly without “arguments were not expected” errors - Properly handles shell wildcard expansion (e.g.,
*.cpp,*.md,*.txt) - Dynamic Plugin Loading
- Implemented PluginLoader class for runtime plugin loading
- Added automatic plugin discovery from standard directories
- Support for YAMS_PLUGIN_DIR environment variable
- Onnx model provider now loads dynamically as a plugin
- Daemon automatically loads plugins on startup
- Created comprehensive test suite for plugin loader
Fixed¶
- Tag Storage and Filtering
- Fixed tag filtering in
listcommand not working (--tagsoption) - Tags are now properly parsed as comma-separated values when stored
- Fixed tag filtering logic to correctly match against stored comma-separated tag values
- Multiple tags in filter work with OR logic (e.g.,
--tags "work,important"shows documents with either tag) - Daemon Path Resolution
- Fixed “Added 0 documents” issue when using relative paths with daemon
- CLI now converts relative paths to absolute paths before sending to daemon
- Added proper error messages for invalid paths and non-recursive directory attempts
- Daemon now returns helpful error messages instead of silently returning 0
- Feature Parity Between CLI and Daemon
- Added all missing fields to GrepRequest for complete CLI feature parity
- Added all missing fields to SearchRequest for complete CLI feature parity
- Updated serialization/deserialization for both request types
- CLI commands now pass all parameters to daemon requests
- Ensures consistent behavior whether using daemon or local execution
- Wildcard Pattern Matching Bug
- Fixed broken wildcard pattern matching in
addandrestorecommands - Replaced regex-based implementation with efficient iterative algorithm
- Patterns like
*.sol,*.cpp,*.jsnow work correctly - Dots and other special characters in patterns are now handled properly
- Linux Build Errors
- Fixed missing
#include <cstring>in compression_benchmark.cpp forstd::strlen - Fixed C++ compiler flag
-Wnon-virtual-dtorbeing incorrectly applied to C files - Used CMake generator expression
$<$<COMPILE_LANGUAGE:CXX>:-Wnon-virtual-dtor>for language-specific flags - Fixed missing Rabin chunker header include in ingestion_benchmark.cpp
- Fixed GCOptions struct initializer warnings by adding progressCallback field
- Fixed query parser benchmark Result access patterns (use
.value()instead of*result) - Fixed metadata benchmark Database constructor usage pattern
- Fixed benchmark API calls to use storeBytes() instead of non-existent addContent()
- Vector Index Loading
- Fixed empty error message when vector index file doesn’t exist
- Added proper file existence check before attempting to load
- Shows debug message instead of warning for missing index file on startup
[v0.4.7] - 2025-08-21¶
Fixed¶
- Release Workflow Build Failures
- Fixed
std::from_charscompilation error on macOS hosted runners (Xcode 15.2) - Replaced
std::from_charswith portablestd::stoullin http_adapter_curl.cpp for parsing Content-Length headers - Fixed missing benchmark package in Linux self-hosted validation builds
- Added proper Conan options (
-o build_tests=True -o build_benchmarks=True) to validation build step - Ensures compatibility with older macOS standard libraries that lack full C++20 support
[v0.4.6] - 2025-08-21¶
Changed¶
- MCP Server Tool Naming
- Simplified tool names to match CLI commands for better consistency
- Tools now use generic names:
search,grep,download,get,list,store,add,delete,cat,update,stats
Fixed¶
- Added proper lifetime management for io_uring operations to prevent accessing freed memory
- Implemented operation tracking and cancellation to ensure clean shutdown
- Storage Backend Improvements
- Fixed FilesystemBackend sharding to use hash-based approach for consistent key distribution
- Replaced key-prefix sharding with SHA256 hash-based sharding to avoid path conflicts
- Chunking Deduplication
- Fixed RabinChunker deduplication by resetting window state at chunk boundaries
- Ensures identical data patterns produce identical chunks for proper deduplication
- Code Improvements
- Added automatic configuration correction for invalid chunking configs (when min > max)
- Enhanced preprocessText() to trim leading/trailing whitespace
- Improved paragraph boundary detection to point after “\n\n” markers
- CI/CD
- Fixed release workflow by changing preset from
conan-validationtoconan-release
Known Issues¶
- Temporarily Disabled Tests (to be fixed in v0.5.0)
- VectorIndexManager: removeVector, index persistence, HNSW operations not implemented
- ModelManagement: Registry initialization issues in test environment
- OnnxRuntime: Tests timeout waiting for non-existent model files
- All disabled tests are marked with TODO(v0.5.0) comments for tracking
[v0.4.5] - 2025-08-21¶
Fixed¶
- MCP Server
- Added mutex protection to StdioTransport for thread-safe I/O operations
- Fixed potential JSON response interleaving when multiple clients connect
- Prevents “Expected ‘,’ or ‘]’ after array element” errors in concurrent scenarios
- Fixed missing includes for file operations
- Test Failures
- Fixed FilesystemBackend::list() key reconstruction from sharded paths
- Fixed ManifestManager statistics by moving static counters to member variables
- Fixed file type detection consistency in CommandIntegrationTest
- Detection Module
- Ensured FileSignature creation uses consistent methods for isBinary and fileType
- Fixed mismatch between FileSignature fields and classification methods
[v0.4.4] - 2025-08-21¶
CI version bump
[v0.4.3] - 2025-08-20¶
Fixed¶
- Build System
- Updated GTest from 1.14.0 to 1.15.0 for Conan 2.0 compatibility
- Homebrew Formula
- Fixed documentation URL in Homebrew formula to point to correct repository
Known Issues¶
- Daemon may crash when processing certain stats requests (investigation ongoing)
[v0.4.2] - 2025-08-20¶
Added¶
- Linux Package Support
- Added AppImage support for universal Linux distribution
- Integrated package building into GitHub release workflow
Fixed¶
- Linux Build Compilation
- Fixed C++ template name lookup issues in
message_serializer.cppfor GCC 13 - Added namespace qualification to 31 deserialize calls for proper template resolution
- Fixed missing
<utility>header inasync_socket.handconnection_pool.hforstd::exchange - Fixed missing
<netinet/in.h>header inasync_socket.cppforIPPROTO_TCPconstant - Position-Independent Code (-fPIC) Linker Errors
- Fixed shared library linking errors by enabling PIC for all static libraries in dependency chain
- Added
POSITION_INDEPENDENT_CODE ONproperty to 11 static libraries - Resolved
yams_onnx_plugin.sobuild failure on Linux x86_64 - Missing Symbol Linker Errors
- Fixed undefined reference to
HybridFuzzySearchby linkingyams_metadatatoyams::search - Resolved circular dependency issues in library linking order
- CLI Output Cleanup
- Changed TextExtractorFactory initialization logging from info to debug level
- Removed spurious log messages from normal CLI output
- Homebrew Formula
- Fixed documentation URL in Homebrew formula to point to correct repository
[v0.4.1] - 2025-08-20¶
Added¶
- Smart Text Extraction by Default
- CLI commands now extract text from PDFs, HTML, and other supported formats by default
catcommand: Shows readable text instead of binary data for PDFs and HTML filesgetcommand: Auto-detects output destination - extracts text for terminal, raw for files--rawflag added to both commands for accessing original content when needed--extractflag forgetcommand to force text extraction even when piping
Changed¶
- MCP Server Text Extraction
- Now uses
TextExtractorFactoryfor all supported file types (not just HTML) - PDF text extraction works correctly in MCP tools
- Consistent text extraction behavior across CLI and MCP interfaces
Fixed¶
- PDF Text Extraction in MCP
- Fixed issue where PDF files returned empty content in MCP tools
- MCP server now properly extracts text from PDFs using PDFium
- Added missing
HtmlTextExtractorto build system - Compilation Issues
- Fixed missing
HtmlTextExtractorin CMakeLists.txt - Fixed ErrorCode enum references in HTML text extractor
- Fixed regex_replace lambda usage for C++ standard compliance
- IPC Response variant construction
- GitHub Actions release workflow: wrap embedded multi-line Python f-string code in bash here-docs and pass JSON file path via argv in the benchmarks block to avoid shell syntax errors on runners.
[v0.4.0] - 2025-08-20¶
Added¶
- Universal Content Handler System
- New
IContentHandlerinterface supporting all file types with metadata extraction ContentHandlerRegistrywith thread-safe handler managementTextContentHandleradapter wrapping existing PlainTextExtractorPdfContentHandlerwrapper for existing PdfExtractorBinaryContentHandleras universal fallback for unknown file types- Updated DocumentIndexer to use new ContentHandlerRegistry system
- Maintained backward compatibility with legacy TextExtractor system
- High-Performance Daemon Architecture
- New
yams-daemonbackground service for persistent resource management - Unix domain socket IPC with zero-copy transfers for large payloads
- Automatic daemon lifecycle management with configurable policies
yams daemon start/stop/status/restartcommand suite- Auto-start on first command if daemon enabled in config
- Shared EmbeddingGenerator across all operations eliminates model loading overhead
- VectorIndexManager cached in daemon memory
- Robust Downloader Module
- New download module with libcurl adapter, repo-local staging
- SHA-256 integrity verification and rate limiting
- Atomic finalize into CAS (store-only by default)
- New
yams downloadcommand returning JSON {hash, stored_path, size_bytes} - Progress output (human/json) with no user-path writes unless export requested
- Configuration v2 Architecture
- New
[daemon]section for service configuration [daemon.models]for model lifecycle management[daemon.lifecycle]for auto-start and shutdown policies[daemon.ipc]for communication tuning- Backward compatible with direct mode (no daemon)
- Enhanced Search Capabilities
--literal-textflag for search and grep commands- Treats query patterns as literal text instead of regex/operators
- Works across all search engines (fuzzy, hybrid, metadata, vector)
- Example:
yams search "call(" --literal-textsafely searches for parentheses
Changed¶
- Performance Architecture
- All CLI commands can leverage shared daemon resources
- Shared result renderer system across CLI and daemon
- Deferred initialization eliminates startup overhead
- MCP Integration Improvements
- Removed HTTP transport support, now stdio-only for cleaner local integration
- Improved EmbeddingGenerator lifecycle with lazy loading
- Better resource management with on-demand initialization
Fixed¶
- Daemon Integration Issues
- Eliminated “Failed to preload model” warnings
- Fixed daemon configuration defaults for missing config sections
- Daemon now gracefully handles missing
[daemon]sections in config files - Configuration System
- ConfigMigrator properly handles v1 to v2 migrations without breaking existing configs
- MCP Schema Compliance
- Fixed “Expected object, received null” error in tool definitions
- Removed empty
properties: {}fields from tools with no parameters - MCP server now loads correctly without schema validation errors
- Search Engine Robustness
- Special characters in search queries no longer break FTS5
- Automatic sanitization prevents syntax errors from
()[]{}*"characters - All search paths (fuzzy, hybrid, metadata) handle special characters safely
- Raw query strings flow through pipeline unchanged until FTS5 level
- Critical Compilation Errors
- Fixed
ChunkingStrategyenum reference mismatch in document chunker (FixedSize → FIXED_SIZE) - Fixed constructor initialization order in VectorIndexOptimizer to match member declaration order
- Fixed compression level configuration in RecoveryManager (now uses level 3 per performance benchmarks)
- Code Quality Issues
- Fixed hundreds of uninitialized variable errors identified by cppcheck analysis
- Eliminated critical errors in recovery_manager.cpp, error_handler.cpp, and metadata_api.cpp
- Added proper RAII initialization patterns across vector and compression components
- Build System Stability
- All modules now compile successfully without errors
- Fixed warning configurations that were breaking dependency builds
- Improved cross-platform compilation compatibility