Skip to content

V0.3.0 raft#1

Open
Tilnel wants to merge 23 commits intomasterfrom
v0.3.0-raft
Open

V0.3.0 raft#1
Tilnel wants to merge 23 commits intomasterfrom
v0.3.0-raft

Conversation

@Tilnel
Copy link
Copy Markdown
Owner

@Tilnel Tilnel commented Mar 14, 2026

No description provided.

Tilnel and others added 23 commits March 12, 2026 08:07
… fixes

Major changes:

1. Raft Test Enhancements (examples/raft/):
   - Add snapshot support: send_snapshot callback, RAFT_MSG_SNAPSHOT handling
   - Add automatic entry generation for leader every 2 periods
   - Auto-trigger snapshot when committed entries > 5
   - Implement comprehensive safety checker (check.cpp) with:
     * Term monotonicity check (prevents stale snapshot bug)
     * Leader uniqueness check
     * Snapshot term consistency validation
   - Add Raft message parser for readable syscall tracing

2. Core State Management (src/core/):
   - Add raft_check_state to tracee_state for BFS-correct checking
   - Add FdManager state serialization to preserve fd allocation
   - Fix memory region permission preservation (prot flags)
   - Improve cmd_diff to compare all state components including:
     * Raft state, FdManager state, filesystem, sockets
   - Add proper copy/move constructors for state structures

3. Build System:
   - Add raft_msg_parser.cpp to build
   - Add -D_FORTIFY_SOURCE=0 to avoid format string crashes

Known Issues:
- Deep state comparison may crash with '***%n in writable section detected***'
  This is a FORTIFY_SOURCE format string protection triggering on raft output.

Co-authored-by: kimi
  - Migrate to 64-bit hash (XXH64) for state identification
  - Optimize StateStore memory management (L1/L2 eviction, mallo
  c_trim support)
  - Fix ZSTD memory bloat via context caching
  - Integrate SysStateStore for cmd_load operations
  - Add depth-based state space analysis with exponential fittin
  g
  - Introduce benchmarks for StateStore, SysStateStore, and XXHa
  sh
  - Add memory fragmentation diagnostic tools
Add full-featured terminal UI with scrollable log view, mouse selection,
prompt interface, and real-time status display. Refactor logging to use
UI-aware macros. Fix StateStore async write consistency on SIGINT.
  - Fix struct padding in message parser (AE/RVR/RV/AER)
  - Fix snapshot membership restoration after load
  - Add test_snapshot_leader_bug test cases
  - Add verify_sstate integrity tool
  - Replace cJSON with nlohmann/json (header-only, modern C++)
  - Update Makefile: remove -lcjson, add build/ directory support
  - Clean up: remove unused memory_analysis.md and test_ui.sh
  - Add proc_status.h with DIFEXITED/DIFCRASHED macros for tracee status
  - Replace exited[NP] with status[NP] in sys_state
  - Handle fatal signals (SIGSEGV/SIGABRT): backtrace, checkpoint, mark
  as crashed
  - Fix exit/exit_group to properly set exit status
- Rename cmd_c to cmd_bfs, use BFS as default auto mode
- Add DFS search algorithm (cmd_dfs + exec_dfs/do_dfs)
- Add random search command skeleton (cmd_rand + exec_rand)
- Add enable/disable_prefetch APIs to StateStore
- Remove depth_stats.txt file operations
Scheduler:
- Implement exec_rand() for depth-limited random walk search
- Add g_traces_searched counter and display in status monitor
- Set explicit search mode (BFS/RAND) in exec functions

Raft example:
- Fix node recovery after snapshot: handle raft_add_node() NULL return
  by falling back to raft_get_node() when node already exists
- traceviz.py: create visible message sequence chart
- trace2batch.py: generate batch script from trace
- add more check to raft
- submodule redisraft
- add checkers for redisraft (buggy)
- `typedef struct raft_node raft_node_t;` could override type caching of
  the definition struct raft_node, leading to Eval not working. fixed.
- fix raft_msg_parser for redisraft
- add dwarf test, despite no help with the fix
- redisraft/raft/main.c for checking
- fix check.cpp for safety check
- add state_store_packed.cpp
- also, traceviz.py can generate send/recv sequence now
- disable redisraft log_poll for committed log checking
  that's not good.
- process level incremental save and load
- optimize state store
- breakdown scheduler: add state_transition.cpp
- patch AT_RANDOM
- patch _dl_x86_cpu_features (cpuid)
- emulate getcpu()
- remove unused functions
- optimize: load promote disk entry to L2, then decompress and promote
  to L1
- change status box information
Major features:
- Full TCP socket lifecycle: socket, bind, listen, accept, connect, send, recv, close
- Epoll support: epoll_create, epoll_ctl, epoll_wait with proper event handling
- Address-based routing using config addrs (handles 0.0.0.0 correctly)
- Cross-process socket communication with real IP mapping
- Pipe support: pipe/pipe2 for inter-process communication
- Mmap support for VFS files (host allocates, copies to tracee)
- Socket state display in monitor (info sock)
- Thread names for better debugging in htop

Bug fixes:
- Fix original_size race condition in state_store
- Fix offset mismatch in packed storage
- Fix epoll_wait to skip closed fd events
- Fix close() to return 0 for sockets
- Fix recv() to return EOF (0) when peer closes

Files modified:
- state_transition.cpp: Add handlers for epoll, accept, pipe, mmap
- sockstate.cpp/h: Full TCP implementation with address routing
- fsstate.cpp/h: Add mmap, pipe, chdir support
- monitor.cpp: Enhanced socket state display
- scheduler.cpp: Thread naming
- fd_manager.h: Add PIPE type
- syscall_fmt.cpp: Add formatters for new syscalls
- check_state() used to consume ~30% time in dfs/bfs/rand load. optimized
  to 10% now, with dwarf type cache, expr parse cache and batched memory read.
- optimize tracee_state::save_full_state_to_state_store(): replace
  vector with pre-allocated uint8_t[] would cancle `memset` initialization
  which couldn't be optimized by compiler, and save `memcpy` calls.
- NOTE: also edited redis/fs_root. use env `LC_ALL=C` to remove dependence
  on locale for redis.
- dep: use libxxhash instead of xxhash.h
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant