feat: add devnet-runner and devnet-log-review Claude skills#130
feat: add devnet-runner and devnet-log-review Claude skills#130MegaRedHand merged 1 commit intomainfrom
Conversation
Add project-level skills for managing local devnets and analyzing their logs. devnet-runner: orchestrates lean-quickstart devnets from the repo root, with validator config management, timed runs, and automatic log collection. devnet-log-review: analyzes devnet logs with scripts for error counting, block production stats, consensus progress, and reference guides for fork analysis, finalization debugging, and error classification.
🤖 Kimi Code ReviewThis PR adds comprehensive devnet tooling and documentation for lean consensus testing. The changes are well-structured and provide valuable infrastructure for debugging consensus issues. Here are my observations: Strengths
Minor Issues1. Script portability in
|
🤖 Codex Code ReviewSummary Findings
Consensus-layer review If you want, I can propose a small patch to harden Automated review by OpenAI Codex · custom prompt |
🤖 Claude Code ReviewBased on my review of PR #130, here's my analysis: PR #130 Review: Add devnet-runner and devnet-log-review Claude SkillsSummaryThis PR adds comprehensive Claude AI skills for managing local devnets and analyzing their logs. The code is well-structured with clear documentation and follows good shell scripting practices. I found several issues ranging from minor to moderate severity that should be addressed. Issues Found1. Script Portability & Robustness Issuesrun-devnet-with-timeout.sh:12-19 - Path resolution could fail in edge cases REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/../../../.." && pwd)"
run-devnet-with-timeout.sh:34 - Silent kill may leave orphaned processes kill $PID 2>/dev/null
Suggested fix: # Graceful shutdown
kill -TERM $PID 2>/dev/null
sleep 2
kill -KILL $PID 2>/dev/null || true
wait $PID 2>/dev/null || true
# Stop containers
cd "$QUICKSTART_DIR"
NETWORK_DIR=local-devnet ./spin-node.sh --node all --stop 2>/dev/null || true2. Security & Input Validationrun-devnet-with-timeout.sh:7-10 - No validation of timeout parameter if [ -z "$1" ]; then
echo "Usage: $0 <seconds>"
exit 1
fi
Multiple scripts - Use of
3. Error Handling Issuesanalyze-logs.sh:58-59 - Complex error counting could silently fail errors=$(grep -i "error" "$f" 2>/dev/null | grep -cvE "manifest unknown|NoFinalizedStateFound|HandshakeTimedOut" 2>/dev/null) || errors=0
count-errors-warnings.sh:19-26 - result=$(grep -i "$pattern" "$file" 2>/dev/null | grep -cvE "$exclude" 2>/dev/null) || result=0
echo "${result:-0}"
4. Race Conditions & Timing Issuesrun-devnet-with-timeout.sh:22-24 - Race condition in process management ./spin-node.sh --node all --generateGenesis 2>&1 | tee "$REPO_ROOT/devnet.log" &
PID=$!
sleep "$1"
Suggested fix: # Use process substitution or job control
cd "$QUICKSTART_DIR"
NETWORK_DIR=local-devnet ./spin-node.sh --node all --generateGenesis 2>&1 | tee "$REPO_ROOT/devnet.log" &
SPIN_PID=$!
sleep "$1"
# Kill the actual spin-node.sh process
kill $SPIN_PID 2>/dev/null5. Data Quality & Log Analysis Issuescount-blocks.sh:54-88 - Client detection could fail with unconventional naming client="${node%_*}"
show-errors.sh:37 - ANSI strip function uses basic sed strip_ansi() {
sed 's/\x1b\[[0-9;]*m//g'
}
6. Documentation IssuesSKILL.md files - Inconsistent timing calculations
SKILL.md references to ports 9001-9008 - No validation that ports are available
7. Minor Code Quality IssuesMultiple scripts - Inconsistent shebang options
count-blocks.sh:16 - ANSI strip is defined but could be more efficient strip_ansi < "$file" | grep -cE "$pattern"
analyze-logs.sh:16-21 - Uses shopt -s nullglob
Positive Observations
RecommendationsHigh Priority
Medium Priority
Low Priority
Verdict🟡 APPROVE WITH CHANGES - The PR provides valuable devnet management tooling with good documentation. However, the process management issues in The shell scripts are generally well-written and handle edge cases thoughtfully. Once the high-priority items are addressed, this will be a solid addition to the development workflow. Automated review by Claude (Anthropic) · custom prompt |
Greptile OverviewGreptile SummaryAdds two Claude skills for managing local devnets and analyzing their logs. devnet-runner provides orchestration for running lean consensus validators with configurable timeouts, automatic log collection, and comprehensive documentation covering validator config management, Docker image tags, port allocation, and troubleshooting workflows. devnet-log-review provides analysis tooling with client-aware pattern matching for block counting, error classification with benign pattern filtering, consensus progress tracking, and detailed reference guides for fork analysis, finalization debugging, and client-specific log patterns. The implementation follows good shell scripting practices with proper error handling ( Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| .claude/skills/devnet-runner/scripts/run-devnet-with-timeout.sh | Adds script to run devnet with timeout and automatic log collection. Minor issue: signal propagation may not work reliably with backgrounded pipeline. |
| .claude/skills/devnet-log-review/scripts/analyze-logs.sh | Main analysis entry point orchestrating error counts, block stats, and consensus progress. Duplicates error counting logic from count-errors-warnings.sh. |
| .claude/skills/devnet-log-review/scripts/count-blocks.sh | Client-aware block counting with pattern matching for each client type. Well-structured with ANSI stripping and safe defaults. |
| .claude/skills/devnet-log-review/scripts/check-consensus-progress.sh | Shows last slot reached and proposer assignments per node. Handles multiple slot format variations correctly. |
| .claude/skills/devnet-runner/SKILL.md | Comprehensive documentation for devnet management with clear workflows, timing calculations, and troubleshooting guides. |
| .claude/skills/devnet-log-review/SKILL.md | Well-organized analysis documentation with progressive disclosure pattern and clear investigation workflows. |
Sequence Diagram
sequenceDiagram
participant User
participant RunScript as run-devnet-with-timeout.sh
participant SpinNode as spin-node.sh
participant Docker
participant AnalyzeScript as analyze-logs.sh
User->>RunScript: Execute with timeout (e.g., 120s)
RunScript->>RunScript: Validate lean-quickstart exists
RunScript->>SpinNode: Start nodes with --generateGenesis
SpinNode->>Docker: Start validator containers
Docker-->>SpinNode: Containers running
RunScript->>RunScript: Sleep for specified duration
RunScript->>Docker: Dump logs from all containers
Docker-->>RunScript: Log files (*.log)
RunScript->>SpinNode: Kill process (SIGTERM)
RunScript-->>User: Logs saved to repo root
User->>AnalyzeScript: Analyze saved logs
AnalyzeScript->>AnalyzeScript: Count errors/warnings
AnalyzeScript->>AnalyzeScript: Count blocks proposed/processed
AnalyzeScript->>AnalyzeScript: Check consensus progress
AnalyzeScript-->>User: Markdown summary with health status
Last reviewed commit: 8d92edf
| NETWORK_DIR=local-devnet ./spin-node.sh --node all --generateGenesis 2>&1 | tee "$REPO_ROOT/devnet.log" & | ||
| PID=$! | ||
| sleep "$1" |
There was a problem hiding this comment.
Signal may not propagate correctly to spin-node.sh since $PID is the shell running the pipeline, not the actual process. The kill on line 33 sends SIGTERM to the pipeline shell, but this may not reliably stop the spin-node.sh process.
Consider capturing the process group and using kill -TERM -$PID to signal the entire process group, or use pkill to target spin-node.sh directly.
Prompt To Fix With AI
This is a comment left during a code review.
Path: .claude/skills/devnet-runner/scripts/run-devnet-with-timeout.sh
Line: 22:24
Comment:
Signal may not propagate correctly to `spin-node.sh` since `$PID` is the shell running the pipeline, not the actual process. The `kill` on line 33 sends SIGTERM to the pipeline shell, but this may not reliably stop the `spin-node.sh` process.
Consider capturing the process group and using `kill -TERM -$PID` to signal the entire process group, or use `pkill` to target `spin-node.sh` directly.
How can I resolve this? If you propose a fix, please make it concise.| total_errors=0 | ||
| for f in "${log_files[@]}"; do | ||
| node=$(basename "$f" .log) | ||
| if [[ "$node" != "devnet" ]]; then | ||
| errors=$(grep -i "error" "$f" 2>/dev/null | grep -cvE "manifest unknown|NoFinalizedStateFound|HandshakeTimedOut" 2>/dev/null) || errors=0 | ||
| total_errors=$((total_errors + errors)) | ||
| fi | ||
| done |
There was a problem hiding this comment.
Duplicates error counting logic from count-errors-warnings.sh with slightly different filter patterns. The benign patterns here (manifest unknown|NoFinalizedStateFound|HandshakeTimedOut) should match those in count-errors-warnings.sh:15 to ensure consistency.
Consider sourcing the benign patterns from a shared location or calling the existing script's function.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Prompt To Fix With AI
This is a comment left during a code review.
Path: .claude/skills/devnet-log-review/scripts/analyze-logs.sh
Line: 54:61
Comment:
Duplicates error counting logic from `count-errors-warnings.sh` with slightly different filter patterns. The benign patterns here (`manifest unknown|NoFinalizedStateFound|HandshakeTimedOut`) should match those in `count-errors-warnings.sh:15` to ensure consistency.
Consider sourcing the benign patterns from a shared location or calling the existing script's function.
<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>
How can I resolve this? If you propose a fix, please make it concise.
Motivation
We need project-level Claude skills for managing local devnets and analyzing their logs directly from this repo, instead of relying on global user-level skills.
Description
Adds two Claude skills under
.claude/skills/:devnet-runner - Orchestrates local devnets from the repo root:
lean-quickstartvalidator config and client image tagsrun-devnet-with-timeout.shcd lean-quickstart && ...)devnet-log-review - Analyzes devnet logs:
analyze-logs.sh- Main entry point producing markdown summarycount-errors-warnings.sh- Per-node error/warning counts (filters benign patterns)count-blocks.sh- Client-aware block production statscheck-consensus-progress.sh- Last slot reached and proposer assignmentsshow-errors.sh- Detailed error display with filteringHow to Test