Skip to content

feat: [cli] - Integrated Session Spawning & Event-Driven Orchestration#81

Open
payneio wants to merge 1 commit intomainfrom
papayne/bundle-spawn-and-events
Open

feat: [cli] - Integrated Session Spawning & Event-Driven Orchestration#81
payneio wants to merge 1 commit intomainfrom
papayne/bundle-spawn-and-events

Conversation

@payneio
Copy link

@payneio payneio commented Feb 4, 2026

Summary

Part of the Integrated Session Spawning & Event-Driven Orchestration implementation.

Changes

  • Refactor to use spawn_bundle() for session spawning
  • Store working_dir in session metadata for sync

Dependencies

  • Depends on microsoft/amplifier-foundation PR for spawn_bundle()

Testing

  • All unit tests pass
  • Validation recipes pass

🤖 Generated with Amplifier

Copy link
Contributor

@colombod colombod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ignore this comment

Copy link
Contributor

@colombod colombod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ignore this comment

Copy link
Contributor

@colombod colombod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall Assessment

This is a strong architectural improvement that centralizes session lifecycle management in spawn_bundle(). The refactoring reduces complexity and improves maintainability.

✅ Strengths

  1. Clear phased structure - The PHASE 1-7 comments make the flow easy to follow
  2. Separation of concerns - CLI-specific logic moved to cli_pre_execute_hook
  3. Error safety - try/finally ensures display nesting is always restored
  4. Code reduction - Removes ~94 lines of complex manual session management

🔍 Questions & Considerations

1. Inline Bundle Creation (Line 39-45)

Creating an inline Bundle from merged_config is clever, but:

  • Should we validate that merged_config has the expected structure?
  • What happens if merged_config is malformed?

2. Inheritance Flags Set to False (Line 390-392)

inherit_providers=False,
inherit_tools=False,
inherit_hooks=False,

This is correct IF merged_config already contains the filtered modules. Can you confirm:

  • The merged_config in PHASE 1 already handles tool/hook inheritance filtering?
  • These False flags prevent double-inheritance?

3. Capability Override Pattern (Line 196-205)

The hook overrides session.spawn with CLI's version. This suggests:

  • spawn_bundle() registers its own simpler session.spawn
  • CLI needs the extended signature with agent resolution
  • Is this documented in spawn_bundle's API contract?

4. metadata_extra Field (Line 339-353)

Does spawn_bundle() merge metadata_extra into the persisted session metadata? This is critical for:

  • trace_id (W3C Trace Context)
  • child_span (short_id resolution)
  • working_dir (session sync between CLI and web)

🎯 Recommendation

Approve with minor clarifications. The architecture is sound. Please confirm:

  1. Inheritance flags logic
  2. spawn_bundle's handling of metadata_extra
  3. Whether capability override pattern is intended behavior

Copy link
Contributor

@colombod colombod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Report: feat: [cli] - Integrated Session Spawning & Event-Driven Orchestration

📋 Executive Summary

Overall Verdict: ⚠️ REQUEST CHANGES

This PR introduces significant functionality for session spawning and orchestration with strong architectural foundations and excellent security practices. However, critical test failures and high code complexity require attention before merge.

Key Concerns:

  1. Python quality checks failed (blocking)
  2. Shadow environment version verification failed (testing infrastructure issue)
  3. ⚠️ High code complexity with ~200 lines of duplication (24% of file)

Strengths:

  • ✅ Excellent security posture (8.5/10)
  • ✅ Zero contract violations with amplifier-core
  • ✅ Solid architecture using proper Amplifier primitives
  • ✅ Comprehensive documentation and error handling

🧪 Test Results Summary

Category Status Score Details
Contract Compatibility ✅ PASS - No breaking changes to amplifier-core contracts
Security Audit ✅ PASS 8.5/10 Strong security, zero critical issues
Code Quality ⚠️ CONCERNS 6/10 Functional but high complexity (7.5/10)
Python Checks ❌ FAIL - Quality checks failed (details needed)
Shadow Smoke Tests ❌ FAIL - Version verification issue (4df7b9c vs e227296)
Host Verification ✅ PASS - No unintended host changes

🔴 Critical Issues (Must Address Before Merge)

1. Python Quality Checks Failed ❌

Status: BLOCKING

The automated Python quality checks (ruff, pyright, formatting) failed but specific failure details were not captured. This must be resolved before merge.

Action Required:

# Run locally to see failures:
cd amplifier-app-cli
ruff check amplifier_app_cli/
pyright amplifier_app_cli/
ruff format --check amplifier_app_cli/

2. Shadow Environment Version Mismatch ❌

Status: TESTING INFRASTRUCTURE ISSUE (may not block merge)

Issue: Installed version shows 4df7b9c (main branch) instead of PR commit e227296, despite workspace and git showing correct commit.

Evidence:

  • ✅ Git workspace: e227296 (correct)
  • ✅ Injected code: e227296 (correct)
  • amplifier --version: 4df7b9c (incorrect)

Root Cause: Version string likely generated during build from git metadata that follows branch history to main rather than using current HEAD.

Impact: Cannot confirm PR code is actually running in shadow environment, though manual CLI tests passed.

Recommendation:

  • Investigate version detection mechanism in amplifier --version
  • Consider using git rev-parse HEAD instead of git describe --tags
  • This appears to be a testing infrastructure issue, not a flaw in PR code

⚠️ High Priority Recommendations

3. Extreme Code Duplication (24% of file)

Location: session_spawner.py
Impact: High maintenance burden, potential for diverging bugs

Issue 1: Identical Filter Functions (~120 lines)

_filter_tools() (lines 64-124) and _filter_hooks() (lines 211-271) are copy-pasted with only variable names changed.

Recommended Fix:

def _filter_modules(
    config: dict,
    module_key: str,  # "tools" or "hooks"
    inheritance_policy: dict[str, list[str]],
    explicit_modules: list[str] | None = None,
) -> dict:
    """Generic module filtering for tools, hooks, or any list of modules."""
    # ... extract common logic ...

Issue 2: Duplicated Capability Functions (~60 lines)

Identical child_spawn_capability() functions defined in both spawn_sub_session() (lines 436-465) and resume_sub_session() (lines 731-760).

Recommended Fix:

def _create_spawn_capability(agent_configs: dict[str, dict]) -> Callable:
    """Factory for spawn capability closure."""
    async def child_spawn_capability(...) -> dict:
        return await spawn_sub_session(...)
    return child_spawn_capability

4. Function Complexity Exceeds Maintainability Threshold

Metrics:

  • spawn_sub_session(): 290 lines, 15 parameters
  • resume_sub_session(): 260 lines
  • File total: 831 lines (threshold: <500)

Impact: Difficult to test, understand, and modify

Recommended Refactoring:

async def spawn_sub_session(...) -> dict:
    config = _prepare_spawn_config(...)
    metadata = _create_spawn_metadata(...)
    bundle = _create_inline_bundle(...)
    
    async def pre_execute_hook(child):
        await _register_cli_capabilities(child, metadata, agent_configs)
    
    result = await spawn_bundle(bundle, ..., pre_execute_hook=pre_execute_hook)
    return {"output": result.output, "session_id": result.session_id}

Break into:

  • _prepare_session_config() - config merging/filtering (lines 329-384)
  • _create_session_metadata() - metadata preparation (lines 505-522)
  • _register_cli_capabilities() - capability registration (lines 427-497)

💡 Medium Priority Suggestions

5. Encapsulation Concern: Private Attribute Access

Location: _extract_bundle_context() (lines 39-44)

Accesses private attributes ._bundle, ._paths, ._bundle_mappings which breaks encapsulation.

Suggestion: Add public API to foundation layer:

# In amplifier_foundation
class BundleModuleResolver:
    def get_module_paths(self) -> dict[str, Path]:
        """Public API for serializing module paths."""
        return dict(self._paths)

6. Session ID Generation Security Verification

Severity: Medium (verification needed)

Location: session_spawner.py:389

Cannot verify from this PR if generate_sub_session_id() uses cryptographically secure random generation (secrets module vs random module).

Action: Verify in amplifier_foundation that it uses:

import secrets
session_id = secrets.token_hex(16)  # ✅ SECURE

7. Documentation Drift

Issue: amplifier-core/docs/CAPABILITY_REGISTRY.md shows simplified signatures that don't reflect the full 13 available parameters for session.spawn.

Impact: Low - not breaking, but developers may not discover advanced features like provider_preferences or tool_inheritance.

Suggestion: Update core docs to show full signature or note "...additional optional parameters available"


✅ Positive Notes & Strengths

Outstanding Security Practices 🛡️

  • Path traversal protection - Comprehensive sanitization of session IDs
  • No command injection vectors - Zero unsafe code execution patterns
  • No hardcoded secrets - All credentials properly managed
  • Atomic file operations - Prevents corruption with backup recovery
  • Safe deserialization - Uses JSON, not pickle

Excellent Architecture 🏗️

  • Uses proper primitives - Built on spawn_bundle() as foundation
  • W3C Trace Context - Proper distributed tracing (lines 505, 514)
  • Clear separation of concerns - Well-documented phases (PHASE 1-7)
  • Observability - Emits events for session lifecycle
  • Explicit error handling - Clear error messages with context

Comprehensive Documentation 📚

  • Detailed docstrings - Every parameter and phase explained
  • Clear comments - Phase markers aid understanding
  • Good naming - Descriptive function and variable names

📊 Quality Metrics

Metric Current Target Status
Security Score 8.5/10 7/10 ✅ Excellent
Philosophy Alignment 6/10 8/10 ⚠️ Needs work
Code Duplication ~24% <5% ❌ High
Function Length 290 lines <100 ❌ Exceeds
Parameters per Function 15 <7 ❌ Exceeds
Contract Safety 100% 100% ✅ Perfect

🎯 Recommended Action Plan

Before Merge (REQUIRED)

  1. Fix Python quality check failures - Run and resolve ruff/pyright issues
  2. ⚠️ Investigate shadow version mismatch - Verify this is testing infrastructure issue, not code problem

High Priority (STRONGLY RECOMMENDED)

  1. Eliminate code duplication - Extract _filter_modules() and capability factories
    • Impact: Removes ~180 lines of duplication
    • Effort: 2-3 hours
    • Risk: Low (extract method refactoring)

Medium Priority (RECOMMENDED FOR FUTURE PR)

  1. Break down large functions - Decompose spawn/resume into smaller pieces
  2. Add public APIs to foundation - Replace private attribute access
  3. Verify session ID security - Confirm uses secrets module

🏁 Final Verdict

REQUEST CHANGES ⚠️

Blocking Issues:

  1. Python quality checks must pass
  2. Shadow version mismatch needs explanation (likely not code issue, but needs verification)

Strong Recommendation:
Address code duplication before merge. While not technically blocking, the 24% duplication rate creates significant maintenance risk and violates the "single source of truth" principle.

Once addressed, this PR represents solid work with excellent security practices and proper use of Amplifier architecture. The core functionality is sound and ready for deployment after quality issues are resolved.


Commit: e227296
Files Changed: amplifier_app_cli/session_spawner.py (+831 lines)
Review Date: 2026-02-09


Review generated by pr-review-full recipe

@payneio
Copy link
Author

payneio commented Feb 9, 2026

Response to Feedback

Thank you for the thorough review. We accept all the technical feedback and will address the issues.

Blocking Issues — Will Fix

Issue Action
Python quality checks failed Will run ruff/pyright and resolve all issues
Shadow version mismatch Will investigate version detection mechanism (appears to be testing infrastructure, not code issue)

High Priority — Will Address in This PR

Issue Action
24% code duplication Extract _filter_modules() to handle both tools and hooks generically
Duplicated capability functions Extract _create_spawn_capability() and _create_resume_capability() factories
Function complexity (290 lines, 15 params) Decompose into _prepare_spawn_config(), _create_spawn_metadata(), _register_cli_capabilities()

Medium Priority — Will Address

Issue Action
Private attribute access Will propose public APIs to foundation: resolver.get_module_paths(), mention_resolver.get_bundle_mappings()
Session ID security Will verify generate_sub_session_id() uses secrets module (it does)
Documentation drift Will update capability registry with full signature

Clarifications from First Feedback

  1. Inheritance flags set to False: Correct — merged_config already contains filtered modules from PHASE 1. The False flags prevent double-inheritance.

  2. Capability override pattern: This is intended behavior. spawn_bundle() registers a simple session.spawn, and CLI overrides with an extended version that includes agent name resolution from the agents registry. Will document this contract.

  3. metadata_extra handling: Confirmed — spawn_bundle() merges metadata_extra into persisted session metadata via SessionStorage.save(). This preserves trace_id, child_span, and working_dir for session resumption.

Dependency Note

This PR depends on the outcome of the foundation PR discussion (#63). If spawn_bundle() moves to a bundle or the API changes based on that discussion, we will update this PR accordingly.

Will push fixes once we have alignment on foundation.

Refactors spawn_sub_session() to use the new spawn_bundle() primitive:
- Implements CLISessionStorage for session persistence
- Delegates to spawn_bundle() for actual session execution
- Preserves backward compatibility with existing delegate tool
- Fixes resume_sub_session() to register session.spawn/resume capabilities
- Restores session.working_dir capability on resume

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
@payneio payneio force-pushed the papayne/bundle-spawn-and-events branch from eea7a66 to 7c30edb Compare February 9, 2026 22:02
@colombod
Copy link
Contributor

PR Review (Automated): Re-check at commit 7c30edb

Follow-up automated review. The previous review flagged duplication, private attribute access, and python quality failures. @payneio accepted all feedback. This re-check confirms which items are still outstanding on the current head.


Test Results

Category Result Details
Cross-Repo Contracts SAFE All consumed core APIs match current signatures. No reverse dependencies.
Code Review ⚠️ Concerns 3 critical issues, 4 moderate issues identified
Security Audit PASS No critical security findings
Python Checks FAIL Linting/type errors still present
Shadow - CLI (non-LLM) 9/9 PASS Version, help, config, resources, recipes, source override all pass
Shadow - LLM-dependent ⚠️ 0/8 All failures caused by test harness uv tool install --force stripping provider SDKs - not a PR defect
Host Isolation PASS Host Amplifier unchanged at 4df7b9c; shadow correctly ran 7c30edb

Still Outstanding: Critical Issues

These were flagged in the previous review and remain unresolved at 7c30edb.

1. Capability registration duplicated 3x verbatim (~120 lines)

The child_spawn_capability / child_resume_capability closures and approval provider registration are copy-pasted identically across spawn_sub_session and resume_sub_session. Each closure is a 15-parameter passthrough wrapper. When spawn_sub_session's signature changes, all three sites must be updated in lockstep.

Recommendation: Extract a single _register_cli_capabilities(session, sub_session_id, self_delegation_depth) helper. This collapses ~120 lines to ~20.

2. Private attribute access (_bundle._paths, _bundle_mappings)

if hasattr(resolver, "_bundle") and hasattr(resolver._bundle, "_paths"):
    module_paths = {k: str(v) for k, v in resolver._bundle._paths.items()}

Reaches 2 levels into private state. If either resolver refactors its internals, resume_sub_session silently breaks.

Recommendation: Add public get_module_paths() / get_bundle_mappings() methods to the resolver interfaces in amplifier_foundation.

3. Removed behavioral contracts with no visible replacement

The old spawn_sub_session explicitly handled:

  • Cancellation propagation via parent_cancellation.register_child() / unregister_child()
  • sys.path sharing via parent_session.loader._added_paths and bundle_package_paths

Neither appears in the new code. If spawn_bundle() doesn't handle these:

  • Ctrl+C during nested agent execution may leave orphaned sessions
  • Bundle packages may not be importable in child sessions

Recommendation: Either cite where spawn_bundle() handles these (e.g., # Cancellation propagation handled by spawn_bundle -- see foundation/spawn.py:L142), or restore the wiring.


Suggestions for Improvement

# Priority Issue Suggested Action
4 Medium resume_sub_session not simplified (264 lines of manual wiring) File follow-up issue for a resume_bundle() foundation primitive
5 Medium Hollow Bundle(agents={}, context_files=[]) constructed to satisfy API Consider config: dict alternative in spawn_bundle() signature
6 Medium f-string in logger calls (lines 497, 669, 702, 811, 824) Use lazy %s formatting: logger.debug("msg %s", val)
7 Medium Zero test coverage for new spawn path Add tests for spawn_bundle() integration, cli_pre_execute_hook, and display nesting cleanup on error

Logic concern (non-blocking): In resume_sub_session at line 638, AmplifierSession() is constructed without is_resumed=True (defaults to False), causing the kernel to emit session:start. But line 794 manually emits session:resume. This produces a duplicate/contradictory event pair. Consider passing is_resumed=True and removing the manual emit.


What This PR Gets Right

  1. Correct architectural direction - Delegating to spawn_bundle() is textbook Amplifier philosophy: push lifecycle complexity into well-tested foundation primitives, keep the CLI layer thin.
  2. try/finally for display nesting - The old code could leak display state on error. The new pattern is strictly better.
  3. pre_execute_hook pattern - Clean inversion of control. The CLI injects its concerns via hook while spawn_bundle() owns the lifecycle.
  4. metadata_extra boundary - Cleanly separates CLI-specific metadata from foundation metadata.
  5. Resume session capabilities - Adding spawn/resume/approval capabilities to resumed sessions fixes real bugs where resumed sessions couldn't delegate.
  6. Cross-repo safety - All core API contracts are respected.

Bottom Line

The architectural direction is right. The move from hand-rolled session lifecycle to spawn_bundle() is a significant simplification. Items 1-3 above should be addressed in this PR to avoid merging known duplication and fragility. Items 4-7 are fine as documented follow-ups.

Verdict: REQUEST CHANGES (same as previous review - issues acknowledged but not yet resolved at this commit)


Automated review by Amplifier PR Review (pr-review-full v2.7.0) at commit 7c30edb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants