Conversation
Wumms snapshot of ~65 files from ongoing throughput/UX iteration: Backend: - classification_channel cleanup (running/recognition/ejecting), ghost handling, ignored regions, tracker exemption for pending-drop - local_state: ghost ignore memory + dossier groundwork - piece_transport: motion-sync metrics, distribution positioning fix - vision: overlay scaling for 4K feeds, stream overlay cleanups (helper IN/white-red DROP removed), camera_feed/service tweaks, tracking history + polar_tracker refinements - server/api + detection router: known-objects endpoints, dossier read paths - new: role_aliases, overlays/scaling, tests/test_distribution_positioning Frontend: - RecentObjects + TrackPathComposite + tracked/[uuid] restored from APFS snapshot, burst filmstrip, KnownObject-by-uuid fallback - recent-pieces.ts extracted (dedup key logic) - settings sections updates, CameraFeed crop rotation gate - stores/sortingProfile + manager.svelte.ts wiring gitignore: - exclude stray tmp-*.png / dashboard-*.png / sorter-dashboard-*.png debug screenshots from repo root Safety-net commit before the unified piece-dossier refactor (plan at ~/.claude/plans/lass-uns-gerade-einmal-wiggly-parrot.md). Intent is recoverability, not clean history — the next commits split this work by subsystem.
- new piece_segments table persisting per-segment polar track data, sector snapshots, and recognition results per physical piece - indexes for (piece_uuid, sequence) and (session_id, role, first_seen_ts) - helpers remember_piece_segment / list_piece_segments / clear_piece_segments_for_session - additive only: existing piece_dossiers untouched - phase 3 will wire this table from the vision manager Phase 2 of unified piece-dossier plan.
- piece_transport.registerIncomingPiece now cascades tracked_global_id through active lookup, DB dossier lookup, explicit piece_uuid, and finally fresh uuid — preventing double-register on tracker glitches - new resumeExistingPiece() rehydrates a dossier from SQLite into active pieces - new KnownObject.from_dossier() classmethod reciprocal to event serialiser - classification_channel running.py _recoverExistingTrackedPieces uses resumeExistingPiece on DB-hit; _registerNewIntakePiece guards against double-register at call-site too - tests for idempotent register + dossier rehydration Phase 1 of unified piece-dossier plan.
- blob_manager: piece_crops_dir + write_piece_crop, best-effort on OSError
- PieceHistoryBuffer: on_segment_archived callback slot
- vision_manager: wires archive callback -> resolves piece_uuid via
piece_transport, creates stub dossier on miss, writes sector_snapshot
wedge/piece JPEGs to blob/piece_crops/<uuid>/seg<n>/, then
remember_piece_segment() with paths instead of b64
- api.py: GET /api/piece-crops/{uuid}/seg{n}/{kind}/{idx}.jpg
FileResponse with path-traversal guard and long cache
- piece_transport: bindStubPieceUuid helper
- tests for roundtrip, stub-dossier creation, and error tolerance
Phase 3 of unified piece-dossier plan.
- _LiveTrack.piece_uuid assigned when C3 track reaches 4 stable hits - stub dossier written to SQLite via remember_piece_dossier on first bind, so downstream lookups find the record from the moment the piece becomes reliably tracked - PendingHandoff and handoff.register_track propagate piece_uuid across C3 -> Carousel transition - TrackAngularExtent surfaces piece_uuid to classification channel - running.py passes piece_uuid through registerIncomingPiece so the idempotent cascade (phase 1) reuses the same UUID on C4 intake - tests for binding threshold, handoff preservation, and register idempotency with explicit piece_uuid Phase 4 of unified piece-dossier plan.
- local_state.build_piece_detail_payload merges dossier + segments
- GET /api/tracked/pieces/{uuid} now reads DB first, falls back to
runtime LRU, returns 404 only when truly unknown — kills the
spurious "Track Not Found" on still-known pieces
- response adds track_detail.{segments, live} block; live tracker
details merged when the piece is still active
- GET /api/tracked/pieces items carry has_track_segments flag
- bulk segment-count helper keeps list endpoint O(1) per row
- tests for persisted-after-restart, live-merge, and fallback paths
Phase 5 of unified piece-dossier plan.
- recent-pieces.ts key logic: piece_uuid first, tracked_global_id fallback, recentPhysicalKeyOrNull() returns null when neither is available (drop the item instead of inventing keys) - RecentObjects + tracked/+page dedupe skip null-key items - tracked/[uuid] reads track_detail.segments from /api/tracked/pieces response, only hits /api/feeder/tracking/history when track_detail.live === true - TrackPathComposite prefers jpeg_path (-> /api/piece-crops/...) over jpeg_b64 for snapshot rendering - removed "Track Not Found" label; replaced with neutral loading/error states that reflect the DB-first reality Phase 6 of unified piece-dossier plan.
- cursor-pointer on camera tile buttons (USB + network) so hover signals clickability, disabled:cursor-not-allowed while saving - reassign-confirm modal raised above camera picker so the secondary dialog actually surfaces instead of being stacked under - camera stream auto-refreshes after assign by propagating feedRevision into the background feed component (key or query param) - preview-unavailable fallback on tiles whose MJPEG stream never produces a frame (was showing bare broken-image alt)
- tile MJPEG status callback uses direct property assignment (tileStreamStatus[idx] = status) instead of Object.spread — the spread cloned the whole record on every frame for 6 parallel streams, saturating Svelte's reactivity graph and freezing the renderer before saveCameraRole's fetch could resolve - close camera picker unconditionally after save rather than gating on cameraError; an error now shows as inline alert, but the modal no longer stays open silently while the backend did accept the assignment
- new CameraPreviewHub broadcaster: one VideoCapture per device,
fan-out to N subscribers via asyncio.Queue at ~10fps preview rate
- new /ws/camera-preview/{index} websocket endpoint — clients
receive raw jpeg bytes per message
- refuses subscription when device is owned by vision_manager
(primary role feeds), to avoid device-handle conflicts
- new wsJpegStream svelte action mirroring the mjpeg action API
(status callback, reconnect, destroy)
- ZoneSection modal tiles switch to wsJpegStream; the main
zone feed and dashboard feed remain on MJPEG
- legacy GET /api/cameras/stream/{index} kept as fallback
Eliminates the 6-connection-per-host HTTP/1.1 pool exhaustion
that froze the picker whenever saveCameraRole's POST couldn't
acquire a socket.
Clean-slate before the dossier refactor left a 1 GB local_state.sqlite.backup-<ts> file behind as a safety net. Keep those around locally but don't commit them.
motion_confirmed was a sticky latch — once a track briefly displaced past the 18 px birth threshold, the stagnant-false-track filter was disarmed for that track forever. Apparatus ghosts that wiggled during autoexposure settle at startup latched True and then blocked the dropzone indefinitely: feeder pipeline stalled on wait_chX_dropzone_clear with no self-recovery path. Add a role-agnostic recent-cartesian-stationary helper and short-circuit the motion_confirmed gate: when a latched track has stayed within RECENT_STATIONARY_MAX_DISPLACEMENT_PX (6 px) over RECENT_STATIONARY_MIN_COVERAGE_S (1.8 s) of a 2.5 s window, allow the stagnant filter to fire. Legit pieces pausing briefly mid-travel (sub-second) are unaffected because their recent window still contains the anchor from pre-pause motion.
… dynamic Feeder roles on hive:* or gemini_sam never consume the MOG2 detector output, yet a FeederAnalysisThread was still running per channel at 33Hz with BGR->Lab conversion + MOG2 background subtraction. Gate the thread at init-time and on runtime algorithm-switch so it only runs when static mog2 is actually selected.
…achinery Fundamentally inverts the feeder-tracker gate: a track is no longer "real until proven ghost" — it's now "unknown until proven real". Every tracked piece starts with ``confirmed_real=False`` and flips sticky-True only after demonstrating monotonic polar-angular progress ≥ 5° or centroid drift ≥ 40 px across a ≥ 6-sample window. Apparatus ghosts (screws, reflections, guides) physically cannot clear that bar — their jitter is fixed-position, not monotonic. Dossier mint + segment archival gate on ``confirmed_real`` instead of the Phase-7 displacement thresholds. Everything removed in a single tombstone: * stagnant_false_track filter (all flavours: carousel polar, motion- confirmed recovery, pending-drop protection); * persistent_static_ghost_regions infrastructure (load/save/prune/ suppress + role wiring + state_entries rows with a migration that purges any legacy rows on backend boot); * motion_confirmed latch + MIN_ANGULAR/PIECE_UUID_DISPLACEMENT gates in _maybe_bind_piece_uuid; * ANGULAR_SPAN < 3° motion-gate in _archive_segment_to_dossier_impl (DB-lookup-before-mint path stays). _archive_segment_to_dossier_impl now consults the live tracker for ``confirmed_real``; if the originating track is already dead, falls back to segment_sector_angular_span_deg ≥ 5° as a safe-but-lax check. TrackedPiece exposes ``confirmed_real`` so downstream consumers can filter on it in the follow-up propagation commit.
…e, handoff Downstream consumers of the polar tracker now all gate on the new ``confirmed_real`` whitelist flag: * ``_channelDetectionsFromTracks`` filters tracks before they reach ``analyzeFeederChannels`` — this is the single chokepoint that drives ch2/ch3 dropzone occupancy, clump/massage decisions, and exit wiggle. Apparatus ghosts never populate that path anymore, so the feeder stops stalling behind them. * ``subsystems/feeder/feeding.py`` classification-channel admission: ``classification_channel_track_count`` counts only confirmed-real tracks (prevents a carousel apparatus ghost from pinning ch3). * ``subsystems/channels/c1_bulk.py`` c2 saturation: the bulk feeder's max-ch2-pieces cap counts only confirmed tracks (prevents a c2 guide ghost from halting bulk feed forever). * ``vision/overlays/tracker.py`` live overlay: unconfirmed tracks render in dim grey with no id chip; confirmed tracks keep their green/amber/magenta colour coding + ``#xxxx`` pill. Handoff manager kept as-is. The existing ghost-reject gate on ``PendingHandoff.last_displacement_px`` already filters stationary upstream pendings, and a real piece near C3-exit can handoff before it has covered the full 5° whitelist arc — requiring confirmed_real at notify_track_death would break that edge-case.
… driver Two emergency back-pressure gates so the machine doesn't pile pieces onto C4 while upstream filtering stabilises: - admission.py: MAX_CLASSIFICATION_CHANNEL_DETECTION_CAP=3 enforces a hard admission block on C3 once raw YOLO sees 3+ bboxes on the carousel, independent of transport/zone state. Protects the pipeline when the whitelist hasn't confirmed tracks yet but C4 is physically full. - ch2_separation.py: CH2_SEPARATION_ENABLED=False kill-switch on the slip-stick driver. The CW/CCW rocking was firing even for 2 pieces on C2 with no real cluster; hard-disable until the cluster gate is tightened.
Snapshot of where tonight's ghost-elimination rework left the repo. **Pipeline is NOT working end-to-end.** Captured as WIP so we can reset with a clear head tomorrow. ### What works - Vulkan-wired ncnn inference (6f2c307) — YOLO off CPU. - MOG2 thread skipped when detection algorithm is dynamic (9bbf3e8) — CPU drops from ~600% to ~17% in idle pipeline. - Hard cap of 3 raw C4 detections blocks C3 from piling the carousel (a636f9d). - Ch2 slip-stick separation driver hard-disabled via kill-switch flag (a636f9d). ### What is broken - **Whitelist refactor (706a24d + a213794) is too lax.** The `_evaluate_confirmed_real` check uses full-path median(first-5) vs median(last-5) centroid drift, which grows monotonically for ghost tracks that live forever now that the stagnant-false-track filter was removed. In a 45 s window on a physically clean C4 the pipeline still mints ~27 confirmed dossiers and ~24 segments. Expected fix: rolling window (~1.5 s) + dormancy kill when a track shows no progress for ~3 s. - **CPU regression to ~350 %.** MOG2 was not the only hog; whitelist adds per-frame path-evaluation overhead and trackers never die naturally now, so active-track count grows. - **dist = 0, multi_drop_fail = 3 out of 5 pieces seen** — classify success 0 %. Either cluster arrivals at C4 or downstream classify pipeline stalled. Root cause not isolated tonight. - Startup-purge (`classification_channel/idle.py`) did not fire on reboot — C4 had to be emptied by hand. Either the Idle state's `transport.activePieces()` early-skip catches too much, or the classification state machine was stuck in a previous idle snapshot. Needs diagnosis. - Runtime-stats snapshot appeared stale for minutes at a time across restarts — `main_to_server_queue` + broadcast pipe may have wedged. ### Open workstreams for next session 1. Rolling-window + dormancy fix for `_evaluate_confirmed_real` (polar_tracker.py) so unconfirmed ghost tracks die and never confirm via cumulative drift. 2. Investigate CPU regression post-whitelist — profile which calls dominate (suspect: per-frame path comparisons + embedding EMA for ever-living tracks). 3. Classify 0 % success: trace one C4 piece through `classification_channel` state machine to see where crops / hive recognition drop out. 4. Feeder over-eager on C2: cluster-gate needs tightening, not just the slip-stick disable. 5. Runtime-stats staleness — audit the main-loop broadcast path. ### How to reset tomorrow - `git log --oneline sorthive ^main` lists all ghost-elimination commits. Rollback candidates: 706a24d, a213794 (whitelist core + propagation). The Vulkan + MOG2 + admission-cap + ch2-disable fixes (6f2c307, 9bbf3e8, a636f9d) are keepers. - Full DB + blob purge + supervisor restart is already scripted inline in this session's notes.
Documentation baseline for the runtime rearchitecture on sorthive.
All runtime work after this commit targets the new architecture:
- runtime-architecture.html (canonical visual vision)
- docs/lab/runtime-rebuild-design.md (engineering companion)
The LEGACY map (runtime-current-state-map.md) captures the pre-rebuild state
as migration reference only.
Added:
docs/lab/runtime-current-state-map.md — IST-state, LEGACY-banner, file:line cites
docs/lab/runtime-rebuild-design.md — 4-layer x 5-column design, contracts,
7-step C4<->Distributor handshake,
strategy plugins, 6-phase migration
Changed:
docs/lab/software-architecture-decisions/index.md — reference the two canonical docs
Removed:
docs/lab/software-architecture-decisions/vision-camera-runtime-refactor.md
(superseded; broader rebuild replaces narrower VisionManager-split proposal)
REVIEW-2026-04-16.md (obsolete one-off)
First phase of the runtime rearchitecture per docs/lab/runtime-rebuild-design.md.
Contracts-only; no algorithm ports, no runtime implementations.
Added under software/sorter/backend/rt/:
contracts/ Feed, Zone (Rect/Polygon/Polar), Detector, Tracker, Filter,
FilterChain, Classifier, Calibration, Runtime (ABC),
AdmissionStrategy, EjectionTimingStrategy, RulesEngine,
EventBus + StrategyRegistry with 8 register_* decorators
(detector/tracker/filter/classifier/calibration/admission/
ejection_timing/rules_engine)
config/ Pydantic v2 schema (SorterConfig with feeds/pipelines/
runtimes/classification/distribution) + TOML+SQLite loader
with deep-merge override
events/ InProcessEventBus — bounded queue (maxsize=2048), dispatcher
thread, drop-oldest on overflow, fnmatch glob topics,
drain() for synchronous tests. 9 typed topic constants
context.py RuntimeContext DI container (replaces shared_state.py globals)
__init__.py build_runtime() stub — NotImplementedError until phase 2+
tests/ 8 tests (registry round-trip, config validation, event bus
subscribe/publish/glob/unsubscribe); all green under
uv run pytest rt/tests/
Placeholder __init__.py for later phases:
perception/ (detectors, trackers, filters, classifiers, calibration)
runtimes/ (+ _states/) rules/ classification/ coupling/
hardware/ irl/
1243 LoC total; every file under 250 LoC. No imports from or edits to
the existing backend/ tree.
…+ filters)
First perception stack portiert in das rt/-Contract-Frame. Phase 2a von 6,
self-contained in rt/perception/ — keine main.py-Integration, kein Shadow-Mode,
keine Berührung des alten backend/-Trees.
Added under software/sorter/backend/rt/perception/:
feeds.py CameraFeed adapter — pulls frames from legacy
backend.vision.camera_service (explicit temporary bridge
until rt/hardware/ lands). Monotonic frame_seq per feed
zones.py build_zone(ZoneConfig) factory — Rect/Polygon/Polar
detectors/mog2.py Mog2Detector port (registered "mog2"). Rect+Polygon
masks ok; Polar raises NotImplementedError stub
trackers/polar.py PolarTracker port (registered "polar"). Polar Kalman +
Hungarian matching with cartesian fallback. Whitelist
confirmation (angular >=5 deg OR centroid drift >=40px)
filters/size.py SizeFilter (registered "size")
filters/ghost.py GhostFilter (registered "ghost") — pulls confirmed_real
gating out of the tracker into an explicit filter step
pipeline.py PerceptionPipeline (detect -> track -> filter) +
build_pipeline_from_config factory that wires
PolarZone geometry into the polar tracker
pipeline_runner.py PerceptionRunner — daemon thread per feed, periodic
pipeline execution, duplicate frame_seq skip,
error-threshold circuit-breaker, EventBus publish of
PERCEPTION_TRACKS + HARDWARE_ERROR
Subpackage __init__.py files updated with side-effect imports so
`import rt.perception` registers all strategies.
Tests (21 new, 29 total green):
test_mog2_detector.py synthetic frames + MOG2 warmup + detection
test_polar_tracker.py polar + cartesian fallback, confirmation gating
test_filters.py size passes/blocks, ghost filters unconfirmed
test_pipeline.py end-to-end detect -> track -> filter
test_pipeline_runner.py start/stop lifecycle, duplicate-seq skip,
circuit-breaker on repeated failures
Verified: uv run pytest rt/tests/ -v -> 29 passed, 0 failed, 0.96s.
Known limits (all explicit in code):
- PolarZone masking in Mog2Detector is a NotImplementedError stub
- PolarTracker does not carry OSNet appearance embeddings or history
buffer (scope-excluded; those land with handoff in a later phase)
- global_id == track_id for now; PieceHandoffManager is separate
Phase 2b hooks ready: PerceptionRunner lifecycle handles, EventBus
publishing, non-blocking latest_tracks() accessor, build_pipeline_from_config
factory.
Live-hardware finding: the C4 transport move is configured for up to 24 000 µsteps/s but only reaches ~3% of that in practice — 53 µsteps per pulse is far too short to clear the acceleration ramp. The motor sits in a short triangular profile with an abrupt stop at the end, and pieces on the carousel slide because the stop's inertia outpaces friction. New PROFILE_TRANSPORT_PULSED splits a single transport_move into sub-pulses of configurable size with explicit settle pauses in between. Each sub-pulse is still a small triangular move, but the pause lets the piece re-grip the carousel before the next kick. Matches the manual pattern Marc observed working better during hand testing. Opt in via RT_C4_TRANSPORT_PROFILE=pulsed; sub-pulse size and settle duration are env-tunable (RT_C4_TRANSPORT_SUB_PULSE_DEG, RT_C4_TRANSPORT_SETTLE_MS) so we can A/B without a rebuild. Default profile stays "transport" — no behaviour change unless opted in.
Marc reported the Recent Pieces list was "wuseln" — pieces flickering in and out, reordering on nearly every frame. Measured at live sorting: 19 enters / 18 leaves / constant reorders over 15 s. After this change, the same 15 s window shows 0 / 0 / 0. Two separate instability sources addressed: - **Reorder source.** `RecentObjects.upcoming` was sorted by `exitDistanceDeg` — the piece's current angle relative to the drop point. Every small angle update from the tracker swapped adjacent rows. Switched to a stable FIFO sort on `first_carousel_seen_ts`: newest at top, oldest (next to drop) at the bottom right above the distributed divider. Order changes only when pieces actually enter or leave. - **Enter/leave source.** `MachineManager.handleKnownObject` ran each incoming event through `shouldKeepRecentObject` (== the display filter) and **removed** existing entries that no longer matched. A piece rotating past the drop zone flipped `classification_channel_zone_state` from `active` to `superseded` and got evicted from storage. The upcoming `$derived` then emitted "leave" + "enter" when it cycled back. Fixed by always updating-in-place on subsequent events: the display filter decides what to render, the storage layer just keeps the history. First- insertion eviction still uses the filter. Also dropped the 15 s "same-gid recently distributed" dedup in the upcoming list — it was a workaround for the pre-BoTSORT tracker splitting one physical piece across many global_ids, and now just hides active pieces for 15 s whenever their gid happens to match a just-distributed entry.
Live audit: the widget was showing nonsense because every dossier admission on C4 bumped the counters, and with BoTSORT keeping a stable tracked_global_id across carousel rotations the same physical piece can produce dozens of dossiers. Live sample before fix: pieces_seen=1092, classified=797, distributed=248 — against roughly ~17 actual physical pieces in the carousel. Reading "87 ppm feed rate" when only 17 pieces have ever been fed is actively misleading. Backend: RuntimeStatsCollector.snapshot() now also reports - unique_pieces_seen - unique_pieces_classified - unique_pieces_distributed by folding ``tracked_global_id`` into sets. Dossier-level counters (pieces_seen / classified / distributed) stay for the Totals footer so the attempt-level numbers remain visible. Widget: - Feed rate uses unique_pieces_seen / running_time_s. - Distributed / min uses unique_pieces_distributed / running_time_s instead of the dossier-based throughput.overall_ppm. - Multi-drop denominator is finished classifications, not pieces_seen (pieces_seen was inflated by admissions of pieces still on C4). - New "Unique pieces" tile: "4 of 17 seen" — the honest answer to "how many pieces has the sorter actually handled?". - Dropped the broken "C4 active ppm" tile. ``channel_throughput. classification_channel.active_ppm`` is permanently None because ``observeChannelExit`` was never wired into the rt graph after the cutover; fixing that is a separate task. - Fixed a dead-code lookup on ``outcomes.classified?.active_ppm`` (the real key is ``classified_success``). Totals footer still shows the raw dossier counts so operators can see the inflation if they want to debug it, but the headline tiles no longer double-count re-circulating pieces.
Follow-up to the unique-piece-count fix. The widget was still showing nonsense rates right after a backend restart: unique_pieces_seen is cumulative across sessions (backed by the piece-dossier DB) but running_time_s resets to zero each start, so ``cumulative_count * 60 / running_time_s`` inflated wildly for the first few minutes and then slowly decayed. Backend: - ``throughput.recent_ppm`` — pieces distributed per minute observed over a rolling 5-minute window, computed from the actual wall-clock span of the ``distributed_at`` timestamps. Session-independent. - ``throughput.feed_recent_ppm`` — same idea keyed off the ``first_carousel_seen_ts`` of each unique tracked_global_id, so the feed rate reflects the real cadence at which new physical pieces enter C4 rather than a ratio against a just-reset clock. - Snapshot lazily syncs ``_is_running`` from the rt_handle's ``started`` / ``paused`` flags so running_time_s ticks even when the command-queue ``setLifecycleState`` path races or drops the event (observed live: rt reported not-paused while the collector still claimed "initializing"). Widget: ``dist_rate_ppm`` and ``feed_rate_ppm`` prefer the new recent-window values, falling through to the cumulative-over-time numbers only when no recent events exist.
Now that BoTSORT+ReID is the production primary tracker on every feed, painting both the primary boxes (solid green) and the shadow boxes (dashed magenta) on the same stream is just visual noise. Operators need the live boxes from the tracker that's actually making sorting decisions — the shadow comparison is debug info that belongs on ``/api/rt/status`` (still wired there) or in offline benchmark runs, not on the camera feed. RuntimeAnnotationProvider returns only RuntimeTrackOverlay + RuntimeGhostOverlay by default. The shadow overlay class stays in the codebase so ad-hoc debugging can re-add it, just not on the default provider chain. Tests updated to match.
Live observation: with BoTSORT as the production tracker, tracks with hit_count well into the thousands were sitting at confirmed_real=False because the rotation-window verdict only fires when c2/c3/c4 happens to publish a PERCEPTION_ROTATION while the tracker has accumulated 6+ samples in that window. The verdict gate was a useful ghost filter when motion-only trackers regularly birthed false positives on apparatus pixels — with appearance-aware association it just hides the live tracker output operators want to see. RuntimeTrackOverlay now draws the ID label for every non-ghost track. Box colour still reflects the verdict (green = confirmed, dim grey = pending) so the visual signal is preserved, but the ID is always visible. Ghosts continue to be filtered out and rendered separately by RuntimeGhostOverlay.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This is the current
sorthivebranch delta againstmain. It carries the runtime rebuild from shim-heavy legacy wiring onto the new RT runtime, a generic cross-channel purge architecture, a fully persisted tracked-pieces view with per-segment debug crops, a broad service-extraction sweep, the Hive integration with Gemini-backed condition labeling, and a BoTSORT + OSNet ReID primary tracker that replaces motion-only association.Architectural direction is codified in [lab/sorter-architecture-principles]({{ '/lab/sorter-architecture-principles/' | relative_url }}) — the docs tree has been realigned so the principles doc is the active guide.
Recent additions (on top of the base cutover work)
BoTSORT + OSNet ReID primary tracker
rt/perception/trackers/boxmot_reid.pywraps boxmot's BoTSORT with OSNet ReID and emitsappearance_embeddingon eachTrack. Weights (osnet_x0_25_msmt17, ~3 MB) auto-download intoblob/reid_models/on first use.botsort_reid(override viaRT_PRIMARY_TRACKER_KEY); shadow isturntable_groundplane. Motion-only trackers stay registered as benchmarks.rt/perception/trackers/_geometry.pyso the three legacy adapters share one implementation instead of three near-identical copies.TrackTransitRegistrystill owns the cross-channel C3 → C4 hand-off but now carries the source track's embedding and gatesclaim()by cosine similarity (default 0.55). Catches the documented failure mode where a red slope's track got merged with an unrelated white-piece track. Missing embeddings never block, so pure-motion fleets keep working.Hive integration
Gemini condition teacher
condition_teacher.pyadapter scans persisted piece crops and asks Gemini for composition (single_part/compound_part/multi_part/ …) and condition (clean_ok/minor_wear/dirty/damaged/trash_candidate), emitting aconditionsample typesample_payloads.build_sample_payloadproduces a first-classconditionanalysis blockteacher_detection/condition/classification/other) so the queue can filter and backfill by typeSampleConditionCardon sample detail + review pagesRuntime + channels
HandoffPortExtractions + sweep
camerasrouter (and out of sidebar)Base cutover (what opened this PR)
Runtime cutover (RT)
Generic purge architecture
PurgePortcontract andPurgeStrategyabstractionTracked pieces rebuild
/api/tracked/piecesendpoint feeds list + detail + modal (no more parallel code paths)SegmentRecorderpersists per-track path points, channel geometry, and wedge crops to SQLite + diskService extractions + ports (principle: composition wires, services coordinate)
SorterLifecyclePortreplaces hardware callable-globalsrt/projectionspackage with piece_dossier subscriberrt/hardware/channel_callables.py,rt/config/channels.py, perception runner builder → own modulesblob_managerforwarder layer dissolvedDead-code sweep
vision_manager+controller_refcompat stubs and branches removed from cameras/steppers/training/sorting-profile/detection-config/camera_preview_hubruntime_variablesmodule + endpoint path removed (hidden config store violating principle 4)__init__.pyfiles, dead local variables (ruff F841) sweptgc_ref.xxxfield reads removed from routersBug fixes surfaced by live-debug workflow
confirmed_realgate dropped in C2/C3 (enables ring visibility)Docs
docs/sorter/architecture.mdno longer carries the stale pre-RT component map; it's a short pointer into the principles docValidation
test_boxmot_reid_tracker+ embedding-awaretest_track_transit+ C3 embedding forwarding testcondition_samplepayload passingLive hardware run (2026-04-24 evening, ~5 min total)
tracker=botsort_reid); ReID weights auto-downloaded toblob/reid_models/osnet_x0_25_msmt17.pton first backend bootTrackvia the extended/api/rt/tracks/{feed_id}surfacetracked_global_idresolved to exactly one classifiedpart_id/colorcombo — the Police-vs-Slope class of identity swap did not reproduce. BoTSORT keeps the samegidacross full carousel rotations of the same physical piece, which is the intended behaviour and matches operator expectations.runtime-statswidget (bottom-right UI) was reporting emptystate_machinesafter the RT cutover — nothing in the rt graph was callingRuntimeStatsCollector.observeStateTransition. Fixed by adding astate_observercallback onBaseRuntimeand wiring it in bootstrap; widget now shows current state + transition counts + per-state time shares for c1/c2/c3/c4/distributor.82980aa0): HandoffPort gets anavailable_slots()probe, each dossier has a 250 ms retry cooldown after a busy rejection. C4 stops hammering the distributor at tick rate — the 2.6% acceptance rate was pure bookkeeping noise fromhandoff_requestbeing called repeatedly while the distributor worked through a single piece.fde8c8c): newPROFILE_TRANSPORT_PULSEDsplits a transport move into sub-pulses with explicit settle pauses between them, so pieces on the carousel can re-grip between kicks instead of sliding under the abrupt stops. Opt-in viaRT_C4_TRANSPORT_PROFILE=pulsed; sub-pulse size (RT_C4_TRANSPORT_SUB_PULSE_DEG, default 2°) and settle duration (RT_C4_TRANSPORT_SETTLE_MS, default 120 ms) are env-tunable so Marc can A/B without a rebuild. Default profile staystransport— no behaviour change unless opted in.Notes
main(200+ commits)