Target Worker (specsplit.workers.target)
The Target Worker ("The Tortoise") runs the large, accurate model and verifies draft speculative token trees using tree-attention.
It returns the longest accepted token prefix (and an optional correction token when the draft diverges from the target distribution).
Responsibilities
- Load the configured target model and tokenizer.
- Maintain per-session KV-cache state (
session_id) to avoid prompt recomputation across verification rounds. - Perform batched tree-attention to score the draft tree efficiently.
- Roll back/crop cached KV state to the accepted prefix.
Where the code lives
specsplit/workers/target/engine.pyTargetEngine: verification + session cache orchestrationVerificationResult: accepted/correction outputs + acceptance ratespecsplit/workers/target/tree_attn.py- tree-attention mask construction and position/id logic
specsplit/workers/target/kv_cache.pyStaticKVCacheand rollback/compaction operationsspecsplit/workers/target/service.py- gRPC handlers for
TargetService
Entry points
- Run the worker:
python -m specsplit.workers.target.service
Relevant tests
tests/unit/test_target_engine.py(session caching + rollback)tests/unit/test_tree_attn.py(mask + tree attention helpers)tests/unit/test_verification.py(greedy/stochastic verification math)