35 lines
2.6 KiB
Markdown
35 lines
2.6 KiB
Markdown
CatOpt-Flow: Agent Architecture Guide
|
||
=====================================
|
||
|
||
This document outlines the architectural approach used in CatOpt-Flow and establishes guidelines for contributors and automated tooling. It complements the codebase and the CI/test contracts already in place.
|
||
|
||
Overview
|
||
- CatOpt-Flow is a production-oriented platform for multi-tenant ML training pipelines across heterogeneous accelerators.
|
||
- It models optimization as category-theory-inspired primitives: Objects (local tasks), Morphisms (data-exchange channels with versioned schemas), and Functors (adapters mapping device-specific problems to a vendor-agnostic representation).
|
||
- Global constraints are enforced via Limits/Colimits, providing an aggregator that stitches local problems into a globally consistent plan.
|
||
- An ADMM-like distributed solver runs on each node and communicates summarized statistics through a delta-sync protocol that tolerates dynamic scaling and partial failures.
|
||
- A lightweight schema registry and contract marketplace enable plug-and-play adapters for popular ML frameworks and hardware backends.
|
||
- Code generation tooling is provided to output orchestration stubs (Rust/C++) and Python bindings for rapid deployment with minimal vendor lock-in.
|
||
|
||
What to Build (MVP Path)
|
||
- Protocol skeleton with two starter adapters per platform.
|
||
- Delta-sync, simple governance ledger, and identity primitives (DID-based).
|
||
- Cross-domain demo with a simulated domain (Phase 2) and HIL validation (Phase 3).
|
||
- A minimal DSL sketch: LocalProblem/SharedVariables/PlanDelta and toy adapters to bootstrap interoperability.
|
||
|
||
Development Rules
|
||
- All changes should be driven by tests. If a feature requires a new test, add it alongside the implementation.
|
||
- Use the existing test.sh to validate tests and packaging build. The script runs pytest and builds the package via python -m build.
|
||
- Do not break the public API unless explicitly requested. If you add new classes, export them from the package’s __init__ to ease discoverability.
|
||
- When in doubt, add a small integration test demonstrating a 2-node ADMM interaction before expanding scope.
|
||
|
||
Publishing and Governance
|
||
- Publishable artifacts should include a clear README, a small DSL sketch, and a contract registry skeleton.
|
||
- A ready-to-publish signal is provided via a READY_TO_PUBLISH file in the repository root once all required checks pass.
|
||
|
||
Contributing
|
||
- Open issues and PRs should reference sections of this guide and align with the MVP roadmap.
|
||
- Documentation updates should accompany code changes.
|
||
|
||
This file is intentionally lightweight but should be kept current with repository changes.
|