catopt-flow-category-theore.../AGENTS.md

35 lines
2.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

CatOpt-Flow: Agent Architecture Guide
=====================================
This document outlines the architectural approach used in CatOpt-Flow and establishes guidelines for contributors and automated tooling. It complements the codebase and the CI/test contracts already in place.
Overview
- CatOpt-Flow is a production-oriented platform for multi-tenant ML training pipelines across heterogeneous accelerators.
- It models optimization as category-theory-inspired primitives: Objects (local tasks), Morphisms (data-exchange channels with versioned schemas), and Functors (adapters mapping device-specific problems to a vendor-agnostic representation).
- Global constraints are enforced via Limits/Colimits, providing an aggregator that stitches local problems into a globally consistent plan.
- An ADMM-like distributed solver runs on each node and communicates summarized statistics through a delta-sync protocol that tolerates dynamic scaling and partial failures.
- A lightweight schema registry and contract marketplace enable plug-and-play adapters for popular ML frameworks and hardware backends.
- Code generation tooling is provided to output orchestration stubs (Rust/C++) and Python bindings for rapid deployment with minimal vendor lock-in.
What to Build (MVP Path)
- Protocol skeleton with two starter adapters per platform.
- Delta-sync, simple governance ledger, and identity primitives (DID-based).
- Cross-domain demo with a simulated domain (Phase 2) and HIL validation (Phase 3).
- A minimal DSL sketch: LocalProblem/SharedVariables/PlanDelta and toy adapters to bootstrap interoperability.
Development Rules
- All changes should be driven by tests. If a feature requires a new test, add it alongside the implementation.
- Use the existing test.sh to validate tests and packaging build. The script runs pytest and builds the package via python -m build.
- Do not break the public API unless explicitly requested. If you add new classes, export them from the packages __init__ to ease discoverability.
- When in doubt, add a small integration test demonstrating a 2-node ADMM interaction before expanding scope.
Publishing and Governance
- Publishable artifacts should include a clear README, a small DSL sketch, and a contract registry skeleton.
- A ready-to-publish signal is provided via a READY_TO_PUBLISH file in the repository root once all required checks pass.
Contributing
- Open issues and PRs should reference sections of this guide and align with the MVP roadmap.
- Documentation updates should accompany code changes.
This file is intentionally lightweight but should be kept current with repository changes.