# Technical Debt & Refactoring Plan
**Created**: 2026-01-21
**Status**: Pending - UX improvements in progress first
**Last Updated**: 2026-01-21

---

## Executive Summary

GameConfigIdeaEditBrainstorm is a sophisticated AI-powered game design platform with ~24,343 lines of Python across 79 files. While feature-rich, it has accumulated technical debt that impacts maintainability, testability, and extensibility.

**Decision**: Focus on UX improvements first to better understand real usage patterns, then return to this restructuring plan with concrete insights.

---

## Current Architecture Overview

```
├── app.py (4,104 lines)              # Monolithic main file - PRIMARY CONCERN
├── Core Engine
│   ├── my_text_game_engine_attempt.py
│   ├── game_state.py
│   ├── condition_evaluator.py
│   └── game_configs.py
├── AI/ML Integration
│   ├── leveraging_machine_learning.py
│   └── llm_playtester.py
├── UI Tabs (ui_tabs/)                # Already modularized - GOOD
├── Exporters (exporters/)            # 14 platforms - GOOD separation
├── External Ports
│   ├── narrativeengine_hfport/
│   ├── storygenattempt_hfport/
│   └── dnd_game_master_hfport/
└── Scenario Templates (*_scenarios.py)
```

---

## Issues by Priority

### P0 - Critical (Blocks scaling)

| Issue | Impact | Effort | Notes |
|-------|--------|--------|-------|
| Monolithic `app.py` (4,104 lines) | Hard to maintain, test, or onboard contributors | High | Break into feature modules |
| No automated tests | Can't refactor safely | Medium | Add pytest suite |
| Tight UI-engine coupling | Can't unit test game logic | High | Extract pure engine layer |

### P1 - High (Impacts development velocity)

| Issue | Impact | Effort | Notes |
|-------|--------|--------|-------|
| Mixed state management (Player + GameState) | Confusing, potential bugs | Medium | Complete migration to GameState |
| 60+ hardcoded LLM list | Hard to maintain/extend | Low | Create ModelRegistry class |
| Sparse error handling | Silent failures confuse users | Medium | Add structured logging |
| Lambda consequences + declarative effects coexisting | Inconsistent, harder to validate | Medium | Migrate all to declarative |

### P2 - Medium (Quality of life)

| Issue | Impact | Effort | Notes |
|-------|--------|--------|-------|
| Code duplication in exporters | Maintenance burden | Medium | Extract base exporter class |
| Missing type hints | IDE support, bugs | Low | Add progressively |
| Inconsistent naming | Cognitive load | Low | Establish conventions |
| Magic strings/numbers | Bugs, hard to refactor | Low | Create enums/constants |

### P3 - Low (Nice to have)

| Issue | Impact | Effort | Notes |
|-------|--------|--------|-------|
| Exporter quality variance | Some exports may fail | Medium | Add capability metadata |
| No caching for LLM inferences | Repeated work | Medium | Add caching layer |
| Sparse docstrings | Onboarding difficulty | Low | Document as we go |

---

## Proposed Refactoring Phases

### Phase 1: Foundation (After UX work)
- [ ] Extract `app.py` into logical modules:
  - `app_core.py` - Gradio app setup, shared state
  - `app_generation.py` - Content generation handlers
  - `app_playtest.py` - Playtest/preview handlers
  - `app_export.py` - Export handlers
  - `app_media.py` - Media generation handlers
- [ ] Add basic pytest infrastructure
- [ ] Create constants/enums for magic strings

### Phase 2: Engine Isolation
- [ ] Extract pure game engine (no Gradio dependencies)
- [ ] Complete Player → GameState migration
- [ ] Migrate lambda consequences to declarative effects
- [ ] Add engine unit tests

### Phase 3: ML Infrastructure
- [ ] Create ModelRegistry class with metadata
- [ ] Add structured error handling + logging
- [ ] Implement inference caching

### Phase 4: Polish
- [ ] Extract shared exporter base class
- [ ] Add type hints throughout
- [ ] Comprehensive documentation pass
- [ ] Add integration tests

---

## Metrics to Track

- Lines in `app.py` (target: <500)
- Test coverage % (target: >60%)
- Average function length (target: <50 lines)
- Number of untyped functions (target: 0)

---

## UX Insights to Gather First

Before restructuring, document insights from UX work:

- [ ] Which tabs/features are actually used most?
- [ ] What are common user workflows?
- [ ] Where do users get confused or stuck?
- [ ] Which exporters are production-quality vs experimental?
- [ ] What error messages do users encounter?

These insights will inform which modules to prioritize and how to structure the codebase for real usage patterns.

---

## Notes / Updates

*Add notes here as UX work progresses*

- 2026-01-21: Plan created. Starting UX improvements first.