Skip to content

Backend Architecture Deep Dive

The Yappa Knowledge Hub backend is built with Symfony 7 and follows a strictly Layered Architecture to ensure clean separation between external interfaces, business logic, and infrastructure.

Architecture Layers


The Dual-Storage Pattern

The core architectural decision is a Dual-Storage approach combining Notion's collaborative power with the speed of a local SQLite database.

  • SQLite: High-performance local cache. All reads for the Slack Bot and API come from here.
  • Notion: Master UI and persistent cloud storage. Collaborative editing and long-term storage happen here.
  • Sync Engine: A specialized service layer ensuring consistency between both worlds.

Project Structure

backend/
├── src/
│   ├── Command/          # CLI tools (sync, digest generation, AI test)
│   ├── Controller/       # HTTP entry points (6 controllers, 18 routes)
│   ├── Entity/           # Doctrine models (Category, Knowledge, Digest, DigestDelivery, Subscription)
│   ├── Repository/       # Database query logic
│   ├── Mapper/           # Entity ↔ DTO mapping
│   ├── Constants/        # No magic strings
│   └── Service/          # Business logic (38 services)
│       ├── Orchestrator/ # High-level workflows
│       ├── Notion/       # Notion API, sync, content
│       ├── Digest/       # Digest generation, distribution
│       ├── Knowledge/    # Knowledge persistence, query
│       ├── Category/     # Category persistence, query
│       ├── Subscription/ # Subscription management
│       ├── Ai/           # AI provider, content cleaning
│       ├── Content/      # URL extraction
│       ├── Slack/        # Slack delivery
│       └── Transport/    # HTTP transport
└── tests/                # PHPUnit tests

Layers of Responsibility

1. Presentation (Controllers/Commands)

Handles raw input from HTTP requests or CLI. Validates basic request structure, delegates all business logic to orchestrators.

2. Orchestration (Orchestrators)

Coordinates multi-service workflows. Follows a pipeline pattern: Resolve → Map → Persist → Sync.

3. Domain (Entities + Services)

Rich domain models and focused business logic services (highlight generation, content formatting, date range parsing).

4. Infrastructure (Repositories + External Clients)

Handles communication with the outside world (SQLite via Doctrine, Notion API, OpenAI API).


Core Lifecycles

Knowledge Capture

  1. Slack Bot receives message → Calls backend API.
  2. KnowledgeOrchestrator persists to SQLite + syncs to Notion.
  3. AI summary generated lazily during digest creation (not during capture).

Digest Generation (7-Step Pipeline)

  1. Resolve category → Parse date range → Query knowledge items
  2. Generate AI highlights (lazy, 2-3 sentence Dutch summaries)
  3. Format as Markdown → Create entity → Persist + sync to Notion

Digest Distribution

  1. Resolve category subscribers + initiator
  2. Convert Markdown → Slack mrkdwn → Send via DM
  3. Track per-user delivery status with retry support

Technical Context

  • Framework: Symfony 7
  • Runtime: PHP 8.2+
  • ORM: Doctrine
  • Database: SQLite 3
  • AI: OpenAI / OpenRouter via AiProviderInterface

TIP

For a deep dive into the full 10-layer architecture, see Layered Architecture.