MVP Scope — What Was Delivered
Overview
The Yappa Knowledge Hub MVP is a complete, working system that enables teams to capture, organize, and digest knowledge through Slack with AI-generated summaries and Notion integration.
Core Features
1. Knowledge Capture via Slack
Users capture knowledge without leaving Slack:
- Message Shortcut — Save any Slack message as knowledge via the three-dot menu
- Global Shortcut — Quick add from anywhere in Slack
- App Home Button — Add knowledge from the dashboard
- URL Content Extraction — Automatic page content extraction when a URL is included
- Auto-sync to Notion — Every knowledge item creates a Notion page
2. Category Management
Organise knowledge into thematic lists:
- Create categories with name, description, and emoji icon
- Categories synced to Notion bidirectionally
- Target groups (audience labels) synced from Notion multi-select
- Digest configuration per category (frequency, day, time)
3. AI-Powered Summaries
Automatic Dutch-language summaries for knowledge items:
- 2–3 sentence summaries generated by OpenAI or OpenRouter
- Lazy generation — summaries created when needed for digests
- Stored in
Knowledge.highlightfield - Graceful fallback to content truncation on AI failure
- Single hardcoded prompt optimised for Dutch summaries
4. Digest Generation
AI-enhanced digest reports per category:
- 7-step pipeline: category → date range → query → AI highlights → format → persist → Notion sync
- Supports multiple date modes: last N days, date range, all time
- Dutch-language Markdown formatting with per-item summaries, tags, and read time
- Immutable snapshots — digests preserve the AI output at generation time
- Synced to Notion with full content embedding
5. Digest Distribution
Deliver digests to the right people:
- Slack DM delivery to category subscribers
- Per-user delivery tracking with retry support (max 3 attempts)
- Automatic subscriber resolution + initiator inclusion
- Markdown → Slack mrkdwn conversion
- Distribution results synced to Notion
6. Subscriptions
User-controlled notification preferences:
- Subscribe/unsubscribe to categories via App Home overflow menu
- Soft-delete with history preservation
- Unique constraint: one subscription per user per category
7. Notion Integration
Bidirectional sync for all data:
- 4 Notion databases: Knowledge, Categories, Digests, Subscriptions
- Real-time outbound sync on creation/update
- On-demand inbound sync from Notion
- Full Markdown content embedding in Notion digest pages
- Dual-ID resolution (local integer + Notion UUID)
8. Digest Scheduling
Automated digest generation:
- Per-category scheduling: frequency (weekly/biweekly/monthly/quarterly)
- Day of week and time configuration
- Scheduler service checks due categories and triggers generation
Technical Architecture
Database
| Aspect | Implementation |
|---|---|
| Engine | SQLite |
| ORM | Doctrine ORM with attribute mapping |
| Entities | 5: Category, Knowledge, Digest, DigestDelivery, Subscription |
| Migrations | Schema created via doctrine:schema:create |
| Timestamps | Gedmo Timestampable (auto-managed createdAt/updatedAt) |
Service Architecture
Controller → Orchestrator → Domain Services → Persistence → Repository
↕
AI Provider
↕
Notion Sync Service38 service files organized by domain:
- Orchestrator layer (2 files) — coordinates multi-step workflows
- Digest domain (10 files) — generation, formatting, distribution, scheduling, delivery tracking
- Knowledge domain (3 files) — creation, persistence, query
- Subscription domain (3 files) — facade + CQRS split
- Notion domain (11 files) — sync, repositories, page builder, mappers
- AI domain (2 files) — provider interface + OpenAI/OpenRouter implementation
- Content domain (2 files) — extraction + HTML parser
- Transport (2 files) — message transport interface + Slack implementation
- Root (2 files) — CategoryService + EmojiConverter
API Surface
18 authenticated endpoints across 6 controllers:
| Controller | Endpoints |
|---|---|
| CategoryController | POST/GET /api/categories |
| CategorySettingsController | GET/PUT /api/categories/{id}/settings/digest |
| DigestController | POST generate, POST distribute, GET delivery-status, POST retry-failed, GET by-category |
| KnowledgeController | POST/GET /api/knowledge |
| NotionSyncController | POST from-notion, POST categories-from-notion, POST to-notion |
| SubscriptionController | POST subscribe, POST unsubscribe, GET user/{userId}, GET category/{categoryId}/subscribers |
Authentication
All API endpoints require ROLE_ADMIN via bearer token authentication. The AccessTokenHandler validates tokens configured via environment variables.
AI Provider
Configuration via environment variables: OPENAI_MODEL, OPENAI_API_KEY, OPENAI_API_URL.
Entities
Category
| Field | Type | Description |
|---|---|---|
| id | int (auto) | Primary key |
| name | string (255) | Display name (2-100 chars, required) |
| description | string (255) | Optional description |
| icon | string (50) | Emoji icon, default :file_folder: |
| targetGroups | JSON | Audience labels from Notion |
| notionId | string | Notion page reference |
| isActive | bool | Visibility flag, default true |
| sortOrder | int | Display order, default 0 |
| digestEnabled | bool | Whether scheduled digests are active |
| digestFrequency | enum | weekly/biweekly/monthly/quarterly |
| digestDay | int | Day of week for scheduled digests |
| digestTime | string | Time for scheduled digests |
| lastDigestAt | datetime | When last digest was generated |
| createdAt/updatedAt | datetime | Gedmo timestamps |
Knowledge
| Field | Type | Description |
|---|---|---|
| id | int (auto) | Primary key |
| title | string (255) | Title (3-255 chars, required) |
| content | text | Main content + extracted URL content |
| tags | JSON | User-defined tags |
| targetGroups | JSON | Audience labels |
| userId | string | Slack user ID of creator |
| status | string | pending/Not started/In progress/Done/approved/rejected/archived/draft/published |
| url | string (500) | Source URL |
| urlMetadata | JSON | Extraction details |
| sourceMessage | JSON | Original Slack message data |
| highlight | text | AI-generated 2-3 sentence Dutch summary |
| summaries | JSON | AI summary data from Notion sync |
| notionId / notionUrl | string | Notion references |
| createdAt/updatedAt | datetime | Gedmo timestamps |
Digest
| Field | Type | Description |
|---|---|---|
| id | int (auto) | Primary key |
| title | string (255) | Auto-generated title |
| content | text | Full Markdown digest |
| period | string | Date range label |
| status | string | Done/sent |
| statistics | JSON | highlightCount, days, tokenCount, cost, durationMs |
| knowledgeHighlights | JSON | Immutable snapshot of highlights at generation time |
| generatedBy | string | Slack user who triggered generation |
| notionId / notionUrl | string | Notion references |
| createdAt/updatedAt | datetime | Gedmo timestamps |
| category | ManyToOne | Required category reference |
| knowledges | ManyToMany | Linked knowledge items |
DigestDelivery
| Field | Type | Description |
|---|---|---|
| id | int (auto) | Primary key |
| userId | string | Recipient Slack user ID |
| deliveryChannel | string | slack/notion/email |
| status | string | pending/sent/failed |
| retryCount | int | Retry attempts (max 3) |
| errorMessage | text | Failure details |
| sentAt | datetime | Delivery timestamp |
| digest | ManyToOne | Required digest reference (CASCADE) |
Subscription
| Field | Type | Description |
|---|---|---|
| id | int (auto) | Primary key |
| userId | string | Subscriber Slack user ID |
| slackUserName | string | Display name |
| isActive | bool | Default true |
| subscribedAt | datetime | Subscription timestamp |
| unsubscribedAt | datetime | Unsubscription timestamp (null if active) |
| notionId | string | Notion page reference |
| category | ManyToOne | Required category reference (CASCADE) |
What's Not in the MVP
| Feature | Status | Notes |
|---|---|---|
| Per-role AI summaries | Not built | Single summary per item |
| Prompt template management | Not built | Hardcoded prompt constant |
| Knowledge edit/delete UI | Not built | Only create + list |
| Advanced search | Not built | Basic keyword search only |
| PDF/RSS/YouTube processing | Not built | HTML only |
| Redis caching | Not built | Not needed for current scale |
| Symfony Messenger | Not built | Synchronous processing |
| User authentication system | Not built | Simple bearer token |
| Web dashboard | Not built | Slack-only interface |