Skip to content

Yappa Knowledge Hub - Consolidated Product Requirements Document

Version: 4.0 (Consolidated) Date: 2026-02-20 Status: Sprint 0 Complete | MVP Planning Phase Owner: Yappa Internship Project


Executive Summary

The Yappa Knowledge Hub is an internal knowledge management system that enables Yappa employees to capture, organize, and distribute knowledge through Slack, with AI-powered summaries and automated digests. The system uses Notion as the single source of truth, Symfony as the processing layer, and Slack as the primary user interface.

Sprint 0 (POC) Achievement: Successfully demonstrated end-to-end integration between Slack, Symfony, and Notion with 19 knowledge items stored across 10 categories. The system validates the technical approach and is ready for team review before merging to main branch.

MVP Path: Requires 80-100 additional tickets (400-500 story points) to reach production-ready status with AI summaries, digest automation, and production infrastructure.


1. Product Vision

What is Yappa Knowledge Hub

A central system that makes internal knowledge:

  1. Easy to capture - Low friction submission via Slack
  2. Smartly processed - AI summaries tailored per target audience
  3. Actively distributed - Periodic digests by thematic list and target group
  4. Centrally accessible - Notion as single source of truth with web UI

Target Users

  • Contributors (Yappa employees): Submit content, select lists, target groups, and tags
  • List Owners/Admins: Create/edit lists, configure digest schedules, manage prompts
  • Digest Recipients: Receive scheduled digests via Slack DM/channel

Target Groups (Audience Segments)

Semantic roles used for tone and content focus:

  • Developers
  • Marketers
  • CEO/Leadership
  • Service Desk
  • Sales
  • Operations

Core Value Proposition

Problem: Internal knowledge at Yappa is scattered across Slack messages, emails, notes, and announcements. Information is shared "in the moment" but becomes hard to find, contextualize, and distribute.

Solution: A Slack-first knowledge hub that captures information with minimal friction, processes it with AI for different audiences, and distributes it through automated digests.


2. Sprint 0 (POC) Achievements

What Was Built (Honest Assessment)

Completion Date: 2026-02-16 Status: Functional prototype validating technical approach

Working Infrastructure

  • Slack bot connected via Socket Mode (Node.js + @slack/bolt)
  • Symfony 7.2 backend with REST API (PHP 8.2)
  • Notion databases (3) created and accessible
  • Basic CRUD operations for knowledge and categories
  • Sub-second SlackNotion sync (< 1 second)
  • Cron-based NotionSlack sync (2 minutes)
  • Dutch UI strings (basic localization)
  • 19 knowledge items successfully stored
  • 10 categories with icons and descriptions

Slack Bot Features

  • Message shortcuts ("Save to Knowledge Hub")
  • Global shortcuts ("Quick Add Knowledge")
  • Slash command /knowledge with subcommands (add, search, dashboard, help)
  • Modal forms for knowledge submission
  • App Home tab with stats and category browser
  • URL detection with basic metadata scraping
  • File share detection (detection only, no processing)
  • Emoji reaction trigger (basic implementation)
  • Dutch confirmation messages

Symfony Backend

  • REST API endpoints for Knowledge and Categories
  • NotionClient with basic retry logic
  • NotionKnowledgeService for CRUD operations
  • NotionCategoryService for category management
  • NotionPropertyMapper for data conversion
  • NotionSyncService for bidirectional sync
  • Basic error handling and logging

Notion Integration

  • Knowledge Database (ID: 306e292a15d58004a8cbc222dcd48bb2)
  • Categories Database (ID: 306e292a15d5805dae13e64bed8519c5)
  • Digests Database (ID: 306e292a15d580d7a0f6fe8421baff10)
  • Manual sync endpoint operational
  • Direct web access for manual editing

What Was NOT Built

Missing Core Features:

  • AI summary generation (stub only, no OpenAI integration)
  • Digest scheduling and delivery (mock formatter only)
  • Real-time webhook sync (uses 2-minute polling)
  • Persistent state management (in-memory tracking lost on restart)
  • Production infrastructure (no queues, retry logic, monitoring)
  • User authentication and permissions
  • Analytics and reporting
  • Web dashboard
  • Advanced search and filtering

Missing Extra Storys:

  • Proper URL extraction (basic scraping only)
  • PDF upload support (detection only)
  • Tag normalization and autocomplete
  • URL validation and sanitization
  • List suggestions and multi-list assignment
  • Rate limiting for APIs
  • Queue system for async processing
  • Health monitoring endpoints

30 Tickets Completed

The Sprint 0 work represents approximately 30 completed tickets covering:

  • Slack bot foundation (8 tickets)
  • Symfony backend setup (6 tickets)
  • Notion integration (5 tickets)
  • Basic CRUD operations (6 tickets)
  • Sync system (3 tickets)
  • Dutch localization (2 tickets)

Total Story Points Completed: ~150 SP

Architecture Decisions Made

  1. Notion as Primary Database: All data stored in Notion databases, no local database required
  2. Slack-First UX: Primary interface through Slack bot, web dashboard secondary
  3. Symfony REST API: Backend processing layer between Slack and Notion
  4. Socket Mode for Development: Simple connection model for POC, HTTP mode for production
  5. Dutch Localization: All user-facing strings in Dutch
  6. Target Group Model: Audience-based content tailoring for summaries and digests

3. Sprint 0 Status

Completed Features

Ready for Main Branch Merge:

  • Slack bot with multiple interaction patterns (shortcuts, commands, modals)
  • Functional Symfony REST API with Notion integration
  • Bidirectional sync (SlackNotion instant, NotionSlack 2-min polling)
  • 19 knowledge items successfully stored
  • 10 categories with icons and descriptions
  • Dutch localization for user-facing strings
  • Basic URL detection and metadata scraping

Technical Quality:

  • Code follows Symfony and Node.js best practices
  • Basic error handling implemented
  • Logging configured
  • Documentation complete
  • All systems operational

Pending Team Review

Before Merging to Main:

  1. Code review by senior developers
  2. Security review of API endpoints
  3. Dutch translation review by native speaker
  4. UX review of Slack modals
  5. Performance testing with concurrent users
  6. Decision on production deployment strategy

Known Limitations

Technical Debt:

  • In-memory view tracking (lost on restart)
  • No rate limiting (API quota risk)
  • No retry logic (failed requests lost)
  • Socket Mode only (not production-ready)
  • Basic error handling (needs enhancement)
  • Cron-based sync (not real-time)

Scope Limitations:

  • Single category per knowledge item
  • Hardcoded target groups (6 groups)
  • No AI functionality
  • No digest system
  • No user management
  • No analytics

4. Product Roadmap

Sprint 1-2: Infrastructure Hardening (Weeks 1-4)

Objective: Make POC production-ready

Deliverables:

  • Redis integration for persistent state
  • Queue system (Symfony Messenger)
  • Retry logic with exponential backoff
  • Rate limiting for Notion API
  • Health check endpoints
  • Structured logging with context

Story Points: 80 SP Priority: Critical (P0)

Acceptance Criteria:

  • System survives restart without data loss
  • Failed operations retry automatically
  • API rate limits respected
  • Health endpoint returns 200 OK
  • Logs include request IDs and context

Sprint 3-4: Content Ingestion Extra Story (Weeks 3-6)

Objective: Improve content capture quality

Deliverables:

  • Full URL extraction service
  • Tag normalization and autocomplete
  • Enhanced emoji reaction processing
  • URL validation and sanitization
  • Multi-list assignment support

Story Points: 92 SP Priority: High (P1)

Acceptance Criteria:

  • URLs extract full article content
  • Tags suggest existing options
  • Invalid URLs rejected with clear errors
  • Knowledge items can belong to multiple lists

Sprint 5-6: AI Integration (Weeks 5-9)

Objective: Enable AI-powered summaries

Deliverables:

  • OpenAI API integration
  • Target-group-specific prompt templates
  • Async summary generation
  • Summary regeneration UI
  • Token usage tracking

Story Points: 99 SP Priority: Critical (P0)

Acceptance Criteria:

  • Summaries generated within 30 seconds
  • Each target group gets unique summary
  • Users can regenerate summaries
  • Token costs tracked per summary

Sprint 7-8: Digest Automation (Weeks 8-11)

Objective: Automated knowledge distribution

Deliverables:

  • Digest scheduling per category
  • Digest generation with summaries
  • Slack channel/DM delivery
  • Digest history in Notion

Story Points: 68 SP Priority: Critical (P0)

Acceptance Criteria:

  • Digests sent on schedule (weekly/biweekly/monthly)
  • Digests include AI summaries
  • Delivery success rate > 99%
  • Users can view digest history

Sprint 9-10: Search & Discovery (Weeks 10-13)

Objective: Improve discoverability

Deliverables:

  • Advanced search with filters
  • Date range filtering
  • Tag-based filtering
  • Status filtering
  • Search result pagination

Story Points: 87 SP Priority: High (P1)

Acceptance Criteria:

  • Search returns relevant results
  • Filters work correctly
  • Results paginate properly
  • Search performance < 500ms

Sprint 11-12: User Management (Weeks 12-15)

Objective: Add access control

Deliverables:

  • Role-based permissions
  • User preferences
  • Admin interface
  • Permission system

Story Points: 95 SP Priority: Medium (P2)

Acceptance Criteria:

  • Users have roles (admin, contributor, viewer)
  • Permissions enforced on all endpoints
  • Users can set preferences
  • Admins can manage users

5. Backlog

Extra Story Tickets (201 Total)

Created Tickets:

  • Batch 1 Part 1: Content Ingestion Foundation (US-100 to US-115)
  • Batch 1 Part 2: Content Submission Features (US-116 to US-130)
  • Batch 3: AI Summaries (US-166 to US-200)

Pending Tickets:

  • Batch 2: Lists & Categories (US-131 to US-165)
  • Batch 4: Digest & Infrastructure (US-201 to US-235)
  • Batch 5: Search & User Management (US-236 to US-280)
  • Batch 6: Advanced & Analytics (US-281 to US-320)

Total Extra Story Scope: Tracked dynamically in Jira workflow dashboard.

Priority Ordering

Tier 1: Foundation (Must Have) - 35 tickets | 180 SP

  • Infrastructure hardening (Redis, queues, retry, rate limiting, health checks)
  • Content ingestion extra storys (URL extraction, tag normalization, validation)
  • Category management extra storys (list suggestions, multi-list, filtering)

Tier 2: Core Features (High Priority) - 45 tickets | 220 SP

  • AI summaries (OpenAI integration, target-group prompts, regeneration)
  • Digest system (scheduling, generation, Slack delivery)
  • Search & discovery (advanced search, filtering)

Tier 3: Polish (Nice to Have) - 20 tickets | 100 SP

  • User management (role-based permissions, preferences)
  • Analytics (usage tracking, dashboard)

Story Point Estimates

By Epic:

  • Epic 1: Content Ingestion - 92 SP
  • Epic 2: Lists & Categories - 81 SP
  • Epic 3: AI Summaries - 99 SP
  • Epic 4: Digest System - 68 SP
  • Epic 5: Infrastructure - 83 SP
  • Epic 6: Notion Sync - 47 SP
  • Epic 7: Search & Discovery - 87 SP
  • Epic 8: User Management - 95 SP
  • Epic 9: Analytics - 60 SP
  • Epic 10: Web Dashboard - 71 SP

Total: 783 SP

By Priority:

  • Critical (P0): 234 SP (29.9%)
  • High (P1): 163 SP (20.8%)
  • Medium (P2): 186 SP (23.8%)
  • Low (P3): 200 SP (25.5%)

6. Technical Requirements

Architecture


  Slack Bot     Symfony       Notion
  (Node.js)     Backend      Databases




                      OpenAI API
                      (Summaries)




                        Redis
                     (Cache/State)

Technology Stack

Input Layer:

  • Slack Bot (Node.js 18+, @slack/bolt framework)
  • Socket Mode (development) / HTTP Mode (production)

Processing Layer:

  • Symfony 7.2 (PHP 8.2)
  • Doctrine ORM
  • Symfony Messenger (queues)

Storage Layer:

  • Notion Databases (primary storage)
  • Redis (cache and state management)
  • SQLite (local development backup)

AI Layer:

  • OpenAI GPT-4 (target-group-specific summaries)
  • Configurable AI provider interface

Infrastructure:

  • Redis 7+ (caching, view tracking, session management)
  • Symfony Messenger (async job processing)
  • Monolog (structured logging)

Infrastructure Needs

Development Environment:

  • Node.js 18+
  • PHP 8.2+
  • Composer
  • npm/yarn
  • Redis (optional for POC)

Production Environment:

  • VPS or cloud hosting (AWS/DigitalOcean/Hetzner)
  • Redis instance (persistent)
  • Process manager (PM2 for Node.js, Supervisor for Symfony)
  • Reverse proxy (nginx/Apache)
  • SSL certificates
  • Monitoring (Prometheus/Grafana recommended)

External Services:

  • Slack workspace with bot app installed
  • Notion workspace with API access
  • OpenAI API account (for MVP)
  • Domain name and DNS configuration

API Endpoints

Knowledge Management:

POST   /api/knowledge              Create knowledge item
GET    /api/knowledge              List knowledge items
GET    /api/knowledge/{id}         Get single item
PUT    /api/knowledge/{id}         Update item
DELETE /api/knowledge/{id}         Delete item
POST   /api/knowledge/search       Search knowledge

Category Management:

POST   /api/categories             Create category
GET    /api/categories             List categories
GET    /api/categories/{id}        Get category
PUT    /api/categories/{id}        Update category
DELETE /api/categories/{id}        Delete category

Sync Operations:

POST   /api/notion/sync/bidirectional      Full bidirectional sync
POST   /api/notion/sync/from-notion        Sync from Notion to local
POST   /api/notion/sync/categories         Sync categories
GET    /api/notion/sync/status             Check sync status
POST   /api/webhooks/notion                Notion webhook endpoint

Digest Operations (Planned):

POST   /api/digests/generate       Generate digest report
GET    /api/digests                List digests
GET    /api/digests/{id}           Get digest
POST   /api/digests/{id}/send      Send digest to Slack

Database Schema (Notion)

Knowledge Database (306e292a15d58004a8cbc222dcd48bb2):

  • Title (title) - required
  • Content (rich_text) - required, up to 2000 chars
  • Status (status) - Draft / Review / Published / Archived
  • Tags (multi_select)
  • Categories (relation) - link to Categories DB
  • Priority (select) - High / Medium / Low
  • Target Groups (multi_select) - audience segments
  • Source Type (select) - Slack Message / Manual / File / URL
  • Source URL (url)
  • Author (people)
  • AI Summary (rich_text) - generated per target group
  • Slack User ID, Message TS, Channel ID (rich_text)
  • View Count (number)
  • Last Reviewed (date)
  • Attachments (files)
  • Created, Last Edited (auto)

Categories Database (306e292a15d5805dae13e64bed8519c5):

  • Name (title) - required
  • Description (rich_text)
  • Icon (rich_text) - emoji
  • Default Target Groups (multi_select)
  • Subscribers (people)
  • Digest Frequency (select) - Daily / Weekly / Bi-weekly / Monthly
  • Digest Day (select)
  • Active (checkbox)
  • Knowledge Count (rollup) - auto
  • Last Digest (date)
  • Created (auto)

Digests Database (306e292a15d580d7a0f6fe8421baff10):

  • Title (title) - required
  • Category (relation) - link to Categories DB
  • Period Start, Period End (date)
  • Items Count (number)
  • Target Groups (multi_select)
  • Generated By (people)
  • Status (status) - Generating / Sent / Failed
  • Slack Sent (checkbox)
  • Recipients (rich_text) - Slack user IDs
  • Generated At (auto)

Performance Targets

Current Performance (POC):

  • Slack modal response time: ~200ms (target < 500ms)
  • Notion API sync latency: ~800ms (target < 2s)
  • Sync completion time: ~15s for 19 items (target < 30s)

MVP Performance Targets:

  • End-to-end submission latency: < 3s (Slack submit Notion confirmation)
  • Notion webhook processing: < 5s (Notion change Slack home updated)
  • View refresh latency: < 2s (Sync triggered Modal updated)
  • Concurrent view tracking: 100+ views
  • Sync reliability: 99.9% (successful syncs / total attempts)
  • API response time (p95): < 500ms

Security Requirements

  • Slack signature verification (handled by Bolt framework)
  • Notion API key stored in environment variables
  • Internal API keys for service-to-service communication
  • Least-privilege Slack scopes
  • HTTPS for all external communication
  • Input validation and sanitization
  • Rate limiting on all endpoints
  • SSRF protection for URL scraping

Cost Controls

  • Rate limiting for LLM API calls
  • Per-summary token tracking
  • Notion API request monitoring
  • Configurable summary length limits
  • Caching to reduce API calls

7. Success Metrics

User Adoption Metrics

MetricTargetMeasurement Method
Active users (weekly)20+Unique Slack user IDs submitting knowledge
Knowledge items submitted100+Count in Notion database
Active categories5+Categories with items added in last 30 days
Digest subscribers30+Users receiving digests
Digest open rate60%+Slack message read receipts

Performance Metrics

MetricTargetMeasurement Method
Submission latency< 3sSlack submit Notion confirmation
AI summary generation< 30sAsync job completion time
Digest delivery success99%+Successful deliveries / Total attempts
API response time (p95)< 500msSymfony monitoring
Sync reliability99.9%+Successful syncs / Total sync attempts

Quality Metrics

MetricTargetMeasurement Method
Error rate< 1%Failed requests / Total requests
User-reported bugs< 5/weekSlack feedback channel
AI summary quality4+/5User ratings
Tag consistency80%+Normalized tags / Total tags

Business Metrics

MetricTargetMeasurement Method
Time saved per week2+ hoursUser survey
Knowledge reuse rate30%+Items viewed > 1 time
Cross-team sharing50%+Items with multiple target groups
Digest engagement60%+Users clicking digest links

8. Risks & Mitigations

Technical Risks

RiskImpactMitigationPriority
Notion API downtimeHighSQLite backup, queue + retriesP0
OpenAI API downtimeHighAllow ingestion without immediate summaryP0
Slack rate limitsMediumRequest throttling, cachingP1
Redis single point of failureHighDocument recovery proceduresP1
Concurrent user race conditionsMediumLoad testing before Phase 3P1

Product Risks

RiskImpactMitigationPriority
Low user adoptionHighUser onboarding, training sessionsP0
Poor AI summary qualityHighEditable templates, regenerate supportP0
Scope creep (PDF/audio/video)MediumStrict phase gates, defer to post-MVPP1
Tag taxonomy inconsistencyMediumTag normalization, autocompleteP1
Dutch translation qualityLowNative speaker reviewP2

Cost Risks

RiskImpactMitigationPriority
OpenAI API costsHighToken logging, rate limits, length constraintsP0
Notion API quotaMediumCaching, batch operationsP1
Infrastructure costsLowStart small, scale as neededP2

9. Next Steps

Immediate Actions (Week 1)

  1. Team Review of Sprint 0 Work

    • Code review by senior developers
    • Security review of API endpoints
    • Dutch translation review
    • UX review of Slack modals
  2. Merge to Main Branch

    • Create pull request with Sprint 0 work
    • Address review feedback
    • Merge to main branch
    • Tag release as v0.1.0-poc
  3. Create Remaining Extra Story Tickets

    • Batch 2: Lists & Categories (35 tickets)
    • Batch 4: Digest & Infrastructure (35 tickets)
    • Batch 5: Search & User Management (35 tickets)
    • Batch 6: Advanced & Analytics (30 tickets)

Short-Term Actions (Weeks 2-4)

  1. Begin Sprint 1: Infrastructure Hardening

    • Set up Redis for persistent state
    • Implement queue system with Symfony Messenger
    • Add retry logic with exponential backoff
    • Implement rate limiting
    • Add health check endpoints
  2. Prepare for AI Integration

    • Set up OpenAI account
    • Design prompt templates for each target group
    • Create AI service architecture
    • Plan token usage tracking
  3. Production Deployment Planning

    • Select hosting provider
    • Plan infrastructure setup
    • Configure CI/CD pipeline
    • Set up monitoring and alerting

Medium-Term Actions (Weeks 5-10)

  1. Implement Core Features

    • AI summary generation
    • Digest scheduling and delivery
    • Advanced search and filtering
  2. Add Production Infrastructure

    • Monitoring and logging
    • Error tracking (Sentry)
    • Performance optimization
    • Load testing
  3. User Testing

    • Internal beta testing
    • Gather feedback
    • Iterate on UX
    • Refine AI prompts

Appendix A: File References

Documentation Sources

  • /IMPLEMENTATION_GUIDE.md
  • /project/PRD.md
  • /project/PRD_UPDATED.md
  • /project/backlog.md
  • /project/roadmap.md
  • /Architecture.md

POC Status Files

  • /home/ubuntu/yappa-knowledge-hub/var/backup-md/POC_COMPLETE.md
  • /home/ubuntu/yappa-knowledge-hub/var/backup-md/POC_FINAL_STATUS.md
  • /home/ubuntu/POC_COMPLETION_VERIFICATION.md

Jira Extra Story Files

  • /home/ubuntu/jira_extra story_tickets_summary.md
  • /home/ubuntu/jira_extra story_batch1_part1.md
  • /home/ubuntu/jira_extra story_batch1_part2.md
  • /home/ubuntu/jira_extra story_batch3.md
  • /home/ubuntu/EXTRA STORY_TICKETS.md

Feature Analysis Files

  • /home/ubuntu/FEATURE_COMPARISON_MATRIX.md
  • /home/ubuntu/COMPLETE_PROJECT_STATUS.md

Appendix B: Configuration

Backend Environment (backend/.env)

bash
# Notion API
NOTION_API_KEY=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
NOTION_VERSION=2022-06-28

# Database IDs
NOTION_DATABASE_KNOWLEDGE=306e292a15d58004a8cbc222dcd48bb2
NOTION_DATABASE_CATEGORIES=306e292a15d5805dae13e64bed8519c5
NOTION_DATABASE_DIGESTS=306e292a15d580d7a0f6fe8421baff10

# OpenAI (for MVP)
OPENAI_API_KEY=sk-...

# Redis (for MVP)
REDIS_DNS=redis://localhost:6379
REDIS_URL=redis://localhost:6379

# Notion Webhook (for MVP)
NOTION_WEBHOOK_SECRET=your_webhook_secret_here
NOTION_WEBHOOK_URL=https://your-domain.com/api/webhooks/notion

# Slack Bot Integration
SLACK_BOT_URL=http://localhost:3000

Slack Bot Environment (.env)

bash
SLACK_BOT_TOKEN=xoxb-...
SLACK_SIGNING_SECRET=...
SLACK_APP_TOKEN=xapp-...
API_BASE_URL=http://localhost:8000
PORT=3000

# Redis (for view tracking)
REDIS_URL=redis://localhost:6379

Notion URLs


Appendix C: Glossary

Thematic List (Category): A curated stream of resources designed for a specific purpose and audience context. Each list has a name, description, icon, default tags, and digest schedule.

Resource (Knowledge Item): A single knowledge item submitted by a user, including title, content, metadata, tags, and AI summaries.

Target Group: A role-based audience segment that shapes summarization (e.g., Developers, Marketers, CEO).

Digest: A periodic report generated per list and tailored per target group, delivered via Slack.

Sprint 0 (POC): Proof of Concept phase validating the technical approach with basic functionality.

MVP: Minimum Viable Product with AI summaries, digest automation, and production infrastructure.

Story Points (SP): Relative measure of effort required to complete a ticket (Fibonacci scale: 1, 2, 3, 5, 8, 13, 21).


Document Version: 4.0 (Consolidated) Last Updated: 2026-02-20 Next Review: After Sprint 1 completion