MVP Scope & Requirements
MVP Vision
What the MVP Delivers
The Yappa Knowledge Hub MVP transforms the proof-of-concept into a production-ready AI-powered knowledge management system. It enables teams to capture, organize, and digest knowledge resources through Slack, with AI-generated summaries tailored to different audience roles.
Core value proposition:
- Submit URLs and content via Slack commands
- Automatic AI-powered summarization in Dutch
- Role-based summary targeting (developer, manager, executive)
- Organized knowledge in thematic lists with automatic tagging
- Bidirectional sync with Notion for extended workflows
- On-demand summary regeneration and manual editing
Target Users and Use Cases
Primary Users:
- Development teams sharing technical resources
- Product managers curating feature documentation
- Executives reviewing strategic content
- Knowledge workers building team wikis
Key Use Cases:
- Developer shares technical article → AI generates technical summary for team
- Manager submits product roadmap → AI creates executive summary for leadership
- Team member adds resource to thematic list → Auto-tagged and synced to Notion
- User regenerates summary with different role perspective
- Team browses categorized knowledge through Slack App Home
Success Criteria
Technical Success:
- End-to-end flow: Slack input → AI processing → Slack output in under 30 seconds
- 99% uptime for backend services
- Dutch language summaries with 90%+ quality
- Error rate below 5%
- Async processing handles 100+ concurrent requests
User Success:
- Users can submit and retrieve resources without leaving Slack
- AI summaries reduce reading time by 70%
- Role-based summaries provide relevant context for each audience
- Manual editing and regeneration provide control over output
- Thematic lists enable organized knowledge discovery
Business Success:
- MVP demonstrates feasibility for production deployment
- Foundation supports post-MVP features (PDF processing, advanced search)
- Architecture scales to 1000+ resources and 50+ users
- OpenAI costs remain under budget ($100/month for MVP phase)
MVP Scope: 31 User Stories
Phase 1: Foundation (Week 1-2)
Data Model & Infrastructure (7 stories)
MSP2-405: US-116 - Refactor Knowledge to Resource Entity
- Status: To Do
- Rename Knowledge entity to Resource for semantic clarity
- Update all references in codebase
MSP2-406: US-117 - Migrate Existing Data to Resource Schema
- Status: To Do
- Create migration script for existing data
- Validate data integrity post-migration
MSP2-407: US-118 - Update API Endpoints for Resource Entity
- Status: To Do
- Change /api/knowledge to /api/resources
- Update Slack bot API calls
MSP2-420: US-131 - Rename Category to ThematicList Entity
- Status: In Progress
- Rename for semantic clarity
- Update database schema and all references
MSP2-421: US-132 - Create ThematicListManager Service
- Status: To Do
- Centralized service for list operations
- Support filtering and statistics
MSP2-422: US-133 - Implement Default Tags Feature
- Status: In Progress
- Auto-apply tags when resource added to list
- Configurable per thematic list
MSP2-52: US-039 - Persistent Backend Service
- Status: In Progress
- Deploy backend to VPS with monitoring
- Configure auto-restart and health checks
Phase 2: AI Infrastructure (Week 3-4)
AI Services & Integration (8 stories)
MSP2-457: US-168 - Implement AiProviderInterface
- Status: To Do
- Abstract interface for AI providers
- Enable future provider flexibility
MSP2-458: US-169 - Integrate OpenAI Provider
- Status: To Do
- Refactor OpenAiService to implement interface
- Add rate limiting and cost tracking
MSP2-455: US-166 - Create AiSummaryService Foundation
- Status: To Do
- Core service for AI summary operations
- Caching, retry logic, async processing
MSP2-456: US-167 - Create Summary Entity and Migration
- Status: To Do
- Separate summaries from resources
- Support multiple summaries per resource
MSP2-459: US-170 - Add Summary Generation Endpoint
- Status: To Do
- REST API: POST /api/resources/{id}/summaries
- Async processing with job status tracking
MSP2-460: US-171 - Create SummaryPromptBuilder Service
- Status: To Do
- Centralized prompt construction
- Template versioning and variable injection
MSP2-461: US-172 - Create Role Entity for Target Groups
- Status: To Do
- Dynamic roles with prompt templates
- Seed default roles: developer, manager, executive
MSP2-462: US-173 - Implement Advanced Prompt Templates
- Status: To Do
- Role-specific prompt variations
- Template optimization for quality output
Phase 3: Role-Based Summaries (Week 5-6)
Role Management & Language (6 stories)
MSP2-36: US-023 - Tailor Summary to Audience Role
- Status: Done
- Generate summaries based on target role
- Different content depth per role
MSP2-37: US-024 - Summaries in Dutch
- Status: To Do
- Enforce Dutch language in all summaries
- Validate language quality
MSP2-464: US-175 - Test Prompts with Different Roles
- Status: To Do
- Validate role-specific output quality
- Iterate on prompt templates
MSP2-465: US-176 - Add Explicit Language Directive
- Status: To Do
- Strengthen Dutch language enforcement
- Handle edge cases where AI defaults to English
MSP2-75: US-062 - Assign Roles to Users
- Status: To Do
- User profile with role assignment
- Default role for summary generation
MSP2-76: US-063 - Auto-Subscribe by Role
- Status: To Do
- Automatic list subscriptions based on role
- Configurable subscription rules
Phase 4: UI & Integration (Week 7-8)
User Interface & Validation (10 stories)
MSP2-35: US-022 - Trigger AI Summary On-Demand
- Status: In Progress
- Slack button to generate summary
- Loading indicator during generation
MSP2-40: US-027 - Manually Edit Summary
- Status: To Do
- Edit modal in Slack
- Mark summaries as manually edited
MSP2-481: US-192 - Add Summary Edit UI
- Status: To Do
- Enhanced edit interface
- Preview before saving
MSP2-475: US-186 - Add Summary Regeneration UI
- Status: To Do
- Regenerate button with confirmation
- Store previous versions
MSP2-30: US-017 - Filter Lists
- Status: In Progress
- Filter thematic lists by criteria
- Search and sort capabilities
MSP2-415: US-126 - Create UrlValidatorService
- Status: To Do
- Validate URL format and reachability
- Extract metadata from URLs
MSP2-23: US-010 - Validate URL Reachability
- Status: To Do
- HTTP HEAD request to check availability
- Handle redirects and timeouts
MSP2-88: US-075 - URL Validation
- Status: To Do
- Comprehensive URL validation
- Error messages for invalid URLs
MSP2-416: US-127 - Integrate URL Validation into Submission Flow
- Status: To Do
- Validate before content extraction
- User feedback for validation failures
MSP2-419: US-130 - Add Validation Status to UI
- Status: To Do
- Display validation status in Slack
- Visual indicators for valid/invalid URLs
Core Features
AI-Powered Summaries
Summary Generation:
- On-demand generation via Slack button
- Async processing with job status tracking
- Caching to avoid duplicate API calls
- Retry logic with exponential backoff
- Cost tracking per summary
Quality Assurance:
- Prompt versioning for reproducibility
- Template optimization based on feedback
- Dutch language enforcement
- Length constraints (150-300 words)
- Metadata tracking (model, tokens, cost)
Role-Based Targeting
Supported Roles:
Developer
- Technical depth and implementation details
- Code examples and architecture insights
- API references and integration patterns
Manager
- Business context and team impact
- Timeline and resource implications
- Decision points and trade-offs
Executive
- Strategic overview and business value
- High-level outcomes and ROI
- Risk assessment and recommendations
Role Assignment:
- User profiles with default role
- Override role per summary request
- Auto-subscribe to relevant lists by role
Dutch Language Support
Implementation:
- Explicit language directive in all prompts
- System message: "IMPORTANT: Respond ONLY in Dutch language"
- Validation of output language
- Fallback handling if English detected
- Quality metrics for Dutch fluency
Prompt Structure:
IMPORTANT: Respond ONLY in Dutch language.
You are summarizing content for a [ROLE].
Title: [TITLE]
URL: [URL]
Content:
[CONTENT]
Generate a concise summary in Dutch that:
- Highlights key points relevant to [ROLE]
- Uses appropriate technical depth for [ROLE]
- Is 150-300 words
- Is written entirely in DutchURL Validation
Validation Steps:
- Format validation (RFC 3986)
- Reachability check (HTTP HEAD)
- Content-Type verification
- Redirect handling (max 3 hops)
- Timeout handling (5 seconds)
Validation States:
pending- Validation queuedvalid- URL is reachableinvalid- URL failed validationerror- Validation error occurred
User Feedback:
- Real-time validation status in Slack
- Error messages with suggested fixes
- Retry option for transient failures
Summary Editing and Regeneration
Manual Editing:
- Edit modal with current summary content
- Preview before saving
- Track edit history
- Mark as manually edited
- Preserve original AI-generated version
Regeneration:
- Force new generation bypassing cache
- Increment version number
- Store previous versions
- Rate limiting (max 3 regenerations per hour)
- Confirmation dialog to prevent accidental regeneration
Technical Requirements
New Entities
Summary Entity:
class Summary
{
private Uuid $id;
private Resource $resource;
private Role $role;
private string $content;
private string $language; // 'nl'
private int $version;
private \DateTime $generatedAt;
private ?\DateTime $editedAt;
private ?string $editedBy;
private array $metadata; // tokens, cost, model, provider
}Role Entity:
class Role
{
private Uuid $id;
private string $name; // developer, manager, executive
private string $description;
private string $promptTemplate;
private \DateTime $createdAt;
private \DateTime $updatedAt;
}Resource Entity (Enhanced):
class Resource
{
private Uuid $id;
private string $sourceType;
private string $sourceUrl;
private string $status;
private string $validationStatus; // NEW
private ?\DateTime $validatedAt; // NEW
private array $metadata; // ENHANCED
private Collection $summaries; // One-to-many
}ThematicList Entity (Renamed from Category):
class ThematicList
{
private Uuid $id;
private string $name;
private string $description;
private string $icon;
private Collection $defaultTags; // NEW
private ?User $owner; // NEW
private Collection $resources;
}New Services
AiSummaryService:
interface AiSummaryServiceInterface
{
public function generateSummary(
Resource $resource,
Role $role,
bool $async = true
): Summary|string;
public function regenerateSummary(Summary $summary): Summary;
public function editSummary(
Summary $summary,
string $newContent,
string $editedBy
): Summary;
public function getSummaryStatus(string $jobId): array;
}SummaryPromptBuilder:
interface SummaryPromptBuilderInterface
{
public function buildPrompt(Resource $resource, Role $role): string;
public function getPromptTemplate(string $roleName): string;
public function renderTemplate(string $template, array $variables): string;
public function addLanguageDirective(string $prompt): string;
public function getPromptVersion(): string;
}UrlValidatorService:
interface UrlValidatorServiceInterface
{
public function validateUrl(string $url): ValidationResult;
public function checkReachability(string $url): bool;
public function extractMetadata(string $url): array;
public function getValidationStatus(string $url): string;
}ThematicListManager:
interface ThematicListManagerInterface
{
public function createList(array $data): ThematicList;
public function assignDefaultTags(
ThematicList $list,
Resource $resource
): void;
public function filterLists(array $criteria): array;
public function getListStats(ThematicList $list): array;
}OpenAI Integration
Configuration:
- API Key: Environment variable
OPENAI_API_KEY - Model: Configurable via
OPENAI_MODEL(default: gpt-4) - Temperature: 0.7 for balanced creativity
- Max Tokens: 500 for summary length control
Rate Limiting:
- Max 3 retries with exponential backoff
- Retry delays: 1s, 2s, 4s
- Queue system for burst handling
- Graceful degradation on rate limit errors
Cost Tracking:
- Log prompt tokens, completion tokens, total tokens
- Calculate cost per summary
- Daily cost reports
- Alert on budget threshold (80% of monthly limit)
Error Handling:
- Retry on transient failures
- Fallback to cached summaries
- User notification on persistent failures
- Detailed error logging for debugging
Async Processing
Symfony Messenger Configuration:
framework:
messenger:
failure_transport: failed
transports:
async:
dsn: '%env(MESSENGER_TRANSPORT_DSN)%'
options:
auto_setup: true
retry_strategy:
max_retries: 3
delay: 1000
multiplier: 2
failed: 'doctrine://default?queue_name=failed'
routing:
'App\Message\GenerateSummaryMessage': async
'App\Message\ValidateUrlMessage': async
'App\Message\SyncNotionMessage': asyncMessage Handlers:
GenerateSummaryMessageHandler- Process summary generationValidateUrlMessageHandler- Validate URL reachabilitySyncNotionMessageHandler- Sync to Notion databases
Queue Management:
- Redis as message transport
- Supervisor for worker process management
- Health checks for worker status
- Dead letter queue for failed messages
Architecture Changes from POC
What's Being Added
New Infrastructure:
- Redis for caching and message queue
- Symfony Messenger for async processing
- Supervisor for worker management
- Health check endpoints
New Services:
- AiSummaryService (replaces stub)
- SummaryPromptBuilder (new)
- UrlValidatorService (new)
- ThematicListManager (new)
- AiProviderInterface (abstraction layer)
New Entities:
- Role (dynamic target groups)
- Enhanced Summary (with role relationship)
- Enhanced Resource (with validation status)
- ThematicList (renamed from Category)
New UI Components:
- Summary generation button
- Loading indicators
- Edit summary modal
- Regenerate confirmation dialog
- Validation status display
What's Being Refactored
Entity Renames:
- Knowledge → Resource (semantic clarity)
- Category → ThematicList (semantic clarity)
Service Enhancements:
- OpenAiService → Implements AiProviderInterface
- SummaryService → Full implementation (was stub)
- NotionDigestService → Enhanced with summaries
API Endpoints:
/api/knowledge→/api/resources/api/categories→/api/lists- New:
/api/resources/{id}/summaries - New:
/api/summaries/{id}/regenerate
Database Schema:
- Add validation fields to Resource
- Add role relationship to Summary
- Add default tags to ThematicList
- Add indexes for performance
Migration Strategy
Phase 1: Additive Changes (Week 1-2)
- Create new entities (Role, enhanced Summary)
- Add new fields to existing entities
- Deploy new services alongside old ones
- No breaking changes to existing functionality
Phase 2: Refactoring (Week 3-4)
- Rename entities (Knowledge → Resource, Category → ThematicList)
- Update API endpoints with versioning
- Migrate data with zero downtime
- Update Slack bot to use new endpoints
Phase 3: Integration (Week 5-6)
- Connect AI services to Slack UI
- Enable async processing
- Add validation to submission flow
- Test end-to-end flows
Phase 4: Cleanup (Week 7-8)
- Remove deprecated endpoints
- Clean up old code
- Optimize database queries
- Performance tuning
Rollback Plan:
- Database migrations are reversible
- Old API endpoints remain during transition
- Feature flags for new functionality
- Backup before each major change
Implementation Timeline
8 Weeks Total, 4 Phases of 2 Weeks Each
Week 1-2: Foundation
- Focus: Data model and infrastructure
- Stories: 7 (MSP2-405, 406, 407, 420, 421, 422, 52)
- Deliverable: Enhanced database schema, deployed backend
- Risk: Medium (entity renames require careful refactoring)
Week 3-4: AI Infrastructure
- Focus: AI services and OpenAI integration
- Stories: 8 (MSP2-457, 458, 455, 456, 459, 460, 461, 462)
- Deliverable: Working AI summary generation
- Risk: High (OpenAI integration, async processing)
Week 5-6: Role-Based Summaries
- Focus: Role management and language enforcement
- Stories: 6 (MSP2-36, 37, 464, 465, 75, 76)
- Deliverable: Role-tailored Dutch summaries
- Risk: Medium (Dutch language quality validation)
Week 7-8: UI & Integration
- Focus: Slack UI and URL validation
- Stories: 10 (MSP2-35, 40, 481, 475, 30, 415, 23, 88, 416, 419)
- Deliverable: Complete end-to-end demo
- Risk: Low (UI polish and integration)
Week-by-Week Breakdown
Week 1:
- Day 1-2: Refactor Knowledge to Resource entity
- Day 3-4: Rename Category to ThematicList entity
- Day 5: Deploy persistent backend service
Week 2:
- Day 1-2: Migrate existing data to new schema
- Day 3-4: Implement ThematicListManager and default tags
- Day 5: Update API endpoints
Week 3:
- Day 1-2: Implement AiProviderInterface and OpenAI integration
- Day 3-4: Create AiSummaryService foundation
- Day 5: Create Summary entity and migration
Week 4:
- Day 1-2: Add summary generation endpoint
- Day 3-4: Create SummaryPromptBuilder service
- Day 5: Create Role entity and seed data
Week 5:
- Day 1-2: Implement advanced prompt templates
- Day 3-4: Test prompts with different roles
- Day 5: Add explicit language directive
Week 6:
- Day 1-2: Assign roles to users
- Day 3-4: Implement auto-subscribe by role
- Day 5: Validate Dutch language quality
Week 7:
- Day 1-2: Trigger AI summary on-demand (Slack UI)
- Day 3-4: Add summary edit and regeneration UI
- Day 5: Implement filter lists
Week 8:
- Day 1-2: Create UrlValidatorService
- Day 3-4: Integrate URL validation into submission flow
- Day 5: Add validation status to UI, final testing
Demo Requirements
End-to-End Flow
Demo Script (5 minutes):
Setup (30 seconds)
- Show Slack channel with existing thematic list
- Explain the knowledge hub concept
Submit Resource (1 minute)
- User types
/yappa add [URL]in Slack - System validates URL (show validation status)
- Content is extracted and stored
- Resource appears in thematic list
- User types
Generate Summary (1.5 minutes)
- User clicks "Generate Summary" button
- Loading indicator shows "Generating summary..."
- AI generates Dutch summary tailored to developer role
- Summary appears in Slack message
Edit Summary (1 minute)
- User clicks "Edit" button
- Edit modal opens with current summary
- User makes changes and saves
- Updated summary displayed with "Edited" badge
Regenerate Summary (1 minute)
- User clicks "Regenerate" button
- Confirmation dialog appears
- New summary generated with different phrasing
- Both versions stored in history
Browse Knowledge (30 seconds)
- Show App Home with thematic lists
- Filter lists by category
- Show resource count and recent additions
What Will Be Demonstrated
Core Functionality:
- URL submission via Slack command
- Automatic URL validation
- Content extraction and storage
- AI summary generation in Dutch
- Role-based summary targeting
- Manual summary editing
- Summary regeneration
- Thematic list organization
- Notion sync (background)
Technical Capabilities:
- Async processing (no blocking)
- Error handling (invalid URL, API failure)
- Loading states and user feedback
- Caching (regenerate bypasses cache)
- Version tracking (edit history)
Quality Attributes:
- Response time: < 30 seconds for summary
- Dutch language quality: Natural and fluent
- Role appropriateness: Technical depth matches role
- UI responsiveness: Immediate feedback
- Error recovery: Graceful degradation
Acceptance Criteria
Functional Requirements:
- User can submit URL via Slack command
- System validates URL before processing
- Content is extracted and stored as Resource
- User can trigger AI summary generation
- Summary is generated in Dutch language
- Summary is tailored to user's role
- Summary is displayed in Slack message
- User can edit summary manually
- User can regenerate summary
- Resource is synced to Notion
- User can browse thematic lists
- User can filter lists by criteria
Non-Functional Requirements:
- Summary generation completes in < 30 seconds
- Dutch language quality is 90%+ correct
- System handles 100+ concurrent requests
- Error rate is < 5%
- Backend uptime is 99%+
- API response time is < 500ms
- OpenAI costs are under budget
User Experience:
- Loading indicators show progress
- Error messages are clear and actionable
- UI is intuitive and requires no training
- Feedback is immediate (< 1 second)
- Summaries are readable and useful
Out of Scope
Features NOT in MVP
Advanced Content Processing:
- PDF text extraction and summarization
- YouTube video transcription
- Podcast audio transcription
- RSS feed auto-import
- Image OCR and analysis
Advanced Search:
- Full-text search across all resources
- Semantic search using embeddings
- Search filters (date, author, tags)
- Search result ranking
- Saved searches
User Management:
- User authentication and authorization
- Team management and permissions
- User profiles and preferences
- Activity tracking and analytics
- Usage quotas and limits
Advanced AI Features:
- Multi-language support (beyond Dutch)
- Custom prompt templates per user
- AI-powered tagging suggestions
- Automatic categorization
- Sentiment analysis
- Key phrase extraction
Integration Enhancements:
- Webhook support for external systems
- API rate limiting per user
- OAuth authentication
- SSO integration
- Export to other platforms (Confluence, SharePoint)
Analytics and Reporting:
- Usage dashboards
- Summary quality metrics
- User engagement analytics
- Cost analysis and optimization
- A/B testing for prompts
Post-MVP Roadmap Items
Phase 2 (Months 3-4):
- PDF processing and summarization
- Advanced search with filters
- User authentication and teams
- Custom prompt templates
- Usage analytics dashboard
Phase 3 (Months 5-6):
- Multi-language support (English, German)
- YouTube and podcast transcription
- AI-powered auto-tagging
- Webhook integrations
- Export to Confluence/SharePoint
Phase 4 (Months 7-8):
- Semantic search with embeddings
- Sentiment analysis
- Custom AI models (fine-tuning)
- Advanced analytics and reporting
- Mobile app (iOS/Android)
Why They're Deferred
Complexity vs. Value:
- MVP focuses on core value: AI summaries in Slack
- Advanced features add complexity without validating core hypothesis
- Better to iterate on core features based on user feedback
Resource Constraints:
- 8-week timeline requires focus on essentials
- Advanced features require additional development time
- OpenAI costs need to be validated before expanding scope
Risk Mitigation:
- Validate core concept before investing in advanced features
- Gather user feedback to prioritize post-MVP features
- Ensure technical foundation is solid before adding complexity
Learning Opportunities:
- MVP will reveal which features users actually need
- Usage patterns will inform prioritization
- Cost data will guide optimization efforts
Success Metrics
Technical Metrics
Performance:
- Summary generation time: < 30 seconds (target: 20 seconds)
- API response time: < 500ms (target: 200ms)
- Database query time: < 100ms (target: 50ms)
- Async job processing time: < 60 seconds (target: 30 seconds)
Reliability:
- Backend uptime: 99% (target: 99.5%)
- Error rate: < 5% (target: < 2%)
- OpenAI API success rate: > 95% (target: > 98%)
- Message queue processing rate: > 95% (target: > 99%)
Scalability:
- Concurrent users: 50 (target: 100)
- Resources stored: 1000 (target: 5000)
- Summaries generated: 500/day (target: 1000/day)
- API requests: 10,000/day (target: 50,000/day)
Quality:
- Dutch language accuracy: 90% (target: 95%)
- Summary relevance: 80% (target: 90%)
- Code coverage: 70% (target: 80%)
- Security vulnerabilities: 0 critical (target: 0 high/critical)
User Adoption Targets
Week 1-2 (Foundation):
- 5 beta users testing basic flows
- 50 resources submitted
- 0 summaries generated (not yet implemented)
Week 3-4 (AI Infrastructure):
- 10 beta users testing AI summaries
- 100 resources submitted
- 50 summaries generated
- Feedback on Dutch language quality
Week 5-6 (Role-Based Summaries):
- 20 beta users with assigned roles
- 200 resources submitted
- 150 summaries generated
- Feedback on role appropriateness
Week 7-8 (UI & Integration):
- 30 beta users using full feature set
- 500 resources submitted
- 300 summaries generated
- 50 summaries edited manually
- 20 summaries regenerated
Post-MVP (Month 3):
- 50 active users
- 1000 resources submitted
- 500 summaries generated
- 80% user satisfaction
- 70% weekly active users
Quality Metrics
Dutch Language Quality:
- Grammar correctness: 95%
- Vocabulary appropriateness: 90%
- Fluency and naturalness: 85%
- No English fallback: 100%
Summary Quality:
- Relevance to source content: 90%
- Appropriate length (150-300 words): 95%
- Role-appropriate depth: 85%
- Actionable insights: 80%
User Satisfaction:
- Overall satisfaction: 80% (4/5 stars)
- Would recommend: 75%
- Saves time: 85%
- Summaries are useful: 80%
System Quality:
- Zero critical bugs
- < 5 high-priority bugs
- < 10 medium-priority bugs
- All P0 issues resolved within 24 hours
Document Owner: Development Team
Last Updated: 2026-02-25
Version: 1.0
Status: Active Development