Database Documentation
This document provides comprehensive documentation for the database setup, migrations, and management in the Yappa Knowledge Hub backend.
Overview
The backend uses SQLite as the local database with Doctrine ORM for object-relational mapping and Doctrine Migrations for schema management.
Why SQLite?
- Simplicity: No separate database server required
- Portability: Single file database
- Performance: Fast for read-heavy workloads
- Zero Configuration: Works out of the box
- Perfect for Development: Easy to reset and seed
Database Location
backend/var/data.dbThe database file is created automatically when you run migrations.
Database Configuration
Environment Variables
Configure the database in .env or .env.local:
# SQLite database
DATABASE_URL="sqlite:///%kernel.project_dir%/var/data.db"
# Alternative: Absolute path
DATABASE_URL="sqlite:///https://github.com/undead2146/KnowledgeHub/blob/main/backend/var/data.db"Doctrine Configuration
Located in config/packages/doctrine.yaml:
doctrine:
dbal:
url: '%env(resolve:DATABASE_URL)%'
orm:
auto_generate_proxy_classes: true
enable_lazy_ghost_objects: true
report_fields_where_declared: true
validate_xml_mapping: true
naming_strategy: doctrine.orm.naming_strategy.underscore_number_aware
auto_mapping: true
mappings:
App:
type: attribute
is_bundle: false
dir: '%kernel.project_dir%/src/Entity'
prefix: 'App\Entity'
alias: AppDatabase Schema
Tables
The database consists of 8 tables:
- category - Knowledge categories
- knowledge - Main knowledge items
- resource - External resources
- resource_content - Extracted content
- summary - AI summaries
- tag - Tags
- resource_tag - Many-to-many join table
- resource_list - Many-to-many join table
Schema Diagram
See the ER diagram in entities.md for visual representation.
Detailed Schema
category
CREATE TABLE category (
id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
name VARCHAR(255) NOT NULL,
description VARCHAR(255) DEFAULT NULL,
icon VARCHAR(50) DEFAULT ':file_folder:',
target_groups TEXT DEFAULT NULL, -- JSON array
notion_id VARCHAR(255) DEFAULT NULL,
last_synced_at DATETIME DEFAULT NULL,
notion_last_edited_at DATETIME DEFAULT NULL,
created_at DATETIME DEFAULT NULL,
is_active BOOLEAN DEFAULT 1,
sort_order INTEGER DEFAULT 0
);
CREATE INDEX IDX_category_notion_id ON category(notion_id);knowledge
CREATE TABLE knowledge (
id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
title VARCHAR(255) NOT NULL,
content TEXT NOT NULL,
category_id INTEGER DEFAULT NULL,
tags TEXT DEFAULT NULL, -- JSON array
target_groups TEXT DEFAULT NULL, -- JSON array
user_id VARCHAR(255) DEFAULT NULL,
status VARCHAR(50) DEFAULT 'pending',
url VARCHAR(500) DEFAULT NULL,
url_metadata TEXT DEFAULT NULL, -- JSON object
source_message TEXT DEFAULT NULL, -- JSON object
attachments TEXT DEFAULT NULL, -- JSON array
summaries TEXT DEFAULT NULL, -- JSON array
created_at DATETIME NOT NULL,
updated_at DATETIME NOT NULL,
notion_id VARCHAR(255) DEFAULT NULL,
notion_url VARCHAR(500) DEFAULT NULL,
last_synced_at DATETIME DEFAULT NULL,
notion_last_edited_at DATETIME DEFAULT NULL,
CONSTRAINT FK_knowledge_category
FOREIGN KEY (category_id)
REFERENCES category (id)
ON DELETE SET NULL
);
CREATE INDEX IDX_knowledge_category ON knowledge(category_id);
CREATE INDEX IDX_knowledge_notion_id ON knowledge(notion_id);
CREATE INDEX IDX_knowledge_status ON knowledge(status);
CREATE INDEX IDX_knowledge_user_id ON knowledge(user_id);resource
CREATE TABLE resource (
id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
title VARCHAR(255) NOT NULL,
url VARCHAR(500) DEFAULT NULL,
source_type VARCHAR(20) NOT NULL, -- enum: url, text, pdf, rss, youtube, podcast
status VARCHAR(20) NOT NULL, -- enum: pending, processing, done, failed
notion_id VARCHAR(255) DEFAULT NULL,
last_synced_at DATETIME DEFAULT NULL,
notion_last_edited_at DATETIME DEFAULT NULL,
created_at DATETIME NOT NULL,
updated_at DATETIME NOT NULL
);
CREATE INDEX IDX_resource_notion_id ON resource(notion_id);
CREATE INDEX IDX_resource_status ON resource(status);
CREATE INDEX IDX_resource_source_type ON resource(source_type);resource_content
CREATE TABLE resource_content (
id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
content TEXT NOT NULL,
content_type VARCHAR(50) DEFAULT 'text/plain',
resource_id INTEGER NOT NULL,
created_at DATETIME NOT NULL,
CONSTRAINT FK_resource_content_resource
FOREIGN KEY (resource_id)
REFERENCES resource (id)
ON DELETE CASCADE
);
CREATE INDEX IDX_resource_content_resource ON resource_content(resource_id);summary
CREATE TABLE summary (
id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
content TEXT NOT NULL,
model VARCHAR(50) DEFAULT NULL,
resource_id INTEGER NOT NULL,
created_at DATETIME NOT NULL,
CONSTRAINT FK_summary_resource
FOREIGN KEY (resource_id)
REFERENCES resource (id)
ON DELETE CASCADE
);
CREATE INDEX IDX_summary_resource ON summary(resource_id);tag
CREATE TABLE tag (
id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
name VARCHAR(255) NOT NULL UNIQUE
);
CREATE UNIQUE INDEX UNIQ_tag_name ON tag(name);resource_tag (join table)
CREATE TABLE resource_tag (
resource_id INTEGER NOT NULL,
tag_id INTEGER NOT NULL,
PRIMARY KEY (resource_id, tag_id),
CONSTRAINT FK_resource_tag_resource
FOREIGN KEY (resource_id)
REFERENCES resource (id)
ON DELETE CASCADE,
CONSTRAINT FK_resource_tag_tag
FOREIGN KEY (tag_id)
REFERENCES tag (id)
ON DELETE CASCADE
);
CREATE INDEX IDX_resource_tag_resource ON resource_tag(resource_id);
CREATE INDEX IDX_resource_tag_tag ON resource_tag(tag_id);resource_list (join table)
CREATE TABLE resource_list (
resource_id INTEGER NOT NULL,
category_id INTEGER NOT NULL,
PRIMARY KEY (resource_id, category_id),
CONSTRAINT FK_resource_list_resource
FOREIGN KEY (resource_id)
REFERENCES resource (id)
ON DELETE CASCADE,
CONSTRAINT FK_resource_list_category
FOREIGN KEY (category_id)
REFERENCES category (id)
ON DELETE CASCADE
);
CREATE INDEX IDX_resource_list_resource ON resource_list(resource_id);
CREATE INDEX IDX_resource_list_category ON resource_list(category_id);Migrations
Doctrine Migrations manages database schema changes through versioned migration files.
Migration Files
Located in backend/migrations/:
Version20260211153507.php- Initial schemaVersion20260211155807.php- Schema updates
Creating Migrations
Generate Migration from Entity Changes
When you modify entities, generate a migration:
cd backend
php bin/console make:migrationThis analyzes your entities and generates a migration file with the necessary SQL changes.
Review the Migration
Always review the generated migration before running it:
// migrations/VersionXXXXXXXXXXXXXX.php
public function up(Schema $schema): void
{
// Review these SQL statements
$this->addSql('ALTER TABLE knowledge ADD COLUMN new_field VARCHAR(255)');
}Running Migrations
Execute All Pending Migrations
cd backend
php bin/console doctrine:migrations:migrateThis runs all migrations that haven't been executed yet.
Execute Specific Migration
php bin/console doctrine:migrations:execute --up VersionXXXXXXXXXXXXXXDry Run (Preview SQL)
php bin/console doctrine:migrations:migrate --dry-runMigration Status
Check which migrations have been executed:
php bin/console doctrine:migrations:statusOutput:
== Configuration
>> Name: Application Migrations
>> Database Driver: pdo_sqlite
>> Database Host: localhost
>> Database Name: data.db
>> Configuration Source: manually configured
>> Version Table Name: doctrine_migration_versions
>> Version Column Name: version
>> Migrations Namespace: DoctrineMigrations
>> Migrations Directory: /path/to/migrations
>> Previous Version: Already at first version
>> Current Version: 2026-02-11 15:58:07 (Version20260211155807)
>> Next Version: Already at latest version
>> Latest Version: 2026-02-11 15:58:07 (Version20260211155807)
>> Executed Migrations: 2
>> Executed Unavailable Migrations: 0
>> Available Migrations: 2
>> New Migrations: 0Rolling Back Migrations
Rollback to Previous Version
php bin/console doctrine:migrations:migrate prevRollback to Specific Version
php bin/console doctrine:migrations:migrate VersionXXXXXXXXXXXXXXRollback All Migrations
php bin/console doctrine:migrations:migrate firstDatabase Management
Creating the Database
Create the database file:
cd backend
php bin/console doctrine:database:createThis creates var/data.db if it doesn't exist.
Dropping the Database
Delete the database file:
php bin/console doctrine:database:drop --forceWarning: This deletes all data!
Validating Schema
Check if the database schema matches your entities:
php bin/console doctrine:schema:validateOutput:
[Mapping] OK - The mapping files are correct.
[Database] OK - The database schema is in sync with the mapping files.Updating Schema (Development Only)
For quick development, update schema without migrations:
php bin/console doctrine:schema:update --forceWarning: Only use in development! Use migrations in production.
Seeding Data
Using Data Fixtures
Data fixtures provide sample data for development and testing.
Create Fixture
// src/DataFixtures/DemoDataFixtures.php
namespace App\DataFixtures;
use App\Entity\Category;
use App\Entity\Knowledge;
use Doctrine\Bundle\FixturesBundle\Fixture;
use Doctrine\Persistence\ObjectManager;
class DemoDataFixtures extends Fixture
{
public function load(ObjectManager $manager): void
{
// Create categories
$category = new Category();
$category->setName('Development')
->setDescription('Development resources')
->setIcon('')
->setTargetGroups(['developers'])
->setIsActive(true)
->setSortOrder(1);
$manager->persist($category);
// Create knowledge items
for ($i = 1; $i <= 10; $i++) {
$knowledge = new Knowledge();
$knowledge->setTitle("Knowledge Item $i")
->setContent("This is the content for knowledge item $i")
->setCategory($category)
->setTags(['tag1', 'tag2'])
->setTargetGroups(['developers'])
->setStatus('published');
$manager->persist($knowledge);
}
$manager->flush();
}
}Load Fixtures
cd backend
php bin/console doctrine:fixtures:loadWarning: This purges the database before loading!
Append Fixtures (Don't Purge)
php bin/console doctrine:fixtures:load --appendUsing Console Commands
The backend includes custom commands for seeding:
PopulateKnowledgeCommand
php bin/console app:populate-knowledgePopulates knowledge items from a predefined source.
SeedPocDataCommand
php bin/console app:seed-poc-dataSeeds proof-of-concept data for testing.
Querying the Database
Using Doctrine ORM
// Get entity manager
$entityManager = $this->getDoctrine()->getManager();
// Get repository
$repository = $entityManager->getRepository(Knowledge::class);
// Find by ID
$knowledge = $repository->find(1);
// Find one by criteria
$knowledge = $repository->findOneBy(['title' => 'My Title']);
// Find all
$items = $repository->findAll();
// Find with criteria and sorting
$items = $repository->findBy(
['status' => 'published'],
['createdAt' => 'DESC'],
10, // limit
0 // offset
);
// Custom query with QueryBuilder
$qb = $repository->createQueryBuilder('k')
->where('k.title LIKE :search')
->andWhere('k.status = :status')
->setParameter('search', '%symfony%')
->setParameter('status', 'published')
->orderBy('k.createdAt', 'DESC')
->setMaxResults(10);
$results = $qb->getQuery()->getResult();Using DQL (Doctrine Query Language)
$dql = 'SELECT k FROM App\Entity\Knowledge k
WHERE k.title LIKE :search
ORDER BY k.createdAt DESC';
$query = $entityManager->createQuery($dql)
->setParameter('search', '%symfony%')
->setMaxResults(10);
$results = $query->getResult();Using Native SQL
$conn = $entityManager->getConnection();
$sql = 'SELECT * FROM knowledge WHERE title LIKE :search';
$stmt = $conn->prepare($sql);
$stmt->bindValue('search', '%symfony%');
$resultSet = $stmt->executeQuery();
$results = $resultSet->fetchAllAssociative();Direct SQLite Access
You can also query the database directly:
cd backend/var
sqlite3 data.db
# List tables
.tables
# Describe table
.schema knowledge
# Query
SELECT * FROM knowledge LIMIT 10;
# Exit
.quitBackup and Restore
Backup Database
Since SQLite is a single file, backup is simple:
# Copy the database file
cp backend/var/data.db backend/var/data.db.backup
# Or with timestamp
cp backend/var/data.db backend/var/data.db.$(date +%Y%m%d_%H%M%S)Restore Database
# Restore from backup
cp backend/var/data.db.backup backend/var/data.dbExport to SQL
cd backend/var
sqlite3 data.db .dump > backup.sqlImport from SQL
cd backend/var
sqlite3 data.db < backup.sqlPerformance Optimization
Indexes
The schema includes indexes on frequently queried columns:
notion_idcolumns (for sync operations)category_id(for joins)status(for filtering)user_id(for user queries)
Query Optimization
- Use Indexes: Ensure WHERE clauses use indexed columns
- Limit Results: Always use LIMIT for large datasets
- Lazy Loading: Doctrine lazy-loads relationships by default
- Eager Loading: Use JOIN FETCH for related entities you need
// Eager load category with knowledge items
$qb = $repository->createQueryBuilder('k')
->leftJoin('k.category', 'c')
->addSelect('c')
->where('k.status = :status')
->setParameter('status', 'published');- Pagination: Use Doctrine Paginator for large result sets
use Doctrine\ORM\Tools\Pagination\Paginator;
$query = $repository->createQueryBuilder('k')
->setFirstResult(0)
->setMaxResults(20)
->getQuery();
$paginator = new Paginator($query);Database Maintenance
SQLite doesn't require much maintenance, but you can:
Vacuum (Reclaim Space)
sqlite3 backend/var/data.db "VACUUM;"Analyze (Update Statistics)
sqlite3 backend/var/data.db "ANALYZE;"Troubleshooting
Database Locked Error
Problem: database is locked error
Solution:
- Close all connections to the database
- Check for long-running transactions
- Increase timeout in configuration
# config/packages/doctrine.yaml
doctrine:
dbal:
options:
timeout: 30Schema Out of Sync
Problem: Schema doesn't match entities
Solution:
# Check differences
php bin/console doctrine:schema:update --dump-sql
# Generate migration
php bin/console make:migration
# Run migration
php bin/console doctrine:migrations:migrateMigration Failed
Problem: Migration execution failed
Solution:
# Check migration status
php bin/console doctrine:migrations:status
# Mark migration as executed (if already applied manually)
php bin/console doctrine:migrations:version VersionXXXXXXXXXXXXXX --add
# Or rollback and retry
php bin/console doctrine:migrations:migrate prev
php bin/console doctrine:migrations:migrateDatabase Corruption
Problem: Database file is corrupted
Solution:
# Check integrity
sqlite3 backend/var/data.db "PRAGMA integrity_check;"
# If corrupted, restore from backup
cp backend/var/data.db.backup backend/var/data.db
# Or recreate from migrations
rm backend/var/data.db
php bin/console doctrine:database:create
php bin/console doctrine:migrations:migrate
php bin/console doctrine:fixtures:loadBest Practices
- Always Use Migrations: Never modify schema directly
- Review Generated Migrations: Check SQL before running
- Backup Before Migrations: Backup database before running migrations in production
- Use Transactions: Wrap multiple operations in transactions
- Validate Schema: Run
doctrine:schema:validateregularly - Use Fixtures for Testing: Create fixtures for consistent test data
- Index Frequently Queried Columns: Add indexes for performance
- Use Query Builder: Prefer QueryBuilder over raw SQL
- Lazy Load Relationships: Don't eager load unless needed
- Monitor Query Performance: Log slow queries in production
Production Considerations
Database Location
In production, consider storing the database outside the web root:
DATABASE_URL="sqlite:////var/lib/yappa-knowledge-hub/data.db"Permissions
Ensure proper file permissions:
chmod 664 /var/lib/yappa-knowledge-hub/data.db
chown www-data:www-data /var/lib/yappa-knowledge-hub/data.dbBackup Strategy
Implement automated backups:
# Cron job for daily backups
0 2 * * * cp /var/lib/yappa-knowledge-hub/data.db /backups/data.db.$(date +\%Y\%m\%d)Monitoring
Monitor database size and performance:
# Check database size
ls -lh backend/var/data.db
# Check table sizes
sqlite3 backend/var/data.db "SELECT name, SUM(pgsize) as size FROM dbstat GROUP BY name ORDER BY size DESC;"