Technical Architecture

Deep dive into GEO Wiki Pro's technical architecture, including frontend, backend, storage, and security design.

## Architecture Overview GEO Wiki Pro adopts a **frontend-backend separation** architecture, with the frontend as a Single Page Application (SPA) and the backend as a RESTful API service. The entire system design follows the **file-as-database** concept, eliminating the need for traditional databases. ``` ┌─────────────────┐ ┌─────────────────┐ │ │ │ │ │ Vue 3 SPA │────▶│ Express.js API │ │ (Vite 5) │ │ (Node.js 22) │ │ │ │ │ └─────────────────┘ └────────┬────────┘ │ ┌────────────┴────────────┐ │ │ ┌─────┴─────┐ ┌───────┴───────┐ │ Markdown │ │ Uploaded │ │ File │ │ Media Files │ │ Storage │ │ Storage │ └───────────┘ └───────────────┘ ``` ## Frontend Architecture ### Tech Stack - **Framework**: Vue 3 (Composition API) - **Build Tool**: Vite 5 - **Styling**: Tailwind CSS 3.4 - **State Management**: Pinia - **Routing**: Vue Router 4 - **Search**: Fuse.js (client-side fuzzy search) ### Directory Structure ``` src/ pages/ # 20 page components, route-mapped components/ # 24 reusable components stores/ # Pinia state stores: docsStore, uiStore, siteConfigStore api/ # API client modules client.js # Exports apiClient object request.js # request() (authenticated) + requestPublic() (unauthenticated) docs.js # Document CRUD + search auth.js # Login/logout/register llm.js # AI features admin.js # Admin panel APIs utils/ # Utility functions: i18n.js, markdown.js, icons.js, schema.js i18n/ # Translation files: zh.json, en.json, jp.json ``` ### Core Modules - **docsStore**: Manages document list, category, tag state and caching - **uiStore**: Manages UI state (language, sidebar, theme, etc.) - **siteConfigStore**: Manages site configuration (Hero section, logo, featured documents, etc.) ## Backend Architecture ### Tech Stack - **Runtime**: Node.js 22 - **Framework**: Express.js - **Storage**: File system (Markdown + JSON) - **Authentication**: JWT (jsonwebtoken) - **Security**: Helmet (CSP), express-rate-limit ### Directory Structure ``` server/ index.js # Express entry, middleware chain (fixed order) db.js # File-based DB core (read/write/queue/cache) routes/ docs.js # Public: GET /docs, GET /docs/:slug auth.js # POST /login, POST /logout, GET /me captcha.js # Captcha generation/verification meta.js # GET /categories, GET /tags geo.js # llms.txt, sitemap.xml, manifest llm.js # AI content analysis/generation media.js # File upload/download config.js # Site configuration versions.js # Document version history admin/ # Protected admin routes index.js # Mounts all admin sub-routes docs.js # CRUD + reorder drafts.js # Draft pipeline categories.js, tags.js, feedback.js geo.js, stats.js, crawler-block.js guestbook.js, users.js middleware/ auth.js # JWT verify, requireAuth, requireRole csrf.js # CSRF protection (double-submit cookie) rateLimiter.js crawlerTracker.js utils/ logger.js # pino structured JSON logging promptGuard.js # LLM prompt injection detection path.js # Path traversal prevention ``` ### Middleware Chain (fixed order) ``` Helmet → CORS → JSON parser → rateLimiter → crawlerTracker → URL dedup → logger → CSRF → CSRF cookie refresh → authRoutes → captchaRoutes → versionsRoutes → docsRoutes → metaRoutes → geoRoutes → llmRoutes → configRoutes → mediaRoutes → adminRoutes (requireAuth + requireRole + requirePasswordChange) ``` ::: warning Middleware registration order is critical. Auth routes (login/logout) must be registered before admin routes, otherwise login endpoints will be intercepted by auth middleware. ::: ## File Storage Design ### Storage Structure ``` data/ docs/{lang}/*.md # Markdown documents (zh/en/jp) drafts/{lang}/*.md # Drafts (AI review pipeline) geo-wiki.json # Category/tag/user/feedback configuration media-meta.json # Uploader information history/{lang}/*.json # Document version history crawler-visits.json # AI crawler visit log csrf-secret.txt # CSRF secret (auto-generated) public/media/ # Uploaded media files ``` ### Write Queue All file writes go through `enqueueWrite()` — a serialized promise chain. Writes are batched with a 5-second debounce. `invalidateCache()` is called after every write. **Important**: Do not call `fs.writeFileSync` directly — always use `saveDoc()`, `saveDb()`, or `enqueueWrite()`. ### Cache Mechanism - Cache TTL: 60 seconds (`CACHE_TTL_MS`) - Automatic invalidation on every write - Manual invalidation support: `invalidateCache()` ## Security Architecture ### Authentication Flow ``` User Login → Verify Password → Generate JWT → Set httpOnly Cookie ↓ API Request → Extract Cookie → Verify JWT → Check Role → Check Password Change Status ``` ### CSRF Protection Double-Submit Cookie HMAC mechanism: 1. Server generates random Secret and stores in file 2. Secret written to both Cookie and response header 3. Client sends Secret back in Header with each request 4. Server verifies HMAC signature ### CSP Policy Nonce-based Content Security Policy: - Unique Nonce generated for each request - Only scripts with correct Nonce are allowed to execute - Dynamic Nonce injection to dynamically loaded CSS via MutationObserver ## API Design ### Public API All public endpoints located at `/api/v1/`, no authentication required: | Method | Endpoint | Description | |--------|----------|-------------| | GET | `/api/v1/docs` | Document list (paginated, filterable) | | GET | `/api/v1/docs/search` | Full-text search | | GET | `/api/v1/docs/:slug` | Single document details | | GET | `/api/v1/categories` | Category list | | GET | `/api/v1/tags` | Tag list | | GET | `/api/v1/config` | Public site configuration | | GET | `/api/v1/llms.txt` | AI crawler data | | GET | `/api/v1/geo/sitemap.xml` | XML sitemap | ### Admin API Requires authentication and role authorization: | Method | Endpoint | Description | |--------|----------|-------------| | POST | `/api/v1/auth/login` | Login | | POST | `/api/v1/auth/logout` | Logout | | GET | `/api/v1/auth/me` | Current user info | | CRUD | `/api/v1/admin/docs` | Document management | | PUT | `/api/v1/admin/docs/reorder` | Document reordering | | CRUD | `/api/v1/admin/categories` | Category management | | CRUD | `/api/v1/admin/tags` | Tag management | | POST | `/api/v1/media/upload` | File upload | | POST | `/api/v1/admin/geo/rebuild` | Rebuild GEO files | ### Response Format ```json // Success { "success": true, "data": { ... } } // Paginated { "success": true, "data": [...], "pagination": { "page": 1, "limit": 20, "total": 42 } } // Error { "success": false, "message": "Not found" } ``` ## Deployment Architecture ### Docker Multi-stage Build ``` Stage 1: Build Frontend Node.js 22 → npm install → npm run build → dist/ Stage 2: Production Runtime Node.js 22 → Express.js API + Nginx static files ``` ### Environment Variables Key configuration items: | Variable | Description | Default | |----------|-------------|---------| | `PORT` | Server port | 3002 | | `JWT_SECRET` | JWT secret (required) | - | | `CORS_ORIGINS` | Allowed CORS origins | - | | `RATE_LIMIT_*` | Rate limiting configuration | 300/min | ```