Fullfører oppgave 15.6: Serverhelse-dashboard
Nytt admin-dashboard for sanntids serverhelse med fire hoveddeler: 1. Tjeneste-status: Parallelle helsesjekker for alle 7 tjenester (PG, STDB, Caddy, Authentik, LiteLLM, Whisper, LiveKit) med latens-måling og statusrapportering (up/down/degraded). 2. System-metrikker: CPU-load via /proc/loadavg, minne via /proc/meminfo, disk via statvfs, oppetid via /proc/uptime. Vises med progress-bars og fargekodede terskler. 3. PG-statistikk: Aktive tilkoblinger, maks-tilkoblinger, databasestørrelse og aktive spørringer. 4. Logg-tilgang: Filtrerbar visning av logger fra alle tjenester. Bruker journalctl for systemd-tjenester og docker logs for containere. Konfigurerbart antall linjer per tjeneste. Backend: health.rs med tokio::join! for parallelle sjekker. Frontend: /admin/health med auto-polling hvert 10. sekund. Backup-sjekk rapporterer ok/stale/missing (ingen backup satt opp ennå). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
9d4902ec48
commit
56b7df8bf8
6 changed files with 945 additions and 2 deletions
|
|
@ -107,6 +107,27 @@ Sanntidsoversikt over systemtilstand.
|
||||||
- **Logg-tilgang:** Siste feil og advarsler fra alle tjenester, filtrerbart
|
- **Logg-tilgang:** Siste feil og advarsler fra alle tjenester, filtrerbart
|
||||||
- **Backup-status:** Siste vellykkede backup per type, neste planlagte kjøring
|
- **Backup-status:** Siste vellykkede backup per type, neste planlagte kjøring
|
||||||
|
|
||||||
|
#### Implementert (oppgave 15.6)
|
||||||
|
|
||||||
|
- **Backend:** `maskinrommet/src/health.rs` — parallelle helsesjekker for alle
|
||||||
|
tjenester, system-metrikker via `/proc`, PG-statistikk, backup-sjekk, logg-tilgang
|
||||||
|
- **API-endepunkter:**
|
||||||
|
- `GET /admin/health` — komplett helse-dashboard (tjeneste-status, CPU/minne/disk,
|
||||||
|
PG-stats, backup-status)
|
||||||
|
- `GET /admin/health/logs?service=&lines=` — logg-tilgang per tjeneste eller alle
|
||||||
|
- **Frontend:** `/admin/health` — dashboard med tjenestekort (opp/nede/degradert med
|
||||||
|
latens), system-metrikker med progress-bars, PG-tilkoblinger og DB-størrelse,
|
||||||
|
backup-status, og filtrerbar logg-visning
|
||||||
|
- **Tjeneste-sjekker:** PG (SQL ping), STDB (noop-kall), Caddy (admin-API),
|
||||||
|
Authentik (health-endpoint), LiteLLM/Whisper/LiveKit (HTTP health). Alle kjøres
|
||||||
|
parallelt med 5s timeout
|
||||||
|
- **Metrikker:** CPU load via `/proc/loadavg`, minne via `/proc/meminfo`,
|
||||||
|
disk via `statvfs`, oppetid via `/proc/uptime`
|
||||||
|
- **Logger:** Systemd-journal for native tjenester (maskinrommet, caddy),
|
||||||
|
`docker logs` for containere. Filtrerbart per tjeneste, konfigurerbart antall linjer
|
||||||
|
- **Backup:** Sjekker standard backup-kataloger for PG-dump og CAS-filer.
|
||||||
|
Rapporterer status som ok/stale/missing basert på filens alder
|
||||||
|
|
||||||
### 5. Bruker- og tilgangsoversikt
|
### 5. Bruker- og tilgangsoversikt
|
||||||
|
|
||||||
- **Aktive brukere:** Hvem er pålogget nå, siste aktivitet
|
- **Aktive brukere:** Hvem er pålogget nå, siste aktivitet
|
||||||
|
|
|
||||||
|
|
@ -945,3 +945,96 @@ export function deleteAiRouting(
|
||||||
): Promise<{ success: boolean }> {
|
): Promise<{ success: boolean }> {
|
||||||
return post(accessToken, '/admin/ai/delete_routing', { job_type: jobType });
|
return post(accessToken, '/admin/ai/delete_routing', { job_type: jobType });
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// =============================================================================
|
||||||
|
// Serverhelse-dashboard (oppgave 15.6)
|
||||||
|
// =============================================================================
|
||||||
|
|
||||||
|
export interface ServiceStatus {
|
||||||
|
name: string;
|
||||||
|
status: 'up' | 'down' | 'degraded';
|
||||||
|
latency_ms: number | null;
|
||||||
|
details: string | null;
|
||||||
|
}
|
||||||
|
|
||||||
|
export interface SystemMetrics {
|
||||||
|
cpu_usage_percent: number;
|
||||||
|
cpu_cores: number;
|
||||||
|
load_avg: [number, number, number];
|
||||||
|
memory_total_bytes: number;
|
||||||
|
memory_used_bytes: number;
|
||||||
|
memory_available_bytes: number;
|
||||||
|
memory_usage_percent: number;
|
||||||
|
disk: {
|
||||||
|
mount_point: string;
|
||||||
|
total_bytes: number;
|
||||||
|
used_bytes: number;
|
||||||
|
available_bytes: number;
|
||||||
|
usage_percent: number;
|
||||||
|
alert_level: string | null;
|
||||||
|
};
|
||||||
|
uptime_seconds: number;
|
||||||
|
}
|
||||||
|
|
||||||
|
export interface BackupInfo {
|
||||||
|
backup_type: string;
|
||||||
|
last_success: string | null;
|
||||||
|
path: string | null;
|
||||||
|
status: 'ok' | 'missing' | 'stale';
|
||||||
|
}
|
||||||
|
|
||||||
|
export interface PgStats {
|
||||||
|
active_connections: number;
|
||||||
|
max_connections: number;
|
||||||
|
database_size_bytes: number;
|
||||||
|
active_queries: number;
|
||||||
|
}
|
||||||
|
|
||||||
|
export interface HealthDashboard {
|
||||||
|
services: ServiceStatus[];
|
||||||
|
metrics: SystemMetrics;
|
||||||
|
backups: BackupInfo[];
|
||||||
|
pg_stats: PgStats;
|
||||||
|
}
|
||||||
|
|
||||||
|
export interface LogEntry {
|
||||||
|
timestamp: string;
|
||||||
|
service: string;
|
||||||
|
level: string;
|
||||||
|
message: string;
|
||||||
|
}
|
||||||
|
|
||||||
|
export interface LogsResponse {
|
||||||
|
entries: LogEntry[];
|
||||||
|
}
|
||||||
|
|
||||||
|
/** Hent komplett serverhelse-dashboard. */
|
||||||
|
export async function fetchHealthDashboard(accessToken: string): Promise<HealthDashboard> {
|
||||||
|
const res = await fetch(`${BASE_URL}/admin/health`, {
|
||||||
|
headers: { Authorization: `Bearer ${accessToken}` }
|
||||||
|
});
|
||||||
|
if (!res.ok) {
|
||||||
|
const body = await res.text();
|
||||||
|
throw new Error(`health dashboard failed (${res.status}): ${body}`);
|
||||||
|
}
|
||||||
|
return res.json();
|
||||||
|
}
|
||||||
|
|
||||||
|
/** Hent logger for en tjeneste (eller alle). */
|
||||||
|
export async function fetchHealthLogs(
|
||||||
|
accessToken: string,
|
||||||
|
params: { service?: string; lines?: number } = {}
|
||||||
|
): Promise<LogsResponse> {
|
||||||
|
const qs = new URLSearchParams();
|
||||||
|
if (params.service) qs.set('service', params.service);
|
||||||
|
if (params.lines) qs.set('lines', String(params.lines));
|
||||||
|
const query = qs.toString();
|
||||||
|
const res = await fetch(`${BASE_URL}/admin/health/logs${query ? `?${query}` : ''}`, {
|
||||||
|
headers: { Authorization: `Bearer ${accessToken}` }
|
||||||
|
});
|
||||||
|
if (!res.ok) {
|
||||||
|
const body = await res.text();
|
||||||
|
throw new Error(`health logs failed (${res.status}): ${body}`);
|
||||||
|
}
|
||||||
|
return res.json();
|
||||||
|
}
|
||||||
|
|
|
||||||
308
frontend/src/routes/admin/health/+page.svelte
Normal file
308
frontend/src/routes/admin/health/+page.svelte
Normal file
|
|
@ -0,0 +1,308 @@
|
||||||
|
<script lang="ts">
|
||||||
|
/**
|
||||||
|
* Admin — Serverhelse-dashboard (oppgave 15.6)
|
||||||
|
*
|
||||||
|
* Viser tjeneste-status, system-metrikker (CPU, minne, disk),
|
||||||
|
* backup-status og logg-tilgang for alle tjenester i stacken.
|
||||||
|
*/
|
||||||
|
import { page } from '$app/stores';
|
||||||
|
import {
|
||||||
|
fetchHealthDashboard,
|
||||||
|
fetchHealthLogs,
|
||||||
|
type HealthDashboard,
|
||||||
|
type LogsResponse
|
||||||
|
} from '$lib/api';
|
||||||
|
|
||||||
|
const session = $derived($page.data.session as Record<string, unknown> | undefined);
|
||||||
|
const accessToken = $derived(session?.accessToken as string | undefined);
|
||||||
|
|
||||||
|
let dashboard = $state<HealthDashboard | null>(null);
|
||||||
|
let logs = $state<LogsResponse | null>(null);
|
||||||
|
let error = $state<string | null>(null);
|
||||||
|
let logsError = $state<string | null>(null);
|
||||||
|
|
||||||
|
// Logg-filter
|
||||||
|
let selectedService = $state<string>('');
|
||||||
|
let logLines = $state(50);
|
||||||
|
let showLogs = $state(false);
|
||||||
|
|
||||||
|
// Poll dashboard hvert 10. sekund
|
||||||
|
$effect(() => {
|
||||||
|
if (!accessToken) return;
|
||||||
|
loadDashboard();
|
||||||
|
const interval = setInterval(loadDashboard, 10_000);
|
||||||
|
return () => clearInterval(interval);
|
||||||
|
});
|
||||||
|
|
||||||
|
async function loadDashboard() {
|
||||||
|
if (!accessToken) return;
|
||||||
|
try {
|
||||||
|
dashboard = await fetchHealthDashboard(accessToken);
|
||||||
|
error = null;
|
||||||
|
} catch (e) {
|
||||||
|
error = String(e);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
async function loadLogs() {
|
||||||
|
if (!accessToken) return;
|
||||||
|
logsError = null;
|
||||||
|
try {
|
||||||
|
logs = await fetchHealthLogs(accessToken, {
|
||||||
|
service: selectedService || undefined,
|
||||||
|
lines: logLines
|
||||||
|
});
|
||||||
|
} catch (e) {
|
||||||
|
logsError = String(e);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
function statusColor(status: string): string {
|
||||||
|
if (status === 'up' || status === 'ok') return 'text-green-400';
|
||||||
|
if (status === 'down' || status === 'missing') return 'text-red-400';
|
||||||
|
if (status === 'degraded' || status === 'stale') return 'text-yellow-400';
|
||||||
|
return 'text-neutral-400';
|
||||||
|
}
|
||||||
|
|
||||||
|
function statusDot(status: string): string {
|
||||||
|
if (status === 'up') return 'bg-green-400';
|
||||||
|
if (status === 'down') return 'bg-red-400';
|
||||||
|
if (status === 'degraded') return 'bg-yellow-400';
|
||||||
|
return 'bg-neutral-400';
|
||||||
|
}
|
||||||
|
|
||||||
|
function formatBytes(bytes: number): string {
|
||||||
|
if (bytes === 0) return '0 B';
|
||||||
|
const units = ['B', 'KB', 'MB', 'GB', 'TB'];
|
||||||
|
const i = Math.floor(Math.log(bytes) / Math.log(1024));
|
||||||
|
return `${(bytes / Math.pow(1024, i)).toFixed(1)} ${units[i]}`;
|
||||||
|
}
|
||||||
|
|
||||||
|
function formatUptime(seconds: number): string {
|
||||||
|
const days = Math.floor(seconds / 86400);
|
||||||
|
const hours = Math.floor((seconds % 86400) / 3600);
|
||||||
|
const mins = Math.floor((seconds % 3600) / 60);
|
||||||
|
if (days > 0) return `${days}d ${hours}t ${mins}m`;
|
||||||
|
if (hours > 0) return `${hours}t ${mins}m`;
|
||||||
|
return `${mins}m`;
|
||||||
|
}
|
||||||
|
|
||||||
|
function logLevelColor(level: string): string {
|
||||||
|
if (level === 'error') return 'text-red-400';
|
||||||
|
if (level === 'warn') return 'text-yellow-400';
|
||||||
|
if (level === 'info') return 'text-blue-400';
|
||||||
|
return 'text-neutral-500';
|
||||||
|
}
|
||||||
|
|
||||||
|
const allServices = ['maskinrommet', 'caddy', 'postgres', 'spacetimedb', 'authentik', 'litellm', 'whisper', 'livekit'];
|
||||||
|
</script>
|
||||||
|
|
||||||
|
<div class="min-h-screen bg-neutral-950 text-neutral-100 p-4 sm:p-8">
|
||||||
|
<div class="max-w-6xl mx-auto space-y-6">
|
||||||
|
<div class="flex items-center justify-between">
|
||||||
|
<h1 class="text-2xl font-bold">Serverhelse</h1>
|
||||||
|
<a href="/admin" class="text-sm text-neutral-400 hover:text-white">Tilbake til admin</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
{#if error}
|
||||||
|
<div class="bg-red-900/30 border border-red-700 rounded-lg p-4 text-red-300 text-sm">
|
||||||
|
{error}
|
||||||
|
</div>
|
||||||
|
{/if}
|
||||||
|
|
||||||
|
{#if !dashboard}
|
||||||
|
<p class="text-neutral-500">Laster...</p>
|
||||||
|
{:else}
|
||||||
|
<!-- Tjeneste-status -->
|
||||||
|
<section>
|
||||||
|
<h2 class="text-lg font-semibold mb-3">Tjenester</h2>
|
||||||
|
<div class="grid grid-cols-1 sm:grid-cols-2 lg:grid-cols-3 xl:grid-cols-4 gap-3">
|
||||||
|
{#each dashboard.services as svc}
|
||||||
|
<div class="bg-neutral-900 border border-neutral-800 rounded-lg p-4">
|
||||||
|
<div class="flex items-center gap-2 mb-1">
|
||||||
|
<span class="w-2.5 h-2.5 rounded-full {statusDot(svc.status)}"></span>
|
||||||
|
<span class="font-medium">{svc.name}</span>
|
||||||
|
</div>
|
||||||
|
<div class="text-sm {statusColor(svc.status)} capitalize">{svc.status}</div>
|
||||||
|
{#if svc.latency_ms !== null}
|
||||||
|
<div class="text-xs text-neutral-500 mt-1">{svc.latency_ms} ms</div>
|
||||||
|
{/if}
|
||||||
|
{#if svc.details}
|
||||||
|
<div class="text-xs text-neutral-600 mt-1 truncate" title={svc.details}>{svc.details}</div>
|
||||||
|
{/if}
|
||||||
|
</div>
|
||||||
|
{/each}
|
||||||
|
</div>
|
||||||
|
</section>
|
||||||
|
|
||||||
|
<!-- System-metrikker -->
|
||||||
|
<section>
|
||||||
|
<h2 class="text-lg font-semibold mb-3">System</h2>
|
||||||
|
<div class="grid grid-cols-1 sm:grid-cols-2 lg:grid-cols-4 gap-3">
|
||||||
|
<!-- CPU -->
|
||||||
|
<div class="bg-neutral-900 border border-neutral-800 rounded-lg p-4">
|
||||||
|
<div class="text-sm text-neutral-400 mb-1">CPU</div>
|
||||||
|
<div class="text-2xl font-mono">{dashboard.metrics.cpu_usage_percent.toFixed(0)}%</div>
|
||||||
|
<div class="text-xs text-neutral-500 mt-1">
|
||||||
|
{dashboard.metrics.cpu_cores} kjerner | Load: {dashboard.metrics.load_avg.map(v => v.toFixed(2)).join(', ')}
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- Minne -->
|
||||||
|
<div class="bg-neutral-900 border border-neutral-800 rounded-lg p-4">
|
||||||
|
<div class="text-sm text-neutral-400 mb-1">Minne</div>
|
||||||
|
<div class="text-2xl font-mono">{dashboard.metrics.memory_usage_percent.toFixed(0)}%</div>
|
||||||
|
<div class="text-xs text-neutral-500 mt-1">
|
||||||
|
{formatBytes(dashboard.metrics.memory_used_bytes)} / {formatBytes(dashboard.metrics.memory_total_bytes)}
|
||||||
|
</div>
|
||||||
|
<!-- Bar -->
|
||||||
|
<div class="mt-2 h-1.5 bg-neutral-800 rounded-full overflow-hidden">
|
||||||
|
<div
|
||||||
|
class="h-full rounded-full transition-all {dashboard.metrics.memory_usage_percent > 90 ? 'bg-red-500' : dashboard.metrics.memory_usage_percent > 70 ? 'bg-yellow-500' : 'bg-green-500'}"
|
||||||
|
style="width: {Math.min(dashboard.metrics.memory_usage_percent, 100)}%"
|
||||||
|
></div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- Disk -->
|
||||||
|
<div class="bg-neutral-900 border border-neutral-800 rounded-lg p-4">
|
||||||
|
<div class="text-sm text-neutral-400 mb-1">Disk</div>
|
||||||
|
<div class="text-2xl font-mono">{dashboard.metrics.disk.usage_percent.toFixed(1)}%</div>
|
||||||
|
<div class="text-xs text-neutral-500 mt-1">
|
||||||
|
{formatBytes(dashboard.metrics.disk.used_bytes)} / {formatBytes(dashboard.metrics.disk.total_bytes)}
|
||||||
|
</div>
|
||||||
|
{#if dashboard.metrics.disk.alert_level}
|
||||||
|
<div class="text-xs text-red-400 mt-1 uppercase">{dashboard.metrics.disk.alert_level}</div>
|
||||||
|
{/if}
|
||||||
|
<div class="mt-2 h-1.5 bg-neutral-800 rounded-full overflow-hidden">
|
||||||
|
<div
|
||||||
|
class="h-full rounded-full transition-all {dashboard.metrics.disk.usage_percent > 90 ? 'bg-red-500' : dashboard.metrics.disk.usage_percent > 80 ? 'bg-yellow-500' : 'bg-green-500'}"
|
||||||
|
style="width: {Math.min(dashboard.metrics.disk.usage_percent, 100)}%"
|
||||||
|
></div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- Oppetid -->
|
||||||
|
<div class="bg-neutral-900 border border-neutral-800 rounded-lg p-4">
|
||||||
|
<div class="text-sm text-neutral-400 mb-1">Oppetid</div>
|
||||||
|
<div class="text-2xl font-mono">{formatUptime(dashboard.metrics.uptime_seconds)}</div>
|
||||||
|
<div class="text-xs text-neutral-500 mt-1">Siden siste reboot</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</section>
|
||||||
|
|
||||||
|
<!-- PG-statistikk -->
|
||||||
|
<section>
|
||||||
|
<h2 class="text-lg font-semibold mb-3">PostgreSQL</h2>
|
||||||
|
<div class="grid grid-cols-2 sm:grid-cols-4 gap-3">
|
||||||
|
<div class="bg-neutral-900 border border-neutral-800 rounded-lg p-4">
|
||||||
|
<div class="text-sm text-neutral-400">Tilkoblinger</div>
|
||||||
|
<div class="text-xl font-mono">{dashboard.pg_stats.active_connections} / {dashboard.pg_stats.max_connections}</div>
|
||||||
|
</div>
|
||||||
|
<div class="bg-neutral-900 border border-neutral-800 rounded-lg p-4">
|
||||||
|
<div class="text-sm text-neutral-400">DB-storrelse</div>
|
||||||
|
<div class="text-xl font-mono">{formatBytes(dashboard.pg_stats.database_size_bytes)}</div>
|
||||||
|
</div>
|
||||||
|
<div class="bg-neutral-900 border border-neutral-800 rounded-lg p-4">
|
||||||
|
<div class="text-sm text-neutral-400">Aktive sporringer</div>
|
||||||
|
<div class="text-xl font-mono">{dashboard.pg_stats.active_queries}</div>
|
||||||
|
</div>
|
||||||
|
<div class="bg-neutral-900 border border-neutral-800 rounded-lg p-4">
|
||||||
|
<div class="text-sm text-neutral-400">Frie tilkoblinger</div>
|
||||||
|
<div class="text-xl font-mono">{dashboard.pg_stats.max_connections - dashboard.pg_stats.active_connections}</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</section>
|
||||||
|
|
||||||
|
<!-- Backup-status -->
|
||||||
|
<section>
|
||||||
|
<h2 class="text-lg font-semibold mb-3">Backup</h2>
|
||||||
|
<div class="grid grid-cols-1 sm:grid-cols-2 gap-3">
|
||||||
|
{#each dashboard.backups as backup}
|
||||||
|
<div class="bg-neutral-900 border border-neutral-800 rounded-lg p-4">
|
||||||
|
<div class="flex items-center gap-2">
|
||||||
|
<span class="w-2 h-2 rounded-full {statusDot(backup.status === 'ok' ? 'up' : backup.status === 'stale' ? 'degraded' : 'down')}"></span>
|
||||||
|
<span class="font-medium">{backup.backup_type}</span>
|
||||||
|
</div>
|
||||||
|
<div class="text-sm {statusColor(backup.status)} mt-1 capitalize">{backup.status}</div>
|
||||||
|
{#if backup.last_success}
|
||||||
|
<div class="text-xs text-neutral-500 mt-1">Siste: {new Date(backup.last_success).toLocaleString('nb-NO')}</div>
|
||||||
|
{:else}
|
||||||
|
<div class="text-xs text-neutral-500 mt-1">Ingen backup funnet</div>
|
||||||
|
{/if}
|
||||||
|
</div>
|
||||||
|
{/each}
|
||||||
|
</div>
|
||||||
|
</section>
|
||||||
|
|
||||||
|
<!-- Logg-tilgang -->
|
||||||
|
<section>
|
||||||
|
<div class="flex items-center justify-between mb-3">
|
||||||
|
<h2 class="text-lg font-semibold">Logger</h2>
|
||||||
|
<button
|
||||||
|
class="text-sm text-neutral-400 hover:text-white"
|
||||||
|
onclick={() => { showLogs = !showLogs; if (showLogs) loadLogs(); }}
|
||||||
|
>
|
||||||
|
{showLogs ? 'Skjul' : 'Vis logger'}
|
||||||
|
</button>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
{#if showLogs}
|
||||||
|
<div class="bg-neutral-900 border border-neutral-800 rounded-lg p-4 space-y-3">
|
||||||
|
<!-- Filtre -->
|
||||||
|
<div class="flex flex-wrap gap-3 items-center">
|
||||||
|
<select
|
||||||
|
bind:value={selectedService}
|
||||||
|
class="bg-neutral-800 border border-neutral-700 rounded px-3 py-1.5 text-sm"
|
||||||
|
onchange={() => loadLogs()}
|
||||||
|
>
|
||||||
|
<option value="">Alle tjenester</option>
|
||||||
|
{#each allServices as svc}
|
||||||
|
<option value={svc}>{svc}</option>
|
||||||
|
{/each}
|
||||||
|
</select>
|
||||||
|
<select
|
||||||
|
bind:value={logLines}
|
||||||
|
class="bg-neutral-800 border border-neutral-700 rounded px-3 py-1.5 text-sm"
|
||||||
|
onchange={() => loadLogs()}
|
||||||
|
>
|
||||||
|
<option value={25}>25 linjer</option>
|
||||||
|
<option value={50}>50 linjer</option>
|
||||||
|
<option value={100}>100 linjer</option>
|
||||||
|
<option value={200}>200 linjer</option>
|
||||||
|
</select>
|
||||||
|
<button
|
||||||
|
class="bg-neutral-800 hover:bg-neutral-700 px-3 py-1.5 rounded text-sm"
|
||||||
|
onclick={() => loadLogs()}
|
||||||
|
>
|
||||||
|
Oppdater
|
||||||
|
</button>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
{#if logsError}
|
||||||
|
<div class="text-red-400 text-sm">{logsError}</div>
|
||||||
|
{/if}
|
||||||
|
|
||||||
|
{#if logs}
|
||||||
|
<div class="max-h-96 overflow-y-auto font-mono text-xs space-y-0.5">
|
||||||
|
{#each logs.entries as entry}
|
||||||
|
<div class="flex gap-2 py-0.5 border-b border-neutral-800/50">
|
||||||
|
<span class="text-neutral-600 shrink-0 w-40 truncate">{entry.timestamp}</span>
|
||||||
|
<span class="text-neutral-500 shrink-0 w-24">{entry.service}</span>
|
||||||
|
<span class="shrink-0 w-12 {logLevelColor(entry.level)}">{entry.level}</span>
|
||||||
|
<span class="text-neutral-300 break-all">{entry.message}</span>
|
||||||
|
</div>
|
||||||
|
{/each}
|
||||||
|
{#if logs.entries.length === 0}
|
||||||
|
<div class="text-neutral-500 py-4 text-center">Ingen logger funnet</div>
|
||||||
|
{/if}
|
||||||
|
</div>
|
||||||
|
{:else}
|
||||||
|
<div class="text-neutral-500 text-sm">Laster logger...</div>
|
||||||
|
{/if}
|
||||||
|
</div>
|
||||||
|
{/if}
|
||||||
|
</section>
|
||||||
|
{/if}
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
518
maskinrommet/src/health.rs
Normal file
518
maskinrommet/src/health.rs
Normal file
|
|
@ -0,0 +1,518 @@
|
||||||
|
// Serverhelse-dashboard — tjeneste-status, metrikker, backup-status, logg-tilgang.
|
||||||
|
//
|
||||||
|
// Sjekker alle tjenester i stacken (PG, STDB, Caddy, Authentik, LiteLLM,
|
||||||
|
// Whisper, LiveKit) og samler system-metrikker (CPU, minne, disk).
|
||||||
|
//
|
||||||
|
// Ref: docs/concepts/adminpanelet.md § 4 "Serverhelse", oppgave 15.6
|
||||||
|
|
||||||
|
use axum::{extract::{Query, State}, http::StatusCode, Json};
|
||||||
|
use serde::{Deserialize, Serialize};
|
||||||
|
use sqlx::PgPool;
|
||||||
|
|
||||||
|
use crate::auth::AuthUser;
|
||||||
|
use crate::AppState;
|
||||||
|
|
||||||
|
// =============================================================================
|
||||||
|
// Typer
|
||||||
|
// =============================================================================
|
||||||
|
|
||||||
|
#[derive(Serialize)]
|
||||||
|
pub struct ServiceStatus {
|
||||||
|
pub name: String,
|
||||||
|
pub status: String, // "up", "down", "degraded"
|
||||||
|
pub latency_ms: Option<u64>,
|
||||||
|
pub details: Option<String>,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Serialize)]
|
||||||
|
pub struct SystemMetrics {
|
||||||
|
pub cpu_usage_percent: f32,
|
||||||
|
pub cpu_cores: usize,
|
||||||
|
pub load_avg: [f32; 3],
|
||||||
|
pub memory_total_bytes: u64,
|
||||||
|
pub memory_used_bytes: u64,
|
||||||
|
pub memory_available_bytes: u64,
|
||||||
|
pub memory_usage_percent: f32,
|
||||||
|
pub disk: crate::resources::DiskStatus,
|
||||||
|
pub uptime_seconds: u64,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Serialize)]
|
||||||
|
pub struct BackupInfo {
|
||||||
|
pub backup_type: String,
|
||||||
|
pub last_success: Option<String>,
|
||||||
|
pub path: Option<String>,
|
||||||
|
pub status: String, // "ok", "missing", "stale"
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Serialize)]
|
||||||
|
pub struct LogEntry {
|
||||||
|
pub timestamp: String,
|
||||||
|
pub service: String,
|
||||||
|
pub level: String,
|
||||||
|
pub message: String,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Serialize)]
|
||||||
|
pub struct HealthDashboard {
|
||||||
|
pub services: Vec<ServiceStatus>,
|
||||||
|
pub metrics: SystemMetrics,
|
||||||
|
pub backups: Vec<BackupInfo>,
|
||||||
|
pub pg_stats: PgStats,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Serialize)]
|
||||||
|
pub struct PgStats {
|
||||||
|
pub active_connections: i64,
|
||||||
|
pub max_connections: i64,
|
||||||
|
pub database_size_bytes: i64,
|
||||||
|
pub active_queries: i64,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Deserialize)]
|
||||||
|
pub struct LogsQuery {
|
||||||
|
pub service: Option<String>,
|
||||||
|
pub lines: Option<usize>,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Serialize)]
|
||||||
|
pub struct LogsResponse {
|
||||||
|
pub entries: Vec<LogEntry>,
|
||||||
|
}
|
||||||
|
|
||||||
|
// =============================================================================
|
||||||
|
// Tjeneste-sjekker
|
||||||
|
// =============================================================================
|
||||||
|
|
||||||
|
async fn check_pg(db: &PgPool) -> ServiceStatus {
|
||||||
|
let start = std::time::Instant::now();
|
||||||
|
match sqlx::query_scalar::<_, i32>("SELECT 1").fetch_one(db).await {
|
||||||
|
Ok(_) => ServiceStatus {
|
||||||
|
name: "PostgreSQL".to_string(),
|
||||||
|
status: "up".to_string(),
|
||||||
|
latency_ms: Some(start.elapsed().as_millis() as u64),
|
||||||
|
details: None,
|
||||||
|
},
|
||||||
|
Err(e) => ServiceStatus {
|
||||||
|
name: "PostgreSQL".to_string(),
|
||||||
|
status: "down".to_string(),
|
||||||
|
latency_ms: None,
|
||||||
|
details: Some(format!("{e}")),
|
||||||
|
},
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
async fn check_stdb(stdb: &crate::stdb::StdbClient) -> ServiceStatus {
|
||||||
|
let start = std::time::Instant::now();
|
||||||
|
match stdb.delete_node("__healthcheck_nonexistent__").await {
|
||||||
|
Ok(()) => ServiceStatus {
|
||||||
|
name: "SpacetimeDB".to_string(),
|
||||||
|
status: "up".to_string(),
|
||||||
|
latency_ms: Some(start.elapsed().as_millis() as u64),
|
||||||
|
details: None,
|
||||||
|
},
|
||||||
|
Err(e) => ServiceStatus {
|
||||||
|
name: "SpacetimeDB".to_string(),
|
||||||
|
status: "down".to_string(),
|
||||||
|
latency_ms: None,
|
||||||
|
details: Some(format!("{e}")),
|
||||||
|
},
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Sjekk en HTTP-tjeneste med timeout.
|
||||||
|
async fn check_http_service(name: &str, url: &str) -> ServiceStatus {
|
||||||
|
let client = reqwest::Client::builder()
|
||||||
|
.timeout(std::time::Duration::from_secs(5))
|
||||||
|
.build()
|
||||||
|
.unwrap();
|
||||||
|
|
||||||
|
let start = std::time::Instant::now();
|
||||||
|
match client.get(url).send().await {
|
||||||
|
Ok(resp) => {
|
||||||
|
let latency = start.elapsed().as_millis() as u64;
|
||||||
|
let status_code = resp.status();
|
||||||
|
if status_code.is_success() || status_code.as_u16() == 401 || status_code.as_u16() == 403 {
|
||||||
|
// 401/403 betyr at tjenesten kjører, bare auth mangler
|
||||||
|
ServiceStatus {
|
||||||
|
name: name.to_string(),
|
||||||
|
status: "up".to_string(),
|
||||||
|
latency_ms: Some(latency),
|
||||||
|
details: None,
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
ServiceStatus {
|
||||||
|
name: name.to_string(),
|
||||||
|
status: "degraded".to_string(),
|
||||||
|
latency_ms: Some(latency),
|
||||||
|
details: Some(format!("HTTP {}", status_code)),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
Err(e) => ServiceStatus {
|
||||||
|
name: name.to_string(),
|
||||||
|
status: "down".to_string(),
|
||||||
|
latency_ms: None,
|
||||||
|
details: Some(format!("{e}")),
|
||||||
|
},
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
async fn check_caddy() -> ServiceStatus {
|
||||||
|
// Caddy kjører lokalt, sjekk admin-API
|
||||||
|
check_http_service("Caddy", "http://localhost:2019/config/").await
|
||||||
|
}
|
||||||
|
|
||||||
|
async fn check_authentik() -> ServiceStatus {
|
||||||
|
// Authentik via Caddy
|
||||||
|
check_http_service("Authentik", "https://auth.sidelinja.org/-/health/ready/").await
|
||||||
|
}
|
||||||
|
|
||||||
|
async fn check_litellm() -> ServiceStatus {
|
||||||
|
let url = std::env::var("AI_GATEWAY_URL")
|
||||||
|
.unwrap_or_else(|_| "http://localhost:4000".to_string());
|
||||||
|
check_http_service("LiteLLM", &format!("{url}/health")).await
|
||||||
|
}
|
||||||
|
|
||||||
|
async fn check_whisper() -> ServiceStatus {
|
||||||
|
let url = std::env::var("WHISPER_URL")
|
||||||
|
.unwrap_or_else(|_| "http://localhost:8000".to_string());
|
||||||
|
check_http_service("Whisper", &format!("{url}/health")).await
|
||||||
|
}
|
||||||
|
|
||||||
|
async fn check_livekit() -> ServiceStatus {
|
||||||
|
let url = std::env::var("LIVEKIT_URL")
|
||||||
|
.unwrap_or_else(|_| "http://localhost:7880".to_string());
|
||||||
|
check_http_service("LiveKit", &url).await
|
||||||
|
}
|
||||||
|
|
||||||
|
// =============================================================================
|
||||||
|
// System-metrikker
|
||||||
|
// =============================================================================
|
||||||
|
|
||||||
|
fn read_cpu_usage() -> f32 {
|
||||||
|
// Les /proc/stat for CPU-bruk (snapshot, ikke gjennomsnitt).
|
||||||
|
// For enkel implementering bruker vi load average i stedet.
|
||||||
|
// CPU-prosent beregnes fra load_avg[0] / antall kjerner.
|
||||||
|
let cores = num_cpus();
|
||||||
|
let load = read_load_avg();
|
||||||
|
// Tilnærming: load / cores * 100, capped til 100
|
||||||
|
((load[0] / cores as f32) * 100.0).min(100.0)
|
||||||
|
}
|
||||||
|
|
||||||
|
fn num_cpus() -> usize {
|
||||||
|
std::thread::available_parallelism()
|
||||||
|
.map(|n| n.get())
|
||||||
|
.unwrap_or(1)
|
||||||
|
}
|
||||||
|
|
||||||
|
fn read_load_avg() -> [f32; 3] {
|
||||||
|
let content = std::fs::read_to_string("/proc/loadavg").unwrap_or_default();
|
||||||
|
let parts: Vec<f32> = content
|
||||||
|
.split_whitespace()
|
||||||
|
.take(3)
|
||||||
|
.filter_map(|s| s.parse().ok())
|
||||||
|
.collect();
|
||||||
|
[
|
||||||
|
parts.first().copied().unwrap_or(0.0),
|
||||||
|
parts.get(1).copied().unwrap_or(0.0),
|
||||||
|
parts.get(2).copied().unwrap_or(0.0),
|
||||||
|
]
|
||||||
|
}
|
||||||
|
|
||||||
|
fn read_memory_info() -> (u64, u64, u64) {
|
||||||
|
// Les /proc/meminfo
|
||||||
|
let content = std::fs::read_to_string("/proc/meminfo").unwrap_or_default();
|
||||||
|
let mut total: u64 = 0;
|
||||||
|
let mut available: u64 = 0;
|
||||||
|
|
||||||
|
for line in content.lines() {
|
||||||
|
if let Some(val) = line.strip_prefix("MemTotal:") {
|
||||||
|
total = parse_meminfo_kb(val) * 1024;
|
||||||
|
} else if let Some(val) = line.strip_prefix("MemAvailable:") {
|
||||||
|
available = parse_meminfo_kb(val) * 1024;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
let used = total.saturating_sub(available);
|
||||||
|
(total, used, available)
|
||||||
|
}
|
||||||
|
|
||||||
|
fn parse_meminfo_kb(s: &str) -> u64 {
|
||||||
|
s.trim().split_whitespace().next()
|
||||||
|
.and_then(|v| v.parse().ok())
|
||||||
|
.unwrap_or(0)
|
||||||
|
}
|
||||||
|
|
||||||
|
fn read_uptime() -> u64 {
|
||||||
|
let content = std::fs::read_to_string("/proc/uptime").unwrap_or_default();
|
||||||
|
content.split_whitespace().next()
|
||||||
|
.and_then(|v| v.parse::<f64>().ok())
|
||||||
|
.map(|v| v as u64)
|
||||||
|
.unwrap_or(0)
|
||||||
|
}
|
||||||
|
|
||||||
|
fn collect_metrics() -> SystemMetrics {
|
||||||
|
let load = read_load_avg();
|
||||||
|
let cores = num_cpus();
|
||||||
|
let cpu = read_cpu_usage();
|
||||||
|
let (mem_total, mem_used, mem_available) = read_memory_info();
|
||||||
|
let mem_percent = if mem_total > 0 {
|
||||||
|
(mem_used as f64 / mem_total as f64 * 100.0) as f32
|
||||||
|
} else {
|
||||||
|
0.0
|
||||||
|
};
|
||||||
|
|
||||||
|
let cas_root = std::env::var("CAS_ROOT")
|
||||||
|
.unwrap_or_else(|_| "/srv/synops/media/cas".to_string());
|
||||||
|
let disk = crate::resources::check_disk_usage(&cas_root)
|
||||||
|
.unwrap_or_else(|_| crate::resources::check_disk_usage("/").unwrap_or(
|
||||||
|
crate::resources::DiskStatus {
|
||||||
|
mount_point: "/".to_string(),
|
||||||
|
total_bytes: 0,
|
||||||
|
used_bytes: 0,
|
||||||
|
available_bytes: 0,
|
||||||
|
usage_percent: 0.0,
|
||||||
|
alert_level: None,
|
||||||
|
}
|
||||||
|
));
|
||||||
|
|
||||||
|
SystemMetrics {
|
||||||
|
cpu_usage_percent: cpu,
|
||||||
|
cpu_cores: cores,
|
||||||
|
load_avg: load,
|
||||||
|
memory_total_bytes: mem_total,
|
||||||
|
memory_used_bytes: mem_used,
|
||||||
|
memory_available_bytes: mem_available,
|
||||||
|
memory_usage_percent: mem_percent,
|
||||||
|
disk,
|
||||||
|
uptime_seconds: read_uptime(),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// =============================================================================
|
||||||
|
// PG-statistikk
|
||||||
|
// =============================================================================
|
||||||
|
|
||||||
|
async fn collect_pg_stats(db: &PgPool) -> PgStats {
|
||||||
|
let active = sqlx::query_scalar::<_, i64>(
|
||||||
|
"SELECT count(*) FROM pg_stat_activity WHERE state = 'active'"
|
||||||
|
)
|
||||||
|
.fetch_one(db)
|
||||||
|
.await
|
||||||
|
.unwrap_or(0);
|
||||||
|
|
||||||
|
let max_conn = sqlx::query_scalar::<_, i64>(
|
||||||
|
"SELECT setting::bigint FROM pg_settings WHERE name = 'max_connections'"
|
||||||
|
)
|
||||||
|
.fetch_one(db)
|
||||||
|
.await
|
||||||
|
.unwrap_or(100);
|
||||||
|
|
||||||
|
let db_size = sqlx::query_scalar::<_, i64>(
|
||||||
|
"SELECT pg_database_size(current_database())"
|
||||||
|
)
|
||||||
|
.fetch_one(db)
|
||||||
|
.await
|
||||||
|
.unwrap_or(0);
|
||||||
|
|
||||||
|
let queries = sqlx::query_scalar::<_, i64>(
|
||||||
|
"SELECT count(*) FROM pg_stat_activity WHERE state = 'active' AND query NOT LIKE '%pg_stat_activity%'"
|
||||||
|
)
|
||||||
|
.fetch_one(db)
|
||||||
|
.await
|
||||||
|
.unwrap_or(0);
|
||||||
|
|
||||||
|
PgStats {
|
||||||
|
active_connections: active,
|
||||||
|
max_connections: max_conn,
|
||||||
|
database_size_bytes: db_size,
|
||||||
|
active_queries: queries,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// =============================================================================
|
||||||
|
// Backup-status
|
||||||
|
// =============================================================================
|
||||||
|
|
||||||
|
fn check_backups() -> Vec<BackupInfo> {
|
||||||
|
// Sjekk om det finnes PG-dumper i standard backup-kataloger
|
||||||
|
let backup_paths = [
|
||||||
|
"/srv/synops/backups",
|
||||||
|
"/srv/synops/data/backups",
|
||||||
|
"/var/backups/synops",
|
||||||
|
];
|
||||||
|
|
||||||
|
let mut backups = Vec::new();
|
||||||
|
|
||||||
|
// PG-dump
|
||||||
|
let mut pg_backup = BackupInfo {
|
||||||
|
backup_type: "PostgreSQL dump".to_string(),
|
||||||
|
last_success: None,
|
||||||
|
path: None,
|
||||||
|
status: "missing".to_string(),
|
||||||
|
};
|
||||||
|
|
||||||
|
for dir in &backup_paths {
|
||||||
|
if let Ok(entries) = std::fs::read_dir(dir) {
|
||||||
|
for entry in entries.flatten() {
|
||||||
|
let name = entry.file_name().to_string_lossy().to_string();
|
||||||
|
if name.contains("pg") || name.ends_with(".sql") || name.ends_with(".dump") {
|
||||||
|
if let Ok(meta) = entry.metadata() {
|
||||||
|
if let Ok(modified) = meta.modified() {
|
||||||
|
let age = modified.elapsed().unwrap_or_default();
|
||||||
|
let ts = chrono::DateTime::<chrono::Utc>::from(modified);
|
||||||
|
pg_backup.last_success = Some(ts.to_rfc3339());
|
||||||
|
pg_backup.path = Some(entry.path().to_string_lossy().to_string());
|
||||||
|
pg_backup.status = if age.as_secs() < 86400 {
|
||||||
|
"ok".to_string()
|
||||||
|
} else if age.as_secs() < 7 * 86400 {
|
||||||
|
"stale".to_string()
|
||||||
|
} else {
|
||||||
|
"stale".to_string()
|
||||||
|
};
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
backups.push(pg_backup);
|
||||||
|
|
||||||
|
// CAS (media) backup
|
||||||
|
backups.push(BackupInfo {
|
||||||
|
backup_type: "CAS media".to_string(),
|
||||||
|
last_success: None,
|
||||||
|
path: None,
|
||||||
|
status: "missing".to_string(),
|
||||||
|
});
|
||||||
|
|
||||||
|
backups
|
||||||
|
}
|
||||||
|
|
||||||
|
// =============================================================================
|
||||||
|
// Logg-tilgang
|
||||||
|
// =============================================================================
|
||||||
|
|
||||||
|
fn read_service_logs(service: &str, max_lines: usize) -> Vec<LogEntry> {
|
||||||
|
// Bruk journalctl for systemd-tjenester, docker logs for containere
|
||||||
|
let cmd = match service {
|
||||||
|
"maskinrommet" | "caddy" | "sveltekit" => {
|
||||||
|
format!("journalctl -u {service} --no-pager -n {max_lines} --output=short-iso 2>/dev/null")
|
||||||
|
}
|
||||||
|
"postgres" | "spacetimedb" | "authentik" | "litellm" | "whisper" | "livekit" => {
|
||||||
|
let container = match service {
|
||||||
|
"postgres" => "sidelinja-postgres-1",
|
||||||
|
"spacetimedb" => "sidelinja-spacetimedb-1",
|
||||||
|
"authentik" => "sidelinja-authentik-server-1",
|
||||||
|
"litellm" => "sidelinja-ai-gateway-1",
|
||||||
|
"whisper" => "sidelinja-faster-whisper-1",
|
||||||
|
"livekit" => "sidelinja-livekit-1",
|
||||||
|
_ => return Vec::new(),
|
||||||
|
};
|
||||||
|
format!("docker logs --tail {max_lines} --timestamps {container} 2>&1")
|
||||||
|
}
|
||||||
|
_ => return Vec::new(),
|
||||||
|
};
|
||||||
|
|
||||||
|
let output = std::process::Command::new("bash")
|
||||||
|
.arg("-c")
|
||||||
|
.arg(&cmd)
|
||||||
|
.output();
|
||||||
|
|
||||||
|
match output {
|
||||||
|
Ok(out) => {
|
||||||
|
let text = String::from_utf8_lossy(&out.stdout);
|
||||||
|
text.lines()
|
||||||
|
.rev() // nyeste først
|
||||||
|
.take(max_lines)
|
||||||
|
.map(|line| {
|
||||||
|
// Prøv å parse tidsstempel fra starten av linjen
|
||||||
|
let (ts, msg) = if line.len() > 24 {
|
||||||
|
(line[..24].trim().to_string(), line[24..].trim().to_string())
|
||||||
|
} else {
|
||||||
|
(String::new(), line.to_string())
|
||||||
|
};
|
||||||
|
|
||||||
|
let level = if msg.contains("ERROR") || msg.contains("error") || msg.contains("ERR") {
|
||||||
|
"error"
|
||||||
|
} else if msg.contains("WARN") || msg.contains("warn") {
|
||||||
|
"warn"
|
||||||
|
} else if msg.contains("INFO") || msg.contains("info") {
|
||||||
|
"info"
|
||||||
|
} else {
|
||||||
|
"debug"
|
||||||
|
};
|
||||||
|
|
||||||
|
LogEntry {
|
||||||
|
timestamp: ts,
|
||||||
|
service: service.to_string(),
|
||||||
|
level: level.to_string(),
|
||||||
|
message: msg,
|
||||||
|
}
|
||||||
|
})
|
||||||
|
.collect()
|
||||||
|
}
|
||||||
|
Err(_) => Vec::new(),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// =============================================================================
|
||||||
|
// API-handlers
|
||||||
|
// =============================================================================
|
||||||
|
|
||||||
|
/// GET /admin/health — komplett serverhelse-dashboard.
|
||||||
|
pub async fn health_dashboard(
|
||||||
|
State(state): State<AppState>,
|
||||||
|
_user: AuthUser,
|
||||||
|
) -> Result<Json<HealthDashboard>, (StatusCode, Json<crate::intentions::ErrorResponse>)> {
|
||||||
|
// Kjør alle tjeneste-sjekker parallelt
|
||||||
|
let (pg, stdb, caddy, authentik, litellm, whisper, livekit) = tokio::join!(
|
||||||
|
check_pg(&state.db),
|
||||||
|
check_stdb(&state.stdb),
|
||||||
|
check_caddy(),
|
||||||
|
check_authentik(),
|
||||||
|
check_litellm(),
|
||||||
|
check_whisper(),
|
||||||
|
check_livekit(),
|
||||||
|
);
|
||||||
|
|
||||||
|
let services = vec![pg, stdb, caddy, authentik, litellm, whisper, livekit];
|
||||||
|
let metrics = collect_metrics();
|
||||||
|
let backups = check_backups();
|
||||||
|
let pg_stats = collect_pg_stats(&state.db).await;
|
||||||
|
|
||||||
|
Ok(Json(HealthDashboard {
|
||||||
|
services,
|
||||||
|
metrics,
|
||||||
|
backups,
|
||||||
|
pg_stats,
|
||||||
|
}))
|
||||||
|
}
|
||||||
|
|
||||||
|
/// GET /admin/health/logs?service=maskinrommet&lines=50
|
||||||
|
pub async fn health_logs(
|
||||||
|
_user: AuthUser,
|
||||||
|
Query(params): Query<LogsQuery>,
|
||||||
|
) -> Json<LogsResponse> {
|
||||||
|
let max_lines = params.lines.unwrap_or(50).min(200);
|
||||||
|
|
||||||
|
let entries = if let Some(service) = ¶ms.service {
|
||||||
|
read_service_logs(service, max_lines)
|
||||||
|
} else {
|
||||||
|
// Alle tjenester, siste linjer fra hver
|
||||||
|
let services = ["maskinrommet", "caddy", "postgres", "spacetimedb", "authentik", "litellm", "whisper", "livekit"];
|
||||||
|
let per_service = (max_lines / services.len()).max(10);
|
||||||
|
let mut all = Vec::new();
|
||||||
|
for svc in &services {
|
||||||
|
all.extend(read_service_logs(svc, per_service));
|
||||||
|
}
|
||||||
|
// Sorter etter timestamp (nyeste først)
|
||||||
|
all.sort_by(|a, b| b.timestamp.cmp(&a.timestamp));
|
||||||
|
all.truncate(max_lines);
|
||||||
|
all
|
||||||
|
};
|
||||||
|
|
||||||
|
Json(LogsResponse { entries })
|
||||||
|
}
|
||||||
|
|
@ -12,6 +12,7 @@ pub mod maintenance;
|
||||||
pub mod pruning;
|
pub mod pruning;
|
||||||
mod queries;
|
mod queries;
|
||||||
pub mod publishing;
|
pub mod publishing;
|
||||||
|
pub mod health;
|
||||||
pub mod resources;
|
pub mod resources;
|
||||||
mod rss;
|
mod rss;
|
||||||
mod serving;
|
mod serving;
|
||||||
|
|
@ -228,6 +229,9 @@ async fn main() {
|
||||||
.route("/admin/ai/delete_provider", post(ai_admin::delete_provider))
|
.route("/admin/ai/delete_provider", post(ai_admin::delete_provider))
|
||||||
.route("/admin/ai/update_routing", post(ai_admin::update_routing))
|
.route("/admin/ai/update_routing", post(ai_admin::update_routing))
|
||||||
.route("/admin/ai/delete_routing", post(ai_admin::delete_routing))
|
.route("/admin/ai/delete_routing", post(ai_admin::delete_routing))
|
||||||
|
// Serverhelse-dashboard (oppgave 15.6)
|
||||||
|
.route("/admin/health", get(health::health_dashboard))
|
||||||
|
.route("/admin/health/logs", get(health::health_logs))
|
||||||
.route("/query/audio_info", get(intentions::audio_info))
|
.route("/query/audio_info", get(intentions::audio_info))
|
||||||
.route("/pub/{slug}/feed.xml", get(rss::generate_feed))
|
.route("/pub/{slug}/feed.xml", get(rss::generate_feed))
|
||||||
.route("/pub/{slug}", get(publishing::serve_index))
|
.route("/pub/{slug}", get(publishing::serve_index))
|
||||||
|
|
|
||||||
3
tasks.md
3
tasks.md
|
|
@ -168,8 +168,7 @@ Uavhengige faser kan fortsatt plukkes.
|
||||||
- [x] 15.3 Jobbkø-oversikt: admin-UI for aktive, ventende og feilede jobber. Filtrer på type/samling/status. Manuell retry og avbryt.
|
- [x] 15.3 Jobbkø-oversikt: admin-UI for aktive, ventende og feilede jobber. Filtrer på type/samling/status. Manuell retry og avbryt.
|
||||||
- [x] 15.4 AI Gateway-konfigurasjon: admin-UI for modelloversikt, API-nøkler (kryptert), ruting-regler per jobbtype, fallback-kjeder, forbruksoversikt per samling. Ref: `docs/infra/ai_gateway.md`.
|
- [x] 15.4 AI Gateway-konfigurasjon: admin-UI for modelloversikt, API-nøkler (kryptert), ruting-regler per jobbtype, fallback-kjeder, forbruksoversikt per samling. Ref: `docs/infra/ai_gateway.md`.
|
||||||
- [x] 15.5 Ressursstyring: prioritetsregler mellom jobbtyper, ressursgrenser per worker, ressurs-governor for automatisk nedprioritering under aktive LiveKit-sesjoner, disk-status med varsling.
|
- [x] 15.5 Ressursstyring: prioritetsregler mellom jobbtyper, ressursgrenser per worker, ressurs-governor for automatisk nedprioritering under aktive LiveKit-sesjoner, disk-status med varsling.
|
||||||
- [~] 15.6 Serverhelse-dashboard: tjeneste-status (PG, STDB, Caddy, Authentik, LiteLLM, Whisper, LiveKit), metrikker (CPU, minne, disk), backup-status, logg-tilgang.
|
- [x] 15.6 Serverhelse-dashboard: tjeneste-status (PG, STDB, Caddy, Authentik, LiteLLM, Whisper, LiveKit), metrikker (CPU, minne, disk), backup-status, logg-tilgang.
|
||||||
> Påbegynt: 2026-03-18T04:03
|
|
||||||
- [ ] 15.7 Ressursforbruk-logging: `resource_usage_log`-tabell i PG. Maskinrommet logger AI-tokens (inn/ut, modellnivå), Whisper-tid (sek), TTS-tegn, CAS-lagring (bytes), LiveKit-tid (deltaker-min). Båndbredde via Caddy-logg-parsing. Ref: `docs/features/ressursforbruk.md`.
|
- [ ] 15.7 Ressursforbruk-logging: `resource_usage_log`-tabell i PG. Maskinrommet logger AI-tokens (inn/ut, modellnivå), Whisper-tid (sek), TTS-tegn, CAS-lagring (bytes), LiveKit-tid (deltaker-min). Båndbredde via Caddy-logg-parsing. Ref: `docs/features/ressursforbruk.md`.
|
||||||
- [ ] 15.8 Forbruksoversikt i admin: aggregert visning per samling, per ressurstype, per tidsperiode. Drill-down til jobbtype og modellnivå.
|
- [ ] 15.8 Forbruksoversikt i admin: aggregert visning per samling, per ressurstype, per tidsperiode. Drill-down til jobbtype og modellnivå.
|
||||||
- [ ] 15.9 Brukersynlig forbruk: hver bruker ser eget forbruk i profil/innstillinger. Per-node forbruk synlig i node-detaljer for eiere.
|
- [ ] 15.9 Brukersynlig forbruk: hver bruker ser eget forbruk i profil/innstillinger. Per-node forbruk synlig i node-detaljer for eiere.
|
||||||
|
|
|
||||||
Loading…
Add table
Reference in a new issue