Lydstudio: lydredigering via FFmpeg i nettleseren

Ikke-destruktiv redigering via EDL (Edit Decision List):
- Backend: audio.rs med FFmpeg-subprocess for klipp, normalisering,
  silence trim, fades, noise reduction, EQ, kompressor
- Frontend: /studio/[id] med wavesurfer.js RegionsPlugin,
  verktøypanel, sesjonslagring, og render-dialog
- Studio-trait for samlinger, versjonshistorikk via derived_from-edges
- API: audio_analyze (synkron), audio_process (jobbkø), audio_info

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
vegard 2026-03-18 00:45:53 +00:00
parent 5d14acf921
commit b4c4bb8a0f
11 changed files with 2229 additions and 3 deletions

130
docs/features/lydstudio.md Normal file
View file

@ -0,0 +1,130 @@
# Lydstudio — Lydredigering i nettleseren
**Status:** Under utvikling (v1)
## Konsept
Lydstudioet er en visning (`/studio/[id]`) av en medienode som gir
brukeren verktøy for enkel lydredigering direkte i nettleseren.
Tenk "Audacity-light" integrert i Synops-plattformen.
**Prinsipp:** Ikke-destruktiv redigering. Originalen i CAS røres aldri.
Operasjoner lagres som en EDL (Edit Decision List), og rendres til ny
fil via maskinrommet + ffmpeg.
## Arkitektur
### Node/edge-modell
```
Original medienode (media, cas_hash: "abc...")
←derived_from── Prosessert medienode (media, cas_hash: "def...", metadata.edl)
←has_studio──── Studio-sesjon (content, metadata.edl = {...})
```
- **Studioet** er en *visning*, ikke en ny node_kind
- **Studio-sesjon** er en content-node som lagrer EDL-en (gjenopptagbart)
- **Prosessert fil** er en ny medienode med `derived_from`-edge
### EDL-format (Edit Decision List)
```json
{
"source_hash": "abc123...",
"operations": [
{ "type": "cut", "start_ms": 15200, "end_ms": 17800 },
{ "type": "normalize", "target_lufs": -16.0 },
{ "type": "trim_silence", "threshold_db": -30.0, "min_duration_ms": 500 },
{ "type": "fade_in", "duration_ms": 1000 },
{ "type": "fade_out", "duration_ms": 2000 },
{ "type": "noise_reduction", "strength_db": -25.0 },
{ "type": "equalizer", "low_gain": 2.0, "mid_gain": 0.0, "high_gain": -1.0 },
{ "type": "compressor", "threshold_db": -20.0, "ratio": 4.0 }
]
}
```
### Prosesseringsflyt
```
Frontend (EDL)
→ POST /intentions/audio_process
→ Jobbkø (audio_process, prioritet 5)
→ maskinrommet: edl → ffmpeg filtergraf → subprocess
→ Resultat lagres i CAS → ny medienode + derived_from edge
```
## Operasjoner
| Operasjon | FFmpeg-filter | Beskrivelse |
|-----------|---------------|-------------|
| Klipp (cut) | `aselect` + `asetpts` | Fjern region (nysing, telefon, etc.) |
| Normaliser | `loudnorm` (to-pass) | EBU R128 loudness-normalisering, typisk -16 LUFS |
| Trim stillhet | `silencedetect` → cuts | Forkort/fjern stille regioner |
| Fade in | `afade=t=in` | Gradvis inngang |
| Fade out | `afade=t=out` | Gradvis utgang |
| Noise reduction | `afftdn` | FFT-basert støyreduksjon |
| EQ | `equalizer` | Tre-bånds parametrisk (lav/mid/høy) |
| Kompressor | `acompressor` | Dynamisk kompresjon ("radio-lyd") |
### Operasjonsrekkefølge (ved render)
1. Cuts (aselect) — fjerner regioner
2. Noise reduction (afftdn)
3. EQ (equalizer)
4. Compressor (acompressor)
5. Normalize (loudnorm) — alltid nest sist
6. Fades (afade) — helt sist
## API-endepunkter
### `POST /intentions/audio_analyze`
Synkron analyse av lydfil: loudness, silence-regioner, metadata.
```json
// Request
{ "cas_hash": "abc...", "silence_threshold_db": -30.0, "silence_min_duration_ms": 500 }
// Response
{
"loudness": { "input_i": -23.1, "input_tp": -5.2, "input_lra": 14.0, "input_thresh": -34.0 },
"silence_regions": [{ "start_ms": 1200, "end_ms": 2800, "duration_ms": 1600 }],
"info": { "duration_ms": 180000, "sample_rate": 44100, "channels": 2, "codec": "mp3", "format": "mp3" }
}
```
### `POST /intentions/audio_process`
Køer render-jobb med EDL. Returnerer job_id for polling.
```json
// Request
{ "media_node_id": "uuid", "edl": { "source_hash": "...", "operations": [...] }, "output_format": "mp3" }
// Response
{ "job_id": "uuid" }
```
### `GET /query/audio_info?hash=...`
Hurtig metadata om lydfil (ffprobe).
## Frontend
- **Rute:** `/studio/[id]` — waveform-visning av medienode
- **Waveform:** wavesurfer.js med RegionsPlugin for visuell region-markering
- **Verktøypanel:** Alle operasjoner tilgjengelig som knapper/slidere
- **Tastatur:** Space (play/pause), Delete (klipp), Ctrl+Z (angre)
- **Transkripsjon:** Segmenter synkronisert med waveform (klikk → seek)
- **Render:** Dialog med format-valg, deretter jobb-polling
## Avhengigheter
- **ffmpeg 6.1.1** — installert native på serveren
- **wavesurfer.js** — allerede i bruk (AudioPlayer.svelte)
- **Trait:** `studio` — aktiverer "Rediger i studioet"-knapp på medienoder
## Filer
| Fil | Rolle |
|-----|-------|
| `maskinrommet/src/audio.rs` | EDL-parser, ffmpeg-kommandoer, jobbhåndterer |
| `maskinrommet/src/jobs.rs` | `audio_process` dispatch |
| `maskinrommet/src/intentions.rs` | API-endepunkter for analyze/process/info |
| `frontend/src/routes/studio/[id]/+page.svelte` | Hovedside |
| `frontend/src/lib/components/studio/` | Waveform, panel, render-dialog |

View file

@ -397,6 +397,88 @@ export interface SegmentChoice {
choice: 'new' | 'old';
}
// =============================================================================
// Lydstudio
// =============================================================================
export interface AudioInfo {
duration_ms: number;
sample_rate: number;
channels: number;
codec: string;
format: string;
bit_rate: number | null;
}
export interface LoudnessInfo {
input_i: number;
input_tp: number;
input_lra: number;
input_thresh: number;
}
export interface SilenceRegion {
start_ms: number;
end_ms: number;
duration_ms: number;
}
export interface AnalyzeResult {
loudness: LoudnessInfo;
silence_regions: SilenceRegion[];
info: AudioInfo;
}
export interface EdlOperation {
type: string;
[key: string]: unknown;
}
export interface EdlDocument {
source_hash: string;
operations: EdlOperation[];
}
/** Analyser lydfil: loudness, silence-regioner, metadata. */
export function audioAnalyze(
accessToken: string,
casHash: string,
silenceThresholdDb?: number,
silenceMinDurationMs?: number
): Promise<AnalyzeResult> {
return post(accessToken, '/intentions/audio_analyze', {
cas_hash: casHash,
silence_threshold_db: silenceThresholdDb,
silence_min_duration_ms: silenceMinDurationMs
});
}
/** Køer audio-prosessering med EDL. Returnerer job_id. */
export function audioProcess(
accessToken: string,
mediaNodeId: string,
edl: EdlDocument,
outputFormat?: string
): Promise<{ job_id: string }> {
return post(accessToken, '/intentions/audio_process', {
media_node_id: mediaNodeId,
edl,
output_format: outputFormat
});
}
/** Hent metadata om lydfil (ffprobe). */
export async function audioInfo(accessToken: string, hash: string): Promise<AudioInfo> {
const res = await fetch(`${BASE_URL}/query/audio_info?hash=${encodeURIComponent(hash)}`, {
headers: { Authorization: `Bearer ${accessToken}` }
});
if (!res.ok) {
const body = await res.text();
throw new Error(`audio_info failed (${res.status}): ${body}`);
}
return res.json();
}
/** Anvend brukerens per-segment-valg etter re-transkripsjon. */
export function resolveRetranscription(
accessToken: string,

View file

@ -0,0 +1,304 @@
<script lang="ts">
import type { EdlOperation, LoudnessInfo, SilenceRegion, AudioInfo } from '$lib/api';
interface Props {
/** Current loudness analysis (null if not yet analyzed) */
loudness: LoudnessInfo | null;
/** Detected silence regions */
silenceRegions: SilenceRegion[];
/** Audio info */
audioInfo: AudioInfo | null;
/** Whether analysis is running */
analyzing: boolean;
/** Current EDL operations */
operations: EdlOperation[];
/** Active region from waveform selection */
activeRegion: { start: number; end: number } | null;
/** Callbacks */
onanalyze: () => void;
oncut: () => void;
onaddop: (op: EdlOperation) => void;
onremoveop: (index: number) => void;
onrender: () => void;
onundo: () => void;
ontrimsilence: () => void;
}
let {
loudness,
silenceRegions,
audioInfo,
analyzing,
operations,
activeRegion,
onanalyze,
oncut,
onaddop,
onremoveop,
onrender,
onundo,
ontrimsilence,
}: Props = $props();
// Tool settings
let normTarget = $state(-16);
let silenceThreshold = $state(-30);
let silenceMinMs = $state(500);
let fadeInMs = $state(1000);
let fadeOutMs = $state(2000);
let noiseStrength = $state(-25);
let eqLow = $state(0);
let eqMid = $state(0);
let eqHigh = $state(0);
let compThreshold = $state(-20);
let compRatio = $state(4);
function opLabel(op: EdlOperation): string {
switch (op.type) {
case 'cut': return `Klipp ${fmtMs(op.start_ms as number)}-${fmtMs(op.end_ms as number)}`;
case 'normalize': return `Normaliser (${op.target_lufs} LUFS)`;
case 'trim_silence': return `Trim stillhet (${op.threshold_db}dB)`;
case 'fade_in': return `Fade inn (${op.duration_ms}ms)`;
case 'fade_out': return `Fade ut (${op.duration_ms}ms)`;
case 'noise_reduction': return `Noise reduction (${op.strength_db}dB)`;
case 'equalizer': return `EQ (L:${op.low_gain} M:${op.mid_gain} H:${op.high_gain})`;
case 'compressor': return `Kompressor (${op.threshold_db}dB, ${op.ratio}:1)`;
default: return op.type;
}
}
function fmtMs(ms: number): string {
const sec = ms / 1000;
const m = Math.floor(sec / 60);
const s = Math.floor(sec % 60);
return `${m}:${s.toString().padStart(2, '0')}`;
}
</script>
<div class="flex flex-col gap-4 rounded-lg border border-gray-200 bg-white p-4">
<!-- Analyse -->
<section>
<h3 class="mb-2 text-sm font-semibold text-gray-700">Analyse</h3>
<button
onclick={onanalyze}
disabled={analyzing}
class="w-full rounded bg-gray-100 px-3 py-2 text-sm transition-colors hover:bg-gray-200 disabled:opacity-50"
>
{analyzing ? 'Analyserer...' : 'Analyser lydfil'}
</button>
{#if loudness}
<div class="mt-2 grid grid-cols-2 gap-1 text-xs text-gray-500">
<span>Loudness:</span>
<span class="font-mono">{loudness.input_i.toFixed(1)} LUFS</span>
<span>True Peak:</span>
<span class="font-mono">{loudness.input_tp.toFixed(1)} dBTP</span>
<span>LRA:</span>
<span class="font-mono">{loudness.input_lra.toFixed(1)} LU</span>
</div>
{/if}
{#if audioInfo}
<div class="mt-2 grid grid-cols-2 gap-1 text-xs text-gray-500">
<span>Varighet:</span>
<span class="font-mono">{fmtMs(audioInfo.duration_ms)}</span>
<span>Format:</span>
<span class="font-mono">{audioInfo.codec} / {audioInfo.sample_rate}Hz</span>
<span>Kanaler:</span>
<span class="font-mono">{audioInfo.channels}</span>
</div>
{/if}
{#if silenceRegions.length > 0}
<p class="mt-2 text-xs text-gray-400">
{silenceRegions.length} stille regioner funnet
</p>
{/if}
</section>
<!-- Klipp -->
<section>
<h3 class="mb-2 text-sm font-semibold text-gray-700">Klipp</h3>
<button
onclick={oncut}
disabled={!activeRegion}
class="w-full rounded bg-red-50 px-3 py-2 text-sm text-red-700 transition-colors hover:bg-red-100 disabled:opacity-50"
>
{activeRegion ? 'Klipp markert region' : 'Marker en region forst'}
</button>
</section>
<!-- Silence trimming -->
<section>
<h3 class="mb-2 text-sm font-semibold text-gray-700">Trim stillhet</h3>
<div class="flex items-center gap-2">
<label class="text-xs text-gray-500">Terskel</label>
<input type="number" bind:value={silenceThreshold} class="w-16 rounded border px-1 py-0.5 text-xs" step="5" />
<span class="text-xs text-gray-400">dB</span>
</div>
<div class="mt-1 flex items-center gap-2">
<label class="text-xs text-gray-500">Min varighet</label>
<input type="number" bind:value={silenceMinMs} class="w-16 rounded border px-1 py-0.5 text-xs" step="100" />
<span class="text-xs text-gray-400">ms</span>
</div>
<button
onclick={ontrimsilence}
class="mt-2 w-full rounded bg-gray-100 px-3 py-2 text-sm transition-colors hover:bg-gray-200"
>
Trim all stillhet
</button>
</section>
<!-- Normalisering -->
<section>
<h3 class="mb-2 text-sm font-semibold text-gray-700">Normaliser loudness</h3>
<div class="flex items-center gap-2">
<label class="text-xs text-gray-500">Mal</label>
<input type="number" bind:value={normTarget} class="w-16 rounded border px-1 py-0.5 text-xs" step="1" />
<span class="text-xs text-gray-400">LUFS</span>
</div>
<button
onclick={() => onaddop({ type: 'normalize', target_lufs: normTarget })}
class="mt-2 w-full rounded bg-gray-100 px-3 py-2 text-sm transition-colors hover:bg-gray-200"
>
Legg til normalisering
</button>
</section>
<!-- Fades -->
<section>
<h3 class="mb-2 text-sm font-semibold text-gray-700">Fades</h3>
<div class="flex gap-2">
<div class="flex-1">
<label class="text-xs text-gray-500">Fade inn</label>
<div class="flex items-center gap-1">
<input type="number" bind:value={fadeInMs} class="w-16 rounded border px-1 py-0.5 text-xs" step="500" />
<span class="text-xs text-gray-400">ms</span>
</div>
<button
onclick={() => onaddop({ type: 'fade_in', duration_ms: fadeInMs })}
class="mt-1 w-full rounded bg-gray-100 px-2 py-1 text-xs transition-colors hover:bg-gray-200"
>
+ Fade inn
</button>
</div>
<div class="flex-1">
<label class="text-xs text-gray-500">Fade ut</label>
<div class="flex items-center gap-1">
<input type="number" bind:value={fadeOutMs} class="w-16 rounded border px-1 py-0.5 text-xs" step="500" />
<span class="text-xs text-gray-400">ms</span>
</div>
<button
onclick={() => onaddop({ type: 'fade_out', duration_ms: fadeOutMs })}
class="mt-1 w-full rounded bg-gray-100 px-2 py-1 text-xs transition-colors hover:bg-gray-200"
>
+ Fade ut
</button>
</div>
</div>
</section>
<!-- Noise reduction -->
<section>
<h3 class="mb-2 text-sm font-semibold text-gray-700">Noise reduction</h3>
<div class="flex items-center gap-2">
<label class="text-xs text-gray-500">Styrke</label>
<input type="range" min="-50" max="0" bind:value={noiseStrength} class="flex-1" />
<span class="w-10 text-right font-mono text-xs text-gray-500">{noiseStrength}dB</span>
</div>
<button
onclick={() => onaddop({ type: 'noise_reduction', strength_db: noiseStrength })}
class="mt-2 w-full rounded bg-gray-100 px-3 py-2 text-sm transition-colors hover:bg-gray-200"
>
Legg til noise reduction
</button>
</section>
<!-- EQ -->
<section>
<h3 class="mb-2 text-sm font-semibold text-gray-700">Equalizer</h3>
<div class="space-y-1">
<div class="flex items-center gap-2">
<label class="w-8 text-xs text-gray-500">Lav</label>
<input type="range" min="-12" max="12" step="0.5" bind:value={eqLow} class="flex-1" />
<span class="w-10 text-right font-mono text-xs text-gray-500">{eqLow}dB</span>
</div>
<div class="flex items-center gap-2">
<label class="w-8 text-xs text-gray-500">Mid</label>
<input type="range" min="-12" max="12" step="0.5" bind:value={eqMid} class="flex-1" />
<span class="w-10 text-right font-mono text-xs text-gray-500">{eqMid}dB</span>
</div>
<div class="flex items-center gap-2">
<label class="w-8 text-xs text-gray-500">Hoy</label>
<input type="range" min="-12" max="12" step="0.5" bind:value={eqHigh} class="flex-1" />
<span class="w-10 text-right font-mono text-xs text-gray-500">{eqHigh}dB</span>
</div>
</div>
<button
onclick={() => onaddop({ type: 'equalizer', low_gain: eqLow, mid_gain: eqMid, high_gain: eqHigh })}
class="mt-2 w-full rounded bg-gray-100 px-3 py-2 text-sm transition-colors hover:bg-gray-200"
>
Legg til EQ
</button>
</section>
<!-- Compressor -->
<section>
<h3 class="mb-2 text-sm font-semibold text-gray-700">Kompressor</h3>
<div class="space-y-1">
<div class="flex items-center gap-2">
<label class="text-xs text-gray-500">Terskel</label>
<input type="number" bind:value={compThreshold} class="w-16 rounded border px-1 py-0.5 text-xs" step="1" />
<span class="text-xs text-gray-400">dB</span>
</div>
<div class="flex items-center gap-2">
<label class="text-xs text-gray-500">Ratio</label>
<input type="number" bind:value={compRatio} class="w-16 rounded border px-1 py-0.5 text-xs" step="0.5" min="1" />
<span class="text-xs text-gray-400">:1</span>
</div>
</div>
<button
onclick={() => onaddop({ type: 'compressor', threshold_db: compThreshold, ratio: compRatio })}
class="mt-2 w-full rounded bg-gray-100 px-3 py-2 text-sm transition-colors hover:bg-gray-200"
>
Legg til kompressor
</button>
</section>
<!-- Operasjonsliste -->
{#if operations.length > 0}
<section>
<div class="flex items-center justify-between">
<h3 class="text-sm font-semibold text-gray-700">Operasjoner ({operations.length})</h3>
<button
onclick={onundo}
class="text-xs text-gray-400 hover:text-gray-600"
>
Angre siste
</button>
</div>
<ul class="mt-1 space-y-1">
{#each operations as op, i}
<li class="flex items-center justify-between rounded bg-gray-50 px-2 py-1">
<span class="text-xs text-gray-600">{i + 1}. {opLabel(op)}</span>
<button
onclick={() => onremoveop(i)}
class="text-xs text-red-400 hover:text-red-600"
>
x
</button>
</li>
{/each}
</ul>
</section>
{/if}
<!-- Render -->
<button
onclick={onrender}
disabled={operations.length === 0}
class="w-full rounded bg-blue-600 px-4 py-2.5 text-sm font-medium text-white transition-colors hover:bg-blue-700 disabled:opacity-50"
>
Render ({operations.length} operasjoner)
</button>
</div>

View file

@ -0,0 +1,93 @@
<script lang="ts">
import type { EdlOperation } from '$lib/api';
interface Props {
operations: EdlOperation[];
rendering: boolean;
jobId: string | null;
resultNodeId: string | null;
onconfirm: (format: string) => void;
onclose: () => void;
}
let { operations, rendering, jobId, resultNodeId, onconfirm, onclose }: Props = $props();
let format = $state('mp3');
</script>
<!-- svelte-ignore a11y_no_static_element_interactions -->
<div
class="fixed inset-0 z-50 flex items-center justify-center bg-black/50"
onkeydown={(e) => { if (e.key === 'Escape') onclose(); }}
>
<!-- svelte-ignore a11y_click_events_have_key_events -->
<div
class="w-full max-w-md rounded-lg bg-white p-6 shadow-xl"
onclick={(e) => e.stopPropagation()}
>
<h2 class="mb-4 text-lg font-semibold text-gray-800">Render lyd</h2>
{#if resultNodeId}
<!-- Done -->
<div class="rounded bg-green-50 p-4 text-center">
<p class="text-sm text-green-700">Rendering fullfort!</p>
<p class="mt-1 text-xs text-green-600">Node: {resultNodeId}</p>
</div>
<button
onclick={onclose}
class="mt-4 w-full rounded bg-gray-100 px-4 py-2 text-sm hover:bg-gray-200"
>
Lukk
</button>
{:else if rendering}
<!-- In progress -->
<div class="flex flex-col items-center gap-3 py-4">
<svg class="h-8 w-8 animate-spin text-blue-500" fill="none" viewBox="0 0 24 24">
<circle class="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" stroke-width="4"></circle>
<path class="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4z"></path>
</svg>
<p class="text-sm text-gray-600">Rendrer lyd...</p>
{#if jobId}
<p class="text-xs text-gray-400">Jobb: {jobId.slice(0, 8)}...</p>
{/if}
</div>
{:else}
<!-- Config -->
<div class="space-y-3">
<div>
<label class="text-sm text-gray-600">Format</label>
<select bind:value={format} class="mt-1 w-full rounded border px-2 py-1.5 text-sm">
<option value="mp3">MP3 (standard)</option>
<option value="wav">WAV (ukomprimert)</option>
<option value="flac">FLAC (lossless)</option>
<option value="ogg">OGG Vorbis</option>
</select>
</div>
<div class="rounded bg-gray-50 p-3">
<p class="text-xs font-medium text-gray-500">Operasjoner ({operations.length})</p>
<ul class="mt-1 space-y-0.5">
{#each operations as op, i}
<li class="text-xs text-gray-600">{i + 1}. {op.type}</li>
{/each}
</ul>
</div>
</div>
<div class="mt-4 flex gap-2">
<button
onclick={onclose}
class="flex-1 rounded bg-gray-100 px-4 py-2 text-sm hover:bg-gray-200"
>
Avbryt
</button>
<button
onclick={() => onconfirm(format)}
class="flex-1 rounded bg-blue-600 px-4 py-2 text-sm font-medium text-white hover:bg-blue-700"
>
Render
</button>
</div>
{/if}
</div>
</div>

View file

@ -0,0 +1,274 @@
<script lang="ts">
import { onMount, onDestroy } from 'svelte';
import WaveSurfer from 'wavesurfer.js';
import RegionsPlugin from 'wavesurfer.js/dist/plugins/regions.js';
import type { SilenceRegion } from '$lib/api';
interface Props {
src: string;
onready?: (duration: number) => void;
ontimeupdate?: (time: number) => void;
}
let { src, onready, ontimeupdate }: Props = $props();
let container: HTMLDivElement | undefined = $state();
let wavesurfer: WaveSurfer | undefined = $state();
let regions: RegionsPlugin | undefined = $state();
let playing = $state(false);
let currentTime = $state(0);
let totalDuration = $state(0);
let ready = $state(false);
let loadError = $state(false);
let zoom = $state(50); // pixels per second
/** Currently selected region (for cut operations) */
let activeRegion: { id: string; start: number; end: number } | null = $state(null);
function formatTime(seconds: number): string {
if (!seconds || !isFinite(seconds)) return '0:00';
const m = Math.floor(seconds / 60);
const s = Math.floor(seconds % 60);
return `${m}:${s.toString().padStart(2, '0')}`;
}
export function getActiveRegion() {
return activeRegion;
}
export function clearActiveRegion() {
if (activeRegion && regions) {
const r = regions.getRegions().find((reg) => reg.id === activeRegion!.id);
if (r) r.remove();
activeRegion = null;
}
}
/** Add cut region overlay (red) */
export function addCutRegion(startMs: number, endMs: number) {
if (!regions) return;
regions.addRegion({
start: startMs / 1000,
end: endMs / 1000,
color: 'rgba(239, 68, 68, 0.25)',
drag: false,
resize: false,
});
}
/** Show silence regions as gray overlay */
export function showSilenceRegions(silenceRegions: SilenceRegion[]) {
if (!regions) return;
// Remove old silence markers
for (const r of regions.getRegions()) {
if (r.id.startsWith('silence-')) r.remove();
}
for (let i = 0; i < silenceRegions.length; i++) {
const sr = silenceRegions[i];
regions.addRegion({
id: `silence-${i}`,
start: sr.start_ms / 1000,
end: sr.end_ms / 1000,
color: 'rgba(156, 163, 175, 0.3)',
drag: false,
resize: false,
});
}
}
/** Clear all regions */
export function clearAllRegions() {
if (!regions) return;
regions.clearRegions();
activeRegion = null;
}
export function seekTo(timeSec: number) {
if (!wavesurfer || !ready) return;
wavesurfer.setTime(timeSec);
}
export function play() {
wavesurfer?.play();
}
export function pause() {
wavesurfer?.pause();
}
export function togglePlayback() {
wavesurfer?.playPause();
}
export function isPlaying() {
return playing;
}
onMount(() => {
if (!container) return;
regions = RegionsPlugin.create();
wavesurfer = WaveSurfer.create({
container,
height: 128,
waveColor: '#93c5fd',
progressColor: '#2563eb',
cursorColor: '#1d4ed8',
cursorWidth: 2,
barWidth: 2,
barGap: 1,
barRadius: 2,
normalize: true,
minPxPerSec: zoom,
url: src,
plugins: [regions],
});
// Enable drag-to-select region
regions.enableDragSelection({
color: 'rgba(59, 130, 246, 0.3)',
});
regions.on('region-created', (region) => {
// Only keep one selection at a time (not cut/silence overlays)
if (!region.id.startsWith('silence-') && !region.id.startsWith('cut-')) {
// Remove previous active selection
if (activeRegion && activeRegion.id !== region.id) {
const old = regions!.getRegions().find((r) => r.id === activeRegion!.id);
if (old && !old.id.startsWith('cut-')) old.remove();
}
activeRegion = {
id: region.id,
start: region.start,
end: region.end,
};
}
});
regions.on('region-updated', (region) => {
if (activeRegion && region.id === activeRegion.id) {
activeRegion = {
id: region.id,
start: region.start,
end: region.end,
};
}
});
wavesurfer.on('ready', () => {
ready = true;
totalDuration = wavesurfer!.getDuration();
onready?.(totalDuration);
});
wavesurfer.on('timeupdate', (time: number) => {
currentTime = time;
ontimeupdate?.(time);
});
wavesurfer.on('finish', () => {
playing = false;
});
wavesurfer.on('error', () => {
loadError = true;
});
wavesurfer.on('play', () => {
playing = true;
});
wavesurfer.on('pause', () => {
playing = false;
});
});
onDestroy(() => {
wavesurfer?.destroy();
});
function handleZoom(newZoom: number) {
zoom = newZoom;
wavesurfer?.zoom(newZoom);
}
function handleKeydown(e: KeyboardEvent) {
if (e.target instanceof HTMLInputElement || e.target instanceof HTMLTextAreaElement) return;
if (e.code === 'Space') {
e.preventDefault();
togglePlayback();
}
}
</script>
<svelte:window onkeydown={handleKeydown} />
<div class="studio-waveform flex flex-col gap-2">
<!-- Waveform -->
<div class="overflow-x-auto rounded-lg border border-gray-200 bg-gray-50 p-2">
{#if loadError}
<p class="py-8 text-center text-sm text-red-400">Kunne ikke laste lydfilen</p>
{:else if !ready}
<div class="flex items-center justify-center py-8">
<svg class="h-5 w-5 animate-spin text-blue-500" fill="none" viewBox="0 0 24 24">
<circle class="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" stroke-width="4"></circle>
<path class="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4z"></path>
</svg>
<span class="ml-2 text-sm text-gray-400">Laster waveform...</span>
</div>
{/if}
<div bind:this={container} class="w-full" class:hidden={!ready && !loadError}></div>
</div>
<!-- Transport + zoom -->
<div class="flex items-center justify-between">
<div class="flex items-center gap-2">
<!-- Play/pause -->
<button
onclick={togglePlayback}
disabled={!ready}
class="flex h-10 w-10 items-center justify-center rounded-full transition-colors
{ready ? 'bg-blue-600 text-white hover:bg-blue-700' : 'bg-gray-200 text-gray-400'}
disabled:cursor-not-allowed"
aria-label={playing ? 'Pause' : 'Spill av'}
>
{#if playing}
<svg class="h-4 w-4" fill="currentColor" viewBox="0 0 24 24">
<rect x="6" y="4" width="4" height="16" rx="1" />
<rect x="14" y="4" width="4" height="16" rx="1" />
</svg>
{:else}
<svg class="h-4 w-4 ml-0.5" fill="currentColor" viewBox="0 0 24 24">
<path d="M8 5v14l11-7z" />
</svg>
{/if}
</button>
<!-- Time display -->
<span class="font-mono text-sm text-gray-600">
{formatTime(currentTime)} / {formatTime(totalDuration)}
</span>
<!-- Selection info -->
{#if activeRegion}
<span class="rounded bg-blue-50 px-2 py-0.5 text-xs text-blue-600">
Markert: {formatTime(activeRegion.start)} - {formatTime(activeRegion.end)}
</span>
{/if}
</div>
<!-- Zoom -->
<div class="flex items-center gap-2">
<span class="text-xs text-gray-400">Zoom</span>
<input
type="range"
min="10"
max="500"
bind:value={zoom}
oninput={() => handleZoom(zoom)}
class="w-24"
/>
</div>
</div>
</div>

View file

@ -0,0 +1,70 @@
<script lang="ts">
import type { Node } from '$lib/spacetime';
import { edgeStore, nodeStore, nodeVisibility } from '$lib/spacetime';
import TraitPanel from './TraitPanel.svelte';
interface Props {
collection: Node;
config: Record<string, unknown>;
userId?: string;
}
let { collection, config, userId }: Props = $props();
/** Media nodes (audio files) belonging to this collection */
const audioNodes = $derived.by(() => {
const nodes: Node[] = [];
for (const edge of edgeStore.byTarget(collection.id)) {
if (edge.edgeType !== 'belongs_to') continue;
const node = nodeStore.get(edge.sourceId);
if (!node || nodeVisibility(node, userId) === 'hidden') continue;
if (node.nodeKind === 'media') {
try {
const meta = JSON.parse(node.metadata ?? '{}');
if (typeof meta.mime === 'string' && meta.mime.startsWith('audio/')) {
nodes.push(node);
}
} catch { /* skip */ }
}
}
nodes.sort((a, b) => {
const ta = a.createdAt?.microsSinceUnixEpoch ?? 0n;
const tb = b.createdAt?.microsSinceUnixEpoch ?? 0n;
return tb > ta ? 1 : tb < ta ? -1 : 0;
});
return nodes;
});
/** Check if a node has processed versions (derived_from edges) */
function hasVersions(nodeId: string): boolean {
for (const edge of edgeStore.byTarget(nodeId)) {
if (edge.edgeType === 'derived_from') return true;
}
return false;
}
</script>
<TraitPanel title="Lydstudio" icon="studio">
{#if audioNodes.length === 0}
<p class="text-sm text-gray-400">Ingen lydfiler i denne samlingen enna.</p>
{:else}
<ul class="space-y-2">
{#each audioNodes as node}
<li class="flex items-center justify-between rounded border border-gray-100 p-2">
<div class="min-w-0 flex-1">
<p class="truncate text-sm text-gray-700">{node.title ?? 'Uten tittel'}</p>
{#if hasVersions(node.id)}
<span class="text-[10px] text-green-600">Har prosesserte versjoner</span>
{/if}
</div>
<a
href="/studio/{node.id}"
class="ml-2 shrink-0 rounded bg-blue-50 px-3 py-1 text-xs text-blue-600 transition-colors hover:bg-blue-100"
>
Rediger
</a>
</li>
{/each}
</ul>
{/if}
</TraitPanel>

View file

@ -13,6 +13,7 @@
import CalendarTrait from '$lib/components/traits/CalendarTrait.svelte';
import RecordingTrait from '$lib/components/traits/RecordingTrait.svelte';
import TranscriptionTrait from '$lib/components/traits/TranscriptionTrait.svelte';
import StudioTrait from '$lib/components/traits/StudioTrait.svelte';
import GenericTrait from '$lib/components/traits/GenericTrait.svelte';
import TraitAdmin from '$lib/components/traits/TraitAdmin.svelte';
@ -47,7 +48,7 @@
/** Traits with dedicated components */
const knownTraits = new Set([
'editor', 'chat', 'kanban', 'podcast', 'publishing',
'rss', 'calendar', 'recording', 'transcription'
'rss', 'calendar', 'recording', 'transcription', 'studio'
]);
/** Traits that have a dedicated component */
@ -165,6 +166,8 @@
<RecordingTrait collection={collectionNode} config={traits[trait]} />
{:else if trait === 'transcription'}
<TranscriptionTrait collection={collectionNode} config={traits[trait]} />
{:else if trait === 'studio'}
<StudioTrait collection={collectionNode} config={traits[trait]} userId={nodeId} />
{/if}
{/each}

View file

@ -0,0 +1,417 @@
<script lang="ts">
import { page } from '$app/stores';
import { connectionState, nodeStore, edgeStore } from '$lib/spacetime';
import {
casUrl,
audioAnalyze,
audioProcess,
fetchSegments,
createNode,
createEdge,
type EdlOperation,
type EdlDocument,
type LoudnessInfo,
type SilenceRegion,
type AudioInfo,
type Segment,
} from '$lib/api';
import StudioWaveform from '$lib/components/studio/StudioWaveform.svelte';
import OperationPanel from '$lib/components/studio/OperationPanel.svelte';
import RenderDialog from '$lib/components/studio/RenderDialog.svelte';
const session = $derived($page.data.session as Record<string, unknown> | undefined);
const accessToken = $derived(session?.accessToken as string | undefined);
const nodeId = $derived(session?.nodeId as string | undefined);
const connected = $derived(connectionState.current === 'connected');
const mediaNodeId = $derived($page.params.id ?? '');
// Media node from STDB
const mediaNode = $derived(connected ? nodeStore.get(mediaNodeId) : undefined);
const metadata = $derived(mediaNode?.metadata ? JSON.parse(mediaNode.metadata) : null);
const casHash = $derived(metadata?.cas_hash as string | undefined);
const audioSrc = $derived(casHash ? casUrl(casHash) : '');
// Studio state
let operations: EdlOperation[] = $state([]);
let loudness: LoudnessInfo | null = $state(null);
let silenceRegions: SilenceRegion[] = $state([]);
let audioInfo: AudioInfo | null = $state(null);
let analyzing = $state(false);
let waveformRef: StudioWaveform | undefined = $state();
let currentTime = $state(0);
// Render state
let showRenderDialog = $state(false);
let rendering = $state(false);
let renderJobId: string | null = $state(null);
let resultNodeId: string | null = $state(null);
// Session persistence
let sessionNodeId: string | null = $state(null);
let saving = $state(false);
// Version history: processed nodes derived from this media node
const versions = $derived.by(() => {
if (!connected || !mediaNodeId) return [];
const nodes: { id: string; title: string; createdAt: bigint }[] = [];
for (const edge of edgeStore.byTarget(mediaNodeId)) {
if (edge.edgeType !== 'derived_from') continue;
const node = nodeStore.get(edge.sourceId);
if (node) {
nodes.push({
id: node.id,
title: node.title ?? 'Prosessert',
createdAt: node.createdAt?.microsSinceUnixEpoch ?? 0n,
});
}
}
nodes.sort((a, b) => (b.createdAt > a.createdAt ? 1 : -1));
return nodes;
});
// Transcription
let segments: Segment[] = $state([]);
let segmentsLoaded = $state(false);
// Load existing studio session (if any)
$effect(() => {
if (!connected || !mediaNodeId) return;
for (const edge of edgeStore.byTarget(mediaNodeId)) {
if (edge.edgeType !== 'has_studio') continue;
const node = nodeStore.get(edge.sourceId);
if (node) {
try {
const meta = JSON.parse(node.metadata ?? '{}');
if (meta.edl?.operations) {
operations = meta.edl.operations;
sessionNodeId = node.id;
}
} catch { /* ignore */ }
break; // Use first session found
}
}
});
// Load segments on mount
$effect(() => {
if (accessToken && mediaNodeId && !segmentsLoaded) {
segmentsLoaded = true;
fetchSegments(accessToken, mediaNodeId)
.then((res) => {
segments = res.segments || [];
})
.catch(() => {
segments = [];
});
}
});
async function handleAnalyze() {
if (!accessToken || !casHash) return;
analyzing = true;
try {
const result = await audioAnalyze(accessToken, casHash);
loudness = result.loudness;
silenceRegions = result.silence_regions;
audioInfo = result.info;
// Show silence regions on waveform
waveformRef?.showSilenceRegions(silenceRegions);
} catch (e) {
console.error('Analyse feilet:', e);
} finally {
analyzing = false;
}
}
function handleCut() {
const region = waveformRef?.getActiveRegion();
if (!region) return;
operations = [
...operations,
{
type: 'cut',
start_ms: Math.round(region.start * 1000),
end_ms: Math.round(region.end * 1000),
},
];
// Convert selection to cut overlay
waveformRef?.clearActiveRegion();
waveformRef?.addCutRegion(
Math.round(region.start * 1000),
Math.round(region.end * 1000)
);
}
function handleAddOp(op: EdlOperation) {
// Prevent duplicate types for non-cut operations
if (op.type !== 'cut') {
operations = operations.filter((o) => o.type !== op.type);
}
operations = [...operations, op];
}
function handleRemoveOp(index: number) {
operations = operations.filter((_, i) => i !== index);
}
function handleUndo() {
if (operations.length === 0) return;
operations = operations.slice(0, -1);
}
function handleTrimSilence() {
operations = [
...operations,
{
type: 'trim_silence',
threshold_db: -30,
min_duration_ms: 500,
},
];
}
function handleRenderStart() {
showRenderDialog = true;
}
async function handleRenderConfirm(format: string) {
if (!accessToken || !casHash) return;
rendering = true;
const edl: EdlDocument = {
source_hash: casHash,
operations,
};
try {
const result = await audioProcess(accessToken, mediaNodeId, edl, format);
renderJobId = result.job_id;
// Poll for completion (simple approach)
pollJobResult(result.job_id);
} catch (e) {
console.error('Render feilet:', e);
rendering = false;
}
}
function pollJobResult(jobId: string) {
// Poll job_queue for completion via simple interval
// In production this would use SSE or WebSocket
const interval = setInterval(async () => {
try {
const res = await fetch(`/api/query/job_status?job_id=${encodeURIComponent(jobId)}`, {
headers: { Authorization: `Bearer ${accessToken}` },
});
if (res.ok) {
const data = await res.json();
if (data.status === 'completed') {
clearInterval(interval);
resultNodeId = data.result?.processed_node_id ?? null;
rendering = false;
} else if (data.status === 'error') {
clearInterval(interval);
rendering = false;
console.error('Jobb feilet:', data.error_msg);
}
}
} catch {
// Ignore polling errors
}
}, 2000);
// Timeout after 5 minutes
setTimeout(() => {
clearInterval(interval);
if (rendering) {
rendering = false;
console.error('Render timeout');
}
}, 5 * 60 * 1000);
}
function handleKeydown(e: KeyboardEvent) {
if (e.target instanceof HTMLInputElement || e.target instanceof HTMLTextAreaElement) return;
if (e.key === 'Delete' || e.key === 'Backspace') {
const region = waveformRef?.getActiveRegion();
if (region) {
e.preventDefault();
handleCut();
}
}
if (e.ctrlKey && e.key === 'z') {
e.preventDefault();
handleUndo();
}
}
async function handleSaveSession() {
if (!accessToken || !casHash || operations.length === 0) return;
saving = true;
try {
const edl: EdlDocument = { source_hash: casHash, operations };
if (sessionNodeId) {
// Update existing session node
const { updateNode } = await import('$lib/api');
await updateNode(accessToken, {
node_id: sessionNodeId,
metadata: { edl },
});
} else {
// Create new session node + has_studio edge
const res = await createNode(accessToken, {
node_kind: 'content',
title: `Studio-sesjon: ${mediaNode?.title ?? 'Ukjent'}`,
visibility: 'hidden',
metadata: { edl },
});
sessionNodeId = res.node_id;
await createEdge(accessToken, {
source_id: res.node_id,
target_id: mediaNodeId,
edge_type: 'has_studio',
});
}
} catch (e) {
console.error('Kunne ikke lagre sesjon:', e);
} finally {
saving = false;
}
}
function handleSegmentClick(segment: Segment) {
waveformRef?.seekTo(segment.start_ms / 1000);
}
</script>
<svelte:window onkeydown={handleKeydown} />
<div class="flex min-h-screen flex-col bg-gray-50">
<!-- Header -->
<header class="flex items-center justify-between border-b border-gray-200 bg-white px-4 py-3">
<div class="flex items-center gap-3">
<a href="/" class="text-gray-400 hover:text-gray-600">
<svg class="h-5 w-5" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2">
<path stroke-linecap="round" stroke-linejoin="round" d="M10 19l-7-7m0 0l7-7m-7 7h18" />
</svg>
</a>
<h1 class="text-lg font-semibold text-gray-800">
{mediaNode?.title ?? 'Lydstudio'}
</h1>
{#if audioInfo}
<span class="rounded bg-gray-100 px-2 py-0.5 text-xs text-gray-500">
{audioInfo.codec} / {audioInfo.sample_rate}Hz / {audioInfo.channels}ch
</span>
{/if}
</div>
<div class="flex items-center gap-2">
{#if operations.length > 0}
<button
onclick={handleSaveSession}
disabled={saving}
class="rounded bg-gray-100 px-3 py-1.5 text-sm text-gray-700 transition-colors hover:bg-gray-200 disabled:opacity-50"
>
{saving ? 'Lagrer...' : sessionNodeId ? 'Oppdater sesjon' : 'Lagre sesjon'}
</button>
{/if}
</div>
</header>
{#if !audioSrc}
<div class="flex flex-1 items-center justify-center">
<p class="text-gray-400">
{#if !connected}
Kobler til...
{:else if !mediaNode}
Finner ikke medienoden
{:else}
Ingen lydfil tilgjengelig
{/if}
</p>
</div>
{:else}
<div class="flex flex-1 gap-4 p-4">
<!-- Main area: waveform + transcription -->
<div class="flex flex-1 flex-col gap-4">
<StudioWaveform
bind:this={waveformRef}
src={audioSrc}
onready={(d) => { audioInfo = audioInfo ?? { duration_ms: d * 1000, sample_rate: 0, channels: 0, codec: '', format: '', bit_rate: null }; }}
ontimeupdate={(t) => { currentTime = t; }}
/>
<!-- Transcription segments -->
{#if segments.length > 0}
<div class="rounded-lg border border-gray-200 bg-white p-3">
<h3 class="mb-2 text-sm font-semibold text-gray-700">Transkripsjon</h3>
<div class="max-h-64 space-y-0.5 overflow-y-auto">
{#each segments as seg}
{@const active = currentTime >= seg.start_ms / 1000 && currentTime < seg.end_ms / 1000}
<button
onclick={() => handleSegmentClick(seg)}
class="flex w-full items-start gap-2 rounded px-2 py-1 text-left transition-colors
{active ? 'bg-blue-50 text-blue-800' : 'hover:bg-gray-50 text-gray-600'}"
>
<span class="mt-0.5 shrink-0 font-mono text-[10px] text-gray-400">
{Math.floor(seg.start_ms / 60000)}:{Math.floor((seg.start_ms % 60000) / 1000).toString().padStart(2, '0')}
</span>
<span class="text-xs">{seg.content}</span>
</button>
{/each}
</div>
</div>
{/if}
</div>
<!-- Sidebar: operation panel -->
<div class="flex w-72 shrink-0 flex-col gap-4 overflow-y-auto lg:w-80">
<OperationPanel
{loudness}
{silenceRegions}
{audioInfo}
{analyzing}
{operations}
activeRegion={waveformRef?.getActiveRegion() ?? null}
onanalyze={handleAnalyze}
oncut={handleCut}
onaddop={handleAddOp}
onremoveop={handleRemoveOp}
onrender={handleRenderStart}
onundo={handleUndo}
ontrimsilence={handleTrimSilence}
/>
<!-- Version history -->
{#if versions.length > 0}
<div class="rounded-lg border border-gray-200 bg-white p-4">
<h3 class="mb-2 text-sm font-semibold text-gray-700">Versjoner</h3>
<ul class="space-y-1">
<li class="flex items-center justify-between rounded bg-blue-50 px-2 py-1">
<span class="text-xs font-medium text-blue-700">Original</span>
<span class="text-[10px] text-blue-500">Navarende</span>
</li>
{#each versions as ver, i}
<li class="flex items-center justify-between rounded px-2 py-1 hover:bg-gray-50">
<a href="/studio/{ver.id}" class="text-xs text-gray-600 hover:text-blue-600">
v{i + 1}: {ver.title}
</a>
</li>
{/each}
</ul>
</div>
{/if}
</div>
</div>
{/if}
</div>
<!-- Render dialog -->
{#if showRenderDialog}
<RenderDialog
{operations}
{rendering}
jobId={renderJobId}
{resultNodeId}
onconfirm={handleRenderConfirm}
onclose={() => { showRenderDialog = false; rendering = false; renderJobId = null; resultNodeId = null; }}
/>
{/if}

724
maskinrommet/src/audio.rs Normal file
View file

@ -0,0 +1,724 @@
//! Lydstudio — lydbehandling via FFmpeg subprocess.
//!
//! Ikke-destruktiv redigering: originalen i CAS røres aldri.
//! En EDL (Edit Decision List) beskriver operasjonene. Ved render
//! kjøres ffmpeg og resultatet lagres som ny CAS-entry.
use serde::{Deserialize, Serialize};
use sqlx::PgPool;
use uuid::Uuid;
use crate::cas::CasStore;
use crate::jobs::JobRow;
use crate::stdb::StdbClient;
// ─── EDL-datastrukturer ───────────────────────────────────────────
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct EdlDocument {
pub source_hash: String,
pub operations: Vec<EdlOperation>,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(tag = "type", rename_all = "snake_case")]
pub enum EdlOperation {
Cut {
start_ms: i64,
end_ms: i64,
},
Normalize {
target_lufs: f64,
},
TrimSilence {
threshold_db: f32,
min_duration_ms: u32,
},
FadeIn {
duration_ms: u32,
},
FadeOut {
duration_ms: u32,
},
NoiseReduction {
strength_db: f32,
},
Equalizer {
low_gain: f32,
mid_gain: f32,
high_gain: f32,
},
Compressor {
threshold_db: f32,
ratio: f32,
},
}
// ─── Analyse-resultat ─────────────────────────────────────────────
#[derive(Debug, Serialize, Deserialize)]
pub struct LoudnessInfo {
pub input_i: f64,
pub input_tp: f64,
pub input_lra: f64,
pub input_thresh: f64,
}
#[derive(Debug, Serialize, Deserialize)]
pub struct SilenceRegion {
pub start_ms: i64,
pub end_ms: i64,
pub duration_ms: i64,
}
#[derive(Debug, Serialize, Deserialize)]
pub struct AudioInfo {
pub duration_ms: i64,
pub sample_rate: u32,
pub channels: u32,
pub codec: String,
pub format: String,
pub bit_rate: Option<u64>,
}
#[derive(Debug, Serialize)]
pub struct AnalyzeResult {
pub loudness: LoudnessInfo,
pub silence_regions: Vec<SilenceRegion>,
pub info: AudioInfo,
}
// ─── FFmpeg-kommandoer ────────────────────────────────────────────
/// Hent metadata om en lydfil via ffprobe.
pub async fn get_audio_info(cas: &CasStore, hash: &str) -> Result<AudioInfo, String> {
let path = cas.path_for(hash);
if !path.exists() {
return Err(format!("Filen finnes ikke i CAS: {hash}"));
}
let output = tokio::process::Command::new("ffprobe")
.args([
"-v", "quiet",
"-print_format", "json",
"-show_format",
"-show_streams",
])
.arg(&path)
.output()
.await
.map_err(|e| format!("Kunne ikke kjøre ffprobe: {e}"))?;
if !output.status.success() {
let stderr = String::from_utf8_lossy(&output.stderr);
return Err(format!("ffprobe feilet: {stderr}"));
}
let json: serde_json::Value = serde_json::from_slice(&output.stdout)
.map_err(|e| format!("Kunne ikke parse ffprobe-output: {e}"))?;
// Finn første audio-stream
let stream = json["streams"]
.as_array()
.and_then(|streams| streams.iter().find(|s| s["codec_type"] == "audio"))
.ok_or("Ingen audio-stream funnet")?;
let format = &json["format"];
let duration_secs: f64 = format["duration"]
.as_str()
.and_then(|s| s.parse().ok())
.unwrap_or(0.0);
Ok(AudioInfo {
duration_ms: (duration_secs * 1000.0) as i64,
sample_rate: stream["sample_rate"]
.as_str()
.and_then(|s| s.parse().ok())
.unwrap_or(44100),
channels: stream["channels"].as_u64().unwrap_or(2) as u32,
codec: stream["codec_name"]
.as_str()
.unwrap_or("unknown")
.to_string(),
format: format["format_name"]
.as_str()
.unwrap_or("unknown")
.to_string(),
bit_rate: format["bit_rate"]
.as_str()
.and_then(|s| s.parse().ok()),
})
}
/// Analyser loudness (EBU R128) via ffmpeg loudnorm.
pub async fn analyze_loudness(cas: &CasStore, hash: &str) -> Result<LoudnessInfo, String> {
let path = cas.path_for(hash);
if !path.exists() {
return Err(format!("Filen finnes ikke i CAS: {hash}"));
}
let output = tokio::process::Command::new("ffmpeg")
.args(["-i"])
.arg(&path)
.args([
"-af", "loudnorm=print_format=json",
"-f", "null", "-",
])
.output()
.await
.map_err(|e| format!("Kunne ikke kjøre ffmpeg loudnorm: {e}"))?;
// loudnorm skriver JSON til stderr
let stderr = String::from_utf8_lossy(&output.stderr);
// Finn JSON-blokken i stderr
let json_start = stderr
.find("{\n")
.ok_or("Fant ikke loudnorm JSON i ffmpeg-output")?;
let json_end = stderr[json_start..]
.find("\n}")
.map(|i| json_start + i + 2)
.ok_or("Ufullstendig loudnorm JSON")?;
let json_str = &stderr[json_start..json_end];
let json: serde_json::Value = serde_json::from_str(json_str)
.map_err(|e| format!("Kunne ikke parse loudnorm JSON: {e}\n{json_str}"))?;
Ok(LoudnessInfo {
input_i: parse_loudnorm_field(&json, "input_i")?,
input_tp: parse_loudnorm_field(&json, "input_tp")?,
input_lra: parse_loudnorm_field(&json, "input_lra")?,
input_thresh: parse_loudnorm_field(&json, "input_thresh")?,
})
}
fn parse_loudnorm_field(json: &serde_json::Value, field: &str) -> Result<f64, String> {
json[field]
.as_str()
.and_then(|s| s.parse::<f64>().ok())
.ok_or_else(|| format!("Mangler felt '{field}' i loudnorm-output"))
}
/// Detekter stille regioner i en lydfil.
pub async fn detect_silence(
cas: &CasStore,
hash: &str,
threshold_db: f32,
min_duration_ms: u32,
) -> Result<Vec<SilenceRegion>, String> {
let path = cas.path_for(hash);
if !path.exists() {
return Err(format!("Filen finnes ikke i CAS: {hash}"));
}
let min_duration_secs = min_duration_ms as f64 / 1000.0;
let filter = format!("silencedetect=noise={threshold_db}dB:d={min_duration_secs}");
let output = tokio::process::Command::new("ffmpeg")
.args(["-i"])
.arg(&path)
.args(["-af", &filter, "-f", "null", "-"])
.output()
.await
.map_err(|e| format!("Kunne ikke kjøre ffmpeg silencedetect: {e}"))?;
let stderr = String::from_utf8_lossy(&output.stderr);
let mut regions = Vec::new();
let mut current_start: Option<f64> = None;
for line in stderr.lines() {
if let Some(pos) = line.find("silence_start: ") {
let val_str = &line[pos + 15..];
if let Some(secs) = val_str.split_whitespace().next().and_then(|s| s.parse::<f64>().ok()) {
current_start = Some(secs);
}
}
if let Some(pos) = line.find("silence_end: ") {
let val_str = &line[pos + 13..];
if let Some(end_secs) = val_str.split_whitespace().next().and_then(|s| s.parse::<f64>().ok()) {
if let Some(start_secs) = current_start.take() {
regions.push(SilenceRegion {
start_ms: (start_secs * 1000.0) as i64,
end_ms: (end_secs * 1000.0) as i64,
duration_ms: ((end_secs - start_secs) * 1000.0) as i64,
});
}
}
}
}
Ok(regions)
}
// ─── EDL → FFmpeg filtergraf ──────────────────────────────────────
/// Bygg ffmpeg-filtergraf fra EDL-operasjoner.
/// Returnerer (filter_string, trenger_to_pass).
///
/// Operasjonsrekkefølge:
/// 1. Cuts (aselect) — fjerner regioner
/// 2. Trim silence — konvertert til cuts
/// 3. Noise reduction (afftdn)
/// 4. EQ (equalizer)
/// 5. Compressor (acompressor)
/// 6. Normalize (loudnorm) — alltid sist før fades
/// 7. Fades (afade) — aller sist
pub fn build_filter_chain(
ops: &[EdlOperation],
duration_ms: i64,
loudness_measured: Option<&LoudnessInfo>,
) -> String {
let mut filters: Vec<String> = Vec::new();
// Samle alle cuts (inkl. fra trim_silence)
let mut cuts: Vec<(i64, i64)> = Vec::new();
for op in ops {
if let EdlOperation::Cut { start_ms, end_ms } = op {
cuts.push((*start_ms, *end_ms));
}
}
// Sorter cuts og bygg aselect-filter
if !cuts.is_empty() {
cuts.sort_by_key(|c| c.0);
let conditions: Vec<String> = cuts
.iter()
.map(|(s, e)| {
format!(
"between(t,{:.3},{:.3})",
*s as f64 / 1000.0,
*e as f64 / 1000.0
)
})
.collect();
filters.push(format!(
"aselect='not({})',asetpts=N/SR/TB",
conditions.join("+")
));
}
// Noise reduction
for op in ops {
if let EdlOperation::NoiseReduction { strength_db } = op {
filters.push(format!("afftdn=nf={strength_db}"));
}
}
// EQ — tre-bånds parametrisk
for op in ops {
if let EdlOperation::Equalizer {
low_gain,
mid_gain,
high_gain,
} = op
{
let mut eq_parts = Vec::new();
if *low_gain != 0.0 {
eq_parts.push(format!("equalizer=f=100:t=h:w=200:g={low_gain}"));
}
if *mid_gain != 0.0 {
eq_parts.push(format!("equalizer=f=1000:t=h:w=1000:g={mid_gain}"));
}
if *high_gain != 0.0 {
eq_parts.push(format!("equalizer=f=8000:t=h:w=4000:g={high_gain}"));
}
filters.extend(eq_parts);
}
}
// Compressor
for op in ops {
if let EdlOperation::Compressor {
threshold_db,
ratio,
} = op
{
filters.push(format!(
"acompressor=threshold={threshold_db}dB:ratio={ratio}:attack=5:release=50"
));
}
}
// Normalize (loudnorm) — to-pass hvis vi har målte verdier
for op in ops {
if let EdlOperation::Normalize { target_lufs } = op {
if let Some(measured) = loudness_measured {
filters.push(format!(
"loudnorm=I={target_lufs}:TP=-1.5:LRA=11:\
measured_I={:.1}:measured_TP={:.1}:measured_LRA={:.1}:\
measured_thresh={:.1}:linear=true",
measured.input_i,
measured.input_tp,
measured.input_lra,
measured.input_thresh,
));
} else {
// Enkeltpass (lavere kvalitet, men fungerer)
filters.push(format!("loudnorm=I={target_lufs}:TP=-1.5:LRA=11"));
}
}
}
// Beregn varighet etter cuts for fade-out posisjonering
let total_cut_ms: i64 = cuts.iter().map(|(s, e)| e - s).sum();
let effective_duration_ms = duration_ms - total_cut_ms;
// Fades — helt sist
for op in ops {
match op {
EdlOperation::FadeIn { duration_ms } => {
let d = *duration_ms as f64 / 1000.0;
filters.push(format!("afade=t=in:d={d:.3}"));
}
EdlOperation::FadeOut { duration_ms: dur } => {
let d = *dur as f64 / 1000.0;
let start = (effective_duration_ms as f64 / 1000.0) - d;
if start > 0.0 {
filters.push(format!("afade=t=out:st={start:.3}:d={d:.3}"));
}
}
_ => {}
}
}
filters.join(",")
}
// ─── Prosessering ─────────────────────────────────────────────────
/// Kjør ffmpeg med EDL-operasjoner og lagre resultatet i CAS.
pub async fn process_audio(
cas: &CasStore,
edl: &EdlDocument,
output_format: &str,
) -> Result<(String, u64), String> {
let source_path = cas.path_for(&edl.source_hash);
if !source_path.exists() {
return Err(format!("Kildefil finnes ikke i CAS: {}", edl.source_hash));
}
// Hent info for fade-out beregning
let info = get_audio_info(cas, &edl.source_hash).await?;
// Sjekk om vi trenger to-pass loudnorm
let has_normalize = edl.operations.iter().any(|op| matches!(op, EdlOperation::Normalize { .. }));
let loudness_measured = if has_normalize {
// Kjør silence-detection for trim_silence operasjoner
let silence_cuts = resolve_silence_cuts(cas, edl).await?;
// Bygg midlertidig EDL uten normalize for pass 1
let mut pass1_ops: Vec<EdlOperation> = edl.operations.clone();
pass1_ops.retain(|op| !matches!(op, EdlOperation::Normalize { .. }));
pass1_ops.extend(silence_cuts.iter().cloned());
let pass1_filter = build_filter_chain(&pass1_ops, info.duration_ms, None);
// Pass 1: mål loudness etter andre filtre er påført
let measured = if pass1_filter.is_empty() {
analyze_loudness(cas, &edl.source_hash).await?
} else {
analyze_with_filter(cas, &edl.source_hash, &pass1_filter).await?
};
Some(measured)
} else {
None
};
// Resolve trim_silence til faktiske cuts
let silence_cuts = resolve_silence_cuts(cas, edl).await?;
let mut all_ops = edl.operations.clone();
// Fjern TrimSilence og legg til genererte cuts
all_ops.retain(|op| !matches!(op, EdlOperation::TrimSilence { .. }));
all_ops.extend(silence_cuts);
let filter = build_filter_chain(&all_ops, info.duration_ms, loudness_measured.as_ref());
if filter.is_empty() {
return Err("Ingen operasjoner å utføre".to_string());
}
// Bestem output-codec basert på format
let codec_args = match output_format {
"mp3" => vec!["-codec:a", "libmp3lame", "-q:a", "2"],
"wav" => vec!["-codec:a", "pcm_s16le"],
"flac" => vec!["-codec:a", "flac"],
"ogg" => vec!["-codec:a", "libvorbis", "-q:a", "6"],
_ => vec!["-codec:a", "libmp3lame", "-q:a", "2"], // default: mp3
};
let ext = match output_format {
"wav" => "wav",
"flac" => "flac",
"ogg" => "ogg",
_ => "mp3",
};
// Output til temp-fil
let tmp_dir = cas.root().join("tmp");
tokio::fs::create_dir_all(&tmp_dir)
.await
.map_err(|e| format!("Kunne ikke opprette tmp-katalog: {e}"))?;
let tmp_output = tmp_dir.join(format!("audio_process_{}.{ext}", Uuid::now_v7()));
let mut cmd = tokio::process::Command::new("ffmpeg");
cmd.args(["-i"])
.arg(&source_path)
.args(["-af", &filter])
.args(&codec_args)
.args(["-y"])
.arg(&tmp_output);
tracing::info!(
source = %edl.source_hash,
filter = %filter,
output = %tmp_output.display(),
"Kjører ffmpeg audio processing"
);
let output = cmd
.output()
.await
.map_err(|e| format!("Kunne ikke kjøre ffmpeg: {e}"))?;
if !output.status.success() {
let stderr = String::from_utf8_lossy(&output.stderr);
// Rydd opp temp-fil
let _ = tokio::fs::remove_file(&tmp_output).await;
return Err(format!("ffmpeg feilet: {stderr}"));
}
// Les resultat og lagre i CAS
let result_bytes = tokio::fs::read(&tmp_output)
.await
.map_err(|e| format!("Kunne ikke lese ffmpeg-output: {e}"))?;
let _ = tokio::fs::remove_file(&tmp_output).await;
let store_result = cas
.store(&result_bytes)
.await
.map_err(|e| format!("Kunne ikke lagre i CAS: {e}"))?;
tracing::info!(
source = %edl.source_hash,
result = %store_result.hash,
size = store_result.size,
"Audio processing fullført"
);
Ok((store_result.hash, store_result.size))
}
/// Kjør loudnorm-analyse med et forhåndsfilter (for to-pass normalisering).
async fn analyze_with_filter(
cas: &CasStore,
hash: &str,
pre_filter: &str,
) -> Result<LoudnessInfo, String> {
let path = cas.path_for(hash);
let filter = format!("{pre_filter},loudnorm=print_format=json");
let output = tokio::process::Command::new("ffmpeg")
.args(["-i"])
.arg(&path)
.args(["-af", &filter, "-f", "null", "-"])
.output()
.await
.map_err(|e| format!("Kunne ikke kjøre ffmpeg loudnorm pass 1: {e}"))?;
let stderr = String::from_utf8_lossy(&output.stderr);
let json_start = stderr
.find("{\n")
.ok_or("Fant ikke loudnorm JSON i pass 1")?;
let json_end = stderr[json_start..]
.find("\n}")
.map(|i| json_start + i + 2)
.ok_or("Ufullstendig loudnorm JSON i pass 1")?;
let json: serde_json::Value = serde_json::from_str(&stderr[json_start..json_end])
.map_err(|e| format!("Kunne ikke parse loudnorm pass 1: {e}"))?;
Ok(LoudnessInfo {
input_i: parse_loudnorm_field(&json, "input_i")?,
input_tp: parse_loudnorm_field(&json, "input_tp")?,
input_lra: parse_loudnorm_field(&json, "input_lra")?,
input_thresh: parse_loudnorm_field(&json, "input_thresh")?,
})
}
/// Konverter TrimSilence-operasjoner til faktiske Cut-operasjoner
/// ved å kjøre silence detection.
async fn resolve_silence_cuts(
cas: &CasStore,
edl: &EdlDocument,
) -> Result<Vec<EdlOperation>, String> {
let mut cuts = Vec::new();
for op in &edl.operations {
if let EdlOperation::TrimSilence {
threshold_db,
min_duration_ms,
} = op
{
let regions = detect_silence(cas, &edl.source_hash, *threshold_db, *min_duration_ms).await?;
for region in regions {
// Behold 200ms stillhet på hver side for naturlig lyd
let margin_ms = 200;
let start = region.start_ms + margin_ms;
let end = region.end_ms - margin_ms;
if end > start {
cuts.push(EdlOperation::Cut {
start_ms: start,
end_ms: end,
});
}
}
}
}
Ok(cuts)
}
// ─── Jobbhåndterer ───────────────────────────────────────────────
/// Håndterer `audio_process`-jobber fra jobbkøen.
///
/// Payload:
/// ```json
/// {
/// "media_node_id": "uuid",
/// "edl": { "source_hash": "...", "operations": [...] },
/// "output_format": "mp3",
/// "requested_by": "uuid"
/// }
/// ```
pub async fn handle_audio_process_job(
job: &JobRow,
db: &PgPool,
stdb: &StdbClient,
cas: &CasStore,
) -> Result<serde_json::Value, String> {
let media_node_id: Uuid = job.payload["media_node_id"]
.as_str()
.and_then(|s| s.parse().ok())
.ok_or("Mangler media_node_id i payload")?;
let edl: EdlDocument = serde_json::from_value(job.payload["edl"].clone())
.map_err(|e| format!("Ugyldig EDL i payload: {e}"))?;
let output_format = job.payload["output_format"]
.as_str()
.unwrap_or("mp3");
let requested_by: Uuid = job.payload["requested_by"]
.as_str()
.and_then(|s| s.parse().ok())
.ok_or("Mangler requested_by i payload")?;
// Kjør prosessering
let (result_hash, result_size) = process_audio(cas, &edl, output_format).await?;
// Bestem MIME-type
let mime = match output_format {
"mp3" => "audio/mpeg",
"wav" => "audio/wav",
"flac" => "audio/flac",
"ogg" => "audio/ogg",
_ => "audio/mpeg",
};
// Opprett ny medienode for den prosesserte filen
let processed_node_id = Uuid::now_v7();
let metadata = serde_json::json!({
"cas_hash": result_hash,
"mime": mime,
"size_bytes": result_size,
"source_hash": edl.source_hash,
"edl": edl,
});
// Hent tittel fra original node
let original_title: Option<String> = sqlx::query_scalar(
"SELECT title FROM nodes WHERE id = $1"
)
.bind(media_node_id)
.fetch_optional(db)
.await
.map_err(|e| format!("DB-feil: {e}"))?
.flatten();
let title = original_title
.map(|t| format!("{t} (prosessert)"))
.unwrap_or_else(|| "Prosessert lyd".to_string());
// Insert processed media node
sqlx::query(
r#"
INSERT INTO nodes (id, node_kind, title, visibility, metadata, created_by)
VALUES ($1, 'media', $2, 'hidden', $3, $4)
"#,
)
.bind(processed_node_id)
.bind(&title)
.bind(&metadata)
.bind(requested_by)
.execute(db)
.await
.map_err(|e| format!("Kunne ikke opprette prosessert node: {e}"))?;
// Opprett derived_from edge: processed → original
let edge_id = Uuid::now_v7();
sqlx::query(
r#"
INSERT INTO edges (id, source_id, target_id, edge_type, system, created_by)
VALUES ($1, $2, $3, 'derived_from', true, $4)
"#,
)
.bind(edge_id)
.bind(processed_node_id)
.bind(media_node_id)
.bind(requested_by)
.execute(db)
.await
.map_err(|e| format!("Kunne ikke opprette derived_from edge: {e}"))?;
// Synk til SpacetimeDB
let metadata_str = serde_json::to_string(&metadata).unwrap_or_default();
let _ = stdb
.create_node(
&processed_node_id.to_string(),
"media",
&title,
"",
"hidden",
&metadata_str,
&requested_by.to_string(),
)
.await;
let _ = stdb
.create_edge(
&edge_id.to_string(),
&processed_node_id.to_string(),
&media_node_id.to_string(),
"derived_from",
"{}",
true,
&requested_by.to_string(),
)
.await;
tracing::info!(
original = %media_node_id,
processed = %processed_node_id,
hash = %result_hash,
"Audio process-jobb fullført"
);
Ok(serde_json::json!({
"processed_node_id": processed_node_id.to_string(),
"cas_hash": result_hash,
"size_bytes": result_size,
}))
}

View file

@ -36,7 +36,7 @@ const VALID_TRAITS: &[&str] = &[
// Publisering & distribusjon
"publishing", "rss", "newsletter", "custom_domain", "analytics", "embed", "api",
// Lyd & video
"podcast", "recording", "transcription", "tts", "clips", "playlist",
"podcast", "recording", "transcription", "tts", "clips", "playlist", "studio",
// Kommunikasjon
"chat", "forum", "comments", "guest_input", "announcements", "polls", "qa",
// Organisering
@ -2783,6 +2783,131 @@ pub async fn close_communication(
}))
}
// =============================================================================
// Lydstudio
// =============================================================================
#[derive(Deserialize)]
pub struct AudioAnalyzeRequest {
pub cas_hash: String,
pub silence_threshold_db: Option<f32>,
pub silence_min_duration_ms: Option<u32>,
}
/// POST /intentions/audio_analyze
///
/// Synkron analyse av en lydfil: loudness (LUFS), silence-regioner, og metadata.
/// Brukes av studioet for å vise nåværende tilstand før redigering.
pub async fn audio_analyze(
State(state): State<AppState>,
_user: AuthUser,
Json(req): Json<AudioAnalyzeRequest>,
) -> Result<Json<crate::audio::AnalyzeResult>, (StatusCode, Json<ErrorResponse>)> {
let cas = &state.cas;
if !cas.exists(&req.cas_hash) {
return Err(bad_request("Filen finnes ikke i CAS"));
}
let info = crate::audio::get_audio_info(cas, &req.cas_hash)
.await
.map_err(|e| internal_error(&e))?;
let loudness = crate::audio::analyze_loudness(cas, &req.cas_hash)
.await
.map_err(|e| internal_error(&e))?;
let threshold = req.silence_threshold_db.unwrap_or(-30.0);
let min_dur = req.silence_min_duration_ms.unwrap_or(500);
let silence_regions = crate::audio::detect_silence(cas, &req.cas_hash, threshold, min_dur)
.await
.map_err(|e| internal_error(&e))?;
Ok(Json(crate::audio::AnalyzeResult {
loudness,
silence_regions,
info,
}))
}
#[derive(Deserialize)]
pub struct AudioProcessRequest {
pub media_node_id: Uuid,
pub edl: crate::audio::EdlDocument,
pub output_format: Option<String>,
}
#[derive(Serialize)]
pub struct AudioProcessResponse {
pub job_id: Uuid,
}
/// POST /intentions/audio_process
///
/// Køer en audio-prosessering-jobb. Resultatet blir en ny medienode
/// med derived_from-edge til originalen.
pub async fn audio_process(
State(state): State<AppState>,
user: AuthUser,
Json(req): Json<AudioProcessRequest>,
) -> Result<Json<AudioProcessResponse>, (StatusCode, Json<ErrorResponse>)> {
// Sjekk at medienoden eksisterer
if !node_exists(&state.db, req.media_node_id).await.map_err(|e| {
tracing::error!("DB-feil: {e}");
internal_error("Databasefeil")
})? {
return Err(bad_request("media_node_id finnes ikke"));
}
// Sjekk at kildefilen finnes i CAS
if !state.cas.exists(&req.edl.source_hash) {
return Err(bad_request("source_hash finnes ikke i CAS"));
}
let output_format = req.output_format.unwrap_or_else(|| "mp3".to_string());
let payload = serde_json::json!({
"media_node_id": req.media_node_id.to_string(),
"edl": req.edl,
"output_format": output_format,
"requested_by": user.node_id.to_string(),
});
let job_id = crate::jobs::enqueue(&state.db, "audio_process", payload, None, 5)
.await
.map_err(|e| {
tracing::error!("Kunne ikke køe audio_process-jobb: {e}");
internal_error("Kunne ikke køe jobb")
})?;
Ok(Json(AudioProcessResponse { job_id }))
}
#[derive(Deserialize)]
pub struct AudioInfoQuery {
pub hash: String,
}
/// GET /query/audio_info?hash=...
///
/// Hent metadata om en lydfil (varighet, sample rate, kanaler, codec).
pub async fn audio_info(
State(state): State<AppState>,
_user: AuthUser,
axum::extract::Query(query): axum::extract::Query<AudioInfoQuery>,
) -> Result<Json<crate::audio::AudioInfo>, (StatusCode, Json<ErrorResponse>)> {
if !state.cas.exists(&query.hash) {
return Err(bad_request("Filen finnes ikke i CAS"));
}
let info = crate::audio::get_audio_info(&state.cas, &query.hash)
.await
.map_err(|e| internal_error(&e))?;
Ok(Json(info))
}
// =============================================================================
// Tester
// =============================================================================
@ -2842,7 +2967,7 @@ mod tests {
let all_traits = vec![
"editor", "versioning", "collaboration", "translation", "templates",
"publishing", "rss", "newsletter", "custom_domain", "analytics", "embed", "api",
"podcast", "recording", "transcription", "tts", "clips", "playlist",
"podcast", "recording", "transcription", "tts", "clips", "playlist", "studio",
"chat", "forum", "comments", "guest_input", "announcements", "polls", "qa",
"kanban", "calendar", "timeline", "table", "gallery", "bookmarks", "tags",
"knowledge_graph", "wiki", "glossary", "faq", "bibliography",

View file

@ -10,6 +10,7 @@ use uuid::Uuid;
use crate::agent;
use crate::ai_edges;
use crate::audio;
use crate::cas::CasStore;
use crate::stdb::StdbClient;
use crate::summarize;
@ -167,6 +168,9 @@ async fn dispatch(
"tts_generate" => {
tts::handle_tts_job(job, db, stdb, cas).await
}
"audio_process" => {
audio::handle_audio_process_job(job, db, stdb, cas).await
}
other => Err(format!("Ukjent jobbtype: {other}")),
}
}