Advised on transcription model choices for clean and noisy audio on Apple Silicon.
PlanConfirmed profile creation methods and chose direct SQLite insertion.
PlanCreated two transcription profiles in the database.
BuildReviewed repo structure for utility implementation.
PlanDiscussed options for seeding profiles via CLI or standalone binary.
Planhelp me choose best model for transcription
It depends on what you're transcribing. Quick guide for Scriberr's four families:
large-v3 — best default. Strongest multilingual (~99 langs), works well with diarization (the path you just restored), handles noisy/accented audio. Needs ~10GB VRAM; slower. Drop to medium or small if you're CPU-only or English-only and want speed.Two quick questions to narrow it down:
Tell me those and I'll point at one.
Good — on Apple Silicon, WhisperX is your only realistic option (Parakeet/Canary are CUDA-only; Voxtral is too heavy without GPU). The good news: WhisperX runs fine via faster-whisper on CPU/Accelerate. Here are two profiles I'd start with:
For interviews on a good mic, podcasts, single speaker, low background noise.
medium.en (English) or medium (multilingual) — large-v3 is overkill for clean audio and ~3x slowercpuint8 — big speedup on Apple Silicon, accuracy loss is negligible on clean speech8silero (faster, fine for clean audio)30For phone calls, field recordings, overlapping speakers, accents, background noise.
large-v3cpufloat32 (accuracy matters more than speed here)4 (lower — larger model, keep memory steady)pyannote (better at separating speech from noise)20 (shorter chunks reduce hallucination drift in tough audio)large-v3 feels too slow, large-v2 is often nearly as accurate and a touch faster. large-v3-turbo is another option if it's exposed in your model list (not listed in TranscriptionConfigDialog.tsx:136).Want me to look at TranscriptionConfigDialog.tsx and tell you exactly which dropdown values to pick, or open the profile creation flow?
can you create these profiles directly via code?
Let me investigate how profiles are stored before answering.
Yes, two ways — both feasible. Want me to confirm which before I touch anything?
Option A: Direct SQLite insert into data/scriberr.db (table transcription_profiles).
BeforeCreate/BeforeSave hooks. Safe here since neither profile is is_default (you already have "Default") and I'll generate UUIDs manually.Option B: HTTP POST to http://localhost:8080/api/v1/profiles with curl.
I'd lean Option A unless you want to validate the API path. Confirm which, and I'll create both profiles with the params I recommended (clean = medium + int8 + silero; noisy = large-v3 + float32 + pyannote + diarize).
a
Now creating the two profiles.
Both profiles are in. Verified:
| Profile | Model | Compute | VAD | Chunk | Batch | Diarize |
|---|---|---|---|---|---|---|
| Clean audio — fast | medium | int8 | silero | 30 | 8 | off |
| Noisy audio — accurate | large-v3 | float32 | pyannote | 20 | 4 | on |
Notes:
large-v3 (~3GB) and the pyannote diarization model. The Clean profile will download medium (~1.5GB). Both cache after first use.If you want to tweak language-specific behavior (e.g. force medium.en for English-only on the clean profile, or set language=en on noisy to skip auto-detect and shave a bit of time), just say which.
can you create a utility that does this to the repo?
Let me check the existing repo structure first so the utility fits the conventions.
Before I write, three real shape options — they have different blast radius:
A. New subcommand on existing scriberr-cli (e.g. scriberr profiles seed-presets)
internal/cli/login.go)B. New standalone binary cmd/seed-profiles/main.go that opens the DB via database.Initialize + GORM repository
seedDefaultProfile pattern in internal/database/database.go:115FindByNameC. SQL seed file committed to e.g. data/seeds/profiles.sql, run via make seed-profiles
My recommendation: B. It's the right idiom for this codebase (you already have seedDefaultProfile), works in fresh installs without auth setup, stays idempotent, and reuses models.TranscriptionProfile so the schema can never drift from the seed.
Scope: hard-code the two Apple-Silicon presets (Clean / Noisy), idempotent (skip by name), --force flag to overwrite. Add a make seed-profiles target.
Confirm B and I'll build it. Or pick A/C and I'll build that instead.
should the existing seed code be refactored? what's the cleanest solution?
Good question — and yes, it's worth refactoring. The cleaner shape removes the need for a separate CLI utility entirely.
Today seedDefaultProfile (database.go:115) is:
database.gocount > 0 (i.e. only seeds an empty DB)That count > 0 check is the real problem. It means: once any profile exists, no built-in profile can ever be added again. So if you ship a new preset in v1.2, no existing install will ever see it.
Move presets to a name-keyed registry, idempotent by name, runs on every startup:
internal/database/ database.go // calls ensureBuiltInProfiles(db) profile_seeds.go // []models.TranscriptionProfile registry + ensure fn
If a user deliberately deletes a built-in profile, it comes back on the next startup. Two ways to handle that:
seeded_builtin_profiles(name TEXT PRIMARY KEY)) — only seed each name once, ever. Cleaner UX, but ~15 extra lines and a new table for a corner case.I'd ship #1 and only move to #2 if it actually annoys someone.
The "create a utility" task collapses into: refactor seedDefaultProfile → ensureBuiltInProfiles, add the two new presets to the slice, delete the count > 0 check. ~40 lines of diff, no new files, no new binary.
Want me to implement it?
yes
Let me check the model field details before writing.
Now update database.go to call the new function and remove the old one.
Build passes. Let me run go vet and the relevant tests.
Let me show the final shape of the change.
Done. Build and vet are clean.
internal/database/profile_seeds.go (~115 lines) — preset registry + ensureBuiltInProfilesinternal/database/database.go — seedDefaultProfile (52 lines) replaced with one call to ensureBuiltInProfiles. Net: −50 linesDefault, Clean audio — fast, Noisy audio — accurate) and inserts only those whose Name is missing.IsDefault: true on Default only takes effect when no profile in the DB is currently marked default — otherwise it's downgraded to non-default so existing user defaults are never overridden.count > 0 check is gone, so future presets you add to the slice will land on existing installs automatically.You currently have three profiles, none of which are marked default:
whisper is_default=0 Clean audio — fast is_default=0 Noisy audio — accurate is_default=0
On your next server restart, the seeder will add a fresh Default profile (IsDefault=true) because no profile is currently default. You'll end up with four profiles. If you'd rather have whisper be the default, set it in Settings before restarting (or UPDATE transcription_profiles SET is_default=1 WHERE name='whisper';) — the seeder will then see a default exists and skip creating its own.
Want me to commit this?
yes
Committed as 55166dd (2 commits ahead of origin/main, not pushed). Working tree clean.