Voice MD

Record your voice, get markdown. Powered by OpenAI's latest audio models — works on desktop and mobile.

Transcribe directly into your active note, or use meeting mode to identify speakers in conversations. Optionally run GPT post-processing to turn raw transcripts into structured notes with headings, lists, and paragraphs.

Quick start

Install from Obsidian Community Plugins (search "Voice MD")
Add your OpenAI API key in Settings → Voice MD
Click the microphone icon in the ribbon or run "Start voice recording" from the command palette
Speak. Stop. Text appears at your cursor.

Features

Voice-to-markdown

Record audio and get a transcription inserted directly at your cursor position. Uses gpt-4o-mini-transcribe — fast, accurate, and supports automatic language detection. You can also force a specific language (en, de, fr, etc.) in settings.

Meeting mode

Enable the "Meeting Mode" checkbox in the recording modal to identify speakers automatically. Uses gpt-4o-transcribe-diarize — best with 2–6 speakers and recordings over 30 seconds.

Output looks like this:

**Speaker A:** Let's review the Q3 numbers.

**Speaker B:** Revenue was up 12%, mostly driven by the enterprise segment.

**Speaker A:** What about churn?

Post-processing (smart formatting)

Enable the "Post-Processing" checkbox in the recording modal to run GPT formatting on your transcript. This creates two files in a Voice Transcriptions/ folder:

transcription-2025-05-13-143022-raw.md — verbatim transcript
transcription-2025-05-13-143022.md — formatted version with headings, lists, and structure

The formatted file links back to the raw version so you never lose the original. You can configure the GPT model (gpt-4o-mini, gpt-4o, etc.) and write a custom formatting prompt for your use case.

Both checkboxes default to your global settings and persist changes back — so you can toggle per recording without losing your defaults.

Mobile support

Works on iOS and Android. Audio format detection adapts to your platform (WebM on desktop, MP4 on iOS, OGG/WAV as fallback).

Settings

Settings → Voice MD

Setting	Description	Default
OpenAI API key	Required. Get one here	—
Max recording duration	Maximum seconds per recording	300
Auto-start recording	Start recording immediately when the modal opens	Off
Language	Force a language code, or leave blank for auto-detect	Auto
Enable post-processing	Default for the post-processing checkbox	Off
Chat model	GPT model used for post-processing	`gpt-4o-mini`
Custom formatting prompt	Override the default formatting instructions	—

Installation

Community plugins (recommended)

Settings → Community plugins → Browse
Search "Voice MD"
Install and enable

Manual

Download the latest GitHub release
Copy main.js, manifest.json, and styles.css to <vault>/.obsidian/plugins/voice-md/
Enable in Settings → Community plugins

Beta (BRAT)

Install the BRAT plugin
BRAT settings → Add Beta plugin → DenizOkcu/voice-md

Privacy

Audio is sent to OpenAI for transcription only — never stored on disk or written to your vault
Post-processing sends the transcript text to OpenAI if enabled
Your API key stays in local Obsidian storage
No telemetry, no tracking, no third-party services

Troubleshooting

Problem	Fix
No API key error	Add your OpenAI API key in settings
Recording won't start	Grant microphone permission to Obsidian in your OS settings
Transcription fails	Check that your API key is valid and your OpenAI account has credits
No speaker labels	Meeting mode works best with 2–6 speakers and recordings over 30 seconds

For anything else, open an issue.

License

MIT

Voice MD

Description

Reviews

Stats

Latest Version

Changelog

README file from

Voice MD

Quick start

Features

Voice-to-markdown

Meeting mode

Post-processing (smart formatting)

Mobile support

Settings

Installation

Community plugins (recommended)

Manual

Beta (BRAT)

Privacy

Troubleshooting

License