Note Reader CosyVoice

by Note Reader CosyVoice Contributors
5
4
3
2
1
New Plugin

Description

Read Obsidian notes through the local CosyVoice TTS pipeline. - This plugin has not been manually reviewed by Obsidian staff.

Reviews

No reviews yet.

Stats

stars
downloads
0
forks
0
days
NaN
days
NaN
days
0
total PRs
0
open PRs
0
closed PRs
0
merged PRs
0
total issues
0
open issues
0
closed issues
0
commits

Latest Version

Invalid date

Changelog

README file from

Github

Note Reader CosyVoice

中文说明

An Obsidian desktop plugin that reads the current note or selected text through a local CosyVoice TTS pipeline.

Screenshots

Reader control panel:

CosyVoice Reader control panel

Plugin settings. The script path shown here is a redacted example:

Note Reader CosyVoice settings

Features

  • Reads the current note, current selection, or from the current selection start to the end of the note.
  • Opens a right-side CosyVoice Reader control panel.
  • Shows synthesis/playback phase, whole-reading progress, percentage, and text preview.
  • Supports pause, resume, stop, and progress dragging while the current audio chunk is playing.
  • Provides right-panel speed presets: 1x, 1.25x, 1.5x, 2x, 1.1x, 1.2x, 1.3x, and 1.4x.
  • Uses a local PowerShell wrapper script instead of a cloud TTS service.
  • Cleans Markdown before synthesis.
  • Provides a settings-page Restore defaults button for resetting all plugin settings.
  • Handles common LaTeX before synthesis with a configurable Math reading language setting:
    • Skips formulas longer than 12 non-space characters.
    • English is the default for public releases, for example $a_b$ -> a subscript b.
    • Chinese keeps Chinese math words, for example $a_b$ -> a 下标 b.
    • Skip math skips short formulas as well as long formulas.
    • Leaves common Greek commands as English names, such as \alpha -> alpha, \beta -> beta, and \pi -> pi.
    • Reads common non-Greek symbols such as \leq, \times, and _.
    • Unwraps style commands such as \textbf{...}, \mathbf{...}, and \boldsymbol{...}.
    • Reads short \frac{a}{b} as a over b in English mode or a 分之 b in Chinese mode.

Privacy

The plugin is designed for local TTS. It does not send note text to Microsoft, OpenAI, or any remote TTS service. Text is written to a temporary file under the plugin cache and passed to the configured local CosyVoice wrapper.

Disclosures

  • Network use: the plugin itself does not call any remote service. Your configured CosyVoice wrapper may talk to a local service such as 127.0.0.1.
  • Shell execution: the plugin launches the PowerShell wrapper script that you configure in settings. This is required to call a local TTS runtime.
  • Direct filesystem access: the plugin writes temporary text and WAV files under this plugin's cache folder in the vault, checks that the configured wrapper script exists, and can launch a wrapper stored outside the vault.
  • Telemetry: the plugin does not include client-side or server-side telemetry.
  • Updates: the plugin does not include a self-update mechanism.

Requirements

  • Obsidian desktop.
  • A working local CosyVoice setup.
  • A PowerShell wrapper compatible with:
cosyvoice-wrapper.ps1 -InputPath <txt> -OutputPath <wav> -Speed <speed>

A recommended script path is:

%LOCALAPPDATA%\note-reader-cosyvoice\cosyvoice-wrapper.ps1

For local CosyVoice installation, hardware guidance, OS-specific notes, and wrapper examples, see Local CosyVoice setup.

Model Storage, Other TTS Engines, And Chunk Limits

This plugin does not download models. Plan storage for the local TTS runtime before installing a voice model:

  • Current CosyVoice model repositories are often several GB each. As of 2026-06, public Hugging Face examples range from about 2.5 GB for a 300M model to about 9 GB for a 0.5B CosyVoice3 model.
  • Reserve more than the raw model size. A practical starting point is 10-20 GB for one model and 30 GB+ if you keep multiple models, source checkouts, Conda environments, and caches.
  • Put model files and caches on a local SSD when possible. Avoid syncing model folders through cloud-drive clients.

The configured script can call another local TTS engine instead of CosyVoice if it follows the same wrapper contract: read UTF-8 text from -InputPath, write a valid WAV file to -OutputPath, accept -Speed, and exit non-zero with a clear error on failure. Check the other model's license, language coverage, audio format, speed controls, startup latency, and whether it sends text outside your machine or trusted local network.

Use Chunk limits to balance startup latency and synthesis stability:

  • CPU-only or low-end GPU: start with 30,60,90,120,160,200.
  • Mid-range GPU: use the default 40,80,120,160,280,320.
  • Faster GPU or low-latency local service: try 80,140,220,320,480,640.
  • If synthesis times out, fails, or the first audio takes too long, lower the numbers. If speech sounds too fragmented and your model is stable, raise them gradually.

Commands

  • Open CosyVoice reader controls
  • Read current note with CosyVoice
  • Read selection with CosyVoice
  • Read from selection with CosyVoice
  • Pause or resume CosyVoice reading
  • Stop CosyVoice reading

Progress Seeking

The progress bar shows whole-reading progress across all chunks. While audio is playing, the bar can be clicked or dragged. Seeking is limited to the currently loaded audio chunk; dragging outside that chunk is clamped to the nearest point in the current chunk.

Shared Package Contents

The install package contains only:

  • manifest.json
  • main.js
  • styles.css
  • README.md
  • INSTALL.md
  • LICENSE

It intentionally excludes data.json, cache, last-error.log, and local test files.

License

MIT.