Plugin Development

10 min read•Intermediate•January 5, 2026

No Shortcuts, No Exceptions
How We Test at KnobSmith

Or: Why shipping plugins that don't crash is actually harder than you think

TestingQAQuality AssurancePlugin DevelopmentWar Stories

Look, we get it. Testing is boring. QA is the vegetables of software development—nobody's excited about it, but you know you need it or things fall apart.

But here's the thing: In audio plugin land, "things fall apart" doesn't mean a button is the wrong color or a tooltip has a typo. It means someone's DAW crashes mid-session. Their project becomes corrupted. That vocal take they spent three hours perfecting? Gone. And now they're on Reddit, very politely explaining why your plugin is, quote, "absolute garbage."

So yeah, we're kind of obsessive about testing.

The Reality Check Nobody Talks About

You know what the difference is between a plugin that "mostly works" and one that professionals actually use in production?

About 200 edge cases you didn't think to test.

• What happens when someone automates every parameter at once at 1/32nd notes?
• What if they load your plugin 47 times on a single track?
• What if they change the sample rate mid-session?
• What if they're running it on a 2015 MacBook Air with 4GB RAM and 73 other plugins?
• What if they feed it a +48dB signal because they don't believe in gain staging?

If your answer to any of these is "well, nobody would actually do that," congratulations—you've discovered the difference between theory and production.

The KnobSmith Quality Gates

We don't ship a plugin unless it passes every single one of these gates. No exceptions. No "but it mostly works." No "we'll fix it in the next release."

All gates must pass. No shortcuts.

Gate 1: Unit Tests

Every DSP module has unit tests. Not "we should write tests someday" tests. Actual tests. Running in CI. Blocking merges if they fail.

What we test:

• DC offset (must be < 0.0001)
• Noise floor (must be < -60 dB)
• Parameter smoothing (no clicks, no zipper noise)
• Gain staging (output levels in expected range)
• Edge cases (what happens at ±Inf? At denormals? At sample rate changes?)

Gate 2: pluginval Level 10

pluginval is the nuclear option of plugin testing. It's like hiring someone to abuse your plugin in every way imaginable, then file detailed bug reports.

What it does:

• Loads and unloads your plugin 100 times
• Randomizes every parameter at audio rate
• Changes sample rates and buffer sizes mid-stream
• Tests state save/restore with every permutation
• Tries to make your plugin cry

The first time we ran pluginval on AlterOne, it found 14 issues. Fourteen! Uninitialized state, race conditions, buffer size assumptions—things we'd never have found manually.

Pass rate required: 100%

Gate 3: DAW Compatibility Matrix

We test every plugin in every major DAW. Not "someone on the team uses Logic so we probably work in Logic." Actual testing. With actual projects. On actual hardware.

✓ Tested

• Ableton Live
• Logic Pro
• Reaper
• FL Studio
• Studio One

What we check

• Loads without crashing
• Presets save/recall
• Automation works
• State persists
• No UI glitches

Gate 4: CPU Budget

Our target: < 1% CPU per instance at 48 kHz on an M1 MacBook Air.

Not Pro. Not Max. The Air. Because if someone's running your plugin on a potato, it should still work.

Real numbers from our last release:

AlterOne0.8% CPU @ 48kHz

LihisSplattener0.3% CPU @ 48kHz

Phasecraft1.2% CPU @ 48kHz

These aren't marketing numbers. These are "10 instances in a mix and your DAW still works" numbers.

Gate 5: Audio Quality Golden Files

Here's where it gets really nerdy. We have a library of test signals—sine sweeps, noise bursts, drums, vocals, guitar—and we process them through each plugin at known settings.

Then we save the output as "golden files." Every time we make a change, we re-render and compare. If the output differs by more than -60 dB, the test fails.

This catches:

• Accidental algorithm changes
• Precision issues
• Platform-specific behavior
• "Improvements" that actually changed the sound

Because if someone built a project with AlterOne v1.0 and it sounds a certain way, v1.1 better sound exactly the same unless we explicitly changed the algorithm and documented it.

The Stuff That Sounds Ridiculous Until It Catches a Bug

You want to know what obsessive quality control looks like? Here's a staged shot of our project root (based on an actual screenshot):

📷knob-comparison.png

📄LIHIS-DEBUG-REPORT.md

JSmeasure-knob-size.mjs

Let that sink in for a moment. We have a JavaScript script specifically dedicated to measuring knob sizes.

Why? Because at some point during UI development, someone made one knob 2 pixels bigger than the others, and when you have OCD and 15 knobs on screen, that 2-pixel difference will haunt you at 3am.

So now we have measure-knob-size.mjs. It checks every knob. It compares dimensions. It fails the build if they don't match. Problem solved.

What these files tell you:

• knob-comparison.png = We visually diff every UI change
• LIHIS-DEBUG-REPORT.md = We document bugs that take 4 days to find
• measure-knob-size.mjs = We automate the stuff that drives us crazy

Are these overkill? Maybe. Do they prevent bugs that would've shipped? Definitely.

This is what "no shortcuts" actually looks like. Scripts with ridiculous names. Debug reports that are longer than the actual code. Screenshots of knobs next to other knobs to verify they're the same size.

Why We're So Anal About This

When you ship bugs:

User loses work
User gets mad
User refunds
User tells everyone on Reddit
Your reputation takes a hit you can't undo

When you ship quality:

User makes music
User doesn't think about your plugin (this is good!)
User recommends it to friends
Word spreads organically
You sleep at night

We've chosen option 2.

The Cost of Quality

Here's the uncomfortable truth: Testing takes longer than writing the feature.

We spent:

Implementing AlterOne's pitch shifter2 weeks

Testing it (every edge case, DAW, platform)3 weeks

That's more time testing than building. And that's... actually normal for production-quality software.

The difference is most audio plugin developers don't talk about it. They ship fast, patch bugs later, and hope nobody notices.

But we're not building hobbyist plugins. We're building tools that professionals rely on. Tools that need to work every single time, in every single DAW, on every single platform.

When Testing Saves Your Ass

Three weeks before AlterOne's release, pluginval caught a bug we'd never have found manually:

"Plugin crashes when you automate pitch while hard tune is enabled, only in Ableton Live on Windows, only at 88.2 kHz, only if you've loaded a preset from disk."

Seven. Separate. Conditions.

If we'd shipped without catching this? One in a thousand users would hit this exact edge case, crash, lose work, and rightfully lose their minds.

But we didn't ship. Because pluginval found it. Because we test obsessively. Because "mostly works" isn't good enough.

The fix was three lines of code. The investigation took four days. Worth every minute.

Our Promise to You

When you load a KnobSmith plugin, here's what you can count on:

✓It won't crash. We've tested the edge cases.
✓It won't glitch. We've smoothed the parameters.
✓It won't waste CPU. We've profiled and optimized.
✓It'll sound consistent. We've locked down the algorithms.
✓It'll work in your DAW. We've tested them all.

That's not marketing speak. That's the result of hundreds of hours of testing, thousands of test runs, and a refusal to compromise on quality.

The Bottom Line

Testing isn't sexy. Nobody downloads a plugin because it "passes pluginval Level 10." Nobody brags about < 1% CPU usage on Reddit.

But you know what they do brag about?

"I've used this plugin on 50 tracks and it's never crashed once."
"It just works. I don't even think about it."
"I trust it in critical sessions."

That's the goal. Not to be exciting. To be reliable.

Because in professional audio, reliability is the feature that matters most. And reliability doesn't happen by accident.

No shortcuts. No exceptions. No regrets.

The Beautiful Mess: What We Learned Building Plugins →

No Shortcuts, No ExceptionsHow We Test at KnobSmith