Your IDE Crashed. Its 10 Background Processes Didn't.

I was building a screen recording skill for Claude Code. The idea: record yourself working, auto-generate subtitles, compile a demo video. Simple enough.

The recording worked. The problem was stopping it. I'd send SIGINT to ffmpeg, the recording would stop, I'd move on. Forty minutes later my laptop fans were spinning. ps aux showed two ffmpeg processes still running. Two tail -f processes. A bats test runner from an earlier test run.

None of them had parents anymore. They were just... there.

I killed them manually and kept building. It happened again the next session. And the one after that.

The bash fix

The obvious solution was a script. I wrote orphan-monitor.sh — 163 lines, installed as a LaunchAgent, runs every 60 seconds. It checked for dead-parent ffmpeg processes, exceeded-duration recordings, orphaned tail watchers. Dead simple.

It caught real orphans on the first run. That should have been the end of it.

But I'd been poking at this problem long enough that I wanted to understand why it kept happening. So I opened GitHub and started searching.

It's not just me

189 chrome processes, 27GB RAM in 10 hours. A single Claude Code session spawning --claude-in-chrome-mcp processes at 4 per minute with no cleanup. The issue has been open since 2025.

641 chroma-mcp processes in 5 minutes. Nearly crashed WSL2. 64GB virtual memory.

30+ processes after 3 days of Cursor use. 3-5GB leaked RAM. Forum thread from last year, still getting replies.

kill-port has over 600,000 weekly npm downloads. That's hundreds of thousands of developers, every week, manually hunting down port-squatting processes and killing them one at a time.

This isn't a bug in any one tool. It's structural.

Why macOS makes it worse

Linux has PR_SET_PDEATHSIG — a syscall that lets a child process say "kill me when my parent dies." It's not perfect but it works. macOS doesn't have it. No PR_SET_CHILD_SUBREAPER either. When your IDE crashes on macOS, every child process it spawned becomes a permanent resident of your process table, reparented to launchd with PPID 1, running indefinitely until you notice or reboot.

The OS provides no safety net. The IDE can't set one up in advance because the feature doesn't exist. So every MCP server, every dev server, every headless browser just... stays.

85% of developers now use AI coding tools regularly (JetBrains 2025, 24,534 respondents). Each session spawns 3 to 10 background processes. None of them reliably clean up on crash.

The bash script was fine for my specific problem. For this one, I needed something distributable.

Why Go

I considered rewriting the bash script as a proper tool. The options were Python, Rust, Go.

Python needs a runtime. That's a non-starter for a system utility — you can't brew install something that needs the user to already have the right Python version.

Rust would have been fine. There's actually a tool called proc-janitor in Rust that does something adjacent. Different approach, different language, but proof the idea existed.

Go compiles to a single static binary. Zero runtime dependencies. gopsutil gives you cross-platform process enumeration without cgo. cobra is what kubectl and the GitHub CLI use for their command interfaces. It cross-compiles from my Mac to Linux and both macOS architectures with one command.

More practically: Go is readable. If someone finds devreap and wants to contribute a pattern, they shouldn't need to learn Rust to do it.

The core problem with naive orphan detection

The first version of the detection logic was: if PPID == 1 and the process name matches a known pattern, it's an orphan. Kill it.

This is wrong.

PPID == 1 just means the parent died. It doesn't mean the process is abandoned. If you have VS Code open with 4 Claude Code sessions, each session spawns MCP servers. Those servers have PPID 1 on macOS — it's how processes work. They're not orphans. Killing them would break your active session.

The fix was to stop thinking in binary and start thinking in probability.

Each process gets scored across five signals:

Signal	Weight
PPID is 1 (parent died, reparented to launchd)	0.40
No IDE running anywhere on the machine	0.30
Running longer than this type of process should	0.25
Bound to a listening port	0.20
No controlling terminal	0.15

Default kill threshold: 0.6. A process needs multiple signals to get flagged.

MCP server, PPID=1, no Cursor running → 0.70 → killed. MCP server, PPID=1, Cursor IS running → 0.40 → safe.

That one distinction eliminates the most common false positive. The IDE check is the signal that makes the whole thing safe to run automatically.

The IDE detection problem

"Is an IDE running?" sounds trivial. It's not.

My first implementation used substring matching on process names. Checked if anything called "cursor" or "code" was running.

CursorUIViewService is a macOS system process. Has nothing to do with the Cursor IDE. It's Apple's text cursor service, which is running on every Mac whether you have Cursor installed or not. Substring matching would always report an IDE as alive.

The fix was path-based signatures. Not "does any process name contain cursor" but "is there a process with /Applications/Cursor.app/Contents/MacOS/Cursor in its path." That's the actual IDE binary. System processes don't live there.

This is the kind of thing you only find by running it on a real machine.

MCP config cross-referencing

Every AI coding agent has a config file that lists which MCP servers should be running.

Claude Code: ~/.claude.json
Cursor: ~/.cursor/mcp.json
VS Code: ~/.vscode/mcp.json

devreap reads those files. It knows exactly which servers are supposed to exist and what their command signatures look like. When it scans, it can compare "what's configured" against "what's running" against "is the IDE alive."

Other tools are approaching this space — cmcp does MCP config cross-referencing to find orphaned servers, cc-reaper cleans up orphan Claude Code processes, and proc-janitor does PPID=1 pattern scanning. Where devreap differs is combining config awareness with the multi-signal scoring — it's not just "is this server in the config?" but "is the IDE alive AND is the parent dead AND has it exceeded its expected duration?" The config is one signal in a system of five.

What broke during the build

Three bugs worth mentioning.

The empty allowlist bug. The allowlist lets you protect specific processes from being killed. The matching code looped through each allowlist entry and checked if it matched the process name. An empty string "" passes strings.Contains checks. Every process was silently matching the allowlist and being protected. Added a if pattern == "" { continue } guard. Simple fix, bad consequences if it shipped.

Silent MCP parse failures. If your ~/.claude.json had malformed JSON, the config loader returned an empty result with no error. You'd run devreap doctor, get a clean result, and never know your MCP config wasn't being read. Refactored to return warnings alongside results so failures surface.

Zombie processes in integration tests. The killer tests spawn real sleep processes, kill them, then verify they're dead. On the first run, half the tests failed with "process still running after all signals." The processes were zombies — killed but not reaped, because the test (the parent) hadn't called cmd.Wait(). gopsutil correctly reports zombies as running. Fixed with go cmd.Wait() in a goroutine so children are reaped immediately after death.

The lesson that cost the most

After everything was "done" — compiling, tests passing, all commands working — I asked for an honest coverage audit.

40%.

Not because we hadn't written tests. We had. Logger had tests. Config had tests. Pattern matching had tests. But the killer package had zero integration tests. The daemon had zero tests. All 11 CLI commands were at 0% because they were only tested by running the binary as a subprocess, which Go's coverage tool doesn't count.

None of this was obvious until someone specifically looked for it. "Tests pass" became a proxy for "code is correct" when those are different things.

The fix was three weeks of integration tests — real processes, real signals, real kill sequences. The daemon now has a mock scanner interface. The killer tests spawn actual sleep processes and verify they die. The CLI tests build the binary once in TestMain and exercise every command.

Measured coverage landed at 62%. Real coverage is higher because subprocess tests don't count. More importantly, the integration tests catch things unit tests can't: zombie reaping, signal delivery timing, PID reuse between scan and kill.

The rule I wrote down afterward: test coverage audit is mandatory before declaring anything done. Not optional. Not something to check if you have time.

What it is now

devreap is a Go daemon that:

Scans every 30 seconds for orphaned developer processes
Scores each candidate across 5 signals, kills at 0.6 threshold
Handles 18 built-in patterns: MCP servers, dev servers, headless browsers, ffmpeg
Sends the right signal for each type (SIGINT for ffmpeg so it writes clean MP4 headers, SIGTERM for everything else)
Installs as a macOS LaunchAgent so it runs automatically
Cross-references your IDE MCP configs to find truly abandoned servers
Ships as a single static binary with zero runtime dependencies

brew install tjp2021/devreap/devreap
devreap install   # set up LaunchAgent
devreap status    # confirm it's running

That's it. No config needed. It works out of the box.

The thing I keep thinking about: the bash script was the right tool for my problem. devreap is the right tool for the broader one. The bash script pointed at the real problem — not the screen recording, but the orphan monitor — and once I saw it clearly, the right response was obvious.

Sometimes the thing you build to solve your problem is the distraction. The thing you notice while building it is the actual product.

Repo: github.com/tjp2021/devreap