Privacy model

No worker ever hears a full recording.

Before any audio reaches a worker, the coordinator slices it into 60-second fragments. Each fragment goes to a different machine. No single worker receives enough audio to reconstruct the original.

Download for Mac How earning works →

How sharding works

Dispatch without exposing the full picture.

How sharding protects you

A 30-minute meeting becomes roughly 30 one-minute shards.
Each shard includes only a small overlap for clean transcript stitching.
Adjacent shards are sent to different Macs, not the same machine twice in a row.
No worker hears enough contiguous audio to follow the full conversation.

Sharding diagram

Mac A

Mac B

Mac C

Mac D

Anti-correlation

The coordinator avoids adjacent shard reuse.

Even if a worker is compromised, it should only ever see non-adjacent slices. That makes reconstruction materially harder.

Why adjacency matters

Shard 12 and shard 13 together reveal more context than shard 12 and shard 27.
Scheduling spreads neighboring audio across different machines on purpose.
A worker can be accurate on its clip without learning what was said before or after.

Anti-correlation visualization

Shard Assigned worker

01 Mac A

02 Mac B

03 Mac C

04 Mac A

05 Mac D

06 Mac B

Validation

What the coordinator does to validate workers.

Hidden canaries and reputation scoring let HiveCompute validate output without assuming any worker is trustworthy on day one.

Double-check new workers

New workers start at zero trust. Their early output can be verified by a second independent worker before it is accepted.

Canary injection

The coordinator occasionally swaps in known-answer audio. Workers do not know when they are being tested, which makes gaming the system harder.

Reputation scoring

Workers build up from new to provisional to trusted to elite. Bad output lowers reputation and reduces future work.

Visibility

What a worker receives, and what stays hidden.

Workers see

Workers do not see

A single 60-second audio clip

Your name, company, or email

The language to transcribe

The full recording

A result endpoint for that clip

Other shards from your job

Only the current shard timing

The final merged transcript

Transport over HTTPS

Any account metadata about you

Comparison

How HiveCompute compares.

HiveCompute

OpenAI Whisper API

Self-hosted

Your audio goes to

Distributed Macs, one shard each

OpenAI servers, full file

Your own machine or cluster

Who can access it

No single worker hears the full recording

OpenAI systems handling the job

Your team and infrastructure

Cost

Around $0.003 per minute

Around $0.006 per minute

Hardware and ops overhead

Speed

Parallel shards across a fleet

Single hosted request path

Bound by your own capacity

Retention

Deleted after completion

30-day retention by default

You decide

FAQ

Direct answers to the obvious questions.

Can a worker reconstruct my full recording?

No. Shards are short, adjacent clips are intentionally split across different workers, and a worker does not receive the merged transcript.

What if a worker saves the audio?

The worker still only has a short clip with no identifying metadata. Hidden canaries and reputation scoring make low-quality or malicious behavior easier to detect and remove.

Is the transcript encrypted?

In transit, yes. Audio and transcripts move over HTTPS. At rest, the transcript sits in the coordinator database behind your API access controls.

Can I use this for HIPAA-regulated audio?

Not yet. The pilot is designed for teams that want lower cost and good privacy boundaries, but it is not represented as HIPAA-certified today.

Open source

The worker is open. Audit it yourself.

The local worker code is public. You can read exactly what runs on your machine — how it polls, what it sends back, and how audio is handled.

View source on GitHub →