For most teams, you should build an in-house data labelling team only when annotation is a core, sensitive, long-term capability with stable high volume, and buy a managed team in nearly every other case, because building carries heavy fixed costs in recruitment, tooling, training and management that a managed pod absorbs for a fixed monthly fee. The practical decision turns on volume stability, domain sensitivity, how fast you need to start, and your tolerance for management overhead. This guide gives you a cost framework and a decision table to make the call cleanly.
We define the work, lay out the full cost of each path, and finish with a side-by-side that maps your situation to a recommendation.
What does "build vs buy" mean for data labelling?
Definition (data annotation / labelling): Data annotation is the process of adding human-generated labels, rankings or corrections to raw data (text, images, code, audio) so it can be used to train or evaluate a machine-learning model.
"Build" means standing up your own annotation operation: hiring annotators, buying or building tooling, writing guidelines, training people, and managing quality yourself. "Buy" means engaging an external provider, either a per-task marketplace or a managed pod, to deliver labelled data or RLHF judgments for you.
The same framework applies to RLHF work (preference ranking, demonstration writing, red-teaming), not just classic labelling, because the cost structure is similar: it is fundamentally a people-and-process problem. For background on why human-labelled data is so central to model quality, see Wikipedia's overview of data labelling for machine learning.
What does it really cost to build in-house?
The salary of an annotator is the visible cost. The hidden costs are what make building expensive and slow.
A realistic in-house build includes:
- Recruitment of annotators and, crucially, domain experts (often the hardest and most expensive to hire).
- Annotation tooling: licence or build-and-maintain cost for the labelling platform.
- Guideline and rubric authoring, typically by senior staff.
- Training and calibration time before anyone is productive.
- QA and adjudication, including a senior reviewer's time.
- Management: scheduling, performance, retention. Annotation churn is high.
- Ramp time: weeks to months before quality stabilises.
The result is high fixed cost and slow time-to-value. Building makes sense when volume is large, stable, and ongoing, so those fixed costs amortise well, and when data is too sensitive to share externally.
What does it cost to buy?
Buying converts most of that fixed cost into a variable or fixed-fee operating cost, and shifts the management burden to the provider.
Two buy models:
- Per-task marketplace (for example Mercor, Surge): you pay per hour or per task. Flexible for bursts, but you still own calibration and QA, and ongoing cost adds up at scale.
- Managed pod (for example OSCABE): a dedicated, trained team runs for you at a fixed monthly fee, with management included.
OSCABE's managed-pod pricing is transparent:
| OSCABE managed pod | From (per month) | Best for |
|---|---|---|
| Coding RLHF Team | £6,000 | Code review and coding RLHF |
| Training Data Pipeline Team | £8,000 | Annotation and data pipelines |
| Domain Expert AI Team | £9,000 | Legal, medical, finance, STEM |
| RLHF Evaluation Team | £10,000 | Preference data, eval, red-teaming |
That is roughly 75 to 80% cheaper than the effective cost of sourcing equivalent expert hours on per-hour gig platforms, with management and a UK contract included. See pricing for current figures.
How do the costs compare side by side?
Here is an illustrative annual comparison for a small expert-annotation capability (assume the equivalent of a managed Training Data Pipeline Team). Figures are indicative estimates to show structure, not a quote.
| Cost component | Build in-house (annual) | OSCABE managed pod (annual) |
|---|---|---|
| Talent (salaries / fee) | High, plus on-costs | Included in fee |
| Recruitment | Per hire, repeated | Included |
| Tooling / platform | Licence or build cost | Included |
| Training and calibration | Senior staff time | Included ("Trained First") |
| QA and management | Ongoing senior time | Included |
| Replacement / churn | You absorb | Handled by provider |
| Indicative total | Significantly higher | From ~£96,000 (£8k x 12) |
The managed pod folds recruitment, tooling, training, QA, management and replacement into one predictable fee, which is why it usually wins on total cost for anything short of a very large, permanent in-house operation.
What is the decision framework?
Use these factors to decide. The more you lean toward the right-hand column, the more buying makes sense.
| Factor | Lean BUILD when | Lean BUY (managed pod) when |
|---|---|---|
| Volume | Very high and permanent | Variable or moderate |
| Data sensitivity | Cannot leave your walls at all | Shareable under NDA / controls |
| Time to start | You can wait months | You need to start in weeks |
| Domain expertise | You can recruit and retain experts | You need experts fast and managed |
| Management capacity | You have spare ops bandwidth | You want overhead removed |
| Cost predictability | You can carry fixed cost | You want a fixed monthly fee |
| Calibration over time | You will invest in it | You want it handled |
For most companies fine-tuning or aligning models, the centre of gravity sits firmly in the "buy" column: they need to start quickly, need domain experts they cannot easily hire, and do not want to run an annotation operation. A managed pod fits that profile. For the broader sourcing landscape, see Mercor vs Surge vs OSCABE for AI training.
Why does a managed pod beat a marketplace for ongoing work?
If you have decided to buy, the next question is marketplace versus managed pod. For sustained programmes, the managed pod's advantage is calibration that compounds. On a marketplace, contributors rotate and each one re-learns your rubric on your budget. With a managed pod, the same dedicated people stay on your project, so quality improves over time and rework falls.
OSCABE's "Trained First" model formalises this: the pod is trained on your rubric and workflow before producing live labels, so you are not paying for early-stage mistakes. You can see the staffing model on how it works and the wider managed teams and teams options. The broad market for annotation and AI training services is widely estimated to be growing as more teams fine-tune models, though specific figures vary by source and should be read as general estimates.
Frequently asked questions
Should I build or buy a data labelling team?
Buy in most cases. Build in-house only when annotation is a core, permanent capability with high, stable volume, or when data cannot leave your walls at all. For everyone else, the fixed costs of recruitment, tooling, training, QA and management make a managed pod cheaper and faster. OSCABE's managed pods start from £6,000 per month and include management and a UK contract.
What hidden costs do people miss when building in-house?
The big ones are recruitment of scarce domain experts, annotation tooling, rubric authoring and training time, ongoing QA and adjudication, management, and the cost of churn. Annotation roles have high turnover, so you re-pay recruitment and training repeatedly. These costs often exceed annotator salaries and are why building looks cheaper than it is.
Is outsourced annotation lower quality than in-house?
Not inherently. Quality depends on calibration and management, not location. A managed pod with a clear rubric, measured inter-annotator agreement and a stable team can match or beat an under-managed in-house effort. OSCABE's "Trained First" approach trains the pod on your rubric before live work to protect quality from day one. For domain-heavy tasks, see hiring domain experts for AI model evaluation.
How quickly can a managed pod start versus building in-house?
A managed pod can typically be scoped and started in weeks, because the provider already has the talent, tooling and management in place. Building in-house usually takes months once you account for recruitment, tooling, guideline authoring and the ramp to stable quality. If speed matters, buying wins clearly.
Make the build-vs-buy call on real numbers
The build-vs-buy decision is not ideological; it is a cost-and-speed calculation. Build when annotation is a core, permanent, high-volume capability you can staff and manage. Buy a managed pod in nearly every other case, because it removes the fixed costs and the management burden while protecting quality.
To get a concrete monthly figure for your annotation or RLHF programme, explore OSCABE's AI Training Teams or contact us. We will scope a managed pod trained on your rubric, with transparent pricing and a UK contract, so you can compare it directly against the true cost of building in-house.