TRY OUT FOR FREE

Matching Data Service

Clustered search technology that uses a data sample and returns matches from the TAUS Data repository, according to domain relevance. With this methodology you get clean, high-quality, high-fidelity datasets for MT training, tuned to your specific content.

Yes, I’d like to try the service for FREE

How it Works?

You Provide us with a Data Sample

Tell us what kind of data you are looking for by providing a sample that represents the domain and language pair of interest.

We Search for Matching Sentences

We identify the best matching data in the TAUS Data repository, on a segment-level, and create three separate data selections.

You Review
Matching Data Selections

We share with you the volumes (number of words and segments), samples and price. You purchase only if you like the results.

Explore the Ready-Made Datasets

Colloquial-corpus-copy-1

eCommerce-corpus (1)

IT-Instructions (1)