Why the quality of annotations is more important than the quantity

Bad Teacher = Bad Students
Bad Data = Bad Model

Any machine learning model can only be as good as the training data that was used. If you don’t annotate all occurrences of a field, the model will be confused when it needs to extract this field (or not) and give low confidence scores.

Think of it as a teacher (you) that teaches its students (the neural network extraction model) confusing, contradicting or incomplete information. The student will never get good results. This is true for any deep learning model: regardless of whether you are teaching a strong A.I. to play chess using reinforcement learning by rewarding bad moves, or using machine learning algorithms for document intelligence.

In the world of AI technology, annotators that don’t make mistakes are science fiction.

Annotation is hard and human annotators make mistakes all the time. Even highly educated, laser-focused engineers, accountants or sales reps filled with coffee make mistakes on any specific task. Some companies resort to having every document annotated independently and from scratch twice. That is a huge drain on your human capital! Even after using this so-called “four eyes principle”, often an error margin of 2% cannot be avoided. I repeat: Annotation is hard and human annotators make mistakes all the time.

Depending on document type, after the initial annotation as much as 35% of documents can still contain (small) mistakes. After a first full human validation, that number can go down to maybe 10% of documents.

Quality is more important than quantity

Strong A.I. starts with good QA engineers

Correctness of annotation is hard, but crucial to getting good performance. Fixing annotation mistakes can easily lead to a 10-20% improvement of your model and is considerably less effort than annotating extra documents from scratch. In addition, adding new documents will not help the accuracy of the model if the existing documents still contain mistakes. You need to add about 5 times more annotated training data to your neural network to get a similar accuracy improvement compared to just fixing the original dataset. So, we always recommend correcting the existing data first before you add new data.

ActionWhenEffortExample Improvement
Perform a first annotation review on 100 documentsAnnotations have never been reviewed, or are known to contain mistakesAt 2 minute per document this would take about 3,5 hours10%
Perform a second annotation review on 100 documentsAnnotations have been reviewed at least once, but could still contain a some mistakes here and thereAt 1 minute per document this would take about 1h40 2%
Add an additional 100 documentsYou have checked 100 documents and found not a single annotation mistakeAt 5 minutes per document this would take about 8 hours2%

Review tasks

Review tasks are used to help correct mistakes. In Metamaze, review tasks are automatically suggested based on data that might be misannotated. Think of it as automated testing for annotations.

After the initial annotations of a certain document type, we recommend doing a QA task on all documents using the Custom QA tasks module.

When a certain document type or field is underperforming, you can also do a targeted, custom QA task focusing on specific entities, annotators, variations, …

The road to max efficiency

However, checking all data of a certain document type can be inefficient: a lot of documents will be (hopefully) fully correctly annotated, so you waste time going over all of them.

Therefore, we introduced the Suggested Review tasks in the Metamaze Platform, that use Artificial Intelligence and deep learning to select which documents need an extra review and which you can safely ignore. The next blog post will go more in depth on how that works on a technical level.

Jos Polfliet, CTO Metamaze
Jos Polfliet
CTO Metamaze
For the past 8 years, Jos has contributed to 100+ Artificial Intelligence projects in various industries, countries and use cases. For the past 4 years, Jos has created and led multiple Artificial Intelligence implementation teams including the teams at Faktion, Chatlayer and now Metamaze. His mission is to inspire companies and individuals to benefit greatly from the use of A.I. and M.L. by doing useful things.

Table of Contents

Share this post

Share on facebook
Share on twitter
Share on linkedin

More from the blog

Do you want to boost your business?

Drop us a line and we'll be in touch!

Get in touch

Let's build the next unicorn!

Subscribe to our newsletter

Let's join forces