Who are data annotators?

The foundation block for providing training dataset, image annotators are the invisible heroes who support the development neural networks and AI solutions.

Sadhli Roomy

Data Annotators.

Real trainer of thinking machines

Video originally published on The Medical Futurist YouTube channel on 10 June 2019.

Dr. Bertalan Meskó a.k.a. The Medical Futurist is correct and data annotators are indeed the frontliners who work to pave the way towards efficient AI solutions. Why? Because the accuracy of data labels ultimately affects the functional accuracy of an ML model. The need for data annotation services is well over the global supply of trained annotators in different fields such as geospatial and satellite imagery labeling, autonomous vehicle annotation, medical AI, human pose estimation, sports analytics etc.

Case of Medical AI

Say for instance Medical AI, a field that is so specialised that it requires doctors, trained nurses and/or annotators with special qualifications to detect and label foreign bodies in medical imagery. Going through the image below, I could not, for the life of me differentiate between the two chest X-Rays. I know something is up on the right image given there is a bit of a haze on the lower lung but then again, there is a bit of a black spot and minute hazes on the left image at the bottom right.

Pneumonia vs. Normal Chest X-Ray

I showed the image to a doctor intern at Acme AI and he immediately recognised the right one to have a form of pneumonia. These expert tier of annotators work well and are extremely precise. While annotation work for medical AI pays well, it might not be the go-to for doctors who earn the lion-share of their livelihood by providing medical services to patients.

From the perspective of a the global south, recent trends showcase final year students from a pharmacy background and/or young doctors expressing an interest in providing data annotation services, predominantly because of it being an avenue for them to accumulate experience, earning remotely and on-the-go (a cultural pivot experienced during COVID 19) while the pull of working on a transformative medical AI model is also a major factor.

On a similar note, we saw a marked increase in quality and agility of annotation in the space of pose estimation by animators and/or people having certificate in character modelling as opposed to their peers. This is due to their relative familiarity with rigging human and animal anatomical structure - giving them an advantage over specific tier of work. Replicate this by sect and there are specific people whose experience and education give them advantages in certain types of annotations.

The requirement for expert human labelers is tied to the rise of AI and the 4th Industrial Revolution that we are on the cusp of realising. It is instrumental for AI developers and researchers to have accessible, quality, and affordable training datasets - a formulae that we are playing around with to provide the best possible package for our clients and hopefully play a part to match demand of annotation services in the near future.

Data preparation and engineering tasks represent over 80% of the time consumed in most AI and Machine Learning projects. The market for third-party Data Labeling solutions is $150M in 2018 growing to over $1B by 2023.

Cognilytica, 2019

Let's start a project together.

Get high-quality and pixel-accurate labelled data through us. We bring the best-in-class, scalable, and adaptive annotation and quality assurance services. Begin the final trek towards fine-tuning your supervised learning model with us.