Abstract
This paper explores data annotators’ employment and job security patterns in the face of evolving artificial intelligence (AI). The paradox arises as annotators contribute to potentially making their jobs obsolete by improving AI models. Addressing a data gap in the overlooked annotator industry, the study includes interviews with human data annotators, managers, and AI company leaders. The findings challenge the fear that AI will render human data annotators unemployed. Additionally, they refute the lump of labor fallacy, which assumes a fixed amount of work, implying that if AI takes a job away, it results in a job loss for humans. Moreover, this study emphasizes the importance of reskilling and upskilling for annotators to stay relevant, as annotator roles will evolve along with AI, causing higher-skilled workers to be in higher demand as more complex annotations are needed to strive for AI model accuracy.
Introduction
Data annotation is a vital yet often overlooked economic facet in the rapidly evolving landscape of Artificial Intelligence (AI). It is a $1.5 billion market with a noteworthy Compound Annual Growth Rate (CAGR) of 25%. Despite its economic and technological significance, data annotation operates in relative obscurity, shrouded in secrecy due to its proprietary nature and the employment of annotators from economically disadvantaged backgrounds.
Data annotation, the labeling and categorizing of data for AI and Machine Learning (ML) models, introduces a paradigm shift that challenges conventional economic models. The paradox is that while increased productivity is typically associated with expanded employment opportunities, data annotators face a unique challenge: their work inherently contributes to the automation that threatens their job security. In a striking twist, they are both builders and potential victims of the systems they help train.
This research paper aims to counter the oversimplified narrative that AI will render annotators obsolete. Instead, it argues that AI and human annotators operate in tandem, with automation acting as a productivity enhancer rather than a job destroyer. Interviews reveal that many annotators transition into higher-level roles—such as quality assurance, project management, or data strategy—as AI evolves. This finding aligns with studies warning that using AI purely for cost-cutting, rather than as a complement to human capability, undermines training initiatives and long-term growth.
Crucially, this paper is supported not only by qualitative interviews (refer to the Quantitative Statistics section) but also by a 13-year quantitative analysis of nearly 5,000 workers on Upwork, one of the world’s largest freelancing platforms. That data reveals clear patterns: annotators who upskilled saw significant income increases, while those who didn’t were more likely to experience stagnant wages or leave the market entirely. This bifurcation of outcomes provides direct, empirical evidence of the labor market forces illustrated in Figures 1 and 2.
Figures 1 and 2 graphically represent these dynamics within an AI company in an unregulated economy, contrasting outcomes for high-skilled workers (e.g., managers or advanced annotators) and low-skilled workers (e.g., basic labelers). Figure 1 demonstrates the trend for low-skilled workers. As task complexity increases, the demand curve for low-skilled annotators shifts leftward (D to D₁), indicating reduced demand. This is confirmed by the Upwork findings: hundreds of annotators with stagnant skill sets exited the market entirely, and those remaining without upskilling saw diminished income over time. Together, the figures and data highlight the growing divide in outcomes based on skill acquisition and adaptability in the face of AI advancement.
In contrast, Figure 2, the steeper supply curve (S₁) reflects the rising opportunity cost of high-skilled labor as AI tasks grow more complex, requiring specialized skills. The rightward shift in demand (D to D₁) shows how automation of simpler tasks increases the need for human oversight and strategic input, raising both wages and employment at the new equilibrium (E**). As confirmed by the Upwork data, this demand for higher-skilled roles has driven income growth for many annotators who adapt and transition.
Literature Review
Data annotation plays a crucial role in the development and accuracy of AI models but remains an underexplored area in academic research. While studies have talked about the social and technical frameworks surrounding work and technology, there is a significant gap in understanding how evolving AI impacts the annotation workforce’s job security and career paths.
The concept of visible and invisible work highlights how essential yet often overlooked tasks support technological systems1. This concept is especially relevant to data annotators, whose labor is critical but largely hidden behind AI’s success. The processes of classification and categorization shape social and technological processes, directly relating to annotation’s core function of labeling data for AI training2.
Intersectionality theory adds an important perspective by showing how multiple social identities intersect to affect marginalized groups disproportionately represented in annotation roles3, a point echoed in recent studies focusing on annotators’ social backgrounds and precarity4.
More recent empirical work specifically addressing AI annotation labor includes detailed examinations of annotators’ experiences, revealing complex motivations and the evolving nature of their work4. Other research explores the challenges faced by crowdworkers on platforms such as Mechanical Turk, emphasizing issues around job security, compensation, and autonomy—factors directly relevant to data annotators worldwide5.
As AI models become more complex, the demand for skilled annotators capable of handling varying and higher-order tasks grows. One study shows that annotation accuracy significantly declines as tasks grow more complex—such as those involving structured or multi-object labels—without sophisticated aggregation techniques, implying that annotators must continuously upskill and refine their methods to maintain data quality in evolving AI workflows6.
The literature also highlights the dual role of AI as both a disruptor and a complement to annotation work. Technology can reshape labor by reducing demand for routine jobs while creating new opportunities requiring human judgment and oversight7. Further studies underline that automation cannot fully replace the nuanced decisions and quality checks human annotators provide, a point reinforced by the interviews in this research8.
In summary, the literature supports the view that while automation influences the data annotation industry, human annotators remain essential—especially as tasks grow more complex—and that upskilling is critical for maintaining employment in this evolving field.
The paper is structured as follows: Section II explains the methods and interview questions used for this paper; Section III includes the data gathered from the interviews conducted with annotators, their managers, and the leaders of AI organizations; Section IV contains the quantitative statistics, and Section V includes the results and conclusion.
Methods
This research was conducted through qualitative interviews involving various groups in the AI industry, including four data annotators, six annotator managers, and eight heads of AI organizations from China, India, Pakistan, and the USA. Particular weight was placed on the responses of heads of AI organizations due to their power in shaping the industry’s direction. Three different questionnaires were developed for the three groups, with annotation “suppliers” asked about career progression and “demanders” asked about market changes (all interview questions shown in the Appendix). These questionnaires were made to fit within a 30-minute interview window. Interviews were conducted via Zoom or in person, depending on participants’ preferences and availability. Moreover, the data was recorded on the phone using a voice recording app for the physical participants, which I used with their consent; the virtual meetings were done over Zoom, where I used the recording option with the participants’ permission. The initial respondents involved personal industry contacts, and feedback from these early sessions informed refinements to the questionnaires for the following interviews.
The questions were developed to understand and answer the trends in the field. The stakeholders represented both the supply and demand sides, which required different questions for each of the three groups. This ensured that the questions were relevant to each role and provided the appropriate insights from each perspective.
The questions were standardized across each group of participants based on their role; a semi-structured interview format was used. This allowed for follow-up questions and gave deeper depth to responses. Once the interviews were conducted, the answers were transcribed with the help of an AI tool, “rev.com,” where I rechecked the AI-generated voice-to-text transcription to ensure accuracy. The data was then categorized into Excel, where the differences in opinion were noted and analyzed to compare the supply and demand sides.
Additional respondents were found through LinkedIn. Of the 250 interview requests sent via emails and LinkedIn messages, 18 participants were secured for the study.
Moreover, a scraper program was written to scrape 4,967 users from the biggest freelancing website, Upwork, to serve as a proxy for the annotation jobs market. I separately obtained data for five different categories of annotators. Then, a Python program was used to convert the scraped .json data into usable data to answer questions from the supply side. I looked at it for over 13 years (2012-2024)
Data Collected
The paper provides a snapshot of the annotation industry as of 2023. The heads of AI companies interviewed are engaged in state-of-the-art application development for a global market, offering insights into current needs and foreseeable trends. The annotators, generally based in developing countries, represent typical workers worldwide.
1) Annotators
Background and Entry into Data Annotation
Four data annotators were interviewed: Participant A (from China), Participant B (from Bangladesh), Participant C (from Pakistan), and the USA (Participant D). All participants had undergraduate degrees, and one held a Master’s degree. Each annotator had, on average, a year of experience in their current position. Entry into the data annotation field was through internships, remote work engagements, or connections from undergraduate alumni networks.
Upon joining the industry, the annotators displayed varied initial career aspirations. Participant B started by cleaning up the data and preparing it for exploratory analysis. However, they wanted to transition into a role more directly connected to training AI models. The eagerness to transition was due to the curiosity linked with the purpose behind his job.
Participant D sought a flexible role with competitive hourly wages and resources to navigate the intricacies of their newly chosen field. Participant C, whose focus was on China, joined a project related to Asian American hate incidents. This opportunity came from an alumni network of a Chinese learning program. Their interest in Asian Americans and China served as a driving force, complementing their prior engagement in social justice and human rights endeavors.
Work Environment and Job Performance
All annotators worked about 20 hours a week and did not state how much they were paid for their efforts. Most annotators worked part-time or for internship purposes as they were looking for new experiences with remote work or an entry into the AI industry.
They had distinct working arrangements regarding hours, remote work, and flexibility. Participant B’s role centered on translating Bengali into English for AI model training. Two of the annotators worked remotely, transforming qualitative data into binary codes. This required interpreting information based on feelings and personal opinions (subjective data analysis). On the other hand, Participant D was tasked with translating and labeling data. She appreciated the convenient working hours and the ability to work remotely.
Participant B worked remotely from Pakistan with My Digital Health, a Bangladesh-based company. For his internship, he navigated health-related queries, categorizing them based on organs or diseases, with translation into multiple languages to train AI models. On the other hand, Participant D reviewed various forms of content and output for accuracy based on the batches of data uploaded by the client. Communication approaches and workload allocation varied with company and project scale. Meanwhile, Participant C’s workload fluctuated due to team expansion, with weekly hours spanning 1.5 to 10, facilitated by remote arrangements and performance objectives. Their chief role involved qualitative data review on Google spreadsheets for criminal justice purposes.
Job Satisfaction and Career Aspirations
Most annotators found their jobs satisfying due to the friendly remote work culture, supportive management, and the opportunity to earn while studying. However, they found the tasks redundant and less stimulating over time, as they involved repetitive labeling and translation of similar data. Despite their interest in training AI models, their role was limited to data annotation.
One key finding is that the companies engaging data annotators, as per the interviews, generally didn’t show strong support for career advancement due to the nature of contractual arrangements that didn’t promise such progression. However, due to the training and experiences most annotators gained, most saw potential in vertical movement within the industry. Notably, Participant B transitioned to a role in product management and worked on training ML models to detect inappropriate images. These experiences, though contrasting with the general trend within the annotation industry, highlight instances where data annotators find avenues for personal and professional growth within their roles.
Relevance of Human Data Annotators and Transferable Skills
To stay relevant in the annotation field, one annotator believes using people to check data quality makes sense, even as automation progresses. Human beings can screen what other humans inquire about and what they will have access to. This adds a solid safety measure to the field. Participant C talked about developing their judgment and reading skills, as AI can mostly only make simple conclusions based on input. As for applying skills and expertise from data annotation to other fields or industries, annotators emphasize the importance of understanding human behavior in their job performance. They believed that an empathetic understanding of people’s moods and desires, particularly as they explore new things, is a skill that AI cannot automate. This interesting observation discusses that generative AI holds significant promise as the future wave. It’s worth emphasizing the term “future.” While McKinsey and Company project the widespread adoption of generative AI by 2050, the managers of the annotator team believe that AI will never attain perfection and that human oversight will always be necessary to ensure the quality of results, which may not lead to redundancy in annotators as of yet9.
They also highlighted how their role developed their judgment and initiative skills. They also gained proficiency in data organization technology, such as Google Spreadsheets and Excel, which they could apply to other work contexts.
2) Managers
Team Structure and Responsibilities
Six managers were interviewed. They were responsible for checking the quality of work through the model’s accuracy, task assignment, and training. They had an average of 5-7 years of experience in the AI industry.
These managers, each overseeing 2-300 annotation teams, are essential in coordinating the annotation process. Annotators’ primary tasks encompass image and video tagging, assigning tags to feedback, and contributing to developing machine learning solutions. “Understanding what the requirements are, what’s required, why it’s required, and enjoying the work” is what managers seek in annotators. These key attributes include understanding the work’s demands and purpose, deriving satisfaction from the tasks, producing high-quality work, domain expertise, and comfort with ambiguity – an ability to annotate uncertain scenarios that may not be explicitly outlined in guidelines or tasks. Precision and keen attention to detail are vital traits as annotators often need to identify subtle, easily overlooked features within the data they work on. This skill, honed through practice, is invaluable as annotators who can pinpoint potential issues, fix them, aid AI models, and make them more accurate.
The typical career progression paths for data annotators within the organizations these managers work in are complex annotation roles, where annotators gain experience and then do more complex annotation work. One main career path annotators take is QA (quality assurance), where they progress to a role where they ensure the quality of the annotations being produced. Another career path is productivity management, where they manage productivity and oversee the use of annotation tools within the team. They may even transition into leadership roles like project management or join the product team, working closely with the company’s main product. They could also become product managers or engineers. They may also transition into design roles where they work on the design of the annotation tools used by the team.
Challenges and Job Security in Data Annotation
Data annotation quality is managed through industry-standard practices, including multiple reviews, a cascading review system for conflicting annotations, individual performance monitoring against a dataset baseline, a quality assurance team, coaching, training, vocabulary-enhancing meetings, model evaluation, and tools for result quality estimation with prompt feedback and reevaluation when needed.
These organizations prioritize managing work-related stress and overtime by reevaluating workloads to prevent stress and unrealistic burdens. They compensate for overtime during high-demand situations, adopt comprehensive scheduling systems, and encourage team well-being through regular engagement sessions and tailored support.
Organizations ensure data annotator job security amid AI advancements by prioritizing retention, offering career progression through mandatory courses, and maintaining alumni networks. Project transitions guarantee continuous employment, and AI is leveraged to enhance annotator productivity instead of replacing them. The focus remains on data quality oversight, emphasizing the commitment to sustained employment and professional development. As succinctly put, “The automation isn’t there to replace the annotator but is there to boost [them] instead.” This approach reflects the organization’s commitment to sustained employment and professional development within the evolving landscape of AI technology.
Specific tasks pose challenges for automation, but mainly depend on the AI stage of evolution. More advanced AI can handle more cognitive tasks but still lag behind humans, as these managers have said: “Anything my model cannot visually detect or requires a multi-step understanding.” The quote mentioned above is representative of the consensus among managers that AI, particularly when trained primarily on visual data, faces challenges in tasks requiring multi-step reasoning and understanding non-visual elements like intentions or abstract concepts. Similarly, another manager said that work that requires “excessive brainstorming” forces the annotators to step in. This limitation is evident in scenarios such as evaluating sequences of events to determine risky behavior, where human abilities excel.
The Evolving Role of Data Annotators
The demand for data annotation work is different between organizations. While some managers note substantial growth in the need for data annotators, complemented by more product demand, others perceive a demand-based fluctuation. One manager highlights the impact of recent tools on reducing the necessity for additional annotators. Still, another manager describes a scenario where the demand for data annotators plummeted after a specific project concluded. They emphasize the distinctive nature of the role’s evolution across industries, driven by AI products’ varying progress and maturity. This insight acknowledges the complexity of predicting trends in annotation demand solely based on overarching industry trends. Another manager predicts a higher demand for annotators and an increasing role complexity.
Managers say that with improving AI models, these annotators are seen to transition into quality experts who review AI model outcomes rather than annotating each data piece. Still, a different perspective regards the role of data annotators as temporary, particularly in the context of evolving AI technology, as noted by another manager: “It’s something that is not a long-term thing.” Data annotators’ skills are developed through upskilling and reskilling initiatives, either through educational programs or mandatory courses, reflecting the commitment to adapting their expertise to meet changing demands.
Opinions on the feasibility of completely automating data annotation vary among the respondents. While some believed that annotation could be fully automated, the majority disagreed. One manager firmly asserts, “I believe it cannot be. There will always be some human involvement needed.” They highlight the requirement for human input in the process. Building on this, another leader talks about the limitations of wholly automated systems. They note the success of LLMs but mention the risk of hallucination as more data is incorporated, necessitating human intervention to rectify erroneous outputs. This perspective acknowledges the irreplaceable role of human annotators in ensuring data accuracy. Another manager builds on the statement that complete automation is unattainable due to the risk of machine learning models generating inaccurate results, as they may not be verified. These opinions may be accurate because the managers have seen firsthand the need for human-annotated data to make models run accurately and the issues with fully automated systems.
Heads of AI Organizations
Organization Overview and Data Annotation Practices
The eight leaders interviewed came from diverse industries, including vision computing, information technology (IT), software development, and technology services. They held top roles, such as CEO or CFO, or headed the company’s entire AI division.
Organizations employ data annotation services for different purposes, such as annotating unsafe behaviors in factories, utilizing supervised learning for model training in domains like computer vision or Natural Language Processing, and providing global data annotation services for digital transformation initiatives. The outsourced annotation tasks include making bounding boxes around objects or potentially dangerous actions, image annotation, and classifying images. This is followed by object recognition involving the annotation of multiple objects within images, text annotation, and labeling situations through human review. Such labeling aids in training machine learning models for predictive purposes.
The significance of data annotation within the interviewed organizations is of utmost importance. One leader contributes, “If you don’t have good quality [human] annotated data, then I think it is tough to train good quality models…high-quality annotated data is probably like oxygen. Without it, there is no chance that you can train those models.” The emphasis is on the critical role of high-quality annotated data in model training.
Automation in Data Annotation and Investment Decisions
In the AI organizations interviewed, task automation ranges from 10% to 33%, but organizations want to increase this amount based on their different motivations. One participant cites the need for more data to improve accuracy: “We will increase the use of annotators as well because to get more and more accuracy, we just need a lot more data.” Another participant talks about expanding machine learning algorithms and models through increased automation: “Yes, absolutely, we would want to increase this number because annotation is a very human-dependent task. Suppose we can increase the number of datasets that are annotated without the intervention. In that case, we can significantly increase the base of developing new machine learning algorithms and models. Otherwise, your biggest constraint is the availability of that data, so if machines can start solving the problem of annotated data, then I think we can train models faster.” Conversely, one organization does not want to escalate automation as its AI domain requires tailored human involvement for optimal results. The interest in expanding this ratio stems from the desire to save costs, improve data quality, and achieve quicker results. Generally, the value and scalability provided by AI are so high that the organization is willing to spend on both annotators and automation. In other words, the result is more automation and more annotation jobs.
The expected demand for human data annotation within organizations varies. Certain respondents anticipate an upsurge driven by the expanding workload and the constant need to improve accuracy. Viewpoint A anticipates stability in human data annotation services over the next 2-3 years, followed by a decrease. Meanwhile, perspective B predicts reduced annotator employment, even though they still use annotation services. This perspective emphasizes the influence of task complexity on demand, as more complicated assignments can be challenging for annotators. To put this in perspective, viewpoint A would support Figure 1 as they are working on more complex tasks that cannot be automated. Similarly, perspective B, which is working on tasks that can be easily automated, supports Figure 2 as the human annotator’s demand decreases as a result of developing automated solutions.
Quality Concerns and Comparisons between Human and Automated Annotation
The strength of AI is that human-annotated data has a higher chance of mistakes. Inadequate training of data annotators or ambiguous instructions can result in significant errors, usually leading to the data being discarded and the task being redone. Quality problems can persist as flawed data pour into the models, leading to noticeable issues during model implementation. One organization interviewed had many challenges with human annotators. Some annotators lack computer proficiency, misclassify data, and incorrectly label data, affecting the end product. Moreover, the slow process of human annotation work is acknowledged, with one participant noting that a human can annotate only a few hundred images in an hour compared to the unrestricted speed of automation. This human limitation becomes evident in scenarios requiring model enhancements or the addition of new objects, which may necessitate hiring more personnel or speeding up processes.
Additionally, human annotations are more expensive than automated solutions. The limitations in accuracy between automated solutions and human annotators are also acknowledged. While human annotators excel in recognizing complicated differences in objects, the accuracy levels might not match when automation reaches its full potential.
The advantage of human annotation over automated solutions is gathering initial data for systems to learn from, especially in cases where no existing data is available. Once a base model has been established, strategies like ensemble modeling using multiple models “vote” if there is a disagreement between models. Another organization discusses the benefit of lower error rates achieved through human annotation. While AI advancements are evident in fields like NLP, they acknowledge that AI’s capabilities are still evolving, especially in domains beyond text analysis. They stress the irreplaceable role of human intelligence in recognizing complex patterns and variations.
The comparison of cost-effectiveness between human annotators and automated solutions, as observed in the interviewed organizations, varies with the complexity of the task. One organization emphasizes that human annotators cannot become redundant due to the existing technology landscape, highlighting the need to coexist with both approaches. They point out that while automation can handle more straightforward tasks, anything beyond its capabilities calls for human annotators. Several organizations agree that the cost-effectiveness of human annotators remains high, particularly when automated solutions lack the desired accuracy. Another organization highlights the cost dynamics of a place like Pakistan, where low wages make manual, non-technical, and non-analytical work financially viable. They stress the feasibility of hiring many data taggers or annotators using platforms like Amazon’s Mechanical Turk. This opinion believes that developing automated systems requires a lot of investment in hardware, including GPU-based machines. In contrast, another organization plans to continue employing human data annotators rather than relying on automated solutions due to cost disparities.
Although AI has advanced, it hasn’t yet mastered automating complex tasks, keeping human annotators crucial for their precision and cost-effectiveness. This evolving tech landscape requires human expertise and automation to maintain accuracy and efficiency. Speed in annotation is vital for organizations, impacting model development and customer satisfaction. With AI taking over more straightforward tasks, human annotators are shifting towards more complex and specific challenges. This progression predicts a future where humans may supervise and refine the work of automated systems like Chat-GPT. As AI improves general pattern recognition, the human role in data annotation will increasingly focus on complex decision-making.
Future Plans and Anticipated Changes in Data Annotation
“I don’t see a future where no human annotators will be required at all,” expressed one organization leader. Most leaders believe that ‘human reinforcement’ is the final step involving human feedback to enhance AI models and that annotators will maintain their relevance, as AI will constantly present new tasks beneficial for AI systems. This supports the idea that leaders, although wanting to increase the use of automated solutions, still need human involvement for tasks that require contextual understanding, such as data from specialized manufacturing or industrial plants. The unavailability of initial data creates a problem with the performance of AI models. Human intervention remains essential in these situations.
Specific roles remain entirely human, like annotating from niche languages that AI models don’t know yet. Given the variety of languages and dialects globally, it remains an enormous task. These advancements contribute to heightened task complexity and demand increased expertise from data annotators. Another leader contributes to the discussion, linking the surge in interest in generative AI and AI/ML, particularly with the advent of LLMs and ChatGPT. This heightened interest has translated into more challenging and diverse data annotation requirements, showcasing the dynamic nature of the field.
Ethical Considerations, Career Growth, and Future Relevance in the AI Industry
Although data annotation is not complex, it can offer marginalized societies at least some form of employment without ample job opportunities. However, leaders acknowledge that ethical concerns tend to surface when these marginalized communities encounter exploitative conditions such as low wages and inhumane working environments. The crux lies in ensuring equitable wages and fair treatment, which, according to leaders, can imbue data annotation work with a positive impact on the marginalized communities it involves.
According to the sentiments expressed by the leaders of AI organizations, ensuring relevance within the AI industry consists of consistent learning. They advise engaging in diverse projects to expand experience while stressing the importance of upskilling. One leader talks about integrating AI assistants, particularly generative AI assistants, into workflows to streamline tasks, boost efficiency, and cut costs. Adopting this mindset can lead to continuous improvement, productivity, and precision. Another leader recommends the principle of lifelong learning, stating that being updated with the current AI tools and concepts can advance their careers. Embracing a professional identity within their field, annotators are encouraged to actively pursue skill enhancement opportunities. Keeping pace with AI industry trends and breakthroughs is essential to remain relevant.
Quantitative statistics
The historical and projected wage trends for data annotators in various markets are as follows.
The rise of AI and machine learning fueled the demand for annotated datasets, leading to a sharp increase in work opportunities and income.
AI complements rather than replaces annotators. As AI progresses, the annotators who fail to upskill lose their jobs. Analysis of workers’ income from 2012 to 2024 from a dataset of 4947 workers revealed patterns of upward mobility. Based on income changes, workers were classified as “Upskilled” or “Obsolete.” Workers were considered upskilled if their income significantly increased, indicating improved skills necessary to remain relevant in a fast-changing AI landscape. Those whose incomes stagnated or declined were deemed obsolete, suggesting reduced workforce relevance due to a lack of work. The results showed that 76.6% of workers experienced upskilling, while 23.4% became obsolete. This insight shows how important upskilling is in this labor market to remain employed. Additionally, this shows how the task complexity has increased over time since task complexity is the reason for annotators’ upskilling.
According to the data, job losses occur when workers become redundant. Since AI is ever-changing and annotation work becomes more complex as AI improves, annotators who fail to learn new skills to handle more complicated annotations. In a different dataset, there are 1157 workers. 402 workers dropped out (no earnings throughout the dataset), leading to 755 active workers. Among the 755 active workers, 18 had a downward trend in earnings, and 737 had an upward trend in earnings.
Moreover, the graph below illustrates more evidence of annotators leaving the industry or moving to better jobs.
Supported by the data from the interviews taken, annotation is supposed to be a short-term job, a stepping stone into the AI industry, or a way of making extra money, which is why many people do not work in the field for too long. The average length of employment was 3.97 years.
The Upwork data scraped for this study finds strong support in existing global research. Annotators who failed to upskill experienced stagnant or declining incomes, mirroring a leftward shift in demand for low-skilled labor shown in Figure 2. This trend aligns with influential research estimating that 47% of U.S. jobs—mostly routine, low-skilled ones—are at high risk of automation10. The World Bank similarly notes that in developing countries, low-skilled roles face the highest automation threat11. An ILO report on crowdwork also cautions that annotation tasks often come with minimal labor protections, leading to volatility in employment and pay12. In contrast, Figure 1 illustrates rising demand for high-skilled roles, a pattern reinforced by a McKinsey Global Institute report, which projects growth in jobs requiring advanced AI-related skills such as quality assurance, data strategy, and oversight9. The OECD’s AI Employment Outlook echoes this, noting net employment gains in AI-intensive sectors due to increasing task complexity13. Stanford’s AI Index further supports this shift, citing a global rise in job postings for roles adjacent to high-skill annotation14. Together, these sources validate the Upwork trends and illustrate a clear bifurcation: annotators who evolve with AI remain relevant, while those who don’t risk obsolescence.
Results
This research investigates the evolving dynamics of the data annotation industry by integrating qualitative insights from 18 interviews (annotators, managers, and AI leaders) with quantitative data scraped from 4,967 Upwork profiles. The results reveal three key trends. First, upskilling is essential: 76.6% of annotators who gained new skills saw increased earnings, while 23.4% became obsolete, either dropping out of the industry or experiencing declining income. Second, the role of annotators is changing. While once limited to repetitive tagging, annotators are increasingly moving into quality assurance and productivity management roles. As one manager put it, “The automation isn’t there to replace the annotator but to boost [them] instead.” Third, the demand for human judgment in complex tasks remains high. Annotators and AI leaders alike emphasized that AI still struggles with tasks involving empathy, intent, or nuance. One annotator reflected, “AI can’t understand what people feel—we still have to read between the lines.” Together, these findings challenge the narrative that AI will replace annotation work entirely. Instead, they suggest that annotation is evolving into a higher-skill field—one that still risks exclusion and exploitation unless ethical standards and training access improve. This study calls for more inclusive, globally diverse research and policy frameworks to ensure data annotators aren’t left behind in the race toward automation.
Conclusion
This research explores the data annotation industry, bridging gaps surrounding AI creation by interviewing four annotators, six managers, and eight heads of AI firms. It also includes a quantitative analysis of annotators from the biggest freelancing websites to gain more insights into how much annotators are paid, their job progression, and employment patterns. It reveals a complex reality beyond the fear of AI-induced unemployment, a dynamic industry where niches requiring nuanced human judgment persist despite the drive towards automation. The findings advocate for annotators to upskill for better job security and view annotation as a potential steppingstone towards managerial roles while acknowledging employment’s ethical implications for marginalized communities. The quantitative data supports this as annotators who didn’t upskill remained with stagnant wages and eventually exited the job market. As AI advances, the industry faces a critical problem, necessitating continued investigation and dialogue, especially considering the moral dimensions and the need for a more inclusive and geographically diverse research approach to navigate its evolving landscape successfully.
Acknowledgments
I sincerely appreciate those who helped me with this research. My biggest thanks go to my mentors, Mr. Cameron Ricciardi and Mr. Trevor Dean Arnold, for their invaluable guidance throughout the planning and conduct of this research paper. I also want to express my gratitude to the leaders, managers, and annotators who contributed significantly despite time zone differences.
I would also like to extend a special thanks to Mr. Umair Khan and Mr. Yaseer Bashir for their exceptional support and collaboration in writing this paper. Thank you both for your help.
Appendix
Questions asked from annotators
- What is your educational background?
- How did you come to work in data annotation?
- What were your career aspirations when you started as an annotator?
- Can you describe your work environment?
- How many hours do you work on an average day?
- Can you describe any job performance pressure you may experience?
- Have you tried to transition into other roles within the AI/ML industry? If yes, can you explain your work’s day-to-day procedures and structures?
- How would you describe your current role and responsibilities as a data annotator?
- Have you received any training or upskilling in your job? If yes, what skills?
- What skills have you acquired or honed during your tenure in data annotation?
- How has your role evolved since you started in this field?
- Have you observed any changes in the demand for data annotation in your organization?
- What is your perception of the potential for career progression within the data annotation sector?
- Have you considered transitioning to roles outside of data annotation due to concerns about automation? If so, which ones?
- Do you consider data annotation to be a stepping stone to the technology and AI industry?
- What factors do you believe contribute to maintaining the relevance of human data annotators despite advancements in AI?
- Can you describe any efforts or initiatives your company takes to support your career aspirations?
- Are you satisfied with your current job? Why or why not?
- What types of tasks do you typically perform that you believe cannot be easily automated?
- How do you feel the skills and expertise from your data annotation work apply to other fields or industries? Have you received any promise of promotion or career advancement? Do you feel that your job provides you with sufficient opportunities for career progression?
Questions asked from managers of annotator teams
- How many data annotators do you manage?
- Can you describe the type of work your team is responsible for?
- How do you assess the performance of your team members?
- Have you noticed a change in demand for data annotation work in your organization?
- How do you see the role of data annotators changing as AI technology evolves?
- Are there any plans for upskilling or reskilling within your team in response to advancements in AI?
- Do you think data annotation work can be fully automated in the future? Why or why not?
- What qualities do you value most in a data annotator?
- What are the common career progression paths for data annotators within your organization?
- How do you manage the quality of data annotation in your team?
- How are issues like overtime and work-related stress handled within your team?
- Are there specific annotation tasks that are particularly difficult to automate?
- What steps is your organization taking to ensure job security for data annotators in light of AI advancements?
Questions asked from the leaders of AI organizations
- What is the primary industry and nature of your organization?
- Can you briefly explain your organization’s use of data annotation services?
- What kind of data annotation tasks do you typically outsource?
- What is the importance of data annotation in your business operations?
- How much of your data annotation tasks are currently automated? Do you plan to increase this ratio? If yes, why?
- Have you invested, or are you considering investing in the development of automation technologies for data annotation? If yes, could you share the reasons behind this decision?
- Have you experienced any issues with the quality of human-annotated data? If so, could you describe these issues?
- What do you perceive as the most significant benefits of human data annotators over automated solutions?
- What do you perceive as the most significant drawbacks of human data annotators compared to automated solutions?
- How do you perceive the cost-effectiveness of human data annotators compared to automated solutions?
- How important are the speed and turnaround time for data annotation in your organization’s projects?
- How do you see the role of human data annotators evolving in your organization as AI technology advances?
- Do you think there are certain jobs that only humans can do for data annotation, even as AI technology improves? If so, can you explain what those jobs are?
- What are your future plans for utilizing human data annotators versus automated annotation? What factors influence these decisions?
- Have you observed an increase in the complexity or diversity of data annotation tasks required by your organization? If so, in what ways?
- Do you anticipate that the demand for human data annotators in your organization will increase, decrease, or remain stable over the next 5 years?
- What does your organization think about the ethical concerns of giving data annotation work to marginalized societies?
- How does your company help data annotators grow in their careers? Do you have plans to train them or help them move into more advanced roles?
- In your opinion, what steps, if any, should data annotators take to ensure their relevance in the AI industry as it evolves?
References
- Star, S. L., & Strauss, A. (1999). Layers of silence, arenas of voice: The ecology of visible and invisible work. Computer Supported Cooperative Work (CSCW), 8(1), 9–30. https://doi.org/10.1023/A:1008651105359 [↩]
- Bowker, G. C., & Star, S. L. (2000). Sorting things out: Classification and its consequences. MIT Press [↩]
- Crenshaw, K. W. (1991). Mapping the margins: Intersectionality, identity politics, and violence against women of color. Stanford Law Review, 43(6), 1241–1299. https://www.jstor.org/stable/1229039 [↩]
- Wang, D., et al. (2022). Whose AI dream? In search of the aspiration in data annotation. arXiv. https://doi.org/10.48550/arXiv.2203.10748 [↩] [↩]
- Irani, L. C., & Silberman, M. S. (2013). Turkopticon: Interrupting worker invisibility in Amazon Mechanical Turk. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 611–620. https://doi.org/10.1145/2470654.2470742 [↩]
- Braylan, J., et al. (2023). A general model for aggregating annotations across simple, complex, and multi-object annotation tasks. arXiv. https://arxiv.org/pdf/2312.13437v1 [↩]
- Berg, M. (1998). The politics of technology: On bringing social theory into technological design. Science, Technology, & Human Values, 23(4), 456–490. https://doi.org/10.1177/016224399802300406); (Dourish, P. (2010). HCI and environmental sustainability: The politics of design and the design of politics. Proceedings of ACM DIS 2010. https://research.ed.ac.uk/files/364486013/Bringing_Sustainability_through_PROST_DOA30042023_AFV_CC_BY.pdf [↩]
- Felstiner, A. L. (2011). Working the crowd: Employment and labor law in the crowdsourcing industry. Berkeley Journal of Employment & Labor Law, 32(1), 143–204. https://www.researchgate.net/publication/43919661_Working_the_Crowd_Employment_and_Labor_Law_in_the_Crowdsourcing_Industry); (Ruhleder, K., & Star, S. L. (2001). The moral and infrastructural politics of “the digital divide”: Rethinking the concept of access. Proceedings of the 7th European Conference on Computer-Supported Cooperative Work, 247–266 [↩]
- McKinsey & Company. (2023, August 1). The state of AI in 2023: Generative AI’s breakout year. McKinsey Global Institute. https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai-in-2023-generative-ais-breakout-year [↩] [↩]
- Frey, C. B., & Osborne, M. A. (2013). The future of employment: How susceptible are jobs to computerisation? Oxford Martin School, University of Oxford. https://www.oxfordmartin.ox.ac.uk/downloads/academic/The_Future_of_Employment.pdf [↩]
- World Bank. (2016). World development report 2016: Digital dividends. World Bank Group. https://www.worldbank.org/en/publication/wdr2016 [↩]
- International Labour Organization. (2021). The role of digital labour platforms in transforming the world of work. https://www.ilo.org/global/research/global-reports/weso/2021/WCMS_771749/lang–en/index.htm [↩]
- Organisation for Economic Co-operation and Development. (2021). The impact of artificial intelligence on the labour market. https://www.oecd.org/employment/the-impact-of-ai-on-the-labour-market.htm [↩]
- Stanford Institute for Human-Centered Artificial Intelligence. (2023). AI index report 2023. Stanford University. https://aiindex.stanford.edu/report/ [↩]