Get 40% Off
🤯 This Tech Portfolio is up 29% YTD! Join Now to Get April’s Top PicksGet The Picks – Just 99 USD

Faces for cookware: data collection industry flourishes as China pursues AI ambitions

Published 28/06/2019, 01:36
Updated 28/06/2019, 01:36
© Reuters. Villagers wait in line as they attend a facial data collection project run by Qian Ji Data Co, which would serve for developing artificial intelligence (AI) and machine learning technology, in Jia county, Henan

© Reuters. Villagers wait in line as they attend a facial data collection project run by Qian Ji Data Co, which would serve for developing artificial intelligence (AI) and machine learning technology, in Jia county, Henan

By Cate Cadell

PINGDINGSHAN, China (Reuters) - In a village in central China's Henan province, amid barking dogs and wandering chickens, villagers gather along a dirt road to trade images of their faces for kettles, pots and tea cups.

At the front of the line, a woman stands in front of a camera zip-tied to a tripod. She holds a photograph of her head with the eyes and the nose cut out in front of her face and slowly rotates side to side.

Villagers waiting their turn take a numbered ticket. Some of them say it's the third or fourth time they've come to do this sort of work.

The project, run out of a sleepy courtyard village house adorned with posters of former China leader Mao Zedong, is collecting material that could train AI software to distinguish between real facial features and still images.

"The largest projects have tens of thousands of people, all of whom live in this area." said Liu Yangfeng, CEO at Qianji Data Co Ltd, which collects and labels data for several of China's largest tech firms and is based in the nearby city of Pingdingshan.

"We are creating more data sets to serve more AI algorithm companies, so they can serve the development of artificial intelligence in China," said Liu, declining to disclose his clients.

The boom in demand for data to train AI algorithms is feeding a new global industry that gathers information such as photos and videos, which are then labelled to tell the machines what they are seeing.

Companies involved in data labelling or data annotation as it is also called include crowdsourcing platforms such as Amazon.com's (O:AMZN) Mechanical Turk which offer users small amounts of money in return for simple tasks, outsourcing firms such as India's Wipro Ltd (NS:WIPR) as well as professional labellers like Qianji.

Cognilytica, a U.S. research firm specialising in AI, estimates the global market for machine-learning related data annotation grew 66% to $500 million in 2018 and is set to more than double by 2023. Some industry insiders say, however, that much of the work done is not disclosed, making accurate estimates difficult.

WEAK PRIVACY LAWS, CHEAP LABOUR

China has emerged as a key hub for data collection and labelling thanks to insatiable demand from a burgeoning artificial intelligence sector backed by the ruling Communist Party, which sees AI as an engine of economic growth and a tool for social control.

A plethora of firms have invested heavily in an area of AI known as machine learning, which is at the core of facial recognition technology and other systems based on finding patterns in data.

These include tech giants Alibaba Group Holding Ltd (N:BABA), Tencent Holding Ltd (HK:0700), Baidu Inc (O:BIDU) as well as younger companies such as AI specialist SenseTime Group Ltd and speech recognition firm Iflytek Co Ltd (SZ:002230).

The result has been a proliferation of AI products and services in China, from facial recognition-based payment systems to automated surveillance and even AI-animated state media news anchors. Chinese consumers mostly see these technologies as novel and futuristic, despite concerns raised by some over more invasive applications.

Weak data privacy laws and cheap labour have also been a competitive advantage for China as it races to become a global leader in AI. The Henan villagers were happy to trade several sessions in front of a camera for a tea cup, or several hours for a stove-top pot.

OVERSEAS CUSTOMERS

Beijing-based BasicFinder, a leading data labelling firm with locations across Hebei, Shandong and Shanxi provinces, boasts a robust mix of domestic and overseas clients.

At a recent visit to its Beijing offices, some staff were labelling images of sleepy people that will be used by an autonomous driving project to identify drivers who might be falling asleep at the wheel.

Others were labelling British documents from the 1800s for a Western online ancestry service, marking fields for dates, names and genders on birth and death certificates.

According to BasicFinder Chief Executive Du Lin, hiring trained labellers in China is cheaper than using Western crowdsourcing marketplaces.

A Princeton University project related to autonomous driving initially put a task on Amazon's Mechanical Turk but as the task became more complicated, people began making mistakes and BasicFinder was brought in to help correct the results, said Du.

In that project, one trained BasicFinder labeller was able to do the work of three crowdsourced labellers, he added.

"Gradually they saw they were paying less for labelling from us, so they hired us to label all the works from the very beginning," said Du.

Princeton declined to comment.

For labelling employees, the reasons for joining China's data industry are straightforward. The work, though sometimes tedious, is an upgrade on other jobs available to young workers who want to return home to small Chinese cities and villages.

Labellers at Qianji make roughly 100 yuan ($14.50) a day marking data points on photographs of people, surveillance footage and street images.

The work is usually simple, according to the employees, though some overseas content poses a challenge.

"One time we thought we were classifying Europe-style cooker machines that have a washer attached," said Jia Yahui, a labeller at Qianji. "Later we were told it's actually two separate things, a stove and a dishwasher."

The labelling work brings some of the employment benefits of the tech sector to rural areas, but those benefits may prove short-lived if AI improves enough to perform many of the tasks labellers do.

"We think this industry will still exist in three to five years. It may not be a long-term career - we can only think of the five-year plan for now," said Qianji CEO Liu.

© Reuters. Villagers wait in line as they attend a facial data collection project run by Qian Ji Data Co, which would serve for developing artificial intelligence (AI) and machine learning technology, in Jia county, Henan

Latest comments

Risk Disclosure: Trading in financial instruments and/or cryptocurrencies involves high risks including the risk of losing some, or all, of your investment amount, and may not be suitable for all investors. Prices of cryptocurrencies are extremely volatile and may be affected by external factors such as financial, regulatory or political events. Trading on margin increases the financial risks.
Before deciding to trade in financial instrument or cryptocurrencies you should be fully informed of the risks and costs associated with trading the financial markets, carefully consider your investment objectives, level of experience, and risk appetite, and seek professional advice where needed.
Fusion Media would like to remind you that the data contained in this website is not necessarily real-time nor accurate. The data and prices on the website are not necessarily provided by any market or exchange, but may be provided by market makers, and so prices may not be accurate and may differ from the actual price at any given market, meaning prices are indicative and not appropriate for trading purposes. Fusion Media and any provider of the data contained in this website will not accept liability for any loss or damage as a result of your trading, or your reliance on the information contained within this website.
It is prohibited to use, store, reproduce, display, modify, transmit or distribute the data contained in this website without the explicit prior written permission of Fusion Media and/or the data provider. All intellectual property rights are reserved by the providers and/or the exchange providing the data contained in this website.
Fusion Media may be compensated by the advertisers that appear on the website, based on your interaction with the advertisements or advertisers.
© 2007-2024 - Fusion Media Limited. All Rights Reserved.