🎉 Gate Square Growth Points Summer Lucky Draw Round 1️⃣ 2️⃣ Is Live!
🎁 Prize pool over $10,000! Win Huawei Mate Tri-fold Phone, F1 Red Bull Racing Car Model, exclusive Gate merch, popular tokens & more!
Try your luck now 👉 https://www.gate.com/activities/pointprize?now_period=12
How to earn Growth Points fast?
1️⃣ Go to [Square], tap the icon next to your avatar to enter [Community Center]
2️⃣ Complete daily tasks like posting, commenting, liking, and chatting to earn points
100% chance to win — prizes guaranteed! Come and draw now!
Event ends: August 9, 16:00 UTC
More details: https://www
DataFi: A New Blue Ocean of AI Data Economy in the Web3 Field
Data is Asset: DataFi is Opening a New Blue Ocean
The biggest topic in the AI circle this month is undoubtedly Meta's large-scale recruitment of talent, forming a luxurious AI team mainly composed of Chinese researchers. The team is led by Alexander Wang, who is only 28 years old and founded Scale AI. Scale AI is currently valued at $29 billion and provides data services for several AI giants, including the U.S. military, OpenAI, Anthropic, and Meta, with its core business being the provision of a large amount of accurate labeled data.
Scale AI stands out among numerous unicorns because it recognized early on the critical role of data in the AI industry. Computing power, models, and data are the three pillars of AI models. If we compare a large model to a person, then the model is the body, computing power is the food, and data is the knowledge and information.
In the rapid development of large language models, the industry's focus has shifted from models to computing power. Nowadays, most models use the transformer as the basic framework, occasionally incorporating innovations like MoE or MoRe; major companies either build their own supercomputing clusters or sign long-term agreements with cloud service providers to address computing power issues. On this basis, the importance of data is becoming increasingly prominent.
Scale AI focuses on building a solid data foundation for AI models, with its business not only involving the mining of existing data but also encompassing data generation services. The company has also formed an AI training team composed of experts from various fields to provide high-quality training data for AI models.
Model training is divided into two stages: pre-training and fine-tuning. Pre-training is similar to the process of a baby learning to speak, requiring a large amount of information such as text and code gathered from the internet. Fine-tuning is akin to school education, with clear goals and directions, cultivating the model's specific abilities through carefully designed datasets.
Therefore, the AI data track mainly includes two types of datasets: one type consists of large amounts of data that require little processing, usually sourced from UGC platforms such as Reddit, Twitter, and Github, public literature databases, or corporate private databases; the other type requires careful design and selection to ensure that it can cultivate specific capabilities in the model, necessitating data cleaning, filtering, labeling, and human feedback.
With the further enhancement of model capabilities, various more refined and specialized training data will become the key factor determining model performance. In the long run, AI data represents a long-term investment track with a snowball effect; as preliminary work accumulates, data assets will possess the ability to generate compound returns, with their value continuously increasing.
Web3 DataFi: The Chosen AI Data Oasis
Compared to the hundreds of thousands of remote manual labeling teams formed by certain companies in multiple countries, Web3 has a natural advantage in the AI data field, giving rise to the new concept of DataFi. Ideally, the advantages of Web3 DataFi include:
For ordinary users, DataFi is also the easiest decentralized AI project to participate in. Users can get involved through simple operations, including providing data, evaluating models, using AI tools for simple creation, or participating in data trading.
The Potential Projects of Web3 DataFi
Currently, multiple DataFi projects have obtained significant funding. Here are some representative projects:
Sahara AI: Committed to building a decentralized AI super infrastructure and trading market.
Yupp: AI model feedback platform that collects user feedback on model outputs.
Vana: Converts user personal data into monetizable digital assets.
Chainbase: Focused on on-chain data, covering over 200 blockchains.
Sapien: Aims to transform human knowledge on a large scale into high-quality AI training data.
Prisma X: Committed to becoming an open coordination layer for robots.
Masa: One of the leading subnet projects in the Bittensor ecosystem.
Irys: Focused on programmable data storage and computation.
ORO: Empowering ordinary people to participate in AI contributions.
Gata: Positioned as a decentralized data layer.
The barriers to entry for these projects are generally not high at the moment, but once they accumulate users and ecological stickiness, the platform advantages will quickly accumulate. Therefore, early-stage projects should focus on incentives and user experience. At the same time, these platforms also need to consider how to manage participants and ensure data quality to avoid the situation of "bad money driving out good."
In addition, increasing transparency is also a major challenge faced by current on-chain projects. Many projects still lack sufficient publicly available and traceable data, which is detrimental to the long-term healthy development of Web3 DataFi.
The path for the large-scale application of DataFi can be divided into two parts: first, attracting a sufficient number of individual users to participate, forming a strong force for data collection/generation and consumers of the AI economy; second, gaining the recognition of mainstream enterprises, as they are the main source of large data orders in the short term.
DataFi represents the long-term cultivation of machine intelligence by human intelligence, while ensuring the benefits of human labor through smart contracts, ultimately achieving the mutual benefit of machine intelligence for humanity. For those who feel uncertain about the AI era or still hold blockchain ideals, participating in DataFi may be a timely choice.