Will the data flow overseas with the international large model? First generative artificial intelligence legislation clarifies

Question

**Source:**AI Pioneer Officer![](https://img-cdn.gateio.im/resized-social/moments-bab2147faf-ca39e06195-dd1a6f-7649e1) *Image source: Generated by Unbounded AI tool*The "thousand model war" triggered by GPT has gradually entered the "data decisive game"."High-quality data" and "data volume" have become the key for large models to quickly establish barriers and compete for the future. According to Wu Chao, Director of CITIC Think Tank Expert Committee and Director of China Securities Research Institute, “In the future, 20% of a model will be determined by algorithms, and 80% by data quality. Next, high-quality data will be the key to improving model performance.”Yet factual data are becoming scarce. AI-powered bots like ChatGPT may soon "run out of text in the universe".A joint study "Will we run out of data?" "Gives a time limit: Human-originated data may become increasingly scarce in the future, and high-quality natural language data may be exhausted by large language models as soon as 2026.How to ensure a steady stream of data to supply large model training? While international open source organizations and business giants are constantly trying, they are also suffering from discord. There are endless questions about property rights protection, data privacy, and network security brought about by data collection.In March this year, many Windows 11 users reported that they were "forced to pop up a window", prompting that "your data will be processed outside the country or region you are in", and there is no option to "cancel", and they can only click the option of "next", otherwise they will not be able to enter the system desktop.The move has sparked concerns among users of the Windows operating system about the leakage of private data abroad. In this regard, Microsoft stated that after users update and use Windows 11, the data will be sent out of China. Because Microsoft's software registration center is in the United States, after ChatGPT is integrated into Bing search and Edge browser, it also needs the support of the US data center, so the data of Chinese users may be sent abroad.Microsoft's good friend Open AI, while benefiting from the former's massive data, also encountered doubts. At the end of June, OpenAI was hit with a class action lawsuit, accused of stealing "a large amount of personal data" to train ChatGPT. Musk imposed a temporary limit on the number of tweets read on July 1 for this reason.Alphabet has warned employees not to use chatbots blindly, including Google Bard, which it is promoting in global markets. On June 1, Google updated its privacy statement, warning users to "please do not involve confidential or sensitive information in conversations with Bard."On the one hand, Crazy All-in builds a data flywheel for global users, and on the other hand, it is cautious about its own business data. This kind of "double standard" makes most companies around the world have to adopt "active defense". Many companies around the world, such as Samsung and Amazon, have begun to set up guardrails for AI chatbots. And Microsoft and Google timely launched dialogue tools for commercial customers, guaranteeing that data will not be absorbed into public AI models, but customers need to pay high fees for this.With regard to the risks that may arise from the way AIGC uses and obtains data, regulators in various countries have intervened.**Italian data regulator Garante announced a complete ban on ChatGPT on March 31, 2023** and prohibited OpenAI from processing Italian user data. After OpenAI promised to make corresponding improvements, ChatGPT resumed its service in Italy.Subsequently, **Germany, France, and Ireland also took countermeasures**. Spain asked the European Data Protection Board (EDPB) to evaluate ChatGPT’s privacy protection issues. The Korean Personal Information Protection Commission also stated that it launched an investigation into the data leakage of ChatGPT Korean users.Our country also acted early. On July 13, the ** State Cyberspace Administration of China jointly issued the "Interim Measures for the Management of Generative Artificial Intelligence Services"** (referred to as the "Interim Measures"). This is my country's first special legislation in the field of generative artificial intelligence.The "Interim Measures" clarified the principles for the first time, "If the provision of generative artificial intelligence services from outside the People's Republic of China does not comply with laws, administrative regulations and the provisions of these measures, the national network information department shall notify the relevant agencies to take technical measures and other necessary measures to deal with it."In addition, it also clarifies the scope of application of the Measures: it applies to services that generate text, pictures, audio, video and other content for the public in China, and clearly excludes R&D and application activities that do not provide services to the domestic public from the scope of application.This means that **overseas AIGC service providers (whether it is the model layer or the application layer) will be subject to the relevant provisions of the "Interim Measures", whether they provide related services directly to China, or provide indirect services through API interfaces or "encapsulation" or "nesting". For domestic manufacturers, the Interim Measures will apply regardless of whether they are properly authorized by overseas AIGC service providers.Data knows no borders, but data security has borders. The promulgation of the "Interim Measures" has delineated the boundaries for domestic large-scale technology companies and entrepreneurs engaged in large-scale models, and provided experience reference for the subsequent promulgation of the "Artificial Intelligence Law". In this regard, the academic circles and enterprises generally believe that the "Interim Measures" were released in a timely manner and have built confidence in the development of artificial intelligence in China.In addition to the legislative level, the industry is also seeking breakthroughs through its own efforts. Technology companies that have launched large-scale models in the past six months have expressed their emphasis on the safety and credibility mentioned in the "Interim Measures".Baidu said that only by establishing and improving laws and regulations, institutional systems, and ethics to ensure the healthy development of artificial intelligence can a good innovation ecology be created. 360 proposes to build a proprietary large model that is "safe, reliable, controllable and easy to use". Alibaba Cloud proposed that "building a safe and reliable artificial intelligence" has gradually become an industry consensus. JD Cloud stated that the next-generation digital infrastructure needs to meet the four characteristics of integration and openness, efficient collaboration, extreme cost performance, and security and controllability.The industrial layout based on the region has already begun to explore.Not long ago, Beijing released the "Twenty Measures on Data", giving opinions on industrial collaboration and building a trusted data circulation system: support the Beijing Economic and Technological Development Zone and other areas to carry out pilot data infrastructure systems, and create policy highlands, trusted spaces, and data factories.In fact, as early as May this year, the Beijing Municipal Bureau of Economy and Information Technology, the Zhongguancun Management Committee of the Municipal Science and Technology Commission, and the Municipal Development and Reform Commission jointly launched the "Beijing General Artificial Intelligence Industry Innovation Partnership Program", and now the second phase of the partner list has been announced.The plan expects to bring together independent and credible innovative enterprises in Beijing to promote the compliant and high-quality development of the artificial intelligence industry. The list includes computing power partners such as Alibaba Cloud, data partners such as Beijing Big Data Center, model partners such as Baidu, application partners such as Tongxin UOS and WPS, and investment partners such as IDG and CDH.The plan includes leading companies at key nodes in the artificial intelligence industry chain, laying an independent and credible ecological foundation for China to benchmark the artificial intelligence ecology of open AI + Microsoft + Nvidia..END.