Stanford AI Team Says Sorry to Chinese Startup Model Best for Plagiarizing LLM Code
Lv Qian
DATE:  Jun 04 2024
/ SOURCE:  Yicai
Stanford AI Team Says Sorry to Chinese Startup Model Best for Plagiarizing LLM Code Stanford AI Team Says Sorry to Chinese Startup Model Best for Plagiarizing LLM Code

(Yicai) June 4 -- An artificial intelligence team from Stanford University apologized to Model Best after plagiarizing the Chinese firm's open-source large language model MiniCPM-Llama3-V 2.5 code for its Llama3-V project.

"We want to apologize to the original authors of MiniCPM," Siddharth Sharma, a member of the team, said on X today. "Aksh Garg and I posted Llama3-V with Mustafa Aljadery," he added, noting that Aljadery wrote the code.

"We asked Mustafa (Aljadery) about proof of originality for Llama3-V and asked for the training code but we haven't seen any response so far," Sharma noted. "We were waiting for Mustafa (Aljadery) to take the lead, but instead, we are releasing our own statement."

Sharma and Garg are students at Stanford University, while Aljadery studies deep learning and computational neuroscience at the University of Southern California.

Li Dahai, chief executive of Model Best, said yesterday that he deeply regrets plagiarism and urges developing an open, cooperative, and trusting community environment. The issue shows the recognition of MiniCPM by international teams, but in an unexpected way, noted Li.

"I know nothing here; it seems done by a few undergrads, some at Stanford," said Christopher Manning, director of the Stanford AI Lab. "'Fake it before you make it' is an ignoble product of Silicon Valley. There's good open-source work around TsinghuaNLP, helping advance science." 

The Llama3-V model launched by the Stanford team has the same model structure and configuration file as the MiniCPM-Llama3-V 2.5, and it also replicates Model Best's newly developed recognition ability for Tsinghua Bamboo Slips, with even the mistakes being consistent.

After being questioned about the plagiarism, the Stanford AI team hid its Llama3-V model from AI platform Hugging Face to fix the model's inference problems.

The development of AI is inseparable from the open-source sharing of global algorithms, data, and models, said Liu Zhiyuan, co-founder of Model Best. MiniCPM-Llama3-V 2.5 uses the latest Llama3 as its base, Lui added.

However, what Stanford's Llama3-V team did seriously undermined the foundations of open source sharing, including adherence to open source protocols, trust in other contributors, and respect for previous achievements, Liu pointed out.

Chinese large-scale model teams, including ChatGLM, Ali Qwen, DeepSeek, and Model-Best-Tsinghua OpenBMB, have received widespread international attention through continuous open-source sharing, Liu noted, adding that the incident reflects that domestic innovations have also received global attention.

Founded in August 2022, Model Best launched MiniCPM (Chinese Pretrained Model), significantly accelerating image coding. This year, it bagged hundreds of millions of Chinese yuan, equivalent to tens of millions of US dollars, in a Series A financing round led by Primavera Venture Capital and Huawei Hubble, with Beijing Artificial Intelligence Industry Investment Fund also participating.

Editor: Martin Kadiev

Follow Yicai Global on
Keywords:   Stanford University