Meta’s New Large Language Model Was Only Online for Three Days.

Meta announced a large language model, Galactica on November 15. It was designed to aid scientists. Galactica didn’t make it to the big splash Meta had hoped for. It died after three days of fierce criticism. Yesterday, the company pulled down the public demo it had invited everyone to test out.

Meta’s mistake and its hubris show that Big Tech is blind to the serious limitations of large-language models. A large amount of research has been done to highlight the weaknesses of this technology. This includes its tendency to replicate prejudice and claim falsehoods as facts.

Meta and other large language model companies, such as Google, have not taken it seriously.

Galactica is a language model for science that has been trained from 48 million scientific articles, websites and lecture notes. Meta promoted the model as a way to save time for students and researchers. Galactica, according to the company, “can summarise academic papers, solve math problems, generate Wiki articles and write scientific code, annotate proteins and molecules, and much more.”

However, the shiny veneer was quickly worn away. Galactica, like all language models is a mindless robot that can’t tell fact from fiction. Scientists quickly shared their biased and inaccurate results on social media within hours.

Chirag Shah, a University of Washington student who studies search technology, says that “I am both amazed and unsurprised” by the new effort. They look amazing, magical, and intelligent when demoing these items. Yet, people don’t seem able to understand that such things won’t work in the way they are hyped up to.

Meta replied to MIT Technology Review asking for clarification on why the demo was removed. It pointed to a tweet that said: “Thanks everyone for trying out the Galactica model demonstration. We are grateful for the positive feedback from the community and have temporarily halted the demo. Researchers who are interested in learning more about our work or reproducing results in the paper can access our models.

Galactica’s fundamental problem is its inability to discern truth from fiction, which is a requirement for any language model that generates scientific text. It created fake papers, sometimes claiming they were written by real authors. Wiki articles about the history of bear in space could be generated just as easily as those about protein complexes or the speed of light. It is easy to identify fiction when it involves space bears. However, it can be difficult to spot a topic that users might not know much about.

Next >>