The Low-Down: The Best Artificial Intelligence Still Flunks 8th Grade Science

Feb 18, 2016

The Best Artificial Intelligence Still Flunks 8th Grade Science

But just because it isn't all that smart doesn't mean it isn't capable of taking jobs from humans who might also fit that description. JL

Cade Metz reports in Wired:

8th-grade science asks students to solve problems that require several steps, and combine multiple facts to show understanding. So, we’ve yet to build a machine that’s even sorta close to real intelligence.
In 2012, IBM Watson went to medical school. So said The New York Times, announcing that the tech giant’s artificially intelligent question-and-answer machine had begun a “stint as a medical student” at the Cleveland Clinic Lerner College of Medicine.
This was just a metaphor. Clinicians were helping IBM train Watson for use in medical research. But as metaphors go, it wasn’t a very good one. Three years later, our artificially intelligent machines can’t even pass an eighth-grade science test, much less go to medical school.

The top performers successfully answered about 60 percent of the questions. In other words, they flunked.
So says Oren Etzioni, a professor of computer science at the University of Washington and the executive director of the Allen Institute for Artificial Intelligence, the AI think-tank funded by Microsoft co-founder Paul Allen. Etzioni and the non-for-profit Allen Institute recently ran a contest, inviting nearly 800 teams of researchers to build AI systems that could take an eighth grade science test, and today, the Institute released the results: The top performers successfully answered about 60 percent of the questions. In other words, they flunked.
For Etzioni, this five-month-long contest serves as a reality check for the state of artificial intelligence. Yes, thanks to the rise of deep neural networks, networks of hardware and software that approximate the web of neurons in the human brain, companies like Google and Facebook and Microsoft have achieved human-like performance in identifying images and recognizing spoken words, among other tasks. But we’re still a long way from machines that can really think, from AI that can carry on a real conversation, even from systems that can pass a basic science test.

Whither Watson?
You might say that, way back in 2011, IBM Watson beat the best humans on Earth at Jeopardy!, the venerable TV trivia game show. And it did. Google just built a system that could top a professional at the ancient game of Go. But for a machine, these are somewhat easier tasks than taking a science test. “Jeopardy! is [about] finding a single fact, while I would imagine—and hope—that 8th-grade science asks students to solve problems that require several steps, and combine multiple facts to show understanding,” says Chris Nicholson, CEO and founder of AI startup Skymind.
The Allen Institute’s science test includes more than just trivia. It asks that machines understand basic ideas, serving up not only questions like “Which part of the eye does light hit first?” but more complex questions that revolve around concepts like evolutionary adaptation. “Some types of fish live most of their adult lives in salt water but lay their eggs in freshwater,” one question read. “The ability of these fish to survive in these different environments is an example of [what]?” These were multiple-choice questions—and the machines still couldn’t pass, despite using state-of-the-art techniques, including deep neural nets. “Natural language processing, reasoning, picking up a science textbook and understanding—this presents a host of more difficult challenges,” Etzioni says. “To get these questions right requires a lot more reasoning.”
Yes, most of the contestants were academics, independent researchers, or computer scientists outside the largest tech companies. But Etzioni isn’t sure the tech giants would preform all that much better, despite employing some of the top researchers in the field. “It’s entirely possible that the scores would have gone higher had companies like Google and others put their ‘big guns’ to work,” he says. “[But] the ‘wisdom of the crowds’ is quite powerful and there some very talented folks engaged in these contests.” Chaim Linhart, an Israeli researcher who participated in the competition, agrees. “In most competitions, I think the winning models are very specific to the test dataset, so even companies that work in the same domain don’t necessarily have a significant advantage,” he says.
What about Watson? According to Etzioni, IBM declined to participate (the company says it has turned its attentions away from contests like this and towards “real world” applications). But Watson is perhaps not the best litmus test. Watson was good at Jeopardy!. That’s what it was built for. But today, Watson is really just a brand name for a wide range of AI tools offered by IBM, and those tools aren’t necessarily state of the art.

Back to Work
Etzioni’s eighth grade science test is really a test of natural language understanding—how well a machine understands the natural way humans speak and write. IBM’s services do include natural language processing, but since Watson’s arrival, this kind of tech has received a new boost from deep neural nets. Just as you can teach a neural net to recognize a cat by feeding it myriad cat photos, you can teach it to understand natural language using mountains of digital dialogue. Google, for instance, has used neural nets to build a chatbot that debates the meaning of life.
But this chatbot wasn’t completely convincing. As it stands, the state of the art lies beyond any one technology. “So far, there is no universal method,” says Dutch researcher Benedikt Wilbertz, another participant in the Allen AI contest. “This challenge needed its own mix of machine learning and [other] AI tools.” Indeed, the top participants in the Allen AI challenge used deep learning as well as various other techniques. And the end result was still well below perfect.
Doug Lenat, who runs an AI project called Cyc, says that teaching today’s machines to take basic science tests doesn’t even make much sense. We should be striving for something more—something much further out. “If you’re talking about passing multiple choice science tests, I always felt that was not actually the test AI should be aiming to pass,” he says. “The focus on natural language understanding—-science tests, and so on—is something that should follow from a program being actually intelligent. Otherwise, you end up hitting the target but producing the veneer of understanding.” In other words, a machine that passes an eighth grade science test isn’t all that smart.
So, we’ve yet to build a machine that’s even sorta close to real intelligence. But work will continue.

33 comments:

chintu said...: I read this blog, Nice article...Thanks for sharing waiting for the next...
C C++ Training in Chennai
c++ class
c c++ course fee
c++ course fees
C Language Training
javascript training in chennai
core java training in chennai
Html5 Training in Chennai
DOT NET Training in Chennai
QTP Training in Chennai; January 1, 2020 at 11:57 PM
sindhuvarun said...: This blog is very helpful for us...I got some important information from this blog..
Data Science Course in Chennai
Data Science Courses in Bangalore
Data Science Course in Coimbatore
Data Science Course in Hyderabad
PHP Training in bangalore
Spoken English Classes in Bangalore
Data Science Training in btm
Data Science Coaching in Hyderabad
Data Science Training in Marathahalli
Best Data Science Courses in Hyderabad; January 3, 2020 at 6:20 AM
Reshma said...: This post is so interactive and informative. Thanks for sharing this post. keep update more informations...
IELTS Coaching in Chennai
IELTS Coaching in Bangalore
IELTS Coaching centre in coimbatore
IELTS Coaching in madurai
IELTS Coaching in Hyderabad
IELTS Training in Chennai
Best IELTS Coaching in Chennai
Best IELTS Coaching centres in Chennai
German Classes in Bangalore; January 10, 2020 at 5:13 AM
SSK Law Firm said...: Wonderful Blog!!! Thanks for sharing this post with us... and it is more helpful for us.
'SSK Law Firm
Debt Recovery Lawyers in Chennai
Immigration Lawyers in Chennai
Divorce Lawyers in Chennai
Best Divorce Lawyers in Chennai
Dowry Harassement Lawyers in Chennai
Domestic Violence Alimony Lawyers in Chennai
Property Registration Lawyers in Chennai
Property Legal Lawyers in Chennai
Document Registration Lawyers in Chennai
Construction Issues Illegal Possession Lawyers in Chennai'; June 11, 2020 at 11:22 AM
Unknown said...: valuable blog,Informative content...thanks for sharing, Waiting for the next update…Oneyes Technologies
Inplant Training in Chennai
Inplant Training in Chennai for CSE IT MCA
Inplant Training in Chennai ECE EEE EIE
Inplant Training in Chennai for Mechanical
Internship in Chennai; June 24, 2020 at 1:25 PM
Civil Service Aspirants said...: Really i found this article more informative, thanks for sharing this article! Also Check here
Civil Service Aspirants
TNPSC Tutorial in English
TNPSC Tutorial in Tamil
TNPSC Notes in English
TNPSC Materials in English
tnpsc group 1 study materials; July 14, 2020 at 12:14 PM
Civil Service Aspirants said...: Really i found this article more informative, thanks for sharing this article! Also Check here
Civil Service Aspirants
TNPSC Tutorial in English
TNPSC Tutorial in Tamil
TNPSC Notes in English
TNPSC Materials in English
tnpsc group 1 study materials; July 14, 2020 at 12:14 PM
360digitmg said...: Hi! This is my first visit to your blog! We are a team of volunteers and new initiatives in the same niche. Blog gave us useful information to work. You have done an amazing job!
Data Science Training in Hyderabad; November 3, 2020 at 4:34 AM
360digitmg said...: wow, great, I was wondering how to cure acne naturally. and found your site by google, learned a lot, now i’m a bit clear. I’ve bookmark your site and also add rss. keep us updated.
Best Data Science Courses in Hyderabad; November 10, 2020 at 2:10 AM
prathyusha said...: I at last discovered extraordinary post here.I will get back here. I just added your blog to my bookmark locales. thanks.Quality presents is the pivotal on welcome the guests to visit the website page, that is the thing that this page is giving.data science training Hyderabad; November 18, 2020 at 11:39 PM
Maneesha said...: Truly quite fascinating post. I was searching for this sort of data and delighted in perusing this one. Continue posting. Much obliged for sharing.
data scientist hyderabad; December 2, 2020 at 5:16 AM
lionelmessi said...: I would you like to say thank you so much for my heart. Really amazing and impressive post you have the share. Please keep sharing...

AWS Training in Hyderabad; July 15, 2021 at 7:03 AM
Priya Rathod said...: very informative blog and useful article thank you for sharing with us, keep posting.
DevOps Training in Hyderabad
DevOps Course in Hyderabad; July 16, 2021 at 4:07 AM
kajal shah said...: Thanks a lot for such interesting facts

divorce lawyers in chennai; November 18, 2021 at 1:39 AM
kajal shah said...: I'm glad that I found this page thanks for sharing such valuable information.

ceramic coating in chennai; December 1, 2021 at 3:36 AM
data science said...: Happy to visit your blog, I am by all accounts forward to more solid articles and I figure we as a whole wish to thank such huge numbers of good articles, blog to impart to us.; February 23, 2022 at 11:44 PM
PMP Training in Malaysia said...: 360DigiTMG, the top-rated organisation among the most prestigious industries around the world, is an educational destination for those looking to pursue their dreams around the globe. The company is changing careers of many people through constant improvement, 360DigiTMG provides an outstanding learning experience and distinguishes itself from the pack. 360DigiTMG is a prominent global presence by offering world-class training. Its main office is in India and subsidiaries across Malaysia, USA, East Asia, Australia, Uk, Netherlands, and the Middle East.; February 24, 2022 at 1:21 AM
360DigiTMG said...: Great post. I would like to thank you for the efforts you have made in writing this interesting and knowledgeable article.
data science institutes in hyderabad; April 29, 2022 at 5:54 AM
Mahil mithu said...: Well done! I am really glad to read your fantastic posting and keep sharing...
Family Law Retainer Fee
Child Support Virginia; August 23, 2022 at 8:38 AM
abogado de divorcio de nueva jersey said...: Thanks for sharing beautiful content. I got information from your blog.keep sharing
Divorce Attorneys Fairfax; April 8, 2023 at 5:19 AM
Nick said...: Thanks for sharing the post.
Best Divorce Lawyers in Chennai; April 27, 2023 at 4:33 AM
PilgrimageTour said...: Thank you for sharing this kind of wonderful post keep sharing blogs and you might check our blog also. Pilgrimage Tour; June 19, 2023 at 3:33 AM
360DigiTMGIOTCourses said...: I feel exceptionally pleased to have seen your site page and expect such endless additionally spellbinding occasions looking at here. Appreciative again for all of the subtleties.
MLOps Course; July 13, 2023 at 7:29 AM
ambrosed081 said...: AI has made significant progress in tasks like image recognition and game playing, but still struggles with fundamental reasoning and understanding for tasks like an 8th-grade science test. Despite advanced techniques like deep neural networks, AI systems tested still fall short on basic comprehension and reasoning tasks. IBM, once showcasing Watson's capabilities in Jeopardy!, declined to participate in such tests, focusing instead on practical applications in real-world scenarios. The pursuit of genuine artificial intelligence, machines that can think, learn, and reason like humans, remains a daunting but compelling goal. The reluctance of major tech companies like IBM to participate in such challenges signals a shift in focus towards practical applications of AI rather than academic benchmarks. abogado planificación patrimonial; May 2, 2024 at 4:01 AM
zion rider said...: The tutorial could also include fun facts about the characters, personalization tips, and a closing segment encouraging viewers to share their creations. Additionally, safety tips for handling kitchen tools, food allergies, and dietary restrictions could be included. Overall, the tutorial is a great idea but could be more helpful and engaging. bankruptcy lawyer stafford va Whether you're facing a criminal charge, going through a divorce, dealing with a personal injury, or managing complex business transactions, a qualified lawyer possesses the specialized knowledge and experience to guide you through the legal process and achieve the best possible outcome. They will meticulously analyze your case, develop a strategic plan, and leverage their negotiation and litigation skills to fight for your rights.; August 19, 2024 at 6:19 AM
Anonymous said...: Thank you for the information It was very useful.
Travel companies in chennai; October 3, 2024 at 6:01 AM
alexcary said...: Despite AI's remarkable progress, its inability to grasp difficult ideas and reason like humans is demonstrated by its difficulties in eighth-grade science. It draws attention to the necessity of continuous improvement in AI education. Prince William Sex Crimes Law Do you require professional legal assistance? For trustworthy advice and a formidable defense, rely on SRIS Lawyers. Take the first step in overcoming your legal obstacles by getting in touch with us today for a free consultation!; November 4, 2024 at 6:21 AM
Markwood1412 said...: Interesting perspective! It's fascinating how far AI has come, yet it still struggles with basic concepts that humans learn at a young age. This really highlights the complexity of human intelligence compared to machines. I’m curious to see how future advancements will close this gap immigration lawyers washington dc Your Legal Advocate Starts Here. Contact Us Today for Trusted Advice and Unmatched Representation!; December 19, 2024 at 9:34 AM
mindalina said...: The review also highlights the human and economic costs of these events. However, it could benefit from a more detailed analysis of long-term recovery efforts and global perspectives. pornography defense lawyer fairfax va Attorney is a skilled lawyer with extensive legal knowledge, renowned for his humane approach and commitment to client success. He focuses on individualized care and transparent communication, ensuring client support throughout the process.; December 21, 2024 at 5:24 AM
Aurora said...: AI still has trouble with basic thinking and comprehension for things like an eighth-grade science test, despite making great strides in areas like image recognition and gaming. Even with sophisticated methods like deep neural networks, AI systems are still unable to perform well on simple comprehension and reasoning tasks. 3rd dui in fairfax A third DUI conviction in Virginia carries harsh penalties, such as statutory minimum sentences, increased fines, lengthy license suspensions, and potential long-term effects on your criminal history and driving record. This includes a third DUI charge in Fairfax. An expert lawyer might try to refute the evidence in a third DUI prosecution. This can include raising concerns about the validity of the traffic stop, the precision of the field sobriety tests, the calibration of breathalyzers, or other potential irregularities in the arrest process.; December 27, 2024 at 6:25 AM
Steve said...: Despite significant advancements in fields like image recognition and gaming, AI still struggles with fundamental reasoning and comprehension for tasks like an eighth-grade science test. Deep neural networks and other advanced techniques have not yet been able to help AI systems perform effectively on basic comprehension and reasoning tasks. Arlington Warrant Lawyer A lawyer in Arlington with a wealth of experience handling arrest, bench, and search warrants is a warrant lawyer. An attorney who is knowledgeable about these kinds of warrants will be able to evaluate the circumstances and handle the legal nuances of warrants and associated matters. A criminal defense attorney is typically an excellent fit. Regardless of whether your warrant is for an arrest or a bench warrant, they can represent you in court.; December 31, 2024 at 1:46 AM
Jameswreck said...: AI has made great progress in areas such as picture recognition and game play, but it still struggles with fundamental thinking and comprehension on tests such as an 8th-grade science test. Despite significant advances in image recognition and gaming, AI still struggles with fundamental reasoning and comprehension for tasks such as an eighth-grade science test. Fairfax Theft Lawyer A Fairfax Theft Lawyer, a criminal defense attorney with a focus on theft in particular, will be equipped to handle the intricacies of the law. A lawyer that practices in Fairfax County on a regular basis will be familiar with the local judicial system, including the judges, prosecutors, and law enforcement.; January 10, 2025 at 7:50 AM
erikazara said...: Additionally, local libraries or media organizations may have recordings or transcripts of Good Day Sacramento episodes from July 2013.personal injury attorney virginia beachvirginia uncontested divorce procedure Passionate advocate fighting for justice and change. I believe in the power of every voice to create a better world—join me in making an impact!; January 18, 2025 at 12:51 AM

A Blog by Jonathan Low

Feb 18, 2016

The Best Artificial Intelligence Still Flunks 8th Grade Science

Whither Watson?

Back to Work

33 comments:

Post a Comment

contact

Search This Blog

Blog Archive

Labels

links