1.媒體來源: MIT Technology Review - Robotics
Machines Can Now Recognize Something After Seeing It Once
Algorithms usually need thousands of examples to learn something. Researchers
at Google DeepMind found a way around that.
Google DeepMind 研究員發現了一種方法可以繞過去。
by Will Knight November 3, 2016
Most of us can recognize an object after seeing it once or twice. But the
algorithms that power computer vision and voice recognition need thousands of
examples to become familiar with each new image or word.
Researchers at Google DeepMind now have a way around this. They
made a few
clever tweaks to a deep-learning algorithm that allows it to recognize
objects in images and other things from a single example—something known as
one-shot learning." The team demonstrated the trick on a large database of
tagged images, as well as on handwriting and language.
現在 Google DeepMind 的研究員發現一個方法可以繞過去。他們對深度學習演算法做了
一些聰明的小改變,使得只需一個樣本就可以辨認事物 — 號稱 "one-shot learning"。
The best algorithms can recognize things reliably, but their need for data
makes building them time-consuming and expensive. An algorithm trained to
spot cars on the road, for instance, needs to ingest many thousands of
examples to work reliably in a driverless car. Gathering so much data is
often impractical—a robot that needs to navigate an unfamiliar home, for
instance, can’t spend countless hours wandering around learning.
本才能穩定工作。要收集如此多的資料是不切實際的 — 例如一個需要在陌生的家庭中導
Oriol Vinyals, a research scientist at Google DeepMind, a U.K.-based
subsidiary of Alphabet that’s focused on artificial intelligence, added a
memory component to a deep-learning system—a type of large neural network
that’s trained to recognize things by adjusting the sensitivity of many
layers of interconnected components roughly analogous to the neurons in a
brain. Such systems need to see lots of images to fine-tune the connections
between these virtual neurons.
專攻人工智慧的 Alphabet 子公司,Google DeepMind 的科學家 Oriol Vinyals,加了一
個記憶元件到 deep-learning system — 一種大型的神經網路,以粗略模仿腦內神經元
The team demonstrated the capabilities of the system on a database of labeled
photographs called
ImageNet. The software still needs to analyze several
hundred categories of images, but after that it can learn to recognize new
objects—say, a dog—from just one picture. It effectively learns to
recognize the characteristics in images that make them unique. The algorithm
was able to recognize images of dogs with an accuracy close to that of a
conventional data-hungry system after seeing just one example.
研究團隊是用了叫 ImageNet 的標記相片資料庫來演示系統的能力。該軟體仍然需要分析
Vinyals says the work could be especially useful if it could quickly
recognize the meaning of a new word. This could be important for Google,
Vinyals says, since it could allow a system to quickly learn the meaning of a
new search term.
Vinyals 表示如果這個研究成果可以用來快速的辨識出新詞的意義,那會非常有用。這對
Google 非常重要,因為它可讓系統很快地學習新的搜尋詞組的意義。
Others have developed
one-shot learning systems, but these are usually not
compatible with deep-learning systems. An academic project last year used
probabilistic programming techniques to enable this kind of very efficient
learning (see "
This Algorithm Learns Tasks As Fast As We Do").
也有其他人發展出 one-shot learning 系統,但是通常與 deep-learning 系統不相容。
去年一個學術計畫以 probabilistic programming 技術使這種非常有效率的學習法成為
But deep-learning systems are becoming more capable, especially with the
addition of memory mechanisms. Another group at Google DeepMind recently
developed a network with a flexible kind of memory, making it capable of
performing simple reasoning tasks—for example, learning how to navigate a
subway system after analyzing several much simpler network diagrams (see
What Happens When You Give a Computer a Working Memory?").
但是 deep-learning 系統更有能力,尤其是加上了記憶機制。最近 Google DeepMind 的
其他團隊發展出一個有彈性記憶的網路,它可以執行簡單的推理任務 — 例如,在分析數
"I think this is a very interesting approach, providing a novel way of doing
one-shot learning on such large-scale data sets," says Sang Wan Lee, who
leads the Laboratory for Brain and Machine Intelligence at the Korean
Advanced Institute for Science and Technology in Daejeon, South Korea. "This
is a technical contribution to the AI community, which is something that
computer vision researchers might fully appreciate."
韓國科學技術院,腦與機器智慧實驗室 Sang Wan Lee 表示 "我認為這是一個非常有趣的
方法,它提供了一個新奇的手段在巨量資料上做 one-shot learning"。"這是個對AI社群
Others are more skeptical about its usefulness, given how different it still
is from human learning. For one thing, says Sam Gershman, an assistant
professor in Harvard's Department for Brain Science, humans generally learn
by understanding the components that make up an image, which may require some
real-world, or commonsense, knowledge. For example, "a Segway might look very
different from a bicycle or motorcycle, but it can be composed from the same
其他人質疑它的實用性,並指出它與人類學習的差異。哈佛腦科學系助理教授 Sam
Gershman 表示,人們通常藉由了解圖像的構成元件來學習,這需要一些現實生活的常識
或知識。例如,Segway 的外觀與腳踏車或摩托車相比非常不一樣,但是它們都由相同的
According to both Gershman and Wan Lee, it will be some time yet before
machines match human learning. "We still remain far from revealing humans’
secret of performing
one-shot learning," Wan Lee says, "but this proposal
clearly poses new challenges that merit further study."
根據 Gershman 和 Wan Lee 的說法,在機器可以與人類學習法競爭之前,還有很長的一
段路要走。Wan Lee 表示,"我們依然停留在距離揭祕人腦運用 one-shot learning 很遠
one-shot learning
