Researchers Use Artificial Intelligence to Identify, Count, Describe Wild Animals


Motion sensor “camera traps” unobtrusively take pictures of animals in their natural environment, oftentimes yielding images not otherwise observable. The artificial intelligence system automatically processes such images, here correctly reporting this as a picture of two impala standing.

A new paper in the Proceedings of the National Academy of Sciences (PNAS) reports how a cutting-edge artificial intelligence technique called deep learning can automatically identify, count and describe animals in their natural habitats.

Photographs that are automatically collected by motion-sensor cameras can then be automatically described by deep neural networks. The result is a system that can automate animal identification for up to 99.3 percent of images while still performing at the same 96.6 percent accuracy rate of crowdsourced teams of human volunteers.

“This technology lets us accurately, unobtrusively and inexpensively collect wildlife data, which could help catalyze the transformation of many fields of ecology, wildlife biology, zoology, conservation biology and animal behavior into ‘big data’ sciences. This will dramatically improve our ability to both study and conserve wildlife and precious ecosystems,” says Jeff Clune, the senior author of the paper. He is the Harris Associate Professor at the University of Wyoming and a senior research manager at Uber’s Artificial Intelligence Labs.

The paper was written by Clune; his Ph.D. student Mohammad Sadegh Norouzzadeh; his former Ph.D. student Anh Nguyen (now at Auburn University); Margaret Kosmala (Harvard University); Ali Swanson (University of Oxford); and Meredith Palmer and Craig Packer (both from the University of Minnesota).

Deep neural networks are a form of computational intelligence loosely inspired by how animal brains see and understand the world. They require vast amounts of training data to work well, and the data must be accurately labeled (e.g., each image being correctly tagged with which species of animal is present, how many there are, etc.).

This study obtained the necessary data from Snapshot Serengeti, a citizen science project on the http://www.zooniverse.org platform. Snapshot Serengeti has deployed a large number of “camera traps” (motion-sensor cameras) in Tanzania that collect millions of images of animals in their natural habitat, such as lions, leopards, cheetahs and elephants. The information in these photographs is only useful once it has been converted into text and numbers. For years, the best method for extracting such information was to ask crowdsourced teams of human volunteers to label each image manually. The study published today harnessed 3.2 million labeled images produced in this manner by more than 50,000 human volunteers over several years.

“When I told Jeff Clune we had 3.2 million labeled images, he stopped in his tracks,” says Packer, who heads the Snapshot Serengeti project. “We wanted to test whether we could use machine learning to automate the work of human volunteers. Our citizen scientists have done phenomenal work, but we needed to speed up the process to handle ever greater amounts of data. The deep learning algorithm is amazing and far surpassed my expectations. This is a game changer for wildlife ecology.”

Swanson, who founded Snapshot Serengeti, adds: “There are hundreds of camera-trap projects in the world, and very few of them are able to recruit large armies of human volunteers to extract their data. That means that much of the knowledge in these important data sets remains untapped. Although projects are increasingly turning to citizen science for image classification, we’re starting to see it take longer and longer to label each batch of images as the demand for volunteers grows. We believe deep learning will be key in alleviating the bottleneck for camera-trap projects: the effort of converting images into usable data.”

“Not only does the artificial intelligence system tell you which of 48 different species of animal is present, but it also tells you how many there are and what they are doing. It will tell you if they are eating, sleeping, if babies are present, etc.,” adds Kosmala, another Snapshot Serengeti leader. “We estimate that the deep learning technology pipeline we describe would save more than eight years of human labeling effort for each additional 3 million images. That is a lot of valuable volunteer time that can be redeployed to help other projects.”

First-author Sadegh Norouzzadeh points out that “Deep learning is still improving rapidly, and we expect that its performance will only get better in the coming years. Here, we wanted to demonstrate the value of the technology to the wildlife ecology community, but we expect that as more people research how to improve deep learning for this application and publish their datasets, the sky’s the limit. It is exciting to think of all the different ways this technology can help with our important scientific and conservation missions.”

The paper that in PNAS is titled, “Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning.”

http://www.uwyo.edu/uw/news/2018/06/researchers-use-artificial-intelligence-to-identify,-count,-describe-wild-animals.html

Microsoft Thinks Machines Can Learn to Converse by Making Chat a Game

MICROSOFT IS BUYING a deep learning startup based in Montreal, a global hub for deep learning research. But two years ago, this startup wasn’t based in Montreal, and it had nothing to do with deep learning. Which just goes to show: striking it big in the world of tech is all about being in the right place at the right time with the right idea.

Sam Pasupalak and Kaheer Suleman founded Maluuba in 2011 as students at the University of Waterloo, about 400 miles from Montreal. The company’s name is an insider’s nod to one of their undergraduate computer science classes. From an office in Waterloo, they started building something like Siri, the digital assistant that would soon arrive on the iPhone, and they built it in much the same way Apple built the original, using techniques that had driven the development of conversational computing for years—techniques that require extremely slow and meticulous work, where engineers construct AI one tiny piece at a time. But as they toiled away in Waterloo, companies like Google and Facebook embraced deep neural networks, and this technology reinvented everything from image recognition to machine translations, rapidly learning these tasks by analyzing vast amounts of data. Soon, Pasupalak and Suleman realized they should change tack.

In December 2015, the two founders opened a lab in Montreal, and they started recruiting deep learning specialists from places like McGill University and the University of Montreal. Just thirteen months later, after growing to a mere 50 employees, the company sold itself to Microsoft. And that’s not an unusual story. The giants of tech are buying up deep learning startups almost as quickly as they’re created. At the end of December, Uber acquired Geometric Logic, a two-year old AI startup spanning fifteen academic researchers that offered no product and no published research. The previous summer, Twitter paid a reported $150 million for Magic Pony, a two-year-old deep learning startup based in the UK. And in recent months, similarly small, similarly young deep learning companies have disappeared into the likes of General Electric, Salesforce, and Apple.

Microsoft did not disclose how much it paid for Maluuba, but some of these deep learning acquisitions have reached hefty sums, including Intel’s $400 million purchase of Nervana and Google’s $650 million acquisition of DeepMind, the British AI lab that made headlines last spring when it cracked the ancient game of Go, a feat experts didn’t expect for another decade.

At the same time, Microsoft’s buy is a little different than the rest. Maluuba is a deep learning company that focuses on natural language understanding, the ability to not just recognize the words that come out of our mouths but actually understand them and respond in kind—the breed of AI needed to build a good chatbot. Now that deep learning has proven so effective with speech recognition, image recognition, and translation, natural language is the next frontier. “In the past, people had to build large lexicons, dictionaries, ontologies,” Suleman says. “But with neural nets, we no longer need to do that. A neural net can learn from raw data.”

The acquisition is part of an industry-wide race towards digital assistants and chatbots that can converse like a human. Yes, we already have digital assistants like Microsoft Cortana, the Google Search Assistant, Facebook M, and Amazon Alexa. And chatbots are everywhere. But none of these services know how to chat (a particular problem for the chatbots). So, Microsoft, Google, Facebook, and Amazon are now looking at deep learning as a way of improving the state of the art.

Two summers ago, Google published a research paper describing a chatbot underpinned by deep learning that could debate the meaning of life (in a way). Around the same time, Facebook described an experimental system that could read a shortened form of The Lord of the Rings and answer questions about the Tolkien trilogy. Amazon is gathering data for similar work. And, none too surprisingly, Microsoft is gobbling up a startup that only just moved into the same field.

Winning the Game
Deep neural networks are complex mathematical systems that learn to perform discrete tasks by recognizing patterns in vast amounts of digital data. Feed millions of photos into a neural network, for instance, and it can learn to identify objects and people in photos. Pairing these systems with the enormous amounts of computing power inside their data centers, companies like Google, Facebook, and Microsoft have pushed artificial intelligence far further, far more quickly, than they ever could in the past.

Now, these companies hope to reinvent natural language understanding in much the same way. But there are big caveats: It’s a much harder task, and the work has only just begun. “Natural language is an area where more research needs to be done in terms of research, even basic research,” says University of Montreal professor Yoshua Bengio, one of the founding fathers of the deep learning movement and an advisor to Maluuba.

Part of the problem is that researchers don’t yet have the data needed to train neural networks for true conversation, and Maluuba is among those working to fill the void. Like Facebook and Amazon, it’s building brand new datasets for training natural language models: One involves questions and answers, and the other focuses on conversational dialogue. What’s more, the company is sharing this data with the larger community of researchers and encouraging then\m to share their own—a common strategy that seeks to accelerate the progress of AI research.

But even with adequate data, the task is quite different from image recognition or translation. Natural language isn’t necessarily something that neural networks can solve on their own. Dialogue isn’t a single task. It’s a series of tasks, each building on the one before. A neural network can’t just identify a pattern in a single piece of data. It must somehow identify patterns across an endless stream of data—and a keep a “memory” of this stream. That’s why Maluuba is exploring AI beyond neural networks, including a technique called reinforcement learning.

With reinforcement learning, a system repeats the same task over and over again, while carefully keeping tabs on what works and what doesn’t. Engineers at Google’s DeepMind lab used this method in building AlphaGo, the system that topped Korean grandmaster Lee Sedol at the ancient game of Go. In essence, the machine learned to play Go at a higher level than any human by playing game after game against itself, tracking which moves won the most territory on the board. In similar fashion, reinforcement learning can help machines learn to carry on a conversation. Like a game, Bengio says, dialogue is interactive. It’s a back and forth.

For Microsoft, winning the game of conversation means winning an enormous market. Natural language could streamline practically any computer interface. With this in mind, the company is already building an army of chatbots, but so far, the results are mixed. In China, the company says, its Xiaoice chatbot has been used by 40 million people. But when it first unleashed a similar bot in the US, the service was coaxed into spewing racism, and the replacement is flawed in so many other ways. That’s why Microsoft acquired Maluuba. The startup was in the right place at the right time. And it may carry the right idea.

https://www.wired.com/2017/01/microsoft-thinks-machines-can-learn-converse-chats-become-game/