CIDIS - Publicaciones -- Query Results

	Roberto Jacome Galarza, Miguel-Andrés Realpe-Robalino, Chamba-Eras LuisAntonio, & Viñán-Ludeña MarlonSantiago and Sinche-Freire Javier-Francisco. (2019). Computer vision for image understanding. A comprehensive review. In International Conference on Advances in Emerging Trends and Technologies (ICAETT 2019); Quito, Ecuador (pp. 248–259). Abstract: Computer Vision has its own Turing test: Can a machine describe the contents of an image or a video in the way a human being would do? In this paper, the progress of Deep Learning for image recognition is analyzed in order to know the answer to this question. In recent years, Deep Learning has increased considerably the precision rate of many tasks related to computer vision. Many datasets of labeled images are now available online, which leads to pre-trained models for many computer vision applications. In this work, we gather information of the latest techniques to perform image understanding and description. As a conclusion we obtained that the combination of Natural Language Processing (using Recurrent Neural Networks and Long Short-Term Memory) plus Image Understanding (using Convolutional Neural Networks) could bring new types of powerful and useful applications in which the computer will be able to answer questions about the content of images and videos. In order to build datasets of labeled images, we need a lot of work and most of the datasets are built using crowd work. These new applications have the potential to increase the human machine interaction to new levels of usability and user’s satisfaction. Permanent link \| Save citation: RTF PDF LaTeX \| Export record: Atom XML MODS XML ODF XML

Abstract: Computer Vision has its own Turing test: Can a machine describe the contents of an image or a video in the way a human being would do? In this paper, the progress of Deep Learning for image recognition is analyzed in order to know the answer to this question. In recent years, Deep Learning has increased considerably the precision rate of many tasks related to computer vision. Many datasets of labeled images are now available online, which leads to pre-trained models for many computer vision applications. In this work, we gather information of the latest techniques to perform image understanding and description. As a conclusion we obtained that the combination of Natural Language Processing (using Recurrent Neural Networks and Long Short-Term Memory) plus Image Understanding (using Convolutional Neural Networks) could bring new types of powerful and useful applications in which the computer will be able to answer questions about the content of images and videos. In order to build datasets of labeled images, we need a lot of work and most of the datasets are built using crowd work. These new applications have the potential to increase the human machine interaction to new levels of usability and user’s satisfaction.

Permanent link

| Save citation: RTF PDF LaTeX

| Export record: Atom XML MODS XML ODF XML

	CIDIS - Publicaciones Home \| Show All \| Simple Search \| Advanced Search	Login Quick Search: Field: contains: ...
	1–1 of 1 record found matching your query (RSS \| history):