|
Henry O. Velesaca, Raul A. Mira, Patricia L. Suarez, Christian X. Larrea, & Angel D. Sappa. (2020). Deep Learning based Corn Kernel Classification. In The 1st International Workshop and Prize Challenge on Agriculture-Vision: Challenges & Opportunities for Computer Vision in Agriculture on the Conference Computer on Vision and Pattern Recongnition (CVPR 2020) (Vol. 2020-June, pp. 294–302).
Abstract: This paper presents a full pipeline to classify sample sets of corn kernels. The proposed approach follows a segmentation-classification scheme. The image segmentation is performed through a well known deep learning based
approach, the Mask R-CNN architecture, while the classification is performed by means of a novel-lightweight network specially designed for this task—good corn kernel, defective corn kernel and impurity categories are considered.
As a second contribution, a carefully annotated multitouching corn kernel dataset has been generated. This dataset has been used for training the segmentation and
the classification modules. Quantitative evaluations have been performed and comparisons with other approaches provided showing improvements with the proposed pipeline.
|
|
|
Henry O. Velesaca, S. A., Patricia L. Suarez, Ángel Sanchez & Angel D. Sappa. (2020). Off-the-Shelf Based System for Urban Environment Video Analytics. In The 27th International Conference on Systems, Signals and Image Processing (IWSSIP 2020) (Vol. 2020-July, pp. 459–464).
Abstract: This paper presents the design and implementation details of a system build-up by using off-the-shelf algorithms for urban video analytics. The system allows the connection to public video surveillance camera networks to obtain the necessary
information to generate statistics from urban scenarios (e.g., amount of vehicles, type of cars, direction, numbers of persons, etc.). The obtained information could be used not only for traffic management but also to estimate the carbon footprint of urban scenarios. As a case study, a university campus is selected to
evaluate the performance of the proposed system. The system is implemented in a modular way so that it is being used as a testbed to evaluate different algorithms. Implementation results are provided showing the validity and utility of the proposed approach.
|
|
|
Cristhian A. Aguilera, C. A., Cristóbal A. Navarro, & Angel D. Sappa. (2020). Fast CNN Stereo Depth Estimation through Embedded GPU Devices. Sensors 2020, Vol. 2020-June(11), pp. 1–13.
Abstract: Current CNN-based stereo depth estimation models can barely run under real-time
constraints on embedded graphic processing unit (GPU) devices. Moreover, state-of-the-art
evaluations usually do not consider model optimization techniques, being that it is unknown what is
the current potential on embedded GPU devices. In this work, we evaluate two state-of-the-art models
on three different embedded GPU devices, with and without optimization methods, presenting
performance results that illustrate the actual capabilities of embedded GPU devices for stereo depth
estimation. More importantly, based on our evaluation, we propose the use of a U-Net like architecture
for postprocessing the cost-volume, instead of a typical sequence of 3D convolutions, drastically
augmenting the runtime speed of current models. In our experiments, we achieve real-time inference
speed, in the range of 5–32 ms, for 1216 368 input stereo images on the Jetson TX2, Jetson Xavier,
and Jetson Nano embedded devices.
|
|
|
Ángel Morera, Á. S., A. Belén Moreno, Angel D. Sappa, & José F. Vélez. (2020). SSD vs. YOLO for Detection of Outdoor Urban Advertising Panels under Multiple Variabilities. In Sensors, Vol. 2020-August(16), pp. 1–23.
Abstract: This work compares Single Shot MultiBox Detector (SSD) and You Only Look Once (YOLO)
deep neural networks for the outdoor advertisement panel detection problem by handling multiple
and combined variabilities in the scenes. Publicity panel detection in images oers important
advantages both in the real world as well as in the virtual one. For example, applications like Google
Street View can be used for Internet publicity and when detecting these ads panels in images, it could
be possible to replace the publicity appearing inside the panels by another from a funding company.
In our experiments, both SSD and YOLO detectors have produced acceptable results under variable
sizes of panels, illumination conditions, viewing perspectives, partial occlusion of panels, complex
background and multiple panels in scenes. Due to the diculty of finding annotated images for the
considered problem, we created our own dataset for conducting the experiments. The major strength
of the SSD model was the almost elimination of False Positive (FP) cases, situation that is preferable
when the publicity contained inside the panel is analyzed after detecting them. On the other side,
YOLO produced better panel localization results detecting a higher number of True Positive (TP)
panels with a higher accuracy. Finally, a comparison of the two analyzed object detection models
with dierent types of semantic segmentation networks and using the same evaluation metrics is
also included.
|
|
|
Patricia L. Suárez, A. D. S. and B. X. V. (2021). Deep learning-based vegetation index estimation. In Generative Adversarial Networks for Image-to-Image Translation Book. (Vol. Chapter 9, pp. 205–232).
|
|
|
Patricia L. Suárez, A. D. S., Boris X. Vintimilla. (2021). Cycle generative adversarial network: towards a low-cost vegetation index estimation. In IEEE International Conference on Image Processing (ICIP 2021) (Vol. 2021-September, pp. 2783–2787).
Abstract: This paper presents a novel unsupervised approach to estimate the Normalized Difference Vegetation Index (NDVI).The NDVI is obtained as the ratio between information from the visible and near infrared spectral bands; in the current work, the NDVI is estimated just from an image of the visible spectrum through a Cyclic Generative Adversarial Network (CyclicGAN). This unsupervised architecture learns to estimate the NDVI index by means of an image translation between the red channel of a given RGB image and the NDVI unpaired index’s image. The translation is obtained by means of a ResNET architecture and a multiple loss function. Experimental results obtained with this unsupervised scheme show the validity of the implemented model. Additionally, comparisons with the state of the art approaches are provided showing improvements with the proposed approach.
|
|
|
Rafael E. Rivadeneira, A. D. S. and B. X. V. (2022). Multi-Image Super-Resolution for Thermal Images. In Proceedings of the International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications VISIGRAPP 2022 (Vol. 4, pp. 635–642).
|
|
|
Angel D. Sappa, P. L. S., Henry O. Velesaca, Darío Carpio. (2022). Domain adaptation in image dehazing: exploring the usage of images from virtual scenarios. In 16th International Conference on Computer Graphics, Visualization, Computer Vision and Image Processing (CGVCVIP 2022), julio 20-22 (pp. 85–92).
|
|
|
Henry O. Velesaca, P. L. S., Dario Carpio, and Angel D. Sappa. (2021). Synthesized Image Datasets: Towards an Annotation-Free Instance Segmentation Strategy. In 16 International Symposium on Visual Computing. Octubre 4-6, 2021. Lecture Notes in Computer Science (Vol. 13017, pp. 131–143).
|
|
|
Jorge L. Charco, A. D. S., Boris X. Vintimilla. (2022). Human Pose Estimation through A Novel Multi-View Scheme. In Proceedings of the International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications VISIGRAPP 2022 (Vol. 5, pp. 855–862).
Abstract: This paper presents a multi-view scheme to tackle the challenging problem of the self-occlusion in human
pose estimation problem. The proposed approach first obtains the human body joints of a set of images,
which are captured from different views at the same time. Then, it enhances the obtained joints by using a
multi-view scheme. Basically, the joints from a given view are used to enhance poorly estimated joints from
another view, especially intended to tackle the self occlusions cases. A network architecture initially proposed
for the monocular case is adapted to be used in the proposed multi-view scheme. Experimental results and
comparisons with the state-of-the-art approaches on Human3.6m dataset are presented showing improvements
in the accuracy of body joints estimations.
|
|