Henry O. Velesaca, S. A., Patricia L. Suarez, Ángel Sanchez & Angel D. Sappa. (2020). Off-the-Shelf Based System for Urban Environment Video Analytics. In The 27th International Conference on Systems, Signals and Image Processing (IWSSIP 2020) (Vol. 2020-July, pp. 459–464).
Abstract: This paper presents the design and implementation details of a system build-up by using off-the-shelf algorithms for urban video analytics. The system allows the connection to public video surveillance camera networks to obtain the necessary
information to generate statistics from urban scenarios (e.g., amount of vehicles, type of cars, direction, numbers of persons, etc.). The obtained information could be used not only for traffic management but also to estimate the carbon footprint of urban scenarios. As a case study, a university campus is selected to
evaluate the performance of the proposed system. The system is implemented in a modular way so that it is being used as a testbed to evaluate different algorithms. Implementation results are provided showing the validity and utility of the proposed approach.
|
Cristhian A. Aguilera, C. A., Cristóbal A. Navarro, & Angel D. Sappa. (2020). Fast CNN Stereo Depth Estimation through Embedded GPU Devices. Sensors 2020, Vol. 2020-June(11), pp. 1–13.
Abstract: Current CNN-based stereo depth estimation models can barely run under real-time
constraints on embedded graphic processing unit (GPU) devices. Moreover, state-of-the-art
evaluations usually do not consider model optimization techniques, being that it is unknown what is
the current potential on embedded GPU devices. In this work, we evaluate two state-of-the-art models
on three different embedded GPU devices, with and without optimization methods, presenting
performance results that illustrate the actual capabilities of embedded GPU devices for stereo depth
estimation. More importantly, based on our evaluation, we propose the use of a U-Net like architecture
for postprocessing the cost-volume, instead of a typical sequence of 3D convolutions, drastically
augmenting the runtime speed of current models. In our experiments, we achieve real-time inference
speed, in the range of 5–32 ms, for 1216 368 input stereo images on the Jetson TX2, Jetson Xavier,
and Jetson Nano embedded devices.
|
Ángel Morera, Á. S., A. Belén Moreno, Angel D. Sappa, & José F. Vélez. (2020). SSD vs. YOLO for Detection of Outdoor Urban Advertising Panels under Multiple Variabilities. In Sensors, Vol. 2020-August(16), pp. 1–23.
Abstract: This work compares Single Shot MultiBox Detector (SSD) and You Only Look Once (YOLO)
deep neural networks for the outdoor advertisement panel detection problem by handling multiple
and combined variabilities in the scenes. Publicity panel detection in images oers important
advantages both in the real world as well as in the virtual one. For example, applications like Google
Street View can be used for Internet publicity and when detecting these ads panels in images, it could
be possible to replace the publicity appearing inside the panels by another from a funding company.
In our experiments, both SSD and YOLO detectors have produced acceptable results under variable
sizes of panels, illumination conditions, viewing perspectives, partial occlusion of panels, complex
background and multiple panels in scenes. Due to the diculty of finding annotated images for the
considered problem, we created our own dataset for conducting the experiments. The major strength
of the SSD model was the almost elimination of False Positive (FP) cases, situation that is preferable
when the publicity contained inside the panel is analyzed after detecting them. On the other side,
YOLO produced better panel localization results detecting a higher number of True Positive (TP)
panels with a higher accuracy. Finally, a comparison of the two analyzed object detection models
with dierent types of semantic segmentation networks and using the same evaluation metrics is
also included.
|
Patricia L. Suarez, Angel D. Sappa, & Boris X. Vintimilla. (2017). Infrared Image Colorization based on a Triplet DCGAN Architecture. In 13th IEEE Workshop on Perception Beyond the Visible Spectrum – In conjunction with CVPR 2017. (This paper has been selected as “Best Paper Award” ) (Vol. 2017-July, pp. 212–217).
|
Angel Morera, Angel Sánchez, Angel D. Sappa, & José F. Vélez. (2019). Robust Detection of Outdoor Urban Advertising Panels in Static Images. In 17th International Conference on Practical Applications of Agents and Multi-Agent Systems (PAAMS 2019); Ávila, España. Communications in Computer and Information Science (Vol. 1047, pp. 246–256).
Abstract: One interesting publicity application for Smart City environments is recognizing brand information contained in urban advertising
panels. For such a purpose, a previous stage is to accurately detect and
locate the position of these panels in images. This work presents an effective solution to this problem using a Single Shot Detector (SSD) based
on a deep neural network architecture that minimizes the number of
false detections under multiple variable conditions regarding the panels and the scene. Achieved experimental results using the Intersection
over Union (IoU) accuracy metric make this proposal applicable in real
complex urban images.
|
Patricia L. Suarez, Angel D. Sappa, & Boris X. Vintimilla. (2018). Adaptive Harris Corners Detector Evaluated with Cross-Spectral Images. In International Conference on Information Technology & Systems (ICITS 2018). ICITS 2018. Advances in Intelligent Systems and Computing (Vol. 721).
Abstract: This paper proposes a novel approach to use cross-spectral
images to achieve a better performance with the proposed Adaptive Harris
corner detector comparing its obtained results with those achieved
with images of the visible spectra. The images of urban, field, old-building
and country category were used for the experiments, given the variety of
the textures present in these images, with which the complexity of the
proposal is much more challenging for its verification. It is a new scope,
which means improving the detection of characteristic points using crossspectral
images (NIR, G, B) and applying pruning techniques, the combination
of channels for this fusion is the one that generates the largest
variance based on the intensity of the merged pixels, therefore, it is that
which maximizes the entropy in the resulting Cross-spectral images.
Harris is one of the most widely used corner detection algorithm, so
any improvement in its efficiency is an important contribution in the
field of computer vision. The experiments conclude that the inclusion of
a (NIR) channel in the image as a result of the combination of the spectra,
greatly improves the corner detection due to better entropy of the
resulting image after the fusion, Therefore the fusion process applied to
the images improves the results obtained in subsequent processes such as
identification of objects or patterns, classification and/or segmentation.
|
Cristhian A. Aguilera, Xaver Soria, Angel D. Sappa, & Ricardo Toledo. (2017). RGBN Multispectral Images: a Novel Color Restoration Approach. In 15th International Conference on Practical Applications of Agents and Multi-Agent Systems (Vol. 619, pp. 155–163).
|
Julien Poujol, Cristhian A. Aguilera, Etienne Danos, Boris X. Vintimilla, Ricardo Toledo, & Angel D. Sappa. (2015). A visible-Thermal Fusion based Monocular Visual Odometry. In Iberian Robotics Conference (ROBOT 2015), International Conference on, Lisbon, Portugal, 2015 (Vol. 417, pp. 517–528).
Abstract: The manuscript evaluates the performance of a monocular visual odometry approach when images from different spectra are considered, both independently and fused. The objective behind this evaluation is to analyze if classical approaches can be improved when the given images, which are from different spectra, are fused and represented in new domains. The images in these new domains should have some of the following properties: i) more robust to noisy data; ii) less sensitive to changes (e.g., lighting); iii) more rich in descriptive information, among other. In particular in the current work two different image fusion strategies are considered. Firstly, images from the visible and thermal spectrum are fused using a Discrete Wavelet Transform (DWT) approach. Secondly, a monochrome threshold strategy is considered. The obtained representations are evaluated under a visual odometry framework, highlighting their advantages and disadvantages, using different urban and semi-urban scenarios. Comparisons with both monocular-visible spectrum and monocular-infrared spectrum, are also provided showing the validity of the proposed approach.
|
Miguel Oliveira, Vítor Santos, Angel D. Sappa, & Paulo Dias. (2015). Scene representations for autonomous driving: an approach based on polygonal primitives. In Iberian Robotics Conference (ROBOT 2015), Lisbon, Portugal, 2015 (Vol. 417, pp. 503–515). Springer International Publishing Switzerland 2016.
Abstract: In this paper, we present a novel methodology to compute a 3D scene representation. The algorithm uses macro scale polygonal primitives to model the scene. This means that the representation of the scene is given as a list of large scale polygons that describe the geometric structure of the environment. Results show that the approach is capable of producing accurate descriptions of the scene. In addition, the algorithm is very efficient when compared to other techniques.
|
Henry O. Velesaca, P. L. S., Dario Carpio, Rafael E. Rivadeneira, Ángel Sánchez, Angel D. Sappa. (2022). Video Analytics in Urban Environments: Challenges and Approaches. In ICT Applications for Smart Cities Part of the Intelligent Systems Reference Library book series (Vol. 224, pp. 101–122).
|