|
Miguel Oliveira, Vítor Santos, Angel D. Sappa, Paulo Dias, & A. Paulo Moreira. (2016). Incremental Scenario Representations for Autonomous Driving using Geometric Polygonal Primitives. Robotics and Autonomous Systems Journal, Vol. 83, pp. 312–325.
Abstract: When an autonomous vehicle is traveling through some scenario it receives a continuous stream of sensor data. This sensor data arrives in an asynchronous fashion and often contains overlapping or redundant information. Thus, it is not trivial how a representation of the environment observed by the vehicle can be created and updated over time. This paper presents a novel methodology to compute an incremental 3D representation of a scenario from 3D range measurements. We propose to use macro scale polygonal primitives to model the scenario. This means that the representation of the scene is given as a list of large scale polygons that describe the geometric structure of the environment. Furthermore, we propose mechanisms designed to update the geometric polygonal primitives over time whenever fresh sensor data is collected. Results show that the approach is capable of producing accurate descriptions of the scene, and that it is computationally very efficient when compared to other reconstruction techniques.
|
|
|
Miguel Oliveira, Vítor Santos, Angel D. Sappa, Paulo Dias, & A. Paulo Moreira. (2016). Incremental Texture Mapping for Autonomous Driving. Robotics and Autonomous Systems Journal, Vol. 84, pp. 113–128.
Abstract: Autonomous vehicles have a large number of on-board sensors, not only for providing coverage all around the vehicle, but also to ensure multi-modality in the observation of the scene. Because of this, it is not trivial to come up with a single, unique representation that feeds from the data given by all these sensors. We propose an algorithm which is capable of mapping texture collected from vision based sensors onto a geometric description of the scenario constructed from data provided by 3D sensors. The algorithm uses a constrained Delaunay triangulation to produce a mesh which is updated using a specially devised sequence of operations. These enforce a partial configuration of the mesh that avoids bad quality textures and ensures that there are no gaps in the texture. Results show that this algorithm is capable of producing fine quality textures.
|
|
|
Juan A. Carvajal, Dennis G. Romero, & Angel D. Sappa. (2016). Fine-tuning based deep covolutional networks for lepidopterous genus recognition. In XXI IberoAmerican Congress on Pattern Recognition (pp. 1–9).
Abstract: This paper describes an image classication approach ori- ented to identify specimens of lepidopterous insects recognized at Ecuado- rian ecological reserves. This work seeks to contribute to studies in the area of biology about genus of butter ies and also to facilitate the reg- istration of unrecognized specimens. The proposed approach is based on the ne-tuning of three widely used pre-trained Convolutional Neural Networks (CNNs). This strategy is intended to overcome the reduced number of labeled images. Experimental results with a dataset labeled by expert biologists, is presented|a recognition accuracy above 92% is reached. 1 Introductio
|
|
|
Angel D. Sappa, Cristhian A. Aguilera, Juan A. Carvajal Ayala, Miguel Oliveira, Dennis Romero, Boris X. Vintimilla, et al. (2016). Monocular visual odometry: a cross-spectral image fusion based approach. Robotics and Autonomous Systems Journal, Vol. 86, pp. 26–36.
Abstract: This manuscript evaluates the usage of fused cross-spectral images in a monocular visual odometry approach. Fused images are obtained through a Discrete Wavelet Transform (DWT) scheme, where the best setup is em- pirically obtained by means of a mutual information based evaluation met- ric. The objective is to have a exible scheme where fusion parameters are adapted according to the characteristics of the given images. Visual odom- etry is computed from the fused monocular images using an off the shelf approach. Experimental results using data sets obtained with two different platforms are presented. Additionally, comparison with a previous approach as well as with monocular-visible/infrared spectra are also provided showing the advantages of the proposed scheme.
|
|
|
Cristhian A. Aguilera, Angel D. Sappa, & R. Toledo. (2015). LGHD: A feature descriptor for matching across non-linear intensity variations. In IEEE International Conference on, Quebec City, QC, 2015 (pp. 178–181). Quebec City, QC, Canada: IEEE.
Abstract: This paper presents a new feature descriptor suitable to the task of matching features points between images with nonlinear intensity variations. This includes image pairs with significant illuminations changes, multi-modal image pairs and multi-spectral image pairs. The proposed method describes the neighbourhood of feature points combining frequency and spatial information using multi-scale and multi-oriented Log- Gabor filters. Experimental results show the validity of the proposed approach and also the improvements with respect to the state of the art.
|
|
|
M. Oliveira, L. Seabra Lopes, G. Hyun Lim, S. Hamidreza Kasaei, Angel D. Sappa, & A. Tomé. (2015). Concurrent Learning of Visual Codebooks and Object Categories in Open- ended Domains. In Intelligent Robots and Systems (IROS), 2015 IEEE/RSJ International Conference on, Hamburg, Germany, 2015 (pp. 2488–2495). Hamburg, Germany: IEEE.
Abstract: In open-ended domains, robots must continuously learn new object categories. When the training sets are created offline, it is not possible to ensure their representativeness with respect to the object categories and features the system will find when operating online. In the Bag of Words model, visual codebooks are usually constructed from training sets created offline. This might lead to non-discriminative visual words and, as a consequence, to poor recognition performance. This paper proposes a visual object recognition system which concurrently learns in an incremental and online fashion both the visual object category representations as well as the codebook words used to encode them. The codebook is defined using Gaussian Mixture Models which are updated using new object views. The approach contains similarities with the human visual object recognition system: evidence suggests that the development of recognition capabilities occurs on multiple levels and is sustained over large periods of time. Results show that the proposed system with concurrent learning of object categories and codebooks is capable of learning more categories, requiring less examples, and with similar accuracies, when compared to the classical Bag of Words approach using codebooks constructed offline.
|
|
|
Dennis G. Romero, A. Frizera, Angel D. Sappa, Boris X. Vintimilla, & T.F. Bastos. (2015). A predictive model for human activity recognition by observing actions and context. In ACIVS 2015 (Advanced Concepts for Intelligent Vision Systems), International Conference on, Catania, Italy, 2015 (pp. 323–333).
Abstract: This paper presents a novel model to estimate human activities – a human activity is defined by a set of human actions. The proposed approach is based on the usage of Recurrent Neural Networks (RNN) and Bayesian inference through the continuous monitoring of human actions and its surrounding environment. In the current work human activities are inferred considering not only visual analysis but also additional resources; external sources of information, such as context information, are incorporated to contribute to the activity estimation. The novelty of the proposed approach lies in the way the information is encoded, so that it can be later associated according to a predefined semantic structure. Hence, a pattern representing a given activity can be defined by a set of actions, plus contextual information or other kind of information that could be relevant to describe the activity. Experimental results with real data are provided showing the validity of the proposed approach.
|
|
|
Julien Poujol, Cristhian A. Aguilera, Etienne Danos, Boris X. Vintimilla, Ricardo Toledo, & Angel D. Sappa. (2015). A visible-Thermal Fusion based Monocular Visual Odometry. In Iberian Robotics Conference (ROBOT 2015), International Conference on, Lisbon, Portugal, 2015 (Vol. 417, pp. 517–528).
Abstract: The manuscript evaluates the performance of a monocular visual odometry approach when images from different spectra are considered, both independently and fused. The objective behind this evaluation is to analyze if classical approaches can be improved when the given images, which are from different spectra, are fused and represented in new domains. The images in these new domains should have some of the following properties: i) more robust to noisy data; ii) less sensitive to changes (e.g., lighting); iii) more rich in descriptive information, among other. In particular in the current work two different image fusion strategies are considered. Firstly, images from the visible and thermal spectrum are fused using a Discrete Wavelet Transform (DWT) approach. Secondly, a monochrome threshold strategy is considered. The obtained representations are evaluated under a visual odometry framework, highlighting their advantages and disadvantages, using different urban and semi-urban scenarios. Comparisons with both monocular-visible spectrum and monocular-infrared spectrum, are also provided showing the validity of the proposed approach.
|
|
|
Miguel Oliveira, Vítor Santos, Angel D. Sappa, & Paulo Dias. (2015). Scene representations for autonomous driving: an approach based on polygonal primitives. In Iberian Robotics Conference (ROBOT 2015), Lisbon, Portugal, 2015 (Vol. 417, pp. 503–515). Springer International Publishing Switzerland 2016.
Abstract: In this paper, we present a novel methodology to compute a 3D scene representation. The algorithm uses macro scale polygonal primitives to model the scene. This means that the representation of the scene is given as a list of large scale polygons that describe the geometric structure of the environment. Results show that the approach is capable of producing accurate descriptions of the scene. In addition, the algorithm is very efficient when compared to other techniques.
|
|
|
A. Amato, F. Lumbreras, & Angel D. Sappa. (2014). A general-purpose crowdsourcing platform for mobile devices. In Computer Vision Theory and Applications (VISAPP), 2014 International Conference on, Lisbon, Portugal, 2014 (Vol. 3, pp. 211–215). Lisbon, Portugal: IEEE.
Abstract: This paper presents details of a general purpose micro-taskon-demand platform based on the crowdsourcing philosophy. This platformwas specifically developed for mobile devices in order to exploit the strengths of such devices; namely: i) massivity, ii) ubiquityand iii) embedded sensors.The combined use of mobile platforms and the crowdsourcing model allows to tackle from the simplest to the most complex tasks.Users experience is the highlighted feature of this platform (this fact is extended to both task-proposer and task- solver).Proper tools according with a specific task are provided to a task-solver in order to perform his/her job in a simpler, faster and appealing way.Moreover, a task can be easily submitted by just selecting predefined templates, which cover a wide range of possible applications.Examples of its usage in computer vision and computer games are provided illustrating the potentiality of the platform.
|
|