Abstract: The simultaneous use of images from different spectra can be helpful to improve the performance of many com- puter vision tasks. The core idea behind the usage of cross- spectral approaches is to take advantage of the strengths of each spectral band providing a richer representation of a scene, which cannot be obtained with just images from one spectral band. In this work we tackle the cross-spectral image similarity problem by using Convolutional Neural Networks (CNNs). We explore three different CNN archi- tectures to compare the similarity of cross-spectral image patches. Specifically, we train each network with images from the visible and the near-infrared spectrum, and then test the result with two public cross-spectral datasets. Ex- perimental results show that CNN approaches outperform the current state-of-art on both cross-spectral datasets. Ad- ditionally, our experiments show that some CNN architec- tures are capable of generalizing between different cross- spectral domains.