Related papers: Learning cross space mapping via DNN using large scale click-through logs

Learning cross space mapping via DNN using large scale click-through logs

URL: http://arxiv.org/abs/2302.13275v1
Date: Sun, 26 Feb 2023 09:00:35 GMT
Title: Learning cross space mapping via DNN using large scale click-through logs
Authors: Wei Yu, Kuiyuan Yang, Yalong Bai, Hongxun Yao, Yong Rui
Abstract summary: The gap between low-level visual signals and high-level semantics has been progressively bridged by continuous development of deep neural network (DNN) We propose a unified DNN model for image-query similarity calculation by simultaneously modeling image and query in one network. Both the qualitative results and quantitative results on an image retrieval evaluation task with 1000 queries demonstrate the superiority of the proposed method.
Score: 38.94796244812248
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The gap between low-level visual signals and high-level semantics has been progressively bridged by continuous development of deep neural network (DNN). With recent progress of DNN, almost all image classification tasks have achieved new records of accuracy. To extend the ability of DNN to image retrieval tasks, we proposed a unified DNN model for image-query similarity calculation by simultaneously modeling image and query in one network. The unified DNN is named the cross space mapping (CSM) model, which contains two parts, a convolutional part and a query-embedding part. The image and query are mapped to a common vector space via these two parts respectively, and image-query similarity is naturally defined as an inner product of their mappings in the space. To ensure good generalization ability of the DNN, we learn weights of the DNN from a large number of click-through logs which consists of 23 million clicked image-query pairs between 1 million images and 11.7 million queries. Both the qualitative results and quantitative results on an image retrieval evaluation task with 1000 queries demonstrate the superiority of the proposed method.

Related papers

Recurrent Neural Networks for Still Images [0.0]
We argue that RNNs can effectively handle still images by interpreting the pixels as a sequence. We introduce a novel RNN design tailored for two-dimensional inputs, such as images, and a custom version of BiDirectional RNN (BiRNN) that is more memory-efficient than traditional implementations.
arXiv Detail & Related papers (2024-09-10T06:07:20Z)
NAS-BNN: Neural Architecture Search for Binary Neural Networks [55.058512316210056]
We propose a novel neural architecture search scheme for binary neural networks, named NAS-BNN. Our discovered binary model family outperforms previous BNNs for a wide range of operations (OPs) from 20M to 200M. In addition, we validate the transferability of these searched BNNs on the object detection task, and our binary detectors with the searched BNNs achieve a novel state-of-the-art result, e.g., 31.6% mAP with 370M OPs, on MS dataset.
arXiv Detail & Related papers (2024-08-28T02:17:58Z)
CNN2GNN: How to Bridge CNN with GNN [59.42117676779735]
We propose a novel CNN2GNN framework to unify CNN and GNN together via distillation. The performance of distilled boosted'' two-layer GNN on Mini-ImageNet is much higher than CNN containing dozens of layers such as ResNet152.
arXiv Detail & Related papers (2024-04-23T08:19:08Z)
Architecturing Binarized Neural Networks for Traffic Sign Recognition [0.0]
Binarized neural networks (BNNs) have shown promising results in computationally limited and energy-constrained devices. We propose BNNs architectures which achieve more than $90%$ for the German Traffic Sign Recognition Benchmark (GTSRB) The number of parameters of these architectures varies from 100k to less than 2M.
arXiv Detail & Related papers (2023-03-27T08:46:31Z)
Neural Implicit Dictionary via Mixture-of-Expert Training [111.08941206369508]
We present a generic INR framework that achieves both data and training efficiency by learning a Neural Implicit Dictionary (NID) Our NID assembles a group of coordinate-based Impworks which are tuned to span the desired function space. Our experiments show that, NID can improve reconstruction of 2D images or 3D scenes by 2 orders of magnitude faster with up to 98% less input data.
arXiv Detail & Related papers (2022-07-08T05:07:19Z)
Two-Stream Graph Convolutional Network for Intra-oral Scanner Image Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes. Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z)
Frequency learning for image classification [1.9336815376402716]
This paper presents a new approach for exploring the Fourier transform of the input images, which is composed of trainable frequency filters. We propose a slicing procedure to allow the network to learn both global and local features from the frequency-domain representations of the image blocks.
arXiv Detail & Related papers (2020-06-28T00:32:47Z)
When CNNs Meet Random RNNs: Towards Multi-Level Analysis for RGB-D Object and Scene Recognition [10.796613905980609]
We propose a novel framework that extracts discriminative feature representations from multi-modal RGB-D images for object and scene recognition tasks. To cope with the high dimensionality of CNN activations, a random weighted pooling scheme has been proposed. Experiments verify that fully randomized structure in RNN stage encodes CNN activations to discriminative solid features successfully.
arXiv Detail & Related papers (2020-04-26T10:58:27Z)
R-FCN: Object Detection via Region-based Fully Convolutional Networks [87.62557357527861]
We present region-based, fully convolutional networks for accurate and efficient object detection. Our result is achieved at a test-time speed of 170ms per image, 2.5-20x faster than the Faster R-CNN counterpart.
arXiv Detail & Related papers (2016-05-20T15:50:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.