NFT1000: A Cross-Modal Dataset for Non-Fungible Token Retrieval
- URL: http://arxiv.org/abs/2402.16872v2
- Date: Thu, 17 Oct 2024 02:53:23 GMT
- Title: NFT1000: A Cross-Modal Dataset for Non-Fungible Token Retrieval
- Authors: Shuxun Wang, Yunfei Lei, Ziqi Zhang, Wei Liu, Haowei Liu, Li Yang, Wenjuan Li, Bing Li, Weiming Hu,
- Abstract summary: We will introduce a benchmark dataset named "NFT Top1000 Visual-Text" (NFT1000), containing 7.56 million image-text pairs.
Based on this dataset and leveraging the CLIP series of pre-trained models, we propose the dynamic masking fine-tuning scheme.
We also propose a robust metric Comprehensive Variance Index (CVI) to assess the similarity and retrieval difficulty of visual-text pairs data.
- Score: 38.63307493935328
- License:
- Abstract: With the rise of "Metaverse" and "Web 3.0", Non-Fungible Token (NFT) has emerged as a kind of pivotal digital asset, garnering significant attention. By the end of March 2024, more than 1.7 billion NFTs have been minted across various blockchain platforms. To effectively locate a desired NFT, conducting searches within a vast array of NFTs is essential. The challenge in NFT retrieval is heightened due to the high degree of similarity among different NFTs, regarding regional and semantic aspects. In this paper, we will introduce a benchmark dataset named "NFT Top1000 Visual-Text Dataset" (NFT1000), containing 7.56 million image-text pairs, and being collected from 1000 most famous PFP1 NFT collections2 by sales volume on the Ethereum blockchain. Based on this dataset and leveraging the CLIP series of pre-trained models as our foundation, we propose the dynamic masking fine-tuning scheme. This innovative approach results in a 7.4\% improvement in the top1 accuracy rate, while utilizing merely 13\% of the total training data (0.79 million vs. 6.1 million). We also propose a robust metric Comprehensive Variance Index (CVI) to assess the similarity and retrieval difficulty of visual-text pairs data. The dataset will be released as an open-source resource. For more details, please refer to: https://github.com/ShuxunoO/NFT-Net.git.
Related papers
- InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning [58.7966588457529]
InfiMM-WebMath-40B is a high-quality dataset of interleaved image-text documents.
It comprises 24 million web pages, 85 million associated image URLs, and 40 billion text tokens, all meticulously extracted and filtered from CommonCrawl.
Our evaluations on text-only benchmarks show that, despite utilizing only 40 billion tokens, our dataset significantly enhances the performance of our 1.3B model.
Our models set a new state-of-the-art among open-source models on multi-modal math benchmarks such as MathVerse and We-Math.
arXiv Detail & Related papers (2024-09-19T08:41:21Z) - MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens [113.9621845919304]
We release MINT-1T, the most extensive and diverse open-source Multimodal INTerleaved dataset to date.
MINT-1T comprises one trillion text tokens and 3.4 billion images, a 10x scale-up from existing open-source datasets.
Our experiments show that LMMs trained on MINT-1T rival the performance of models trained on the previous leading dataset, OBELICS.
arXiv Detail & Related papers (2024-06-17T07:21:36Z) - What Determines the Price of NFTs? [26.368626684043992]
We analyze both on-chain and off-chain data of NFT collections trading on OpenSea to understand what influences NFT pricing.
Our results show that while text and image data of the NFTs can be used to explain price variations within collections, the extracted features do not generalize to new, unseen collections.
arXiv Detail & Related papers (2023-10-03T06:09:59Z) - Learning Profitable NFT Image Diffusions via Multiple Visual-Policy
Guided Reinforcement Learning [69.60868581184366]
We propose a Diffusion-based generation framework with Multiple Visual-Policies as rewards for NFT images.
The proposed framework consists of a large language model (LLM), a diffusion-based image generator, and a series of visual rewards by design.
Our framework can generate NFT images showing more visually engaging elements and higher market value, compared with SOTA approaches.
arXiv Detail & Related papers (2023-06-20T17:59:46Z) - NFTVis: Visual Analysis of NFT Performance [12.491701063977825]
A non-fungible token (NFT) is a data unit stored on the blockchain.
Current rarity models have flaws and are sometimes not convincing.
It is difficult to take comprehensive consideration and analyze NFT performance efficiently.
arXiv Detail & Related papers (2023-06-05T09:02:48Z) - Show me your NFT and I tell you how it will perform: Multimodal
representation learning for NFT selling price prediction [2.578242050187029]
Non-Fungible Tokens (NFTs) represent deeds of ownership, based on blockchain technologies and smart contracts, of unique crypto assets on digital art forms (e.g., artworks or collectibles)
We propose MERLIN, a novel multimodal deep learning framework designed to train Transformer-based language and visual models, along with graph neural network models, on collections of NFTs' images and texts.
A key aspect in MERLIN is its independence on financial features, as it exploits only the primary data a user interested in NFT trading would like to deal with.
arXiv Detail & Related papers (2023-02-03T11:56:38Z) - Bubble or Not: Measurements, Analyses, and Findings on the Ethereum
ERC721 and ERC1155 Non-fungible Token Ecosystem [22.010657813215413]
The market capitalization of NFT reached 21.5 billion USD in 2021, almost 200 times of all previous transactions.
The rapid decline in NFT market fever in the second quarter of 2022 casts doubts on the ostensible boom in the NFT market.
By collecting data from the whole blockchain, we construct three graphs, namely NFT create graph, NFT transfer graph, and NFT hold graph, to characterize the NFT traders.
We propose new indicators to quantify the activeness and value of NFT and propose an algorithm that combines indicators and graph analyses to find bubble NFTs.
arXiv Detail & Related papers (2023-01-05T10:17:57Z) - Fast-iTPN: Integrally Pre-Trained Transformer Pyramid Network with Token
Migration [138.24994198567794]
iTPN is born with two elaborated designs: 1) The first pre-trained feature pyramid upon vision transformer (ViT)
Fast-iTPN can accelerate the inference procedure by up to 70%, with negligible performance loss.
arXiv Detail & Related papers (2022-11-23T06:56:12Z) - Predicting Non-Fungible Token (NFT) Collections: A Contextual Generative
Approach [8.246077490514848]
Non-fungible tokens (NFTs) are digital assets stored on a blockchain representing real-world objects such as art or collectibles.
In this paper, we take a contextual generative approach that learns these diverse characteristics of NFT collections.
We generate the potential market value predictions of newly minted ones.
arXiv Detail & Related papers (2022-10-14T12:50:22Z) - Probably Something: A Multi-Layer Taxonomy of Non-Fungible Tokens [62.997667081978825]
Non-Fungible Tokens (NFTs) are hyped and increasingly marketed as essential building blocks of the Metaverse.
This paper aims to establish a fundamental and comprehensive understanding of NFTs by identifying and structuring common characteristics within a taxonomy.
arXiv Detail & Related papers (2022-08-29T18:00:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.