Related papers: A Deep Learning Approach for Imbalanced Tabular Data in Advertiser Prospecting: A Case of Direct Mail Prospecting

A Deep Learning Approach for Imbalanced Tabular Data in Advertiser Prospecting: A Case of Direct Mail Prospecting

URL: http://arxiv.org/abs/2410.01157v1
Date: Wed, 2 Oct 2024 01:19:40 GMT
Title: A Deep Learning Approach for Imbalanced Tabular Data in Advertiser Prospecting: A Case of Direct Mail Prospecting
Authors: Sadegh Farhang, William Hayes, Nick Murphy, Jonathan Neddenriep, Nicholas Tyris,
Abstract summary: We propose a supervised learning approach for identifying new customers, i.e., prospecting, which comprises how we define labels for our data and rank potential customers. This framework is designed to tackle large imbalanced datasets with vast number of numerical and categorical features. Our framework comprises two components: an autoencoder and a feed-forward neural network.
Score: 0.6990493129893112
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Acquiring new customers is a vital process for growing businesses. Prospecting is the process of identifying and marketing to potential customers using methods ranging from online digital advertising, linear television, out of home, and direct mail. Despite the rapid growth in digital advertising (particularly social and search), research shows that direct mail remains one of the most effective ways to acquire new customers. However, there is a notable gap in the application of modern machine learning techniques within the direct mail space, which could significantly enhance targeting and personalization strategies. Methodologies deployed through direct mail are the focus of this paper. In this paper, we propose a supervised learning approach for identifying new customers, i.e., prospecting, which comprises how we define labels for our data and rank potential customers. The casting of prospecting to a supervised learning problem leads to imbalanced tabular data. The current state-of-the-art approach for tabular data is an ensemble of tree-based methods like random forest and XGBoost. We propose a deep learning framework for tabular imbalanced data. This framework is designed to tackle large imbalanced datasets with vast number of numerical and categorical features. Our framework comprises two components: an autoencoder and a feed-forward neural network. We demonstrate the effectiveness of our framework through a transparent real-world case study of prospecting in direct mail advertising. Our results show that our proposed deep learning framework outperforms the state of the art tree-based random forest approach when applied in the real-world.

Related papers

Organize the Web: Constructing Domains Enhances Pre-Training Data Curation [129.27104172458363]
We develop a framework for organizing web pages in terms of both their topic and format. We automatically annotate pre-training data by distilling annotations from a large language model into efficient curations. Our work demonstrates that constructing and mixing domains provides a valuable complement to quality-based data curation methods.
arXiv Detail & Related papers (2025-02-14T18:02:37Z)
Lowering the Barrier of Machine Learning: Achieving Zero Manual Labeling in Review Classification Using LLMs [0.0]
This paper introduces an approach that integrates large language models (LLMs), specifically Generative Pre-trained Transformer (GPT) and Bidirectional Representations from Transformers (BERT) Our approach retains high classification accuracy without the need for manual labeling, expert knowledge in tuning and data annotation, or substantial computational power.
arXiv Detail & Related papers (2025-02-05T05:31:54Z)
Context-aware Advertisement Modeling and Applications in Rapid Transit Systems [1.342834401139078]
We present an advertisement model using behavioral and tracking analysis. We present a model using the agent-based modeling (ABM) technique, with the target audience of rapid transit system users to target the right person for advertisement applications.
arXiv Detail & Related papers (2024-09-16T02:59:36Z)
An explainable machine learning-based approach for analyzing customers' online data to identify the importance of product attributes [0.6437284704257459]
We propose a game theory machine learning (ML) method that extracts comprehensive design implications for product development. We apply our method to a real-world dataset of laptops from Kaggle, and derive design implications based on the results.
arXiv Detail & Related papers (2024-02-03T20:50:48Z)
A Survey of Graph Unlearning [12.86327535559885]
Graph unlearning provides the means to remove sensitive data traces from trained models, upholding the right to be forgotten. We present the first systematic review of graph unlearning approaches, encompassing a diverse array of methodologies. We explore the versatility of graph unlearning across various domains, including but not limited to social networks, recommender systems, and resource-constrained environments like the Internet of Things.
arXiv Detail & Related papers (2023-08-23T20:50:52Z)
Exploring Large Language Model for Graph Data Understanding in Online Job Recommendations [63.19448893196642]
We present a novel framework that harnesses the rich contextual information and semantic representations provided by large language models to analyze behavior graphs. By leveraging this capability, our framework enables personalized and accurate job recommendations for individual users.
arXiv Detail & Related papers (2023-07-10T11:29:41Z)
Deep learning for table detection and structure recognition: A survey [49.09628624903334]
The goal of this survey is to provide a profound comprehension of the major developments in the field of Table Detection. We provide an analysis of both classic and new applications in the field. The datasets and source code of the existing models are organized to provide the reader with a compass on this vast literature.
arXiv Detail & Related papers (2022-11-15T19:42:27Z)
Adversarial Auto-Augment with Label Preservation: A Representation Learning Principle Guided Approach [95.74102207187545]
We show that a prior-free autonomous data augmentation's objective can be derived from a representation learning principle. We then propose a practical surrogate to the objective that can be efficiently optimized and integrated seamlessly into existing methods.
arXiv Detail & Related papers (2022-11-02T02:02:51Z)
Opinion Spam Detection: A New Approach Using Machine Learning and Network-Based Algorithms [2.062593640149623]
Online reviews play a crucial role in helping consumers evaluate and compare products and services. Fake reviews (opinion spam) are becoming more prevalent and negatively impacting customers and service providers. We propose a new method for classifying reviewers as spammers or benign, combining machine learning with a message-passing algorithm.
arXiv Detail & Related papers (2022-05-26T15:27:46Z)
Just Label What You Need: Fine-Grained Active Selection for Perception and Prediction through Partially Labeled Scenes [78.23907801786827]
We introduce generalizations that ensure that our approach is both cost-aware and allows for fine-grained selection of examples through partially labeled scenes. Our experiments on a real-world, large-scale self-driving dataset suggest that fine-grained selection can improve the performance across perception, prediction, and downstream planning tasks.
arXiv Detail & Related papers (2021-04-08T17:57:41Z)
Data-efficient Online Classification with Siamese Networks and Active Learning [11.501721946030779]
We investigate learning from limited labelled, nonstationary and imbalanced data in online classification. We propose a learning method that synergistically combines siamese neural networks and active learning. Our study shows that the proposed method is robust to data nonstationarity and imbalance, and significantly outperforms baselines and state-of-the-art algorithms in terms of both learning speed and performance.
arXiv Detail & Related papers (2020-10-04T19:07:19Z)
Learning to Infer User Hidden States for Online Sequential Advertising [52.169666997331724]
We propose our Deep Intents Sequential Advertising (DISA) method to address these issues. The key part of interpretability is to understand a consumer's purchase intent which is, however, unobservable (called hidden states)
arXiv Detail & Related papers (2020-09-03T05:12:26Z)
Adversarial Knowledge Transfer from Unlabeled Data [62.97253639100014]
We present a novel Adversarial Knowledge Transfer framework for transferring knowledge from internet-scale unlabeled data to improve the performance of a classifier. An important novel aspect of our method is that the unlabeled source data can be of different classes from those of the labeled target data, and there is no need to define a separate pretext task.
arXiv Detail & Related papers (2020-08-13T08:04:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.