Related papers: An Ensemble-based approach for assigning text to correct Harmonized system code

Related papers

Towards Trustworthy Multimodal Moderation via Policy-Aligned Reasoning and Hierarchical Labeling [22.914127076888086]
Hi-Guard is a multimodal moderation framework that introduces a new policy-aligned decision paradigm.<n>To ensure alignment with evolving moderation policies, Hi-Guard directly incorporates rule definitions into the model prompt.<n>Experiments and real-world deployment demonstrate that Hi-Guard achieves superior classification accuracy, generalization, and interpretability.
arXiv Detail & Related papers (2025-08-05T10:16:04Z)
Universal Item Tokenization for Transferable Generative Recommendation [89.42584009980676]
We propose UTGRec, a universal item tokenization approach for transferable Generative Recommendation. By devising tree-structured codebooks, we discretize content representations into corresponding codes for item tokenization. For raw content reconstruction, we employ dual lightweight decoders to reconstruct item text and images from discrete representations. For collaborative knowledge integration, we assume that co-occurring items are similar and integrate collaborative signals through co-occurrence alignment and reconstruction.
arXiv Detail & Related papers (2025-04-06T08:07:49Z)
AILuminate: Introducing v1.0 of the AI Risk and Reliability Benchmark from MLCommons [62.374792825813394]
This paper introduces AILuminate v1.0, the first comprehensive industry-standard benchmark for assessing AI-product risk and reliability. The benchmark evaluates an AI system's resistance to prompts designed to elicit dangerous, illegal, or undesirable behavior in 12 hazard categories.
arXiv Detail & Related papers (2025-02-19T05:58:52Z)
Transparent NLP: Using RAG and LLM Alignment for Privacy Q&A [15.86510147965235]
General Data Protection Regulation requires precise processing information to be clear and accessible. This paper examines state-of-the-art Retrieval Generation (RAG) systems enhanced with alignment techniques to fulfill obligations.
arXiv Detail & Related papers (2025-02-10T16:42:00Z)
Benchmarking Harmonized Tariff Schedule Classification Models [0.0]
The study evaluates several industry-leading solutions, including those provided by Zonos, Tarifflo, Avalara, and WCO BACUDA. Results highlight areas for industry-wide improvement and innovation.
arXiv Detail & Related papers (2024-12-04T16:29:05Z)
An Open Knowledge Graph-Based Approach for Mapping Concepts and Requirements between the EU AI Act and International Standards [1.9142148274342772]
The EU's AI Act will shift the focus of such organizations toward conformance with the technical requirements for regulatory compliance. This paper offers a simple and repeatable mechanism for mapping the terms and requirements relevant to normative statements in regulations and standards.
arXiv Detail & Related papers (2024-08-21T18:21:09Z)
Learnable Item Tokenization for Generative Recommendation [78.30417863309061]
We propose LETTER (a LEarnable Tokenizer for generaTivE Recommendation), which integrates hierarchical semantics, collaborative signals, and code assignment diversity. LETTER incorporates Residual Quantized VAE for semantic regularization, a contrastive alignment loss for collaborative regularization, and a diversity loss to mitigate code assignment bias.
arXiv Detail & Related papers (2024-05-12T15:49:38Z)
Towards Standards-Compliant Assistive Technology Product Specifications via LLMs [7.30389619012625]
We introduce CompliAT, a pioneering framework designed to streamline the compliance process of AT product specifications. CompliAT addresses three critical tasks: checking consistency terminology, classifying products according to standards, and tracing key product specifications to standard requirements. We propose a novel approach for product classification, leveraging a retrieval-augmented generation model to accurately categorize AT products aligning to international standards.
arXiv Detail & Related papers (2024-04-04T00:10:39Z)
RulePrompt: Weakly Supervised Text Classification with Prompting PLMs and Self-Iterative Logical Rules [30.239044569301534]
Weakly supervised text classification (WSTC) has attracted increasing attention due to its applicability in classifying a mass of texts. We propose a prompting PLM-based approach named RulePrompt for the WSTC task, consisting of a rule mining module and a rule-enhanced pseudo label generation module. Our approach yields interpretable category rules, proving its advantage in disambiguating easily-confused categories.
arXiv Detail & Related papers (2024-03-05T12:50:36Z)
Gen-Z: Generative Zero-Shot Text Classification with Contextualized Label Descriptions [50.92702206798324]
We propose a generative prompting framework for zero-shot text classification. GEN-Z measures the LM likelihood of input text conditioned on natural language descriptions of labels. We show that zero-shot classification with simple contextualization of the data source consistently outperforms both zero-shot and few-shot baselines.
arXiv Detail & Related papers (2023-11-13T07:12:57Z)
Prompt Tuned Embedding Classification for Multi-Label Industry Sector Allocation [2.024620791810963]
This study benchmarks the performance of Prompt Tuning and baselines for multi-label text classification. It is applied to classifying companies into an investment firm's proprietary industry taxonomy. We confirm that the model's performance is consistent across both well-known and less-known companies.
arXiv Detail & Related papers (2023-09-21T13:45:32Z)
Using novel data and ensemble models to improve automated labeling of Sustainable Development Goals [0.0]
A number of labeling systems based on text have been proposed to help monitor work on the United Nations (UN) Sustainable Development Goals. We show that systems differ considerably in their specificity (i.e., true-positive rate) and sensitivity (i.e., true-negative rate) We then show that an ensemble model that pools labeling systems alleviates some of these limitations, exceeding the labeling performance of all currently available systems.
arXiv Detail & Related papers (2023-01-25T07:44:46Z)
Hybrid Rule-Neural Coreference Resolution System based on Actor-Critic Learning [53.73316523766183]
Coreference resolution systems need to tackle two main tasks. One task is to detect all of the potential mentions, the other is to learn the linking of an antecedent for each possible mention. We propose a hybrid rule-neural coreference resolution system based on actor-critic learning.
arXiv Detail & Related papers (2022-12-20T08:55:47Z)
Learning Label Modular Prompts for Text Classification in the Wild [56.66187728534808]
We propose text classification in-the-wild, which introduces different non-stationary training/testing stages. Decomposing a complex task into modular components can enable robust generalisation under such non-stationary environment. We propose MODULARPROMPT, a label-modular prompt tuning framework for text classification tasks.
arXiv Detail & Related papers (2022-11-30T16:26:38Z)
Token-level Sequence Labeling for Spoken Language Understanding using Compositional End-to-End Models [94.30953696090758]
We build compositional end-to-end spoken language understanding systems. By relying on intermediate decoders trained for ASR, our end-to-end systems transform the input modality from speech to token-level representations. Our models outperform both cascaded and direct end-to-end models on a labeling task of named entity recognition.
arXiv Detail & Related papers (2022-10-27T19:33:18Z)
Interpretable Reinforcement Learning with Multilevel Subgoal Discovery [77.34726150561087]
We propose a novel Reinforcement Learning model for discrete environments. In the model, an agent learns information about environment in the form of probabilistic rules. No reward function is required for learning; an agent only needs to be given a primary goal to achieve.
arXiv Detail & Related papers (2022-02-15T14:04:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.