"Near Data" and "Far Data" for Urban Sustainability: How Do Community Advocates Envision Data Intermediaries?
- URL: http://arxiv.org/abs/2501.07661v1
- Date: Mon, 13 Jan 2025 19:47:44 GMT
- Title: "Near Data" and "Far Data" for Urban Sustainability: How Do Community Advocates Envision Data Intermediaries?
- Authors: Han Qiao, Siyi Wu, Christoph Becker,
- Abstract summary: Data intermediaries are crucial stakeholders in facilitating data access and use.
Community advocates live in these sites of social injustices and opportunities for change.
This paper examines the unique perspectives that community advocates offer on data intermediaries.
- Score: 1.8900583145555927
- License:
- Abstract: In the densifying data ecosystem of today's cities, data intermediaries are crucial stakeholders in facilitating data access and use. Community advocates live in these sites of social injustices and opportunities for change. Highly experienced in working with data to enact change, they offer distinctive insights on data practices and tools. This paper examines the unique perspectives that community advocates offer on data intermediaries. Based on interviews with 17 advocates working with 23 grassroots and nonprofit organizations, we propose the quality of "near" and "far" to be seriously considered in data intermediaries' works and articulate advocates' vision of connecting "near data" and "far data." To pursue this vision, we identified three pathways for data intermediaries: align data exploration with ways of storytelling, communicate context and uncertainties, and decenter artifacts for relationship building. These pathways help data intermediaries to put data feminism into practice, surface design opportunities and tensions, and raise key questions for supporting the pursuit of the Right to the City.
Related papers
- A Survey on Data Markets [73.07800441775814]
Growing trend of trading data for greater welfare has led to the emergence of data markets.
A data market is any mechanism whereby the exchange of data products including datasets and data derivatives takes place.
It serves as a coordinating mechanism by which several functions, including the pricing and the distribution of data, interact.
arXiv Detail & Related papers (2024-11-09T15:09:24Z) - Data-Centric AI in the Age of Large Language Models [51.20451986068925]
This position paper proposes a data-centric viewpoint of AI research, focusing on large language models (LLMs)
We make the key observation that data is instrumental in the developmental (e.g., pretraining and fine-tuning) and inferential stages (e.g., in-context learning) of LLMs.
We identify four specific scenarios centered around data, covering data-centric benchmarks and data curation, data attribution, knowledge transfer, and inference contextualization.
arXiv Detail & Related papers (2024-06-20T16:34:07Z) - Situating Data Sets: Making Public Data Actionable for Housing Justice [5.281983320884712]
We investigate and describe the work of making eviction data open to tenant organizers.
This work combines observation, direct participation in data work, and creating media artifacts, specifically digital maps.
arXiv Detail & Related papers (2024-02-19T20:13:42Z) - Data Acquisition: A New Frontier in Data-centric AI [65.90972015426274]
We first present an investigation of current data marketplaces, revealing lack of platforms offering detailed information about datasets.
We then introduce the DAM challenge, a benchmark to model the interaction between the data providers and acquirers.
Our evaluation of the submitted strategies underlines the need for effective data acquisition strategies in Machine Learning.
arXiv Detail & Related papers (2023-11-22T22:15:17Z) - Assessing Scientific Contributions in Data Sharing Spaces [64.16762375635842]
This paper introduces the SCIENCE-index, a blockchain-based metric measuring a researcher's scientific contributions.
To incentivize researchers to share their data, the SCIENCE-index is augmented to include a data-sharing parameter.
Our model is evaluated by comparing the distribution of its output for geographically diverse researchers to that of the h-index.
arXiv Detail & Related papers (2023-03-18T19:17:47Z) - Contributing to Accessibility Datasets: Reflections on Sharing Study
Data by Blind People [14.625384963263327]
We present a pair of studies where 13 blind participants engage in data capturing activities.
We see how different factors influence blind participants' willingness to share study data as they assess risk-benefit tradeoffs.
The majority support sharing of their data to improve technology but also express concerns over commercial use, associated metadata, and the lack of transparency about the impact of their data.
arXiv Detail & Related papers (2023-03-09T00:42:18Z) - Increasing Data Equity Through Accessibility [25.06163815093506]
This response considers data equity specifically for people with disabilities.
We argue that one critically underserved community in the context of data equity is people with disabilities.
arXiv Detail & Related papers (2022-10-04T20:53:36Z) - Rethinking Data Heterogeneity in Federated Learning: Introducing a New
Notion and Standard Benchmarks [65.34113135080105]
We show that not only the issue of data heterogeneity in current setups is not necessarily a problem but also in fact it can be beneficial for the FL participants.
Our observations are intuitive.
Our code is available at https://github.com/MMorafah/FL-SC-NIID.
arXiv Detail & Related papers (2022-09-30T17:15:19Z) - DataPerf: Benchmarks for Data-Centric AI Development [81.03754002516862]
DataPerf is a community-led benchmark suite for evaluating ML datasets and data-centric algorithms.
We provide an open, online platform with multiple rounds of challenges to support this iterative development.
The benchmarks, online evaluation platform, and baseline implementations are open source.
arXiv Detail & Related papers (2022-07-20T17:47:54Z) - Studying Up Machine Learning Data: Why Talk About Bias When We Mean
Power? [0.0]
We argue that reducing societal problems to "bias" misses the context-based nature of data.
We highlight the corporate forces and market imperatives involved in the labor of data workers that subsequently shape ML datasets.
arXiv Detail & Related papers (2021-09-16T17:38:26Z) - Ontologies in CLARIAH: Towards Interoperability in History, Language and
Media [0.05277024349608833]
One of the most important goals of digital humanities is to provide researchers with data and tools for new research questions.
The FAIR principles provide a framework as these state that data needs to be: Findable, as they are often scattered among various sources; Accessible, since some might be offline or behind paywalls; Interoperable, thus using standard knowledge representation formats and shared.
We describe the tools developed and integrated in the Dutch national project CLARIAH to address these issues.
arXiv Detail & Related papers (2020-04-06T17:38:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.