Leveraging Google's Publisher-specific IDs to Detect Website
Administration
- URL: http://arxiv.org/abs/2202.05074v1
- Date: Thu, 10 Feb 2022 14:59:17 GMT
- Title: Leveraging Google's Publisher-specific IDs to Detect Website
Administration
- Authors: Emmanouil Papadogiannakis, Panagiotis Papadopoulos, Evangelos P.
Markatos, Nicolas Kourtellis
- Abstract summary: We propose a novel, graph-based methodology to detect administration of websites on the Web.
We apply our methodology across the top 1 million websites and study the characteristics of the created graphs of website administration.
Our findings show that approximately 90% of the websites are associated each with a single publisher, and that small publishers tend to manage less popular websites.
- Score: 3.936965297430477
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Digital advertising is the most popular way for content monetization on the
Internet. Publishers spawn new websites, and older ones change hands with the
sole purpose of monetizing user traffic. In this ever-evolving ecosystem, it is
challenging to effectively answer questions such as: Which entities monetize
what websites? What categories of websites does an average entity typically
monetize on and how diverse are these websites? How has this website
administration ecosystem changed across time?
In this paper, we propose a novel, graph-based methodology to detect
administration of websites on the Web, by exploiting the ad-related
publisher-specific IDs. We apply our methodology across the top 1 million
websites and study the characteristics of the created graphs of website
administration. Our findings show that approximately 90% of the websites are
associated each with a single publisher, and that small publishers tend to
manage less popular websites. We perform a historical analysis of up to 8
million websites, and find a new, constantly rising number of (intermediary)
publishers that control and monetize traffic from hundreds of websites, seeking
a share of the ad-market pie. We also observe that over time, websites tend to
move from big to smaller administrators.
Related papers
- The Web unpacked: a quantitative analysis of global Web usage [0.0]
We estimate the total web traffic and investigate its distribution among domains and industry sectors.
Our analysis reveals a significant concentration of web traffic, with a diminutive number of top websites capturing the majority of visits.
Much of the traffic goes to for-profit but mostly free-of-charge websites, highlighting the dominance of business models not based on paywalls.
arXiv Detail & Related papers (2024-04-26T01:05:47Z) - User Attitudes to Content Moderation in Web Search [49.1574468325115]
We examine the levels of support for different moderation practices applied to potentially misleading and/or potentially offensive content in web search.
We find that the most supported practice is informing users about potentially misleading or offensive content, and the least supported one is the complete removal of search results.
More conservative users and users with lower levels of trust in web search results are more likely to be against content moderation in web search.
arXiv Detail & Related papers (2023-10-05T10:57:15Z) - Measuring and Modeling the Free Content Web [13.982229874909978]
We investigate the similarities and differences between free content and premium websites.
For risk analysis, we consider and examine the maliciousness of these websites at the website- and component-level.
arXiv Detail & Related papers (2023-04-26T04:17:43Z) - "Way back then": A Data-driven View of 25+ years of Web Evolution [4.055696230852368]
We look at the top 100 Alexa websites for over 25 years from the Internet Archive or the "Wayback Machine", archive.org.
We study the changes in popularity, from Geocities and Yahoo! in the mid-to-late 1990s to the likes of Google, Facebook, and Tiktok of today.
We also look at different categories of websites and their popularity over the years and find evidence for the decline in popularity of news and education-related websites.
arXiv Detail & Related papers (2022-02-16T18:36:03Z) - Who Funds Misinformation? A Systematic Analysis of the Ad-related Profit
Routines of Fake News sites [3.936965297430477]
We study more than 2400 popular fake and real news websites and show that well-known legitimate ad networks have a direct advertising relation with more than 40% of these fake news websites.
We show that entities who own fake news websites, also own (or operate) other types of websites for entertainment, business, and politics, pointing to the fact that owning a fake news website is part of a broader business operation.
arXiv Detail & Related papers (2022-02-10T15:07:33Z) - Where the Earth is flat and 9/11 is an inside job: A comparative
algorithm audit of conspiratorial information in web search results [62.997667081978825]
We examine the distribution of conspiratorial information in search results across five search engines: Google, Bing, DuckDuckGo, Yahoo and Yandex.
We find that all search engines except Google consistently displayed conspiracy-promoting results and returned links to conspiracy-dedicated websites in their top results.
Most conspiracy-promoting results came from social media and conspiracy-dedicated websites while conspiracy-debunking information was shared by scientific websites and, to a lesser extent, legacy media.
arXiv Detail & Related papers (2021-12-02T14:29:21Z) - Online Advertising Revenue Forecasting: An Interpretable Deep Learning
Approach [0.0]
We propose a novel attention-based architecture to predict publishers' advertising revenues.
Our results outperform several benchmark deep-learning time-series forecast models over multiple time horizons.
arXiv Detail & Related papers (2021-11-16T23:55:02Z) - The Rise and Fall of Fake News sites: A Traffic Analysis [62.51737815926007]
We investigate the online presence of fake news websites and characterize their behavior in comparison to real news websites.
Based on our findings, we build a content-agnostic ML for automatic detection of fake news websites.
arXiv Detail & Related papers (2021-03-16T18:10:22Z) - A novel auction system for selecting advertisements in Real-Time bidding [68.8204255655161]
Real-Time Bidding is a new Internet advertising system that has become very popular in recent years.
We propose an alternative betting system with a new approach that not only considers the economic aspect but also other relevant factors for the functioning of the advertising system.
arXiv Detail & Related papers (2020-10-22T18:36:41Z) - Political audience diversity and news reliability in algorithmic ranking [54.23273310155137]
We propose using the political diversity of a website's audience as a quality signal.
Using news source reliability ratings from domain experts and web browsing data from a diverse sample of 6,890 U.S. citizens, we first show that websites with more extreme and less politically diverse audiences have lower journalistic standards.
arXiv Detail & Related papers (2020-07-16T02:13:55Z) - Online Joint Bid/Daily Budget Optimization of Internet Advertising
Campaigns [115.96295568115251]
We study the problem of automating the online joint bid/daily budget optimization of pay-per-click advertising campaigns over multiple channels.
For every campaign, we capture the dependency of the number of clicks on the bid and daily budget by Gaussian Processes.
We design four algorithms and show that they suffer from a regret that is upper bounded with high probability as O(sqrtT)
We present the results of the adoption of our algorithms in a real-world application with a daily average spent of 1,000 Euros for more than one year.
arXiv Detail & Related papers (2020-03-03T11:07:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.