Search Engine Similarity Analysis: A Combined Content and Rankings
Approach
- URL: http://arxiv.org/abs/2011.00650v2
- Date: Fri, 6 Nov 2020 17:11:10 GMT
- Title: Search Engine Similarity Analysis: A Combined Content and Rankings
Approach
- Authors: Konstantina Dritsa, Thodoris Sotiropoulos, Haris Skarpetis, Panos
Louridas
- Abstract summary: We present an analysis of the affinity of the two major search engines, Google and Bing, along with DuckDuckGo.
We developed a new similarity metric that leverages both the content and the ranking of search responses.
We found that Google stands apart, but Bing and DuckDuckGo are largely indistinguishable from each other.
- Score: 6.69087470775851
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: How different are search engines? The search engine wars are a favorite topic
of on-line analysts, as two of the biggest companies in the world, Google and
Microsoft, battle for prevalence of the web search space. Differences in search
engine popularity can be explained by their effectiveness or other factors,
such as familiarity with the most popular first engine, peer imitation, or
force of habit. In this work we present a thorough analysis of the affinity of
the two major search engines, Google and Bing, along with DuckDuckGo, which
goes to great lengths to emphasize its privacy-friendly credentials. To do so,
we collected search results using a comprehensive set of 300 unique queries for
two time periods in 2016 and 2019, and developed a new similarity metric that
leverages both the content and the ranking of search responses. We evaluated
the characteristics of the metric against other metrics and approaches that
have been proposed in the literature, and used it to (1) investigate the
similarities of search engine results, (2) the evolution of their affinity over
time, (3) what aspects of the results influence similarity, and (4) how the
metric differs over different kinds of search services. We found that Google
stands apart, but Bing and DuckDuckGo are largely indistinguishable from each
other.
Related papers
- The Essence of the Essence from the Web:The Metasearch Engine [0.0]
Metasearch Engine comes into play by reducing the user burden by dispatching queries to multiple search engines in parallel.
These engines do not own a database of Web pages rather they send search terms to the databases maintained by the search engine companies.
In this paper, we describe the working of a typical metasearch engine and then present a comparative study of traditional search engines and metasearch engines on the basis of different parameters.
arXiv Detail & Related papers (2024-11-06T06:56:22Z) - Exploring Query Understanding for Amazon Product Search [62.53282527112405]
We study how query understanding-based ranking features influence the ranking process.
We propose a query understanding-based multi-task learning framework for ranking.
We present our studies and investigations using the real-world system on Amazon Search.
arXiv Detail & Related papers (2024-08-05T03:33:11Z) - Tree Search for Language Model Agents [69.43007235771383]
We propose an inference-time search algorithm for LM agents to perform exploration and multi-step planning in interactive web environments.
Our approach is a form of best-first tree search that operates within the actual environment space.
It is the first tree search algorithm for LM agents that shows effectiveness on realistic web tasks.
arXiv Detail & Related papers (2024-07-01T17:07:55Z) - A comparison of online search engine autocompletion in Google and Baidu [3.5016560416031886]
We study the characteristics of search auto-completions in two different linguistic and cultural contexts: Baidu and Google.
We find differences between the two search engines in the way they suppress or modify original queries.
Our study highlights the need for more refined, culturally sensitive moderation strategies in current language technologies.
arXiv Detail & Related papers (2024-05-03T08:17:04Z) - The Use of Generative Search Engines for Knowledge Work and Complex Tasks [26.583783763090732]
We analyze the types and complexity of tasks that people use Bing Copilot for compared to Bing Search.
Findings indicate that people use the generative search engine for more knowledge work tasks that are higher in cognitive complexity than were commonly done with a traditional search engine.
arXiv Detail & Related papers (2024-03-19T18:17:46Z) - User Attitudes to Content Moderation in Web Search [49.1574468325115]
We examine the levels of support for different moderation practices applied to potentially misleading and/or potentially offensive content in web search.
We find that the most supported practice is informing users about potentially misleading or offensive content, and the least supported one is the complete removal of search results.
More conservative users and users with lower levels of trust in web search results are more likely to be against content moderation in web search.
arXiv Detail & Related papers (2023-10-05T10:57:15Z) - Evaluating Verifiability in Generative Search Engines [70.59477647085387]
Generative search engines directly generate responses to user queries, along with in-line citations.
We conduct human evaluation to audit four popular generative search engines.
We find that responses from existing generative search engines are fluent and appear informative, but frequently contain unsupported statements and inaccurate citations.
arXiv Detail & Related papers (2023-04-19T17:56:12Z) - NeuralSearchX: Serving a Multi-billion-parameter Reranker for
Multilingual Metasearch at a Low Cost [4.186775801993103]
We describe NeuralSearchX, a metasearch engine based on a multi-purpose large reranking model to merge results and highlight sentences.
We show that our design choices led to a much cost-effective system with competitive QPS while having close to state-of-the-art results on a wide range of public benchmarks.
arXiv Detail & Related papers (2022-10-26T16:36:53Z) - Exposing Query Identification for Search Transparency [69.06545074617685]
We explore the feasibility of approximate exposing query identification (EQI) as a retrieval task by reversing the role of queries and documents in two classes of search systems.
We derive an evaluation metric to measure the quality of a ranking of exposing queries, as well as conducting an empirical analysis focusing on various practical aspects of approximate EQI.
arXiv Detail & Related papers (2021-10-14T20:19:27Z) - The Matter of Chance: Auditing Web Search Results Related to the 2020
U.S. Presidential Primary Elections Across Six Search Engines [68.8204255655161]
We look at the text search results for "us elections", "donald trump", "joe biden" and "bernie sanders" queries on Google, Baidu, Bing, DuckDuckGo, Yahoo, and Yandex.
Our findings indicate substantial differences in the search results between search engines and multiple discrepancies within the results generated for different agents.
arXiv Detail & Related papers (2021-05-03T11:18:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.