Revealing the Black Box of Device Search Engine: Scanning Assets, Strategies, and Ethical Consideration
- URL: http://arxiv.org/abs/2412.15696v1
- Date: Fri, 20 Dec 2024 09:15:48 GMT
- Title: Revealing the Black Box of Device Search Engine: Scanning Assets, Strategies, and Ethical Consideration
- Authors: Mengying Wu, Geng Hong, Jinsong Chen, Qi Liu, Shujun Tang, Youhao Li, Baojun Liu, Haixin Duan, Min Yang,
- Abstract summary: This study presents the first comprehensive examination of device search engines' operational and ethical dimensions.
We developed a novel framework to trace the IP addresses utilized by these engines and collected 1,407 scanner IPs.
Our findings reveal significant ethical concerns, including a lack of transparency, harmlessness, and anonymity.
- Score: 24.74127068662522
- License:
- Abstract: In the digital age, device search engines such as Censys and Shodan play crucial roles by scanning the internet to catalog online devices, aiding in the understanding and mitigation of network security risks. While previous research has used these tools to detect devices and assess vulnerabilities, there remains uncertainty regarding the assets they scan, the strategies they employ, and whether they adhere to ethical guidelines. This study presents the first comprehensive examination of these engines' operational and ethical dimensions. We developed a novel framework to trace the IP addresses utilized by these engines and collected 1,407 scanner IPs. By uncovering their IPs, we gain deep insights into the actions of device search engines for the first time and gain original findings. By employing 28 honeypots to monitor their scanning activities extensively in one year, we demonstrate that users can hardly evade scans by blocklisting scanner IPs or migrating service ports. Our findings reveal significant ethical concerns, including a lack of transparency, harmlessness, and anonymity. Notably, these engines often fail to provide transparency and do not allow users to opt out of scans. Further, the engines send malformed requests, attempt to access excessive details without authorization, and even publish personally identifiable information (PII) and screenshots on search results. These practices compromise user privacy and expose devices to further risks by potentially aiding malicious entities. This paper emphasizes the urgent need for stricter ethical standards and enhanced transparency in the operations of device search engines, offering crucial insights into safeguarding against invasive scanning practices and protecting digital infrastructures.
Related papers
- Illusions of Relevance: Using Content Injection Attacks to Deceive Retrievers, Rerankers, and LLM Judges [52.96987928118327]
We find that embedding models for retrieval, rerankers, and large language model (LLM) relevance judges are vulnerable to content injection attacks.
We identify two primary threats: (1) inserting unrelated or harmful content within passages that still appear deceptively "relevant", and (2) inserting entire queries or key query terms into passages to boost their perceived relevance.
Our study systematically examines the factors that influence an attack's success, such as the placement of injected content and the balance between relevant and non-relevant material.
arXiv Detail & Related papers (2025-01-30T18:02:15Z) - Fingerprinting and Tracing Shadows: The Development and Impact of Browser Fingerprinting on Digital Privacy [55.2480439325792]
Browser fingerprinting is a growing technique for identifying and tracking users online without traditional methods like cookies.
This paper gives an overview by examining the various fingerprinting techniques and analyzes the entropy and uniqueness of the collected data.
arXiv Detail & Related papers (2024-11-18T20:32:31Z) - Deepfake Media Forensics: State of the Art and Challenges Ahead [51.33414186878676]
AI-generated synthetic media, also called Deepfakes, have influenced so many domains, from entertainment to cybersecurity.
Deepfake detection has become a vital area of research, focusing on identifying subtle inconsistencies and artifacts with machine learning techniques.
This paper reviews the primary algorithms that address these challenges, examining their advantages, limitations, and future prospects.
arXiv Detail & Related papers (2024-08-01T08:57:47Z) - Darknet Traffic Analysis A Systematic Literature Review [0.0]
The objective of an anonymity tool is to protect the anonymity of its users through the implementation of strong encryption and obfuscation techniques.
The strong anonymity feature also functions as a refuge for those involved in illicit activities who aim to avoid being traced on the network.
This paper presents a comprehensive analysis of methods of darknet traffic using machine learning techniques to monitor and identify the traffic attacks inside the darknet.
arXiv Detail & Related papers (2023-11-27T19:27:50Z) - When Authentication Is Not Enough: On the Security of Behavioral-Based Driver Authentication Systems [53.2306792009435]
We develop two lightweight driver authentication systems based on Random Forest and Recurrent Neural Network architectures.
We are the first to propose attacks against these systems by developing two novel evasion attacks, SMARTCAN and GANCAN.
Through our contributions, we aid practitioners in safely adopting these systems, help reduce car thefts, and enhance driver security.
arXiv Detail & Related papers (2023-06-09T14:33:26Z) - Reproducing Random Forest Efficacy in Detecting Port Scanning [0.0]
Port scanning is a method used by hackers to identify vulnerabilities in a network or system.
It is important to detect port scanning because it is often the first step in a cyber attack.
Researchers have worked for over a decade to develop robust methods to detect port scanning.
arXiv Detail & Related papers (2023-02-18T12:28:53Z) - Re-purposing Perceptual Hashing based Client Side Scanning for Physical
Surveillance [11.32995543117422]
We experimentally characterize the potential for one type of misuse -- attackers manipulating the content scanning system to perform physical surveillance on target locations.
Our contributions are threefold: (1) we offer a definition of physical surveillance in the context of client-side image scanning systems; (2) we experimentally characterize this risk and create a surveillance algorithm that achieves physical surveillance rates of >40% by poisoning 5% of the perceptual hash database.
arXiv Detail & Related papers (2022-12-08T06:52:14Z) - Synthetic ID Card Image Generation for Improving Presentation Attack
Detection [12.232059909207578]
This work explores three methods for synthetically generating ID card images to increase the amount of data while training fraud-detection networks.
Our results indicate that databases can be supplemented with synthetic images without any loss in performance for the print/scan Presentation Attack Instrument Species (PAIS) and a loss in performance of 1% for the screen capture PAIS.
arXiv Detail & Related papers (2022-10-31T19:07:30Z) - Deep Learning Algorithm for Threat Detection in Hackers Forum (Deep Web) [0.0]
We propose a novel approach for detecting cyberthreats using a deep learning algorithm Long Short-Term Memory (LSTM)
Our model can be easily deployed by organizations in securing digital communications and detection of vulnerability exposure before cyberattack.
arXiv Detail & Related papers (2022-02-03T07:49:44Z) - Dos and Don'ts of Machine Learning in Computer Security [74.1816306998445]
Despite great potential, machine learning in security is prone to subtle pitfalls that undermine its performance.
We identify common pitfalls in the design, implementation, and evaluation of learning-based security systems.
We propose actionable recommendations to support researchers in avoiding or mitigating the pitfalls where possible.
arXiv Detail & Related papers (2020-10-19T13:09:31Z) - Adversarial EXEmples: A Survey and Experimental Evaluation of Practical
Attacks on Machine Learning for Windows Malware Detection [67.53296659361598]
adversarial EXEmples can bypass machine learning-based detection by perturbing relatively few input bytes.
We develop a unifying framework that does not only encompass and generalize previous attacks against machine-learning models, but also includes three novel attacks.
These attacks, named Full DOS, Extend and Shift, inject the adversarial payload by respectively manipulating the DOS header, extending it, and shifting the content of the first section.
arXiv Detail & Related papers (2020-08-17T07:16:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.