Alternate identifier:
-
Related identifier:
-
Creator/Author:
Demir, Nurullah https://orcid.org/0000-0001-8721-4412 [Institut für Informationssicherheit und Verlässlichkeit]

Hörnemann, Jan [Hörnemann, Jan]

Große-Kampmann, Matteo [Große-Kampmann, Matteo]

Urban, Tobias [Urban, Tobias]

Holz, Thorsten [Helmholtz-Zentrum für Informationssicherheit]

Pohlmann, Norbert [Pohlmann, Norbert]

Wressnegger, Christian [Institut für Informationssicherheit und Verlässlichkeit]
Contributors:
-
Title:
Dataset: On the Similarity of Web Measurements Under Different Experimental Setups
Additional titles:
-
Description:
(Abstract) Measurement studies are essential for research and industry alike to understand the Web's inner workings better and help quantify specific phenomena. Performing such studies is demanding due to the dynamic nature and size of the Web. An experiment's careful design and setup are complex, and many factors might affect the results. However, while several works have independently observed differences in the outcome of an experiment (e.g., the number of observed trackers) based on the measurement setup, it is unclear what causes such deviations. This work investigates the reasons for these differences by visiting 1.7M webpages with five different measurement setups. Based on this, we build `dependency trees' for each page and cross-compare the nodes in the trees. The results show that the measured trees differ considerably, that the cause of differences can be attributed to specific nodes, and that even identical measurement setups can produce different results.
(Abstract) Measurement studies are essential for research and industry alike to understand the Web's inner workings better and help quantify specific phenomena. Performing such studies is demanding due to the dynamic nature and size of the Web. An experiment's careful design and setup are complex, and many factors might affect the results. However, while several works have independently observed differences in the outcome of an experiment (e.g., the number of observed trackers) based on the measurement setup, it is unclear what causes such deviations. This work investigates the reasons for these differences by visiting 1.7M webpages with five different measurement setups. Based on this, we build `dependency trees' for each page and cross-compare the nodes in the trees. The results show that the measured trees differ considerably, that the cause of differences can be attributed to specific nodes, and that even identical measurement setups can produce different results.
(Technical Remarks) This repository hosts the dataset corresponding to the paper "On the Similarity of Web Measurements Under Different Experimental Setups", which was published at the Proceedings of the 23nd ACM Internet Measurement Conference 2023.
Keywords:
-
Related information:
-
Language:
-
Production year:
Subject areas:
Computer Science
Resource type:
Dataset
Data source:
-
Software used:
-
Data processing:
-
Publication year:
Rights holders:

Hörnemann, Jan

Große-Kampmann, Matteo

Urban, Tobias

Holz, Thorsten

Pohlmann, Norbert

Wressnegger, Christian
Funding:
-
Name Storage Metadata Upload Action

Number of views in the previous six months.

Dataset page views

299


Downloads

47


Overall statistics

Period Landing page accessed Dataset downloaded
Jul 2024 32 0
Jun 2024 61 0
May 2024 86 47
Apr 2024 53 0
Mar 2024 34 0
Feb 2024 33 0
Before 253 1
Total 552 48
Status:
Published
Uploaded by:
kitopen
Created on:
Archiving date:
2023-08-23
Archive size:
247.7 GB
Archive creator:
kitopen
Archive checksum:
6a6ea4a3c60bf5653caffabe37c2d091 (MD5)
Embargo end date:
-