Volunteers Needed to Manually Classify Web Spam, Research Project from Universita’ di Roma “La Sapienza”
From an e-mail:
At the Algorithmic Engineering group at Universita’ di Roma “La Sapienza”, we are currently building a reference collection for testing Web Spam detection algorithms. While similar collections for research on e-mail spam filtering exist, there are no publicly available collections for testing Web Spam detection techniques. This collection will be freely available once it is completed. We are currently tagging a large subset of 8,000 .UK domains.
The objective is to classify every domain as spam, normal or suspicious. We are 12 volunteers at this moment and we want to have at least two judges per each classified domain, also, having an heterogeneous group of judges makes the collection more valuable.
The working time for classifying 100 domains is of about 2 to 3 hours. We provide guidelines and examples for the classification task, and an easy to use web-based interface for the volunteers.
