Research Gatherings: Workshop on Genre-Enabled Search Engines

W9: Towards Genre-Enabled Search Engines: The Impact of NLP
Scheduled to take place on September 30, 2007 in Borovets, Bulgaria.
The workshop is being held in conjunction with the 6th International Conference “Recent Advances on Natural Language Processing” (RANLP-2007)

Generally speaking, genres are textual categories that streamline communication by relying on acknowledged conventions and raising predictable expectations. For instance, the conventions underlying the BLOG genre are represented by a sequence of daily entries that contain narratives, opinions, and feelings of the blogger, an individual who wishes to participate in a discussion on a certain subject. These entries are public and other bloggers can directly comment on them by sending their own postings. These conventions are different from those underlying the EDITORIAL genre, where a single person presents an argumentative statement of views that are considered to be representative of a newspaper as a whole. In brief, genres convey the context of communication. This context is essential when determining the relevance of the information contained in a text…The distinction between topical and non-topical textual dimensions is crucial when it comes to features. Traditionally, topics and domains rely on features based on content words (e.g. in the bag-of-words approach), while genre classes appear to be more easily identified through the use of grammatical features (like function words, POS tags, and syntactic features). As Natural Language Processing (NLP) provides methods to retrieve grammatical features, an investigation of the influence of NLP on automatic genre identification appears of primary importance. For this reason, we wish to investigate to what extent NLP can help identify genre in an IR scenario.

Papers to be presented:

1) Braslavski P.
Combining Relevance and Genre-Related Rankings: An Exploratory Study

2) Jebari C.
Combining classifiers for flexible genre categorization of web pages

3) Mehler A., Gleim R., Wegner A.
Structural Uncertainty of Hypertext Types. An Empirical Study

4) Stubbe A., Ringlstetter C., Goebel R.
Elements of a Learning Interface for Genre Qualified Search

5) Symonenko S.
Recognizing Genre-Like Regularities in Website Content Structure

6) Tavosanis M.
Juvenile Netspeak and subgenre classification issues in Italian blogs

7) Vidulin V., Luštrek M., Gams M.
Using Genres to Improve Search Engines

Source:SIG-IR