New Demos from Next Generation Search Group at University of Helsinki
New Demos from The Next Generation Search Group at University of Helsinki
From the NGIR web site:
The Next Generation Information Retrieval group looks at search and information retrieval in a world impacted by Linux and Google where open source and open standards are becoming a dominant paradigm for internet services, and information retrieval is viewed as a key function in productive internet use. The group uses probabilistic and information-theoretic methods to model information retrieval, and is committed to open source software development. The group also believes distributed, semantic-based and multilingual methods will have a central role in the future of information retrieval.
Alvis — Superpeer Semantic Search Engine
What is ALVIS?
The project will conduct research in the design, use and interoperability of topic-specific search engines with the goal of developing an open source prototype of a distributed, semantic-based search engine. Existing search engines provide poor foundation for semantic web operations…Alvis is funded by EU’s Sixth Framework Programme for Research and Technological Development.
Demos
+ Alvis News Search Engine
A collection of news articles collected from a predefined set of Internet Search related news sites at regular intervals. The database covers the latest articles as well as archives from different popular Web news and blogs resources.
+ Alvis Wikipedia Search
Background about Wikipedia search engine.
+ SMART
Topic specific search for the recent EU project Statistical Multilingual Analysis for Retrieval and Translation.
Overview of SMART search available here. Uses ALVIS technology.
The crawl uses the ALVIS focussed crawler that is guided by keywords. The key phrases relevant to the crawl are one of the following:
relevance to statistical machine translation with key phrases: cross-?lingual information access, smt systems?, statistical machine translation, textual information access, statistical translation models?, cross-?lingual information retrieval, information extraction,
or both of :
relevance to machine learning with key phrases: machine learning, statistical learning, kernel methods?, string kernels?, rational kernels?, online learning, support vector machines?, SVM, principal component analysis, independent component analysis, PCA, ICA, discriminative language models?, canonical correlation analysis, margin-?based translation models?, statistical language, latent dirichlet, automatic processing,
+ relevance to machine translation with key phrases: machine translation, information retrieval, language models?, translation models?, computational linguistics, lexicon extraction, comprehension aids?, multilingual lexicon, user trials, user evaluation, parallel corpora, language modelling, computer aided translation, comprehension aids, multilingual lexicons?, multilingual corpora, cross-?language information retrieval, natural language processing, multilingual lexicon extraction, human language technology, machine translation technology, machine translation systems?, cross-?lingual information retrieval, linguistic resources.
A list of seed sites for crawl is also available here.
Source: ALVIS Consortium, Next Generation Information Search at University of Helsinki
