Buscar:   
Center for Web Research D.C.S. University of Chile

Table of contents

Overview

Downloading and installing

Documentation

Acknowledgements


Third-party software (recommended)
In 2011 NIC Brazil created WIRE-Nic, a fork of WIRE that brings some bug fixes and improvements to the original system.

Additionally, they developed ConNeCTOR, a software that analyzes websites downloaded with WIRE in order to measure IPv6 adoption, perform geo localization, check HTML standards adherence, among other tasks.

WIRE en Español

Ir a la página principal del proyecto WIRE en español

WIRE - Web Information Retrieval Environment

Overview

The WIRE project is an effort started by the Center for Web Research for creating an application for information retrieval, designed to be used on the Web.

Currently, it includes:
  • A simple format for storing a collection of web documents.
  • A web crawler.
  • Tools for extracting statistics from the collection.
  • Tools for generating reports about the collection.
The main characteristics of the WIRE software are:
  • Scalability: designed to work with large volumes of documents, tested with several million documents.
  • Performance: written in C/C++ for high performance.
  • Configurable: all the parameters for crawling and indexing can be configured via an XML file.
  • Analysis: includes several tools for analyzing, extracting statistics, and generating reports on sub-sets of the web, e.g.: the web of a country or a large intranet.
  • Free software: code is freely available under a GPL license.

Downloading and installing

The home page of WIRE is http://www.cwr.cl/projects/WIRE/

The latest version can be downloaded from http://www.cwr.cl/projects/WIRE/releases/. Download and unpack the distribution, then follow the installation instructions.

Documentation and support

If you use WIRE, it is advisable to join the wire-crawler@groups.yahoo.com mailing list to receive announcements of new releases.

See online documentation.

See also a PhD. Thesis and publications on the WIRE crawler.


Third-party software

In 2011 NIC Brazil created WIRE-Nic, a fork of WIRE that brings some bug fixes and improvements to the original system. Additionally, they developed ConNeCTOR, a software that analyzes websites downloaded with WIRE in order to measure IPv6 adoption, perform geo localization, check HTML standards adherence, among other tasks.

Luis Alberto García Hernández created a front-end in Java to configure and execute WIRE, and to visualize/analyze web graphs.

NOKUBI Takatsugu made a library to access WIRE using SWIG. This is useful if you want to access the collection generated by WIRE using Ruby/Perl/TCL/etc.

Acknowledgements

This project is funded by the Center for Web Research. The Center for Web Research (CWR) is possible thanks to the Millenium Program.

Design:
Programming:

 

Department of Computer Sciences
University of Chile
Blanco Encalada #2120
Santiago, Chile

Millenium Science Initiative Questions/Comments: cwr@dcc.uchile.cl
Last modification:
Search Services in: Go to todocl.cl

The Center for Web Research (CWR) is possible thanks to the Millenium Science Initiative Program
Millenium Science Initiative, Ministry of Planning and Cooperation - Government of Chile


Valid HTML 4.01! Valid CSS!


dcc