|
|
| WIRE
- Crawling experiment 2: log analysis |
Objective
The objective of this experiment
is to measure how deep users are going into a website.
Description
For this experiment, the Web site will be downloaded once, and
analyzed to generate a Web graph. We will study the correlation
of the depth of each page (its distance to the root page in the
Web graph), with the number of visits.
Requirements
An access log file of the website, with at least 1,000 page-views
(and no more than 10,000,000 page-views), is required.
Neither the period of time nor the number of daily
visits are relevant for this experiment, so no data about the
number of visits to each individual website will be released.
Any webserver: Windows, UNIX/Linux, etc. is compatible
with this experiment, as we only require a log file.
To participate:
1. Prepare a file with the last 50-100Mb
of your access log file.
2. If possible, compress this file to
save bandwidth.
3. Leave the file in a temporary directory
in your public FTP or HTTP directory.
4. Send an e-mail to ccastill@dcc.uchile.cl with instructions
to download the file.
Thank your for your collaboration.
|
|