ITS LOGO September 2000

How do I ... logo Generate an Alphabetical List of Web Sites


Contents

What is alpha-list

The tool alpha-list available on the ITS general purpose machine panther.uwo.ca and was designed to help web information providers maintain an alphabetic listing of a large numbers of sites. This is primarily useful when you have more than 50 links that work well when listed alphabetically. One example of its use is for the Index of Public Mailing Lists that is available at http://www.uwo.ca/westerndir/publists/browse-name.html.

In order to use it an account on panther.uwo.ca is required but the list that is produced can be used on any web server.

alpha-list makes an alphabetical index form a list of site-names and url's. The format of the command is:

   alpha-list [-t template] [-s sepchar] [-noalp]
              [-name #] [-url #] [-click col#]
              [-other #,#,...] [inputfile [outputfile] ]
alpha-list takes as standard input a file with a list of site names and url's. The site names and url's are separated by sepchar which by default is a semicolon (;). By default, the first field is the site name, the second the url--this can be changed with the appropriate options, described below. Any other fields can be included with the -other option, described below. If there is no url on a line, but there is a site name, then the site will be printed as plain text rather than as a link.

The list is alphabetized (see below) and then inserted into the file template, by default index-template.html. The index is inserted in the position where the template file contains a line beginning with the characters:

   ###
This gives you the option of adding a header and a footer to the index.

The index itself can consist of several fields (see -other). You can specify headings for these fields in the marker line. For example:

   ###;who;where;why
would give the first field of the index the heading who, and so on. The separator character, ; in the example, is the same as the separator character you use to separate the fields in the input file.

Default input is from standard input, default output to standard output. If the last argument on the command line is a file name, that file will be taken as input. If the last two arguments are file names, the first of the two is the input file, the second the output file. If an output file is specified, its permissions will be set to rw-r--r--.

Alphabetization

The site name field is alphabetized as follows:
  1. If the first word of the name is The, it is not included in the alphabetization, i.e. The Elbow Room is alphabetized under the E, not the T, and it will show as Elbow Room.
  2. Site names staring with Centre of, Centre for, Faculty of, School of, Graduate School of, Office of and Society of are alphabetized both under The C for Center, the F for Faculty etc., but also under their principal constituent. For example, Faculty of Arts will be alphabetized both under the F and the A. Note that, with rule 1 above, The Faculty Of Arts will show up under the F and the A. Under the F it will show as Faculty of Arts, while under the A it will show as Arts, Faculty of.
  3. Entries beginning with Campus or Student will be treated in a similar manner as rule 2, so that Campus Computer Store will show up as Campus Computer Store and as Computer Store, Campus-.
  4. The alphabetization of entries starting with St, St., Ste and Ste. will be as if they started with Saint or Sainte.

Exhaustive Alphabetization

If you use the -exhaustive option, alphabetization will be done on each word (except stop words like the, and, a) in the name field. For example, an entry International Horse Show whould appear under the I (international), under the H (horse) and under the S (show). See the section Example of Exhaustive Alphabetization for details.

Options

-other n,n,n...
If you want fields from your input file, other than the name field and the url field, to be displayed in the list, specify the field numbers as a comma-separated list.
-t templatefile
By default, the header and footer template file is index-template.html, this option can change that.
-s sepchar
Changes the field separator, a ; by default, to something else. sepchar should be a single character.
-noalp
Switches off the alphabetization as described above, the alphabetical sort will be done with the site names as they are.
-name n
Sets the field number of the name field, 1 by default (n should be a number).
-url n
Sets the field number of the url field, 2 by default (n should be a number).
-click n
Sets the field number of the "clickable" field, 1 by default (n should be a number). This is the field that is the link to which the URL points.
-exhaustive
Switches on exhaustive alphabetization.

Example

Input File X.list

Faculty of Science;/sci/
Chemistry;/chem/
Astronomy;http://phobos.astro.uwo.ca/
Computer Science;http://www.csd.uwo.ca/
Plant Sciences;/plantsci/
Applied Mathematics;http://pineapple.apmaths.uwo.ca/
Earth Sciences;/earth/es4.html
Statistical & Actuarial Sciences;http://fisher.stats.uwo.ca/
Zoology;/zoo/
Physics;http://www.physics.uwo.ca/
Note that if the url is on the same server as the resulting index, only a short url is needed, for example /chem/ for Chemistry. Astronomy however runs its own server, hence a full url http://phobos.astro.uwo.ca/ is needed.

Part of the Template File X_template.html

(HTML that will appear before the list)

###

(HTML that will appear after the list)

Command Used

alpha-list -t X_template.html X.list X.html

Output on X.html

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z



A
Applied Mathematics
Astronomy
C
Chemistry
Computer Science
E
Earth Sciences
F
Faculty of Science
P
Physics
Plant Sciences
S
Science, Faculty of
Statistical & Actuarial Sciences
Z
Zoology


This list was created on Tue May 6 11:45:07 1997

Example of Exhaustive Alphabetization

If the same file as above had been processed with the command:
   alpha-list -exhaustive -t X_template.html X.list X.html
The output would have been as follows:
Index of Faculty of Science

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z



A
actuarialStatistical & Actuarial Sciences
appliedApplied Mathematics
astronomyAstronomy
C
chemistryChemistry
computerComputer Science
E
earthEarth Sciences
F
facultyFaculty of Science
M
mathematicsApplied Mathematics
P
physicsPhysics
plantPlant Sciences
S
scienceComputer Science
"Faculty of Science
science,Science, Faculty of
sciencesEarth Sciences
"Plant Sciences
"Statistical & Actuarial Sciences
statisticalStatistical & Actuarial Sciences
Z
zoologyZoology


This list was created on Tue May 6 12:30:40 1997

About alpha-list

This tool was written by Gerard Stafleu, ITS, UWO. Your comments and suggestions for improvements are welcome.
©1997, The University of Western Ontario. Permission is granted to copy in whole or in part provided that due credit is given to the authors, the Division of Information Technology Services, and The University of Western Ontario.


Gerard Stafleu, ITS, UWO <gerard.stafleu@uwo.ca>
Last Update: September 29, 2000
URL: http://www.uwo.ca/its/doc/hdi/web/alpha-list.html