www.pudn.com > madengine.zip > readme


MAdEnGIne v1.0 Copyright  1998, Linda M. Myro Network Design, 
all rights reserved.  Permission granted to use and modify 
for any good use, as long as you agree to hold Linda M. Myro 
Network Design harmless, in the event of any disaster 
either real or imagined, resulting from the use of this software. 
Modification and/or installation of this software either by you, 
or on your behalf, shall be considered agreement of you, the user,  
with the above conditions. 
Questions, problems, etc...email  
 
Madengine uses a set of perl libraries written by Jeffrey Friedl.. 
, a simple to use library, that has saved my 
butt on a number of occaisions.  If you ever see this, Jeffrey, 
you have my sincerest gratitude. 
 
 
Madengine is a Non-Spidering robot-driven contextual search engine 
written in perl, for *nix-based webservers. 
 
Madengine reads a list of admin-configureable URLs, gets these urls 
via the web into temp files, and removes most html tags, then adds the  
info to a database file for that page.  When madengine is queried, it  
returns a list of LINES from those pages that match the query string,  
along with a linked url for the page that match is from.  This gives 
the user of madengine a better idea how their query string is used in 
context, rather than just a raw match.  Madengine DOES NOT follow or  
index URLS contained within the preconfigured list...they must be added 
by the administrator. 
 
Madengine uses a web-based, simple password secured admin script which 
allows you do add or delete urls from your searched url list, as well 
as an update feature, which will refresh the database on demand. 
 
Madengine contains an automatic update script, which, if called from 
unix cron, will do the update at predetermined intervals, assuring that 
your database is always fresh. 
 
Suggested Use for Madengine 
 
Madengine would best be used to produce a searchable database of 
similar-content urls.  Computer hardware sites, or other "for sale" 
kind of sites in particular, by providing an "in-context" return of 
the query string. 
 
Having said that: 
 
CAUTION:   
1) Madengine can be system-resource intensive, and should not 
   be updated during peak server usage hours.   
2) Madengine uses large amounts of disk space, since it stores the  
   entire database as plain text files.  Therefore, you should limit 
   the number of urls in your url list, accordingly.  Madengine is not 
   intended for use as a broad-purpose searchengine. 
3) Madengine does some "sneaky" things in order to get past servers  
   with "no-bot" security features added.  If the madengine robot is 
   used indescretely, you will make enemies. 
4) Madengine is not compliant with accepted robot guidelines, since it 
   is NOT a spidering robot, and does not follow urls contained in the 
   indexed documents. 
 
Installation Instructions. 
 
1) Unzip or uncompress the madengine package. 
2) Make a directory (preferably called "madengine" --to use the defaults) 
   within your cgi-bin. 
3) Upload all the files contained in the madengine package to this 
   directory 
4) Make a subdirectory, inside this directory called "data" 
5) Make sure the first line of madmin.cgi, madengine.cgi, and update.pl all 
   point to the correct location of the perl interpreter for your system. 
6) chmod all files to 0755 , Chmod the data directory to 0777 
7) Make necessary changes to "config.pl" as described here. 
 
	Change this line to whatever you like..it is what the madmin.cgi  
	script refers to itself as 
	$madmintitle = "MadMin"; 
 
	Decide whether you'd like to use wallpaper or a background color 
	$colororimage = "1"; 
 
	Define the background color if determined above 
	$bgcolor = "WHITE"; 
 
	Define the background image url if determined above 
	$bgimageurl = ""; 
 
	Define the text color for all madengine pages 
	$textcolor = "BLACK"; 
 
	Define the link color for all madengine pages 
	$linkcolor = "RED"; 
 
	Define the visited link color for all madengine pages 
	$vlinkcolor = "BLUE"; 
 
	Define the active link color for all madengine pages 
	$alinkcolor = "PURPLE"; 
 
	Set the url of the madmin.cgi script 
	$madminurl = "http://www.yourdomain.com/cgi-bin/madengine/madmin.cgi"; 
 
	Set the full path to the data directory 
	$datadirectorypath = "/home/username/cgi-bin/madengine/data"; 
 
	Set the password for madmin 
	$madminpass = "memad"; 
 
	What Madengine refers to itself as in madengine pages 
	$madenginetitle = "madengine"; 
 
	Set the url for the madengine.cgi script 
	$madengineurl = "http://www.yourdomain.com/cgi-bin/madengine/madengine.cgi"; 
 
	Set this to your email address, to help the madengine bot get  
	through no-bot sites 
	$pretendfrom = "you@yourdomain.com";  
 
	Set this to a browser type you'd like the madengine bot to  
	masquerade as 
	$useragent = "Mozilla/4.05";  
 
8) Log into madmin.cgi, and begin creating your url list. 
 
9) Update your database, with the "update" refresh button. 
 
10) Test madengine.cgi with a sample search string. 
 
11) If you'd like to have the update done automatically, say..once per 
    day, you need to edit your crontab file.. 
	 
	If you have telnet access, try typing "crontab -e" at the 
  	command prompt.  If you get some sort of "denied" or "not allowed" 
	message, you must write to your server administrator for assistance. 
 
	If some editor comes up with your crontab file (or a blank file), 
	you can add a line something like this line (please be sure to 
	set YOUR path on YOUR server to the path for update.pl. 
 
	15 2 * * * /home/username/www/cgi-bin/madengine/update.pl 1> /dev/null 2> /dev/null 
 
	This will update your database once a day, at 02:15 am, when 
	system resources should be at more available than during peak 
	usage hours. 
 
12) You're done...enjoy your copy of MAdEnGIne... 
 
Problems, questions, suggestions, comments..contact