www.pudn.com > madengine.zip > readme
MAdEnGIne v1.0 Copyright1998, Linda M. Myro Network Design, all rights reserved. Permission granted to use and modify for any good use, as long as you agree to hold Linda M. Myro Network Design harmless, in the event of any disaster either real or imagined, resulting from the use of this software. Modification and/or installation of this software either by you, or on your behalf, shall be considered agreement of you, the user, with the above conditions. Questions, problems, etc...email Madengine uses a set of perl libraries written by Jeffrey Friedl.. , a simple to use library, that has saved my butt on a number of occaisions. If you ever see this, Jeffrey, you have my sincerest gratitude. Madengine is a Non-Spidering robot-driven contextual search engine written in perl, for *nix-based webservers. Madengine reads a list of admin-configureable URLs, gets these urls via the web into temp files, and removes most html tags, then adds the info to a database file for that page. When madengine is queried, it returns a list of LINES from those pages that match the query string, along with a linked url for the page that match is from. This gives the user of madengine a better idea how their query string is used in context, rather than just a raw match. Madengine DOES NOT follow or index URLS contained within the preconfigured list...they must be added by the administrator. Madengine uses a web-based, simple password secured admin script which allows you do add or delete urls from your searched url list, as well as an update feature, which will refresh the database on demand. Madengine contains an automatic update script, which, if called from unix cron, will do the update at predetermined intervals, assuring that your database is always fresh. Suggested Use for Madengine Madengine would best be used to produce a searchable database of similar-content urls. Computer hardware sites, or other "for sale" kind of sites in particular, by providing an "in-context" return of the query string. Having said that: CAUTION: 1) Madengine can be system-resource intensive, and should not be updated during peak server usage hours. 2) Madengine uses large amounts of disk space, since it stores the entire database as plain text files. Therefore, you should limit the number of urls in your url list, accordingly. Madengine is not intended for use as a broad-purpose searchengine. 3) Madengine does some "sneaky" things in order to get past servers with "no-bot" security features added. If the madengine robot is used indescretely, you will make enemies. 4) Madengine is not compliant with accepted robot guidelines, since it is NOT a spidering robot, and does not follow urls contained in the indexed documents. Installation Instructions. 1) Unzip or uncompress the madengine package. 2) Make a directory (preferably called "madengine" --to use the defaults) within your cgi-bin. 3) Upload all the files contained in the madengine package to this directory 4) Make a subdirectory, inside this directory called "data" 5) Make sure the first line of madmin.cgi, madengine.cgi, and update.pl all point to the correct location of the perl interpreter for your system. 6) chmod all files to 0755 , Chmod the data directory to 0777 7) Make necessary changes to "config.pl" as described here. Change this line to whatever you like..it is what the madmin.cgi script refers to itself as $madmintitle = "MadMin"; Decide whether you'd like to use wallpaper or a background color $colororimage = "1"; Define the background color if determined above $bgcolor = "WHITE"; Define the background image url if determined above $bgimageurl = ""; Define the text color for all madengine pages $textcolor = "BLACK"; Define the link color for all madengine pages $linkcolor = "RED"; Define the visited link color for all madengine pages $vlinkcolor = "BLUE"; Define the active link color for all madengine pages $alinkcolor = "PURPLE"; Set the url of the madmin.cgi script $madminurl = "http://www.yourdomain.com/cgi-bin/madengine/madmin.cgi"; Set the full path to the data directory $datadirectorypath = "/home/username/cgi-bin/madengine/data"; Set the password for madmin $madminpass = "memad"; What Madengine refers to itself as in madengine pages $madenginetitle = "madengine"; Set the url for the madengine.cgi script $madengineurl = "http://www.yourdomain.com/cgi-bin/madengine/madengine.cgi"; Set this to your email address, to help the madengine bot get through no-bot sites $pretendfrom = "you@yourdomain.com"; Set this to a browser type you'd like the madengine bot to masquerade as $useragent = "Mozilla/4.05"; 8) Log into madmin.cgi, and begin creating your url list. 9) Update your database, with the "update" refresh button. 10) Test madengine.cgi with a sample search string. 11) If you'd like to have the update done automatically, say..once per day, you need to edit your crontab file.. If you have telnet access, try typing "crontab -e" at the command prompt. If you get some sort of "denied" or "not allowed" message, you must write to your server administrator for assistance. If some editor comes up with your crontab file (or a blank file), you can add a line something like this line (please be sure to set YOUR path on YOUR server to the path for update.pl. 15 2 * * * /home/username/www/cgi-bin/madengine/update.pl 1> /dev/null 2> /dev/null This will update your database once a day, at 02:15 am, when system resources should be at more available than during peak usage hours. 12) You're done...enjoy your copy of MAdEnGIne... Problems, questions, suggestions, comments..contact