www.pudn.com > plwap.rar > plwap.cpp
/*
This is the PLWAP algorithm program, which demostartes the
result described in:
C. I. Ezeife and Y. Lu. "Mining Web Log sequential Patterns with
Position Coded Pre-Order Linked WAP-tree" in DMKD.
1.DEVELOPMENT ENVIRONMENT:
Although initial version is developed under the
hardware/software environment specified below, the program
runs on more powerful and faster multiprocessor UNIX environments as well.
(1)Hardware: Intel Celeron 400 PC, 64M Memory;
(2)Operting system: Windows 98;
(3)Development tool: Inprise(Borland) C++ Builder 6.0.
Note: The algorithm is developed under C++ Builder 6.0. However,
it is possible to compile and run the pro
gram under any standard C++ development tool.
Program is in plwap.cpp, compiled with Unix "g++ plwap.cpp" and
executed with a.out.
2. INPUT:
(1) test.data:
For simplifying input process of the program, we assume that
all input data have been preprocessed
such that all events belonging to same user
id have been gathered together, and formed as a sequence
which is saved in a text file, called, "test.data". The
"test.data" file is composed of
hundreds of thousands of lines of sequences where each line
represents a web access sequence for each user.
Every line of the input data file ("test.data") includes UserID, length
of sequence and the sequence which are seperated by tab spaces.
An example input line is:
100 5 10 20 40 10 30
Here, 100 represents UserID, 5 means the length of sequence is 5, the
sequence is 10,20,40,10,30.
(2) minimum support:
The program also needs to accept a value between 0 and 1 which
is called minimum support. The minimum support input is entered
interactively by the user during
the execution of the program when prompted. For a minimum
support of 50%, user should type 0.5, and for minsupport of 5%,
user should type .05, and so on.
3. OUTPUT: result_PLWAP.data
Once the program terminates, we can find the result frequent
patterns in a file named "result_PLWAP.data".
it may contain lines of patterns. Each line represents a pattern.
4. FUNCTIONS USED IN THE CODE:
(1)BuildTree: Builds the PLWAP tree
(2)BuildLinkage: Builds the linkage for PLWAP tree
(3)makeCode: Makes the position code for a node
(4)checkPosition: Checks the position between any two nodes in
the PLWAP tree
(5)MiningProcess: Mines sequential frequent patterns from the PLWAP tree
5. DATA STRUCTURE
Three struct are used in this program:
(1) the node struct indicates a PLWAP node which contains the
information:
a.the event name
b.the number of occurrence of the event
c. a link to the position code
d. length of position code
e. the linkage to next node same event name in PLWAP tree
f. a pointer to its left son
g. a pointer to its right sibling
h. a pointer to its parent
i. the number of its sons.
(2) a position code struct
(3) a linkage struct
6. ADDITIONAL INFORMATION:
The run time is displayed on the screen with start time, end time
and total seconds for running the program.
*/
#include
#include