*COMDECK *HELP
  
        A    G U I D E    T O    S O R T 5
        -    ---------    ---    ---------
  
        PROGRAM DESIGN. D.J.FORTHOFFER AND R.L.MCALLESTER, CONTROL DATA 
CORPORATION, SUNNYVALE, CALIFORNIA. ENTIRE CONTENTS 
*CALL COPYRIGHT 
  
        IMPLEMENTATION. D.J.FORTHOFFER, T.N.DUONG, R.L.MCALLESTER,
W.F.DALE, G.S. PUTERBAUGH, AND JOHN ROBINSON. 
        --------------------------------------------------------------- 
  
        THE PREVIOUS COMDECK HELP HAS BY NOW BECOME TOTALLY ERRONEOUS.
THIS IS INTENDED AS THE BEGINNING OF A REPLACEMENT DECK.
        *** A GUIDE TO THE PERPLEXED ***
  
  
        THE MAIN STRUCTURE OF THE PROGRAM IS CONTAINED IN THE CAPSULE 
S$MAIN AND IN THE CAPSULE S$SRTPH WHICH IS LOADED SOON AFTER. 
        SORT5 CAN BE CALLED IN MANY DIFFERENT WAYS. THESE ARE 1) CONTROL
CARD CALLS 2)THE INTERACTIVE DIALOG (WHICH ALWAYS *BEGINS* AS A CONTROL 
CARD CALL) AND 3)THE RELOCATABLE CALLS. THE RELOCATABLE CALLS ARE IN TURN 
SUBDIVIDED INTO THE 'SM' CALLS AND THE 'SM5' CALLS - THE FORMER ARE THE 
SORT-4-COMPATIBLE CALLS, AND THE LATTER ARE THE CALLS DESCRIBED IN THE
SORT5 REFERENCE MANUAL. 
        ALL OF THESE LEAD TO THE LOADING OF S$MAIN. S$MAIN IS THE *ROOT*
CAPSULE IN THE STRUCTURE OF SORT5. IT AND S$SRTPH ARE ALWAYS IN MEMORY
WHEN YOU TAKE A DUMP FROM A SORT5 ABORT. YOU WILL ALWAYS BE ABLE TO FIND
YOUR WAY AS FAR AS THOSE, OR ELSE THERE IS SOMETHING RADICALLY WRONG
WITH CMM. 
        OTHER CAPSULES WILL DEPEND ON WHAT *PHASE* THE SORT IS IN. THESE
PHASES ARE:  1)DEFINING THE SPECIFICATION OF THE SORT. 2) GENERATING THE
CODE FOR THE SORT. 3) SORTING.
        DURING PHASE ONE (DEFINING THE SORT) WHAT YOU FIND IN MEMORY
WILL DEPEND ENTIRELY ON THE SORT. FOR CONTROL CARD OR INTERACTIVE DIALOG, 
YOU WILL CERTAINLY HAVE THE SORT5 (0,0) OVERLAY IN MEMORY. THIS CONTAINS
THE PROGRAM SORT5, AT THE BEGINNING OF THE SYMPL LISTING. HOWEVER, THIS 
(0,0) OVERLAY ALSO CONTAINS A LOT OF OTHER STUFF WHICH HAD TO BE RESIDENT 
IN STATIC MEMORY. SEE A LOAD MAP OF THIS OVERLAY - A COMPANION DOCUMENT IS
A LISTING OF THE COMDECK SKELETON, WHICH IS ALSO A MAJOR GUIDE TO THE 
UNDERSTANDING OF THE SORTING PROGRAM'S STRUCTURE. 
        THE MAIN THING TO LOOK FOR IS THE CAPSULE S$GTCSP (*GET *CONTROL
*STATEMENT *PARAMETER).  THIS ONE CRACKS THE CONTROL CARD(S) AND OR THE 
SUPPLEMENTARY DIRECTIVE FILE.  IF THE DIALOG IS INVOLVED, THE CAPSULE 
S$GINPR WILL BE IN MEMORY.
        IF YOU ARE DEALING WITH RELOCATABLE CALLS, LOOK FOR THE DYNAMIC 
AREA OF SORT5 (S$MAIN AND FROM THERE ONWARD). YOU WILL ALSO FIND THE USER 
PROGRAM IN MEMORY. THE BEST WAY TO COMPREHEND THE DUMP IS WITH A LOAD MAP 
OF THE USER PROGRAM.
        IN GENERAL, NONE OF THESE PROCEDURES IS LIKELY TO ABORT, AND IN 
MOST CASES WE WIND UP AT S$SETWH, WHICH LOOKS AT ALL USER SPECIFICATIONS
AND PUTS THEM INTO AN ARRAY CALLED SPEC$. (SEE LISTING OF COMDECK IF
INTERESTED).
        THIS ARRAY IS THEN MASSAGED BY A PROCEDURE CALLED S$PRSPC 
*PROCESS*SPEC$).  BY THIS TIME, WE'RE ALMOST READY TO START SORTING.
HOWEVER, BEFORE WE DO THAT, WE HAVE TO GENERATE THE CODE WHICH WILL PERFORM 
100% OF THE ACTION FROM HERE ON.
        THE SORT IS A TOURNAMENT SORT. THE MAIN ACTIVITIES WHICH IT 
CARRIES OUT ARE 1) GETTING A RECORD FROM THE USER 2) STUFFING THE RECORD
INTO THE TOURNAMENT AND HANDLING THE BACK END AS USUAL 3) GIVING A SORTED 
OR OUTPUT RECORD BACK TO THE USER.
        TO GENERATE THE CODE, WE LOAD THE CAPSULE S$GNSRT.  (*GENERATE
*SORT).  THIS GENERATES THREE SETS OF CODE.  THEY ARE CALLED INIT-CODE, 
SHORT-CODE, AND LONG-CODE.  THE CODE-GENERATION CAPSULES ARE CALLED 
S$GNINI, S$GNSHT, AND S$GNLNG.  ALL THREE POSSIBLE CODES ARE GENERATED
BEFORE ANY SORTING TAKES PLACE.  (THE GENERATION TAKES PLACE IN THE 
TWINKLING OF AN EYE, ALTHOUGH THERE IS SOME PP TIME IN HAULING IN THE 
CAPSULES TO DO IT). 
        ONCE THE CODES ARE GENERATED (AND TWO OF THEM HAVE BEEN SAVED ON
DISK), WE PROCEED WITH THE CODE LAST GENERATED, WHICH WAS INIT-CODE.
        THE FUNCTION OF INIT-CODE IS TO FILL UP THE TOURNAMENT. SO IT 
GETS RECORDS FROM THE USER UNTIL THE TOURNAMENT IS FULL.  IN THE CASE 
WHERE IT RUNS OUT OF RECORDS BEFORE IT CAN FILL THE TOURNAMENT, IT
PASSES THE INFORMATION BACK TO S$SRTPH (*SORT *PHASE), WHICH IS IN
CONTROL NOW. TO FOLLOW INIT-CODE, YOU GO TO S$GNINI(SYMPL) AND THE COMPASS
ROUTINES IT CALLS, WHICH ARE S$GNINI, S$GNIN2 ETC.
        NOW, IF THE TOURNAMENT IS STILL PARTIY EMPTY AND THE USER IS
ALL OUT OF RECORDS, WE PLAINLY HAVE A SHORT SORT. SO S$SRTPH LOADS IN 
SHORT-CODE. SHORT-CODE JUST DRAGS THE RECORDS OUT OF THE TOURNAMENT AND 
RETURNS THEM TO THE USER. 
        IF THE USER HAS MORE RECORDS AND THE TOURNAMENT IS FULL, WE HAVE
A LONG SORT. IN THIS CASE, LONG-CODE IS LOADED INTO MEMORY INSTEAD OF 
SHORT-CODE. THE FUNCTION OF LONG-CODE IS TO EXHAUST ALL USER RECORDS. IN
THE PROCESS IT CONTINUES RUNNING THE TOURNAMENT AND WRITING SORTED
STRINGS TO OUTPUT ON ZZZZZ FILES. IF THE USER NEVER RUNS OUT OF RECORDS,
THEN WE HAVE AN INFINITE LOOP.
        SO, WE COME TO A POINT IN TIME WHERE THE USER HAS NO MORE 
RECORDS. AT THIS POINT WE SHUT THE FRONT DOOR AND PUSH THE TOURNAMENT 
OUT ONTO A ZZZZZ-FILE AND TAKE STOCK OF WHAT WE HAVE ON DISK. THIS PHASE
CONSULTS A TABLE OF ALL THE SORTED STRINGS IN THE ZZZZZ FILES, AND
GENERATES THE CODE FOR THE INTERMEDIATE MERGE - IF NECESSARY. IT IS SMART 
SMART ENOUGH TO GO DIRECTLY TO FINAL MERGE IF THERE ARE JUST A FEW FILES..
        FINAL MERGE (WITH OR WITHOUT INTERMEDIATE MERGING) WILL PRODUCE 
OUTPUT RECORDS WHICH ARE GIVEN TO THE USER WITH CODE GENERATED BY 
S$GNPRU (*GENERATE *PUT *RECORD TO *USER).
        GETTING AND PUTTING RECORDS IS A SIMPLE BUSINESS EXCEPT IN SORT5. 
THE REASON FOR THIS IS EFFICIENCY. THERE IS NO QUESTION IN SORT5 OF 'KEY
EXTRACTION' VERSUS 'KEY COMPARISON'. SORT5 TAKES THE ENTIRE USER RECORD 
AND CONVERTS IT TO INTERNAL FORMAT BEFORE SORTING IT. THIS PROCESS IS 
CALLED INVERSION. WHEN THE SORT IS COMPLETED, THE INTERNAL RECORD IS
CONVERTED BACK TO ITS NORMAL STATE BEFORE WE PUT IT OUT TO THE USER.
THIS PROCESS IS CALLED REVERSION. 
        THEREFORE, THE GENERATED CODE WHICH GETS A RECORD FROM THE USER 
(S$GNGRU) REALLY HAS TWO PARTS. ONE PART DOES THE STRAIGHTFORWARD 
READING OF A FILE OR OBTAINING THE RECORD FROM OWN-CODE. THE OTHER PART 
INVERTS IT. THE LOGIC FOR INVERSION CAN BE FOUND IN S$GNINV (*GENERATE
*INVERSION), WHICHCALLS OTHER INVERSION ROUTINES AS NECESSARY. THE LOGIC
REVERTING THE RECORD CAN BE FOUND IN S$GNREV. 
