14c14
< % Version 1.2 allows `0' in response to an error, et alia (October 1984).
---
> % Version 1.2 allowed `0' in response to an error, et alia (October 1984).
19,36c19,38
< % Version 2.1 corrects anomalies in discretionary breaks (January 1987).
< % Version 2.2 corrects "(Please type...)" with null \endlinechar (April 1987).
< % Version 2.3 avoids incomplete page in premature termination (August 1987).
< % Version 2.4 fixes \noaligned rules in indented displays (August 1987).
< % Version 2.5 saves cur_order when expanding tokens (September 1987).
< % Version 2.6 adds 10sp slop when shipping leaders (November 1987).
< % Version 2.7 improves rounding of negative-width characters (November 1987).
< % Version 2.8 fixes weird bug if no \patterns are used (December 1987).
< % Version 2.9 makes \csname\endcsname's "relax" local (December 1987).
< % Version 2.91 fixes \outer\def\a0{}\a\a bug (April 1988).
< % Version 2.92 fixes \patterns, also file names with complex macros (May 1988).
< % Version 2.93 fixes negative halving in allocator when mem_min<0 (June 1988).
< % Version 2.94 keeps open_log_file from calling fatal_error (November 1988).
< % Version 2.95 solves that problem a better way (December 1988).
< % Version 2.96 corrects bug in "Infinite shrinkage" recovery (January 1989).
< % Version 2.97 corrects blunder in creating 2.95 (February 1989).
< % Version 2.98 omits save_for_after at outer level (March 1989).
< % Version 2.99 catches $$\begingroup\halign...$$ (June 1989).
---
> % Version 2.1 corrected anomalies in discretionary breaks (January 1987).
> % Version 2.2 corrected "(Please type...)" with null \endlinechar (April 1987).
> % Version 2.3 avoided incomplete page in premature termination (August 1987).
> % Version 2.4 fixed \noaligned rules in indented displays (August 1987).
> % Version 2.5 saved cur_order when expanding tokens (September 1987).
> % Version 2.6 added 10sp slop when shipping leaders (November 1987).
> % Version 2.7 improved rounding of negative-width characters (November 1987).
> % Version 2.8 fixed weird bug if no \patterns are used (December 1987).
> % Version 2.9 made \csname\endcsname's "relax" local (December 1987).
> % Version 2.91 fixed \outer\def\a0{}\a\a bug (April 1988).
> % Version 2.92 fixed \patterns, also file names with complex macros (May 1988).
> % Version 2.93 fixed negative halving in allocator when mem_min<0 (June 1988).
> % Version 2.94 kept open_log_file from calling fatal_error (November 1988).
> % Version 2.95 solved that problem a better way (December 1988).
> % Version 2.96 corrected bug in "Infinite shrinkage" recovery (January 1989).
> % Version 2.97 corrected blunder in creating 2.95 (February 1989).
> % Version 2.98 omitted save_for_after at outer level (March 1989).
> % Version 2.99 caught $$\begingroup\halign..$$ (June 1989).
> % Version 2.991 caught .5\ifdim.6... (June 1989).
> % Version 2.992 introduced major changes for 8-bit extensions (September 1989).
38c40,41
< % A reward of $163.84 will be paid to the first finder of any remaining bug.
---
> % A reward of $163.84 will be paid to the first finder of any remaining bug,
> % not counting changes introduced in September 1989.
84,85c87,90
< @* \[1] Introduction.
< This is \TeX, a document compiler intended to produce high-quality typesetting.
---
> 
> @* \[1] Introduction.
> This is \TeX, a document compiler intended to produce typesetting of high
> quality.
151a157,159
> A final revision in September 1989 extended the input character set to
> eight-bit codes and introduced the ability to hyphenate words from
> different languages, based on some ideas of Michael~J. Ferguson.
153a162
> @^Ferguson, Michael John@>
176c185
< @d banner=='This is TeX, Version 2.99' {printed when \TeX\ starts}
---
> @d banner=='This is TeX, Version 2.992' {printed when \TeX\ starts}
397a407
> @!trie_op_size=500; {space for ``opcodes'' in the hyphenation patterns}
495c505,506
< @* \[2] The character set.
---
> 
> @* \[2] The character set.
497,498c508,509
< computers, all of its input text is converted to an internal seven-bit
< code that is essentially standard ASCII, the ``American Standard Code for
---
> computers, all of its input text is converted to an internal eight-bit
> code that includes standard ASCII, the ``American Standard Code for
523c534
< @!ASCII_code=0..127; {seven-bit numbers}
---
> @!ASCII_code=0..255; {eight-bit numbers}
553c564
< @d last_text_char=127 {ordinal number of the largest element of |text_char|}
---
> @d last_text_char=255 {ordinal number of the largest element of |text_char|}
556c567
< @!i:0..last_text_char;
---
> @!i:integer;
568,574c579,585
< @ Since we are assuming that our \PASCAL\ system is able to read and write the
< visible characters of standard ASCII (although not necessarily using the
< ASCII codes to represent them), the following assignment statements initialize
< most of the |xchr| array properly, without needing any system-dependent
< changes. On the other hand, it is possible to implement \TeX\ with
< less complete character sets, and in such cases it will be necessary to
< change something here.
---
> @ Since we are assuming that our \PASCAL\ system is able to read and
> write the visible characters of standard ASCII (although not
> necessarily using the ASCII codes to represent them), the following
> assignment statements initialize the standard part of the |xchr| array
> properly, without needing any system-dependent changes. On the other
> hand, it is possible to implement \TeX\ with less complete character
> sets, and in such cases it will be necessary to change something here.
673,674d683
< xchr[0]:=' '; xchr[@'177]:=' ';
<   {ASCII codes 0 and |@'177| do not appear in text}
681c690
< @d invalid_code=@'177 {ASCII code that should not appear}
---
> @d invalid_code=@'177 {ASCII code that many systems prohibit in text files}
693c702
< what codes are specified in |xchr[1..@'37]|, but the safest policy is to
---
> what codes are specified in |xchr[0..@'37]|, but the safest policy is to
698,702c707
< like `\.^^Z' instead of `\.{\\ne}'. At MIT, for example, it would be more
< appropriate to substitute the code
< $$\hbox{|for i:=1 to @'37 do xchr[i]:=chr(i);|}$$
< \TeX's character set is essentially the same as MIT's, even with respect to
< characters less than~@'40. People with extended character sets can
---
> like `\.^^Z' instead of `\.{\\ne}'. People with extended character sets can
708a714,715
> To get the most ``permissive'' character set, change |' '| on the
> right of these assignment statements to |chr(i)|.
713c720,721
< for i:=1 to @'37 do xchr[i]:=' ';
---
> for i:=0 to @'37 do xchr[i]:=' ';
> for i:=@'177 to @'377 do xchr[i]:=' ';
723,724c731,734
< for i:=1 to @'176 do xord[xchr[i]]:=i;
< @* \[3] Input and output.
---
> for i:=@'200 to @'377 do xord[xchr[i]]:=i;
> for i:=0 to @'176 do xord[xchr[i]]:=i;
> 
> @* \[3] Input and output.
932,933c942
<         overflow("buffer size",buf_size);
< @:TeX capacity exceeded buffer size}{\quad buffer size@>
---
>         @<Report overflow of the input buffer, and abort@>;
995a1005,1018
> The first line is special also because it may be read before \TeX\ has
> input a format file. In such cases, normal error messages cannot yet
> be given. The following code uses concepts that will be explained later.
> 
> @<Report overflow of the input buffer, and abort@>=
> if format_ident=0 then
>   begin write_ln(term_out,'Buffer size exceeded!'); goto final_end;
> @.Buffer size exceeded@>
>   end
> else begin cur_input.loc_field:=first; cur_input.limit_field:=last-1;
>   overflow("buffer size",buf_size);
> @:TeX capacity exceeded buffer size}{\quad buffer size@>
>   end
> 
1050c1073,1074
< @* \[4] String handling.
---
> 
> @* \[4] String handling.
1052c1076
< of seven-bit characters. Since \PASCAL\ does not have a well-developed string
---
> of eight-bit characters. Since \PASCAL\ does not have a well-developed string
1057c1081
< The array |str_pool| contains all of the (seven-bit) ASCII codes in all
---
> The array |str_pool| contains all of the (eight-bit) ASCII codes in all
1067c1091
< String numbers 0 to 127 are reserved for strings that correspond to single
---
> String numbers 0 to 255 are reserved for strings that correspond to single
1075c1099
< into some integer greater than~127. String number 46 will presumably be the
---
> into some integer greater than~255. String number 46 will presumably be the
1078,1079c1102,1103
< ASCII character, so the first 128 strings are used to specify exactly what
< should be printed for each of the 128 possibilities.
---
> ASCII character, so the first 256 strings are used to specify exactly what
> should be printed for each of the 256 possibilities.
1085a1110,1116
> Some \PASCAL\ compilers won't pack integers into a single byte unless the
> integers lie in the range |-128..127|. To accommodate such systems
> we access the string pool only via macros that can easily be redefined.
> 
> @d si(#) == # {convert from |ASCII_code| to |packed_ASCII_code|}
> @d so(#) == # {convert from |packed_ASCII_code| to |ASCII_code|}
> 
1088a1120
> @!packed_ASCII_code = 0..255; {elements of |str_pool| array}
1091c1123
< @!str_pool:packed array[pool_pointer] of ASCII_code; {the characters}
---
> @!str_pool:packed array[pool_pointer] of packed_ASCII_code; {the characters}
1123c1155
< begin str_pool[pool_ptr]:=#; incr(pool_ptr);
---
> begin str_pool[pool_ptr]:=si(#); incr(pool_ptr);
1163c1195
<   begin if str_pool[j]<>buffer[k] then
---
>   begin if so(str_pool[j])<>buffer[k] then
1200c1232
< var k,@!l:0..127; {small indices or counters}
---
> var k,@!l:0..255; {small indices or counters}
1206c1238
< @<Make the first 128 strings@>;
---
> @<Make the first 256 strings@>;
1212,1213c1244,1248
< @ @<Make the first 128...@>=
< for k:=0 to 127 do
---
> @ @d app_lc_hex(#)==l:=#;
>   if l<10 then append_char(l+"0")@+else append_char(l-10+"a")
> 
> @<Make the first 256...@>=
> for k:=0 to 255 do
1217c1252,1254
<     else append_char(k-@'100);
---
>     else if k<@'200 then append_char(k-@'100)
>     else begin app_lc_hex(k div 16); app_lc_hex(k mod 16);
>       end;
1234,1239c1271,1283
< The boolean expression defined here should be |true| unless \TeX\ internal
< code number~|k| corresponds to a non-troublesome visible symbol in the
< local character set.  At MIT, for example, the appropriate formula would
< be `|k in [0,@'10..@'12,@'14,@'15,@'33,@'177]|'.  If character |k| cannot be
< printed, then character |k+@'100| or |k-@'100| must be printable; thus, at
< least 64 printable characters are needed.
---
> Unprintable characters of codes 128--255 are, similarly, rendered
> \.{\^\^80}--\.{\^\^ff}.
> 
> The boolean expression defined here should be |true| unless \TeX\
> internal code number~|k| corresponds to a non-troublesome visible
> symbol in the local character set.  An appropriate formula for the
> extended character set recommended in {\sl The \TeX book\/} would, for
> example, be `|k in [0,@'10..@'12,@'14,@'15,@'33,@'177..@'377]|'.
> If character |k| cannot be printed, and |k<@'200|, then character |k+@'100| or
> |k-@'100| must be printable; moreover, ASCII codes |[@'41..@'46,
> @'60..@'71, @'141..@'146, @'160..@'171]| must be printable.
> Thus, at least 80 printable characters are needed.
> @:TeXbook}{\sl The \TeX book@>
1312c1356,1357
< @* \[5] On-line and off-line printing.
---
> 
> @* \[5] On-line and off-line printing.
1457c1502
< else if s<128 then
---
> else if s<256 then
1465c1510
<   begin print_char(str_pool[j]); incr(j);
---
>   begin print_char(so(str_pool[j])); incr(j);
1478c1523
< else if s<128 then
---
> else if s<256 then
1486c1531
<   begin print(str_pool[j]); incr(j);
---
>   begin print(so(str_pool[j])); incr(j);
1520c1565
< if c>=0 then if c<128 then print(c);
---
> if c>=0 then if c<256 then print(c);
1580,1581c1625,1627
< @ In certain situations, \TeX\ prints either a standard visible ASCII
< character or its hexadecimal ASCII code.
---
> @ Old versions of \TeX\ needed a procedure called |print_ASCII| whose function
> is now subsumed by |print|. We retain the old name here as a possible aid to
> future software arch\ae ologists.
1583,1589c1629
< @p procedure print_ASCII(@!c:integer); {prints a character or its code}
< begin if (c>=0) and (c<=127) then print(c)
< else  begin print_char("[");
<   if c<0 then print_int(c)@+else print_hex(c);
<   print_char("]");
<   end;
< end;
---
> @d print_ASCII == print
1601c1641
<     begin print_char(str_pool[j]); n:=n-v;
---
>     begin print_char(so(str_pool[j])); n:=n-v;
1604,1606c1644,1646
<   k:=j+2; u:=v div (str_pool[k-1]-"0");
<   if str_pool[k-1]="2" then
<     begin k:=k+2; u:=u div (str_pool[k-1]-"0");
---
>   k:=j+2; u:=v div (so(str_pool[k-1])-"0");
>   if str_pool[k-1]=si("2") then
>     begin k:=k+2; u:=u div (so(str_pool[k-1])-"0");
1609c1649
<     begin print_char(str_pool[k]); n:=n+u;
---
>     begin print_char(so(str_pool[k])); n:=n+u;
1611c1651
<   else  begin j:=j+2; v:=v div (str_pool[j-1]-"0");
---
>   else  begin j:=j+2; v:=v div (so(str_pool[j-1])-"0");
1623c1663
<   begin print_char(str_pool[j]); incr(j);
---
>   begin print_char(so(str_pool[j])); incr(j);
1647c1687,1688
< @* \[6] Reporting errors.
---
> 
> @* \[6] Reporting errors.
1841c1882
<   @<Delete |c-"0"| tokens and |goto continue|@>;
---
>   @<Delete \(c)|c-"0"| tokens and |goto continue|@>;
1908c1949
< @<Delete |c-"0"| tokens...@>=
---
> @<Delete \(c)|c-"0"| tokens...@>=
2044c2085
< highest interaction level and lets the user have the full flexibility of
---
> highest interaction level and lets the user have nearly the full flexibility of
2062c2103,2104
< @* \[7] Arithmetic with scaled dimensions.
---
> 
> @* \[7] Arithmetic with scaled dimensions.
2173c2215,2216
< and~|y| are |scaled| and |n| is an integer.
---
> and~|y| are |scaled| and |n| is an integer. We will also use it to
> multiply integers.
2175c2218,2221
< @p function nx_plus_y(@!n:integer;@!x,@!y:scaled):scaled;
---
> @d nx_plus_y(#)==mult_and_add(#,@'7777777777)
> @d mult_integers(#)==mult_and_add(#,0,@'17777777777)
> 
> @p function mult_and_add(@!n:integer;@!x,@!y,@!max_answer:scaled):scaled;
2179,2182c2225,2228
< if n=0 then nx_plus_y:=y
< else if ((x<=(@'7777777777-y) div n)and(-x<=(@'7777777777+y) div n)) then
<   nx_plus_y:=n*x+y
< else  begin arith_error:=true; nx_plus_y:=0;
---
> if n=0 then mult_and_add:=y
> else if ((x<=(max_answer-y) div n)and(-x<=(max_answer+y) div n)) then
>   mult_and_add:=n*x+y
> else  begin arith_error:=true; mult_and_add:=0;
2288c2334,2335
< @* \[8] Packed data.
---
> 
> @* \[8] Packed data.
2428c2475,2476
< @* \[9] Dynamic memory allocation.
---
> 
> @* \[9] Dynamic memory allocation.
2716c2764
< while p<>old_rover do @<Sort |p| into the list starting at |rover|
---
> while p<>old_rover do @<Sort \(p)|p| into the list starting at |rover|
2729c2777
< @<Sort |p|...@>=
---
> @<Sort \(p)|p|...@>=
2737c2785,2786
< @* \[10] Data structures for boxes and their friends.
---
> 
> @* \[10] Data structures for boxes and their friends.
2766,2767c2815
< are accessed outside of math mode only via ligatures and the \.{\\char}
< operator).
---
> are more difficult to access on most keyboards).
2912,2918c2960,2969
< @ A |ligature_node|, which occurs only in horizontal lists, specifies a
< composite character that was formed from two or more actual characters.
< The second word of the node, which is called the |lig_char| word, contains
< |font| and |character| fields just as in a |char_node|. The characters
< that generated the ligature have not been forgotten, since they are needed
< for diagnostic messages and for hyphenation; the |lig_ptr| field points to
< a linked list of character nodes for those characters.
---
> @ A |ligature_node|, which occurs only in horizontal lists, specifies
> a character that was fabricated from the interaction of two or more
> actual characters.  The second word of the node, which is called the
> |lig_char| word, contains |font| and |character| fields just as in a
> |char_node|. The characters that generated the ligature have not been
> forgotten, since they are needed for diagnostic messages and for
> hyphenation; the |lig_ptr| field points to a linked list of character
> nodes for all original characters that have been deleted. (This list
> might be empty if the characters that generated the ligature were
> retained in other nodes.)
2919a2971,2973
> The |subtype| field is 0, plus 2 and/or 1 if the original source of the
> ligature included implicit left and/or right boundaries.
> 
2925c2979,2982
< contents of the |font|, |character|, and |lig_ptr| fields.
---
> contents of the |font|, |character|, and |lig_ptr| fields. We also have
> a |new_lig_item| function, which returns a two-word node having a given
> |character| field. Such nodes are used for temporary processing as ligatures
> are being created.
2930d2986
< subtype(p):=0; {the |subtype| is not used}
2932c2988
< new_ligature:=p;
---
> subtype(p):=0; new_ligature:=p;
2933a2990,2995
> @#
> function new_lig_item(@!c:quarterword):pointer;
> var p:pointer; {the new node}
> begin p:=get_node(small_node_size); character(p):=c; lig_ptr(p):=null;
> new_lig_item:=p;
> end;
3200c3262,3263
< @* \[11] Memory layout.
---
> 
> @* \[11] Memory layout.
3283c3346
< @t\hskip1em@>@!was_free: packed array [mem_min..mem_max] of boolean;
---
> @t\hskip10pt@>@!was_free: packed array [mem_min..mem_max] of boolean;
3285c3348
< @t\hskip1em@>@!was_mem_end,@!was_lo_max,@!was_hi_min: pointer;
---
> @t\hskip10pt@>@!was_mem_end,@!was_lo_max,@!was_hi_min: pointer;
3287c3350
< @t\hskip1em@>@!panicking:boolean; {do we want to check memory constantly?}
---
> @t\hskip10pt@>@!panicking:boolean; {do we want to check memory constantly?}
3410c3473,3474
< @* \[12] Displaying boxes.
---
> 
> @* \[12] Displaying boxes.
3735,3736c3799,3802
< font_in_short_display:=font(lig_char(p));
< short_display(lig_ptr(p)); print_char(")");
---
> if subtype(p)>1 then print_char("|");
> font_in_short_display:=font(lig_char(p)); short_display(lig_ptr(p));
> if odd(subtype(p)) then print_char("|");
> print_char(")");
3776c3842,3843
< @* \[13] Destroying boxes.
---
> 
> @* \[13] Destroying boxes.
3849c3916,3917
< @* \[14] Copying boxes.
---
> 
> @* \[14] Copying boxes.
3936c4004,4005
< @* \[15] The command codes.
---
> 
> @* \[15] The command codes.
4032,4037c4101,4107
< @d radical=65 {square root and similar signs ( \.{\\radical} )}
< @d end_cs_name=66 {end control sequence ( \.{\\endcsname} )}
< @d min_internal=67 {the smallest code that can follow \.{\\the}}
< @d char_given=67 {character code defined by \.{\\chardef}}
< @d math_given=68 {math code defined by \.{\\mathchardef}}
< @d last_item=69 {most recent item ( \.{\\lastpenalty},
---
> @d no_boundary=65 {suppress boundary ligatures ( \.{\\noboundary} )}
> @d radical=66 {square root and similar signs ( \.{\\radical} )}
> @d end_cs_name=67 {end control sequence ( \.{\\endcsname} )}
> @d min_internal=68 {the smallest code that can follow \.{\\the}}
> @d char_given=68 {character code defined by \.{\\chardef}}
> @d math_given=69 {math code defined by \.{\\mathchardef}}
> @d last_item=70 {most recent item ( \.{\\lastpenalty},
4039c4109
< @d max_non_prefixed_command=69 {largest command code that can't be \.{\\global}}
---
> @d max_non_prefixed_command=70 {largest command code that can't be \.{\\global}}
4046,4053c4116,4123
< @d toks_register=70 {token list register ( \.{\\toks} )}
< @d assign_toks=71 {special token list ( \.{\\output}, \.{\\everypar}, etc.~)}
< @d assign_int=72 {user-defined integer ( \.{\\tolerance}, \.{\\day}, etc.~)}
< @d assign_dimen=73 {user-defined length ( \.{\\hsize}, etc.~)}
< @d assign_glue=74 {user-defined glue ( \.{\\baselineskip}, etc.~)}
< @d assign_mu_glue=75 {user-defined muglue ( \.{\\thinmuskip}, etc.~)}
< @d assign_font_dimen=76 {user-defined font dimension ( \.{\\fontdimen} )}
< @d assign_font_int=77 {user-defined font integer ( \.{\\hyphenchar},
---
> @d toks_register=71 {token list register ( \.{\\toks} )}
> @d assign_toks=72 {special token list ( \.{\\output}, \.{\\everypar}, etc.~)}
> @d assign_int=73 {user-defined integer ( \.{\\tolerance}, \.{\\day}, etc.~)}
> @d assign_dimen=74 {user-defined length ( \.{\\hsize}, etc.~)}
> @d assign_glue=75 {user-defined glue ( \.{\\baselineskip}, etc.~)}
> @d assign_mu_glue=76 {user-defined muglue ( \.{\\thinmuskip}, etc.~)}
> @d assign_font_dimen=77 {user-defined font dimension ( \.{\\fontdimen} )}
> @d assign_font_int=78 {user-defined font integer ( \.{\\hyphenchar},
4055,4058c4125,4128
< @d set_aux=78 {specify state info ( \.{\\spacefactor}, \.{\\prevdepth} )}
< @d set_prev_graf=79 {specify state info ( \.{\\prevgraf} )}
< @d set_page_dimen=80 {specify state info ( \.{\\pagegoal}, etc.~)}
< @d set_page_int=81 {specify state info ( \.{\\deadcycles},
---
> @d set_aux=79 {specify state info ( \.{\\spacefactor}, \.{\\prevdepth} )}
> @d set_prev_graf=80 {specify state info ( \.{\\prevgraf} )}
> @d set_page_dimen=81 {specify state info ( \.{\\pagegoal}, etc.~)}
> @d set_page_int=82 {specify state info ( \.{\\deadcycles},
4060,4079c4130,4149
< @d set_box_dimen=82 {change dimension of box ( \.{\\wd}, \.{\\ht}, \.{\\dp} )}
< @d set_shape=83 {specify fancy paragraph shape ( \.{\\parshape} )}
< @d def_code=84 {define a character code ( \.{\\catcode}, etc.~)}
< @d def_family=85 {declare math fonts ( \.{\\textfont}, etc.~)}
< @d set_font=86 {set current font ( font identifiers )}
< @d def_font=87 {define a font file ( \.{\\font} )}
< @d register=88 {internal register ( \.{\\count}, \.{\\dimen}, etc.~)}
< @d max_internal=88 {the largest code that can follow \.{\\the}}
< @d advance=89 {advance a register or parameter ( \.{\\advance} )}
< @d multiply=90 {multiply a register or parameter ( \.{\\multiply} )}
< @d divide=91 {divide a register or parameter ( \.{\\divide} )}
< @d prefix=92 {qualify a definition ( \.{\\global}, \.{\\long}, \.{\\outer} )}
< @d let=93 {assign a command code ( \.{\\let}, \.{\\futurelet} )}
< @d shorthand_def=94 {code definition ( \.{\\chardef}, \.{\\countdef}, etc.~)}
< @d read_to_cs=95 {read into a control sequence ( \.{\\read} )}
< @d def=96 {macro definition ( \.{\\def}, \.{\\gdef}, \.{\\xdef}, \.{\\edef} )}
< @d set_box=97 {set a box ( \.{\\setbox} )}
< @d hyph_data=98 {hyphenation data ( \.{\\hyphenation}, \.{\\patterns} )}
< @d set_interaction=99 {define level of interaction ( \.{\\batchmode}, etc.~)}
< @d max_command=99 {the largest command code seen at |big_switch|}
---
> @d set_box_dimen=83 {change dimension of box ( \.{\\wd}, \.{\\ht}, \.{\\dp} )}
> @d set_shape=84 {specify fancy paragraph shape ( \.{\\parshape} )}
> @d def_code=85 {define a character code ( \.{\\catcode}, etc.~)}
> @d def_family=86 {declare math fonts ( \.{\\textfont}, etc.~)}
> @d set_font=87 {set current font ( font identifiers )}
> @d def_font=88 {define a font file ( \.{\\font} )}
> @d register=89 {internal register ( \.{\\count}, \.{\\dimen}, etc.~)}
> @d max_internal=89 {the largest code that can follow \.{\\the}}
> @d advance=90 {advance a register or parameter ( \.{\\advance} )}
> @d multiply=91 {multiply a register or parameter ( \.{\\multiply} )}
> @d divide=92 {divide a register or parameter ( \.{\\divide} )}
> @d prefix=93 {qualify a definition ( \.{\\global}, \.{\\long}, \.{\\outer} )}
> @d let=94 {assign a command code ( \.{\\let}, \.{\\futurelet} )}
> @d shorthand_def=95 {code definition ( \.{\\chardef}, \.{\\countdef}, etc.~)}
> @d read_to_cs=96 {read into a control sequence ( \.{\\read} )}
> @d def=97 {macro definition ( \.{\\def}, \.{\\gdef}, \.{\\xdef}, \.{\\edef} )}
> @d set_box=98 {set a box ( \.{\\setbox} )}
> @d hyph_data=99 {hyphenation data ( \.{\\hyphenation}, \.{\\patterns} )}
> @d set_interaction=100 {define level of interaction ( \.{\\batchmode}, etc.~)}
> @d max_command=100 {the largest command code seen at |big_switch|}
4106c4176,4177
< @* \[16] The semantic nest.
---
> 
> @* \[16] The semantic nest.
4176c4247
< \yskip\hang|aux| is an auxiliary integer that gives further information
---
> \yskip\hang|aux| is an auxiliary |memory_word| that gives further information
4184,4185c4255,4258
< known as |space_factor|; it holds the current space factor used in spacing
< calculations. In math mode, |aux| is also known as |incompleat_noad|; if
---
> known as |space_factor| and |clang|; it holds the current space factor used in
> spacing calculations, and the current language used for hyphenation.
> (The value of |clang| is undefined in restricted horizontal mode.)
> In math mode, |aux| is also known as |incompleat_noad|; if
4208c4281,4282
<   @!pg_field,@!aux_field,@!ml_field: integer;
---
>   @!pg_field,@!ml_field: integer;
>   @!aux_field: memory_word;
4216,4218c4290,4293
< @d prev_depth==aux {the name of |aux| in vertical mode}
< @d space_factor==aux {the name of |aux| in horizontal mode}
< @d incompleat_noad==aux {the name of |aux| in math mode}
---
> @d prev_depth==aux.sc {the name of |aux| in vertical mode}
> @d space_factor==aux.hh.lh {part of |aux| in horizontal mode}
> @d clang==aux.hh.rh {the other part of |aux| in horizontal mode}
> @d incompleat_noad==aux.int {the name of |aux| in math mode}
4221d4295
< 
4277c4351
< @!a:integer; {auxiliary}
---
> @!a:memory_word; {auxiliary}
4300,4301c4374,4375
<   if a<=ignore_depth then print("ignored")
<   else print_scaled(a);
---
>   if a.sc<=ignore_depth then print("ignored")
>   else print_scaled(a.sc);
4308c4382,4385
< 1: begin print_nl("spacefactor "); print_int(a);
---
> 1: begin print_nl("spacefactor "); print_int(a.hh.lh);
>   if m>0 then if a.hh.rh>0 then
>     begin print(", current language "); print_int(a.hh.rh);
>     end;
4310,4311c4387,4388
< 2: if a<>null then
<   begin print("this will be denominator of:"); show_box(a);
---
> 2: if a.int<>null then
>   begin print("this will be denominator of:"); show_box(a.int);
4314c4391,4392
< @* \[17] The table of equivalents.
---
> 
> @* \[17] The table of equivalents.
4386,4387c4464,4465
< In the first region we have 128 equivalents for ``active characters'' that
< act as control sequences, followed by 128 equivalents for single-character
---
> In the first region we have 256 equivalents for ``active characters'' that
> act as control sequences, followed by 256 equivalents for single-character
4397,4398c4475,4476
< @d single_base=active_base+128 {equivalents of one-letter control sequences}
< @d null_cs=single_base+128 {equivalent of \.{\\csname\\endcsname}}
---
> @d single_base=active_base+256 {equivalents of one-character control sequences}
> @d null_cs=single_base+256 {equivalent of \.{\\csname\\endcsname}}
4595c4673
< bulk of this region is taken up by five tables that are indexed by seven-bit
---
> bulk of this region is taken up by five tables that are indexed by eight-bit
4616,4621c4694,4699
<   {table of 128 command codes (the ``catcodes'')}
< @d lc_code_base=cat_code_base+128 {table of 128 lowercase mappings}
< @d uc_code_base=lc_code_base+128 {table of 128 uppercase mappings}
< @d sf_code_base=uc_code_base+128 {table of 128 spacefactor mappings}
< @d math_code_base=sf_code_base+128 {table of 128 math mode mappings}
< @d int_base=math_code_base+128 {beginning of region 5}
---
>   {table of 256 command codes (the ``catcodes'')}
> @d lc_code_base=cat_code_base+256 {table of 256 lowercase mappings}
> @d uc_code_base=lc_code_base+256 {table of 256 uppercase mappings}
> @d sf_code_base=uc_code_base+256 {table of 256 spacefactor mappings}
> @d math_code_base=sf_code_base+256 {table of 256 math mode mappings}
> @d int_base=math_code_base+256 {beginning of region 5}
4706c4784
< for k:=0 to 127 do
---
> for k:=0 to 255 do
4839c4917,4922
< @d int_pars=50 {total number of integer parameters}
---
> @d language_code=50 {current hyphenation table}
> @d left_hyphen_min_code=51 {minimum left hyphenation fragment size}
> @d right_hyphen_min_code=52 {minimum right hyphenation fragment size}
> @d holding_inserts_code=53 {do not remove insertion nodes from \.{\\box255}}
> @d error_context_lines_code=54 {maximum intermediate line pairs shown}
> @d int_pars=55 {total number of integer parameters}
4841,4842c4924,4925
< @d del_code_base=count_base+256 {128 delimiter code mappings}
< @d dimen_base=del_code_base+128 {beginning of region 6}
---
> @d del_code_base=count_base+256 {256 delimiter code mappings}
> @d dimen_base=del_code_base+256 {beginning of region 6}
4896a4980,4984
> @d language==int_par(language_code)
> @d left_hyphen_min==int_par(left_hyphen_min_code)
> @d right_hyphen_min==int_par(right_hyphen_min_code)
> @d holding_inserts==int_par(holding_inserts_code)
> @d error_context_lines==int_par(error_context_lines_code)
4955a5044,5048
> language_code:print_esc("language");
> left_hyphen_min_code:print_esc("lefthyphenmin");
> right_hyphen_min_code:print_esc("righthyphenmin");
> holding_inserts_code:print_esc("holdinginserts");
> error_context_lines_code:print_esc("errorcontextlines");
5065a5159,5168
> primitive("language",assign_int,int_base+language_code);@/
> @!@:language_}{\.{\\language} primitive@>
> primitive("lefthyphenmin",assign_int,int_base+left_hyphen_min_code);@/
> @!@:left_hyphen_min_}{\.{\\lefthyphenmin} primitive@>
> primitive("righthyphenmin",assign_int,int_base+right_hyphen_min_code);@/
> @!@:right_hyphen_min_}{\.{\\righthyphenmin} primitive@>
> primitive("holdinginserts",assign_int,int_base+holding_inserts_code);@/
> @!@:holding_inserts_}{\.{\\holdinginserts} primitive@>
> primitive("errorcontextlines",assign_int,int_base+error_context_lines_code);@/
> @!@:error_context_lines_}{\.{\\errorcontextlines} primitive@>
5081c5184
< for k:=0 to 127 do del_code(k):=-1;
---
> for k:=0 to 255 do del_code(k):=-1;
5160c5263,5264
< @d dimen_pars=20 {total number of dimension parameters}
---
> @d emergency_stretch_code=20 {reduces badnesses on final pass of line-breaking}
> @d dimen_pars=21 {total number of dimension parameters}
5186a5291
> @d emergency_stretch==dimen_par(emergency_stretch_code)
5209a5315
> emergency_stretch_code:print_esc("emergencystretch");
5256a5363,5364
> primitive("emergencystretch",assign_dimen,dimen_base+emergency_stretch_code);@/
> @!@:emergency_stretch_}{\.{\\emergencystretch} primitive@>
5311c5419,5420
< @* \[18] The hash table.
---
> 
> @* \[18] The hash table.
5474c5583
< begin if s<128 then cur_val:=s+single_base
---
> begin if s<256 then cur_val:=s+single_base
5477c5586
<   for j:=0 to l-1 do buffer[j]:=str_pool[k+j];
---
>   for j:=0 to l-1 do buffer[j]:=so(str_pool[k+j]);
5542a5652,5653
> primitive("noboundary",no_boundary,0);@/
> @!@:no_boundary_}{\.{\\noboundary} primitive@>
5610a5722
> no_boundary:print_esc("noboundary");
5642c5754,5755
< @* \[19] Saving and restoring equivalents.
---
> 
> @* \[19] Saving and restoring equivalents.
5978c6091,6092
< @* \[20] Token lists.
---
> 
> @* \[20] Token lists.
5985c6099
< |cs_token_flag+p|. Here |cs_token_flag=@t$2^{12}$@>| is larger than
---
> |cs_token_flag+p|. Here |cs_token_flag=@t$2^{12}-1$@>| is larger than
5996,5997c6110,6111
< @d cs_token_flag==@'10000 {amount added to the |eqtb| location in a
<   token that stands for a control sequence; is a multiple of~256}
---
> @d cs_token_flag==@'7777 {amount added to the |eqtb| location in a
>   token that stands for a control sequence; is a multiple of~256, less~1}
6110c6224
<   if (info(p)<0)or(c>127) then print_esc("BAD.")
---
>   if info(p)<0 then print_esc("BAD.")
6143c6257
< begin if p<>null then show_token_list(link(p),null,1000);
---
> begin if p<>null then show_token_list(link(p),null,10000000);
6159c6273,6274
< @* \[21] Introduction to the syntactic routines.
---
> 
> @* \[21] Introduction to the syntactic routines.
6254c6369,6370
< @* \[22] Input stacks and states.
---
> 
> @* \[22] Input stacks and states.
6307c6423
< (Incidentally, on a machine with byte-oriented addressing, it would be
---
> (Incidentally, on a machine with byte-oriented addressing, it might be
6589a6706,6707
> @!nn:integer; {number of contexts shown so far, less one}
> @!bottom_line:boolean; {have we reached the final context to be shown?}
6592a6711
> nn:=-1; bottom_line:=false;
6594d6712
<   @<Display the current context@>;
6596c6714,6720
<     if (name>17) or (base_ptr=0) then goto done;
---
>     if (name>17) or (base_ptr=0) then bottom_line:=true;
>   if (base_ptr=input_ptr)or bottom_line or(nn<error_context_lines) then
>     @<Display the current context@>
>   else if nn=error_context_lines then
>     begin print_nl("..."); incr(nn); {omitted if |error_context_lines<0|}
>     end;
>   if bottom_line then goto done;
6603c6727
< if (base_ptr=input_ptr) or (state<>token_list) or
---
> begin if (base_ptr=input_ptr) or (state<>token_list) or
6617c6741,6743
<   end
---
>   incr(nn);
>   end;
> end
6753c6879,6880
< @* \[23] Maintaining the input stacks.
---
> 
> @* \[23] Maintaining the input stacks.
6911c7038,7039
< @* \[24] Getting the next token.
---
> 
> @* \[24] Getting the next token.
7051a7180,7181
> @!c,@!cc:ASCII_code; {constituents of a possible expanded code}
> @!d:2..3; {number of excess characters in an expanded code}
7104,7105c7234,7236
< any_state_plus(sup_mark): @<If this |sup_mark| starts a control character
<   like~\.{\^\^A}, then |goto reswitch|, otherwise set |state:=mid_line|@>;
---
> any_state_plus(sup_mark): @<If this |sup_mark| starts an expanded character
>   like~\.{\^\^A} or~\.{\^\^df}, then |goto reswitch|,
>   otherwise set |state:=mid_line|@>;
7173,7177c7304,7323
< @ @<If this |sup_mark| starts a control character...@>=
< begin if (cur_chr=buffer[loc])and(loc<limit) then
<   begin if buffer[loc+1]<@'100 then cur_chr:=buffer[loc+1]+@'100
<   else cur_chr:=buffer[loc+1]-@'100;
<   loc:=loc+2; goto reswitch;
---
> @ Notice that a code like \.{\^\^8} becomes \.x if not followed by a hex digit.
>  
> @d is_hex(#)==(((#>="0")and(#<="9"))or((#>="a")and(#<="f")))
> @d hex_to_cur_chr==
>   if c<="9" then cur_chr:=c-"0" @+else cur_chr:=c-"a"+10;
>   if cc<="9" then cur_chr:=16*cur_chr+cc-"0"
>   else cur_chr:=16*cur_chr+cc-"a"+10
> 
> @<If this |sup_mark| starts an expanded character...@>=
> begin if cur_chr=buffer[loc] then if loc<limit then
>   begin c:=buffer[loc+1]; @+if c<@'200 then {yes we have an expanded char}
>     begin loc:=loc+2; 
>     if is_hex(c) then if loc<=limit then
>       begin cc:=buffer[loc]; @+if is_hex(cc) then
>         begin incr(loc); hex_to_cur_chr; goto reswitch;
>         end;
>       end;
>     if c<@'100 then cur_chr:=c+@'100 @+else cur_chr:=c-@'100;
>     goto reswitch;
>     end;
7198c7344,7345
< If expanded control characters like `\.{\^\^A}' appear in or just following
---
> If expanded characters like `\.{\^\^A}' or `\.{\^\^df}'
> appear in or just following
7211c7358
<     if an expanded control code is encountered, reduce it
---
>     if an expanded code is encountered, reduce it
7215,7216c7362
<   else @<If an expanded control code is present, reduce it
<     and |goto start_cs|@>;
---
>   else @<If an expanded code is present, reduce it and |goto start_cs|@>;
7225c7371,7372
< expanded control code like \.{\^\^A} appears in |buffer[(k-1)..(k+1)]|, we
---
> expanded code like \.{\^\^A} or \.{\^\^df} appears in |buffer[(k-1)..(k+1)]|
> or |buffer[(k-1)..(k+2)]|, we
7227,7228c7374
< the buffer left two places.  The value of |cur_chr| may be changed here,
< but not the value of |cat|.
---
> the buffer left two or three places.
7231,7237c7377,7392
< begin if buffer[k]=cur_chr then if cat=sup_mark then if k<limit then
<   begin cur_chr:=buffer[k+1];
<   if cur_chr<@'100 then buffer[k-1]:=cur_chr+@'100
<   else buffer[k-1]:=cur_chr-@'100;
<   limit:=limit-2; first:=first-2;
<   while k<=limit do
<     begin buffer[k]:=buffer[k+2]; incr(k);
---
> begin if buffer[k]=cur_chr then @+if cat=sup_mark then @+if k<limit then
>   begin c:=buffer[k+1]; @+if c<@'200 then {yes, one is indeed present}
>     begin d:=2;
>     if is_hex(c) then @+if k+2<=limit then
>       begin cc:=buffer[k+2]; @+if is_hex(cc) then incr(d);
>       end;
>     if d>2 then
>       begin hex_to_cur_chr; buffer[k-1]:=cur_chr;
>       end
>     else if c<@'100 then buffer[k-1]:=c+@'100
>     else buffer[k-1]:=c-@'100;
>     limit:=limit-d; first:=first-d;
>     while k<=limit do
>       begin buffer[k]:=buffer[k+d]; incr(k);
>       end;
>     goto start_cs;
7239d7393
<   goto start_cs;
7303a7458,7459
> @d end_line_char_inactive == (end_line_char<0)or(end_line_char>255)
> 
7323c7479
<     if (end_line_char<0)or(end_line_char>127) then decr(limit)
---
>     if end_line_char_inactive then decr(limit)
7354c7510
< if (end_line_char<0)or(end_line_char>127) then decr(limit)
---
> if end_line_char_inactive then decr(limit)
7413c7569,7570
< @* \[25] Expanding the next token.
---
> 
> @* \[25] Expanding the next token.
7453c7610
< else @<Insert a |frozen_endv| token@>;
---
> else @<Insert a token containing |frozen_endv|@>;
7568c7725
< @<Insert a |frozen_endv| token@>=
---
> @<Insert a token containing |frozen_endv|@>=
7788c7945
< if (info(r)>match_token+127)or(info(r)<match_token) then s:=null
---
> if (info(r)>match_token+255)or(info(r)<match_token) then s:=null
7955c8112,8113
< @* \[26] Basic scanning subroutines.
---
> 
> @* \[26] Basic scanning subroutines.
8022c8180
<    ((cur_chr=str_pool[k])or(cur_chr=str_pool[k]-"a"+"A")) then
---
>    ((cur_chr=so(str_pool[k]))or(cur_chr=so(str_pool[k])-"a"+"A")) then
8158c8316
< begin scan_seven_bit_int;
---
> begin scan_char_num;
8184c8342
< @ A user is allowed to refer to `\.{\\the\\spacefactor}' only in horizontal
---
> @ Users refer to `\.{\\the\\spacefactor}' only in horizontal
8191c8349
< or |glue_val|.
---
> |glue_val|, |input_line_no_code|, or |badness_code|.
8192a8351,8353
> @d input_line_no_code=glue_val+1 {code for \.{\\inputlineno}}
> @d badness_code=glue_val+2 {code for \.{\\badness}}
> 
8213a8375,8378
> primitive("inputlineno",last_item,input_line_no_code);
> @!@:input_line_no_}{\.{\\inputlineno} primitive@>
> primitive("badness",last_item,badness_code);
> @!@:badness_}{\.{{\\badness} primitive@>
8223,8225c8388,8394
< last_item: if chr_code=int_val then print_esc("lastpenalty")
< else if chr_code=dimen_val then print_esc("lastkern")
< else print_esc("lastskip");
---
> last_item: case chr_code of
>   int_val: print_esc("lastpenalty");
>   dimen_val: print_esc("lastkern");
>   glue_val: print_esc("lastskip");
>   input_line_no_code: print_esc("inputlineno");
>   othercases print_esc("badness")
>   endcases;
8240,8241c8409,8410
< else  begin cur_val:=aux;
<   if m=vmode then cur_val_level:=dimen_val@+else cur_val_level:=int_val;
---
> else if m=vmode then
>   begin cur_val:=prev_depth; cur_val_level:=dimen_val;
8242a8412,8413
> else begin cur_val:=space_factor; cur_val_level:=int_val;
>   end
8277a8449,8451
> We also handle \.{\\inputlineno} and \.{\\badness} here, because they are
> legal in similar contexts.
> 
8279,8296c8453,8475
< begin if cur_chr=glue_val then cur_val:=zero_glue@+else cur_val:=0;
< cur_val_level:=cur_chr;
< if not is_char_node(tail)and(mode<>0) then
<   case cur_chr of
<   int_val: if type(tail)=penalty_node then cur_val:=penalty(tail);
<   dimen_val: if type(tail)=kern_node then cur_val:=width(tail);
<   glue_val: if type(tail)=glue_node then
<     begin cur_val:=glue_ptr(tail);
<     if subtype(tail)=mu_glue then cur_val_level:=mu_val;
<     end;
<   end {there are no other cases}
< else if (mode=vmode)and(tail=head) then
<   case cur_chr of
<   int_val: cur_val:=last_penalty;
<   dimen_val: cur_val:=last_kern;
<   glue_val: if last_glue<>max_halfword then cur_val:=last_glue;
<   end; {there are no other cases}
< end
---
> if cur_chr>glue_val then
>   begin if cur_chr=input_line_no_code then cur_val:=line
>   else cur_val:=last_badness; {|cur_chr=badness_code|}
>   cur_val_level:=int_val;
>   end
> else begin if cur_chr=glue_val then cur_val:=zero_glue@+else cur_val:=0;
>   cur_val_level:=cur_chr;
>   if not is_char_node(tail)and(mode<>0) then
>     case cur_chr of
>     int_val: if type(tail)=penalty_node then cur_val:=penalty(tail);
>     dimen_val: if type(tail)=kern_node then cur_val:=width(tail);
>     glue_val: if type(tail)=glue_node then
>       begin cur_val:=glue_ptr(tail);
>       if subtype(tail)=mu_glue then cur_val_level:=mu_val;
>       end;
>     end {there are no other cases}
>   else if (mode=vmode)and(tail=head) then
>     case cur_chr of
>     int_val: cur_val:=last_penalty;
>     dimen_val: cur_val:=last_kern;
>     glue_val: if last_glue<>max_halfword then cur_val:=last_glue;
>     end; {there are no other cases}
>   end
8365c8544
< |scan_something_internal|:
---
> |scan_something_internal|.
8367,8377d8545
< @<Declare procedures that scan restricted classes of integers@>=
< procedure scan_seven_bit_int;
< begin scan_int;
< if (cur_val<0)or(cur_val>127) then
<   begin print_err("Bad character code");
< @.Bad character code@>
<   help2("The numeric code for a character must be between 0 and 127.")@/
<     ("I changed this one to zero."); int_error(cur_val); cur_val:=0;
<   end;
< end;
< 
8390c8558
< procedure scan_four_bit_int;
---
> procedure scan_char_num;
8392,8395c8560,8563
< if (cur_val<0)or(cur_val>15) then
<   begin print_err("Bad number");
< @.Bad number@>
<   help2("Since I expected to read a number between 0 and 15,")@/
---
> if (cur_val<0)or(cur_val>255) then
>   begin print_err("Bad character code");
> @.Bad character code@>
>   help2("A character number must be between 0 and 255.")@/
8404c8572
< procedure scan_char_num;
---
> procedure scan_four_bit_int;
8406,8409c8574,8577
< if (cur_val<0)or(cur_val>255) then
<   begin print_err("Bad character code");
< @.Bad character code@>
<   help2("A character number must be between 0 and 255.")@/
---
> if (cur_val<0)or(cur_val>15) then
>   begin print_err("Bad number");
> @.Bad number@>
>   help2("Since I expected to read a number between 0 and 15,")@/
8506c8674
< if cur_val>127 then
---
> if cur_val>255 then
8649c8817,8818
< @!k:small_number; {number of digits in a decimal fraction}
---
> @!k,@!kk:small_number; {number of digits in a decimal fraction}
> @!p,@!q:pointer; {top of decimal digit stack}
8669c8838
< begin k:=0; get_token; {|point_token| is being re-scanned}
---
> begin k:=0; p:=null; get_token; {|point_token| is being re-scanned}
8673c8842,8843
<     begin dig[k]:=cur_tok-zero_token; incr(k);
---
>     begin q:=get_avail; link(q):=p; info(q):=cur_tok-zero_token;
>     p:=q; incr(k);
8676c8846,8849
< done1: f:=round_decimals(k);
---
> done1: for kk:=k downto 1 do
>   begin dig[kk-1]:=info(p); q:=p; p:=link(p); free_avail(q);
>   end;
> f:=round_decimals(k);
8894c9067,9068
< @* \[27] Building token lists.
---
> 
> @* \[27] Building token lists.
8915c9089
<   begin t:=str_pool[k];
---
>   begin t:=so(str_pool[k]);
9221c9395
< if (end_line_char<0)or(end_line_char>127) then decr(limit)
---
> if end_line_char_inactive then decr(limit)
9268c9442,9443
< @* \[28] Conditional processing.
---
> 
> @* \[28] Conditional processing.
9572c9747
< if (cur_cmd>active_char)or(cur_chr>127) then
---
> if (cur_cmd>active_char)or(cur_chr>255) then
9578c9753
< if (cur_cmd>active_char)or(cur_chr>127) then
---
> if (cur_cmd>active_char)or(cur_chr>255) then
9651c9826,9827
< @* \[29] File names.
---
> 
> @* \[29] File names.
9808,9810c9984,9986
< for j:=str_start[a] to str_start[a+1]-1 do append_to_name(str_pool[j]);
< for j:=str_start[n] to str_start[n+1]-1 do append_to_name(str_pool[j]);
< for j:=str_start[e] to str_start[e+1]-1 do append_to_name(str_pool[j]);
---
> for j:=str_start[a] to str_start[a+1]-1 do append_to_name(so(str_pool[j]));
> for j:=str_start[n] to str_start[n+1]-1 do append_to_name(so(str_pool[j]));
> for j:=str_start[e] to str_start[e+1]-1 do append_to_name(so(str_pool[j]));
9823a10000
> @d format_extension=".fmt" {the extension, as a \.{WEB} constant}
9937c10114
< loop@+begin if (cur_cmd>other_char)or(cur_chr>127) then {not a character}
---
> loop@+begin if (cur_cmd>other_char)or(cur_chr>255) then {not a character}
9977c10154
<   |".fmt"|}
---
>   |format_extension|}
10132c10309
< if (end_line_char<0)or(end_line_char>127) then decr(limit)
---
> if end_line_char_inactive then decr(limit)
10136c10313,10314
< @* \[30] Font metric data.
---
> 
> @* \[30] Font metric data.
10306,10307c10484,10486
< \yskip\hang first byte: |stop_bit|, indicates that this is the final program
<   step if the byte is 128 or more.\par
---
> \yskip\hang first byte: |skip_byte|, indicates that this is the final program
>   step if the byte is 128 or more, otherwise the next step is obtained by
>   skipping this number of intervening steps.\par
10310c10489
< \hang third byte: |op_bit|, indicates a ligature step if less than~128,
---
> \hang third byte: |op_byte|, indicates a ligature step if less than~128,
10314,10317c10493,10495
< In a ligature step the current character and |next_char| are replaced by
< the single character whose code is |remainder|. In a kern step, an
< additional space equal to |@!kern[remainder]| is inserted between the
< current character and |next_char|. (The value of |kern[remainder]| is
---
> In a kern step, an
> additional space equal to |kern[256*(op_byte-128)+remainder]| is inserted
> between the current character and |next_char|. This amount is
10319c10497
< by kerning; but it might be positive.)
---
> by kerning; but it might be positive.
10321,10324c10499,10530
< @d stop_flag=128+min_quarterword
<   {value indicating `\.{STOP}' in a lig/kern program}
< @d kern_flag=128+min_quarterword {op code for a kern step}
< @d stop_bit(#)==#.b0
---
> There are eight kinds of ligature steps, having |op_byte| codes $4a+2b+c$ where
> $0\le a\le b+c$ and $0\le b,c\le1$. The character whose code is
> |remainder| is inserted between the current character and |next_char|;
> then the current character is deleted if $b=0$, and |next_char| is
> deleted if $c=0$; then we pass over $a$~characters to reach the next
> current character (which may have a ligature/kerning program of its own).
> 
> If the very first instruction of the |lig_kern| array has |skip_byte=255|,
> the |next_char| byte is the so-called right boundary character of this font;
> the value of |next_char| need not lie between |bc| and~|ec|.
> If the very last instruction of the |lig_kern| array has |skip_byte=255|,
> there is a special ligature/kerning program for a left boundary character,
> beginning at location |256*op_byte+remainder|.
> The interpretation is that \TeX\ puts implicit boundary characters
> before and after each consecutive string of characters from the same font.
> These implicit characters do not appear in the output, but they can affect
> ligatures and kerning.
> 
> If the very first instruction of a character's |lig_kern| program has
> |skip_byte>128|, the program actually begins in location
> |256*op_byte+remainder|. This feature allows access to large |lig_kern|
> arrays, because the first instruction must otherwise
> appear in a location |<=255|.
> 
> Any instruction with |skip_byte>128| in the |lig_kern| array must have
> |256*op_byte+remainder<nl|. If such an instruction is encountered during
> normal program execution, it denotes an unconditional halt; no ligature
> or kerning command is performed.
> 
> @d stop_flag==qi(128) {value indicating `\.{STOP}' in a lig/kern program}
> @d kern_flag==qi(128) {op code for a kern step}
> @d skip_byte(#)==#.b0
10326c10532
< @d op_bit(#)==#.b2
---
> @d op_byte(#)==#.b2
10399a10606
> @!font_index=0..font_mem_size; {index into |font_info|}
10402a10610,10612
> @d non_char==qi(256) {a |halfword| code that can't match a real character}
> @d non_address==font_mem_size {a spurious |font_index|}
> 
10404c10614
< @!font_info:array[0..font_mem_size] of memory_word;
---
> @!font_info:array[font_index] of memory_word;
10406c10616
< @!fmem_ptr:0..font_mem_size; {first unused word of |font_info|}
---
> @!fmem_ptr:font_index; {first unused word of |font_info|}
10426a10637,10643
> @!bchar_label:array[internal_font_number] of font_index;
>   {start of |lig_kern| program for left boundary character,
>   |non_address| if there is none}
> @!font_bchar:array[internal_font_number] of min_quarterword..non_char;
>   {right boundary character, |non_char| if there is none}
> @!font_false_bchar:array[internal_font_number] of min_quarterword..non_char;
>   {|font_bchar| if it doesn't exist in the font, otherwise |non_char|}
10538c10755,10758
< kerning command~|j| in font~|f|.
---
> kerning command~|j| in font~|f|. If |j| is the |char_info| for a character
> with a ligature/kern program, the first instruction of that program is either
> |i=font_info[lig_kern_start(f)(j)]| or |font_info[lig_kern_restart(f)(i)]|,
> depending on whether or not |skip_byte(i)<=stop_flag|.
10540,10541c10760,10764
< @d lig_kern_start(#)==lig_kern_base[#]+rem_byte {beginning of lig/kern program}
< @d char_kern_end(#)==rem_byte(#)].sc
---
> The constant |kern_base_offset| should be simplified, for \PASCAL\ compilers
> that do not do local optimization.
> @^system dependencies@>
> 
> @d char_kern_end(#)==256*op_byte(#)+rem_byte(#)].sc
10542a10766,10769
> @d kern_base_offset==256*(128+min_quarterword)
> @d lig_kern_start(#)==lig_kern_base[#]+rem_byte {beginning of lig/kern program}
> @d lig_kern_restart_end(#)==256*op_byte(#)+rem_byte(#)+32768-kern_base_offset
> @d lig_kern_restart(#)==lig_kern_base[#]+lig_kern_restart_end
10582c10809
< var k:0..font_mem_size; {index into |font_info|}
---
> var k:font_index; {index into |font_info|}
10589a10817,10818
> @!bch_label:integer; {left boundary start location, or infinity}
> @!bchar:0..256; {right boundary character, or 256}
10675a10905,10907
> if bc>255 then {|bc=256| and |ec=255|}
>   begin bc:=1; ec:=0;
>   end;
10705,10706c10937,10938
< kern_base[f]:=lig_kern_base[f]+nl;
< exten_base[f]:=kern_base[f]+nk;
---
> kern_base[f]:=lig_kern_base[f]+nl-kern_base_offset;
> exten_base[f]:=kern_base[f]+kern_base_offset+nk;
10812,10817c11044,11065
< @ @<Read ligature/kern program@>=
< begin for k:=lig_kern_base[f] to kern_base[f]-1 do
<   begin store_four_quarters(font_info[k].qqqq);
<   check_byte_range(b);
<   if c<128 then check_byte_range(d) {check ligature}
<   else if d>=nk then abort; {check kern}
---
> @ @d check_existence(#)==@t@>@;@/
>   begin check_byte_range(#);
>   qw:=char_info(f)(#); {N.B.: not |qi(#)|}
>   if not char_exists(qw) then abort;
>   end
> 
> @<Read ligature/kern program@>=
> bch_label:=@'77777; bchar:=256;
> if nl>0 then
>   begin for k:=lig_kern_base[f] to kern_base[f]+kern_base_offset-1 do
>     begin store_four_quarters(font_info[k].qqqq);
>     if a>128 then
>       begin if 256*c+d>=nl then abort;
>       if a=255 then if k=lig_kern_base[f] then bchar:=b;
>       end
>     else begin if b<>bchar then check_existence(b);
>       if c<128 then check_existence(d) {check ligature}
>       else if 256*(c-128)+d>=nk then abort; {check kern}
>       if a<128 then if k-lig_kern_base[f]+a+1>=nl then abort;
>       end;
>     end;
>   if a=255 then bch_label:=256*c+d;
10819,10820c11067
< if (nl>0)and(a<128) then abort; {check for stop bit on last command}
< for k:=kern_base[f] to exten_base[f]-1 do
---
> for k:=kern_base[f]+kern_base_offset to exten_base[f]-1 do
10822d11068
< end
10827,10830c11073,11076
<   if a<>0 then check_byte_range(a);
<   if b<>0 then check_byte_range(b);
<   if c<>0 then check_byte_range(c);
<   check_byte_range(d);
---
>   if a<>0 then check_existence(a);
>   if b<>0 then check_existence(b);
>   if c<>0 then check_existence(c);
>   check_existence(d);
10858a11105,11112
> if bch_label<nl then bchar_label[f]:=bch_label+lig_kern_base[f]
> else bchar_label[f]:=non_address;
> font_bchar[f]:=qi(bchar);
> font_false_bchar[f]:=qi(bchar);
> if bchar<=ec then if bchar>=bc then
>   begin qw:=char_info(f)(bchar); {N.B.: not |qi(bchar)|}
>   if char_exists(qw) then font_false_bchar[f]:=non_char;
>   end;
10963c11217,11218
< @* \[31] Device-independent file format.
---
> 
> @* \[31] Device-independent file format.
11311,11312c11566,11569
< to~2. (Some day we will set |i=3|, when \.{DVI} format makes another
< incompatible change---perhaps in 1992.)
---
> to~2. (The value |i=3| is currently used for an extended format that
> allows a mixture of right-to-left and left-to-right typesetting.
> Some day we will set |i=4|, when \.{DVI} format makes another
> incompatible change---perhaps in the year 2048.)
11456c11713,11714
< @* \[32] Shipping pages out.
---
> 
> @* \[32] Shipping pages out.
11633c11891
<   dvi_out(str_pool[k]);
---
>   dvi_out(so(str_pool[k]));
11635c11893
<   dvi_out(str_pool[k])
---
>   dvi_out(so(str_pool[k]))
11941c12199
<   for s:=str_start[str_ptr] to pool_ptr-1 do dvi_out(str_pool[s]);
---
>   for s:=str_start[str_ptr] to pool_ptr-1 do dvi_out(so(str_pool[s]));
12419c12677,12678
< @* \[33] Packaging.
---
> 
> @* \[33] Packaging.
12489c12748,12749
< glue of each kind is present.
---
> glue of each kind is present. A global variable |last_badness| is used
> to implement \.{\\badness}.
12493a12754
> @!last_badness:integer; {badness of the most recently packaged box}
12503c12764
< @ @<Set init...@>=adjust_tail:=null;
---
> @ @<Set init...@>=adjust_tail:=null; last_badness:=0;
12518,12519c12779
< @!b:integer; {badness of the new box}
< begin r:=get_node(box_node_size); type(r):=hlist_node;
---
> begin last_badness:=0; r:=get_node(box_node_size); type(r):=hlist_node;
12648c12908
< if (hbadness<inf_bad)and(o=normal)and(list_ptr(r)<>null) then
---
> if o=normal then if list_ptr(r)<>null then
12661,12662c12921,12922
< begin b:=badness(x,total_stretch[normal]);
< if b>hbadness then
---
> begin last_badness:=badness(x,total_stretch[normal]);
> if last_badness>hbadness then
12664,12665c12924,12925
<   if b>100 then print_nl("Underfull")@+else print_nl("Loose");
<   print(" \hbox (badness "); print_int(b);
---
>   if last_badness>100 then print_nl("Underfull")@+else print_nl("Loose");
>   print(" \hbox (badness "); print_int(last_badness);
12708c12968,12969
<   begin set_glue_ratio_one(glue_set(r)); {this is the maximum shrinkage}
---
>   begin last_badness:=1000000;
>   set_glue_ratio_one(glue_set(r)); {use the maximum shrinkage}
12712c12973
< else if (hbadness<100)and(o=normal)and(list_ptr(r)<>null) then
---
> else if o=normal then if list_ptr(r)<>null then
12738,12740c12999,13001
< begin b:=badness(-x,total_shrink[normal]);
< if b>hbadness then
<   begin print_ln; print_nl("Tight \hbox (badness "); print_int(b);
---
> begin last_badness:=badness(-x,total_shrink[normal]);
> if last_badness>hbadness then
>   begin print_ln; print_nl("Tight \hbox (badness "); print_int(last_badness);
12763,12764c13024
< @!b:integer; {badness of the new box}
< begin r:=get_node(box_node_size); type(r):=vlist_node;
---
> begin last_badness:=0; r:=get_node(box_node_size); type(r):=vlist_node;
12839c13099
< if (vbadness<inf_bad)and(o=normal)and(list_ptr(r)<>null) then
---
> if o=normal then if list_ptr(r)<>null then
12846,12847c13106,13107
< begin b:=badness(x,total_stretch[normal]);
< if b>vbadness then
---
> begin last_badness:=badness(x,total_stretch[normal]);
> if last_badness>vbadness then
12849,12850c13109,13110
<   if b>100 then print_nl("Underfull")@+else print_nl("Loose");
<   print(" \vbox (badness "); print_int(b);
---
>   if last_badness>100 then print_nl("Underfull")@+else print_nl("Loose");
>   print(" \vbox (badness "); print_int(last_badness);
12879c13139,13140
<   begin set_glue_ratio_one(glue_set(r)); {this is the maximum shrinkage}
---
>   begin last_badness:=1000000;
>   set_glue_ratio_one(glue_set(r)); {use the maximum shrinkage}
12883c13144
< else if (vbadness<100)and(o=normal)and(list_ptr(r)<>null) then
---
> else if o=normal then if list_ptr(r)<>null then
12898,12900c13159,13161
< begin b:=badness(-x,total_shrink[normal]);
< if b>vbadness then
<   begin print_ln; print_nl("Tight \vbox (badness "); print_int(b);
---
> begin last_badness:=badness(-x,total_shrink[normal]);
> if last_badness>vbadness then
>   begin print_ln; print_nl("Tight \vbox (badness "); print_int(last_badness);
12922c13183,13184
< @* \[34] Data structures for math mode.
---
> 
> @* \[34] Data structures for math mode.
13375c13637,13638
< @* \[35] Subroutines for math mode.
---
> 
> @* \[35] Subroutines for math mode.
13546,13547c13809,13810
< continue: if (qo(y)>=font_bc[g])and(qo(y)<=font_ec[g]) then
<   begin q:=char_info(g)(y);
---
> if (qo(y)>=font_bc[g])and(qo(y)<=font_ec[g]) then
>   begin continue: q:=char_info(g)(y);
13725c13988,13989
< @* \[36] Typesetting math formulas.
---
> 
> @* \[36] Typesetting math formulas.
14116a14381
>   if not char_exists(i) then goto done;
14128,14132c14393,14396
<     repeat cur_i:=font_info[a].qqqq;
<     if qo(next_char(cur_i))=skew_char[cur_f] then
<       begin if op_bit(cur_i)>=kern_flag then
<         s:=char_kern(cur_f)(cur_i);
<       goto done1;
---
>     cur_i:=font_info[a].qqqq;
>     if skip_byte(cur_i)>stop_flag then
>       begin a:=lig_kern_restart(cur_f)(cur_i);
>       cur_i:=font_info[a].qqqq;
14134,14135c14398,14406
<     incr(a);
<     until stop_bit(cur_i)>=stop_flag;
---
>     loop@+ begin if qo(next_char(cur_i))=skew_char[cur_f] then
>         begin if op_byte(cur_i)>=kern_flag then
> 	  if skip_byte(cur_i)<=stop_flag then s:=char_kern(cur_f)(cur_i);
> 	goto done1;
> 	end;
>       if skip_byte(cur_i)>=stop_flag then goto done1;
>       a:=a+qo(skip_byte(cur_i))+1;
>       cur_i:=font_info[a].qqqq;
>       end;
14253a14525
> @!c:quarterword;@+@!i:four_quarters; {registers for character examination}
14260,14261c14532,14535
<     begin cur_c:=rem_byte(cur_i); character(nucleus(q)):=cur_c;
<     cur_i:=char_info(cur_f)(cur_c);
---
>     begin c:=rem_byte(cur_i); i:=char_info(cur_f)(c);
>     if char_exists(i) then
>       begin cur_c:=c; cur_i:=i; character(nucleus(q)):=c;
>       end;
14324a14599,14600
> No boundary characters enter into these ligatures.
> 
14329c14605
< @!p:pointer; {temporary register for list manipulation}
---
> @!p,@!r:pointer; {temporary registers for list manipulation}
14331,14332c14607,14608
< if (math_type(subscr(q))=empty)and(math_type(supscr(q))=empty)and@|
<   (math_type(nucleus(q))=math_char) then
---
> if math_type(subscr(q))=empty then if math_type(supscr(q))=empty then
>  if math_type(nucleus(q))=math_char then
14342,14348c14618,14630
<         repeat cur_i:=font_info[a].qqqq;@/
<         @<If instruction |cur_i| is a kern with |cur_c|,
<          attach the kern after |q| and |return|;
<          or if it is a ligature with |cur_c|, combine
<          noads |q| and |p| and |goto restart|@>;
<         incr(a);
<         until stop_bit(cur_i)>=stop_flag;
---
> 	cur_i:=font_info[a].qqqq;
> 	if skip_byte(cur_i)>stop_flag then
> 	  begin a:=lig_kern_restart(cur_f)(cur_i);
> 	  cur_i:=font_info[a].qqqq;
> 	  end;
> 	loop@+ begin @<If instruction |cur_i| is a kern with |cur_c|, attach
> 	    the kern after~|q|; or if it is a ligature with |cur_c|, combine
> 	    noads |q| and~|p| appropriately; then |return| if the cursor has
> 	    moved past a noad, or |goto restart|@>;
> 	  if skip_byte(cur_i)>=stop_flag then return;
> 	  a:=a+qo(skip_byte(cur_i))+1;
> 	  cur_i:=font_info[a].qqqq;
> 	  end;
14355c14637,14639
< is replaced by an |ord_noad|. Presumably a font designer will define such
---
> is replaced by an |ord_noad|, when the two noads collapse into one.
> But we could make a parenthesis (say) change shape when it follows
> certain letters. Presumably a font designer will define such
14357a14642,14643
> \chardef\?='174 % vertical line to indicate character retention
> 
14359,14360c14645,14646
< if next_char(cur_i)=cur_c then
<   if op_bit(cur_i)>=kern_flag then
---
> if next_char(cur_i)=cur_c then if skip_byte(cur_i)<=stop_flag then
>   if op_byte(cur_i)>=kern_flag then
14364,14368c14650,14668
<   else  begin link(q):=link(p); math_type(nucleus(q)):=math_char;
<     character(nucleus(q)):=rem_byte(cur_i);@/
<     mem[subscr(q)]:=mem[subscr(p)];
<     mem[supscr(q)]:=mem[supscr(p)];
<     free_node(p,noad_size); goto restart;
---
>   else  begin check_interrupt; {allow a way out of infinite ligature loop}
>     case op_byte(cur_i) of
>   qi(1),qi(5): character(nucleus(q)):=rem_byte(cur_i); {\.{=:\?}, \.{=:\?>}}
>   qi(2),qi(6): character(nucleus(p)):=rem_byte(cur_i); {\.{\?=:}, \.{\?=:>}}
>   qi(3),qi(7),qi(11):begin r:=new_noad; {\.{\?=:\?}, \.{\?=:\?>}, \.{\?=:\?>>}}
>       character(nucleus(r)):=rem_byte(cur_i);
>       fam(nucleus(r)):=fam(nucleus(q));@/
>       link(q):=r; link(r):=p;
>       if op_byte(cur_i)<qi(11) then math_type(nucleus(r)):=math_char
>       else math_type(nucleus(r)):=math_text_char; {prevent combination}
>       end;
>     othercases begin link(q):=link(p);
>       character(nucleus(q)):=rem_byte(cur_i); {\.{=:}}
>       mem[subscr(q)]:=mem[subscr(p)]; mem[supscr(q)]:=mem[supscr(p)];@/
>       free_node(p,noad_size);
>       end
>     endcases;
>     if op_byte(cur_i)>qi(3) then return;
>     math_type(nucleus(q)):=math_char; goto restart;
14618c14918
<   begin case str_pool[r_type*8+t+magic_offset] of
---
>   begin case so(str_pool[r_type*8+t+magic_offset]) of
14650c14950,14951
< @* \[37] Alignment.
---
> 
> @* \[37] Alignment.
14890c15191
<   begin mode:=-vmode; prev_depth:=nest[nest_ptr-2].aux_field;
---
>   begin mode:=-vmode; prev_depth:=nest[nest_ptr-2].aux_field.sc;
14944,14945c15245,15246
< @d span_code=128 {distinct from any character}
< @d cr_code=129 {distinct from |span_code| and from any character}
---
> @d span_code=256 {distinct from any character}
> @d cr_code=257 {distinct from |span_code| and from any character}
15069c15370,15371
< begin push_nest; mode:=(-hmode-vmode)-mode; aux:=0;
---
> begin push_nest; mode:=(-hmode-vmode)-mode;
> if mode=-hmode then space_factor:=0 @+else prev_depth:=0;
15278a15581
> @!aux_save:memory_word; {temporary storage for |aux|}
15512a15816,15819
> In horizontal mode, the |clang| part of |aux| is undefined; an over-cautious
> \PASCAL\ runtime system may complain about this.
> @^dirty Pascal@>
> 
15514c15821
< t:=aux; p:=link(head); q:=tail; pop_nest;
---
> aux_save:=aux; p:=link(head); q:=tail; pop_nest;
15516c15823
< else  begin aux:=t; link(tail):=p;
---
> else  begin aux:=aux_save; link(tail):=p;
15520c15827,15828
< @* \[38] Breaking paragraphs into lines.
---
> 
> @* \[38] Breaking paragraphs into lines.
15578c15886
< label done,done1,done2,done3,done4;
---
> label done,done1,done2,done3,done4,continue;
15600c15908
< @<Get ready...@>=
---
> @<Get ready to start...@>=
15824c16132
< @ @<Get ready...@>=
---
> @ @<Get ready to start...@>=
15848c16156,16157
< |threshold=pretolerance| and |second_pass=false|. If this pass fails to find a
---
> |threshold=pretolerance| and |second_pass=final_pass=false|.
> If this pass fails to find a
15850a16160,16161
> If that fails too, we add |emergency_stretch| to the background
> stretchability and set |final_pass=true|.
15854a16166
> @!final_pass:boolean; {is this our final attempt to break this paragraph?}
15960c16272
< @ @<Get ready...@>=
---
> @ @<Get ready to start...@>=
16215c16527
< @<Get ready...@>=
---
> @<Get ready to start...@>=
16325c16637
< @ During the second pass, we dare not lose all active nodes, lest we lose
---
> @ During the final pass, we dare not lose all active nodes, lest we lose
16335c16647
< begin if second_pass and (minimum_demerits=awful_bad) and@|
---
> begin if final_pass and (minimum_demerits=awful_bad) and@|
16464c16776,16777
< @* \[39] Breaking paragraphs into lines, continued.
---
> 
> @* \[39] Breaking paragraphs into lines, continued.
16487c16800
< @!q,@!r,@!s:pointer; {miscellaneous nodes of temporary interest}
---
> @!q,@!r,@!s,@!prev_s:pointer; {miscellaneous nodes of temporary interest}
16499c16812
<   second_pass:=false;
---
>   second_pass:=false; final_pass:=false;
16501a16815
>   final_pass:=(emergency_stretch<=0);
16504,16505c16818,16820
< loop@+  begin @<Create an active breakpoint representing the beginning of
<     the paragraph@>;
---
> loop@+  begin if threshold>inf_bad then threshold:=inf_bad;
>   if second_pass then @<Initialize for hyphenating a paragraph@>;
>   @<Create an active breakpoint representing the beginning of the paragraph@>;
16518,16520c16833,16840
<   @!stat if tracing_paragraphs>0 then print_nl("@@secondpass");@;@+tats@/
<   threshold:=tolerance; second_pass:=true; {if at first you don't
<     succeed, \dots}
---
>   if not second_pass then
>     begin@!stat if tracing_paragraphs>0 then print_nl("@@secondpass");@;@+tats@/
>     threshold:=tolerance; second_pass:=true; final_pass:=(emergency_stretch<=0);
>     end {if at first you don't succeed, \dots}
>   else begin @!stat if tracing_paragraphs>0 then
>       print_nl("@@emergencypass");@;@+tats@/
>     background[2]:=background[2]+emergency_stretch; final_pass:=true;
>     end;
16522c16842,16844
< done: @!stat if tracing_paragraphs>0 then end_diagnostic(true);@;@+tats@/
---
> done: @!stat if tracing_paragraphs>0 then
>   begin end_diagnostic(true); normalize_selector;
>   end;@+tats@/
16572c16894
< whatsit_node: @<Advance \(p)past a whatsit node in the |line_break| loop@>;
---
> whatsit_node: @<Advance \(p)past a whatsit node in the \(l)|line_break| loop@>;
16659c16981
< correct ``looseness.'' On the second pass, there will be at least one active
---
> correct ``looseness.'' On the final pass, there will be at least one active
16681c17003
<   if (actual_looseness=looseness)or second_pass then goto done;
---
>   if (actual_looseness=looseness)or final_pass then goto done;
16936c17258,17259
< @* \[40] Pre-hyphenation.
---
> 
> @* \[40] Pre-hyphenation.
16960c17283
< nonletter (if |c>=128| or |lc_code(c)=0|), a lowercase letter (if
---
> nonletter (if |lc_code(c)=0|), a lowercase letter (if
16974c17297
< |1<=a<=b+1<=m|. If |n>=5|, this string qualifies for hyphenation; however,
---
> |1<=a<=b+1<=m|. If |n>=min_hyf|, this string qualifies for hyphenation; however,
16990a17314,17320
> @<Initialize for hyphenating...@>=
> begin @!init if trie_not_ready then init_trie;@+tini@;@/
> l_hyf:=left_hyphen_min-1;@+if l_hyf<0 then l_hyf:=0;
> r_hyf:=right_hyphen_min-1;@+if r_hyf<0 then r_hyf:=0;
> min_hyf:=l_hyf+r_hyf+2; cur_lang:=0;
> end
> 
16993c17323
< nodes $p_a$ and~$p_b$ in the description above are placed into variables
---
> nodes $p_{a-1}$ and~$p_b$ in the description above are placed into variables
16997c17327
< @!hc:array[0..65] of halfword; {word to be hyphenated}
---
> @!hc:array[0..65] of 0..256; {word to be hyphenated}
17001c17331
< @!hu:array[1..63] of ASCII_code; {like |hc|, before conversion to lowercase}
---
> @!hu:array[0..63] of 0..256; {like |hc|, before conversion to lowercase}
17002a17333,17334
> @!cur_lang:ASCII_code; {current hyphenation table of interest}
> @!l_hyf,@!r_hyf,@!min_hyf:integer; {limits on fragment sizes}
17014c17346,17347
< begin s:=link(cur_p);
---
> begin if min_hyf>63 then goto done1;
> prev_s:=cur_p; s:=link(prev_s);
17019,17020c17352,17353
<   @<Check that the nodes following |hb| permit hyphenation and that
<     at least five letters have been found, otherwise |goto done1|@>;
---
>   @<Check that the nodes following |hb| permit hyphenation and that at least
>     |min_hyf| letters have been found, otherwise |goto done1|@>;
17028c17361
< label done,found,not_found,found1,exit;
---
> label common_ending,done,found,found1,not_found,not_found+1,exit;
17030c17363
< begin @<Find hyphen locations for the word in |hc|@>;
---
> begin @<Find hyphen locations for the word in |hc|, or |return|@>;
17036c17369
< @ The first thing we need to do is find the node |ha| that contains the
---
> @ The first thing we need to do is find the node |ha| just before the
17044c17377,17383
<     begin q:=lig_ptr(s); c:=qo(character(q)); hf:=font(q);
---
>     if lig_ptr(s)=null then goto continue
>     else begin q:=lig_ptr(s); c:=qo(character(q)); hf:=font(q);
>       end
>   else if (type(s)=kern_node)and(subtype(s)=normal) then goto continue
>   else if type(s)=whatsit_node then
>     begin @<Advance \(p)past a whatsit node in the \(p)pre-hyphenation loop@>;
>     goto continue;
17046,17047d17384
<   else if (type(s)=kern_node)and(subtype(s)=normal) then c:=128
<   else if type(s)=whatsit_node then c:=128
17049c17386
<   if c<128 then if lc_code(c)<>0 then
---
>   if lc_code(c)<>0 then
17052c17389
<   s:=link(s);
---
> continue: prev_s:=s; s:=link(prev_s);
17057c17394
< ha:=s
---
> ha:=prev_s
17066,17068c17403,17405
<     if c>=128 then goto done3;
<     if (lc_code(c)=0)or(hn=63) then goto done3;
<     hb:=s; incr(hn); hu[hn]:=c; hc[hn]:=lc_code(c)-1;
---
>     if lc_code(c)=0 then goto done3;
>     if hn=63 then goto done3;
>     hb:=s; incr(hn); hu[hn]:=c; hc[hn]:=lc_code(c);
17084,17091c17421,17429
< begin j:=hn; q:=lig_ptr(s);
< if font(q)<>hf then goto done3;
< repeat c:=qo(character(q));
< if c>=128 then goto done3;
< if (lc_code(c)=0)or(j=63) then goto done3;
< incr(j); hu[j]:=c; hc[j]:=lc_code(c)-1;@/
< q:=link(q);
< until q=null;
---
> begin if font(lig_char(s))<>hf then goto done3;
> j:=hn; q:=lig_ptr(s);
> while q>null do
>   begin c:=qo(character(q));
>   if lc_code(c)=0 then goto done3;
>   if j=63 then goto done3;
>   incr(j); hu[j]:=c; hc[j]:=lc_code(c);@/
>   q:=link(q);
>   end;
17096c17434
< if hn<5 then goto done1;
---
> if hn<min_hyf then goto done1;
17108c17446,17447
< @* \[41] Post-hyphenation.
---
> 
> @* \[41] Post-hyphenation.
17117a17457,17459
> @!init_list:pointer; {list of punctuation characters preceding the word}
> @!init_lig:boolean; {does |init_list| represent a ligature?}
> @!init_lft:boolean; {if so, did the ligature involve a left boundary?}
17121a17464
> @!bchar:halfword; {right boundary character of hyphenated word, or |non_char|}
17123,17126c17466,17471
< @ \TeX\ will never insert a hyphen that has fewer than two letters before
< it or fewer than three after it; hence, a five-letter word has comparatively
< little chance of being hyphenated. If no hyphens have been found,
< we can save time by not having to make any changes to the paragraph.
---
> @ \TeX\ will never insert a hyphen that has fewer than
> \.{\\lefthyphenmin} letters before it or fewer than
> \.{\\righthyphenmin} after it; hence, a short word has
> comparatively little chance of being hyphenated. If no hyphens have
> been found, we can save time by not having to make any changes to the
> paragraph.
17129c17474
< for j:=2 to hn-3 do if odd(hyf[j]) then goto found1;
---
> for j:=l_hyf+1 to hn-r_hyf-1 do if odd(hyf[j]) then goto found1;
17134,17135c17479,17486
< subsequence of nodes |ha..hb|. The variable |s| will point to the node
< preceding |ha|, and |q| will point to the node following |hb|, so that
---
> subsequence of nodes between |ha| and~|hb|. An attempt is made to
> preserve the effect that implicit boundary characters and punctuation marks
> had on ligatures inside the hyphenated word, by storing a left boundary or
> preceding character in |hu[0]| and by storing a possible right boundary
> in |bchar|. We set |j:=0| if |hu[0]| is to be part of the reconstruction;
> otherwise |j:=1|.
> The variable |s| will point to the tail of the current hlist, and
> |q| will point to the node following |hb|, so that
17139c17490,17505
< q:=link(hb); link(hb):=null; s:=cur_p;
---
> q:=link(hb); link(hb):=null; r:=link(ha); link(ha):=null; bchar:=non_char;
> if type(hb)=ligature_node then if odd(subtype(hb)) then
>   bchar:=font_bchar[hf];
> if is_char_node(ha) then
>   begin init_list:=ha; init_lig:=false; hu[0]:=qo(character(ha));
>   end
> else if type(ha)=ligature_node then
>   begin init_list:=lig_ptr(ha); init_lig:=true; init_lft:=(subtype(ha)>1);
>   hu[0]:=qo(character(lig_char(ha)));
>   if init_list=null then if init_lft then
>     begin hu[0]:=256; init_lig:=false;
>     end; {in this case a ligature will be reconstructed from scratch}
>   free_node(ha,small_node_size);
>   end
> else goto not_found+1; {no punctuation found}
> s:=cur_p; {we have |cur_p<>ha| because |type(cur_p)=glue_node|}
17141,17142c17507,17515
< link(s):=null; flush_node_list(ha);
< @<Reconstitute nodes for the hyphenated word, inserting discretionary hyphens@>
---
> j:=0; goto common_ending;
> not_found+1: j:=1; s:=ha; init_list:=null;
> if not is_char_node(r) then if type(r)=ligature_node then
>  if subtype(r)>1 then
>   begin j:=0; hu[0]:=256; init_lig:=false;
>   end;
> common_ending: flush_node_list(r);
> @<Reconstitute nodes for the hyphenated word, inserting discretionary hyphens@>;
> flush_list(init_list)
17145c17518
< {\def\!{\kern-1pt}
---
> {\def\!{\kern-1pt}%
17148,17150c17521,17523
< discusses the difficulties of the word ``difficult'', but since fonts can
< include highly general ligatures, the discretionary material surrounding a
< hyphen can be even more complex than that. For example, suppose that
---
> discusses the difficulties of the word ``difficult'', and
> the discretionary material surrounding a
> hyphen can be considerably more complex than that. For example, suppose that
17160a17534,17543
> Still further complications arise in the presence of ligatures that do not
> delete the original characters. When punctuation precedes the word being
> hyphenated, \TeX's method is not perfect under all possible scenarios,
> because punctuation marks and letters can propagate information back and forth.
> For example, suppose the original pre-hyphenation pair
> \.{*a} changes to \.{*y} via a \.{\?=:} ligature, which changes to \.{xy}
> via a \.{=:\?} ligature; if $p_{a-1}=\.x$ and $p_a=\.y$, the reconstitution
> procedure isn't smart enough to obtain \.{xy} again. In such cases the
> font designer should include a ligature that goes from \.{xa} to \.{xy}.
> 
17162,17169c17545,17555
< an index~|j|, this function creates a node for the next character or ligature
< found in the |hu| array starting at |hu[j]|, using font |hf|. For example,
< if |hu[j..j+2]| contains the three letters `f', `i', and `x', and if
< font |hf| contains an `fi' ligature but no `fix' ligature, then |reconstitute|
< will create a ligature node for `fi'. The index of the last character
< consumed, in this case |j+1|, will be returned. Furthermore, a kern node
< is created and appended, if kerning is called for between the consumed
< character or ligature and the next (unconsumed) character.
---
> a string of characters $x_j\ldots x_n$, there is a smallest index $m\ge j$
> such that the ``translation'' of $x_j\ldots x_n$ by ligatures and kerning
> has the form $y_1\ldots y_t$ followed by the translation of $x_{m+1}\ldots x_n$,
> where $y_1\ldots y_t$ is some nonempty sequence of character, ligature, and
> kern nodes. We call $x_j\ldots x_m$ a ``cut prefix'' of $x_j\ldots x_n$.
> For example, if $x_1x_2x_3=\.{fly}$, and if the font contains `fl' as a
> ligature and a kern between `fl' and `y', then $m=2$, $y=2$, and $y_1$ will
> be a ligature node for `fl' followed by an appropriate kern node~$y_2$.
> In the most common case, $x_j$~forms no ligature with $x_{j+1}$ and we
> simply have $m=j$, $y_1=x_j$. If $m<n$ we can repeat the procedure on
> $x_{m+1}\ldots x_n$ until the entire translation has been found.
17171,17173c17557,17564
< A second parameter, |n|, gives the limit beyond which this procedure does not
< advance. In other words, |hu[n]| might be consumed, but |hu[n+1]| is never
< accessed.
---
> The |reconstitute| function returns the integer $m$ and puts the nodes
> $y_1\ldots y_t$ into a linked list starting at |link(hold_head)|,
> getting the input $x_j\ldots x_n$ from the |hu| array. If $x_j=256$,
> we consider $x_j$ to be an implicit left boundary character; in this
> case |j| must be strictly less than~|n|. There is a
> parameter |bchar|, which is either 256 or an implicit right boundary character
> assumed to be present just following~$x_n$. (The value |hu[n+1]| is never
> explicitly examined, but the algorithm imagines that |bchar| is there.)
17175,17180c17566,17569
< The global variable |hyphen_passed| is set to~|k| if this procedure
< consumes two characters |hu[k]| and |hu[k+1]| such that |hyf[k]| is odd,
< i.e., if the ligature might have to be broken by a hyphen, or if a kern is
< inserted between |hu[k]| and |hu[k+1]|.  If this condition holds for more
< than one value of |k|, the smallest value is used; and if the condition
< holds for no values of |k|, |hyphen_passed| is set to zero.
---
> If there exists an index |k| in the range $j\le k\le m$ such that |hyf[k]|
> is odd and such that the result of reconstitute would have been different
> if $x_{k+1}$ had been |hchar|, then |reconstitute| sets |hyphen_passed|
> to the smallest such~|k|. Otherwise it sets |hyphen_passed| to zero.
17182,17184c17571,17577
< After |reconstitute| has acted, |link(hold_head)| points to the character
< or ligature node that was created, and |link(link(hold_head))| will either
< be |null| or a pointer to the kern node that was appended.
---
> A special convention is used in the case |j=0|: Then we assume that the
> translation of |hu[0]| appears in a special list of charnodes starting at
> |init_list|; moreover, if |init_lig| is |true|, then |hu[0]| will be
> a ligature character, involving a left boundary if |init_lft| is |true|.
> This facility is provided for cases when a hyphenated
> word is preceded by punctuation (like single or double quotes) that might
> affect the translation of the beginning of the word.
17190c17583
< function reconstitute(@!j,@!n:small_number):
---
> function reconstitute(@!j,@!n:small_number;@!bchar,@!hchar:halfword):
17193,17194c17586,17587
< var p:pointer; {a node being created}
< @!s:pointer; {a node being appended to}
---
> var @!p:pointer; {temporary register for list manipulation} 
> @!t:pointer; {a node being appended to}
17196,17197c17589,17590
< @!c:quarterword; {current character}
< @!d:quarterword; {current character or ligature}
---
> @!cur_rh:halfword; {hyphen character for ligature testing}
> @!test_char:halfword; {hyphen or other character for ligature testing}
17199,17203c17592,17599
< @!r:0..font_mem_size; {position of current lig/kern instruction}
< begin @<Build a list of characters in a maximal ligature, and set |w|
<   to the amount of kerning that should follow@>;
< @<If the list has more than one element, create a ligature node@>;
< @<Attach kerning, if |w<>0|@>;
---
> @!k:font_index; {position of current lig/kern instruction}
> begin hyphen_passed:=0; t:=hold_head; w:=0; link(hold_head):=null;
>  {at this point |ligature_present=lft_hit=rt_hit=false|}
> @<Set up data structures with the cursor following position |j|@>;
> continue:@<If there's a ligature or kern at the cursor position, update the data
>   structures, possibly advancing~|j|; continue until the cursor moves@>;
> @<Append a ligature and/or kern to the translation;
>   |goto continue| if the stack of inserted ligatures is nonempty@>;
17207,17213c17603,17681
< @ @<Build a list of characters in a maximal ligature...@>=
< hyphen_passed:=0; s:=hold_head; w:=0; d:=qi(hu[j]); c:=d;
< loop@+  begin continue: p:=get_avail; font(p):=hf;
<   character(p):=c; link(s):=p;@/
<   @<Look for a ligature or kern between |d| and the following
<     character; update the data structure and |goto continue|
<     if a ligature is found, otherwise update~|w| and |goto done|@>;
---
> @ The reconstitution procedure shares many of the global data structures
> by which \TeX\ has processed the words before they were hyphenated.
> There is an implied ``cursor'' between characters |cur_l| and |cur_r|;
> these characters will be tested for possible ligature activity. If
> |ligature_present| then |cur_l| is a ligature character formed from the
> original characters following |cur_q| in the current translation list.
> There is a ``ligature stack'' between the cursor and character |j+1|,
> consisting of pseudo-ligature nodes linked together by their |link| fields.
> This stack is normally empty unless a ligature command has created a new
> character that will need to be processed later. A pseudo-ligature is
> a special node having a |character| field that represents a potential
> ligature and a |lig_ptr| field that points to a |char_node| or is |null|.
> We have
> $$|cur_r|=\cases{|character(lig_stack)|,&if |lig_stack>null|;\cr
>   |qi(hu[j+1])|,&if |lig_stack=null| and |j<n|;\cr
>   bchar,&if |lig_stack=null| and |j=n|.\cr}$$
> 
> @<Glob...@>=
> @!cur_l,@!cur_r:halfword; {characters before and after the cursor}
> @!cur_q:pointer; {where a ligature should be detached}
> @!lig_stack:pointer; {unfinished business to the right of the cursor}
> @!ligature_present:boolean; {should a ligature node be made for |cur_l|?}
> @!lft_hit,@!rt_hit:boolean; {did we hit a ligature with a boundary character?}
> 
> @ @d append_charnode_to_t(#)== begin link(t):=get_avail; t:=link(t);
>     font(t):=hf; character(t):=#;
>     end
> @d set_cur_r==begin if j<n then cur_r:=qi(hu[j+1])@+else cur_r:=bchar;
>     if odd(hyf[j]) then cur_rh:=hchar@+else cur_rh:=non_char;
>     end
> 
> @<Set up data structures with the cursor following position |j|@>=
> cur_l:=qi(hu[j]); cur_q:=t;
> if j=0 then
>   begin ligature_present:=init_lig; p:=init_list;
>   if ligature_present then lft_hit:=init_lft;
>   while p>null do
>     begin append_charnode_to_t(character(p)); p:=link(p);
>     end;
>   end
> else if cur_l<non_char then append_charnode_to_t(cur_l);
> lig_stack:=null; set_cur_r
> 
> @ We may want to look at the lig/kern program twice, once for a hyphen
> and once for a normal letter. (The hyphen might appear after the letter
> in the program, so we'd better not try to look for both at once.)
> 
> @<If there's a ligature or kern at the cursor position, update...@>=
> if cur_l=non_char then
>   begin k:=bchar_label[hf];
>   if k=non_address then goto done@+else q:=font_info[k].qqqq;
>   end
> else begin q:=char_info(hf)(cur_l);
>   if char_tag(q)<>lig_tag then goto done;
>   k:=lig_kern_start(hf)(q); q:=font_info[k].qqqq;
>   if skip_byte(q)>stop_flag then
>     begin k:=lig_kern_restart(hf)(q); q:=font_info[k].qqqq;
>     end;
>   end; {now |k| is the starting address of the lig/kern program}
> if cur_rh<non_char then test_char:=cur_rh@+else test_char:=cur_r;
> loop@+begin if next_char(q)=test_char then if skip_byte(q)<=stop_flag then
>     if cur_rh<non_char then
>       begin hyphen_passed:=j; hchar:=non_char; cur_rh:=non_char;
>       goto continue;
>       end
>     else begin if hchar<non_char then if odd(hyf[j]) then
>         begin hyphen_passed:=j; hchar:=non_char;
>         end;
>       if op_byte(q)<kern_flag then
>       @<Carry out a ligature replacement, updating the cursor structure
>         and possibly advancing~|j|; |goto continue| if the cursor doesn't
> 	advance, otherwise |goto done|@>;
>       w:=char_kern(hf)(q); goto done; {this kern will be inserted below}
>      end;
>   if skip_byte(q)>=stop_flag then
>     if cur_rh=non_char then goto done
>     else begin cur_rh:=non_char; goto continue;
>       end;
>   k:=k+qo(skip_byte(q))+1; q:=font_info[k].qqqq;
17217,17228c17685,17688
< @ @<Look for a ligature or kern between |d| and...@>=
< if j=n then goto done;
< q:=char_info(hf)(d);
< if char_tag(q)<>lig_tag then goto done;
< r:=lig_kern_start(hf)(q); c:=qi(hu[j+1]);
< loop@+  begin q:=font_info[r].qqqq;
<   if next_char(q)=c then
<     begin if odd(hyf[j])and(hyphen_passed=0) then hyphen_passed:=j;
<     if op_bit(q)<kern_flag then
<       @<Append to the ligature and |goto continue|@>
<     else  begin w:=char_kern(hf)(q);
<       goto done;
---
> @ @d wrap_lig(#)==if ligature_present then
>     begin p:=new_ligature(hf,cur_l,link(cur_q));
>     if lft_hit then
>       begin subtype(p):=2; lft_hit:=false;
17229a17690,17693
>     if # then if lig_stack=null then
>       begin incr(subtype(p)); rt_hit:=false;
>       end;
>     link(cur_q):=p; t:=p; ligature_present:=false;
17231,17232c17695,17710
<   else if stop_bit(q)<stop_flag then incr(r)
<   else goto done;
---
> @d pop_lig_stack==begin if lig_ptr(lig_stack)>null then
>     begin link(t):=lig_ptr(lig_stack); {this is a charnode for |hu[j+1]|}
>     t:=link(t); incr(j);
>     end;
>   p:=lig_stack; lig_stack:=link(p); free_node(p,small_node_size);
>   if lig_stack=null then set_cur_r@+else cur_r:=character(lig_stack);
>   end {if |lig_stack| isn't |null| we have |cur_rh=non_char|}
> 
> @<Append a ligature and/or kern to the translation...@>=
> wrap_lig(rt_hit);
> if w<>0 then
>   begin link(t):=new_kern(w); t:=link(t); w:=0;
>   end;
> if lig_stack>null then
>   begin cur_q:=t; cur_l:=character(lig_stack); ligature_present:=true;
>   pop_lig_stack; goto continue;
17235,17237c17713,17744
< @ @<Append to the ligature...@>=
< begin d:=rem_byte(q);
< incr(j); s:=p; goto continue;
---
> @ @<Carry out a ligature replacement, updating the cursor structure...@>=
> begin if cur_l=non_char then lft_hit:=true;
> if j=n then if lig_stack=null then rt_hit:=true;
> check_interrupt; {allow a way out in case there's an infinite ligature loop}
> case op_byte(q) of
> qi(1),qi(5):begin cur_l:=rem_byte(q); {\.{=:\?}, \.{=:\?>}}
>   ligature_present:=true;
>   end;
> qi(2),qi(6):begin cur_r:=rem_byte(q); {\.{\?=:}. \.{\?=:>}}
>   if lig_stack>null then character(lig_stack):=cur_r
>   else begin lig_stack:=new_lig_item(cur_r);
>     if j=n then bchar:=non_char
>     else begin p:=get_avail; lig_ptr(lig_stack):=p;
>       character(p):=qi(hu[j+1]); font(p):=hf;
>       end;
>     end;
>   end;
> qi(3):begin cur_r:=rem_byte(q); {\.{\?=:\?}}
>   p:=lig_stack; lig_stack:=new_lig_item(cur_r); link(lig_stack):=p;
>   end;
> qi(7),qi(11):begin wrap_lig(false); {\.{\?=:\?>}, \.{\?=:\?>>}}
>   cur_q:=t; cur_l:=rem_byte(q); ligature_present:=true;
>   end;
> othercases begin cur_l:=rem_byte(q); ligature_present:=true; {\.{=:}}
>   if lig_stack>null then pop_lig_stack
>   else if j=n then goto done
>   else begin append_charnode_to_t(cur_r); incr(j); set_cur_r;
>     end;
>   end
> endcases;
> if op_byte(q)>qi(4) then if op_byte(q)<>qi(7) then goto done;
> goto continue;
17240,17250d17746
< @ After the list has been built, |link(s)| points to the final list element.
< 
< @<If the list has more than one element, create a ligature node@>=
< if s<>hold_head then
<   begin p:=new_ligature(hf,d,link(hold_head));
<   link(hold_head):=p;
<   end
< 
< @ @<Attach kerning, if |w<>0|@>=
< if w<>0 then link(link(hold_head)):=new_kern(w)
< 
17253,17254c17749,17750
< |hu[1..hn]| after node |s|, and node |q| should be appended to the result.
< During this process, the variable |i| will be a temporary counter or an
---
> |hu[1..hn]| after node |ha|, and node |q| should be appended to the result.
> During this process, the variable |i| will be a temporary
17263a17760,17761
> @!c_loc:0..63; {where that character came from}
> @!r_count:integer; {replacement count for discretionary}
17266,17267c17764
< @ When the following code is performed, |hyf[j]| will be zero for |j=1|
< and for |j>=hn-2|.
---
> @ When the following code is performed, |hyf[0]| and |hyf[hn]| will be zero.
17270,17277c17767,17773
< j:=0;
< repeat l:=j; j:=reconstitute(j+1,hn);
< if hyphen_passed<>0 then
<   @<Create and append a discretionary node as an alternative to the
<     ligature, and continue to develop both branches until they
<     become equivalent@>
< else  begin link(s):=link(hold_head); s:=link(s);
<   if link(s)<>null then s:=link(s);
---
> repeat l:=j; j:=reconstitute(j,hn,bchar,qi(hyf_char))+1;
> if hyphen_passed=0 then
>   begin link(s):=link(hold_head);
>   while link(s)>null do s:=link(s);
>   if odd(hyf[j-1]) then
>     begin l:=j; hyphen_passed:=j-1; link(hold_head):=null;
>     end;
17279,17280c17775,17779
< if odd(hyf[j]) then @<Insert a discretionary hyphen after |s|@>;
< until j=hn;
---
> if hyphen_passed>0 then
>   @<Create and append a discretionary node as an alternative to the
>     unhyphenated word, and continue to develop both branches until they
>     become equivalent@>;
> until j>hn;
17283c17782,17785
< @ @<Create and append a discretionary node as an alternative...@>=
---
> @ @d advance_major_tail==begin major_tail:=link(major_tail); incr(r_count);
>     end
> 
> @<Create and append a discretionary node as an alternative...@>=
17285,17287c17787,17789
< link(s):=r; link(r):=link(hold_head); type(r):=disc_node;
< major_tail:=link(hold_head);
< if link(major_tail)<>null then major_tail:=link(major_tail);
---
> link(r):=link(hold_head); type(r):=disc_node;
> major_tail:=r; r_count:=0;
> while link(major_tail)>null do advance_major_tail;
17289c17791
< @<Put the \(c)characters |hu[l+1..i]| and a hyphen into |pre_break(r)|@>;
---
> @<Put the \(c)characters |hu[l..i]| and a hyphen into |pre_break(r)|@>;
17297c17799
< or kern. At this point we have |l<i<=j| and |i<=hn-3|.
---
> or kern. At this point we have |l-1<=i<=j| and |i<hn|.
17299,17300c17801,17802
< @<Put the \(c)characters |hu[l+1..i]| and a hyphen into |pre_break(r)|@>=
< minor_tail:=null; hyf_node:=new_character(hf,hyf_char);
---
> @<Put the \(c)characters |hu[l..i]| and a hyphen into |pre_break(r)|@>=
> minor_tail:=null; pre_break(r):=null; hyf_node:=new_character(hf,hyf_char);
17302c17804
<   begin incr(i); c:=hu[i]; hu[i]:=hyf_char;
---
>   begin incr(i); c:=hu[i]; hu[i]:=hyf_char; free_avail(hyf_node);
17304,17309c17806,17814
< repeat l:=reconstitute(l+1,i);
< if minor_tail=null then pre_break(r):=link(hold_head)
< else link(minor_tail):=link(hold_head);
< minor_tail:=link(hold_head);
< if link(minor_tail)<>null then minor_tail:=link(minor_tail);
< until l=i;
---
> while l<=i do
>   begin l:=reconstitute(l,i,font_bchar[hf],non_char)+1;
>   if link(hold_head)>null then
>     begin if minor_tail=null then pre_break(r):=link(hold_head)
>     else link(minor_tail):=link(hold_head);
>     minor_tail:=link(hold_head);
>     while link(minor_tail)>null do minor_tail:=link(minor_tail);
>     end;
>   end;
17312,17314c17817,17818
<   free_avail(hyf_node); decr(i); l:=i;
<   end;
< hyf[i]:=0
---
>   l:=i; decr(i);
>   end
17316c17820
< @ The synchronization algorithm begins with |l=i<=j|.
---
> @ The synchronization algorithm begins with |l=i+1<=j|.
17319c17823,17826
< minor_tail:=null; post_break(r):=null;
---
> minor_tail:=null; post_break(r):=null; c_loc:=0;
> if bchar_label[hf]<non_address then {put left boundary at beginning of new line}
>   begin decr(l); c:=hu[l]; c_loc:=l; hu[l]:=256;
>   end;
17321,17326c17828,17830
<   begin repeat l:=reconstitute(l+1,hn);
<   if minor_tail=null then post_break(r):=link(hold_head)
<   else link(minor_tail):=link(hold_head);
<   minor_tail:=link(hold_head);
<   if link(minor_tail)<>null then
<     begin hyf[l]:=0; minor_tail:=link(minor_tail); {kern present}
---
>   begin repeat l:=reconstitute(l,hn,bchar,non_char)+1;
>   if c_loc>0 then
>     begin hu[c_loc]:=c; c_loc:=0;
17327a17832,17837
>   if link(hold_head)>null then
>     begin if minor_tail=null then post_break(r):=link(hold_head)
>     else link(minor_tail):=link(hold_head);
>     minor_tail:=link(hold_head);
>     while link(minor_tail)>null do minor_tail:=link(minor_tail);
>     end;
17330,17336c17840
<     begin j:=reconstitute(j+1,hn);
<     link(major_tail):=link(hold_head);
<     major_tail:=link(hold_head);
<     if link(major_tail)<>null then
<       begin hyf[j]:=0; major_tail:=link(major_tail); {kern present}
<       end;
<     end;
---
>     @<Append characters of |hu[j..@,]| to |major_tail|, advancing~|j|@>;
17339,17344c17843,17847
< @ @<Move pointer |s| to the end of the current list...@>=
< i:=0; s:=r;
< while link(s)<>null do
<   begin incr(i); s:=link(s);
<   end;
< replace_count(r):=i
---
> @ @<Append characters of |hu[j..@,]|...@>=
> begin j:=reconstitute(j,hn,bchar,non_char)+1;
> link(major_tail):=link(hold_head);
> while link(major_tail)>null do advance_major_tail;
> end
17346c17849,17851
< @ At this point |link(s)| is |null|.
---
> @ Ligature insertion can cause a word to grow exponentially in size. Therefore
> we must test the size of |r_count| here, even though the hyphenated text
> was at most 63 characters long.
17348,17352c17853,17861
< @<Insert a discretionary hyphen after |s|@>=
< begin r:=new_disc; pre_break(r):=new_character(hf,hyf_char);
< link(s):=r; s:=r;
< end
< @* \[42] Hyphenation.
---
> @<Move pointer |s| to the end of the current list...@>=
> if r_count>127 then {we have to forget the discretionary hyphen}
>   begin link(s):=link(r); link(r):=null; flush_node_list(r);
>   end
> else begin link(s):=r; replace_count(r):=r_count;
>   end;
> s:=major_tail
> 
> @* \[42] Hyphenation.
17381c17890,17891
< $p_1\ldots p_k$ by setting |@t$z_1$@>:=@t$p_1$@>| and then, for |1<i<=k|,
---
> $p_1\ldots p_k$ by letting $z_0$ be one greater than the relevant language
> index and then, for |1<=i<=k|,
17394c17904,17905
< the letters in |hc[(l-k+1)..l@,]|, we perform all of the required operations
---
> the letters in |hc[(l-k+1)..l@,]| of language |t|,
> we perform all of the required operations
17396,17398c17907,17909
< |v:=trie_op(@t$z_k$@>)|. Then set |hyf[l-hyf_distance[v]]:=@tmax@>(
< hyf[l-hyf_distance[v]], hyf_num[v])|, and |v:=hyf_next[v]|; repeat, if
< necessary, until |v=min_quarterword|.
---
> |v:=trie_op(@t$z_k$@>)|. Then set |v:=v+op_start[t]|,
> |hyf[l-hyf_distance[v]]:=@tmax@>(hyf[l-hyf_distance[v]], hyf_num[v])|,
> and |v:=hyf_next[v]|; repeat, if necessary, until |v=min_quarterword|.
17409,17411c17920,17923
< @!hyf_distance:array[quarterword] of small_number; {position |k-j| of $n_j$}
< @!hyf_num:array[quarterword] of small_number; {value of $n_j$}
< @!hyf_next:array[quarterword] of quarterword; {continuation of this |trie_op|}
---
> @!hyf_distance:array[1..trie_op_size] of small_number; {position |k-j| of $n_j$}
> @!hyf_num:array[1..trie_op_size] of small_number; {value of $n_j$}
> @!hyf_next:array[1..trie_op_size] of quarterword; {continuation code}
> @!op_start:array[ASCII_code] of 0..trie_op_size; {offset for current language}
17415c17927
< @!v:quarterword; {an index into |hyf_distance|, etc.}
---
> @!v:integer; {an index into |hyf_distance|, etc.}
17419c17931
< to the impossible value 128, in order to guarantee that |hc[hn+3]| will
---
> to the impossible value 256, in order to guarantee that |hc[hn+3]| will
17422c17934
< @<Find hyphen locations for the word in |hc|@>=
---
> @<Find hyphen locations for the word in |hc|...@>=
17426,17429c17938,17942
< hc[0]:=127; hc[hn+1]:=127; hc[hn+2]:=128; {insert delimiters}
< for j:=0 to hn-2 do
<   begin z:=hc[j]; l:=j;
<   while hc[l]=trie_char(z) do
---
> if trie_char(cur_lang+1)<>qi(cur_lang) then return; {no patterns for |cur_lang|}
> hc[0]:=0; hc[hn+1]:=0; hc[hn+2]:=256; {insert delimiters}
> for j:=0 to hn-r_hyf do
>   begin z:=trie_link(cur_lang+1)+hc[j]; l:=j;
>   while hc[l]=qo(trie_char(z)) do
17435c17948,17949
< found: hyf[1]:=0; hyf[hn-2]:=0; hyf[hn-1]:=0; hyf[hn]:=0
---
> found: for j:=0 to l_hyf do hyf[j]:=0;
> for j:=0 to r_hyf do hyf[hn-j]:=0
17439c17953
< repeat i:=l-hyf_distance[v];
---
> repeat v:=v+op_start[cur_lang]; i:=l-hyf_distance[v];
17489c18003,18004
< find the word or we don't.
---
> find the word or we don't. Words from different languages are kept
> separate by appending the language code to the string.
17492c18007
< h:=hc[1];
---
> h:=hc[1]; incr(hn); hc[hn]:=cur_lang;
17499c18014
< not_found:
---
> not_found: decr(hn)
17506,17507c18021,18022
<   repeat if str_pool[u]<hc[j] then goto not_found;
<   if str_pool[u]>hc[j] then goto done;
---
>   repeat if so(str_pool[u])<hc[j] then goto not_found;
>   if so(str_pool[u])>hc[j] then goto done;
17511c18026
<   goto found;
---
>   decr(hn); goto found;
17534a18050,18053
> @d set_cur_lang==if language<=0 then cur_lang:=0
>   else if language>255 then cur_lang:=0
>   else cur_lang:=language
> 
17536c18055
< label reswitch, exit, found, not_found, done;
---
> label reswitch, exit, found, not_found;
17545a18065
> set_cur_lang;
17558c18078
<   spacer,right_brace: begin if n>4 then @<Enter a hyphenation exception@>;
---
>   spacer,right_brace: begin if n>1 then @<Enter a hyphenation exception@>;
17577c18097
< else  begin if (cur_chr>127)or(lc_code(cur_chr)=0) then
---
> else  begin if lc_code(cur_chr)=0 then
17585c18105
<     begin incr(n); hc[n]:=lc_code(cur_chr)-1;
---
>     begin incr(n); hc[n]:=lc_code(cur_chr);
17590c18110
< begin if n>1 then
---
> begin if n<63 then
17596c18116
< begin str_room(n); h:=0;
---
> begin incr(n); hc[n]:=cur_lang; str_room(n); h:=0;
17602,17606c18122
< loop@+  begin if p=null then goto done;
<   if info(p)<n-2 then goto done;
<   q:=link(p); free_avail(p); p:=q; {eliminate hyphens that \TeX\ doesn't like}
<   end;
< done: @<Insert the \(p)pair |(s,p)| into the exception table@>;
---
> @<Insert the \(p)pair |(s,p)| into the exception table@>;
17632c18148,18149
< @* \[43] Initializing the hyphenation tables.
---
> 
> @* \[43] Initializing the hyphenation tables.
17639,17641c18156,18159
< The initialization first builds a trie that is linked instead of packed
< into sequential storage, so that insertions are readily made. Then it
< compresses the linked trie by identifying common subtries, and finally the
---
> The first step is to build a trie that is linked, instead of packed
> into sequential storage, so that insertions are readily made.
> After all patterns have been processed, \.{INITEX}
> compresses the linked trie by identifying common subtries. Finally the
17645c18163,18164
< @p @!init @<Declare procedures for preprocessing hyphenation patterns@>@;
---
> @<Declare subprocedures for |line_break|@>=
> @!init @<Declare procedures for preprocessing hyphenation patterns@>@;
17665c18184
< three have not appeared before.
---
> three have not appeared before for the current language.
17668a18188,18189
> Statistics printed during a dump make it possible for users to tell
> if this has happened.
17670,17672d18190
< @d quarterword_diff=max_quarterword-min_quarterword
< @d trie_op_hash_size=quarterword_diff+quarterword_diff {double}
< 
17674,17677c18192,18201
< @!init@! trie_op_hash:array[0..trie_op_hash_size] of quarterword;
<   {trie op codes for triples}
< tini@;@/
< @t\hskip1em@>@!trie_op_ptr:quarterword; {highest |trie_op| assigned}
---
> @!init@! trie_op_hash:array[-trie_op_size..trie_op_size] of 0..trie_op_size;
>   {trie op codes for quadruples}
> @!trie_used:array[ASCII_code] of quarterword;
>   {largest opcode used so far for this language}
> @!trie_op_lang:array[1..trie_op_size] of ASCII_code;
>   {language part of a hashed quadruple}
> @!trie_op_val:array[1..trie_op_size] of quarterword;
>   {opcode corresponding to a hashed quadruple}
> @!trie_op_ptr:0..trie_op_size; {number of stored ops so far}
> tini
17679,17683c18203,18208
< @ The hash function used by |new_trie_op| is based on the observation that
< 313/510 is an approximation to the golden ratio [cf.\ {\sl The Art of
< Computer Programming \bf3} (1973), 510--512]; |trie_op_hash_size| is
< usually a multiple of 510.  But the choice is comparatively unimportant in
< this particular application.
---
> @ It's tempting to remove the |overflow| stops in the following procedure;
> |new_trie_op| could return |min_quarterword| (thereby simply ignoring
> part of a hyphenation pattern) instead of aborting the job. However, that would
> lead to different hyphenation results on different installations of \TeX\
> using the same patterns. The |overflow| stops are necessary for portability
> of patterns.
17688c18213
< var h:0..trie_op_hash_size; {trial hash location}
---
> var h:-trie_op_size..trie_op_size; {trial hash location}
17690,17696c18215,18227
< begin h:=abs(n+313*d+361*v) mod trie_op_hash_size;
< loop@+  begin u:=trie_op_hash[h];
<   if u=min_quarterword then {empty position found}
<     begin if trie_op_ptr=max_quarterword then {overflow}
<       begin new_trie_op:=min_quarterword; return;
<       end;
<     incr(trie_op_ptr); hyf_distance[trie_op_ptr]:=d;
---
> @!l:0..trie_op_size; {pointer to stored data}
> begin h:=abs(n+313*d+361*v+1009*cur_lang) mod (trie_op_size+trie_op_size)
>   - trie_op_size;
> loop@+  begin l:=trie_op_hash[h];
>   if l=0 then {empty position found for a new op}
>     begin if trie_op_ptr=trie_op_size then
>       overflow("pattern memory ops",trie_op_size);
>     u:=trie_used[cur_lang];
>     if u=max_quarterword then
>       overflow("pattern memory ops per language",
>         max_quarterword-min_quarterword);
>     incr(trie_op_ptr); incr(u); trie_used[cur_lang]:=u;
>     hyf_distance[trie_op_ptr]:=d;
17698,17699c18229,18230
<     trie_op_hash[h]:=trie_op_ptr;
<     new_trie_op:=trie_op_ptr; return;
---
>     trie_op_lang[trie_op_ptr]:=cur_lang; trie_op_hash[h]:=trie_op_ptr;
>     trie_op_val[trie_op_ptr]:=u; new_trie_op:=u; return;
17701,17702c18232,18234
<   if (hyf_distance[u]=d)and(hyf_num[u]=n)and(hyf_next[u]=v) then
<     begin new_trie_op:=u; return;
---
>   if (hyf_distance[l]=d)and(hyf_num[l]=n)and(hyf_next[l]=v)
>    and(trie_op_lang[l]=cur_lang) then
>     begin new_trie_op:=trie_op_val[l]; return;
17704c18236
<   if h>0 then decr(h)@+else h:=trie_op_hash_size;
---
>   if h>-trie_op_size then decr(h)@+else h:=trie_op_size;
17707a18240,18264
> @ After |new_trie_op| has compressed the necessary opcode information,
> plenty of information is available to unscramble the data into the
> final form needed by our hyphenation algorithm.
> 
> @<Sort \(t)the hyphenation op tables into proper order@>=
> op_start[0]:=-min_quarterword;
> for j:=1 to 255 do op_start[j]:=op_start[j-1]+qo(trie_used[j-1]);
> for j:=1 to trie_op_ptr do
>   trie_op_hash[j]:=op_start[trie_op_lang[j]]+trie_op_val[j]; {destination}
> for j:=1 to trie_op_ptr do while trie_op_hash[j]>j do
>   begin k:=trie_op_hash[j];@/
>   t:=hyf_distance[k]; hyf_distance[k]:=hyf_distance[j]; hyf_distance[j]:=t;@/
>   t:=hyf_num[k]; hyf_num[k]:=hyf_num[j]; hyf_num[j]:=t;@/
>   t:=hyf_next[k]; hyf_next[k]:=hyf_next[j]; hyf_next[j]:=t;@/
>   trie_op_hash[j]:=trie_op_hash[k]; trie_op_hash[k]:=k;
>   end
> 
> @ Before we forget how to initialize the data structures that have been
> mentioned so far, let's write down the code that gets them started.
> 
> @<Initialize table entries...@>=
> for k:=-trie_op_size to trie_op_size do trie_op_hash[k]:=0;
> for k:=0 to 255 do trie_used[k]:=min_quarterword;
> trie_op_ptr:=0;
> 
17726,17727c18283,18285
< @!init @!trie_c:packed array[trie_pointer] of ASCII_code; {characters to match}
< @t\hskip1em@>@!trie_o:packed array[trie_pointer] of quarterword;
---
> @!init @!trie_c:packed array[trie_pointer] of packed_ASCII_code;
>   {characters to match}
> @t\hskip10pt@>@!trie_o:packed array[trie_pointer] of quarterword;
17729c18287
< @t\hskip1em@>@!trie_l:packed array[trie_pointer] of trie_pointer;
---
> @t\hskip10pt@>@!trie_l:packed array[trie_pointer] of trie_pointer;
17731c18289
< @t\hskip1em@>@!trie_r:packed array[trie_pointer] of trie_pointer;
---
> @t\hskip10pt@>@!trie_r:packed array[trie_pointer] of trie_pointer;
17733c18291,18293
< @t\hskip1em@>@!trie_ptr:trie_pointer; {the number of nodes in the trie}
---
> @t\hskip10pt@>@!trie_ptr:trie_pointer; {the number of nodes in the trie}
> @t\hskip10pt@>@!trie_hash:packed array[trie_pointer] of trie_pointer;
>   {used to identify equivalent subtries}
17742,17746c18302
< @<Glob...@>=
< @!init @!trie_hash:packed array[trie_pointer] of trie_pointer;
< tini  {to identify equivalent subtries}
< 
< @ The function |trie_node(p)| returns |p| if |p| is distinct from other nodes
---
> The function |trie_node(p)| returns |p| if |p| is distinct from other nodes
17749a18306,18309
> Notice that we might make subtries equivalent even if they correspond to
> patterns for different languages, in which the trie ops might mean quite
> different things. That's perfectly all right.
> 
17785,17796d18344
< @ Before we forget how to initialize the data structures that have been
< mentioned so far, let's write a procedure that does the initialization.
< 
< @<Declare procedures for preprocessing hyph...@>=
< procedure init_pattern_memory; {gets ready to build a linked trie}
< var h:0..trie_op_hash_size; {an index into |trie_op_hash|}
< @!p:trie_pointer; {an index into |trie_hash|}
< begin for h:=0 to trie_op_hash_size do trie_op_hash[h]:=min_quarterword;
< trie_op_ptr:=min_quarterword; trie_root:=0; trie_c[0]:=0; trie_ptr:=0;
< for p:=0 to trie_size do trie_hash[p]:=0;
< end;
< 
17801c18349,18350
< |trie_ref[p]| will be nonzero if the linked trie node |p| is the oldest sibling
---
> |trie_ref[p]| will be nonzero only if the linked trie node |p| is the
> smallest character
17808a18358,18359
> To save time at the low end of the trie, we maintain array entries
> |trie_min[c]| pointing to the smallest hole that is greater than~|c|.
17818c18369
< @!init@!trie_taken:packed array[trie_pointer] of boolean;
---
> @!init@!trie_taken:packed array[1..trie_size] of boolean;
17820,17823c18371,18375
< @t\hskip1em@>@!trie_min:trie_pointer;
<   {all locations |<=trie_min| are vacant in |trie|}
< tini@;@/
< @t\hskip1em@>@!trie_max:trie_pointer; {largest location used in |trie|}
---
> @t\hskip10pt@>@!trie_min:array[ASCII_code] of trie_pointer;
>   {the first possible slot for each character}
> @t\hskip10pt@>@!trie_max:trie_pointer; {largest location used in |trie|}
> @t\hskip10pt@>@!trie_not_ready:boolean; {is the trie still in linked form?}
> tini
17825c18377,18381
< @ Here is how these data structures are initialized.
---
> @ Each time \.{\\patterns} appears, it contributes further patterns to
> the future trie, which will be built only when hyphenation is attempted or
> when a format file is dumped. The boolean variable |trie_not_ready|
> will change to |false| when the trie is compressed; this will disable
> further patterns.
17827,17836c18383,18384
< @<Declare procedures for preprocessing hyph...@>=
< procedure init_trie_memory; {gets ready to pack into |trie|}
< var p:trie_pointer; {index into |trie_ref|, |trie|, |trie_taken|}
< begin for p:=0 to trie_ptr do trie_ref[p]:=0;
< trie_max:=128; trie_min:=128; trie_link(0):=1; trie_taken[0]:=false;
< trie_link(trie_size):=0; trie_back(0):=trie_size; {wrap around}
< for p:=1 to 128 do
<   begin trie_back(p):=p-1; trie_link(p):=p+1; trie_taken[p]:=false;
<   end;
< end;
---
> @<Initialize table entries...@>=
> trie_not_ready:=true; trie_root:=0; trie_c[0]:=si(0); trie_ptr:=0;
17838,17842c18386,18389
< @ Each time \.{\\patterns} appears, it overrides any patterns that were
< entered earlier, so the arrays are not initialized until \TeX\ sees
< \.{\\patterns}. However, some of the global variables must be
< initialized when \.{INITEX} is loaded, in case the user never mentions
< any \.{\\patterns}.
---
> @ Here is how the trie-compression data structures are initialized.
> If storage is tight, it would be possible to overlap |trie_op_hash|,
> |trie_op_lang|, and |trie_op_val| with |trie|, |trie_hash|, and |trie_taken|,
> because we finish with the former just before we need the latter.
17844,17848c18391,18397
< @<Initialize table entries...@>=
< trie_op_ptr:=min_quarterword;@/
< trie_link(0):=0; trie_char(0):=0; trie_op(0):=min_quarterword;
< for k:=1 to 127 do trie[k]:=trie[0];
< trie_max:=127;
---
> @<Get ready to compress the trie@>=
> @<Sort \(t)the hyphenation...@>;
> for p:=0 to trie_size do trie_hash[p]:=0;
> trie_root:=compress_trie(trie_root); {identify equivalent subtries}
> for p:=0 to trie_ptr do trie_ref[p]:=0;
> for p:=0 to 255 do trie_min[p]:=p+1;
> trie_link(0):=1; trie_max:=0
17864,17870c18413,18418
< begin c:=trie_c[p];
< if c<trie_min then trie_min:=c;
< if trie_min=0 then z:=trie_link(trie_size)
< else z:=trie_link(trie_min-1); {get the first conceivably good hole}
< loop@+  begin if z<c then goto not_found;
<   h:=z-c;@/
<   @<Ensure that |trie_max>=h+128|@>;
---
> @!l,@!r:trie_pointer; {left and right neighbors}
> @!ll:1..256; {upper limit of |trie_min| updating}
> begin c:=so(trie_c[p]);
> z:=trie_min[c]; {get the first conceivably good hole}
> loop@+  begin h:=z-c;@/
>   @<Ensure that |trie_max>=h+256|@>;
17879c18427
< @ By making sure that |trie_max| is at least |h+128|, we can be sure that
---
> @ By making sure that |trie_max| is at least |h+256|, we can be sure that
17883,17885c18431,18433
< @<Ensure that |trie_max>=h+128|@>=
< if trie_max<h+128 then
<   begin if trie_size<=h+128 then overflow("pattern memory",trie_size);
---
> @<Ensure that |trie_max>=h+256|@>=
> if trie_max<h+256 then
>   begin if trie_size<=h+256 then overflow("pattern memory",trie_size);
17889c18437
<   until trie_max=h+128;
---
>   until trie_max=h+256;
17895c18443
<   begin if trie_link(h+trie_c[q])=0 then goto not_found;
---
>   begin if trie_link(h+so(trie_c[q]))=0 then goto not_found;
17902,17903c18450,18457
< repeat z:=h+trie_c[q]; trie_back(trie_link(z)):=trie_back(z);
< trie_link(trie_back(z)):=trie_link(z); trie_link(z):=0; q:=trie_r[q];
---
> repeat z:=h+so(trie_c[q]); l:=trie_back(z); r:=trie_link(z);
> trie_back(r):=l; trie_link(l):=r; trie_link(z):=0;
> if l<256 then
>   begin if z<256 then ll:=z @+else ll:=256;
>   repeat trie_min[l]:=r; incr(l);
>   until l=ll;
>   end;
> q:=trie_r[q];
17922,17927c18476,18477
< information. Null pointers in the linked trie will be replaced by the
< first untaken position |r| in |trie|, since this properly implements an
< ``empty'' family. The value of |r| is stored in |trie_ref[0]| just before
< the fixup process starts. Note that |trie_max| will always be at least as
< large as |r+127|, since it is always at least 128 more than each location
< that is taken.
---
> information. Null pointers in the linked trie will be represented by the
> value~0, which properly implements an ``empty'' family.
17930,17933c18480,18491
< r:=0;
< while trie_taken[r] do incr(r);
< trie_ref[0]:=r; {|r| will be used for null pointers}
< trie_fix(trie_root) {this fixes the non-holes in |trie|}
---
> h.rh:=0; h.b0:=min_quarterword; h.b1:=min_quarterword; {|trie_link:=0|,
>   |trie_op:=min_quarterword|, |trie_char:=qi(0)|}
> if trie_root=0 then {no patterns were given}
>   begin for r:=0 to 256 do trie[r]:=h;
>   trie_max:=256;
>   end
> else begin trie_fix(trie_root); {this fixes the non-holes in |trie|}
>   r:=0; {now we will zero out all the holes}
>   repeat s:=trie_link(r); trie[r]:=h; r:=s;
>   until r>trie_max;
>   end;
> trie_char(0):=qi("?"); {make |trie_char(c)<>c| for all |c|}
17947,17952c18505,18509
< while p<>0 do
<   begin q:=trie_l[p]; c:=trie_c[p];
<   trie_link(z+c):=trie_ref[q]; trie_char(z+c):=c; trie_op(z+c):=trie_o[p];
<   if q>0 then trie_fix(q);
<   p:=trie_r[p];
<   end;
---
> repeat q:=trie_l[p]; c:=so(trie_c[p]);
> trie_link(z+c):=trie_ref[q]; trie_char(z+c):=qi(c); trie_op(z+c):=trie_o[p];
> if q>0 then trie_fix(q);
> p:=trie_r[p];
> until p=0;
17955,17960c18512,18514
< @ Now let's put all these routines together. When \.{INITEX} has scanned
< the `\.{\\patterns}' control sequence, it calls on |new_patterns| to do
< the right thing. After |new_patterns| has acted, the compacted pattern data
< will appear in the array |trie[1..trie_max]|, and the associated numeric
< hyphenation data will appear in locations |[(min_quarterword+1)..trie_op_ptr]|
< of the arrays |hyf_distance|, |hyf_num|, |hyf_next|.
---
> @ Now let's go back to the easier problem, of building the linked
> trie.  When \.{INITEX} has scanned the `\.{\\patterns}' control
> sequence, it calls on |new_patterns| to do the right thing.
17971,17975c18525,18527
< @!r,@!s:trie_pointer; {used to clean up the packed |trie|}
< @!h:two_halves; {template used to zero out |trie|'s holes}
< begin scan_left_brace; {a left brace must follow \.{\\patterns}}
< init_pattern_memory;@/
< @<Enter all of the patterns into a linked trie, until coming to a right
---
> begin if trie_not_ready then
>   begin set_cur_lang; scan_left_brace; {a left brace must follow \.{\\patterns}}
>   @<Enter all of the patterns into a linked trie, until coming to a right
17977,17978c18529,18533
< trie_root:=compress_trie(trie_root); {compress the trie}
< @<Pack the trie@>;
---
>   end
> else begin print_err("Too late for "); print_esc("patterns");
>   help1("All patterns must be given before typesetting begins.");
>   error; link(garbage):=scan_toks(false,false); flush_list(def_ref);
>   end;
18005c18560
<   begin if cur_chr="." then cur_chr:=128 {edge-of-word delimiter}
---
>   begin if cur_chr="." then cur_chr:=0 {edge-of-word delimiter}
18010c18565
<       help1("(See Appendix H.)"); error; cur_chr:=128;
---
>       help1("(See Appendix H.)"); error;
18014c18569
<     begin incr(k); hc[k]:=cur_chr-1; hyf[k]:=0; digit_sensed:=false;
---
>     begin incr(k); hc[k]:=cur_chr; hyf[k]:=0; digit_sensed:=false;
18017,18018c18572,18573
< else  begin hyf[k]:=cur_chr-"0";
<   if k<63 then digit_sensed:=true;
---
> else if k<63 then
>   begin hyf[k]:=cur_chr-"0"; digit_sensed:=true;
18027,18030c18582,18585
< q:=0;
< while l<k do
<   begin incr(l); c:=hc[l]; p:=trie_l[q]; first_child:=true;
<   while (p>0)and(c>trie_c[p]) do
---
> q:=0; hc[0]:=cur_lang;
> while l<=k do
>   begin c:=hc[l]; incr(l); p:=trie_l[q]; first_child:=true;
>   while (p>0)and(c>so(trie_c[p])) do
18033c18588
<   if (p=0)or(c<trie_c[p]) then
---
>   if (p=0)or(c<so(trie_c[p])) then
18051c18606
< trie_c[p]:=c; trie_o[p]:=min_quarterword;
---
> trie_c[p]:=si(c); trie_o[p]:=min_quarterword;
18055,18056c18610,18611
< if hc[1]=127 then hyf[0]:=0;
< if hc[k]=127 then hyf[k]:=0;
---
> if hc[1]=0 then hyf[0]:=0;
> if hc[k]=0 then hyf[k]:=0;
18063,18064c18618,18621
< @ The following packing routine is rigged so that the root of the linked
< tree gets mapped into location 0 of |trie|, as required by the hyphenation
---
> @ Finally we put everything together: Here is how the trie gets to its
> final, efficient form.
> The following packing routine is rigged so that the root of the linked
> tree gets mapped into location 1 of |trie|, as required by the hyphenation
18066c18623
< ``take'' location~0.
---
> ``take'' location~1.
18068,18069c18625,18631
< @<Pack the trie@>=
< init_trie_memory;
---
> @<Declare procedures for preprocessing hyphenation patterns@>=
> procedure init_trie;
> var @!p:trie_pointer; {pointer for initialization}
> @!j,@!k,@!t:integer; {all-purpose registers for initialization}
> @!r,@!s:trie_pointer; {used to clean up the packed |trie|}
> @!h:two_halves; {template used to zero out |trie|'s holes}
> begin @<Get ready to compress the trie@>;
18074,18079c18636,18639
< r:=trie_size; {finally, we will zero out the holes}
< h.rh:=0; h.b0:=min_quarterword; h.b1:=0; {|trie_link:=0|,
<   |trie_op:=min_quarterword|, |trie_char:=0|}
< repeat s:=trie_link(r); trie[r]:=h; r:=s;
< until r>trie_max
< @* \[44] Breaking vertical lists into pages.
---
> trie_not_ready:=false;
> end;
> 
> @* \[44] Breaking vertical lists into pages.
18333c18893,18894
< @* \[45] The page builder.
---
> 
> @* \[45] The page builder.
18786c19347
<   @!stat if tracing_pages>0 then @<Display page break cost@>;@+tats@;@/
---
>   @!stat if tracing_pages>0 then @<Display the page break cost@>;@+tats@;@/
18803c19364
< @ @<Display page break cost@>=
---
> @ @<Display the page break cost@>=
18896c19457
< @!stat if tracing_pages>0 then @<Display insertion split cost@>;@+tats@;@/
---
> @!stat if tracing_pages>0 then @<Display the insertion split cost@>;@+tats@;@/
18905c19466
< @ @<Display insertion split cost@>=
---
> @ @<Display the insertion split cost@>=
18977c19538,19539
< @<Prepare all the boxes involved in insertions to act as queues@>;
---
> if holding_inserts<=0 then
>   @<Prepare all the boxes involved in insertions to act as queues@>;
18980,18983c19542,19547
<   begin if type(p)=ins_node then @<Either insert the material
<     specified by node |p| into the appropriate box, or
<     hold it for the next page; also delete node |p| from
<     the current page@>
---
>   begin if type(p)=ins_node then
>     begin if holding_inserts<=0 then
>        @<Either insert the material specified by node |p| into the
>          appropriate box, or hold it for the next page;
>          also delete node |p| from the current page@>;
>     end
18991c19555
< @<Delete the page-insertion nodes@>
---
> @<Delete \(t)the page-insertion nodes@>
19048c19612
< r:=link(page_ins_head);
---
> begin r:=link(page_ins_head);
19058c19622,19623
<   end
---
>   end;
> end
19060c19625
< @ @<Delete the page-insertion nodes@>=
---
> @ @<Delete \(t)the page-insertion nodes@>=
19194c19759,19760
< @* \[46] The chief executive.
---
> 
> @* \[46] The chief executive.
19240,19244c19806,19812
< @d main_loop=70 {go here to typeset |cur_chr| in the current font}
< @d main_loop_1=71 {like |main_loop|, but |(f,c)| = current font and char}
< @d main_loop_2=72 {like |main_loop_1|, but |c| is known to be in range}
< @d main_loop_3=73 {like |main_loop_2|, but several variables are set up}
< @d append_normal_space=74 {go here to append a normal space between words}
---
> @d main_loop=70 {go here to typeset a string of consecutive characters}
> @d main_loop_wrapup=80 {go here to finish a character or ligature}
> @d main_loop_move=90 {go here to advance the ligature cursor}
> @d main_loop_move_lig=95 {same, when advancing past a generated ligature}
> @d main_loop_lookahead=100 {go here to bring in another character, if any}
> @d main_lig_loop=110 {go here to check for ligatures or kerning}
> @d append_normal_space=120 {go here to append a normal space between words}
19249c19817,19820
< label big_switch,reswitch,main_loop,main_loop_1,main_loop_2,main_loop_3,
---
> label big_switch,reswitch,main_loop,main_loop_wrapup,
>   main_loop_move,main_loop_move+1,main_loop_move+2,main_loop_move_lig,
>   main_loop_lookahead,main_loop_lookahead+1,
>   main_lig_loop,main_lig_loop+1,main_lig_loop+2,
19251,19252c19822
< var t:integer; {general-purpose temporary variable}
< @<Local variables for the inner loop of |main_control|@>@;
---
> var@!t:integer; {general-purpose temporary variable}
19258c19828,19832
< hmode+char_num: begin scan_char_num; cur_chr:=cur_val; goto main_loop;
---
> hmode+char_num: begin scan_char_num; cur_chr:=cur_val; goto main_loop;@+end;
> hmode+no_boundary: begin get_x_token;
>   if (cur_cmd=letter)or(cur_cmd=other_char)or(cur_cmd=char_given)or
>    (cur_cmd=char_num) then cancel_boundary:=true;
>   goto reswitch;
19284,19288c19858,19862
< @ In the following program, |l| is the current character or ligature;
< it might grow into a longer ligature. One or more characters has been
< used to define |l|, and the last of these was |c|. The chief use of |c|
< will be to modify |space_factor| and to insert discretionary nodes after
< explicit hyphens in the text.
---
> @ The following part of the program was first written in a structured
> manner, according to the philosophy that ``premature optimization is
> the root of all evil.'' Then it was rearranged into pieces of
> spaghetti so that the most common actions could proceed with little or
> no redundancy.
19290,19301c19864,19873
< @<Local variables for the inner loop of |main_control|@>=
< @!l:quarterword; {the current character or ligature}
< @!c:eight_bits; {the most recent character}
< @!f:internal_font_number; {the current font}
< @!r:halfword; {the next character for ligature/kern matching}
< @!p:pointer; {the current |char_node|}
< @!k:0..font_mem_size; {index into |font_info|}
< @!q:pointer; {where a ligature should be detached}
< @!i:four_quarters; {character information bytes for |l|}
< @!j:four_quarters; {ligature/kern command}
< @!s:integer; {space factor code}
< @!ligature_present:boolean; {should a ligature node be made?}
---
> The original unoptimized form of this algorithm resembles the
> |reconstitute| procedure, which was described earlier in connection with
> hyphenation. Again we have an implied ``cursor`` between characters
> |cur_l| and |cur_r|. The main difference is that the |lig_stack| can now
> contain a charnode as well as pseudo-ligatures; that stack is now
> usually nonempty, because the next character of input (if any) has been
> appended to it. In |main_control| we have
> $$|cur_r|=\cases{|character(lig_stack)|,&if |lig_stack>null|;\cr
>   |font_bchar[cur_font]|,&otherwise.\cr}$$
> Several additional global variables are needed.
19303,19314c19875,19885
< @ @<Append character |cur_chr| and the following characters...@>=
< f:=cur_font; c:=cur_chr;
< main_loop_1: if (c<font_bc[f])or(c>font_ec[f]) then
<   begin char_warning(f,c); goto big_switch;
<   end;
< main_loop_2: q:=tail; ligature_present:=false; l:=qi(c);
< main_loop_3: @<Adjust \(t)the space factor,
<   based on its current value and |c|@>;
< @<Append character |l| and the following characters (if any) to the current
<   hlist, in font |f|; if |ligature_present|, detach a ligature node
<   starting at |link(q)|; if |c| is a hyphen, append a null |disc_node|;
<   finally |goto reswitch|@>
---
> @<Glob...@>=
> @!main_f:internal_font_number; {the current font}
> @!main_i:four_quarters; {character information bytes for |cur_l|}
> @!main_j:four_quarters; {ligature/kern command}
> @!main_k:font_index; {index into |font_info|}
> @!main_p:pointer; {temporary register for list manipulation}
> @!main_s:integer; {space factor value}
> @!bchar:halfword; {right boundary character of current font, or |non_char|}
> @!false_bchar:halfword; {nonexistent character matching |bchar|, or |non_char|}
> @!cancel_boundary:boolean; {should the left boundary be ignored?}
> @!ins_disc:boolean; {should we insert a discretionary node?}
19316,19320c19887,19889
< @ We leave |space_factor| unchanged if |sf_code(c)=0|; otherwise we set it
< to |sf_code(c)|, except that the space factor never changes from a value
< less than 1000 to a value exceeding 1000. If |c>=128|, its |sf_code| is
< implicitly 1000. The most common case is |sf_code(c)=1000|, so we want
< that case to be fast.
---
> @ The boolean variables of the main loop are normally false, and always reset
> to false before the loop is left. That saves us the extra work of initializing
> each time.
19322,19327c19891,19907
< @<Adjust \(t)the space factor...@>=
< if c<128 then
<   begin s:=sf_code(c);
<   if s=1000 then space_factor:=1000
<   else if s<1000 then
<     begin if s>0 then space_factor:=s;
---
> @<Set init...@>=
> ligature_present:=false; cancel_boundary:=false; lft_hit:=false; rt_hit:=false;
> ins_disc:=false;
> 
> @ We leave |space_factor| unchanged if |sf_code(cur_chr)=0|; otherwise we
> set it to |sf_code(cur_chr)|, except that the space factor never changes
> from a value less than 1000 to a value exceeding 1000. The most common
> case is |sf_code(cur_chr)=1000|, so we want that case to be fast.
> 
> The overall structure of the main loop is presented here. Some program labels
> are inside the individual sections.
> 
> @d adjust_space_factor==@t@>@;@/
>   main_s:=sf_code(cur_chr);
>   if main_s=1000 then space_factor:=1000
>   else if main_s<1000 then
>     begin if main_s>0 then space_factor:=main_s;
19330c19910,19921
<   else space_factor:=s;
---
>   else space_factor:=main_s
> 
> @<Append character |cur_chr|...@>=
> adjust_space_factor;@/
> main_f:=cur_font;
> bchar:=font_bchar[main_f]; false_bchar:=font_false_bchar[main_f];
> if mode>0 then if language<>clang then fix_language;
> fast_get_avail(lig_stack); font(lig_stack):=main_f; cur_l:=qi(cur_chr);
> character(lig_stack):=cur_l;@/
> cur_q:=tail;
> if cancel_boundary then
>   begin cancel_boundary:=false; main_k:=non_address;
19332c19923,19938
< else space_factor:=1000
---
> else main_k:=bchar_label[main_f];
> if main_k=non_address then goto main_loop_move+2; {no left boundary processing}
> cur_r:=cur_l; cur_l:=non_char;
> goto main_lig_loop+1; {begin with cursor after left boundary}
> @#
> main_loop_wrapup:@<Make a ligature node, if |ligature_present|;
>   insert a null discretionary, if appropriate@>;
> main_loop_move:@<If the cursor is immediately followed by the right boundary,
>   |goto reswitch|; if it's followed by an invalid character, |goto big_switch|;
>   otherwise move the cursor one step to the right and |goto main_lig_loop|@>;
> main_loop_lookahead:@<Look ahead for another character, or leave |lig_stack|
>   empty if there's none there@>;
> main_lig_loop:@<If there's a ligature/kern command relevant to |cur_l| and
>   |cur_r|, adjust the text appropriately; exit to |main_loop_wrapup|@>;
> main_loop_move_lig:@<Move the cursor past a pseudo-ligature, then
>   |goto main_loop_lookahead| or |main_lig_loop|@>
19334,19336c19940,19944
< @ Now we come to the inner loop, in which the characters of a word are
< gathered at (hopefully) high speed.
< @^inner loop@>
---
> @ If the current horizontal list is empty, the reference to |character(tail)|
> here is not strictly legal, since |tail| will be a node freshly returned by
> |get_avail|. But this should cause no problem on most implementations, and we
> do want the inner loop to be fast.
> @^dirty Pascal@>
19338,19343c19946,19958
< @<Append character |l| and the following...@>=
< i:=char_info(f)(l);
< if char_exists(i) then
<   begin fast_get_avail(p);
<   font(p):=f; character(p):=qi(c);
<   link(tail):=p; tail:=p;
---
> A discretionary break is not inserted for an explicit hyphen when we are in
> restricted horizontal mode. In particular, this avoids putting discretionary
> nodes inside of other discretionaries.
> 
> @d pack_lig(#)== {the parameter is either |rt_hit| or |false|}
>   begin main_p:=new_ligature(main_f,cur_l,link(cur_q));
>   if lft_hit then
>     begin subtype(main_p):=2; lft_hit:=false;
>     end;
>   if # then if lig_stack=null then
>     begin incr(subtype(main_p)); rt_hit:=false;
>     end;
>   link(cur_q):=main_p; tail:=main_p; ligature_present:=false;
19345,19347d19959
< else char_warning(f,qo(l));
< @<Look ahead for ligature or kerning, either continuing the main loop
<   or going to |reswitch|@>
19349,19351c19961,19969
< @ The result of \.{\\char} can participate in a ligature or kern, so
< we must look ahead for it.
< @^inner loop@>
---
> @d wrapup(#)==if cur_l<non_char then
>   begin if character(tail)=qi(hyphen_char[main_f]) then if link(cur_q)>null then
>     ins_disc:=true;
>   if ligature_present then pack_lig(#);
>   if ins_disc then
>     begin ins_disc:=false;
>     if mode>0 then tail_append(new_disc);
>     end;
>   end
19353,19364c19971,19979
< @<Look ahead for ligature...@>=
< get_next; {set only |cur_cmd| and |cur_chr|}
< if cur_cmd=letter then r:=qi(cur_chr)
< else if cur_cmd=other_char then r:=qi(cur_chr)
< else if cur_cmd=char_given then r:=qi(cur_chr)
< else  begin x_token; {set |cur_cmd|, |cur_chr|, |cur_tok|}
<   if (cur_cmd=letter)or(cur_cmd=other_char)or(cur_cmd=char_given) then
<     r:=qi(cur_chr)
<   else if cur_cmd=char_num then
<     begin scan_char_num; r:=qi(cur_val);
<     end
<   else r:=qi(256); {this flag means that no character follows}
---
> @<Make a ligature node, if |ligature_present|;...@>=
> wrapup(rt_hit)
> 
> @ @<If the cursor is immediately followed by the right boundary...@>=
> if lig_stack=null then goto reswitch;
> cur_q:=tail; cur_l:=cur_r; {or |character(lig_stack)|}
> main_loop_move+1:if not is_char_node(lig_stack) then goto main_loop_move_lig;
> main_loop_move+2:if(cur_chr<font_bc[main_f])or(cur_chr>font_ec[main_f]) then
>   begin char_warning(main_f,cur_chr); free_avail(lig_stack); goto big_switch;
19366,19371c19981,19985
< if char_tag(i)=lig_tag then if r<>qi(256) then
<   @<Follow the lig/kern program; |goto main_loop_3| if scoring a hit@>;
< @<Make a ligature node, if |ligature_present|; insert a discretionary
<   node for an explicit hyphen, if |c| is the current |hyphen_char|@>;
< if r=qi(256) then goto reswitch; {|cur_cmd|, |cur_chr|, |cur_tok| are untouched}
< c:=qo(r); goto main_loop_1 {|f| is still valid}
---
> main_i:=char_info(main_f)(cur_l);
> if not char_exists(main_i) then
>   begin char_warning(main_f,cur_chr); free_avail(lig_stack); goto big_switch;
>   end;
> tail_append(lig_stack) {|main_loop_lookahead| is next}
19373,19379c19987,19988
< @ Even though comparatively few characters have a lig/kern program, the |repeat|
< construction here counts as part of \TeX's inner loop, since it involves a
< potentially long sequential search. For example, tests with one commonly
< used font showed that about 40 per cent of all characters had a lig/kern
< program, and the |repeat| loop was performed about four times for every
< such character.
< @^inner loop@>
---
> @ Here we are at |main_loop_move_lig|.
> When we begin this code we have |cur_l=character(lig_stack)| and |cur_q=tail|.
19381,19389c19990,20000
< @<Follow the lig/kern...@>=
< begin k:=lig_kern_start(f)(i);
< repeat j:=font_info[k].qqqq; {fetch a lig/kern command}
< if next_char(j)=r then
<   if op_bit(j)<kern_flag then @<Extend a ligature, |goto main_loop_3|@>
<   else @<Append a kern, |goto main_loop_2|@>;
< incr(k);
< until stop_bit(j)>=stop_flag;
< end
---
> @<Move the cursor past a pseudo-ligature...@>=
> main_p:=lig_ptr(lig_stack);
> if main_p>null then tail_append(main_p);
> temp_ptr:=lig_stack; lig_stack:=link(temp_ptr);
> free_node(temp_ptr,small_node_size);
> main_i:=char_info(main_f)(cur_l); ligature_present:=true;
> if lig_stack=null then
>   if main_p>null then goto main_loop_lookahead
>   else cur_r:=bchar
> else cur_r:=character(lig_stack);
> goto main_lig_loop
19391,19395c20002,20003
< @ @<Append a kern,...@>=
< begin @<Make a ligature node,...@>;
< tail_append(new_kern(char_kern(f)(j)));
< c:=qo(r); goto main_loop_2;
< end
---
> @ The result of \.{\\char} can participate in a ligature or kern, so we must
> look ahead for it.
19397,19400c20005,20022
< @ A discretionary break is not inserted for an explicit hyphen when we are
< in restricted horizontal mode. In particular, this avoids putting
< discretionary nodes inside of other discretionaries.
< @^explicit hyphens@>
---
> @<Look ahead for another character...@>=
> get_next; {set only |cur_cmd| and |cur_chr|, for speed}
> if cur_cmd=letter then goto main_loop_lookahead+1;
> if cur_cmd=other_char then goto main_loop_lookahead+1;
> if cur_cmd=char_given then goto main_loop_lookahead+1;
> x_token; {now expand and set |cur_cmd|, |cur_chr|, |cur_tok|}
> if cur_cmd=letter then goto main_loop_lookahead+1;
> if cur_cmd=other_char then goto main_loop_lookahead+1;
> if cur_cmd=char_given then goto main_loop_lookahead+1;
> if cur_cmd=char_num then
>   begin scan_char_num; cur_chr:=cur_val; goto main_loop_lookahead+1;
>   end;
> if cur_cmd=no_boundary then bchar:=non_char;
> cur_r:=bchar; lig_stack:=null; goto main_lig_loop;
> main_loop_lookahead+1: adjust_space_factor;
> fast_get_avail(lig_stack); font(lig_stack):=main_f;
> cur_r:=qi(cur_chr); character(lig_stack):=cur_r;
> if cur_r=false_bchar then cur_r:=non_char {this prevents spurious ligatures}
19402,19404c20024,20045
< @<Make a ligature node,...@>=
< if ligature_present then
<   begin p:=new_ligature(f,l,link(q)); link(q):=p; tail:=p;
---
> @ Even though comparatively few characters have a lig/kern program, several
> of the instructions here count as part of \TeX's inner loop, since a
> potentially long sequential search must be performed. For example, tests with
> Computer Modern Roman showed that about 40 per cent of all characters
> actually encountered in practice had a lig/kern program, and that about four
> lig/kern commands were investigated for every such character.
> 
> At the beginning of this code we have |main_i=char_info(main_f)(cur_l)|.
> 
> @<If there's a ligature/kern command...@>=
> if char_tag(main_i)<>lig_tag then goto main_loop_wrapup;
> main_k:=lig_kern_start(main_f)(main_i); main_j:=font_info[main_k].qqqq;
> if skip_byte(main_j)<=stop_flag then goto main_lig_loop+2;
> main_k:=lig_kern_restart(main_f)(main_j);
> main_lig_loop+1:main_j:=font_info[main_k].qqqq;
> main_lig_loop+2:if next_char(main_j)=cur_r then
>  if skip_byte(main_j)<=stop_flag then
>   @<Do ligature or kern command, returning to |main_lig_loop|
>   or |main_loop_wrapup| or |main_loop_move|@>;
> if skip_byte(main_j)=qi(0) then incr(main_k)
> else begin if skip_byte(main_j)>=stop_flag then goto main_loop_wrapup;
>   main_k:=main_k+qo(skip_byte(main_j))+1;
19406c20047
< if c=hyphen_char[f] then if mode=hmode then tail_append(new_disc)
---
> goto main_lig_loop+1
19408,19409c20049,20096
< @ @<Extend a ligature...@>=
< begin ligature_present:=true; l:=rem_byte(j); c:=qo(r); goto main_loop_3;
---
> @ When a ligature or kern instruction matches a character, we know from
> |read_font_info| that the character exists in the font, even though we
> haven't verified its existence in the normal way.
> 
> This section could be made into a subroutine, if the code inside
> |main_control| needs to be shortened.
> 
> \chardef\?='174 % vertical line to indicate character retention
> 
> @<Do ligature or kern command...@>=
> begin if op_byte(main_j)>=kern_flag then
>   begin wrapup(rt_hit);
>   tail_append(new_kern(char_kern(main_f)(main_j))); goto main_loop_move;
>   end;
> if cur_l=non_char then lft_hit:=true
> else if lig_stack=null then rt_hit:=true;
> check_interrupt; {allow a way out in case there's an infinite ligature loop}
> case op_byte(main_j) of
> qi(1),qi(5):begin cur_l:=rem_byte(main_j); {\.{=:\?}, \.{=:\?>}}
>   main_i:=char_info(main_f)(cur_l); ligature_present:=true;
>   end;
> qi(2),qi(6):begin cur_r:=rem_byte(main_j); {\.{\?=:}, \.{\?=:>}}
>   if lig_stack=null then {right boundary character is being consumed}
>     begin lig_stack:=new_lig_item(cur_r); bchar:=non_char;
>     end
>   else if is_char_node(lig_stack) then {|link(lig_stack)=null|}
>     begin main_p:=lig_stack; lig_stack:=new_lig_item(cur_r);
>     lig_ptr(lig_stack):=main_p;
>     end
>   else character(lig_stack):=cur_r;
>   end;
> qi(3):begin cur_r:=rem_byte(main_j); {\.{\?=:\?}}
>   main_p:=lig_stack; lig_stack:=new_lig_item(cur_r);
>   link(lig_stack):=main_p;
>   end;
> qi(7),qi(11):begin wrapup(false); {\.{\?=:\?>}, \.{\?=:\?>>}}
>   cur_q:=tail; cur_l:=rem_byte(main_j);
>   main_i:=char_info(main_f)(cur_l); ligature_present:=true;
>   end;
> othercases begin cur_l:=rem_byte(main_j); ligature_present:=true; {\.{=:}}
>   if lig_stack=null then goto main_loop_wrapup
>   else goto main_loop_move+1;
>   end
> endcases;
> if op_byte(main_j)>qi(4) then
>   if op_byte(main_j)<>qi(7) then goto main_loop_wrapup;
> if cur_l<non_char then goto main_lig_loop;
> main_k:=bchar_label[main_f]; goto main_lig_loop+1;
19424c20111
<   begin @<Find the glue specification, |p|, for
---
>   begin @<Find the glue specification, |main_p|, for
19426c20113
<   q:=new_glue(p);
---
>   temp_ptr:=new_glue(main_p);
19428,19429c20115,20116
< else q:=new_param_glue(space_skip_code);
< link(tail):=q; tail:=q;
---
> else temp_ptr:=new_param_glue(space_skip_code);
> link(tail):=temp_ptr; tail:=temp_ptr;
19438,19444c20125,20131
< begin p:=font_glue[cur_font];
< if p=null then
<   begin f:=cur_font; p:=new_spec(zero_glue); k:=param_base[f]+space_code;
<   width(p):=font_info[k].sc; {that's |space(f)|}
<   stretch(p):=font_info[k+1].sc; {and |space_stretch(f)|}
<   shrink(p):=font_info[k+2].sc; {and |space_shrink(f)|}
<   font_glue[f]:=p;
---
> begin main_p:=font_glue[cur_font];
> if main_p=null then
>   begin main_p:=new_spec(zero_glue); main_k:=param_base[cur_font]+space_code;
>   width(main_p):=font_info[main_k].sc; {that's |space(cur_font)|}
>   stretch(main_p):=font_info[main_k+1].sc; {and |space_stretch(cur_font)|}
>   shrink(main_p):=font_info[main_k+2].sc; {and |space_shrink(cur_font)|}
>   font_glue[cur_font]:=main_p;
19450,19453c20137
< var p:pointer; {glue specification}
< @!q:pointer; {glue node}
< @!f:internal_font_number; {the current font}
< @!k:0..font_mem_size; {index into |font_info|}
---
> var@!q:pointer; {glue node}
19456c20140
< else  begin if space_skip<>zero_glue then p:=space_skip
---
> else  begin if space_skip<>zero_glue then main_p:=space_skip
19458,19460c20142,20144
<   p:=new_spec(p);
<   @<Modify the glue specification in |p| according to the space factor@>;
<   q:=new_glue(p); glue_ref_count(p):=null;
---
>   main_p:=new_spec(main_p);
>   @<Modify the glue specification in |main_p| according to the space factor@>;
>   q:=new_glue(main_p); glue_ref_count(main_p):=null;
19465,19468c20149,20152
< @ @<Modify the glue specification in |p| according to the space factor@>=
< if space_factor>=2000 then width(p):=width(p)+extra_space(cur_font);
< stretch(p):=xn_over_d(stretch(p),space_factor,1000);
< shrink(p):=xn_over_d(shrink(p),1000,space_factor)
---
> @ @<Modify the glue specification in |main_p| according to the space factor@>=
> if space_factor>=2000 then width(main_p):=width(main_p)+extra_space(cur_font);
> stretch(main_p):=xn_over_d(stretch(main_p),space_factor,1000);
> shrink(main_p):=xn_over_d(shrink(main_p),1000,space_factor)
19476c20160
< any_mode(relax),vmode+spacer,mmode+spacer:do_nothing;
---
> any_mode(relax),vmode+spacer,mmode+spacer,mmode+no_boundary:do_nothing;
19594c20278,20279
< @* \[47] Building boxes and lists.
---
> 
> @* \[47] Building boxes and lists.
20204c20889
<    vmode+ex_space:@t@>@;@/
---
>    vmode+ex_space,vmode+no_boundary:@t@>@;@/
20213c20898
< push_nest; mode:=hmode; space_factor:=1000;
---
> push_nest; mode:=hmode; space_factor:=1000; clang:=0;
20756c21441,21442
< @* \[48] Building math lists.
---
> 
> @* \[48] Building math lists.
20807c21493,21495
< mmode+eq_no: if privileged then start_eq_no;
---
> mmode+eq_no: if privileged then
>   if cur_group=math_shift_group then start_eq_no
>   else off_save;
20964,20965c21652
< letter,other_char,char_given: if cur_chr>=128 then c:=cur_chr
<   else  begin c:=ho(math_code(cur_chr));
---
> letter,other_char,char_given: begin c:=ho(math_code(cur_chr));
21010,21012c21697,21698
< mmode+letter,mmode+other_char,mmode+char_given: if cur_chr<128 then
<     set_math_char(ho(math_code(cur_chr)))
<   else set_math_char(cur_chr);
---
> mmode+letter,mmode+other_char,mmode+char_given:
>   set_math_char(ho(math_code(cur_chr)));
21014,21015c21700
<   if cur_chr<128 then set_math_char(ho(math_code(cur_chr)))
<   else set_math_char(cur_chr);
---
>   set_math_char(ho(math_code(cur_chr)));
21606c22291
< push_nest; mode:=hmode; space_factor:=1000;
---
> push_nest; mode:=hmode; space_factor:=1000; clang:=0;
21707c22392
< prev_depth:=t; resume_after_display;
---
> prev_depth:=aux_save.sc; resume_after_display;
21717c22402,22403
< @* \[49] Mode-independent processing.
---
> 
> @* \[49] Mode-independent processing.
21803c22489
< @!k:0..font_mem_size; {index into |font_info|}
---
> @!k:font_index; {index into |font_info|}
22115c22801
<   p:=cur_chr; scan_seven_bit_int; p:=p+cur_val; scan_optional_equals;
---
>   p:=cur_chr; scan_char_num; p:=p+cur_val; scan_optional_equals;
22136c22822
< else n:=127
---
> else n:=255
22231c22917,22919
<   if q=multiply then cur_val:=nx_plus_y(eqtb[l].int,cur_val,0)
---
>   if q=multiply then
>     if p=int_val then cur_val:=mult_integers(eqtb[l].int,cur_val)
>     else cur_val:=nx_plus_y(eqtb[l].int,cur_val,0)
22679,22680c23367,23368
< We also change active characters, using the fact that |cs_token_flag|
< is a multiple of~256.
---
> We also change active characters, using the fact that
> |cs_token_flag+active_base| is a multiple of~256.
22685,22689c23373,23374
<   begin if t>=cs_token_flag then t:=t-active_base;
<   c:=t mod 256;
<   if c<128 then if equiv(b+c)<>0 then t:=256*(t div 256)+equiv(b+c);
<   if t>=cs_token_flag then info(p):=t+active_base
<   else info(p):=t;
---
>   begin c:=t mod 256;
>   if equiv(b+c)<>0 then info(p):=t-c+equiv(b+c);
22793c23478,23479
< @* \[50] Dumping and undumping the tables.
---
> 
> @* \[50] Dumping and undumping the tables.
22942,22943c23628,23629
<   w.b0:=str_pool[k]; w.b1:=str_pool[k+1];
<   w.b2:=str_pool[k+2]; w.b3:=str_pool[k+3];
---
>   w.b0:=qi(so(str_pool[k])); w.b1:=qi(so(str_pool[k+1]));
>   w.b2:=qi(so(str_pool[k+2])); w.b3:=qi(so(str_pool[k+3]));
22960,22961c23646,23647
<   str_pool[k]:=w.b0; str_pool[k+1]:=w.b1;
<   str_pool[k+2]:=w.b2; str_pool[k+3]:=w.b3
---
>   str_pool[k]:=si(qo(w.b0)); str_pool[k+1]:=si(qo(w.b1));
>   str_pool[k+2]:=si(qo(w.b2)); str_pool[k+3]:=si(qo(w.b3))
22971c23657,23658
< k:=pool_ptr-4; undump_four_ASCII
---
> k:=pool_ptr-4; undump_four_ASCII;
> init_str_ptr:=str_ptr; init_pool_ptr:=pool_ptr
23006c23693
< p:=mem_bot; q:=rover; x:=0;
---
> p:=mem_bot; q:=rover;
23154a23842,23844
> dump_int(bchar_label[k]);
> dump_int(font_bchar[k]);
> dump_int(font_false_bchar[k]);@/
23182c23872,23875
< undump(min_halfword)(lo_mem_max)(font_glue[k]);
---
> undump(min_halfword)(lo_mem_max)(font_glue[k]);@/
> undump(0)(font_mem_size)(bchar_label[k]);
> undump(min_quarterword)(non_char)(font_bchar[k]);
> undump(min_quarterword)(non_char)(font_false_bchar[k]);
23189a23883,23885
> print_ln; print_int(hyph_count); print(" hyphenation exception");
> if hyph_count<>1 then print_char("s");
> if trie_not_ready then init_trie;
23193c23889
< for k:=min_quarterword+1 to trie_op_ptr do
---
> for k:=1 to trie_op_ptr do
23198,23199d23893
< print_ln; print_int(hyph_count); print(" hyphenation exception");
< if hyph_count<>1 then print_char("s");
23202,23203c23896,23903
< print(" has "); print_int(qo(trie_op_ptr)); print(" op");
< if trie_op_ptr<>min_quarterword+1 then print_char("s")
---
> print(" has "); print_int(trie_op_ptr); print(" op");
> if trie_op_ptr<>1 then print_char("s");
> print(" out of "); print_int(trie_op_size);
> for k:=255 downto 0 do if trie_used[k]>min_quarterword then
>   begin print_nl("  "); print_int(qo(trie_used[k]));
>   print(" for language "); print_int(k);
>   dump_int(k); dump_int(qo(trie_used[k]));
>   end
23205c23905,23907
< @ @<Undump the hyphenation tables@>=
---
> @ Only ``nonempty'' parts of |op_start| need to be restored.
> 
> @<Undump the hyphenation tables@>=
23212,23215c23914,23917
< undump_size(0)(trie_size)('trie size')(trie_max);
< for k:=0 to trie_max do undump_hh(trie[k]);
< undump(min_quarterword)(max_quarterword)(trie_op_ptr);
< for k:=min_quarterword+1 to trie_op_ptr do
---
> undump_size(0)(trie_size)('trie size')(j); {|trie_max|}
> for k:=0 to j do undump_hh(trie[k]);
> undump_size(0)(trie_op_size)('trie op size')(j); {|trie_op_ptr|}
> for k:=1 to j do
23219c23921,23926
<   end
---
>   end;
> k:=256;
> while j>0 do
>   begin undump(0)(k-1)(k); undump(1)(j)(x); j:=j-x; op_start[k]:=qo(j);
>   end;
> @!init trie_not_ready:=false @+tini
23243,23244c23950,23952
< pack_job_name(".fmt");
< while not w_open_out(fmt_file) do prompt_file_name("format file name",".fmt");
---
> pack_job_name(format_extension);
> while not w_open_out(fmt_file) do
>   prompt_file_name("format file name",format_extension);
23252c23960,23961
< @* \[51] The main program.
---
> 
> @* \[51] The main program.
23326a24036
> init_str_ptr:=str_ptr; init_pool_ptr:=pool_ptr; fix_date_and_time;
23331d24040
< init_str_ptr:=str_ptr; init_pool_ptr:=pool_ptr;@/
23470c24179
< if (end_line_char<0)or(end_line_char>127) then decr(limit)
---
> if end_line_char_inactive then decr(limit)
23478c24187,24188
< @* \[52] Debugging.
---
> 
> @* \[52] Debugging.
23546c24256,24257
< @* \[53] Extensions.
---
> 
> @* \[53] Extensions.
23557,23561c24268,24273
< `\.{\\write}', `\.{\\openout}', `\.{\\closeout}', `\.{\\immediate}, and
< `\.{\\special}' as if they were extensions. These commands are actually
< primitives of \TeX82, and they should appear in all implementations of the
< system; but let's try to imagine that they aren't. Then the program below
< illustrates how a person could add them.
---
> `\.{\\write}', `\.{\\openout}', `\.{\\closeout}', `\.{\\immediate}',
> `\.{\\special}', and `\.{\\setlanguage}' as if they were extensions.
> These commands are actually primitives of \TeX, and they should
> appear in all implementations of the system; but let's try to imagine
> that they aren't. Then the program below illustrates how a person
> could add them.
23587,23589c24299,24301
< We shall introduce four |subtype| values here, corresponding to the
< control sequences \.{\\openout}, \.{\\write}, \.{\\closeout}, and
< \.{\\special}. The second word of such whatsits has a |write_stream| field
---
> We shall introduce five |subtype| values here, corresponding to the
> control sequences \.{\\openout}, \.{\\write}, \.{\\closeout}, \.{\\special}, and
> \.{\\setlanguage}. The second word of I/O whatsits has a |write_stream| field
23602a24315,24316
> @d language_node=4 {|subtype| in whatsits that change the current language}
> @d stored_language(#)==mem[#+1].int {language number, in the range |0..255|}
23626a24341
> @d set_language_code=5 {command modifier for \.{\\setlanguage}}
23638a24354,24355
> primitive("setlanguage",extension,set_language_code);@/
> @!@:set_language_}{\.{\\setlanguage} primitive@>
23652a24370
>   set_language_code:print_esc("setlanguage");
23672a24391
> set_language_code:@<Implement \.{\\setlanguage}@>;
23755a24475,24477
> language_node:begin print_esc("setlanguage");
>   print_int(stored_language(p));
>   end;
23766c24488,24489
< close_node: begin r:=get_node(small_node_size); words:=small_node_size;
---
> close_node,language_node: begin r:=get_node(small_node_size);
>   words:=small_node_size;
23778c24501
< close_node: free_node(p,small_node_size);
---
> close_node,language_node: free_node(p,small_node_size);
23791c24514,24515
< @ @<Advance \(p)past a whatsit node in the |line_break| loop@>=do_nothing
---
> @ @<Advance \(p)past a whatsit node in the \(l)|line_break| loop@>=
> if subtype(cur_p)=language_node then cur_lang:=stored_language(cur_p)
23792a24517,24519
> @ @<Advance \(p)past a whatsit node in the \(p)pre-hyphenation loop@>=
> if subtype(s)=language_node then cur_lang:=stored_language(s)
> 
23805,23807d24531
< @ @<Finish the extensions@>=
< for k:=0 to 15 do if write_open[k] then a_close(write_file[k])
< 
23826c24550
< for k:=str_start[str_ptr] to pool_ptr-1 do dvi_out(str_pool[k]);
---
> for k:=str_start[str_ptr] to pool_ptr-1 do dvi_out(so(str_pool[k]));
23857c24581
< show_token_list(link(def_ref),null,10000000); print_ln;
---
> token_show(def_ref); print_ln;
23904a24629
> language_node:do_nothing;
23947c24672,24702
< @* \[54] System-dependent changes.
---
> 
> @ The \.{\\language} extension is somewhat different.
> We need a subroutine that comes into play when a character of
> a non-|clang| language is being appended to the current paragraph.
> 
> @<Declare action...@>=
> procedure fix_language;
> var @!l:ASCII_code; {the new current language}
> begin if language<=0 then l:=0
> else if language>255 then l:=0
> else l:=language;
> if l<>clang then
>   begin new_whatsit(language_node,small_node_size);
>   stored_language(tail):=l; clang:=l;
>   end;
> end;
> 
> @ @<Implement \.{\\setlanguage}@>=
> if abs(mode)<>hmode then report_illegal_case
> else begin new_whatsit(language_node,small_node_size);
>   scan_int;
>   if cur_val<=0 then clang:=0
>   else if cur_val>255 then clang:=0
>   else clang:=cur_val;
>   stored_language(tail):=clang;
>   end
> 
> @ @<Finish the extensions@>=
> for k:=0 to 15 do if write_open[k] then a_close(write_file[k])
> 
> @* \[54] System-dependent changes.
23956c24711,24712
< @* \[55] Index.
---
> 
> @* \[55] Index.
