Class | SM::SimpleMarkup |
In: |
markup/simple_markup.rb
|
Parent: | Object |
This code converts input_string, which is in the format described in markup/simple_markup.rb, to HTML. The conversion takes place in the convert method, so you can use the same SimpleMarkup object to convert multiple input strings.
require 'rdoc/markup/simple_markup' require 'rdoc/markup/simple_markup/to_html' p = SM::SimpleMarkup.new h = SM::ToHtml.new puts p.convert(input_string, h)
You can extend the SimpleMarkup parser to recognise new markup sequences, and to add special processing for text that matches a regular epxression. Here we make WikiWords significant to the parser, and also make the sequences {word} and <no>text...</no> signify strike-through text. When then subclass the HTML output class to deal with these:
require 'rdoc/markup/simple_markup' require 'rdoc/markup/simple_markup/to_html' class WikiHtml < SM::ToHtml def handle_special_WIKIWORD(special) "<font color=red>" + special.text + "</font>" end end p = SM::SimpleMarkup.new p.add_word_pair("{", "}", :STRIKE) p.add_html("no", :STRIKE) p.add_special(/\b([A-Z][a-z]+[A-Z]\w+)/, :WIKIWORD) h = WikiHtml.new h.add_tag(:STRIKE, "<strike>", "</strike>") puts "<body>" + p.convert(ARGF.read, h) + "</body>"
missing
SPACE | = | ?\s | ||
SIMPLE_LIST_RE | = | /^( ( \* (?# bullet) |- (?# bullet) |\d+\. (?# numbered ) |[A-Za-z]\. (?# alphabetically numbered ) ) \s+ )\S/x |
List entries look like:
* text 1. text [label] text label:: text Flag it as a list entry, and work out the indent for subsequent lines |
|
LABEL_LIST_RE | = | /^( ( \[.*?\] (?# labeled ) |\S.*:: (?# note ) )(?:\s+|$) )/x |
take a block of text and use various heuristics to determine it‘s structure (paragraphs, lists, and so on). Invoke an event handler as we identify significant chunks.
# File markup/simple_markup.rb, line 207 207: def initialize 208: @am = AttributeManager.new 209: @output = nil 210: @block_exceptions = nil 211: end
Add to the sequences recognized as general markup
# File markup/simple_markup.rb, line 226 226: def add_html(tag, name) 227: @am.add_html(tag, name) 228: end
Add to other inline sequences. For example, we could add WikiWords using something like:
parser.add_special(/\b([A-Z][a-z]+[A-Z]\w+)/, :WIKIWORD)
Each wiki word will be presented to the output formatter via the accept_special method
# File markup/simple_markup.rb, line 240 240: def add_special(pattern, name) 241: @am.add_special(pattern, name) 242: end
Add to the sequences used to add formatting to an individual word (such as bold). Matching entries will generate attibutes that the output formatters can recognize by their name
# File markup/simple_markup.rb, line 218 218: def add_word_pair(start, stop, name) 219: @am.add_word_pair(start, stop, name) 220: end
Look through the text at line indentation. We flag each line as being Blank, a paragraph, a list element, or verbatim text
# File markup/simple_markup.rb, line 274 274: def assign_types_to_lines(margin = 0, level = 0) 275: now_blocking = false 276: while line = @lines.next 277: if @block_exceptions 278: if now_blocking 279: line.stamp(Line::PARAGRAPH, level) 280: @block_exceptions.each{ |be| 281: if now_blocking == be['name'] 282: be['replaces'].each{ |rep| 283: line.text.gsub!(rep['from'], rep['to']) 284: } 285: end 286: if now_blocking == be['name'] && line.text =~ be['end'] 287: now_blocking = false 288: break 289: end 290: } 291: next 292: else 293: @block_exceptions.each{ |be| 294: if line.text =~ be['start'] 295: now_blocking = be['name'] 296: line.stamp(Line::PARAGRAPH, level) 297: break 298: end 299: } 300: next if now_blocking 301: end 302: end 303: 304: if line.isBlank? 305: line.stamp(Line::BLANK, level) 306: next 307: end 308: 309: # if a line contains non-blanks before the margin, then it must belong 310: # to an outer level 311: 312: text = line.text 313: 314: for i in 0...margin 315: if text[i] != SPACE 316: @lines.unget 317: return 318: end 319: end 320: 321: active_line = text[margin..-1] 322: 323: # Rules (horizontal lines) look like 324: # 325: # --- (three or more hyphens) 326: # 327: # The more hyphens, the thicker the rule 328: # 329: 330: if /^(---+)\s*$/ =~ active_line 331: line.stamp(Line::RULE, level, $1.length-2) 332: next 333: end 334: 335: # Then look for list entries. First the ones that have to have 336: # text following them (* xxx, - xxx, and dd. xxx) 337: 338: if SIMPLE_LIST_RE =~ active_line 339: 340: offset = margin + $1.length 341: prefix = $2 342: prefix_length = prefix.length 343: 344: flag = case prefix 345: when "*","-" then ListBase::BULLET 346: when /^\d/ then ListBase::NUMBER 347: when /^[A-Z]/ then ListBase::UPPERALPHA 348: when /^[a-z]/ then ListBase::LOWERALPHA 349: else raise "Invalid List Type: #{self.inspect}" 350: end 351: 352: line.stamp(Line::LIST, level+1, prefix, flag) 353: text[margin, prefix_length] = " " * prefix_length 354: assign_types_to_lines(offset, level + 1) 355: next 356: end 357: 358: 359: if LABEL_LIST_RE =~ active_line 360: offset = margin + $1.length 361: prefix = $2 362: prefix_length = prefix.length 363: 364: next if handled_labeled_list(line, level, margin, offset, prefix) 365: end 366: 367: # Headings look like 368: # = Main heading 369: # == Second level 370: # === Third 371: # 372: # Headings reset the level to 0 373: 374: if active_line[0] == ?= and active_line =~ /^(=+)\s*(.*)/ 375: prefix_length = $1.length 376: prefix_length = 6 if prefix_length > 6 377: line.stamp(Line::HEADING, 0, prefix_length) 378: line.strip_leading(margin + prefix_length) 379: next 380: end 381: 382: # If the character's a space, then we have verbatim text, 383: # otherwise 384: 385: if active_line[0] == SPACE 386: line.strip_leading(margin) if margin > 0 387: line.stamp(Line::VERBATIM, level) 388: else 389: line.stamp(Line::PARAGRAPH, level) 390: end 391: end 392: end
for debugging, we allow access to our line contents as text
# File markup/simple_markup.rb, line 493 493: def content 494: @lines.as_text 495: end
We take a string, split it into lines, work out the type of each line, and from there deduce groups of lines (for example all lines in a paragraph). We then invoke the output formatter using a Visitor to display the result
# File markup/simple_markup.rb, line 250 250: def convert(str, op, block_exceptions=nil) 251: @lines = Lines.new(str.split(/\r?\n/).collect { |aLine| 252: Line.new(aLine) }) 253: return "" if @lines.empty? 254: @lines.normalize 255: @block_exceptions = block_exceptions 256: assign_types_to_lines 257: group = group_lines 258: # call the output formatter to handle the result 259: # group.to_a.each {|i| p i} 260: group.accept(@am, op) 261: end
for debugging, return the list of line types
# File markup/simple_markup.rb, line 499 499: def get_line_types 500: @lines.line_types 501: end
Return a block consisting of fragments which are paragraphs, list entries or verbatim text. We merge consecutive lines of the same type and level together. We are also slightly tricky with lists: the lines following a list introduction look like paragraph lines at the next level, and we remap them into list entries instead
# File markup/simple_markup.rb, line 464 464: def group_lines 465: @lines.rewind 466: 467: inList = false 468: wantedType = wantedLevel = nil 469: 470: block = LineCollection.new 471: group = nil 472: 473: while line = @lines.next 474: if line.level == wantedLevel and line.type == wantedType 475: group.add_text(line.text) 476: else 477: group = block.fragment_for(line) 478: block.add(group) 479: if line.type == Line::LIST 480: wantedType = Line::PARAGRAPH 481: else 482: wantedType = line.type 483: end 484: wantedLevel = line.type == Line::HEADING ? line.param : line.level 485: end 486: end 487: 488: block.normalize 489: block 490: end
Handle labeled list entries, We have a special case to deal with. Because the labels can be long, they force the remaining block of text over the to right:
this is a long label that I wrote: | and here is the block of text with a silly margin |
So we allow the special case. If the label is followed by nothing, and if the following line is indented, then we take the indent of that line as the new margin
this is a long label that I wrote: | here is a more reasonably indented block which will ab attached to the label. |
# File markup/simple_markup.rb, line 411 411: def handled_labeled_list(line, level, margin, offset, prefix) 412: prefix_length = prefix.length 413: text = line.text 414: flag = nil 415: case prefix 416: when /^\[/ 417: flag = ListBase::LABELED 418: prefix = prefix[1, prefix.length-2] 419: when /:$/ 420: flag = ListBase::NOTE 421: prefix.chop! 422: else raise "Invalid List Type: #{self.inspect}" 423: end 424: 425: # body is on the next line 426: 427: if text.length <= offset 428: original_line = line 429: line = @lines.next 430: return(false) unless line 431: text = line.text 432: 433: for i in 0..margin 434: if text[i] != SPACE 435: @lines.unget 436: return false 437: end 438: end 439: i = margin 440: i += 1 while text[i] == SPACE 441: if i >= text.length 442: @lines.unget 443: return false 444: else 445: offset = i 446: prefix_length = 0 447: @lines.delete(original_line) 448: end 449: end 450: 451: line.stamp(Line::LIST, level+1, prefix, flag) 452: text[margin, prefix_length] = " " * prefix_length 453: assign_types_to_lines(offset, level + 1) 454: return true 455: end