Class SM::SimpleMarkup
In: markup/simple_markup.rb
Parent: Object

Synopsis

This code converts input_string, which is in the format described in markup/simple_markup.rb, to HTML. The conversion takes place in the convert method, so you can use the same SimpleMarkup object to convert multiple input strings.

  require 'rdoc/markup/simple_markup'
  require 'rdoc/markup/simple_markup/to_html'

  p = SM::SimpleMarkup.new
  h = SM::ToHtml.new

  puts p.convert(input_string, h)

You can extend the SimpleMarkup parser to recognise new markup sequences, and to add special processing for text that matches a regular epxression. Here we make WikiWords significant to the parser, and also make the sequences {word} and <no>text...</no> signify strike-through text. When then subclass the HTML output class to deal with these:

  require 'rdoc/markup/simple_markup'
  require 'rdoc/markup/simple_markup/to_html'

  class WikiHtml < SM::ToHtml
    def handle_special_WIKIWORD(special)
      "<font color=red>" + special.text + "</font>"
    end
  end

  p = SM::SimpleMarkup.new
  p.add_word_pair("{", "}", :STRIKE)
  p.add_html("no", :STRIKE)

  p.add_special(/\b([A-Z][a-z]+[A-Z]\w+)/, :WIKIWORD)

  h = WikiHtml.new
  h.add_tag(:STRIKE, "<strike>", "</strike>")

  puts "<body>" + p.convert(ARGF.read, h) + "</body>"

Output Formatters

missing

Methods

Constants

SPACE = ?\s
SIMPLE_LIST_RE = /^( ( \* (?# bullet) |- (?# bullet) |\d+\. (?# numbered ) |[A-Za-z]\. (?# alphabetically numbered ) ) \s+ )\S/x   List entries look like:
 *       text
 1.      text
 [label] text
 label:: text

Flag it as a list entry, and work out the indent for subsequent lines

LABEL_LIST_RE = /^( ( \[.*?\] (?# labeled ) |\S.*:: (?# note ) )(?:\s+|$) )/x

Public Class methods

take a block of text and use various heuristics to determine it‘s structure (paragraphs, lists, and so on). Invoke an event handler as we identify significant chunks.

[Source]

     # File markup/simple_markup.rb, line 207
207:     def initialize
208:       @am = AttributeManager.new
209:       @output = nil
210:       @block_exceptions = nil
211:     end

Public Instance methods

Add to the sequences recognized as general markup

[Source]

     # File markup/simple_markup.rb, line 226
226:     def add_html(tag, name)
227:       @am.add_html(tag, name)
228:     end

Add to other inline sequences. For example, we could add WikiWords using something like:

   parser.add_special(/\b([A-Z][a-z]+[A-Z]\w+)/, :WIKIWORD)

Each wiki word will be presented to the output formatter via the accept_special method

[Source]

     # File markup/simple_markup.rb, line 240
240:     def add_special(pattern, name)
241:       @am.add_special(pattern, name)
242:     end

Add to the sequences used to add formatting to an individual word (such as bold). Matching entries will generate attibutes that the output formatters can recognize by their name

[Source]

     # File markup/simple_markup.rb, line 218
218:     def add_word_pair(start, stop, name)
219:       @am.add_word_pair(start, stop, name)
220:     end

Look through the text at line indentation. We flag each line as being Blank, a paragraph, a list element, or verbatim text

[Source]

     # File markup/simple_markup.rb, line 274
274:     def assign_types_to_lines(margin = 0, level = 0)
275:       now_blocking = false
276:       while line = @lines.next
277:         if @block_exceptions
278:           if now_blocking
279:             line.stamp(Line::PARAGRAPH, level)
280:             @block_exceptions.each{ |be|
281:               if now_blocking == be['name']
282:                 be['replaces'].each{ |rep|
283:                   line.text.gsub!(rep['from'], rep['to'])
284:                 }
285:               end
286:               if now_blocking == be['name'] && line.text =~ be['end']
287:                 now_blocking = false
288:                 break
289:               end
290:             }
291:             next
292:           else
293:             @block_exceptions.each{ |be|
294:               if line.text =~ be['start']
295:                 now_blocking = be['name']
296:                 line.stamp(Line::PARAGRAPH, level)
297:                 break
298:               end
299:             }
300:             next if now_blocking
301:           end
302:         end
303: 
304:         if line.isBlank?
305:           line.stamp(Line::BLANK, level)
306:           next
307:         end
308:         
309:         # if a line contains non-blanks before the margin, then it must belong
310:         # to an outer level
311: 
312:         text = line.text
313:         
314:         for i in 0...margin
315:           if text[i] != SPACE
316:             @lines.unget
317:             return
318:           end
319:         end
320: 
321:         active_line = text[margin..-1]
322: 
323:         # Rules (horizontal lines) look like
324:         #
325:         #  ---   (three or more hyphens)
326:         #
327:         # The more hyphens, the thicker the rule
328:         #
329: 
330:         if /^(---+)\s*$/ =~ active_line
331:           line.stamp(Line::RULE, level, $1.length-2)
332:           next
333:         end
334: 
335:         # Then look for list entries. First the ones that have to have
336:         # text following them (* xxx, - xxx, and dd. xxx)
337: 
338:         if SIMPLE_LIST_RE =~ active_line
339: 
340:           offset = margin + $1.length
341:           prefix = $2
342:           prefix_length = prefix.length
343: 
344:           flag = case prefix
345:                  when "*","-" then ListBase::BULLET
346:                  when /^\d/   then ListBase::NUMBER
347:                  when /^[A-Z]/ then ListBase::UPPERALPHA
348:                  when /^[a-z]/ then ListBase::LOWERALPHA
349:                  else raise "Invalid List Type: #{self.inspect}"
350:                  end
351: 
352:           line.stamp(Line::LIST, level+1, prefix, flag)
353:           text[margin, prefix_length] = " " * prefix_length
354:           assign_types_to_lines(offset, level + 1)
355:           next
356:         end
357: 
358: 
359:         if LABEL_LIST_RE =~ active_line
360:           offset = margin + $1.length
361:           prefix = $2
362:           prefix_length = prefix.length
363: 
364:           next if handled_labeled_list(line, level, margin, offset, prefix)
365:         end
366: 
367:         # Headings look like
368:         # = Main heading
369:         # == Second level
370:         # === Third
371:         #
372:         # Headings reset the level to 0
373: 
374:         if active_line[0] == ?= and active_line =~ /^(=+)\s*(.*)/
375:           prefix_length = $1.length
376:           prefix_length = 6 if prefix_length > 6
377:           line.stamp(Line::HEADING, 0, prefix_length)
378:           line.strip_leading(margin + prefix_length)
379:           next
380:         end
381:         
382:         # If the character's a space, then we have verbatim text,
383:         # otherwise 
384: 
385:         if active_line[0] == SPACE
386:           line.strip_leading(margin) if margin > 0
387:           line.stamp(Line::VERBATIM, level)
388:         else
389:           line.stamp(Line::PARAGRAPH, level)
390:         end
391:       end
392:     end

for debugging, we allow access to our line contents as text

[Source]

     # File markup/simple_markup.rb, line 493
493:     def content
494:       @lines.as_text
495:     end

We take a string, split it into lines, work out the type of each line, and from there deduce groups of lines (for example all lines in a paragraph). We then invoke the output formatter using a Visitor to display the result

[Source]

     # File markup/simple_markup.rb, line 250
250:     def convert(str, op, block_exceptions=nil)
251:       @lines = Lines.new(str.split(/\r?\n/).collect { |aLine| 
252:                            Line.new(aLine) })
253:       return "" if @lines.empty?
254:       @lines.normalize
255:       @block_exceptions = block_exceptions
256:       assign_types_to_lines
257:       group = group_lines
258:       # call the output formatter to handle the result
259:       #      group.to_a.each {|i| p i}
260:       group.accept(@am, op)
261:     end

for debugging, return the list of line types

[Source]

     # File markup/simple_markup.rb, line 499
499:     def get_line_types
500:       @lines.line_types
501:     end

Return a block consisting of fragments which are paragraphs, list entries or verbatim text. We merge consecutive lines of the same type and level together. We are also slightly tricky with lists: the lines following a list introduction look like paragraph lines at the next level, and we remap them into list entries instead

[Source]

     # File markup/simple_markup.rb, line 464
464:     def group_lines
465:       @lines.rewind
466: 
467:       inList = false
468:       wantedType = wantedLevel = nil
469: 
470:       block = LineCollection.new
471:       group = nil
472: 
473:       while line = @lines.next
474:         if line.level == wantedLevel and line.type == wantedType
475:           group.add_text(line.text)
476:         else
477:           group = block.fragment_for(line)
478:           block.add(group)
479:           if line.type == Line::LIST
480:             wantedType = Line::PARAGRAPH
481:           else
482:             wantedType = line.type
483:           end
484:           wantedLevel = line.type == Line::HEADING ? line.param : line.level
485:         end
486:       end
487: 
488:       block.normalize
489:       block
490:     end

Handle labeled list entries, We have a special case to deal with. Because the labels can be long, they force the remaining block of text over the to right:

this is a long label that I wrote:and here is the block of text with a silly margin

So we allow the special case. If the label is followed by nothing, and if the following line is indented, then we take the indent of that line as the new margin

this is a long label that I wrote:here is a more reasonably indented block which will ab attached to the label.

[Source]

     # File markup/simple_markup.rb, line 411
411:     def handled_labeled_list(line, level, margin, offset, prefix)
412:       prefix_length = prefix.length
413:       text = line.text
414:       flag = nil
415:       case prefix
416:       when /^\[/
417:         flag = ListBase::LABELED
418:         prefix = prefix[1, prefix.length-2]
419:       when /:$/
420:         flag = ListBase::NOTE
421:         prefix.chop!
422:       else raise "Invalid List Type: #{self.inspect}"
423:       end
424:       
425:       # body is on the next line
426:       
427:       if text.length <= offset
428:         original_line = line
429:         line = @lines.next
430:         return(false) unless line
431:         text = line.text
432:         
433:         for i in 0..margin
434:           if text[i] != SPACE
435:             @lines.unget
436:             return false
437:           end
438:         end
439:         i = margin
440:         i += 1 while text[i] == SPACE
441:         if i >= text.length
442:           @lines.unget
443:           return false
444:         else
445:           offset = i
446:           prefix_length = 0
447:           @lines.delete(original_line)
448:         end
449:       end
450:       
451:       line.stamp(Line::LIST, level+1, prefix, flag)
452:       text[margin, prefix_length] = " " * prefix_length
453:       assign_types_to_lines(offset, level + 1)
454:       return true
455:     end

[Validate]