| Using Markdown as Python Library |
| ================================ |
| |
| First and foremost, Python-Markdown is intended to be a python library module |
| used by various projects to convert Markdown syntax into HTML. |
| |
| The Basics |
| ---------- |
| |
| To use markdown as a module: |
| |
| import markdown |
| html = markdown.markdown(your_text_string) |
| |
| Encoded Text |
| ------------ |
| |
| Note that ``markdown()`` expects **Unicode** as input (although a simple ASCII |
| string should work) and returns output as Unicode. Do not pass encoded strings to it! |
| If your input is encoded, e.g. as UTF-8, it is your responsibility to decode |
| it. E.g.: |
| |
| input_file = codecs.open("some_file.txt", mode="r", encoding="utf-8") |
| text = input_file.read() |
| html = markdown.markdown(text, extensions) |
| |
| If you later want to write it to disk, you should encode it yourself: |
| |
| output_file = codecs.open("some_file.html", "w", encoding="utf-8") |
| output_file.write(html) |
| |
| More Options |
| ------------ |
| |
| If you want to pass more options, you can create an instance of the ``Markdown`` |
| class yourself and then use ``convert()`` to generate HTML: |
| |
| import markdown |
| md = markdown.Markdown( |
| extensions=['footnotes'], |
| extension_configs= {'footnotes' : ('PLACE_MARKER','~~~~~~~~')}, |
| safe_mode=True, |
| output_format='html4' |
| ) |
| return md.convert(some_text) |
| |
| You should also use this method if you want to process multiple strings: |
| |
| md = markdown.Markdown() |
| html1 = md.convert(text1) |
| html2 = md.convert(text2) |
| |
| Working with Files |
| ------------------ |
| |
| While the Markdown class is only intended to work with Unicode text, some |
| encoding/decoding is required for the command line features. These functions |
| and methods are only intended to fit the common use case. |
| |
| The ``Markdown`` class has the method ``convertFile`` which reads in a file and |
| writes out to a file-like-object: |
| |
| md = markdown.Markdown() |
| md.convertFile(input="in.txt", output="out.html", encoding="utf-8") |
| |
| The markdown module also includes a shortcut function ``markdownFromFile`` that |
| wraps the above method. |
| |
| markdown.markdownFromFile(input="in.txt", |
| output="out.html", |
| extensions=[], |
| encoding="utf-8", |
| safe=False) |
| |
| In either case, if the ``output`` keyword is passed a file name (i.e.: |
| ``output="out.html"``), it will try to write to a file by that name. If |
| ``output`` is passed a file-like-object (i.e. ``output=StringIO.StringIO()``), |
| it will attempt to write out to that object. Finally, if ``output`` is |
| set to ``None``, it will write to ``stdout``. |
| |
| Using Extensions |
| ---------------- |
| |
| One of the parameters that you can pass is a list of Extensions. Extensions |
| must be available as python modules either within the ``markdown.extensions`` |
| package or on your PYTHONPATH with names starting with `mdx_`, followed by the |
| name of the extension. Thus, ``extensions=['footnotes']`` will first look for |
| the module ``markdown.extensions.footnotes``, then a module named |
| ``mdx_footnotes``. See the documentation specific to the extension you are |
| using for help in specifying configuration settings for that extension. |
| |
| Note that some extensions may need their state reset between each call to |
| ``convert``: |
| |
| html1 = md.convert(text1) |
| md.reset() |
| html2 = md.convert(text2) |
| |
| Safe Mode |
| --------- |
| |
| If you are using Markdown on a web system which will transform text provided |
| by untrusted users, you may want to use the "safe_mode" option which ensures |
| that the user's HTML tags are either replaced, removed or escaped. (They can |
| still create links using Markdown syntax.) |
| |
| * To replace HTML, set ``safe_mode="replace"`` (``safe_mode=True`` still works |
| for backward compatibility with older versions). The HTML will be replaced |
| with the text defined in ``markdown.HTML_REMOVED_TEXT`` which defaults to |
| ``[HTML_REMOVED]``. To replace the HTML with something else: |
| |
| markdown.HTML_REMOVED_TEXT = "--RAW HTML IS NOT ALLOWED--" |
| md = markdown.Markdown(safe_mode="replace") |
| |
| **Note**: You could edit the value of ``HTML_REMOVED_TEXT`` directly in |
| markdown/__init__.py but you will need to remember to do so every time you |
| upgrade to a newer version of Markdown. Therefore, this is not recommended. |
| |
| * To remove HTML, set ``safe_mode="remove"``. Any raw HTML will be completely |
| stripped from the text with no warning to the author. |
| |
| * To escape HTML, set ``safe_mode="escape"``. The HTML will be escaped and |
| included in the document. |
| |
| Output Formats |
| -------------- |
| |
| If Markdown is outputing (X)HTML as part of a web page, most likely you will |
| want the output to match the (X)HTML version used by the rest of your page/site. |
| Currently, Markdown offers two output formats out of the box; "HTML4" and |
| "XHTML1" (the default) . Markdown will also accept the formats "HTML" and |
| "XHTML" which currently map to "HTML4" and "XHTML" respectively. However, |
| you should use the more explicit keys as the general keys may change in the |
| future if it makes sense at that time. The keys can either be lowercase or |
| uppercase. |
| |
| To set the output format do: |
| |
| html = markdown.markdown(text, output_format='html4') |
| |
| Or, when using the Markdown class: |
| |
| md = markdown.Markdown(output_format='html4') |
| html = md.convert(text) |
| |
| Note that the output format is only set once for the class and cannot be |
| specified each time ``convert()`` is called. If you really must change the |
| output format for the class, you can use the ``set_output_format`` method: |
| |
| md.set_output_format('xhtml1') |