aboutsummaryrefslogtreecommitdiff
path: root/node_modules/highlight.js/docs/language-guide.rst
diff options
context:
space:
mode:
Diffstat (limited to 'node_modules/highlight.js/docs/language-guide.rst')
-rw-r--r--node_modules/highlight.js/docs/language-guide.rst264
1 files changed, 0 insertions, 264 deletions
diff --git a/node_modules/highlight.js/docs/language-guide.rst b/node_modules/highlight.js/docs/language-guide.rst
deleted file mode 100644
index f48c748be..000000000
--- a/node_modules/highlight.js/docs/language-guide.rst
+++ /dev/null
@@ -1,264 +0,0 @@
-Language definition guide
-=========================
-
-Highlighting overview
----------------------
-
-Programming language code consists of parts with different rules of parsing: keywords like ``for`` or ``if``
-don't make sense inside strings, strings may contain backslash-escaped symbols like ``\"``
-and comments usually don't contain anything interesting except the end of the comment.
-
-In highlight.js such parts are called "modes".
-
-Each mode consists of:
-
-* starting condition
-* ending condition
-* list of contained sub-modes
-* lexing rules and keywords
-* …exotic stuff like another language inside a language
-
-The parser's work is to look for modes and their keywords.
-Upon finding, it wraps them into the markup ``<span class="...">...</span>``
-and puts the name of the mode ("string", "comment", "number")
-or a keyword group name ("keyword", "literal", "built-in") as the span's class name.
-
-
-General syntax
---------------
-
-A language definition is a JavaScript object describing the default parsing mode for the language.
-This default mode contains sub-modes which in turn contain other sub-modes, effectively making the language definition a tree of modes.
-
-Here's an example:
-
-::
-
- {
- case_insensitive: true, // language is case-insensitive
- keywords: 'for if while',
- contains: [
- {
- className: 'string',
- begin: '"', end: '"'
- },
- hljs.COMMENT(
- '/\\*', // begin
- '\\*/', // end
- {
- contains: [
- {
- className: 'doc', begin: '@\\w+'
- }
- ]
- }
- )
- ]
- }
-
-Usually the default mode accounts for the majority of the code and describes all language keywords.
-A notable exception here is XML in which a default mode is just a user text that doesn't contain any keywords,
-and most interesting parsing happens inside tags.
-
-
-Keywords
---------
-
-In the simple case language keywords are defined in a string, separated by space:
-
-::
-
- {
- keywords: 'else for if while'
- }
-
-Some languages have different kinds of "keywords" that might not be called as such by the language spec
-but are very close to them from the point of view of a syntax highlighter. These are all sorts of "literals", "built-ins", "symbols" and such.
-To define such keyword groups the attribute ``keywords`` becomes an object each property of which defines its own group of keywords:
-
-::
-
- {
- keywords: {
- keyword: 'else for if while',
- literal: 'false true null'
- }
- }
-
-The group name becomes then a class name in a generated markup enabling different styling for different kinds of keywords.
-
-To detect keywords highlight.js breaks the processed chunk of code into separate words — a process called lexing.
-The "word" here is defined by the regexp ``[a-zA-Z][a-zA-Z0-9_]*`` that works for keywords in most languages.
-Different lexing rules can be defined by the ``lexemes`` attribute:
-
-::
-
- {
- lexemes '-[a-z]+',
- keywords: '-import -export'
- }
-
-
-Sub-modes
----------
-
-Sub-modes are listed in the ``contains`` attribute:
-
-::
-
- {
- keywords: '...',
- contains: [
- hljs.QUOTE_STRING_MODE,
- hljs.C_LINE_COMMENT,
- { ... custom mode definition ... }
- ]
- }
-
-A mode can reference itself in the ``contains`` array by using a special keyword ``'self``'.
-This is commonly used to define nested modes:
-
-::
-
- {
- className: 'object',
- begin: '{', end: '}',
- contains: [hljs.QUOTE_STRING_MODE, 'self']
- }
-
-
-Comments
---------
-
-To define custom comments it is recommended to use a built-in helper function ``hljs.COMMENT`` instead of describing the mode directly, as it also defines a few default sub-modes that improve language detection and do other nice things.
-
-Parameters for the function are:
-
-::
-
- hljs.COMMENT(
- begin, // begin regex
- end, // end regex
- extra // optional object with extra attributes to override defaults
- // (for example {relevance: 0})
- )
-
-
-Markup generation
------------------
-
-Modes usually generate actual highlighting markup — ``<span>`` elements with specific class names that are defined by the ``className`` attribute:
-
-::
-
- {
- contains: [
- {
- className: 'string',
- // ... other attributes
- },
- {
- className: 'number',
- // ...
- }
- ]
- }
-
-Names are not required to be unique, it's quite common to have several definitions with the same name.
-For example, many languages have various syntaxes for strings, comments, etc…
-
-Sometimes modes are defined only to support specific parsing rules and aren't needed in the final markup.
-A classic example is an escaping sequence inside strings allowing them to contain an ending quote.
-
-::
-
- {
- className: 'string',
- begin: '"', end: '"',
- contains: [{begin: '\\\\.'}],
- }
-
-For such modes ``className`` attribute should be omitted so they won't generate excessive markup.
-
-
-Mode attributes
----------------
-
-Other useful attributes are defined in the :doc:`mode reference </reference>`.
-
-
-.. _relevance:
-
-Relevance
----------
-
-Highlight.js tries to automatically detect the language of a code fragment.
-The heuristics is essentially simple: it tries to highlight a fragment with all the language definitions
-and the one that yields most specific modes and keywords wins. The job of a language definition
-is to help this heuristics by hinting relative relevance (or irrelevance) of modes.
-
-This is best illustrated by example. Python has special kinds of strings defined by prefix letters before the quotes:
-``r"..."``, ``u"..."``. If a code fragment contains such strings there is a good chance that it's in Python.
-So these string modes are given high relevance:
-
-::
-
- {
- className: 'string',
- begin: 'r"', end: '"',
- relevance: 10
- }
-
-On the other hand, conventional strings in plain single or double quotes aren't specific to any language
-and it makes sense to bring their relevance to zero to lessen statistical noise:
-
-::
-
- {
- className: 'string',
- begin: '"', end: '"',
- relevance: 0
- }
-
-The default value for relevance is 1. When setting an explicit value it's recommended to use either 10 or 0.
-
-Keywords also influence relevance. Each of them usually has a relevance of 1, but there are some unique names
-that aren't likely to be found outside of their languages, even in the form of variable names.
-For example just having ``reinterpret_cast`` somewhere in the code is a good indicator that we're looking at C++.
-It's worth to set relevance of such keywords a bit higher. This is done with a pipe:
-
-::
-
- {
- keywords: 'for if reinterpret_cast|10'
- }
-
-
-Illegal symbols
----------------
-
-Another way to improve language detection is to define illegal symbols for a mode.
-For example in Python first line of class definition (``class MyClass(object):``) cannot contain symbol "{" or a newline.
-Presence of these symbols clearly shows that the language is not Python and the parser can drop this attempt early.
-
-Illegal symbols are defined as a a single regular expression:
-
-::
-
- {
- className: 'class',
- illegal: '[${]'
- }
-
-
-Pre-defined modes and regular expressions
------------------------------------------
-
-Many languages share common modes and regular expressions. Such expressions are defined in core highlight.js code
-at the end under "Common regexps" and "Common modes" titles. Use them when possible.
-
-
-Contributing
-------------
-
-Follow the :doc:`contributor checklist </language-contribution>`.