<p>Prism is awesome out of the box, but it’s even awesomer when it’s customized to your own needs. This section will help you write new language definitions, plugins and all-around Prism hacking.</p>
<p>A regular expression literal is the simplest way to express a token. An alternative way, with more options, is by using an object literal. With that notation, the regular expression describing the token would be the <code>pattern</code> attribute:</p>
<p>So far the functionality is exactly the same between the short and extended notations. However, the extended notation allows for additional options:</p>
This makes it easier to define certain languages. However, keep in mind that they’re slower and if coded poorly, can even result in infinite recursion.
For an example of nested tokens, check out the Markup language definition:
<dd>Accepts an object literal with tokens and appends them to the end of the current object literal. Useful for referring to tokens defined elsewhere. For an example where <code>rest</code> is useful, check the Markup definitions above.</dd>
<p>Unless explicitly allowed through the <code>inside</code> property, each token cannot contain other tokens, so their order is significant. Although per the ECMAScript specification, objects are not required to have a specific ordering of their properties, in practice they do in every modern browser.</p>
<p>In most languages there are multiple different ways of declaring the same constructs (e.g. comments, strings, ...) and sometimes it is difficult or unpractical to match all of them with one single regular expression. To add multiple regular expressions for one token name an array can be used:</p>
<p>This is a helper method to ease modifying existing languages. For example, the CSS language definition not only defines CSS highlighting for CSS documents,
but also needs to define highlighting for CSS embedded in HTML through <codeclass="language-markup"><style></code> elements. To do this, it needs to modify
<code>Prism.languages.markup</code> and add the appropriate tokens. However, <code>Prism.languages.markup</code>
is a regular JavaScript object literal, so if you do this:</p>
<pre><code>Prism.languages.markup.style = {
/* tokens */
};</code></pre>
<p>then the <code>style</code> token will be added (and processed) at the end. <code>Prism.languages.insertBefore</code> allows you to insert
tokens <em>before</em> existing tokens. For the CSS example above, you would use it like this:</p>
<p>This section will explain the usual workflow of creating a new language definition.</p>
<p>As an example, we will create the language definition of the fictional <em>Foo's Binary, Artistic Robots™</em> language or just Foo Bar for short.</p>
<p>Create a new file <code>components/prism-foo-bar.js</code>.</p>
<p>In this example, we choose <code>foo-bar</code> as the id of the new language. The language id has to be unique and should work well with the <code>language-xxxx</code> CSS class name Prism uses to refer to language definitions. Your language id should ideally match the regular expression <code>/^[a-z][a-z\d]*(?:-[a-z][a-z\d]*)*$/</code>.</p>
<p>Edit <code>components.json</code> to register the new language by adding it to the <code>languages</code> object. (Please note that all language entries are sorted alphabetically by title.) <br>
<p>If your language definition depends any other languages, you have to specify this here as well by adding a <codeclass="language-js">"require"</code> property. E.g. <codeclass="language-js">"require": "clike"</code>, or <codeclass="language-js">"require" : ["markup", "css"]</code>.</p>
<p><em>Note:</em> Any changes made to <code>components.json</code> require a rebuild (see step 3).</p>
</li>
<li>
<p>Rebuild Prism by running <codeclass="language-bash">npx gulp</code>.</p>
<p>This will make your language available to the <ahref="test.html">test page</a>, or more precise: your local version of it. You can open your local <code>test.html</code> page in any browser, select your language, and see how your language definition highlights any code you input.</p>
<p>Aliases for are useful if your language is known under more than just one name or there are very common abbreviations for your language (e.g. JS for JavaScript). Keep in mind that aliases are very similar to language ids in that they also have to be unique (i.e. there cannot be an alias which is the same as another alias of language id) and work as CSS class names.</p>
<p>In this example, we will register the alias <code>foo</code> for <code>foo-bar</code> because Foo Bar code is stored in <code>.foo</code> files.</p>
<p>To add the alias, we add this line at the end of <code>prism-foo-bar.js</code>:</p>
<p>Aliases also have to be registered in <code>components.json</code> by adding the <code>alias</code> property to the language entry. In this example, the updated entry will look like this:</p>
<pre><codeclass="language-js">"foo-bar": {
"title": "Foo Bar",
"alias": "foo",
"owner": "Your GitHub name"
}</code></pre>
<p><em>Note:</em><code>alias</code> can also be a string array if you need to register multiple aliases.</p>
<p>Using <code>aliasTitles</code>, it's also possible to give aliases specific titles. In this example, this won't be necessary but a good example as to where this is useful is the markup language:</p>
<p>Create a folder <code>tests/languages/foo-bar/</code>. This is where your test files will live. The test format and how to run tests is described <ahref="test-suite.html">here</a>.</p>
<p>You should add a test for every major feature of your language. Test files should test the common case and certain edge cases (if any). Good examples are <ahref="https://github.com/PrismJS/prism/tree/master/tests/languages/javascript">the tests of the JavaScript language</a>.</p>
<p>This command will check only your test files. The new test will fail because the specified JSON is incorrect but the error message of the failed test will also include the JSON of the simplified token stream Prism created. This is what we're after. Replace the current incorrect JSON with the output labeled <em>Token Stream</em>. (Please also adjust the indentation. We use tabs.)</p>
</li>
<li>
<p><strong>Carefully check</strong> that the token stream JSON you just inserted is what you expect.</p>
<p>Create a new file <code>examples/prism-foo-bar.html</code>. This will be the template containing the example markup. Just look at other examples to see how these files are structured. <br>
We don't have any rules as to what counts as an example, so a single <em>Full example</em> section where you present the highlighting of the major features of the language is enough.</p>
<p>Prism’s plugin architecture is fairly simple. To add a callback, you use <codeclass="language-javascript">Prism.hooks.add(hookname, callback)</code>.
<code>hookname</code> is a string with the hook id, that uniquely identifies the hook your code should run at.
<code>callback</code> is a function that accepts one parameter: an object with various variables that can be modified, since objects in JavaScript are passed by reference.
<p>Of course, to understand which hooks to use you would have to read Prism’s source. Imagine where you would add your code and then find the appropriate hook.
If there is no hook you can use, you may <ahref="https://github.com/PrismJS/prism/issues">request one to be added</a>, detailing why you need it there.
<dd>The element containing the code. It must have a class of <code>language-xxxx</code> to be processed, where <code>xxxx</code> is a valid language identifier.</dd>
<p>This is the heart of Prism, and the most low-level function you can use. It accepts a string of text as input and the language definitions to use, and returns an array with the tokenized code.
When the language definition includes nested tokens, the function is called recursively on each of these tokens. This method could be useful in other contexts as well, as a very crude parser.</p>