Customizable Parsing Test Repository: Language

Constructor

new Language(name, converter, groupers, linternullable)

To add a new language to a Converter, just call this constructor, passing the converter object as one of the parameters, and this constructor will add this language to that converter.

The default grouping symbols for any newly installed language are ( and ). You may wish to override this default for a number of reasons. For example, internally, when this class adds the putdown language, it specifies the empty array, since putdown uses no grouping symbols. And if you were to define a parser for LaTeX, you might want to add the symbols { and } to the list, since they are groupers in LaTeX. Note that the array must be of even length, pairs of open and close groupers, in that order, as in [ '(', ')', '{', '}' ] for LaTeX.

The default linter for any language is the identity function, meaning that no cleanup is needed for expressions of that language. If you want the convert() function, upon creating an expression in this language, to apply to it any specific formatting conventions you would like to see in the output, you can specify a linter, which will be run before convert() returns its result. Add such a function only if you see output from convert() that doesn't meet your standards, aesthetically or for some functional reason.

For example, when installing putdown as the initial language in the constructor for this class, it provides a linter that removes unnecessary spaces around parentheses.

Parameters

name String
the name of the new language (e.g., "latex")
converter Converter
the converter into which to install this language
groupers Array.<String>
any pairs of grouping symbols used by the language (as documented above)
linter function <nullable>

a function that cleans up notation in this language (as documented above)

See

Source

language.js , line 34

Classes

Language

Members

regularExpressions

This static member of the class contains regular expressions for some common types of notation. The following regular expressions are available to make it easier to define new concepts or notations.

oneLetterVariable - a single letter variable expressed in Roman letters (lower-case or upper-case A, B, C, ...)
nonnegativeInteger - an integer expressed using just the digits 0-9
integer - same as the previous, but possibly preceded by a -
nonnegativeNumber - a number expressed using the digits 0-9 and a decimal point (optional)
number - same as the previous, but possibly preceded by a -

Source

language.js , line 359

Methods

static

fromJSON(name, converter, json, addConceptsAlso)

Rather than creating an empty language using this class's constructor, then adding each notation with a separate function call, you can construct an instance and add all the notations in one function call with this method.

The format for the JSON data structure passed as the third argument is as follows.

It should have a "groupers" field that is an array of strings containing the exact same data you would pass as the groupers argument to the constructor.
It should have a "notations" field that is an array of objects, each object having the fields "concept", "notation", and "options", which correspond directly to the three parameters of the addNotation() function.

To see an example of such a data structure, examine the contents of the file latex-notation.js in this repository.

Parameters

name string
the name of the language, just as in this class's constructor
converter Converter
a Converter instance, just as in this class's constructor
json Object
the JSON representation of the language, as described above
addConceptsAlso boolean true
if true, before constructing the Language, examine all built-in concepts mentioned in any of its notations, and add them to the converter using addBuiltIns().

Source

language.js , line 398

addNotation(conceptName, notation, optionsnullable)

Add a new notation to this language, for one of its converter's concepts. Specify the name of the concept being represented, then the notation using a string in which the letters A, B, and C represent the first, second, and third arguments, respectively. (You can omit any arguments you do not need. For example, you might write A+B for addition, -A for negation, or just \\bot for the logical constant "false" in LaTeX.)

The options object supports the following fields.

If you need to use one of the letters A, B, or C in the notation itself, or if you need to use more than three parameters in your notation (continuing on to D, E, etc.) then you can use the options object to specify the variables in your notation. For example, you could use notation x+y and then use the options object to specify { variables : [ 'x', 'y' ] }. Note that every occurrence a variable counts as the variable (except inside another word) even if used multiple times. So choose variable names that do not show up in the new notation you are introducing.
If this notation should be used only for representing the concept in this language, but not for parsing from this language into an AST, then you can set writeOnly : true. This can be useful in two cases.
1. If you have multiple notations for the same concept in some languages, but not in others. You can map each notation to a separate concept, then map all concepts to one notation in the smaller language, marking all but one as write-only, thus establishing a canonical form. And yet between any two languages that support all the notations, translation can preserve the notational subtleties.
2. If you have some notation that is just a shorthand for a more complex notation, you can parse the notation to a concept named for that notation, but convert to putdown form in a write-only way, expanding the notation to its underlying (compound) meaning. Then the converter will not attempt to invert that expansion when parsing putdown, but will preserve its expanded meaning.

There are no other options at this time besides those documented above, but the options object is available for future expansion.

Parameters

conceptName String
the name of the concept represented by this new notation
notation String
the notation being added
options Object <nullable>

any additional options, as documented above

Source

language.js , line 183

convertTo(text, language, ambiguous) → {String|Array.<String>}

Convert text in this language to text in another language. If the text cannot be parsed in this language, then undefined is returned instead. Note that this object and language must have the same Converter instance associated with them, or this function will throw an error.

Parameters

text String
the text in this language to be converter to the other language
language Language
the destination language
ambiguous boolean false
passed to the parse() function, and thus determines whether the result of this is a string or an array thereof

Returns

String Array.<String>
the converted text, if the conversion was possible, and undefined otherwise (or an array of strings if ambiguous is true)

Source

language.js , line 312

parse(text, ambiguous) → {AST|Array.<AST>}

Treat the given text as an expression in this language and attempt to parse it. Return an abstract syntax tree (AST) on success, or undefined on failure. Or, if you set the optional second parameter to true, it will return an array of all possible parsings, each as an AST.

Parameters

text String
the input text to parse
ambiguous boolean false
if true, return all possible meanings of the given text, which will be more than one if the text is ambiguous; defaults to false, which returns just one AST or undefined

See

compact()

Returns

AST Array.<AST>
- if ambiguous is set to false, returns the parsed AST, or undefined if parsing failed; if ambiguous is set to true, returns all parsed ASTs as an array, which may be empty

Source

language.js , line 267

rulesFor(target) → {Array}

Get all grammar rules for the given concept or syntactic type. The result is an array of the right-hand sides of the grammar rules for the concept or syntactic type. Each such right-hand side is the array of tokens or type names used internally by the parser.

Parameters

target String | AST
if this is a string, it must be the name of the concept or syntactic type to look up; if it is a leaf AST, then its contents as a string are used; if it is a compound AST, then its head is used

Returns

Array
an array of the right-hand sides of the grammar rules

Source

language.js , line 337