Customizable Parsing Test Repository

Class

AST

An abstract syntax tree (AST) is a data structure typically created by parsing text using a grammar. We create this class for that purpose. It has the following attributes:

  • Its contents, which may be a string to indicate that this AST is a leaf of the tree, or it may be an array of other ASTs, which would be the operator and operands, in that order.
  • The Language from which it was parsed.

Any AST object then has methods for workflows that involve parsing, representing, and classifying.

Constructor

new AST(language, contents)

Construct a new AST from two ingredients, the Language from which it was parsed, and its contents (which can be a string if this AST is a leaf, or an array of ASTs otherwise). Also, the second parameter can be a JSON structure of nested arrays and strings, which will be recursively passed to AST constructors to convert it to an AST.

Note that constructing an AST from a JSON structure takes it as given, and does not alter it. To construct an AST that performs changes based on associativity of concepts, see fromJSON().

In an AST, every head (i.e., operator) must be a leaf AST. Even if, for example, you plan to use ASTs to represent function application, you must use a leaf AST to represent the idea of function application. Thus, for example, "f composed with g, applied to x" should not be represented as the structure [ [ 'compose', 'f', 'g' ], 'x' ], but rather as the structure [ 'apply', [ 'compose', 'f', 'g' ], 'x' ].

Parameters

  • language Language

    the language from which this AST was parsed

  • contents String | Array

    see explanation above

Source

Classes

AST

Methods

static

fromJSON(language, json, topnullable) → {AST}

The Language class can parse text into hierarchies of JavaScript arrays. Those arrays sometimes contain unnecessary information about the details of the parsing. For instance, if x*y were the notation for multiplication in some language, then that Language class's parsing function would return an array with contents something like ['multiplication','x','*','y']. In other words, all tokens that were parsed become part of the parsing result, even though one of them is redundant now that the semantic type (in this example "multiplication") has been added.

This function not only converts such an array into an AST instance, but also removes the unnecessary tokens (the '*' entry in the example above), since the meaning is clear from the first element of the array, the name of the semantic type parsed from the '*' infix operator.

This function also flattens any nested ASTs for concepts that are classified as associative. For example, if the JSON representation of the AST were ['addition','x',['addition','y','z']], and the concept of addition were marked associative (in the Converter's list of concepts), then this function will not create a nested tree imitating that JSON, but will create one isomorphic to ['addition','x','y','z'] instead. (Note that associativity is actually not a predicate about each operator separately, but a binary relation between the outer and inner operators, so this example is a special case.)

Parameters

  • language Language

    the language from which the JSON was parsed, and which should be used when constructing the AST

  • json Array

    a hierarchy of JavaScript Arrays (with strings as leaves) representing an AST

  • top boolean <nullable>
    true

    true if this is the top-level call, for internal use only; clients should omit this parameter

Returns

  • AST

    the AST represented by the JSON

Source

arg(index) → {AST}

A non-leaf AST has a nonempty array of children, beginning with the operator of the AST, often called its head, followed by the operands. This function returns the operand at the given index, where the first operand after the head has index 0, and the last operand has index this.numArgs() - 1. If this AST is a leaf, this function returns undefined.

Parameters

  • index number

    the index of the operand to return

Returns

  • AST

    the operand at the given index

Source

args() → {AST}

A non-leaf AST has a nonempty array of children, beginning with the operator of the AST, often called its head. This function returns the rest of that list, the operands, as a JavaScript array (not an AST). If this AST is a leaf, this function returns undefined.

See

Returns

  • AST

    the operands of this AST

Source

associatesCorrectly() → {boolean}

One of the options that can be provided when defining a new concept in a Converter is how it associates when used multiple times in a row. For example, if we see the ordinary mathematical notation 1+2+3, does that mean (1+2)+3 or 1+(2+3)? In some cases, it doesn't matter, but in others it does. For example, the implication arrow of propositional logic typically associates to the right, as in A->(B->C).

This function checks to ensure that every subtree of this AST associates according to the options associated with every one of its operators. This can be used to filter out undesired results from a list of possible ASTs generated by parsing. If an operator has no "associates" option set for it, then it has no restrictions on how it associates, and thus ambiguity of parsing remains for that operator.

Note that this function checks only when an operator is nested inside of itself, because that is what the "associates" option on concepts means. Note further that it applies only to binary operators; associativity is not defined for other types. In other words, only subtrees of the form op(op(arg1,arg2),arg3) or op(arg1,op(arg2,arg3)) are considered when determining the return value of this function.

Finally, if a subtree was created originally from notation that used groupers, then there was no ambiguity to resolve, and thus there is no check we need to perform. The user is always free to override the default associating behavior with the use of groupers.

Returns

  • boolean

    true if this AST associates according to the options provided in the "associates" option for all of the concepts used within it, false otherwise

Source

concept() → {Object}

The SyntacticTypes module defines a set of syntactic types common to mathematical notation. Each Converter instance can define a set of semantic types to add to the set of syntactic types. Those semantic types are called concepts. An AST can represent a concept in either of two ways, as documented in the isConcept() method.

When an AST represents a concept, this function returns that concept as an object with the following properties.

  • parentType, the name of the parent syntactic type
  • putdown, the putdown representation of the concept (for example, it might be (+ summand summand) for the concept addition)
  • typeSequence, an array of the syntactic types of each of the arguments extracted from the putdown notation (for example, it would be ["summand","summand"] in the example above)

Returns

  • Object

    the data about the concept represented in this AST

Source

A non-leaf AST has a nonempty array of children, beginning with the operator of the AST, often called its head. This function returns that entry. (It does not shift the head off the array, but only returns it, leaving the AST unaltered.) If this AST is a leaf, this function returns undefined.

See

Returns

  • AST

    the operator of this AST, if one exists

Source

isCompound() → {boolean}

An AST may be a leaf or it may have subtrees, that is to say, it may be compound. This function detects which is the case, and returns true iff this AST is compound.

See

Returns

  • boolean

    whether this AST is a non-leaf (i.e., is compound)

Source

isConcept() → {boolean}

The SyntacticTypes module defines a set of syntactic types common to mathematical notation. Each Converter instance can define a set of semantic types to add to the set of syntactic types. Those semantic types are called concepts. This function asks whether this AST represents a concept. This can happen in two ways.

  • First, if this AST is compound, then its head may represent a concept. Consider two examples:
    • If the operator were the leaf "expression", that would indicate that this AST represents an instance of the syntactic type "expression", and this AST would not be a concept, and this function would return false.
    • If this AST were for a Language whose Converter defined the concept "factorial" to be a specific type of expression, and this AST began with the operator "factorial", then this function would return true, because this AST is not an instance of a syntactic type (expression), but an instance of a semantic type (factorial).
  • Second, if this AST is a leaf, then its contents (as a string) may represent a concept. For example, if this AST were for a Language whose Converter defined the constant "pi" as a concept, and this AST is a leaf containing just the string "pi", then this function would return true.

Returns

  • boolean

    true if this AST represents a concept

Source

isLeaf() → {boolean}

An AST may be a leaf or it may have subtrees. This function detects which is the case, and returns true iff this AST is a leaf. This is equivalent to !this.isCompound(), and is therefore just a convenience function.

Returns

  • boolean

    whether this AST is a leaf

Source

numArgs() → {number}

A non-leaf AST has a nonempty array of children, beginning with the operator of the AST, often called its head, followed by the operands. This function returns the number of operands (which is one less than the length of the array). If this AST is a leaf, this function returns undefined.

See

Returns

  • number

    the number of operands

Source

toJSON() → {Array|String}

Converts this AST into a nested hierarchy of JavaScript arrays, which is isomorphic to the AST hierarchy itself, but without any of the data or features provided by the AST class. For example, the compound AST whose string representation is addition(x,multiplication(y,z)) would have the JSON form [ 'addition', 'x', [ 'multiplication', 'y', 'z' ] ]. But a leaf AST has its string contents as its JSON form (i.e., it is not an array).

Returns

  • Array String

    a JSON representation of this AST

Source

toLanguage(language) → {String}

Represent this AST in the given language. For example, if the AST were one whose string representation were addition(x,y), then we might call .toLanguage(latex) on that AST and expect to get x+y, or we might call .toLanguage(putdown) and expect to get (+ x y). The parameter passed to toLanguage() must be a Language that shares the same Converter instance as this AST's language.

This function requires the AST on which it is called to be in compact form, as produced by the compact() member function. The behavior of this function is undefined if this requirement is not met.

Parameters

  • language Language

    the language in which to write the expression stored in this AST

Returns

  • String

    the representation, in the specified language, of this AST

Source

toString() → {String}

A simple string representation of this AST, useful for debugging. For example, if this AST represented the addition of x and the product of y and z, the representation might be addition(x,multiplication(y,z)).

Returns

  • String

    a simple string representation of this AST

Source