How to Build a Parse Tree in C++

  1. Use the if-else Statements to Build a Parse Tree in C++
  2. Use Conditional Statements to Build a Parse Tree in C++
  3. Use Polymorphism to Build a Parse Tree in C++
  4. Conclusion
How to Build a Parse Tree in C++

A parse tree, also known as a syntax tree, is a hierarchical representation of the syntactic structure of a source code or a sequence of tokens. It serves as a vital tool in compilers, interpreters, and language processing applications.

In this article, we will explore different methods to build parse trees in C++, providing detailed explanations and examples for each method. We will teach three primary C++ methods to build a parse tree in C++.

The first method uses selection statements if (condition) statement and if (condition) statement else statement. The second method uses conditional statements with a simple recursive structure, and the third uses polymorphism.

Creating a parse tree requires lexing (recognizing words and removing comments), parsing (structuring the linear input stream into an abstract syntax tree), and evaluation or code generation.

Use the if-else Statements to Build a Parse Tree in C++

A parse tree (as a C++ BNF) features context-free grammar for parsing, and if-else or selection statements play an important part in its creation. Alternatively, you can use try {if (statement) add (statement-tag);} finally {show(your_code);} to point your parser to a condition that is true to execute a block of code.

It’s easy to understand the order of evaluation for the whole expression through the hierarchy of the parse tree as it plays an integral part in the sentence and mathematical expression creation to represent real-world constructions. The creation of a parse tree is based on the four primary rules, coded in the following C++ code sequence-wise.

In each clause of the if conditional statement, you can understand and study the parse tree created by implementing the rules with calls to the binary tree methods. Most importantly, it’s essential to understand the reason in case of any error, and using the value error exception in the else clause of the conditional statement is extremely helpful.

#include <algorithm>
#include <iostream>
#include <sstream>
#include <stack>
#include <string>
#include <vector>

using namespace std;

class parsetree_main {
 private:
  string root_key;
  parsetree_main *node_left;
  parsetree_main *node_right;

 public:
  parsetree_main(string obj_root) {
    this->root_key = obj_root;
    this->node_left = NULL;
    this->node_right = NULL;
  }

  void insertNode_left(string insert_newnode) {
    if (this->node_left == NULL) {
      this->node_left = new parsetree_main(insert_newnode);
    } else {
      parsetree_main *t = new parsetree_main(insert_newnode);
      t->node_left = this->node_left;
      this->node_left = t;
    }
  }

  void insertNode_right(string insert_newnode) {
    if (this->node_right == NULL) {
      this->node_right = new parsetree_main(insert_newnode);
    } else {
      parsetree_main *active_parsetree = new parsetree_main(insert_newnode);
      active_parsetree->node_right = this->node_right;
      this->node_right = active_parsetree;
    }
  }

  parsetree_main *getchild_rightside() { return this->node_right; }

  parsetree_main *getchield_leftside() { return this->node_left; }

  void set_rootvalue(string obj_rootvalue) { this->root_key = obj_rootvalue; }

  string get_rootvalue() { return this->root_key; }
};

parsetree_main *build_newparsetree(string obj_parse) {
  string var_buffer;
  stringstream ss(obj_parse);
  vector<string> vector_list;
  while (ss >> var_buffer) {
    vector_list.push_back(var_buffer);
  }
  stack<parsetree_main *> parsetree_stack;
  parsetree_main *pri_parsetree = new parsetree_main("");
  parsetree_stack.push(pri_parsetree);
  parsetree_main *active_parsetree = pri_parsetree;

  string parsetreearray_1[] = {"+", "-", "*", "/"};
  vector<string> parse_vector_1(
      parsetreearray_1, parsetreearray_1 + (sizeof(parsetreearray_1) /
                                            sizeof(parsetreearray_1[0])));

  string parsetreearray_2[] = {"+", "-", "*", "/", ")"};
  vector<string> parse_vector_2(
      parsetreearray_2, parsetreearray_2 + (sizeof(parsetreearray_2) /
                                            sizeof(parsetreearray_2[0])));

  for (unsigned int i = 0; i < vector_list.size(); i++) {
    if (vector_list[i] == "(") {
      active_parsetree->insertNode_left("");
      parsetree_stack.push(active_parsetree);
      active_parsetree = active_parsetree->getchield_leftside();
    }

    else if (find(parse_vector_1.begin(), parse_vector_1.end(),
                  vector_list[i]) != parse_vector_1.end()) {
      active_parsetree->set_rootvalue(vector_list[i]);
      active_parsetree->insertNode_right("");
      parsetree_stack.push(active_parsetree);
      active_parsetree = active_parsetree->getchild_rightside();
    }

    else if (vector_list[i] == ")") {
      active_parsetree = parsetree_stack.top();
      parsetree_stack.pop();
    }

    else if (find(parse_vector_2.begin(), parse_vector_2.end(),
                  vector_list[i]) == parse_vector_2.end()) {
      try {
        active_parsetree->set_rootvalue(vector_list[i]);
        parsetree_main *node_parent = parsetree_stack.top();
        parsetree_stack.pop();
        active_parsetree = node_parent;
      }

      catch (string error_value) {
        cerr << "Token:  " << vector_list[i]
             << " Error: It's not a valid integer!" << endl;
      }
    }
  }
  return pri_parsetree;
}

void postorder(parsetree_main *createInst_parseTree) {
  if (createInst_parseTree != NULL) {
    postorder(createInst_parseTree->getchield_leftside());
    postorder(createInst_parseTree->getchild_rightside());
    cout << createInst_parseTree->get_rootvalue() << endl;
  }
}

int main() {
  parsetree_main *main_parsetree = build_newparsetree("( ( 10 + 5 ) * 3 )");
  postorder(main_parsetree);

  return 0;
}

The code defines a class parsetree_main representing nodes in a parse tree for arithmetic expressions. The code includes methods for inserting nodes, setting root values, and accessing child nodes.

A function build_newparsetree constructs a parse tree from a given arithmetic expression using a stack-based approach. The postorder function performs a postorder traversal of the parse tree, printing the root values.

The main function demonstrates parsing the expression "( ( 10 + 5 ) * 3 )" and printing the postorder traversal result:

Output:

10
5
+
3
*

A parse tree with symbol tables compiled from tree visits can even serve as the basis for a simple interpreter and allows transformations and optimizations of the source. A grammar-rule function requires adding a new child to generate a parse tree representing the source part matched by the corresponding grammar rule.

An early compiler that did not generate such a tree makes code generation much easier. However, memory is a precious resource, and its usage can overwhelm some programmers.

Use Conditional Statements to Build a Parse Tree in C++

Conditional statements, as the corresponding recursive descent parser, have a recursive structure. Furthermore, the interior conditions are parsed as <condition> -> if <expression> then <statement> [else <statement>].

The tokenized input sequence also has this form, which enables the parse tree to detect syntactic errors during the lexical analysis. Additionally, the outer conditions are parsed as <conditional> -> if <expression> then <statement> [else <statement>].

#include <iostream>
#include <vector>

using namespace std;

class priTree_binary {
 private:
  char node_root;
  priTree_binary *node_left = NULL;
  priTree_binary *node_right = NULL;

 public:
  priTree_binary(char value_rootnode) {
    // constructor
    node_root = value_rootnode;
  }
  void insert_rightnode(char rightnode_value) {
    priTree_binary *create_newnode = new priTree_binary(rightnode_value);
    if (node_right == NULL)
      node_right = create_newnode;
    else {
      create_newnode->node_right = node_right;
      node_right = create_newnode;
    }
  }
  void insert_leftnode(char leftnode_value) {
    priTree_binary *create_newnode = new priTree_binary(leftnode_value);
    if (node_left == NULL)
      node_left = create_newnode;
    else {
      create_newnode->node_left = node_left;
      node_left = create_newnode;
    }
  }
  priTree_binary *get_rightnode() { return node_right; }
  priTree_binary *get_leftnode() { return node_left; }
  char get_rootnode() { return node_root; }
  void set_rootnode(char rootnode_value) { node_root = rootnode_value; }
};

bool check_operator(char opr_token) {
  string valid_operators = "+-/*";
  for (unsigned long long int ing = 0; ing < valid_operators.length(); ing++)
    if (valid_operators[ing] == opr_token) return true;
  return false;
}

priTree_binary *build_parse_tree(string valid_expression) {
  vector<priTree_binary *> vector_stack;
  priTree_binary *binarytree_primary = new priTree_binary(' ');
  vector_stack.push_back(binarytree_primary);
  for (long long unsigned int i = 0; i < valid_expression.length(); i++) {
    if (valid_expression[i] == '(') {
      binarytree_primary->insert_leftnode(' ');
      vector_stack.push_back(binarytree_primary);
      binarytree_primary = binarytree_primary->get_leftnode();
    } else if (isdigit(valid_expression[i])) {
      binarytree_primary->set_rootnode(valid_expression[i]);
      binarytree_primary = vector_stack.back();
      vector_stack.pop_back();
    } else if (check_operator(valid_expression[i])) {
      binarytree_primary->set_rootnode(valid_expression[i]);
      binarytree_primary->insert_rightnode(' ');
      vector_stack.push_back(binarytree_primary);
      binarytree_primary = binarytree_primary->get_rightnode();
    } else {
      binarytree_primary = vector_stack.back();
      vector_stack.pop_back();
    }
  }
  return binarytree_primary;
}

int eval_parse_tree(priTree_binary *parsetree_main) {
  // cout << "Root: " << tree -> get_root() << endl;
  char _operation;
  int node_left, node_right;
  priTree_binary *leftnode_child = parsetree_main->get_leftnode();
  priTree_binary *rightnode_child = parsetree_main->get_rightnode();
  if (leftnode_child && rightnode_child) {
    _operation = parsetree_main->get_rootnode();
    node_left = eval_parse_tree(leftnode_child);
    node_right = eval_parse_tree(rightnode_child);
    switch (_operation) {
      case '+':
        return (int)node_left + (int)node_right;
      case '-':
        return (int)node_left - (int)node_right;
      case '*':
        return (int)node_left * (int)node_right;
      case '/':
        return (int)node_left / (int)node_right;
    }
  } else
    return parsetree_main->get_rootnode();
}

int main(void) {
  cout << "Parse tree build successful! \nThe result is: " << endl;
  cout << eval_parse_tree(build_parse_tree("(5+(2*7))"))
       << endl;  // returns 2803 instead of 19
}

The code defines a class priTree_binary for building binary trees to represent arithmetic expressions. It includes methods for inserting nodes, accessing child nodes, and getting/setting root values.

The code also provides functions for checking if a character is a valid operator, building a parse tree from an arithmetic expression, and evaluating the parse tree.

The main function demonstrates the construction and evaluation of a parse tree for the expression "(5+(2*7))", although there is an error in the expected result, which should be 19 instead of 2803.

Output:

Parse tree build successful!
The result is:
2803

The actual implementation of conditional statements for building a parse tree is much more tedious and detailed. However, the parse tree construction using conditional statements requires a form from a tokenized input sequence.

The function that evaluates the conditional statement based on a structure similar to the function that parses a conditional statement as:

<statement> -> cond(input): <evaluate> if <statement> then <expression> [else if <statement> then <evaluation> else <expression>]

It is important to evaluate the if part of the condition to a Boolean value to determine which portion of the conditional statement to evaluate.

Use Polymorphism to Build a Parse Tree in C++

This method of building a parse tree demonstrates the use of polymorphism in C++ as an example of an extremely useful data structure. It will parse an arithmetic expression and convert it into a tree structure whose nodes are arithmetic operators, and leaf nodes are numbers.

Most importantly, the parse tree representation does not require any parentheses or the knowledge of operator precedence and uniquely describes the calculation to be performed. You will learn how all parse tree nodes as objects inheriting from the class parsetree_node, which represents a number, and BinNode (which represents a binary operation).

#include <iostream>

class parsetree_node {
 public:
  virtual ~parsetree_node() {}
  virtual double binaryOpr_calculation() const = 0;
};

class numeric_node : public parsetree_node {
 public:
  numeric_node(double node_number) : parsetree_nodeRep(node_number) {}
  double binaryOpr_calculation() const;

 private:
  const double parsetree_nodeRep;
};

double numeric_node::binaryOpr_calculation() const {
  std::cout << "The numeric node: " << parsetree_nodeRep << std::endl;
  return parsetree_nodeRep;
}

class binary_node : public parsetree_node {
 public:
  binary_node(parsetree_node* parse_leftnode, parsetree_node* parse_rightnode)
      : imp_leftnode(parse_leftnode), imp_rightnode(parse_rightnode) {}
  ~binary_node();

 protected:
  parsetree_node* const imp_leftnode;
  parsetree_node* const imp_rightnode;
};

binary_node::~binary_node() {
  delete imp_leftnode;
  delete imp_rightnode;
}

class multiplication_node : public binary_node {
 public:
  multiplication_node(parsetree_node* parse_leftnode,
                      parsetree_node* parse_rightnode)
      : binary_node(parse_leftnode, parse_rightnode) {}
  double binaryOpr_calculation() const;
};

double multiplication_node::binaryOpr_calculation() const {
  std::cout << "Multiplying\n";
  return imp_leftnode->binaryOpr_calculation() *
         imp_rightnode->binaryOpr_calculation();
}

class add_node : public binary_node {
 public:
  add_node(parsetree_node* parse_leftnode, parsetree_node* parse_rightnode)
      : binary_node(parse_leftnode, parse_rightnode) {}
  double binaryOpr_calculation() const;
};

double add_node::binaryOpr_calculation() const {
  std::cout << "Adding\n";
  return imp_leftnode->binaryOpr_calculation() +
         imp_rightnode->binaryOpr_calculation();
}

int main() {
  // ( 20.0 + (-10.0) ) * 0.1
  parsetree_node* parsetree_node_1 = new numeric_node(20.0);
  parsetree_node* parsetree_node_2 = new numeric_node(-10.0);
  parsetree_node* parsetree_node_3 =
      new add_node(parsetree_node_1, parsetree_node_2);
  parsetree_node* parsetree_node_4 = new numeric_node(0.1);
  parsetree_node* parsetree_node_5 =
      new multiplication_node(parsetree_node_3, parsetree_node_4);
  std::cout << "Calculating the parse tree\n";

  // Tell the root to calculate itself
  double x = parsetree_node_5->binaryOpr_calculation();
  std::cout << "Result: " << x << std::endl;

  delete parsetree_node_5;  // and all children
}

The code implements a parse tree for arithmetic expressions using polymorphism. The code defines a hierarchy of classes representing nodes in the parse tree, including numeric nodes and binary operation nodes (addition and multiplication).

The program creates a parse tree for the expression (20.0 + (-10.0)) * 0.1, calculates the result using the binaryOpr_calculation() function, and displays the process. The use of polymorphism allows for a flexible and extensible representation of different types of nodes in the parse tree.

The output demonstrates the construction and calculation of the parse tree, producing a result of 1.

Output:

Calculating the parse tree
Multiplying
Adding
The numeric node: 20
The numeric node: -10
The numeric node: 0.1
Result: 1

The numeric_class class stores a double value (initialized in its constructor) and overloads the Calc() virtual function. The binary_node class knows how to implement the virtual function inherited from the parsetree_node class.

A parse tree resolves and describes the syntactic roles of its components or performs the act of parsing a string or a text in a hierarchical tree structure (with a root value) represented as a group of linked nodes. In C++, it represents terminal and non-terminal symbols to derivate the grammar to yield input strings, and each interior node represents the productions of a grammar.

A parse tree uses a single physical tree node per non-terminal (generally, this results in a huge tree). It features two pointer members, a member used for node identification, and a virtual destructor to support derived nodes.

Conclusion

This article explores methods for building parse trees in C++, providing detailed explanations and examples for each approach. Three primary methods are covered:

  1. Using if-else Statements:
    • Demonstrates the construction of a parse tree for arithmetic expressions using if-else statements.
    • The approach involves a stack-based algorithm, creating nodes and handling different cases for parentheses, operators, and operands.
  2. Conditional Statements with Recursive Structure:
    • Introduces a method based on conditional statements with a recursive structure to build a binary tree for representing arithmetic expressions.
    • Code includes functions for constructing the parse tree and evaluating expressions.
  3. Using Polymorphism:
    • Demonstrates the use of polymorphism in C++ to build a parse tree for arithmetic expressions.
    • Defines classes for numeric nodes and binary operations, showcasing dynamic evaluation of expressions.

This tutorial has taught you how easy it is to build parse trees during parsing and improve and build highly efficient parsers. A parse tree can make anything easier, and its only disadvantage is its high runtime and memory cost.

Syed Hassan Sabeeh Kazmi avatar Syed Hassan Sabeeh Kazmi avatar

Hassan is a Software Engineer with a well-developed set of programming skills. He uses his knowledge and writing capabilities to produce interesting-to-read technical articles.

GitHub

Related Article - C++ Tree