LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 08-13-2014, 07:12 AM   #61
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 10,688
Blog Entries: 4

Rep: Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947

I still detect the lack of "separation of concerns" here. You're still muddling code-generation with parsing.

An AST structure for an IF-statement might, for example, be a 2-tree with children for the "then" and "else" parts, with children consisting of, say, a list of statement-blocks for each part. Once the statement was completely recognized, code would be generated from the AST (fragment). It might be littered with semantic clues, as well.

One analogy that has been bantered-about is that, whereas the parser deals with "words," the AST represents "sentences."

Really, an example is worth a thousand words, and on the Internet such examples are legion. One of the obligatory (for some reason) classes that you always find at the graduate level in a computer-science curriculum is the building of a compiler, fortunately almost-always using a toolkit.
 
Old 08-13-2014, 10:44 PM   #62
des_a
Senior Member
 
Registered: Sep 2006
Posts: 1,441

Original Poster
Blog Entries: 43

Rep: Reputation: 36
So I got it to work now. It is a flat tree, but it works. Here is my definitions:

Code:
class ASTNode
{
 protected:
  String name;
  Array<String> value;
  unsigned long valuecount;


 public:
  ASTNode(const String n = "");
  ASTNode(const Array<String> v, const String n = "");
	
  ASTNode(ASTNode & n);

  ~ASTNode();

  
  String getName();
  Array<String> getValue();

  void setName(const String n);
  void setValue(const Array<String> v);


  int conprint(const char * format, ...);


  ASTNode operator=(ASTNode & n);
};

class ASTTree
{
 protected:
  Array<ASTNode *> nodes;


 public:
  ASTTree();
  ASTTree(ASTTree & t);
  ~ASTTree();


  void add_node(ASTNode & node);
  void remove_node();
  void remove_node(String n);

  ASTNode & getNode(String n);
  ASTNode & getNode(unsigned long offset);
  void setNode(String n, ASTNode & node);
  void setNode(unsigned long offset, ASTNode & node);

  unsigned long nodenumber();
};
 
Old 08-16-2014, 12:32 AM   #63
des_a
Senior Member
 
Registered: Sep 2006
Posts: 1,441

Original Poster
Blog Entries: 43

Rep: Reputation: 36
I need a little help on a possible algorithm to add the line numbers in. Here is an example of code:

Code:
VERSION TVOID 0V


; Begin if :1
ALOAD TBOOLEAN true
CGOTOL TVOID if_part :1
GOTOL TVOID else_part_or_end :1


LBL TVOID 0V
VOID TVOID 0V
; End if code :1
GOTOL TVOID end


LBL TVOID 0V
; Begin if :2
ALOAD TBOOLEAN false
CGOTOL TVOID if_part :2
GOTOL TVOID else_part_or_end :2


LBL TVOID 0V
VOID TVOID 0V
; End if code :2
GOTOL TVOID end


LBL TVOID 0V
; End if or else code :1
GOTOL TVOID end


LBL TVOID 0V
; End if :1

END TVOID 0V
What I need to do, is use the comments to add the line numbers. Here is what I got so far for an algorithm:

Code:
  1. Count the labels until there is an if-else.
  2. Count the if else labels.
  3. Replace end with the results of (step 1 - 1) + (step 2 - 1).
I need to have a complete algorithm to add the line numbers, and it needs to work with all the possible if-else possibilities.

I "think" the comments and psudocode give enough information for the program to figure it out from there. Am I on the right track for what might work? I'd like to learn more about these ASTs, but for now, whatever will work will be fine, if I'm still mixing the code generation in too much. As long as it works, for now it is a fine algorithm. That way I can simply continue to learn as I go, and not have to wait until I finally figure out the ASTs, before finishing.
 
Old 08-16-2014, 06:35 AM   #64
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 10,688
Blog Entries: 4

Rep: Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947
What I did, in a script compiler that needed to refer to line-numbers for error locating purposes, was to simply insert the source-line number (counted by the scanner and ignoring $INCLUDEs) directly into every p-code instruction that was generated. A source-line-range reference (not always accurate) was also inserted into AST entries. It enabled you to at least get close to the line-number where the error occurred, for debugging only. (The error-messages say, "near line such-and-so," and provide no useful information at all if the error is found in an included file. They're used to point a source-file editor to an approximate location, and nothing more.

Line-numbers should not be a syntactically-significant part of your language. Even the earliest BASIC interpreters got rid of the notion of "one statement per line," and used "line numbers" only as labels ... later replacing them with, well, labels. No one missed that throwback to the teletype days.
 
Old 08-18-2014, 05:22 PM   #65
pwalden
Member
 
Registered: Jun 2003
Location: Washington
Distribution: Raspbian, Ubuntu, Chrome/Crouton
Posts: 374

Rep: Reputation: 50
Quote:
Originally Posted by sundialsvcs View Post
Line-numbers should not be a syntactically-significant part of your language. Even the earliest BASIC interpreters got rid of the notion of "one statement per line," and used "line numbers" only as labels ... later replacing them with, well, labels. No one missed that throwback to the teletype days.
I don't think he's listening.
 
Old 08-19-2014, 08:15 PM   #66
des_a
Senior Member
 
Registered: Sep 2006
Posts: 1,441

Original Poster
Blog Entries: 43

Rep: Reputation: 36
I realize now I said line numbers, and what I meant was that I was using label numbers to jump to a statement in my generated code. A label is the LBL instruction.
 
Old 08-19-2014, 08:28 PM   #67
des_a
Senior Member
 
Registered: Sep 2006
Posts: 1,441

Original Poster
Blog Entries: 43

Rep: Reputation: 36
The first label number in the program starts at 0, and it goes up from there.
 
Old 08-23-2014, 11:12 PM   #68
des_a
Senior Member
 
Registered: Sep 2006
Posts: 1,441

Original Poster
Blog Entries: 43

Rep: Reputation: 36
My book finally came from the library. It's called "Writing Compilers and Interpreters: An Applied Approach". By the end of reading the book, I will have a working Pascal compiler and interpreter. I will then modify it with my understanding of the concepts, or attempt to anyway, to compile to PNFASM, instead of to Microsoft Quick Assembler, which I cannot use. Then I will modify the language itself, to allow at the very least an ASM keyword, which will output PNFASM code directly, for the times which I need more features of the language. Then I will have my first working compiler for a high level language. I still want to make this language, as it's more powerful yet, but I will have lots of power at my fingertips after that.
 
Old 08-23-2014, 11:14 PM   #69
des_a
Senior Member
 
Registered: Sep 2006
Posts: 1,441

Original Poster
Blog Entries: 43

Rep: Reputation: 36
I still would like advice on the solution to this problem though, in case the book does not answer my question. The book covers C compilers, and not bison and flex compilers, but the ideas of the code generation and ASTs, will be the same.
 
Old 08-25-2014, 01:43 PM   #70
NevemTeve
Senior Member
 
Registered: Oct 2011
Location: Budapest
Distribution: Debian/GNU/Linux, AIX
Posts: 4,880
Blog Entries: 1

Rep: Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871
It is not clear what your question is. Actually, this topic looks like a blog, not an usual questions/answers topic.
 
Old 08-25-2014, 09:54 PM   #71
des_a
Senior Member
 
Registered: Sep 2006
Posts: 1,441

Original Poster
Blog Entries: 43

Rep: Reputation: 36
The question was and remains to basically get the if-else statement working, in a nutshell. What I've done so far, is gone from one state to another, still not fully working. A summary of the steps were shown in this thread. Right now the only part of the code that's not working in any way is that the label numbers are not generated correctly. So far in my current code, there is no other problems, so if we could fix that from this point, without breaking anything else, then the solution will have been provided.

This language compiles to my PNFASM language, so that is what the final output needs to be.
 
Old 08-25-2014, 10:05 PM   #72
des_a
Senior Member
 
Registered: Sep 2006
Posts: 1,441

Original Poster
Blog Entries: 43

Rep: Reputation: 36
At this point, I am learning from AST trees and all that, but as long as the if else will work and generate "right" code, it is irrelevant, and something to look at in the future, or to handle later if I have trouble in another spot, in which case if needed, I'd post a new thread to learn how to deal with it. My book I will go through, and I'd been waiting on this book, but if I get this to work without the knowledge of the book, all the better. Not that now that I'm aware of the general idea of generating AST trees, I won't look into it for later, but I don't care right now for this, if it uses it or not, as long as the compiler produces working code. If I'd ended up designing it and implementing the language, and never needed ASTs, further than what I got, that's fine, and in the future when I learn about it, I'll use them. But for now, any way that works, is okay with me. Even if I had to rewrite the program just to get that to work as well as what already works, that would be fine with me, but the point of this thread is to get it to work.
 
Old 08-25-2014, 10:16 PM   #73
des_a
Senior Member
 
Registered: Sep 2006
Posts: 1,441

Original Poster
Blog Entries: 43

Rep: Reputation: 36
See: http://forums.techguy.org/software-d...ml#post8958614, for an updated summary of the problem's state.
 
Old 08-26-2014, 03:11 AM   #74
NevemTeve
Senior Member
 
Registered: Oct 2011
Location: Budapest
Distribution: Debian/GNU/Linux, AIX
Posts: 4,880
Blog Entries: 1

Rep: Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871
You've got complete examples, did you actually examined any of them?
 
Old 08-26-2014, 10:43 AM   #75
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 10,688
Blog Entries: 4

Rep: Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947
In the compilers that I have done, and as I have already said, forward-jumps are handled by means of a "fixup" data structure. For example, if I'm compiling an if/then/else construction, then the AST that I receive will already contain the one-or-two children ("then" and optionally "else"), and I know that I need to now generate code for each of them. But the design of my compiler in this section is non-recursive, using a "to-do list" of sorts to drive the exploration. There is a push-down stack of the various nested structures that I am "in" at any time, so I push an entry for if/then/else. One of the properties of that stack-object is a fixup list that will correspond to the end of the construct. Anyone that needs to branch there pushes a fixup-entry. Later on, when the end of the construct is reached, the code pops the stack-entry (verifying that it is an if/then/else entry as expected), then "resolves" the fixup-list by stamping the now-known correct address (and, at the same time, confirming that the location being updated is of the expected opcode-type ... one can never be too sure).

"Line numbers in the source code" should have no semantic meaning whatsoever. The source-code is translated into an AST data-structure ... usually, a chunk at a time ... and this is what you work from. If you need to generate forward or backward branches, etc., then the whole process at that point is entirely abstracted away from the source-code. It is done using internal compiler data structures and dictionaries, such as the "fixup lists" and "pushdown stacks" aforementioned.

Puh-l-l-l-leauze stop and study the resources that have already been pointed out to you. You're walking down a pathway that is so familiar and so old that it has been paved, but you've got a blindfold on and seem to be determined to learn everything in the school of hard knocks.

Last edited by sundialsvcs; 08-26-2014 at 10:44 AM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Portable Numbers Format: PNFHA des_a Programming 14 07-18-2014 10:51 PM
[SOLVED] Prime Generation Code Not Working Properly ashok.g Programming 29 12-14-2009 03:59 AM
A GCC code generation problem? dogbird Programming 4 12-09-2005 11:52 AM
tool for code generation? bcalmac Linux - Software 0 08-22-2005 10:41 AM
UML and C++ code generation GŠutama Linux - Software 2 11-13-2003 06:56 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 03:11 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration