LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 03-26-2010, 09:59 AM   #121
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 454Reputation: 454Reputation: 454Reputation: 454Reputation: 454

Quote:
Originally Posted by MTK358 View Post
Could you please just tell me the answer? I don't even understand what I should say.
How do you implement in code "followed by" ? Try, for starters, to answer this question.

In English you describe language constructs as "something followed by something else" - among other things. So, how do you check in plain procedural code that something is followed by something else ?
 
Click here to see the post LQ members have rated as the most helpful post in this thread.
Old 03-26-2010, 10:10 AM   #122
MTK358
LQ 5k Club
 
Registered: Sep 2009
Posts: 6,443

Original Poster
Blog Entries: 3

Rep: Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723
Is this what you want:

"xcv = ();" is wrong, because a variable assignment is a variable, followed by '=', followed by '(', an expression, and a ')'.
 
Old 03-26-2010, 10:34 AM   #123
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 454Reputation: 454Reputation: 454Reputation: 454Reputation: 454
Quote:
Originally Posted by MTK358 View Post
Is this what you want:

"xcv = ();" is wrong, because a variable assignment is a variable, followed by '=', followed by '(', an expression, and a ')'.
Not quite, though you are closer - I gave just two legal constructs:
  1. identifier_to_identifier assignment
  2. number_to_identifier_assignment
.

I have not mentioned an expression. In the parser studying toy world there is no notion of expression yet.

And I'm still waiting for an implementation of "followed by" check.
 
Old 03-26-2010, 11:01 AM   #124
MTK358
LQ 5k Club
 
Registered: Sep 2009
Posts: 6,443

Original Poster
Blog Entries: 3

Rep: Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723
Then parentheses are not a valid character.
 
Old 03-26-2010, 11:25 AM   #125
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 454Reputation: 454Reputation: 454Reputation: 454Reputation: 454
Quote:
Originally Posted by MTK358 View Post
Then parentheses are not a valid character.
Correct.

So, what about the valid cases ? I.e. how would your parser deal with the input example:

Code:
abc = def;
fgh = 123;
xcv = ();
?

I.e. what is the implementation of "followed by" check ?

In the light of presence of just two legal constructs:
  1. identifier_to_identifier assignment
  2. number_to_identifier_assignment

how should the error regarding 'xcv = ();' be reported ? To be more exact, at what place of input stream should the error be reported ?

Don't start with error reporting until you implement the legal constructs:
  1. identifier_to_identifier assignment
  2. number_to_identifier_assignment

- the error reporting directly depends on the way you implement the two constructs.
 
Old 03-26-2010, 11:28 AM   #126
MTK358
LQ 5k Club
 
Registered: Sep 2009
Posts: 6,443

Original Poster
Blog Entries: 3

Rep: Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723
variable, followed by '=', followed by variable, and a ';'.

variable, followed by '=', followed by number, and a ';'.

variable, followed by '=', followed by illegal character (ERROR).
 
Old 03-26-2010, 11:36 AM   #127
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 454Reputation: 454Reputation: 454Reputation: 454Reputation: 454
Quote:
Originally Posted by MTK358 View Post
variable, followed by '=', followed by variable, and a ';'.

variable, followed by '=', followed by number, and a ';'.

variable, followed by '=', followed by illegal character (ERROR).
There is a number of issues with you answer:

a language like Perl or "C" does not have "followed by" statement/operator, so I'm still waiting for some procedural code;
your sequence of checks will fail for the following (I swapped the first two lines):

Code:
fgh = 123;
abc = def;
xcv = ();
, i.e. if you wrote the parser in such a manner, the very first check of yours:

Quote:
variable, followed by '=', followed by variable, and a ';'
would fail - which is a wrong failure.

So, again, first write procedurally something minimal, like just one check:

Quote:
variable, followed by '=', followed by variable, and a ';'
, and then the other check:

Quote:
variable, followed by '=', followed by number, and a ';'.
, and then we'll discuss how to make the toy parser insensitive to the correct constructs order.
 
Old 03-26-2010, 01:45 PM   #128
MTK358
LQ 5k Club
 
Registered: Sep 2009
Posts: 6,443

Original Poster
Blog Entries: 3

Rep: Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723
Quote:
Originally Posted by smeezekitty View Post
GCC is written with L&Y.
Seems to me that the fact that the de-facto standard C compiler uses it, that most of its basic concepts are quite simple, and that it's a popular, standard Unix tool, is enough reason to use it.
 
Old 03-26-2010, 05:48 PM   #129
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 454Reputation: 454Reputation: 454Reputation: 454Reputation: 454
Quote:
Originally Posted by MTK358 View Post
Seems to me that the fact that the de-facto standard C compiler uses it, that most of its basic concepts are quite simple, and that it's a popular, standard Unix tool, is enough reason to use it.
The fact that tool A is written with tool B does not help your understanding of what is under the hood - remember your difficulties with 'Text::Balanced', 'substr', regular expressions.

As I've already said, the disconnect between English and code in this thread and the above named difficulties have the same root cause.
 
Old 03-26-2010, 06:09 PM   #130
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,399
Blog Entries: 2

Rep: Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908
Sergei:
I'm not sure MTK358 really wants to learn how things work under the hood. He just wants to get a drivers license.
If he wanted to write a parser from scratch, then understanding the theory would be good. He just wants to use an existing tool. Using the rules of engagement for that tool. Even if it isn't the best tool out there. And even if you or he didn't write it. Thats why the title of the thread is 'Lex/YACC Question'.
Frankly, your assertion that tokenizing the input data is unhelpful doesn't stand up well to my (novice) analysis. Since the essence of a grammar is a way of generalizing the description of how parts of a language (analogous to parts of speech) are organized as a collective, it seems to make perfect sense to categorize the input into its more general component parts (tokens). This is exactly what the lexical analyzer does. A computer language grammar uses names to describe its component parts in consistent ways: variables, constants, operators, functions, statements and expressions are part of the common lexicon. Using these component names to describe the elements that are parsed from the data seems like a good way to create an unambiguous description of the grammar that the parser is to read. What is missing from this logic?

--- rod.

Last edited by theNbomr; 03-26-2010 at 06:13 PM.
 
1 members found this post helpful.
Old 03-26-2010, 06:39 PM   #131
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 454Reputation: 454Reputation: 454Reputation: 454Reputation: 454
Quote:
Originally Posted by theNbomr View Post
Sergei:
I'm not sure MTK358 really wants to learn how things work under the hood. He just wants to get a drivers license.
If he wanted to write a parser from scratch, then understanding the theory would be good. He just wants to use an existing tool. Using the rules of engagement for that tool. Even if it isn't the best tool out there. And even if you or he didn't write it. Thats why the title of the thread is 'Lex/YACC Question'.
Frankly, your assertion that tokenizing the input data is unhelpful doesn't stand up well to my (novice) analysis. Since the essence of a grammar is a way of generalizing the description of how parts of a language (analogous to parts of speech) are organized as a collective, it seems to make perfect sense to categorize the input into its more general component parts (tokens). This is exactly what the lexical analyzer does. A computer language grammar uses names to describe its component parts in consistent ways: variables, constants, operators, functions, statements and expressions are part of the common lexicon. Using these component names to describe the elements that are parsed from the data seems like a good way to create an unambiguous description of the grammar that the parser is to read. What is missing from this logic?

--- rod.
And I see the world differently - in terms of pattern recognition. Parallel thread: http://www.linuxquestions.org/questi...f-text-797670/ with its example:

Code:
always begin
if(V(gnd) < 0.01) #1 gnd_d = 1'b0; else #1 gnd_d = 1'b1;
if(V(vdd) > 0.50) #1 vdd_d = 1'b1; else #1 vdd_d = 1'b0;
end
is an example of a pattern to be recognized and then transformed.

The pattern in "scientific terms" is Verilog 'always' statement, but not just any Verilog 'always' statement (typically in designs one sees

Code:
always @(posedge clk)
  <do_something>
, '@(posedge clk)' being event specification), but specifically event-less 'always' statement (i.e. 'always' keyword), followed by sequential block ('begin' keyword followed by some operators, followed by 'end' keyword), where 'some operators' is a sequences of specifically 'if' statements (opposed to much wider set of generally allowed statements), and the peculiarity of the 'if' statements is call to 'V' function with 'gnd' argument as aprt of difference.

So, to solve the OP's in that thread problem one has to write code which recognizes such patterns, extracts from the patterns vital info (for the first 'if' it's

V(gnd) < 0.01
gnd_d = 1'b0
gnd_d = 1'b1

)

and generates another pattern.

As I look back, the vast majority of work I did with design code was pattern recognition. For example, I once wrote a piece if code which recognized sequences of two back to back connected inverters in a netlist, eliminated them and connected together the two unconnected nodes which became unconnected because of the inverters elimination.

I'm trying to say that pattern (language constructs) recognition is a much more generic approach and IMO it is much closer to natural human thinking.

As I mentioned many times, I learned about this from Prolog - a language meant to imitate human thinking/reasoning.
 
Old 03-26-2010, 07:00 PM   #132
MTK358
LQ 5k Club
 
Registered: Sep 2009
Posts: 6,443

Original Poster
Blog Entries: 3

Rep: Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723
Quote:
Originally Posted by Sergei Steshenko View Post
I'm trying to say that pattern (language constructs) recognition is a much more generic approach and IMO it is much closer to natural human thinking.

As I mentioned many times, I learned about this from Prolog - a language meant to imitate human thinking/reasoning.
IMO the way computers and humans think are basically polar opposites.

Can you mentally solve millions of complicated algorithms in a split second, with not even one error?

Can your computer be shown a picture and be able to tell exactly what every object in the picture is?

So why explain something that is intended for a computer in human terms?
 
1 members found this post helpful.
Old 03-27-2010, 01:07 AM   #133
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 454Reputation: 454Reputation: 454Reputation: 454Reputation: 454
Quote:
Originally Posted by MTK358 View Post
IMO the way computers and humans think are basically polar opposites.

Can you mentally solve millions of complicated algorithms in a split second, with not even one error?

Can your computer be shown a picture and be able to tell exactly what every object in the picture is?

So why explain something that is intended for a computer in human terms?
Prolog was meant as an AI tool. Computer programs have made tremendous pregress in image/pattern recognition.

The challenge is to develop tools/approaches which model human thinking well.

The "thing" I'm trying to get from you for your own sake (implementing "followed by" in a procedural language) is very basic and quite unambiguous.

Another interesting issue is robust "half-hearted" parsers which work very similar to humans, i.e. they know how to understand "what it's all about" omitting unnecessary details. I myself once wrote such a parser "for money", i.e. I needed it to do my job. The idea was original at the moment i.e. I came to it myself; my friends/colleagues really liked it.

But before all that "followed by" in a procedural language needs to be done. "followed by" is fundamental, and from my observation of your lack of understanding of quite relevant reasonable Perl documentation (the main items of the documentation are universal - not Perl-specific) your inability to comprehend/implement "followed by" and relate it to other fundamental things in computers/programming is the main reason you have difficulty understanding how parsing works.

Programmers who implemented various parser generators understood "followed by" perfectly well.

Last edited by Sergei Steshenko; 03-27-2010 at 01:11 AM.
 
Old 03-28-2010, 11:11 AM   #134
MTK358
LQ 5k Club
 
Registered: Sep 2009
Posts: 6,443

Original Poster
Blog Entries: 3

Rep: Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723
I have decided to make this simpler and make my compiler to convert code to C instead of compile or interpret (at least for now).

I have almost finished my AST-builder, and I am using a union for tokens' semantic values, like this:

Code:
%union {
	char* string;
}

%token <string> INTEGER
%token <string> IDENTIFIER
(For now I am using only a string type).

But what is this error here:

Code:
$ ./compile
Running YACC:
lang.y:24.31-32: $1 of `stmt' has no declared type
lang.y:24.45-46: $1 of `stmt' has no declared type
lang.y:28.19-20: $$ of `expr' has no declared type
lang.y:29.24-25: $$ of `expr' has no declared type
lang.y:30.36-37: $$ of `expr' has no declared type
lang.y:30.76-77: $3 of `expr' has no declared type
lang.y:31.27-28: $$ of `expr' has no declared type
lang.y:31.49-50: $1 of `expr' has no declared type
lang.y:31.53-54: $3 of `expr' has no declared type
lang.y:32.27-28: $$ of `expr' has no declared type
lang.y:32.49-50: $1 of `expr' has no declared type
lang.y:32.53-54: $3 of `expr' has no declared type
lang.y:33.27-28: $$ of `expr' has no declared type
lang.y:33.49-50: $1 of `expr' has no declared type
lang.y:33.53-54: $3 of `expr' has no declared type
lang.y:34.27-28: $$ of `expr' has no declared type
lang.y:34.49-50: $1 of `expr' has no declared type
lang.y:34.53-54: $3 of `expr' has no declared type
lang.y:35.26-27: $$ of `expr' has no declared type
lang.y:35.31-32: $2 of `expr' has no declared type
YACC returned with exit code 1
./compile is a little script that I wrote that runs Bison, Flex, and GCC and checks if each succeeds before starting the next one.
 
Old 03-28-2010, 12:21 PM   #135
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,399
Blog Entries: 2

Rep: Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908
Since all of your types are string, add to your yacc file:
Code:
%type <string> expr
%type <string> stmnt
These go in the same part of the file where your %tokens are defined.
--- rod.

Last edited by theNbomr; 03-28-2010 at 12:22 PM.
 
1 members found this post helpful.
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
lex and yacc in RHEL4 birds Linux - Software 1 06-03-2009 02:56 AM
LEX and YACC with C gr33ndata Programming 4 11-18-2007 05:12 PM
Lex and Yacc on Federo 2.0 vivekian Fedora 6 05-20-2006 09:09 AM
Lex and Yacc on Mandrake 9.2.2 Anuradha Linux - Software 0 07-02-2005 03:32 AM
Lex & YACC coolfrog Programming 3 09-25-2004 07:00 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 04:21 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration