Correct way to derive parse trees, for a given input, to show ambiguity.
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Infinite loop is very easy to produce, you could have found out it by now:
Code:
/* subexp -> NUMBER | subexp - NUMBER */
int parse_subexp() {
int left= parse_subexp();
if (token=='-') {
token= nextToken();
int right= parse_number();
return left-right;
} else {
return left;
}
}
Thanks, am trying to fit it into a complete program.
But, if in your earlier code, if the same could be reproduced; then better.
The reason is your code, creates a 'linked list of structures', which is not easy for me to manipulate for the purpose of this code.
Infinite recursion has a disadvantage: it is 'infinite' because it never terminates, so it has only theoretical significance, it is never used in actual programs.
Infinite recursion has a disadvantage: it is 'infinite' because it never terminates, so it has only theoretical significance, it is never used in actual programs.
It would not be a useful program, but need show the infinite loop.
It is a program with theoretical interests only.
Edit: It seems from your response, that your earlier program cannot go into infinite loop, for even the left-recursive grammar. If yes, why?
Wish I could see this, but am unable to check.
Naive top-down recursive parser cannot handle left recursion ( e.g. A -> A - B ), because if it tried, the recursion would never end.
A real-life program can work on modified rules ( e.g. A -> B - A ), create some data-structure (typically a tree) as a temporay result, then transform the data-structure to get the correct representation, e.g:
Naive top-down recursive parser cannot handle left recursion ( e.g. A -> A - B ), because if it tried, the recursion would never end.
A real-life program can work on modified rules ( e.g. A -> B - A ), create some data-structure (typically a tree) as a temporary result, then transform the data-structure to get the correct representation, e.g:
In that trivial case the return value is simply an array. Mind you, the building of the array (lines 98-100) is rather ineffective: it emulates operation 'insert-first' with alloc+copy+free sequence.
In that trivial case the return value is simply an array. Mind you, the building of the array (lines 98-100) is rather ineffective: it emulates operation 'insert-first' with alloc+copy+free sequence.
The stated lines are:
Code:
retp= malloc(1+strlen(right)+1);
/* allocates memory unit - but not clear how the size for allocation is being determined, as in line #96, 'right = stmt_list(pd);',
the variable 'right' is assigned by a recursive call;
so for input of: 'Test1("ABCD");', have on line #37, call: 'char *list= stmt_list(&pd);'.
This, in-turn on line #93 has: 'left = stmt(pd);', which serves to just increment (on each call) the 'int i' field.
The function of 'stmt(&pd)' seems trivial, & ideally should have been within the 'stmt_list(&pd)' itself.
Unable to find the size of : 'retp', as in line #99, of the modified code at: https://onlinegdb.com/OF17ddCgv.
*/
retp[0]= left;
strcpy (retp+1, right);
But, the fix for left-recursive grammar is still not visible to me. Please show what you mean by: the transformation of the data-structure (here, which seems to me is a linked list of struct type (ParseData) elements; and not a forest) being in the trivial case being an array?
I.e., what is the trivial case? and where is the array? as see a linked-list of struct type elements named: ParseData?
Thanks, and request to show the infinite loop problem for left-recursive grammar.
This problem is stated universally (across all literature, of any sort) as a disadvantage of brute-force top-down parsing, with left-recursive grammars.
Indeed, it is universally explained in parsing literature. Have you read those and tried to understand what they are telling you? How would another example from myself be any different? See post #41, from which:
Quote:
A left-recursive implementation would look like stmt_list(){ stmt_list(); stmt(); }, and it should be obvious why it would loop endlessly.
But this problem, like so many in the parsing realm, is inherent in the concept of top-down parsing, not just an implementation detail. As such, it is best understood conceptually rather than as a code example.
Top-down parsing begins with the input string and the start symbol, then performs predictions by replacing the leading non-terminal in each alternative until the left-most symbol in each prediction is a terminal. At that point it matches those terminals with the left-most end of the (remaining) input string, removes those that match into the parse tree and discards the rest of the predictions, then continues with the next leading non-terminals until it reaches the end of the input string.
The problem with left-recursion in this scenario is that you can never reach a set of predictions in which all left-most symbols are terminals. For example (adapted from Dick Grune):
Code:
Grammar: S -> Sb | a
Input string: ab
Deriving predictions...
Parse Tree | Prediction
_______________|________________
| S
...
S | Sb
S | a
...
SS | Sbb
SS | ab
S | a
...
SSS | Sbbb
SS | ab
S | a
...
ad infinitum...
You can never get rid of the leading non-terminal in all predictions, so you can never move to the match phase.
The infinite progression is present in the concept of a top-down parser, not only recursive implementations where it manifests as infinite recursion.
Trying to see it as code alone will not lead to understanding.
Last edited by astrogeek; 08-21-2023 at 11:01 AM.
Reason: left-most -> leading, more precise
Seems am not understood in my last response.
Changed line #99 in the code at: https://onlinegdb.com/Q6XDXBJGU, to have: %p, as printf's format specifier.
The output is now somewhat comprehensible, but still not know why state that you have created an array in: stmt_list(ParseData *pd, int callno);
rather than a linked list.
Even not sure, how to see the linked list of: struct ParseData, nodes; for each of the inputs, say: 'ABCD'.
Also, would like to add (at the risk of being ignored) the earlier asked queries:
Code:
=====
But, the fix for left-recursive grammar is still not visible to me. Please show what you mean by: the transformation of the data-structure (here, which seems to me is a linked list of struct type (ParseData) elements; and not a forest) being in the trivial case being an array?
I.e., what is the trivial case? and where is the array? as see a linked-list of struct type elements named: ParseData?
=======
Also, what do you mean by : 'ineffective'?
Please don't ignore my instruction: read about global variables and you will find out what ParseData is. You other questions won't be answered before that.
Please don't ignore my instruction: read about global variables and you will find out what ParseData is. You other questions won't be answered before that.
The webpage is about the way to program, but is not applicable about the global variable: ParseData, as its contents are not modifiable. At max., it tells that it should be declared a const; which your code has done by declaring: const char * arr;
The toughest use is made, in your code, of global-data-coupling, i.e. of variable 'i', in struct ParseData.
Even now need to ascertain (by putting in suitable printf() commands, outside the two coupled functions) that only the two functions: stmt_list(), stmt(); are having access to the global variaable: 'i', i.e.: pd->i. In fact, the page states the same in a very apt line:
Code:
However, this type of coupling is easy to miss, because it is not reflected in the software design, and it is not made visible except through source code analysis and searching for names.
I made a modification, to the code; but still not able to see how the code's stmt_list() builds an array, and not a linked-list.
Am unable to see how the dynamic memory allocation is done, i.e. the logic for getting memory allocated as: 1+strlen(right)+1; in line #98
Code:
retp= malloc(1+strlen(right)+1);
Could not find even the strlen(right), as then get weird output; as if inserted below before line #98 (modified code):
You are right, ParseData is meant to get rid of global variables -- but it is most certainly not 'const', member 'i' is increment whenever a token accepted.
Right now, there is no linked list in the code, functions malloc, memcpy and free are called at every character (lines 98-100), this is a waste of CPU.
You are right, ParseData is meant to get rid of global variables -- but it is most certainly not 'const', member 'i' is increment whenever a token accepted.
Right now, there is no linked list in the code, functions malloc, memcpy and free are called at every character (lines 98-100), this is a waste of CPU.
But, is there any way to see the fact that stmt_list() is creating an array based on the struct ParseData.
I cannot see a single array, for each given input; say, for input: 'ABCD'.
Any method, that can show by printf() would be better, as can show my students too.
Also, the reason for the last part of my previous response, is requested, i.e. the reason for size for malloc being given as: 1+ strlen(right)+1;
and why am unable to see the working of the modified code.
Whenever `malloc` returns something, it is an array. Whenever `strcpy` or `memcpy` is called, it is is a copy between arrays. To tell the truth, C doesn't have string type, it uses character-arrays instead.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.