LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 08-21-2023, 10:26 AM   #61
ajiten
Member
 
Registered: Jun 2023
Posts: 377

Original Poster
Rep: Reputation: 4

Quote:
Originally Posted by NevemTeve View Post
Whenever `malloc` returns something, it is an array. Whenever `strcpy` or `memcpy` is called, it is is a copy between arrays. To tell the truth, C doesn't have string type, it uses character-arrays instead.
Wished (& could see also) if could show by some simple means (like, printf()) the way in which the malloc() returns memory space, and all these are connected in one array; for some given input, say 'ABCD'.
 
Old 08-21-2023, 11:12 AM   #62
NevemTeve
Senior Member
 
Registered: Oct 2011
Location: Budapest
Distribution: Debian/GNU/Linux, AIX
Posts: 4,880
Blog Entries: 1

Rep: Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871
https://onlinegdb.com/NClPkNgSH

Added debug-prints into stmt_list,
now here I idendeted the debug lines to show to recursion:

Code:
stmt_list: input='ABCD'
  stmt_list: input='BCD'
    stmt_list: input='CD'
      stmt_list: input='D'
      stmt_list: input='D', first='D' rest=''(array of 0 characters)
      now I'll merge them into another array of 1 characters
      the result is 'D'
    stmt_list: input='CD', first='C' rest='D'(array of 1 characters)
    now I'll merge them into another array of 2 characters
    the result is 'CD'
  stmt_list: input='BCD', first='B' rest='CD'(array of 2 characters)
  now I'll merge them into another array of 3 characters
  the result is 'BCD'
stmt_list: input='ABCD', first='A' rest='BCD'(array of 3 characters)
now I'll merge them into another array of 4 characters
the result is 'ABCD'

ABCD interpreted as left associative: (((A).B).C).D
ABCD interpreted as right associative: A.(B.(C.(D)))
Also remember the terminating zero: is marks the end of the characters. That's why we allocate one more byte every time.
 
1 members found this post helpful.
Old 08-21-2023, 03:49 PM   #63
ajiten
Member
 
Registered: Jun 2023
Posts: 377

Original Poster
Rep: Reputation: 4
Quote:
Originally Posted by NevemTeve View Post
Infinite loop is very easy to produce, you could have found out it by now:
Code:
/* subexp -> NUMBER | subexp - NUMBER */
int parse_subexp() {
    int left= parse_subexp();
    if (token=='-') {
          token= nextToken();
          int right= parse_number();
          return left-right;
    } else {
          return left;
    }
}
Wanted to build code, on the similar grounds as given by @astrogeek here.

But, could not understand the line #22 (and the initial condition of the for-loop at line #70), in the given code.

Have some skeleton till now, as here.

Last edited by ajiten; 08-21-2023 at 10:15 PM.
 
Old 08-21-2023, 10:08 PM   #64
ajiten
Member
 
Registered: Jun 2023
Posts: 377

Original Poster
Rep: Reputation: 4
Quote:
Originally Posted by astrogeek View Post
Indeed, it is universally explained in parsing literature. Have you read those and tried to understand what they are telling you? How would another example from myself be any different? See post #41, from which:



But this problem, like so many in the parsing realm, is inherent in the concept of top-down parsing, not just an implementation detail. As such, it is best understood conceptually rather than as a code example.

Top-down parsing begins with the input string and the start symbol, then performs predictions by replacing the leading non-terminal in each alternative until the left-most symbol in each prediction is a terminal. At that point it matches those terminals with the left-most end of the (remaining) input string, removes those that match into the parse tree and discards the rest of the predictions, then continues with the next leading non-terminals until it reaches the end of the input string.

The problem with left-recursion in this scenario is that you can never reach a set of predictions in which all left-most symbols are terminals. For example (adapted from Dick Grune):

Code:
Grammar:        S -> Sb | a
Input string:   ab

Deriving predictions...

        Parse Tree | Prediction
    _______________|________________
                   |  S
                  ...
                 S |  Sb
                 S |  a
                  ...
                SS |  Sbb
                SS |  ab
                 S |  a
                  ...
               SSS |  Sbbb
                SS |  ab
                 S |  a
                  ...
ad infinitum...
You can never get rid of the leading non-terminal in all predictions, so you can never move to the match phase.

The infinite progression is present in the concept of a top-down parser, not only recursive implementations where it manifests as infinite recursion.

Trying to see it as code alone will not lead to understanding.
Please tell how to interpret the line #22:
Code:
   if((parseptr-parse)>=(MAXPARSE - 1))
and the initialization condition, in the for loop, on line #70:
Code:
     for(char *indx = parse + strlen(parse); indx>=parse; indx--)
The code is on onlinegdb.com, here.

Last edited by ajiten; 08-21-2023 at 10:14 PM.
 
Old 08-21-2023, 10:50 PM   #65
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=15, FreeBSD_12{.0|.1}
Posts: 6,269
Blog Entries: 24

Rep: Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206
Quote:
Originally Posted by ajiten View Post
Please tell how to interpret the line #22:
Code:
   if((parseptr-parse)>=(MAXPARSE - 1))
That prevents overwriting the final NULL in the parse buffer, creating an unterminated string.

Quote:
Originally Posted by ajiten View Post
and the initialization condition, in the for loop, on line #70:
Code:
     for(char *indx = parse + strlen(parse); indx>=parse; indx--)
The code is on onlinegdb.com, here.
That initializes the pointer to the end of the parse string. Should be one less actually, but writing the NULL terminator has null effect so no harm done, and is safe for zero length string (indx was originally an int index into array, modified for posting here, oops).

Last edited by astrogeek; 08-21-2023 at 11:00 PM.
 
1 members found this post helpful.
Old 08-21-2023, 11:14 PM   #66
ajiten
Member
 
Registered: Jun 2023
Posts: 377

Original Poster
Rep: Reputation: 4
Please see your post on 08-19-23, at 10:19
Quote:
Originally Posted by NevemTeve View Post
Naive top-down recursive parser cannot handle left recursion ( e.g. A -> A - B ), because if it tried, the recursion would never end.
A real-life program can work on modified rules ( e.g. A -> B - A ), create some data-structure (typically a tree) as a temporay result, then transform the data-structure to get the correct representation, e.g:

Input: 2-3-4
Temporary result: 2-(3-4)
Fixed result: (2-3)-4
Please tell how your code does the fixing of the result, in order to avoid the infinite-recursion problem, in left-recursive grammars.

Last edited by ajiten; 08-21-2023 at 11:17 PM.
 
Old 08-22-2023, 12:36 AM   #67
ajiten
Member
 
Registered: Jun 2023
Posts: 377

Original Poster
Rep: Reputation: 4
Quote:
Originally Posted by astrogeek View Post
Code:
     for(char *indx = parse + strlen(parse); indx>=parse; indx--)
That initializes the pointer to the end of the parse string. Should be one less actually, but writing the NULL terminator has null effect so no harm done, and is safe for zero length string (indx was originally an int index into array, modified for posting here, oops).
Okay, the variable: parse, would give the start location of the character array. Would need to add just: len(parse)-1.
Though am unclear, what you mean by : writing the NULL terminator has null effect...

Last edited by ajiten; 08-22-2023 at 02:02 AM.
 
Old 08-22-2023, 03:54 AM   #68
NevemTeve
Senior Member
 
Registered: Oct 2011
Location: Budapest
Distribution: Debian/GNU/Linux, AIX
Posts: 4,880
Blog Entries: 1

Rep: Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871
> Please tell how your code does the fixing of the result, in order to avoid the infinite-recursion problem, in left-recursive grammars.

The program in question is way too simple to show the problem, let alone the solution.
I might come up with an actual example, but it is not trivial.

As a prerequisite, you have to know binary expression trees. Sample:

https://github.com/lzsiga/hello-worl...naive_parser.c

Then move on to lexical parsing (same program).

Last edited by NevemTeve; 08-22-2023 at 04:53 AM.
 
1 members found this post helpful.
Old 08-22-2023, 08:37 AM   #69
ajiten
Member
 
Registered: Jun 2023
Posts: 377

Original Poster
Rep: Reputation: 4
Quote:
Originally Posted by NevemTeve View Post
> Please tell how your code does the fixing of the result, in order to avoid the infinite-recursion problem, in left-recursive grammars.

The program in question is way too simple to show the problem, let alone the solution.
I might come up with an actual example, but it is not trivial.

As a prerequisite, you have to know binary expression trees. Sample:

https://github.com/lzsiga/hello-worl...naive_parser.c

Then move on to lexical parsing (same program).
Request some details on your statement regarding the insufficient complexity of the program, in order to show the infinite loop problem, for left-recursive grammar.
This can be used by me to elaborate to class as well, the need for the given program.
Am in process of understanding code, so if you feel that would acquire the knowledge on understanding the program, then no issues; though see no hope for that.

Also, why the function prototype declaration on line #32, 33, 34 is repeated on line #44, 45, 46; is unclear.
Code:
static LexData *LexInit(const char *from);
static int LexGet(LexData *ld, LexToken *into); /* returns into->type */
static void LexTerm(LexData *ld);
 
Old 08-22-2023, 08:44 AM   #70
NevemTeve
Senior Member
 
Registered: Oct 2011
Location: Budapest
Distribution: Debian/GNU/Linux, AIX
Posts: 4,880
Blog Entries: 1

Rep: Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871
You're late, by now it has been improved, it has a naive parser now, here is an example showing the problem:
Code:
NaiveParsTest: input="96 / 2 / 3 / 4"
Result: (96/(2/(3/4)))
It should be (((96/2)/3)/4)

Now back to your perpetual request: I most certainly won't add infinite recursion into the program, do that yourself. You have been given examples, e.g. in post #45

Please learn the current state of the program by heart.

Last edited by NevemTeve; 08-22-2023 at 09:11 AM.
 
1 members found this post helpful.
Old 08-22-2023, 11:04 AM   #71
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=15, FreeBSD_12{.0|.1}
Posts: 6,269
Blog Entries: 24

Rep: Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206
Updated example in blog post renames loop pointer and corrects offset.

Quote:
Originally Posted by ajiten View Post
Okay, the variable: parse, would give the start location of the character array. Would need to add just: len(parse)-1.
Though am unclear, what you mean by : writing the NULL terminator has null effect...
It means what it says, how do you interpret it?
 
Old 08-23-2023, 04:14 AM   #72
NevemTeve
Senior Member
 
Registered: Oct 2011
Location: Budapest
Distribution: Debian/GNU/Linux, AIX
Posts: 4,880
Blog Entries: 1

Rep: Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871
I'll admit, my first idea to fix the associativity problem failed. The second one uses stacks, and I think it sort of works.
Code:
$ ./naive_parser "2+3-4+5-6" "2*3^3*4^4^5"

NaiveParsTest: input="2+3-4+5-6"
Result: ((((2+3)-4)+5)-6)
Result: 0

NaiveParsTest: input="2*3^3*4^4^5"
Result: ((2*(3^3))*(4^(4^5)))
Result: inf
It doesn't support signs yet (+3, -5), which might also be interesting, e.g wether -3^2 is (-3)^2 or -(3^2) (I support the latter, but have heard arguments about it.)

Last edited by NevemTeve; 08-23-2023 at 06:40 AM.
 
1 members found this post helpful.
Old 08-24-2023, 08:04 PM   #73
ajiten
Member
 
Registered: Jun 2023
Posts: 377

Original Poster
Rep: Reputation: 4
Quote:
Originally Posted by NevemTeve View Post
You're late, by now it has been improved, it has a naive parser now, here is an example showing the problem:
Code:
NaiveParsTest: input="96 / 2 / 3 / 4"
Result: (96/(2/(3/4)))
It should be (((96/2)/3)/4)

Now back to your perpetual request: I most certainly won't add infinite recursion into the program, do that yourself. You have been given examples, e.g. in post #45

Please learn the current state of the program by heart.
Sorry, due to internal examn. was too busy for few days, as couldn't adjust to different tt.
Please tell why the struct LexData, is empty, as how an empty datatype can be a parameter, as well as a return data type, as on the lines #40-42.
Have read here, but couldn't find something relevant. Any hint will work, in kicking in the right direction.

Last edited by ajiten; 08-24-2023 at 10:14 PM.
 
Old 08-25-2023, 12:06 AM   #74
NevemTeve
Senior Member
 
Registered: Oct 2011
Location: Budapest
Distribution: Debian/GNU/Linux, AIX
Posts: 4,880
Blog Entries: 1

Rep: Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871
This is called `opaque type`. It's not important in this little program, but in multi-file projects it is useful: implementation can be separated from the specification.

Line 38 declares 'struct LexData' then the definition is in lines 434-440

In a bigger project, lines 415-501 would be in a separated file called `lex.c`

Also here: https://c-faq.com/struct/sd1.html
 
Old 08-26-2023, 09:12 AM   #75
ajiten
Member
 
Registered: Jun 2023
Posts: 377

Original Poster
Rep: Reputation: 4
Quote:
Originally Posted by NevemTeve View Post
This is called `opaque type`. It's not important in this little program, but in multi-file projects it is useful: implementation can be separated from the specification.

Line 38 declares 'struct LexData' then the definition is in lines 434-440

In a bigger project, lines 415-501 would be in a separated file called `lex.c`

Also here: https://c-faq.com/struct/sd1.html
Please state why on line #172, the function has ... in the parameter list?
Code:
static void NP_PrintErrF(ParseData *p, char *fmt, ...) {
 
  


Reply

Tags
compiler



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Unable to understand why the given Lex program fails to recognize the given input string. ajiten Programming 11 07-11-2023 12:28 AM
LXer: Trees, B-Trees, B+Trees and H-Trees LXer Syndicated Linux News 0 07-15-2013 02:11 PM
How to Boot Linux/ Linux like OS from Pen derive kumars.nitin123 Linux - General 2 11-05-2009 11:00 PM
Unable to derive IRQ merchtemeagle Linux - Hardware 2 06-21-2007 08:14 PM
mount derive in lfs aneel Linux From Scratch 2 11-24-2005 12:23 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 01:27 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration