The character you see in gedit is the hidden control character represented by hexadecimal C in the ASCII table. Not by chance it is the NP (new page) form feed: it is inserted in plain text books to let you distinguish pages programmatically.
A simple GNU awk program can accomplish your task, but if you change it with numbers how will you distinguish them from other numbers inserted in the text? In other word, the ability to separate pages is lost, unless you choose a unique sequence of characters or another hidden control character not appearing in the text. Just out of curiousity.
Anyway, here is a rough solution in gawk:
Code:
gawk 'BEGIN{ RS = "\xC" } { printf "%s%d", $0, ++c }' book > page_numbered_book
Basically it consider the ASCII character with hexadecimal C as Record Separator and prints out every (untouched) record, that is the content of a page (newline included), followed by a number incremented at each passage. In other words, being a separator, the hidden character is not part of the text anymore and it is a way to remove it from the output.
At this point (if you are experienced in awk, of course) you can easily add the conditional expressions to adjust numbering at your pleasure and customize the format of the number, even by adding the \xC character again! Feel free to ask if in doubt.
An aside note: most likely you cannot see the new page character in the CLI, for example using the cat command (unless you use the -v option). Actually it appears as a kind of newline, because recent terminal emulators are able to interpret it correctly. For example, suppose I create a single-line file with the following content
Code:
page one<np>page two<np>page three<np>fine
Using cat I will see it like this:
Code:
$ cat book
page one
page two
page three
fine
Instead, using the -v option the hidden character is somewhat revealed:
Code:
$ cat -v book
page one^Lpage two^Lpage three^Lfine
Hope this helps.