Strange floating point variable behaviour, in Python? Or am I ignorant about the situation here?

dedec0 · 02-08-2024, 07:57 AM

I am learning Python. I have a fairly good knowledge in C language, and a few other languages, since several years.

In the book I am reading with the basics of Python, there are exercises to train each new aspect of the language. The exercises are not meant to be hard, since the book is also for people who could be in the start of any kind of software programming.

So, the trouble for this thread is an exercise that asks us a program to ask an amount of money to pay, and gives the answer of the smaller amount of papers ("cédulas", in my language) and/or coins which is necessary for that payment.

For example, if the "things" (papers or coins) we could have to pay here are only 100; 50; 20; 1; 0,50; 0,01", and the user wants to pay $101, the minimal amount of "things" are one paper of 100, and one paper of 1. Easy and clear?

But my program is not working when I changed it to accept the cents of a money value. We start with an exercise just for integer values, and where we solve it with simple integer variables, to values that can include cents, and thus we use the float type.

For $105,7, the output is:

Code:

Value ($) to pay: 105.7
1 cédula(s) de R$100
0 cédula(s) de R$50
0 cédula(s) de R$20
0 cédula(s) de R$10
1 cédula(s) de R$5
0 cédula(s) de R$1
1 moeda(s) de R$0.50
0 moeda(s) de R$0.25
2 moeda(s) de R$0.10
0 moeda(s) de R$0.05
1 moeda(s) de R$0.01

Notice the "extra" 1 cent coin, which is wrong, for what the user typed, and what the algorithm I typed is meant to do (I mean that when the payment has something smaller than 1 cent, which is the smaller coin, but bigger than zero, we add a 1 cent coin to the given answer; thus, we always round up the value to 1 cent, when necessary). But the decimal value here is simply 70 cents! Should be given as one coin of $0,50 and 2 coins of $0,10... without anything else. I think this would work as expected, with the same algorithm written in C, with the same variables (I think even a float instead of double would work there). I tested this example by adding more precision to the input: $105.700000 (five zeroes!). Same result!!

For 105,5 is works:

Code:

Value ($) to pay: 105.5
1 cédula(s) de R$100
0 cédula(s) de R$50
0 cédula(s) de R$20
0 cédula(s) de R$10
1 cédula(s) de R$5
0 cédula(s) de R$1
1 moeda(s) de R$0.50

For 105,1 is has the strangest behaviour:

Code:

Value ($) to pay: 105.1
1 cédula(s) de R$100
0 cédula(s) de R$50
0 cédula(s) de R$20
0 cédula(s) de R$10
1 cédula(s) de R$5
0 cédula(s) de R$1
0 moeda(s) de R$0.50
0 moeda(s) de R$0.25
0 moeda(s) de R$0.10
1 moeda(s) de R$0.05
5 moeda(s) de R$0.01

I execute it with Python 3. My current code:

Code:

valor = float( input( "Value ($) to pay: " ) )
numCédulas = 0
cédulaAtual = 100               # Começa com a cédula de maior valor
# À medida que vai contando as notas, o valor diminui, e este restante 
# sobre para as notas de menor valor, até chegar em R$1
restanteAPagar = valor

while True :
    # Podemos usar mais uma cédula da maior possível atualmente?
    if restanteAPagar >= cédulaAtual :
        restanteAPagar -= cédulaAtual
        numCédulas += 1

    # Se a cédula atual não funcionar, tentamos a próxima maior, menor que ela,
    # até acabar.
    # Mostra quantas cédulas do valor atual que serão usadas, antes de passar
    # para a próxima.
    else :
        if cédulaAtual >= 1 :
            print( f"{numCédulas} cédula(s) de R${cédulaAtual}" )
        elif cédulaAtual > 0.01 :
            print( f"{numCédulas} moeda(s) de R${cédulaAtual:.2f}" )

        if restanteAPagar == 0 :
            break

        # Quando o valor ficar menor que 1 centavo, mas ainda não zerado, 
        # consideramos ele como o último centavo a ser contado
        if cédulaAtual == 0.01 and restanteAPagar < 0.01 and restanteAPagar >0 :
           restanteAPagar = 0
           numCédulas += 1
           print( f"{numCédulas} moeda(s) de R${cédulaAtual:.2f}" )
           break

        if cédulaAtual == 100 :
            cédulaAtual = 50
        elif cédulaAtual == 50 :
            cédulaAtual = 20
        elif cédulaAtual == 20 :
            cédulaAtual = 10
        elif cédulaAtual == 10 :
            cédulaAtual = 5
        elif cédulaAtual == 5 :
            cédulaAtual = 1
        # A partir daqui, "cédula" serve para moeda também
        elif cédulaAtual == 1 :
            cédulaAtual = 0.50
        elif cédulaAtual == 0.50 :
            cédulaAtual = 0.25
        elif cédulaAtual == 0.25 :
            cédulaAtual = 0.10
        elif cédulaAtual == 0.10 :
            cédulaAtual = 0.05
        elif cédulaAtual == 0.05 :
            cédulaAtual = 0.01

        # Mudou o valor da cédula, recomeça a contagem
        numCédulas = 0

I wrote this post quickly. I must run to lunch now. I will review (and possibly correct) any basic errors that I wrote here - sorry if there is a big one; just wait my fix, if you prefer!

ntubski · 02-08-2024, 08:11 AM

https://floating-point-gui.de/basic/

Quote:

Why don't my numbers, like 0.1 + 0.2 add up to a nice round 0.3, and instead I get a weird result like 0.30000000000000004?

Because internally, computers use a format (binary floating-point) that cannot accurately represent a number like 0.1, 0.2 or 0.3 at all.

When the code is compiled or interpreted, your "0.1" is already rounded to the nearest number in that format, which results in a small rounding error even before the calculation happens.

boughtonp · 02-08-2024, 08:47 AM

The solution is to use Python's type of decimal, not float.

Quote:

Originally Posted by https://docs.python.org/3/library/decimal.html

The decimal module provides support for fast correctly rounded decimal floating point arithmetic. It offers several advantages over the float datatype:

Decimal “is based on a floating-point model which was designed with people in mind, and necessarily has a paramount guiding principle – computers must provide an arithmetic that works in the same way as the arithmetic that people learn at school.” – excerpt from the decimal arithmetic specification.
Decimal numbers can be represented exactly. In contrast, numbers like 1.1 and 2.2 do not have exact representations in binary floating point. End users typically would not expect 1.1 + 2.2 to display as 3.3000000000000003 as it does with binary floating point.
The exactness carries over into arithmetic. In decimal floating point, 0.1 + 0.1 + 0.1 - 0.3 is exactly equal to zero. In binary floating point, the result is 5.5511151231257827e-017. While near to zero, the differences prevent reliable equality testing and differences can accumulate. For this reason, decimal is preferred in accounting applications which have strict equality invariants.
The decimal module incorporates a notion of significant places so that 1.30 + 1.20 is 2.50. The trailing zero is kept to indicate significance. This is the customary presentation for monetary applications. For multiplication, the “schoolbook” approach uses all the figures in the multiplicands. For instance, 1.3 * 1.2 gives 1.56 while 1.30 * 1.20 gives 1.5600.
Unlike hardware based binary floating point, the decimal module has a user alterable precision (defaulting to 28 places) which can be as large as needed for a given problem
...

sundialsvcs · 02-08-2024, 08:56 AM

To extend this notion:

CPUs also provide a BCD = Binary-Coded Decimal mode which represents individual digits as four-bit groups. (The best-known language with full and built-in support for this is: good ol' COBOL.) When arithmetic is performed in this "pure decimal" mode, these anomalies do not occur. You are no longer "rounding" or "encoding" anything: you are actually manipulating decimal digits.

The Python language, of course, includes a Decimal package which performs arithmetic in this way.

dedec0 · 02-08-2024, 01:15 PM

Editing finished.

Although I change the program input line to English, so everyone reading here can understand easily, I kept the output as I wrote for me and the book, in my language (Portuguese). There is no big secret: "cédula" means a paper of money, like one for 100 dollars; and "moeda" literally means coin. In my currency, the smaller paper is 5, and the biggest coin is 1.

I kept my code comments as I wrote them. If needed to understand my ideas in each part, translating them should suffice, I assume. But if you need, ask me, and I rewrite the ideas in English too.

Any ideas? I am kind of lost in what I should look for, in the "python world".

dedec0 · 02-08-2024, 01:52 PM

@ntubski, although I knew the theory of how the processors work and do calculations (since decades ago), I never had a practical problem like I described. The error did not appear, or I rounded it in the result, before showing or using it. And I played with a fairly good amount of problems from Project Euler website, which involve tricky ideas with a lot of math and computing, generally. I will check there again. I stopped waiting for the next available problem being released there, counting the days to it... 🙈😝 Maybe I have to recover my password...

@boughtonp, what an incredible thing to have in Python! From the things I skimmed in the documentation you point, would it be better to use this library instead of the default simple operations in Python? The code and all the output there has "Decimal( ... )" repeatedly written. It works like we (humans) prefer, but one basic subjective good aspect of the Python language syntax is lost:

Decimal( 70 ) / Decimal( 100 )

instead of simply:

70 / 100

A toggle option in command line or environment variable could make the latter work as the other? Precision can be an integer value, and if zero and negative numbers are valid and can be wanted in some context, a second environment variable for toggling "classic float" or "decimal fixed point and floating point arithmetic". What you think?

@sundialsvcs: I did not know this "detail" about Cobol! I heard its name several times, but just from a few people and teachers who studied, worked with or had computers as old as 386, or even a few ones before it. And these people talked about assembly, Fortran, C (which changed a lot in decades before and after I was born), Cobol, Pascal, Basic, QBasic, ... and maybe some other name I forgot to mention here. But my age is still less than 100 years, okay!! 😝😝😝 And I think the university I study today, still have many and important things written in Fortran... the studies of physics, mathematics, statistics and possibly other exact sciences (chemistry?) are having their algorithms and programs being rewritten in modern languages we have today, like R or possibly Python (a guess basic in a few rumours I hear around).

sundialsvcs · 02-08-2024, 03:46 PM

The COBOL language is still(!) in use and still quite interesting. It implemented both floating-point ("computational") and decimal types from the very start, and its default operating method is decimal. It allows you to have very precise control as to exactly how many digits are represented in any value that it manipulates.

"Dollars and Cents" was a key form of data that it was called upon to manipulate, and accountants do not like a column of figures to be "off by one." If you use this data type, the computer does the math exactly like you do. (But not like your pocket calculator does ... which is floating-point.)

rclark · 02-08-2024, 04:24 PM

Quote:

... many and important things written in Fortran

Absolutely and that isn't going to change (much). A few years ago I was helping plasma (like affects of re-entry of vehicles on materials into the atmosphere) researchers using Fortran applications. Those programs are not going to be rewritten any time soon.... Millennial kids think the newest languages on the block are what is needed to solve problems and to get more robust code ... but, for the most part, it's just more of the same -- as the wheel turns so to speak. Professional programmers still need to write correct code today, as programmers back in the '40s-50s' had to. You still can't be 'lazy/sloppy' programmers as much as some would wish it to be

. Ie. Let the compiler find your logic/memory/resource errors or OOP concepts are the salvation to robust code...

Python is now the glue with-in a lot of companies. Even Engineers can hack at it without having a CS degree. It is so easy to use to get useful results. In my company that is the way it is anyway. I use it all the time now and only drop to C/C++ when necessary. Most all the SBC/microcontroller users for robotics and other projects use Python (microPython, CircuitPython, etc.) .

ntubski · 02-08-2024, 06:22 PM

Quote:

Originally Posted by dedec0

one basic subjective good aspect of the Python language syntax is lost:

Decimal( 70 ) / Decimal( 100 )

instead of simply:

70 / 100

Strictly speaking, Decimal( 70 ) / 100 is also good enough (that is, only one of the operands needs to be a Decimal). You generally shouldn't have hardcoded values in the middle of your code anyway.

For a small program like you're writing here, I would probably go with integer number of cents though.

sundialsvcs · 02-09-2024, 07:34 PM

So far as I can recall now, only the COBOL language provided built-in support for the BCD data-type. Every other language used "float." Otherwise, they had to resort to external packages. FORTRAN does not (AFAIK) support decimal types natively.

(Disclosure: I have actually written code in all of them ... quite extensively. But "it has been a while," and languages change.)

Now – with regard to the Python examples shown above – when you do use the Decimal package, it is absolutely critical that you do so correctly. The documentation goes to great lengths to explain this. Read it very carefully. The "natively floating-point" core interpreter will easily lead you astray.

pan64 · 02-10-2024, 03:13 AM

bcd is slow in the CPU. I mean calculating anything with bcd is much slower than using binary or float. It was provided to simplify decimal calculations when there was no enough memory to convert binary to decimal (or was extremely slow). bcd for 8 bit was acceptable, but bcd on 64 bit is not really efficient.

additionally you may find it interesting (or useful):
https://www.askpython.com/python/exa...teger-handling

sundialsvcs · 02-10-2024, 09:13 AM

@pan64: Actually, I think, "slow" was never the consideration. If you are dealing with "vast columns of dollars-and-cents," you might need a decimal representation of the quantity. Your primary consideration is not "speed," but "to-the-penny precision" over "however many tens-of-millions of" mathematical operations.

This is also why database systems implement a currency data-type. Which "might not be decimal, but also will not be floating-point." For example, Microsoft Access implemented it as a very-long integer, "multiplied by 10,000." This gave "four digits to the right of the decimal point." It's not a "BCD" representation, but it solves the same problem. "No matter what database system you are working with, if it provides a 'currency' type (and no matter how they did it ...), use it."

In the OP situation, "accumulated imprecision" lead to an "off-by-one error." And, in the real world, that "one" is hugely important.

EdGr · 02-10-2024, 11:25 AM

dedec0 - A workaround for the inexactness of binary floating-point is to multiply by 100, round to the nearest integer, and then compare.
Ed

pan64 · 02-11-2024, 05:55 AM

Quote:

Originally Posted by sundialsvcs

@pan64: Actually, I think, "slow" was never the consideration. If you are dealing with "vast columns of dollars-and-cents," you might need a decimal representation of the quantity. Your primary consideration is not "speed," but "to-the-penny precision" over "however many tens-of-millions of" mathematical operations.

I meant low level support, the CPU itself cannot handle BCD very well. otherwise yes, it is definitely important and useful for us. The solution is to use a library/software which can do the conversion without problems.