Testers for inxi/pinxi redone -C CPU logic... huge internal changes
SlackwareThis Forum is for the discussion of Slackware Linux.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
bw42, a PowerPC, you are making my day!! That's exactly where I would expect bugs to pop up, since I have no way to test that, very glad you have one of those. I will see if I can resolve that issue.
The Kaby Lake/Comet Lake is a real pain, that's actually why inxi says 'note: check', it knows that it can't actually tell them apart in some steppings. But I'll double check that to make sure, but it basically comes down to Intel using product names as marketing devices, not engineering terms, since obviously if it was an actual different cpu it would have a different stepping, at least that's how I understand it. This issue also happened with Zen > Zen+, by the way, and has happened a few other times. But I'll double check that to make sure it can't be improved.
cpu-world.com is doing a really good job, they are my go-to now for all confirmations of data, what they are doing is a real service to the entire tech community. Unlike other sources that often contradicted eachother, they really seem to be getting the data right, though nobody is perfect since it's quite empirical at times.
The Aarch server is another perfect example of a total fail, anytime you see: ERR-103 in the new CPU logic it is actually a code that tells me that some logic totally failed to execute at all that should have been able to execute if the stuff had worked.
I'll work up a debugger for that one, I had it lying around but can't find it. Just a one liner.
I uploaded debug logs for the Aarch64 system, and for my RISC-V HiFive Unmatched in case you add support for RISC-V at some point.
The RISC-V was almost a complete failure at reading anything, which I expected.
Code:
Machine:
Smbios: No SMBIOS data for dmidecode to process
CPU:
Info: ERR-103
model: N/A
bits: 64
type: UP
arch: N/A
family: N/A
model-id: N/A
stepping: N/A
microcode: N/A
cache:
L1: 256 KiB
desc: d-4x32 KiB; i-4x32 KiB
L2: 2 MiB
desc: 1x2 MiB
flags: N/A
bogomips: 0
Speed: N/A
min/max: N/A
core: No per core speed data found.
Vulnerabilities: No CPU vulnerability/bugs data available.
pghvlaans, yeah, the drag is, those are Intel architectures in lspci strings, not actual data, that's always been annoying because you see the value you want in the string, but it's not actual data. In Intel systems, you'll also see devices from different architectures listed, like one in the GPU, another in the CPU, but not the same ones.
I'm going to take a look however, but if it's an ambiguous stepping inxi can't 'know' which it is, unfortunately, I beat my head against that one off and on during recent architecture updates. That's a matching table, sort of, empirical, you can't actually get the architecture name from the system, same as for vendors, those are also created by internal manually constructed matching table, same for ram vendors by the way. On rare occasions you can get ram vendors internally, but usually not, it has to be matched to the product string.
So far great stuff, far better than hoped for.
Debugger data sets from AArch and PPC should show show me causes for sure, thanks a lot for those. The hex character may be slightly difficult to deal with, I had to handle an issue like that for weather, for russian characters, and htat was very tricky, I didn't do it globally, just in weather.
risc-v worked spectacularly well by the way, no errors!! Everything handled internally, that's actually very impressive, it suggests support can be added, though I've never seen a dataset before, so will be interesting to see what they look like.
pghvlaans, that may be a real bug with the stepping, I have to check that, carefully. It looks like stepping is converted to hex, but then is sent to the matching table, which is looking for integers, but I have to make triple sure about this before fixing that, but you may have found a really significant bug which will impact a lot of Intel architecture detections, since a lot of them depend on stepping, not model number. The ID if it was working should have returned: Comet/Whiskey Lake but returned the fallback case, Kaby Lake.
I'd honestly always been unclear if the stepping in cpuinfo was hex or decimal, but during research for this refactor, I finally came across something that was unambiguous about it. I'm going to run a quick test on this, and if you can confirm if it worked, I'll know that was the bug, which seems to be a bug in inxi, maybe one that's been there for a while. My problem has always been that all the Intel servers I have access to just happen to have steppings less than 10, so this issue never really was obvious, but I have to be very careful in this one. But it looks like a significant bug in pinxi/inxi for mainly Intel CPUs. Exactly what I was hoping to find.
bw42, I'll check your stuff and post if I see anything.
I have several more BOINC rigs that I could also install inxi/pinxi on if it'll help you. (I have one box running Slackware 14.2 whose motherboard reads "copyright 1998.")
bw42, pghvlaans, somehow, you both managed to find manifestations of the same bug, LOL. This is far better than I hoped for, and in particular, this area I had not even really double checked at all, since it wasn't part of the core refactor. the PPC error was also directly caused by failing to test that the number to be converted from hex to decimal was an actual hex number.
So Perl and inxi did not do well trying to convert: 2.2 (pvr 004e 1202) (2)
into decimal from hex, lol. MIPS, PPC and to some lesser extent, ARM, also tend to put various strings into the stepping or revision fields, so this was going to hit more people than just this single case.
The logic internally was correctly turning the stepping integer into hex, but it had not been updated then in the cpu architecture function to then check to make sure it was dealing with a hex number and convert it back to decimal, and it was also only doing a simple is numeric test, that I think was from an older version of the logic where I was not converting the integer stepping value to hex the way I do with model id and family id.
So this was a really excellent start, very promising, several related bugs in an area I did very little double checks of the logic, and which I had not refactored, and none of my cpus had steppings > 9, nor were they ever giving a non integer value for stepping/revision/rev, so I never saw it.
pinxi 3.3.09-10 should have these two issues handled.
Code:
pinxi -U
to get fixed version.
This means the intel arch has been wrong in all cases where the stepping was > 9, which is a drag
Excellent so far, will keep digging into the data.
Many thanks for finding such failures so quickly.
Interestingly, a Manjaro person also found a stepping bug, but that was one where I'd forgotten to add a fix in to protect against the case of stepping 0 being treated as false, not a valid value. That is also fixed.
JayByrd, another excellent one, I was almost kicking myself because I had just converted my old athlon x2 box to ryzen about a month or two before starting this refactor, and I was wanting to see what happened on that specific cpu.
Any old hardware is really great to test on, and also any super new stuff, I think I have a guy on monday who is going to shoot me data from his alder lake, that's the real test for this refactor, so far it should work exactly as intended unless there are bugs, keeping my fingers crossed on that one. But from what I am seeing so far, pretty much anything and everything is great to test on. Remember to update with pinxi -U to get the latest fixes each time. I'll try to remember to up the patch number each commit so you can tell you got the latest.
many thanks, these are all extremely useful.
nobodino, sometimes I worry it's getting too much information, if you saw it when it was version 0.1 compared to now, it's a huge difference, and sometimes I worry there is too much, but a lot of it is directly due to end user feature requests (or my own feature requests for my own needs), so it is is what it is I guess. I hope that the verbosity extra data levels help control that, and do always try to keep the non extra data levels to as short and concise as possible, like -b for instance. But it's a challenge balancing clarity and non ambiguity with terseness brevity. -a/--admin was actually my solution to that dilemma, that one is allowed to be as verbose and long as required for precision.
pghvlaans, I am super glad you and bw42 had hardware that exposed such a 'core' bug, instantly, this is exactly what I was hoping for, and even better, in both places where that bug could trigger. I hate doing a new inxi with such a glaring bug in existence, and I'm trying to not think of all the previous releases that have had that bug, that's very annoying. But this is a big reason I had to do the full CPU refactor, there were a lot of subtle and not so subtle bugs like that, many of which could not be fixed without rewriting all the logic.
The current fix is not perfect, but I think will cover virtually all meaningful cases, it's to search the stepping value for 1 to 3 hexidecimal characters, and only 1 to 3, and if it contains anything else, don't use it for hex conversions, just either set it to 0 for math tests in the stepping logic, or leave it as a string and don't try to convert it for the other logic bw42 tripped the error on.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.