New inxi / pinxi features: user RAM reports! And much more! Testers?
SlackwareThis Forum is for the discussion of Slackware Linux.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
rizitis, sorry, no, it was right to say 64 GiB is an estimate because this was a case of the dmi data for array being wrong, logically impossible, because it listed capacity at 32 GiB when the occupied total is 64 GiB. Most of the array values are I believe filled out manually by someone, and those types of errors come from bad copy pastes of the wrong data into the board dmi table.
One thing that puzzles me about how udevadm implemented this is how many values they left off their report, one assumes they get the whole dmi table at boot, then decide what to print of it, so in a sense, they are deciding not to print stuff like max module size.
It's hard to believe that while they are trying to maintain AI, first, is a real thing, and second, will take over the world, when such comically absurd and all too human errors are a constant in the stuff driving these systems. Maybe a case of putting the cart before the horse there I think.
One thing that strikes me re output, with the new Report: arrays: item, is that the array report should contain the type, like DDR4, and that doesn't need to be in the DEVICE item since it's going to always be the same, that is, ram module type is defined by the array, not the module. I don't believe different DDR types even fit into the same slots, so that never made any sense to list by DEVICE there, that should only be info specific to the module. This probably applies to voltage as well, unless you can clock different arrays differently, which might be a thing?
I'd never thought of this before since the type is listed in the module 17 dmi set, not the array 16 set from dmidecode, so I'd just not thought about it.
Is this wrong, can they be different in theory? I don't see how.
But type definitely can be moved to the ARRAY line to get rid of some redundancy.
I think I'll future proof the udevadm > 1 memory array handler to assume they number the arrays at some point in the future, so it will look for numbered or not numbered, not guaranteed to work since no idea how that will be presented, but as a guess, one would assume they would follow their slot output field name syntax.
amikoyan, great, I think that's the first login manager (lm) output I've seen in the wild, I added that a few weeks ago, seatd, logind, and one or two others.
That was the main goal with the 3.3.32 cycle, to get all kinds of small loose ends wrapped up, refactored, etc, then a few big features like this new user udevadm ram report just popped in at the last minute.
Also the relatively recent EGL/OpenGL/Vulkan reports, nice to see all that working, though those were in a previous release, but still relatively new.
It's cool to see the sysvinit version method work, that only works occasionally (uses strings to read the binary, and sometimes that contains the version, but not always).
I think there's almost enough evidence to say that this udevadm feature is new, and came in somewhere between v245 and 249, which explains why I never heard of it before, this data for non root user has been one of the longest standing features I wanted in inxi, and for it to just pop up like this is surprising, I think I've been waiting for that for maybe 10 years now.
I still have to tweak the errors a bit now that udevadm is a valid option, some cases assume dmidecode or nothing, but those are tricky conditions to get right.
After I tweak the multi ram array logic a bit that will be as good as I can get it, though I am aware for complex systems with newer udevadm it may not work, but hard to speculate there.
Oh, I learned a new thing, the data width and total width, if both 64, means it's not error correcting EC ram, but if it's like 64 72 or 64 80, it's EEC, I didn't know that's what the numbers meant for data bandwidth.
Apparently DDR5 EEC uses 80 total, and 64 data.
So technically if I understand this stuff right, one can deduce EEC or not just from the data/total widths. I always thought those numbers were referring to something else.
chrisretusn: Oops, sorry, that one slipped by. Now corrected in 3.3.31-49
The per array active modules fix works, that's good, though it looks like something is getting trimmed from Error Correction Type.
[another bug, sorry, error correction type passed the field name, not the value, to the cleaner]
I'd still like to know what the udevadm output for > 1 array setups looks like, to me it looks like they made a mistake, and assumed only 1 array, so did not number the MEMORY_ARRAY_[field] items like they do with: MEMORY_DEVICE_(\d+)_[field name]
I'm leaning towards guessing that first, MEMORY_ARRAY_LOCATION is always first, and that the ordered sequence is per array, then adding in a check for a possible array number, along with the known no number case which is so far standard. But there's no way to know this since the way they did it appears to be a mistake, a subtle one, that I can see why would not have gotten caught so far, but I have to assume it will get caught.
I may have to redo that logic to actually build the structure that is missing.
This is a similar issue I'm having with a current active bug report where with amd threadripper 2950x, 16 core, I assume 2 die, each die is counting its cores independently, which no other amd zen does that I have seen, leading to a reported core count of 1/2, but impossible to debug without the debugger data required to emulate it, I can't actually guess what the data looks like.
The good side is, very few users will have > 1 array systems, so any potential issues will be limited to a small set of people, and I'll only need probably one udevadm data sample from that system to fix it, as logic goes, this is pretty easy overall.
Thanks for confirming the fix, and sorry for the typo, I should have run some data through it with EC active, but I forgot.
Note that another actually important difference, which I don't like, between dmidecode output and udevadm output is that in many cases, udevadm does not output values if they are not active, whereas dmidecode has the same values for modules whether they are occupied or not, or for arrays, EC whether active or not.
This is mostly over my head. Not sure what the oops refers to, still glad to have helped. The output of 'pinxi -SIGmaz --vs' looks the same to me. From one of your post above I tried this 'pinxi -maz --fake udevadm' my result is this. I get the same result as user and root (the root one surprised me):
Code:
$ ./pinxi -maz --fake udevadm
Memory:
System RAM: total: 8 GiB available: 7.76 GiB used: 2.96 GiB (38.1%)
RAM Report: message: No RAM data found using udevadm.
# ./pinxi -maz --fake udevadm
Memory:
System RAM: total: 8 GiB available: 7.76 GiB used: 2.96 GiB (38.1%)
RAM Report: message: No RAM data found using udevadm.
With the 'pinxi -SIGmaz --vs', I see "For most reliable report, use superuser + dmidecode.", I gave that a try (just the "Memory" section):
I'm an amateur too, lol, I'm learning this stuff as I go along.
If those two samples are from the same system there is a definite quality problem with udevadm data, it missed EC RAM, it missed double bank, but it doesn't look like it's EC if the data and total widths are 64 bit. So that's strange.
So far it looks like except for ram voltages, which are clearly wrong with udevadm, the data is similar, with one big problem however, unlike dmidecode data, where you can link the array handle to the ram stick, which list that handle, in udevadm it's just hoping it stacks in order.
I've added an attempt to at least guess ok on that, but it's a hack, I'm hoping udevadm fixes its output for complicated situations like > 1 ram arrays on board.
I was looking at a 24 slot supermicro which looks like it has two separate banks of ram, one per cpu, but I couldn't find if it had > 1 array or not, stuff like that is really hard to find documented in my experience.
The oops were a few bugs that were exposed, one of course, the wrong function name, second, using the wrong value to clean with the cleaner, the key name instead of the value, those are all corrected now.
Some other corner case is also handled now, and I think the logic working on the dmidecode data to check and correct it is also now working on the udevadm data, though the dmidecode is more accurate because I can reliably connect the array handle to the module that belongs to it, that so far can't happen on udevadm, which is unfortunate, but that only impacts real servers with multiple memory arrays, not common. But that's why the message says to confirm with root and dmidecode, though that only shows now for -mxx and greater verbosity levels, since for most systems, if no voltage shows, the info is pretty much the same.
Keeping fingers crossed, this was/is a big new feature to add in right before releasing new inxi, but I wanted to get it in because it's nice having finally ok ram report for user, not just sudo/root.
I'm going to have to adjust some more things, unlike dmidecode, which doesn't usually have these weak spots it seems like udevadm is not very sophisticated in terms of how it is creating its data.
So maybe that message should always show, but hard to know, ram data from dmi is always very messy.
So looks like the user report (using udevadm) dumps all the memory into 1 array. Although reviewing the udevadm report maybe MEMORY_DEVICE_*_BANK_LOCATOR indicates the array???
This motherboard has 2 physical CPU die slots (of which both are used). I don't know why there are 4 memory arrays instead of just 2. If you only have 1 CPU installed you can only install memory for that CPU (so 6 slots instead of 12).
Code:
live@darkstar:~$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 88
On-line CPU(s) list: 0-87
Vendor ID: GenuineIntel
Model name: Intel(R) Xeon(R) Gold 6238 CPU @ 2.10GHz
CPU family: 6
Model: 85
Thread(s) per core: 2
Core(s) per socket: 22
Socket(s): 2
Stepping: 7
CPU(s) scaling MHz: 31%
CPU max MHz: 3700.0000
CPU min MHz: 1000.0000
BogoMIPS: 4200.00
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss h
t tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_ts
c cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca s
se4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_f
ault epb cat_l3 cdp_l3 ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow flexpriority ept vpid ept_ad fsgsbase
tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt c
lwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_
mbm_local dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req vnmi pku ospke avx512_vnni md_clear fl
ush_l1d arch_capabilities
Virtualization features:
Virtualization: VT-x
Caches (sum of all):
L1d: 1.4 MiB (44 instances)
L1i: 1.4 MiB (44 instances)
L2: 44 MiB (44 instances)
L3: 60.5 MiB (2 instances)
NUMA:
NUMA node(s): 2
NUMA node0 CPU(s): 0-21,44-65
NUMA node1 CPU(s): 22-43,66-87
Vulnerabilities:
Gather data sampling: Vulnerable: No microcode
Itlb multihit: KVM: Mitigation: VMX disabled
L1tf: Not affected
Mds: Not affected
Meltdown: Not affected
Mmio stale data: Vulnerable: Clear CPU buffers attempted, no microcode; SMT vulnerable
Retbleed: Mitigation; Enhanced IBRS
Spec rstack overflow: Not affected
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; Enhanced / Automatic IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence
Srbds: Not affected
Tsx async abort: Vulnerable: Clear CPU buffers attempted, no microcode; SMT vulnerable
This confirms what i was suspecting, only I do not have root on that remote server, and I had not noticed the bank locator item in the device.
So I will need to synthesize the data completely, clearly the array report is for one of the arrays, the assumption then being that all the arrays are the same, which I don't know is always the case.
This is verified by the data I got from the multi cpu servers, which also had bank locator values.
As design goes, this is terrible output, whoever did this was not very good at their job or task, re adding this feature to udevadm, probably one of those developers who only use vms and a laptop is my guess.
It was clear they needed to add at least numbering for arrays, array handles, and handles connections per device, like the source data has in dmidecode, assuming the raw dmi data has that, don't know how it works.
The good news is, first, I had a strong suspicion something was wrong, but didn't spot what it was, but with a few samples, I can emulate this.
I will have to build this from the most complex scenario, and synthesize the array-device connectors, which means, grab all the data first, then loop it again to assemble it into actual data structure which is correct, then pass it to the post processor.
I'm surprised this passed inspection, it's quite shoddy how they did this, unfortunately, but something in me raised a red flag about how the output was working, thanks for noticing the BANK_LOCATOR, which we will have to assume emans array. I had checked the specs on the supermicro board with 2 cpus and 24 ram slots, and the wiring diagram made it pretty obvious it was two, at least, memory arrays.
Looks like a 2 step process is in order, one, build the array data, 2, build the module data, then look for the bank locator on the devices, and if present, generate the actual data structure. Should not be necessary, but that's how it goes when they handed this task to someone who didn't actually understand how the hardware works, that's my guess anyway.
The risk here of course is that they realize they did something really silly, and fix it in a later release.
I had guessed that in the future, they would add the MEMORY_ARRAY_[number]_ to fix this obvious oversight, but that may never happen, so I anticipate this feature will need updates in the future once people realize how undesirable the output is, udevadm makes a lot of real errors, for example, they don't show negative values, like for EC, they simply don't show the field, which means, you can't _know_ for sure it's a negative except by guessing if no positive value, then it's negative, which is poor way to do this type of processing, I have to do the same with absence of size, though in that case, they did add a PRESENT field name, with 0 or 1 value, which is fine. Except that it doesn't always appear, sigh, your data doesn't use it, except to show... sigh, a negative value.
I'm curious who developed this feature of udevadm, if I had to guess, I'd guess someone at redhat corporation, it has that sort of poor quality not well thought out feel to it, and also not really building it from what dmidecode has shown for ages, which is the obvious model to emulate, unless NIH syndrome is active and in control.
So to start, I think I'm going to always show the Message: unreliable data if > 1 arrays are detected once this is patched and updated.
One of the things that makes this udevadm output very low grade is the bad decision to randomly show or not show specific field names based on circumstances, instead of always showing them, the bank locator for example should always be there, and apply to the set of 1 to many board memory arrays, dmidecode definitely did a far superior job in creating their output, far more usable and in particular machine parse-able, with clear links between connected items.
Actually no, it's worse than I thought, looking at my udevadm samples, bank locator is in fact randomly used or not used, regardless of if it's 1 or > 1 arrays.
Basically the udevadm dev tossed the array id randomly in there, this is really poor data handling by udevadm, extremely poor, I was afraid the actually complex scenarios would generate issues.
This is almost certainly because the person who whipped this feature together did not understand the physical hardware.
This is very similar to me to the systemd development, where from the start, you could see what the developers used re hardware and testing, which was vms and laptops, because that was the only stuff systemd actually worked on for maybe the first 5 years, so anyone running complex hardware or servers had big issues, and the devs kept saying there were no issues because they did not use real hardware, work stations, servers, etc. This udevadm has the same smell to me, which is why it works ideally for simple scenarios, and starts to fall apart in complex ones.
This is a big problem, for example, I have a 4 slot 1 array system where udevadm notes in bank locator that it's bank 1, bank 2, bank 3, bank 4, 1 per slot.
Which means I can't use that method to determine the actual array.
This may not have a workaround until they actually fix their output to handle real hardware situations, but my fear is, this was made by someone paid to do it, and doesn't actually care that they did it poorly. That's what it looks like to me.
I already was using the BANK_LOCATOR for some fallback tests, but that is basically random junk strings that spmetimes refers to channels, sometimes arrays + channels, and sometimes slot names, it's largely random, it's not real data, that is. I believe it's largely up to the vendor how that data is entered per slot.
Given the samples here, the 4 array one just happened to have corresponding node 1-4, but looking at some other supermicro high slot count multi cpu, no such predictable syntax occurs.
I think I can roughly say:
'Channel [A-Z]' is not an array, that's normal channels in one array
'Bank [0-9]' has no definite meaning at all, you can largely deduce nothing from it, it might mean it's not dual channel ram on board, that is, each slot acts as one channel, or bank, or it might mean something else.
'Node [0-9]' I had some hopes for, seems to offer the best bet. I have now 4 samples, 3 supermicro, and the array capacity x node count is correct for the board, 2 have 1 array, 1 has 2 arrays, the 4 array 12 slot one has 4 nodes, so all these would be 'right' once adjusted.
However relying on such sloppy string parsing to get real data is rarely reliable and almost certain to fail with the next sample I get.
However, checking on an amd 2 cpu 16 ram slot supermicro, dmidecode, it shows only Node0, and the array capacity is correct, even though the specs look like it's a 2 cpu with 2 sets of ram slots, but dmidecode data shows only one array, so that may be ok, though I doubt it's actually right, but dmi data is just not very reliable.
It's pretty clear the dmi data is also defective in some of these cases, for example, on a cpu that supports 8 ram channels, with 16 slots, and 2 cpus, that's to me obviously running 2 memory arrays, yet dmidecode says it is running 1.
But dmidecode has always been a very poor data source so this is not surprising, the only thing of concern here is that at least, Node X corresponds to the memory array handle, which so far it seems to.
And absence of Node x in the BANK_LOCATOR string seems consistently to indicate 1 array only.
So with this, I can get some systems handled, without anything close to robust, but the data source of udevadm is simply too low grade to allow for proper connections, and dmi data itself is too poor quality to actually trust.
I think the rule here probably should be to always show the check with dmidecode for slot total > 4.
I don't know, and have never understood, why Linux, and in particular, /sys data, has consistently not shown RAM data in any real way, I waited years for it to start since it was so logical to add it, then finally and grudgingly decided to use dmidecode as a sort of desperate fallback, which has always created big problems because its data, particularly the data in device type 16, the array data, has been such poor quality.
This wasn't the easiest puzzle to solve, and the solution is not robust at all, but...
It now shows the Message: For most reliable.. now for all systems with more than 4 ram slots, since those can't be relied on due to the weak array handling issue.
This is unfortunate, but at least in some cases the Node [x] method words, but trusting random strings like that is unlikely to work always, it's about as non robust as you can get, this would have been a lot easier if udevadm provided handles to connect the arrays and devices, and if they had provided the full arrays reports, not just the one case.
This test only runs with the Node x syntax, and only if > 1 node was detected, at which point I believe it will work, unless someone uses the term 'node' for something other than an array.
I'm pretty sure other servers will have no Node syntax with > 1 array however, but I already know from data that things like P0, P1, Bank 1, Bank 2, Channel A, Channel B, aren't reliable to detect arrays, and then there's a bunch of other random stuff as well.
But at least the udevadm reports will roughly match up with the dmidecode reports, but very low grades for the udevadm implementation, it's like half done at best, and was designed wrong in several ways in terms of being reliable for data source.
But this is now sort of weakly working.
I did not have to refactor it, just insert a data structure rebuilder when nodes were detected.
Note further that some systmes have Node 0, but only 1, even though they are almost certainly > 1 array systems, but on those, they are reported with the right capacity for the array, and the right total slots, etc, so those are wrong somewhere else, all quite vague.
whew!
But at least there's a mechanism to inject this logic when required, so there may be other syntaxes that can be added there in the future if they show up. Seems to work on supermicro boards so far, and that system76, which is a start.
chrisretusn, sorry, wanted to ask this earlier, can you show:
Code:
sudo dmidecode --type 16,17
for your above sample, is that the same udevadm sample you posted earlier, if not, can you also provide:
Code:
udevadm info -p /devices/virtual/dmi/id
There's some odd discrepancies there between the dmidecode version and the udevadm version, want to make sure there's nothing wrong in the logic.
One showing EC, the other Non EC is particularly odd, particularly when the data/total are 64/64, which I believe means non EC.
But if dmidecode is just lying, that's fine, nothing I can do about that.
Re the --fake udevadm,that only works if you have the test environment and the debugger data files, what I use to emaulate the various samples here, those files are available from codeberg.org/smxi/pinxi/data/ram/udevadm/ and I'm trying to upload the real data files used to develop in case someone wants to try to work on it other than me, I used to have most of those non-open because it was a pain to go through that stuff and clean it up and organize it, but it's mostly done now so I just add them as I go along.
The paths for those, if anyone cares, can be set using --fake-data-dir though it's hard coded to be what I use locally. But that has to all be setup to use it, it's pretty easy once you get the idea.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.