-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use /proc/cpuinfo
for frequency measurement
#266
Comments
Relying on Here an example for a VIM2S relying on Amlogic S912: https://github.com/ThomasKaiser/sbc-bench/blob/master/results/1iJ7.txt#L51 With the boot BLOB Khadas got, the cpufreqs of the 'bigger' cluster will be faked as 1.5 GHz while it's 1.4 in reality. Android TV boxes relying on this SoC will often fake 2 GHz and with for example S905W it's even more funny since this SoC fakes 2.0 GHz too but only runs at 1.2 GHz in reality. Amlogic/Allwinner cheating was the main reason to integrate https://github.com/wtarreau/mhz in |
This is really interesting! Thanks for sharing.
|
Just walking through
I have no fallback, am just building it ( |
Okay, it does not use |
Related to Apple SoCs (#230), where I basically hardcode the max frequency (bad), I wonder if an approach similar to |
Mhz is not OS-specific. I've used it on AIX, *BSD, Linux, OS-X, Solaris etc. It only needs to find a relatively trustable clock source (i.e. the venerable gettimeofday() which every OS has) and that's all. For the operations, they're extremely simple, it just creates a long sequence of dependent single-cycle instructions that the CPU cannot optimize away so that it's effectively able to count the time it takes to perform N operations, hence N cycles. There's a compensation for the cond jump at the end by comparing two distinct loops, but overall it's quite accurate and variations are around 1-to-2 / 1000, which is not bad. As @ThomasKaiser said, it served us at a time where there was a race to the biggest liars between CPU vendors. By then I was really fed up with not knowing what I was buying and wanted to make something trustable to assess the hardware and point the finger at the liars. I didn't have much time to work on that beyond a few basic tools, and when I saw Thomas come with an already fairly complete sbc-bench, I said it was exactly what I had in mind. I humbly think that together we managed to force a little bit of cleanup in this domain so that it's now more difficult to cheat without being noticed. BTW we haven't caught amlogic nor rockchip cheating anymore after the tools became popular enough to be run by reviewers to verify they were not losing their time ;-) |
Though I've no idea how to measure the efficiency cores since something like But there would be a different approach: Firing up something in N threads (corresponding to number of cores) and then using
But there is another problem (not tested with M3 but with M1 and M2): once there are more than N cores busy the maximum cpufreq will be decreased automagically. At least true for the performance cores. E.g. 3200 MHz became 3000 MHz on the MacBook Air M1 back then when more than 2 cores were busy though this may depend on power capping and may differ on different models. |
Yeah I remember about this difficulty or even impossibility to bind to specific cores on this OS, it's pretty annoying. They probably consider it as a feature to prevent the user from helping the scheduler make the right decisions... I've even found a question about this which was roughly replied to as "simple, you just don't need to do that, period". |
WRT MacBook Pro M1:
MacBook Air M3:
The 'load generator' was a silly But no idea with which macOS version this started and not able to test on anything prior to 14.6.1/23G93 (we don't do 'patch management' here but instead patch everything always immediately). Did also a parallel run of 8
|
It's great to have you here @wtarreau! Very nice tool. I also tried doing something similar (and I integrated it into cpufetch here), but instead of the RAW operations you are using, I'm using nops. However, the biggest challenge I found is the number of cycles. In my experiments I found that the number of cycles cannot be predicted, e.g., in my Regarding the inability to set the thread affinity in macOS, I would like to join your annoyment. I also needed this for cpufetch and I had to implement this: make a loop and constantly check the current core until the scheduler decides to move the process to the core I want. Extremely dirty, inefficient and unpleasant, but works for my use case. Well, I don't know what is worse, this hack or macOS not giving the developers the tools needed to do a proper work. Also, I think I still don't get what you guys mean when you say vendors (amlogic, rockchip) were lying about max frequency. Does it mean that they modified the kernel to report higher frequencies than the actual ones? |
The difference is that modern CPUs use instruction fusion in the decoding stage and will merge most NOPs and even eliminate them. 35 years ago I was using NOP on 8088, it was OK (and even allowed to distinguish 8088 from 8086 by overwriting them, one had a 4-byte prefetch queue while the other had 6). But NOPs are totally unusable nowadays. I'm not surprised by your random measurements. They could even depend on code alignment, depending how instructions are fetched and merged together. Regarding vendors cheating. Yes that's it but not just that. Rockchip kernel at the era of RK3288 (kernels 3.10 and 4.4 IIRC) would indeed enforce a hard limitation in the cpufreq driver to silently ignore higher frequencies. Usually the limit was set to 1.608 GHz, but unscrupulous (or sometimes unsuspecting) board vendors would advertise 1.8 GHz (probably after verifying that it still worked once changing it in the device tree, without realizing that if it was stable, it was because it wasn't 1.8 GHz). For amlogic, it was worse, they didn't modify the driver, it looks like it was an MCU inside the SoC that was enforcing a hard limit. Hardkernel used to advertise and sell their Odroid-C2 as a quad-2.0 GHz CPU except that it was a quad-1.536 one. After this was disclosed, they apologized for not noticing it. Given how they annoyed amlogic to get different blobs to try to set higher, stable frequencies, I really think they were honest and didn't notice that scam by themselves. That was too much, really. When vendors cheat at the hardware level, you have to find other ways to let users figure by themselves what they're buying. Nowadays the situation has significantly improved on this point! |
In some systems, measure the frequency is not possible due to
perf_event_open
being unavailable (like in #260). Thus, having a fallback that relies on/proc/cpuinfo
would improve resilency of the method.The text was updated successfully, but these errors were encountered: