Samsung Laptop Bug Is Not Linux Specific
First time accepted submitter YurB writes "Matthew Garrett, a Linux kernel developer who was investigating the recent Linux-on-Samsung-in-UEFI-mode problem, has bricked a Samsung laptop using a test userspace program in Windows. The most fascinating part of the story is on what is actually causing the firmware boot failure: 'Unfortunately, it turns out that some Samsung laptops will fail to boot if too much of the [UEFI] variable storage space is used. We don't know what "too much" is yet, but writing a bunch of variables from Windows is enough to trigger it. I put some sample code here — it writes out 36 variables each containing a kilobyte of random data. I ran this as an administrator under Windows and then rebooted the system. It never came back.'"
Embrace Linux as an additional test suite for your hardware.
it writes out 36 variables each containing a kilobyte of random data
36k clearly isn't enough for anyone.
Does windows crash if it has 0 temp space or 0 ram free real+VM?
Or at least in older vers? or on systems with very low ram.
It's not irrecoverably bricked. All he needs to do is open the laptop and disconnect the battery that refreshes the CMOS storage memory and wait a few seconds.
30-day hassle-free return policy.
So installing too many operating system will result in a brick, Windows in particular uses a lot of NV storage for it's boot entry, be careful when using BCDEDIT.exe...
Unfortunately not in this case.
These guys are intentionally trying to brick their laptops? I understand what they're trying to do, but don't they care about their money going down the drain, or are they getting free laptops from Samsung somehow?
We all know perfectly well that malware makers will start including a module that purposefully bricks Samsung laptops so that extortionists can threaten to wipe out a batch of corporate-owned laptops in one blow if the company refuses to cough up a substantial amount. No matter how this affair plays out, I can't see it ending well for Samsung.
A truly excellent pizza parlor is a delight unto the heavens. Treasure the sauce and the toppings!
I might be confused, but don't kernel devs normally destroy their instruments at the end of each show?
How can I believe you when you tell me what I don't want to hear?
He forgot the line, "Try it yourself and see." :)
Reminds me of the old IRC days when n00bs would ask what the command was to get channel admin privelages. "+++ATH" was the normal answer. :)
-Charlie
That's what swap space is for (aka the pagefile on Windows).
Your system will try to dump memory into swap space. If you don't have swap space, on Linux at least, processes will fail to run and you'll get some messages in dmesg that you're running out of available RAM.
Depending on the application, the application that is trying to allocate memory may crash.
I have yet to see a full system hault brcause of it though.
Removing the CMOS battery didn't recover this system, which is pretty much what I'd expect - UEFI variables are typically stored in the same hardware as the firmware itself, and unplugging batteries doesn't kill your firmware.
The system doesn't fail to boot. The system doesn't even complete its power-on self checks. The screen is never turned on. It never responds to keyboard input. It's bricked. This machine's not coming back to life without an SPI programmer.
UEFI data is apparently stored in NAND. Non-volatile.
No idea if there is some way to flash it, but if it's sufficiently hardwired into the board then it's entirely possible you're SOL and have to buy new hardware. Yes, this is idiotic.
Sorry, but if removing the battery or otherwise resetting the NVRAM to factory defaults resolves the issue, that's not even remotely "bricked".
Non-Volatile Random Access Memmory
Look up the first part and you'll figure out why removing the battery won't fix it.
I've seen Windows machines run out of handles. First you see applications not drawing properly, or missing buttons, then you see windows failing to be created. When it tries to create the window, it fails, then you hear the "Critical Stop" sound played instead of a dialog appearing.
Sometimes, it won't even create menus, so you can't right-click on a program in your taskbar and close it, but you can still activate the window and press Alt+F4 to close the program.
Once your system gets into that state, start closing programs (Calc, Explorer windows, etc. ) until you can use your computer again. Once you've closed enough programs, your computer works again. Don't even need to reboot.
The code writes 48 1kb vars. The summary is wrong.
(char)rand();
Extremely minor nitpick, but converting an out-of-range value to a signed integral type causes implementation-defined behaviour (which could include raising a signal).
It's pretty safe to say that Microsoft will never release a compiler that breaks this, but portability could be maximised by making 'testdata' be 'unsigned char' and removing the cast in the quoted code (out-of-range conversions of unsigned integral types causes the value to be reduced using modular arithmetic - no cast is required or desired).
I have yet to see a full system hault brcause of it though.
Last time I discussed it, Linux would kill a heuristically-chosen process (the "OOM Killer", it will avoid killing a process owned by root, balanced by killing something using lots of RAM and maybe CPU, I can't remember). Windows will crash.
Both behaviours are acceptable. Arguably, the Linux one is worse in some cases -- it might leave the system in an unnoticed but inconsistent state.
This just goes to show that UEFI is top-heavy, fragile, and not ready for prime time.
Don't attribute to malice what can be explained by stupidity, at least if not lawyers involved.
It's okay, kernel developers and heavy metal bands are easily mistaken for each other.
That's not what the OOM killer is for. Linux will allow over-commitment of memory (programs can malloc more memory (RAM plus swap) than is available). If all the malloc'ed memory is actually used, this can lead to more memory having been allocated than is available. This is when the OOM killer starts work killing tasks.
This behavior can be modified by changing the values in /proc/sys/vm/overcommit_ratio and /proc/sys/vm/overcommit_memory.
As an experiment, I wrote a little progrem that malloc'ed 200MB chunks of memory. I ran this on a Linux box with 2GB of RAM and all the SWAP disabled. The program could malloc 3GB of RAM before the allocation requests failed.
The real "Libtards" are the Libertarians!
That's often a case of running out of desktop heap rather than handles.
Sorry, but if removing the battery or otherwise resetting the NVRAM to factory defaults resolves the issue, that's not even remotely "bricked".
Non-Volatile Random Access Memmory
Look up the first part and you'll figure out why removing the battery won't fix it.
"Or otherwise". Even though they were wrong about it not being bricked in this instance, to be fair the AC wasn't completely clueless. Some hardware includes procedures to recover from bad NVRAM data.
---
Some systems have a small reset switch on the motherboard for this purpose that is supposed to reset it to factory default when removing the battery [which might only be used for the system time-of-day clock chip] wouldn't work.
Whether the UEFI data is being stored in the CMOS area, or, as some other posts indicate, being stored in the same flash memory as the BIOS executable code is unclear. Or, there might be a special NAND memory, just for UEFI data, but this would add extra cost. But, if the BIOS executive flash is being co-opted to store UEFI data, this would make either updating the UEFI or the BIOS code a dicey proposition at best.
This is dicey because the BIOS flash is dual banked. When reflashing the BIOS, [which is running out of bank A], the new code is loaded into bank B, a flag gets set, the system is rebooted, and if the [new] bank B code runs correctly, the system will mark bank B as the new default bank to boot from. In the interim mode, if the bank B code fails, the system is reverted to the older but still good bank A code. The process ping pongs on each new BIOS update.
Trying to lay UEFI key data in the same memory space seems ripe for problems.
Like a good neighbor, fsck is there
Interesting. Does this mean that before too long there's going to be a nice glut of Samsung laptops being sold as refurbs? Replace, reflash, resell?
fencepost
just a little off
Depends. Some motherboard vendors will include methods of reflashing a BIOS in the event the boot EEPROM code is hosed. Obviously this process is hard coded in ROM someplace. So perhaps Samsung has such a method in place for unblicking the units.
Life is not for the lazy.
I might be confused, but don't kernel devs normally destroy their instruments at the end of each show?
Well, when on the Ed Sullivan Show, they have been known to pack explosives into the drum memory.
Why, without your clothes, you're naked, Miss Dudley!
You can almost certainly re-program it using a JTAG interface... Samsung can do this at the factory if you return it to them. JTAG is not intended for consumer use, though. My old university had a JTAG probe and several adapters to interface with various hardware vendors proprietary interfaces - without this we would have had several multi-thousand dollar bricks in our hardware lab :)
I would hope that Samsung would have the decency to admit a flaw in their design and provide the reprogramming free of charge, but ...
As an experiment, I wrote a little progrem that malloc'ed 200MB chunks of memory. I ran this on a Linux box with 2GB of RAM and all the SWAP disabled. The program could malloc 3GB of RAM before the allocation requests failed.
You were running on 32 bit? You will hit the same limit whether you have 256MB or 16GB then.
Finally! A year of moderation! Ready for 2019?
Windows 7 and 8 give you messages that you are running low on memory. I'm not 100 percent sure, but I think they kill the largest userspace program (though this just might be the program dying from lack of ram). Running out of (disk) space is generally the bigger problem, linux doesn't like to log you in if it cannot syslog the attempt to disk.
Sorry, what? You are saying that a 32-bit machine with no swap and 256MB or RAM would allow 3GB of memory to malloc'ed? I don't think so. My point was that with a total of 2GB of memory in the system (2GB RAM and zero swap), a program can malloc 3GB.
The real "Libtards" are the Libertarians!
You can malloc() as much as you want until you run out of address space. That's 3GB on 32-bit systems, no matter how much RAM you have. Things will only go wrong if you attempt to use it for anything.
Sorry, but no. My tests showed that the amount of memory you can malloc() is dependent on the values in /proc/sys/vm/overcommit_ratio and /proc/sys/vm/overcommit_memory
The real "Libtards" are the Libertarians!
That's definitely a resource leak that I've hit before. It doesn't seem to be cleaned up completely by closing the offending process. Sure you can prolong the reboot for a while. But eventually you can only keep a couple of application windows open before hitting the limit and you'll need to reboot anyway to actually get some work done.
09F91102 no, 455FE104 nope, F190A1E8 uh-uh, 7A5F8A09 that's not it, C87294CE no. Ah! 452F6E403CDF10714E41DFAA257D313F.
I'm sorry, you're right - I was under the impression that overcommit_memory defaulted to permissive rather than heuristic allocation. However, overcommit_ratio is also ignored by default.
Fix the bug for Linux, let M$ rot in hell.
As I see it, he left the problem to Samsung. I mean, how can you be more blatant in saying: 'Fix this, or else....', than by posting ( supposedly ) working exploit code for the majority of the installed OS-base?
buffer overrun? and when the storage gets full it starts to over write other config data with junk.
That remind me of an assembly language course I took at the University, where we had to implement a mathematical algorithm in x86 assembly. My implementation bricked the PC, leaving it with a BIOS unable to boot or to enter setup. I never understood how it did it, but I now suspect that removing the battery for a while would have cured the disease.
Which is far better than this UEFI issue as Windows while misbehaving doesn't crash, and can actually be recovered (and if you don't know how, a simple reboot will fix it ).
Samsung was claiming this is a Linux problem. This needs to be shown to be either a problem with UEFI directly or Samsung's implementation.
The important thing here is that this bug may exist in other hardware with different thresholds. UEFI is just like BIOS in that the only difference between a Phoenix Bios on Dell and a Phoenix BIOS on Samsung is a couple minor changes because of hardware differences. By releasing the code into the wild they are FORCING the parties responsible to fix the problem or face a public relations nightmare of potentially thousands of bricked machines.
The Linux folks actually read and understand the documentation and then use the mechanisms described. The Windows-folks are usually not so capable.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Given that MS has tried this type of "marketing" before, I would say it is a safe bet it is not stupidity this time either. Just a complete lack of morals.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
I think it was The Who who popularized that style. Not exactly Heavy Metal...
And I thought that uefi was supposed to make this a better and more secure world.
Now this ...
And uefi can not boot iso9660 file systems. So no booting from a cd or dvd without jumping through hoops.
And uefi relies exclusively on the fat filesystem for its efi partition. That seems kinda backward to me.
Looks to me like we're losing functionality by using uefi.
https://en.wikipedia.org/wiki/Extensible_Firmware_Interface
The UEFI specification explicitly requires support for FAT32 for system partitions, and FAT12/FAT16 for removable media;
It's just a matter of time until this exploit is included in malware, so Samsung better start pumping out those firmware upgrades that guarantee enough space will be left to boot.
When the copyright term is "forever minus a day", live every day like it's the last.
Actually, Bill (Gates) got this one right when he (I have the audio recording) stated that no one needs more than 640k on any computer.
Right. Release the recording or it didn't happen!
When the copyright term is "forever minus a day", live every day like it's the last.
I bought a Samsung laptop at the start of January :( Not only that...I bought it specifically to install Ubuntu :( Bah.
Now what do I do?
Next you'll be suggesting the Samsung bug is triggered by someone driving a car into their memory pool.
P.S. The boom was at the Smothers Brothers Comedy Hour.
Given that MS has tried this type of "marketing" before, I would say it is a safe bet it is not stupidity this time either. Just a complete lack of morals.
Ofcourse aside from the minor detail that your safe bet turned out to be wrong.
Your suggestion is a good idea to avoid this code being copied to another platform and breaking there. Microsoft does specify what happens here though, and the program as written does the right thing. Both char and unsigned char casts get "Preserve low-order byte" when you start with a larger integer.
if that test was covered by warranty. That is an honest question; if i try to reproduce a bug which bricks a device, i do something which if partially intentionally. Its like dropping the device from a height which it should survive but does not.
Apparently, my memory was also packed with explosives. Of course it was the Smothers Brothers. Ed would have had a stroke if someone had blown up a drum (or anything else) on his show.
Why, without your clothes, you're naked, Miss Dudley!
Running out of (disk) space is generally the bigger problem, linux doesn't like to log you in if it cannot syslog the attempt to disk.
Rubbish. There is no way a syslog caller can know whether the log worked or not.
login might fail if it can't write the utmp/wtmp record, if it still does that.
Watch this Heartland Institute video
Many flash parts are set up so that if you short two adjacent pins, the flash chip will zero itself out.
Certainly non-trivial, though, and if the parts have any kind of water-proof coating, even more difficult. Once in a while a manufacturer will be kind enough to provide a surface-mount pushbutton momentary switch to make this easier.
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
If you set overcommit aggressively enough or use a sufficiently old Linux kernel, you will be able to malloc() 3GB on any 32 bit system. Assuming you stay with the default 3/1GB memory split.
Any testing of overcommit done on 32 bit systems is a bit useless. 32 bit systems are pretty much embedded-only.
Finally! A year of moderation! Ready for 2019?
Oh? And what evidence do you have for that?
MS manages to be malicious and stupid at the same time...
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Really, so every Linux distro works with every piece of hardware right out of the box then?
I haven't thought of anything clever to put here, but then again most of you haven't either.