Windows on ARM Benchmarked: Qualcomm Snapdragon 835 vs. Intel Celeron N3450

Julio Franco

Posts: 9,099   +2,049
Staff member
I actually enjoyed my surface RT. Compared to tablets at the time it had a lot more capability for me out of the box. The performance of the tegra 3 was lacking, by the surface 2 the performance improvement on the tegra 4 was noticeable, but intel released baytrail. Baytrail killed the RT series more than anything, full windows in the same power envelope couldn't be beat.

I get why Microsoft is trying to get windows running on ARM so much, out of all the architectures it's the one that seems to be moving the fastest, especially in the ultramobile, since intel abandoned there atom line and AMD never really competed there, ARM offers good price to performance where intel and AMD don't with x86-64 anymore.
 
Waaait! Did you re-run the benchmarks a couple of times when doing emulation? The use case is that emulation is translated once to arm instructions and then saved for faster subsequent runs. It looks to me like you benchmarked the translation process, which is only done once. The actual speed on repeated re-runs should be faster.
 
Well big cores on the S835 are actually A73 based ones & so two instructions per clock, even Atom used in tests is 3 instructions wide & other X86 one's usually 8 instructions per clock. This is more of a tragicomic than anything else. Even the A73 is faster in integer & not so far behind in floating point instructions (VFP is good & NEON not so much as it's almost equal to the VFP performance) compared to the Atom (the current gen 3 instructions OoO one's). On something which is both native written and God thanks better than Windows things get different. Linux would actually run good and had a much better user experience (as it always did with slower hardware) all native ARM v8a 64 bit, even smaller A53 SoC's are useful with Linux. Besides the power consumption advantage on Linux would be multiple time's better as it's all native plus their is very good support for ARM scheduling on Linux including HPC scheduler and energy aware scheduler (HPC is complicated to configure but done properly it out matches EAS in performance significantly while getting really close to power/performance metrics). Qualcomm does have a core's that can compete even with the best Intel effort (& so do Apple). Falcor core's can go up against Intels Y one's on most tasks. Naturally ARM lose lot of it's magic regarding power efficiency when equality wide design is used but retains some of it at least. Best part is ARM still can use mix of performance OoO core's & efficient in order one's so that it would remain still fabulous for always on even if/when wider OoO ones are employed.

So Microsoft had to be kidding with this it can't be controlled like nothing else but a bad joke. This is even worser than Windows RT try. As Linux is becoming dominant developing platforms we probably will never see native wide ARM apps ecosystem on Windows & as RISC including in the first place ARM (today) is dominant architecture this also means Microsoft days are out numbered. When & if will desktop OS switch happen I really don't have a clue but Microsoft never even got seriously in RT embedded & Server space with also sloppy OS tries...

Yust mi 10¢ on topic.
 
Waaait! Did you re-run the benchmarks a couple of times when doing emulation? The use case is that emulation is translated once to arm instructions and then saved for faster subsequent runs. It looks to me like you benchmarked the translation process, which is only done once. The actual speed on repeated re-runs should be faster.
Proper benchmarks are always done using multiple passes and excluding the extreme results. Techspot never do just 1-pass benchmarks and if they do they say it.
 
What great battery life indeed! All it took was removing ALL THE PERFORMANCE in any meaningful workload. We don't need a glorified Windows "web tablet", we've already got Android for that. All iPad users will continue to be iPad users, and the world will keep on spinning.

On the bright side it's not as gimped as Windows RT was, but that's not saying much. Perhaps the third attempt will be the... chARM.
 
I might have liked it if it had more native apps. But as it stands it's just a half-assed product that is neither a good tablet and neither a good laptop.
 
Looks like an okay start. Nobody is going to run content creation on these any more than they run it on an Atom derivative. If more expected common operations like decompression and Excel work better than on the low end Intel, that matters more.

I wonder how much of that is optimising for specific cases, and if we'll see improved emulation performance in the future.

Still, no matter how you look at it, that's disappointing performance for the price. It would be acceptable for a much lower price.
 
Well I was taking in consideration the market target for this device but is a big shame that you can even run MS Office full on it.
With Office Full would be a advantage for many but without that is complicated.
 
I never understood why people try to wedge an OS engineered for a desktop onto a mobile framework, or vice versa. Aside from saying "just to see if we could," the results have almost always been the same as this. Unless it's about money. Oh wait, never mind, I retract the question. ;)
 
The whole concept of emulating x86 windows on ARM is utterly pointless if its not 100% compatible with 32-bit, 64 -bit and all OpenGL/DirectX/ETC needed to run every application and game.

Im not going to buy a computer on which your applications being compatible depends on the whim of some developers at microsoft.
 
The spectre and meltdown patches increase the efficiency of x86

x86 cpu's may then be slower but more efficient

If you further slow down the clocks on a core i3 to match the "Average" productivity of the Qualcomm chip, which one do you think would then have the best battery life once Intel switches to a 10nm process?

Even if you get 30% better battery life with the 835, the Intel CPU would provide 80-90% better usability (compatibility)

Is slightly better battery life worth all the misery of using one of these abominations?
Add your thoughts......
 
Interesting but we all really know that CPU will always be best in a server, workstation, desktop, laptop or netbook. Anything else is just not going to cut it.

MOBO = Motherboard vs PCB = printed circuit board. That's basically what we are really talking about here. The power of the motherboard vs printed circuit-board low power.

CPU 32bit vs CPU 32/64-bit Again can't compare the two as we all know what's the final result. Software has to support both 32 or 64-bit to take advantage of the extra bus speed or pipes.

ARM SoE all these just not going to be the same as full blown CPU or APU. For my money the keyword is power, more RAM and that's it. I really don't need a high quad, 8 cores or better on a cell phone. Of course today the cell phone is not only a phone, it's are PDA, it's camera, it's small hand held computer and it also can talk back to you as well.
 
Remember this video?
It's a promotional video from Microsoft, showing how well the snapdragon runs x86 apps, including footage of Photoshop running a blur filter under 1 second. The benchmarks run here seems to paint a different picture - the blur filter runs 23 times slower than an i7. Do you think that Microsoft edited their video to make it look better?
 
Well... for now these have to be available below $200 range and when more basic ARM builds hit can even up to $250, but 500 and sometimes even 1k? that's just insane
 
I've been saying this for ages. Running emulated x86 code on an ARM with native performance comparable to a Celeron was never going to be a good experience.

And in fact even though a Snapdragon 835 has pretty decent SIMD performance natively, Intel have threatened to sue if it runs SSE instructions via emulation

https://arstechnica.com/information...t-claims-x86-emulation-is-a-patent-minefield/

So you've got a low power chip with performance comparable to a Celeron running emulated code with an amazingly terrible performance penalty and the emulated x86 can't support SSE which is pretty much the only way to get half way decent multimedia performance on a low end chip.

So unless you use Edge and UWP apps the performance will be awful. Of course this probably suits Microsoft pretty well. Windows RT machines failed because they couldn't run any non Microsoft Win32 or Win64 apps. This machine can run x86 Win32 applications but very slowly or ARM Win64 apps which no one will bother to build. Which pushes users to Edge and UWP and away from Chrome and Win32/Win64. Which of course is what Microsoft wants to do.

Of course like RT devices I think this will fail. No one is going to pay $500+ for one of these Windows on ARM devices when they can get a pretty decent i5 machine for about the same price or less with much more performance. Hell if you want battery life and don't care about performance there are loads of Celeron machines available for incredibly low prices.
 
It's nice to see a mainstream commercial OS trying to function on a non-x86 platform. x86 is fundamentally unchanged since the 1980s. The advancements are in additioas to the instruction set and extensions to the memory addressing that become more complex as we go, all in the name of backward capability. That I can still load a 25 year old (or older) operating system on a modern computer is ridiculous.

Although this might be a poor result, at least someone is thinking about new architecture in terms of commercial computing. The only powerful new tech we have is in mainframes (Itanium, Z-Processors, etc...) and everyone else is stuck using a nearly 40 year old base architecture on to which we keep gluing new stuff and pretending that makes the whole thing new.

Emulation and migration will always be an issue if we want to truly advance our technology. The difference is that we used to do this all of the time. In the '80s I easily had 4 different architectures at home and regularly used about twice that. The difference is that nobody released a new machine without significant native applications. Hopefully we'll see this as a move towards opening the door to real hardware processing innovation.

Considering the price point I'm wondering how this machine behaves under a non-Windows OS. We have a number of 'nix Kernals compiled for ARM that might make this a beautiful machine for someone wanting a portable workstation.
 
Waaait! Did you re-run the benchmarks a couple of times when doing emulation? The use case is that emulation is translated once to arm instructions and then saved for faster subsequent runs. It looks to me like you benchmarked the translation process, which is only done once. The actual speed on repeated re-runs should be faster.
but... normally you wouldn't run something the same twice just to see it go faster...
 
It's nice to see a mainstream commercial OS trying to function on a non-x86 platform. x86 is fundamentally unchanged since the 1980s. The advancements are in additioas to the instruction set and extensions to the memory addressing that become more complex as we go, all in the name of backward capability. That I can still load a 25 year old (or older) operating system on a modern computer is ridiculous.

Back in the 80's Risc chips seemed to have a performance advantage over x86. That changed when NexGen worked out how to decode x86 instructions into Risc like uops and execute those in a Risc like pipeline. Then came superscalar, and out of order. Now at this point Risc ceased to have an advantage.

It makes sense really. Look at a typical die photo - it is mostly cache memory. If 90% of your die is cache then it's hard to see how the overhead of decoding x86 instructions is all that significant.

And x64 gets rid of segment limits for CS and DS. Even FS and GS have a segment base but not a limit. So you can execute those instructions with just an additional adder. Or an additional uop.

Although this might be a poor result, at least someone is thinking about new architecture in terms of commercial computing. The only powerful new tech we have is in mainframes (Itanium, Z-Processors, etc...) and everyone else is stuck using a nearly 40 year old base architecture on to which we keep gluing new stuff and pretending that makes the whole thing new.

Emulation and migration will always be an issue if we want to truly advance our technology. The difference is that we used to do this all of the time. In the '80s I easily had 4 different architectures at home and regularly used about twice that. The difference is that nobody released a new machine without significant native applications. Hopefully we'll see this as a move towards opening the door to real hardware processing innovation.

I've got a fair few architectures in my office - x86, ARM, MIPS (in a router), PIC, AVR (dev boards). And if you look at WD for example they're going to move from ARM to RiscV to cut down on the IP fees they pay. I've worked on projects that use cores from even more obscure sources. VHDL and Verilog make it easy to develop new architectures. Broadcom have a completely novel architecture designed by Sophie Wilson, the original ARM ISA designer.

Considering the price point I'm wondering how this machine behaves under a non-Windows OS. We have a number of 'nix Kernals compiled for ARM that might make this a beautiful machine for someone wanting a portable workstation.

In an non-Windows OS the performance of a Snapdragon 835 will be pretty comparable to a Celeron N3450.

There's an interesting paper here that shows that ISAs do not matter very much - an ARM implemented in a 5W power budget will have comparable performance to an x86 in a 5W power budget.

https://www.extremetech.com/extreme...-or-mips-intrinsically-more-power-efficient/2

Which makes sense, right? If a CPU die is 90% cache then it's hard to see how changing the other 10% is going to cause much to change. That definitely wasn't true back in the 80386 days or the 68000 days when I remember reading a SPARC in a gate array could outperform a 80386 in a highly optimised custom implementation and a ARM1 with 25000 transistors would far outperform a 68000 - the 68000 was named because it had 68000 transistors. Both the ARM and the 68000 had an elegant instruction set but the ARM executed every instruction in a single cycle because it was pipelined. Pipelining was possible because of the way the ARM ISA was designed.

However once you start converting x86 instructions to uops with a bunch of hardwired logic that difference disappears. Sure that custom logic was a significant part of the die area back in the 586 days, but it's not now. And you can see that by looking at a die photo and seeing it's mostly cache.

Incidentally one of the advantages x86 and x64 have over Risc is code density. So you can look at that instruction to uop decode logic as being a decompressor.

And x86/x64 has an advantage over ARM. Historically the fastest ARM has been about as powerful as the slowest x86/x64 chip. That may change in future but right now x86/x64 pretty much owns the notebook/desktop/server market and ARM owns the phone/tablet one. People aren't running the same code on those two platforms. A Snapdragon 835 would perform very well in a phone but would suck in a notebook as these results show.
 
Battery life is good. Because it can't do anything worthy that much with this processor.

Windows softwares are designed for Windows on x86 platform. (Duh)
 
Back