Testing a Chinese x86 CPU: A Deep Dive into Zen-based Hygon Dhyana Processors

Name: Testing a Chinese x86 CPU: A Deep Dive into Zen-based Hygon Dhyana Processors
Item: Testing a Chinese x86 CPU: A Deep Dive into Zen-based Hygon Dhyana Processors

by Dr. Ian Cutress & Wendell Wilson on February 27, 2020 9:00 AM EST

Posted in
CPUs
AMD
x86
Zen
Dhyana
Hygon
Sugon
China

133 Comments | Add A Comment

133 Comments

Hygon CPUs: Chinese Crypto, Different Performance

The big overriding question is about what exactly has changed with these processors compared to the standard Ryzen and EPYC CPUs. To say that they are rebadged processors, as some have suspected, is completely incorrect – we can tell this alone by the different cryptography engines provided by the Linux kernel updates. But we also detected other differences.

By and large, as we could determine, the core layout is identical, with the same cache sizes, TLB sizes, and port allocations – there were no differences at this fundamental level. The CPU still offered 64 KB 4-way for the L1 instruction cache, 32 KB 8-way for the L1 data cache, 512 KB 8-way for the L2 cache, and 8 MB 16-way for the L3 cache, identical to the Zen 1 core. TLB entries are as follows:

L1D + L1I: 4K/2M/1G 64-entry
L2D: 4K 1536-entry 6-way, 2M 1536-entry 3-way, no 1G
L2I: 4K/2M 1024-entry 8-way, no 1G

Memory access times are 4 cycles for L1, 12 cycles for L2, and 37-40 cycles for L3. Memory latency was measured at 284-307 cycles.

L1 read speeds were measured around 32 bytes per clock (805 GB/s total, ~100 GB/s per core), while write speeds were measured around 16 bytes per clock (408 GB/s total, ~51 GB/s per core). DDR4 Memory speed for the 8-core gave 38.5 GB/s for reads and 35.8 GB/s for writes.

Cryptography Changes

For the cryptography changes, these are detailed in the Linux kernel updates. The updates revolve around AMD’s secure encryption for virtualization features, or SEV. Normally with an EPYC processor, SEV is governed by the cryptography protocols defined by AMD, in this case RSA, ECDSA, ECDH, SHA, and AES. In order to generate the right keys, SEV uses these methods. However, in the Hygon Dhyana designs, SEV is built to use algorithms known as SM2, SM3, and SM4.

As stated in the updates, SM2 is based on elliptic curve cryptography, and requires additional private/public key exchange. SM3 is a hashing algorithm, similar to SHA-256, and SM4 is a block cipher algorithm, similar to AES-128. Additional commands are placed into the Linux kernel in order to support the extra functions these algorithms need. In the notes it states that these algorithms were successfully tested on Hygon Dhyana Plus (presumably the big CPU) processors but they were also successfully tested on AMD’s EPYC CPUs.

Slowing Down Some Instructions

The biggest update to the design we were able to determine is in the instruction throughput. We don’t think that this difference between Dhyana Plus and EPYC has been mentioned before, and we did extra checks to make sure our software was displaying the right data, but put simply some instructions have been purposefully made slower. This has some rather serious implications, especially depending on when it occurred in the pipeline.

What we think is the case is that in order for AMD to export its SoC design, it had to also share microcode relating to how the CPU interprets instructions, and it was told to slow down certain key instructions (or disable them altogether) in order to make the arrangement with the joint venture and China work.

In our testing, we found that while integer performance is similar between Hygon and EPYC, certain floating point instructions, namely DIV and SQRT, are not pipelined in the Hygon CPU. This means throughput and latency is reduced. A lot of simple MMX/SSE instructions have reduced throughput:

Instruction Throughput Differences
AnandTech	EPYC Naples	Hygon Dhyana
ADD/SUB	2 per clock	1 per clock
CMP/MULP*	2 per clock	1 per clock
ADDSUBP*	2 per clock	1 per clock
RCP/RSQRT	1 per clock	0.5 per clock
BLENDW	3 per clock	2 per clock
PMIN/MAX*	3 per clock	2 per clcok
PAND/ANDN/OR/XOR	4 per clock	2 per clock
MOVs	4 per clock	2 per clock

All of these instructions are pretty important for even basic tasks. By limiting the simultaneous throughput of these instructions, it means that these CPUs cannot compute code that can be parallelized as fast, ultimately decreasing performance.

Perhaps the biggest change however was one that even differed between the server ‘Dhyana Plus’ processor and the consumer ‘Dhyana’ version. Random number generation, which is a key backbone in a lot of stochastic and financial processes, is severely reduced on the Dhyana Plus. The key instructions, RDRAND and RDSEED, have various reasons for being slow/fast. Here’s the comparison:

Instruction Latency Differences
AnandTech	Zen 1 Desktop	Hygon Dhyana	Hygon Dhyana Plus
RDRAND
16-bit	1200 clocks	1100 clocks	800 clocks
32-bit	1200 clocks	1100 clocks	800 clocks
64-bit	2365 clocks	2125 clocks	1520 clocks
RDSEED
16-bit	1200 clocks	1100 clocks	12000 clocks
32-bit	1200 clocks	1100 clocks	12000 clocks
64-bit	2365 clocks	2125 clocks	27100 clocks

That’s quite a difference, especially in RDSEED. We saw that RDSEED, the seed generation to help spawn random number algorithms, is over 10x slower on the server chip, and RDRAND, used for actually generating hardware based random numbers, is faster than standard Ryzen – moreso on the server chip. Interestingly enough, the same delays for RDSEED for the server chips are also seen on Ryzen Mobile and Ryzen Desktop APUs.

For RDRAND, having a faster random number generator can be indicative of two things: either it is actually faster, or the random algorithm has a lower periodicity, i.e. the point at which an algorithm wraps on itself. The best pseudo-RNG have the largest periodicity, so in this case the RDRAND is fast leads us to conclude that the periodicity is low, leading to lower quality random numbers.

For RDSEED, the fact that this is 10x slower is a little different. RDSEED is meant to take information from the various sensors on board and output a random value to initialize the RDRAND – it should only get called once per periodicity. A slower RDSEED either means its taking data from more sources (a good thing), or it’s being slowed down on purpose (a bad thing).

In actual fact, RDRAND and RDSEED can be enabled/disabled in the BIOS of our Dhyana Plus system.

It’s amusing that this menu is called ‘Moksha Common Options’. Moksha being a word commonly associated with ‘enlightenment’ or ‘release’. This is either a clever word play, or someone digging out a non-contextual old Chinese to English dictionary in translation.

When it comes to AVX and AVX2 performance, even though the CPUs were able to identify themselves as having AVX and AVX2 support, trying to actually measure these instructions failed – in our instruction dumps, they were listed as ‘supported, disabled’. When it comes to supported features, Zen 1 typically lists AESNI, SHA, CLMUL, FMA4, BMI and BMI2 as supported instructions - none of these instructions are supported on the Hygon CPUs.

For things like AES, we have a direct benchmark for these, and the fact that these CPUs do not support AES means that we get a tanking in performance:

AES Encoding

It should also be noted that the typical methods for finding the power consumption on AMD CPUs by probing registers also failed here. These seem to be removed from the CPU altogether.

Our Hygon Systems: 8-core Dhyana and dual 32-core Dhyana Plus Benchmarks: Windows

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

133 Comments

View All Comments

Lord of the Bored - Friday, February 28, 2020 - link
It is still pretty dang clear: AMD Zen core goes in, China Zen core comes out. I don't think anyone didn't realize what AMD was doing, particularly as they crippled the processor.

The convoluted part-ownership scheme is because AMD can't sublicense their x86 license due to an ancient settlement with Intel. So they need to maintain >half ownership to keep Intel from suing them.
sing_electric - Friday, February 28, 2020 - link
I know people that review these deals. They are smart and they aren't naive. That doesn't make them infallible, of course, but they are competent and take their job seriously.

It seems to me that you've got it backwards: This ownership structure was set up *specifically* to meet Chinese requirements for "made in China" certification (which the central government is really pushing pretty hard) while actually making as little in China as possible - even shipping dies to China for packaging.
Mikewind Dale - Thursday, February 27, 2020 - link
Also, even if AMD *did* give IP to China, it's AMD's IP to give. It's not stealing if the owner gives it away. And it's not a betrayal of the USA, since the USA is supposed to be a free-market country that protects individual rights and liberties. If anyone is betraying the USA, it's the US federal government, since the US federal government would be violating Americans' rights and liberties by prohibiting voluntary market transactions (such as exports).
sonny73n - Friday, February 28, 2020 - link
“ Lisa Su committed treason by exporting high tech IP to an enemy country“

I don’t understand why the US has so many sick psychopaths who see everyone as enemy. I wonder if they also see you as an enemy, do you think you would still be here spouting hate right now?
TheinsanegamerN - Friday, February 28, 2020 - link
The chinese see themselves as above all other cultures, and are not only extremely racist towards those who ar enot chinese, but also employ methods such as concentration camps and "reeducation" facilities against those of certian religious and ethic backgrounds.

but sure, prattle on about how hateful the US is.
PeachNCream - Friday, February 28, 2020 - link
Don't assign the actions of the Chinese government to the people living in China in a broader sense. That is unfair to average people that are, just by nature of birth and physical location, part of the population you are giving the blanket title of racist. It's just as bad as calling everyone from Alabama a pickup truck owning, gun-brandishing redneck when we know quite well that there are reasonable, decent people living there that have to spend time contending with that label on a daily basis.
yannigr2 - Friday, February 28, 2020 - link
Trying to find the wrongs to others, is just an excuse to hide your wrongs.

Look, they (Chinese) are worst that us (Americans).

Don't expect much support with that kind of logic. The lesser evil is not good.
sonny73n - Saturday, February 29, 2020 - link
“ The chinese see themselves as above all other cultures, and are not only extremely racist towards those who ar enot chinese, ”

The first letter of any national should be capital. You’re a disrespectful person.
China has thousands of years old culture. If I’m a Chinese, I would be proud of it. And there’s nothing wrong if they see themselves above other cultures because it’s the fact.
They might be racists but who isn’t? If you’re talking about extreme racist, you should talk about Americans.

“ employ methods such as concentration camps and "reeducation" facilities against those of certian religious and ethic backgrounds.”

The US don’t need re-education facilities. MSM have been feeding BS to you since the day you were born. And it seems you did very good gobbling down all the BS. Now you’re only spouting fouls since you’re incapable of having any critical thinking.

You’re trying to say that you like most of the psychopaths in the US aren’t hateful but your comment shows that you are. Accusation without proof is the worst form of hatred you can give. Or maybe you’re just too stupid and lazy to find out the truth. So it’s much easier to repeat what CNN says.
Lord of the Bored - Saturday, February 29, 2020 - link
Comrade sonny73n, you say "accusation without proof is the worst form of hatred" immediately after calling all americans psychopaths withut proof. You undermine your own argument, and the party expects better of you.
s.yu - Sunday, March 1, 2020 - link
>And there’s nothing wrong if they see themselves above other cultures because it’s the fact.
Wow, I can't imagine anybody other than a modern Red Guard(they're popping up here and there under Xi in case some people haven't noticed) saying this.
These people hail the Party's words as gospel, blatantly disregarding the massive rift between the Party's words and actions, made possible only by the sore lack of accountability, in turn an inevitability stemming from the authoritarian and dictatorial nature of the regime. One proof of this is that in China you almost never win a lawsuit against the government, and you literally never win a lawsuit that leads to reforms in legislation, regardless of how the government tramples over its own laws to infringe on your rights. "Legal order" in China largely exists as excuses to prosecute the insubordinate. There's even statistics on the citation of legal code during lawsuits in China, and a conclusion was that ~90% of clauses have almost never been cited in practice, because they're effectively inapplicable.
>If I’m a Chinese, I would be proud of it.
This isn't outright denying that he's Chinese though.¯\_(ツ)_/¯

March 1st is the first day of a new round of crackdowns on flow of information in China, just more excuses to frame you with.
https://translate.google.com/translate?hl=en&s...
Yesterday though I personally witnessed a new mindfuck trick of the Party before my own eyes: if you try to upload an image to Wechat chat groups, it will be OCR'ed simultaneously, and if for whatever reason that meets whatever censorship standard your image is determined to be "sensitive", then it will meet one of at least four fates:
1. Fake upload: You thought you uploaded it, Wechat tells you that it's successfully sent, yet nobody in the group but you could see it.
2. Repost ban: The long press option of repost would be removed from an image.
3. Fake repost: People in the group you uploaded to could see it, but if you repost it to another group(which does not require another upload), it becomes invisible to people in that second group.
4. This is the good one: Within the same chat group, an image uploaded is only visible to certain people, and hidden from others!!
I'd known the first two for quite a while, and discovered the last two yesterday, but the last one really throws a wrench into any serious discussion, especially regarding content that's wiped out from within the wall in the first place.
I'm 100% certain of what I saw because regarding the fake repost issue, I was able to upload, and successfully repost an image(screenshot of a banned article) with a heavy gaussian blur, but the original was only visible to the initial group. Regarding the last issue, a guy uploaded three images, while I only saw two, and he tried uploading the missing one again, but it was still invisible to me, but all three were visible to the uploader himself and another guy in the chat group, so it's no glitch, it's intentional.
I don't believe this was related to the March 1st regulations, it was most likely in place before that. The exact impact of this new legislation remains to be seen.

Testing a Chinese x86 CPU: A Deep Dive into Zen-based Hygon Dhyana Processors

Hygon CPUs: Chinese Crypto, Different Performance

Cryptography Changes

Slowing Down Some Instructions

Post Your Comment

133 Comments

View All Comments

Lord of the Bored - Friday, February 28, 2020 - link

sing_electric - Friday, February 28, 2020 - link

Mikewind Dale - Thursday, February 27, 2020 - link

sonny73n - Friday, February 28, 2020 - link

TheinsanegamerN - Friday, February 28, 2020 - link

PeachNCream - Friday, February 28, 2020 - link

yannigr2 - Friday, February 28, 2020 - link

sonny73n - Saturday, February 29, 2020 - link

Lord of the Bored - Saturday, February 29, 2020 - link

s.yu - Sunday, March 1, 2020 - link

Log in

Don't have an account? Sign up now