Ethereum update: Let’s talk GPU differences Nvidia/AMD and ProgPoW. Why does AMD suffer in ProgPoW? Nvidia the King?
Because ProgPoW is being hotly debated. So I think we all need to have a sit-down, some coffee(or favorite beverage) and lets take a look the differences in GPU architecture between Nvidia and AMD. This will purely focus on GPUs. For other questions on ProgPoW and Ethereum see my write-up [HERE](https://www.reddit.com/r/gpumining/comments/adv764/what_is_progpow_why_ethereum_needs_it_moving/)
*WARNING. To Explain everything properly this will be long!*
**What Am I covering?**
Why was Polaris(RX400/500) series so good at Eth-hash?
* A break-down on Memory controllers
* DAG DOOM
AMD/Nvidia Power consumption.
* TDP breakdown
* Gaming power levels
Why the power difference between the two algorithms?
First-up GPU reviews That I will be quoting throughout and worth a read through.
**Why did RX400/500 stomp out competition in ETH-Hash?**
First some quick info on ETH-Hash. It focused heavily on memory because of DAG(directed acyclic graph) creation. The DAG file is located directly in your GPU memory. The Ethereum DAG is a key component for the proof of work algorithm and is generated for each epoch at around every 30,000 blocks. Slowly increasing the amount of memory required to run Eth-hash PoW.
AMD has always kept there mid-range with 256bit-bus and with Polaris paired it with 7-8GBPS GDDR5 memory
>RX 480’s VRAM…Once again common for mainstream AMD cards, AMD has stuck with a 256-bit GDDR5 memory bus here. Attached to this bus is either 4GB or 8GB of VRAM…Officially, 7Gbps GDDR5 is the minimum speed for both RX 480 capacities… However for their 8GB reference card, AMD has opted to ship the card with faster 8Gbps memory in order to further boost performance.
This along with allowing bios modification that gave access to tighter memory timings allowed them to perfectly match Eth-hash. As Kristy/Ohgodagirl says
>An efficient algorithm for hardware needs to match the access patterns and available space of that hardware. This is why AMD GPUs with firmware edits saw large performance gains on Ethereum — because the access patterns of memory chips were matched to the access patterns of Ethash.
Nvidia’s GDDR5X, GTX1060 GDDR5 9GBPS, and GTX1060’s with Hynix memory gave terrible hashrates (Until the ETH-PILL for 1080/1080ti) because they did not match the access patterns of Eth-hash. Anandtech states on Nvidia’s implementation for GDDR5X;
>…new GDDR5X memory controllers are also backwards compatible with traditional GDDR5, which in turn is used to drive the GTX 1070 with its 8Gbps GDDR5. The difference in operation between GDDR5 and GDDR5X does make the ROP situation a bit trickier overall for NVIDIA’s architects – now they need to be able to handle two different memory access patterns.
The GTX 1060, RX580’s direct price/game competitor, only used a 192-bit bus GDDR5 resulting in a 192GB/sec. Nvidia’s used special compression techniques so this wasn’t noticeable in games. However it Reduced any Eth-hash performance advantages. The GTX 1070 performed on par with RX400/500 because of the similar GDDR5 speed and 256bit-bus.
*The AMD ticking-clock of doom (That was avoided)*
AMD GPUs suffer from what known as DAG-thrashing. This happens when the DAG size starts reaching higher memory limits. RX280/380 2/4GB and RX270/370’s were the very first affected when Ethereum hit it’s 2GB limit. Genoil(Creator of Genoils’Miner) stated since the DAG takes 100% of first memory bank (2gb) once it reaches beyond 2GB it moves to the 2nd memory bank. The GPU has to access two banks instead of just one leading to significant hashrate drop. This is because AMD’s GPU designs access 2GB memory banks at 256bit a time(2GB/4GB/8GB). Example 256bit read from 2GB memory bank, 2.5GB of memory needs to be read then the controller switches to the next 2GB memory bank. RX500/400s we’re set to suffer in 2017 until AMD released special update “[Blockchain drivers](https://www.legitreviews.com/amds-new-mining-block-chain-optimized-driver-tested_197095)” that fixed this issue. This fix did not apply to older AMD GPUs such as 200/300 series.
Nvidia GTX1060/1070 did not suffer from this issue. I cannot find an exact reason but i summarize is has to do with the differences in how memory is read by the memory controller in the two architectures.
>NVIDIA has reorganized the memory controllers to ensure that each memory controller still operates on the same amount of data. With GDDR5 they teamed up two GDDR5 channels to get 64B operations, whereas with GDDR5X this can be accomplished with a single memory channel.
We are now beyond Epoch 199, currently 234, and even R9 290/390’s are suffering from Dag-trashing with users reporting about 27mh/s formally achieving around 30-32mh/s.
(I’ve personally owned and ran, GTX1060, GTX1070, AMD RX580 on Ethereum)
The RX400/500 performed on eth-hash anywhere from 27-32Mh/s with Bios modifications. Without a BIOs mod anywhere from 22-25mh/s. On average the GTX1060 performed about 20mh/s upwards of 24mh/s with overclocking. Hynix versions performed significantly worse at 16mh/s-18mh/s. The GTX1070 achieved speeds of 26mh/s-34mh/s without a Bios mod. This put AMD’s $200/$230 class of mid-range card up against Nvidia’s high-end $400 class of card.
Without bio’s modification the AMD would of seriously been hampered here. It seems Nvidia uses similar access patterns as Eth-hash for there memory timings without the needs for BIOs mod. In a world where both BIOs were locked down Nvidia actually would been the Eth-hash king, not AMD.
It gets tricky here. AMD’s power readings from GPU-Z are *not accurate* this is because the reading only take into account parts of the GPU rather than the GPU as whole. Where as Nvidia’s is taken as a whole. How much power is saved all goes off how diligent the miner wants to be with reducing voltage on their GPUs. Additionally GPUs are not created equal and vary from vendor to vendor some being better some being worse. I’ve personally experienced this with MSI vs XFX with MSI being better as a whole and XFX being hit and miss.
So for this I’m going off my personal experience of what I’ve seen. My RX580’s achieved about 80w~ (GPU-Z) However the actual power was more along 120w. Voltage settings were 850mv(Core)/900mv(memory). RX480/470s achieved better power savings (as you will see why later on) although i cannot attest to this since I never owned one. Nvidia GTX1070 achieved about 90-100w. Voltage settings for these at 713mv(Using Nvinspector). For GTX 1060s achieved 65-70w at 700mv.
Aside Linux users will see higher power usage compared to Windows(about 10% higher).
Nvidia P104-100 was a card designed with Ethereum mining as it selling point. it was equipped with GDDR5X that matched Eth-hash access patterns. This allowed it to achieved 40Mh/s. however this was not sold retail everywhere mainly only available in Asia. So I’m just going to factor this card as an outlier or a unicorn.
*Winners and Losers*
Because of how AMD designed the RX400/500 series, this generation ruled Eth-hash mining with excellent price, performance, and low power consumption. Nvidia, on the other hand, took other methods for GTX1060 that allowed it to compete in games, power consumption, but not in Eth-hash mining. The GTX1070 was too expensive for most miners even though it competed metrically similar in Hashrate/Power/Performance just not price. The DAG-thrashing doom almost killed RX400/500, but AMD themselves stepped in and saved them.
**Claymore’s Genius, Dual-mining!**
Since ETH-Hash only required small amounts of 1600p Keccak and under utilized the GPU core, Claymore took advantage of this and create the first ever dual miner. For a while it was very profitable to dual-mine. Today not at all. This was possible on AMD and speeds were astonishing because AMD GPUs have excellent parallel processing. This is why they’ve always been the miners choice. Nvidia suffered here. Pascal line could not achieve similar speeds without losing significant hashrate. Turning seems to turn that around, but it’s a moot point now. The take-away here is that an under-utilized core allow this to be possible at the cost of extra power consumption.
**The eternal battle, power efficiency.**
(Note all quotes were taken from Anandtech articles listed above.)
The main crux of ProgPoW, which hits AMD cards more so, is power. ProgPoW is set to increase power consumption on Ethereum across the board. But why? Why does AMD take such huge hit compared to Nvidia? is this favoritism? The short answer is no. To answer why lets explore what Polars brings in terms of power efficeny.
RX470/480 were well known for there power savings. This is because AMD took a excellent leap forward to bring back better power consumption with performance. I personally applauded AMD back then. I still have nightmares of my R9 295X I used for gaming consuming 600w!
>…let’s talk about power consumption. As AMD has made clear over the last several months, one of the major goals of Polaris was power efficiency, and this is where we see some of the first payoffs from that decision. RX 480’s official Typical Board Power (TBP) is 150W, over 20% lower than the last-generation R9 380, and 45% lower than the otherwise performance-comparable R9 390
As we moved on AMD upped the core clocks to competitively fight against Nvidia’s offerings. Doing so increased power consumption.
>RX 580 is a 185W card, while RX 570 starts at 150W. This is a 30-35W increase in TBPs over the RX 400 series, and given the expected prevalence of factory overclocked cards, the TBP of the average retail SKU is probably a bit higher still.
AMD Losing power efficiency as we advance??
>So as we noted at the RX 500 launch, instead of fighting an efficiency battle that they can’t win, AMD has opted for raw performance and competing on value.
With the just released RX590. Reducing Power consumption goes straight-out the window.
>For all the gaming performance gains that the RX 590 has made, it came with the higher clockspeeds, and to bring those higher clockspeeds came more power. Already, TBPs have notably increased from the RX 480’s 150W to the RX 580’s 185W, and now to the RX 590’s 225W. Which is already past RX Vega 56’s 210W reference board power spec
AMD has now surely lost it’s vision back in 2016 to reduce power savings and we’re back to R9 290x insane power consumption once again.
AMD has with the RX 590 and Vega, opted to forgo any-power savings in order for raw performance to compete with Nvidia.
>Naturally, the tradeoffs between power efficiency and raw performance is a classic problem of silicon engineering, and it’s been somewhat of a sticking point for Polaris since the beginning. Even though Polaris improved on efficiency over older GCN products, NVIDIA’s top-to-bottom Pascal series brought its own power and clockspeed improvements.
Nvidia Meantime has brought excellent power-savings to Pascal offerings.
Speaking on power consumption and gaming performance. Lets do a quick dive into gaming and power consumption while gaming. The assumption here is that, maxing the GPU cores means utilizing the GPU architecture thus the results of the power design come out.
A quick shoot over to [Gamers Nexus](https://youtu.be/mIpOx8YUXuU) and we see RX580 power usage during games show’s significant disadvantage compared to GTX 1060 let alone the 1070. Mind you GTX 1060 is the direct comparison to the RX580. The GTX 1060 has a 125w TDP, the GTX 1070 has 150w TDP. This is based off official TDP, not what AIB sometimes go overboard on.
Anandtech uses Crysis 3 to stress there GPUs to achieve the power results i’ll show below. It doesn’t get pretty for AMD.
>GTX 1060 holds a similarly impressive lead over AMD’s Radeon RX 480. Against the 8GB card, NVIDIA’s mainstream competitor draws 37W less for 14% better gaming performance. Since Maxwell NVIDIA has enjoyed a significant power efficiency advantage, and while AMD’s recent Polaris architecture has helped to close the gap, GTX 1060 proves that NVIDIA continues to execute well here
The GTX 1060 is able to use less power than RX480, AMD’s most efficient GPU and gaming wise keep up with it. RX480 uses GTX1070 levels of power but not performance.
RX480 starts off well, but as we progress into the 580 we gain another 30w. Moving onto the RX590(They changed test suite for power) but you can see the take-away. It’s mind blowing.
Going from an RX480 to an RX590 results in 60w increase in power! let alone AIB partner models that further increase power. As you can see in both tests the GTX 1070 in Crysis 3/BF 1 uses similar power.
Architecture increasing power efficiency
**So what’s the take-away from this with ProgPoW?**
ProgPoW utilizes the GPU cores(or SM) from both AMD and Nvidia as effectively as possible. However the differences in architecture makes up for the power consumption used from both. RX480’s are set to use the least amount of power for AMD however ProgPoW performance will suffer. As we step-up performance, we increase power drastically as well. GTX1060’s look to be set to use the least amount of power with ProgPoW while achieving RX400/500 series performance.
It would be my assumption that an RX590 could reach GTX 1070 level of performance for ProgPoW. AMD cards and Nvidia achieved excellent power savings because Eth-hash never fully load the SM(shader modules) on either GPUs. When the SM’s are loaded we start reaching TDPs. I’m obviously not looking at power-savings that could be achieved with some tuning/undervolting, but a general overall look.
Take note that all GPUs will receive a performance hit when ProgPoW is implemented. The theory is 1/2 of Eth-hash rate.
>The general expectation is that ProgPoW should have around half the hashrate of Ethash since it accesses twice as much memory per hash. This holds true for GPUs that utilize GDDR5 memory — the RX 580 and the GTX 1070.
[Why did she choose this](https://www.reddit.com/r/ethereum/comments/ag0bgu/opinion_asic_resistance_is_a_state_of_mind_not/ee3ez6o), because it was the “happy” medium between Nvidia’s and AMD’s architecture (Utilizes v_mul_lo_u32). In lay-mans terms what ProgPoW does is target the GPU core using reduced Keccak(f800) and other random math while increasing those access patterns that Eth-used by double. Because the switch to ProgPoW initializes more of the GPU core than Eth-hash. As a result see a huge increase in power, similar to as if you were gaming. RX400/500 series power will increase by my estimates about 60% in power over ETH-hash. This obviously leaves a lot of unhappy RX400/500 series miners. Because increase in power = reduced profit due to increased power + more spent on electrical upgrading of there farms.
I relate ProgPoW to How claymore achieved Dual-mining. Although the processes that allow it are different, Claymore dual-mining saturated more of the GPU’s core leading to an increase in power consumption. So if your wondering why AMD sees an increase in power consumption. Please take a look at how claymore achieved dual mining using Eth-Hash.
I want to specifically point out here that the fault of such a power increase is because AMD’s designs have pushed power limits, higher, and higher. Nvidia meantime has drawn power down while increasing performance, Maxwell, Pascal and finally Turing showed us that have increased performance without increasing power consumption drastically. *The design of ProgPoW does not aggressively target AMD and force higher power. It’s simply AMD’s design.*
**The king will be dethroned.**
This leaves the RX500/400 owners in a sticky spot. Most of the Ethereum network was built on AMD because of it’s excellent Watt/hash/cost. The switch to a more GPU intensive algorithm leaves them with less powerful cards than they originally had with ethereum. Additionally ProgPoW takes away any BIOs modification advantage they had in Eth-hash leaving them with stock speeds.
GTX 1060s and RX480/580s achieve similar speeds in ProgPoW, around 9-10Mh/s. They’re the direct competitors. The wattage favors the GTX1060 as I’ve shown, it’s because the GTX1060 is a 125w TDP card while the RX480 is 150w and the 580 185w. Remember, an RX480 @ $230 MSRP was achieving what Nvidia’s GTX1070 @ $400 MSRP was. That’s a hell of a feat. I know why is was so loved by all miners.
Nvidia has recently has been dominating GPU Intensive algorithms. If it wasn’t for the DAG savior, they probably would of went on and dominated Ethereum as well. One only needs to look at X16R/Zhash/X22i/Equihash 150/5, they will understand that performance/watt Nvidia’s superior architecture wins the day. Not that AMD’s cannot keep up. The problem lies in AMD’s design and power usage.
ProgPoW does what it was designed to do, use the full capacity of the GPU. If the rumors of Navi are true then we may get our RX480 a-like for ProgPoW with excellent power consumption. With Vega VII we may get excellent hashrates but at the expense of power. We won’t know for sure till February when it’s released. If ProgPoW brings back decentralized GPU mining at the expense of some power, I’m still for it. It will be up to AMD miners to find ways to achieve a good watt/hash ratio.
***TL;DR: Differences in GPU Architecture means differences in hashrate/power. ProgPoW levels the playing field for both Nvidia/AMD***
Ethereum is a decentralized platform that runs smart contracts: applications that run exactly as programmed without any possibility of downtime, censorship, fraud or third-party interference.
Don’t forget to share the post if you love it !