[OLD] IOTA PoW Hardware Accelerator FPGA for Altera DE1

poc

This is the site for the obsolete Altera DE1 proof-of-concept. Please look here for the PiDiver

Introduction

IOTA PoW needs a lot of computational power which makes sending transactions on smaller microcontrollers (like ARM) very slow. One of the main reasons is that the innerst loop of Curl-P81 can’t be computed very efficient on general purpose CPUs. Even modern CPUs with SIMD extension (like SSE or AVX) are heavily restricted when it comes to true parallel calculations.

This is a port of IOTA IRI’s Pearl-Diver for PoW-computation for FPGAs which speeds up the process of doing Proof-Of-Work significantly by a factor of more than 140 compared to e.g. a Raspberry Pi.

The core concept is that FPGAs are  able to calculate one round of Curl-P81 in a single clock cycle and one complete Hash in about 85 (including test for valid nonce). The core works 5-fold which means, in every 85 clock cycles 5 Hashes are calculated in parallel – this gives about 12.87MHash/s at a clock frequency of 220MHz. Moreover, the parallel computation can be adjusted easily to be even faster on larger FPGAs.

For instance, finding the nonce of a single transaction takes about 90s on a Raspi. Finding the nonce hardware accelerated by this core reduces the time to ~350ms.

This core can be also used by IRI when using a modified version which allows to use dcurl as external hashing libary.

So it is possible to build a full-node on raspberry pi with a decent hashing power for doing PoW calculations.

The project aims to be completly open source including all source codes, schematic, layouts.

VHDL-Core

The IOTA PoW Pearl-Diver core was implemented in VHDL.

Except an Altera PLL, no additional core or unusual VHDL library is used which makes it very simple to implement for other FPGA platform targets.

Moreover the core is customizable so the 5-fold parallelization can be increased or reduced (currently up to 8 but that could be changed with little work) depending on the resources of a FPGA target.

The core implements a high-speed SPI interface which directly can be used by the hardware SPI of a Raspberry Pi.

Following the electrical connections between Altera DE1 and Raspberry Pi.

Spectacle.J20765

Here the synthesis report for the EP2C20:

Spectacle.J21867

Maximum clock frequency:

Spectacle.J21835

First clock is I/O like SPI and commando decoder (running @ 110MHz in the design). Second clock is the actual Pearl-Diver PoW State-Machine reaching 220MHz.

Design is still functional although all of the optimizations 🙂

Pearl-Diver Core Repository

Following repository not only contains VHDL source for Altera DE1 and „Pi-Diver“ proto-type but also project-files for Quartus (Prime).

Altera DE1 Cyclone 2 is supported by Quartus 13.0.1 and Cyclone 10 is supported by Quartus 17.x. In the first case, you can directly synthesize the project and upload it to the Altera DE1.

Link to Github Repository

Hashing-Library dcurl with FPGA support

dcurl is a very fast Curl-Hashing-Library which not only supports graphics cards (OpenCL) but also provides highly optimized variants for SSE and AVX capable CPUs.

I did a fork of the library and added code for support of the VHDL Pearl-Diver.

The advantage is that every software working together with dcurl library can make use of the FPGA version of Curl (on Raspberry Pi – for different targets the low level control of SPI has to be replaced).

Link to Github Repository

Compiling and Testing with dcurl

1. Download and install BCM2835 library

# download the latest version of the library, say bcm2835-1.xx.tar.gz, then:
tar zxvf bcm2835-1.xx.tar.gz
cd bcm2835-1.xx
./configure
make
sudo make check
sudo make install

2. Enable „SPI“ under „Interfacing Options“.

sudo raspi-config

3. Load kernel module with modprobe

sudo modprobe spi_bcm2835

4. Check Permissions

pi@raspi:~ $ ls /dev/spidev0.0 -al
crw-rw---- 1 root spi 153, 0 May 3 15:17 /dev/spidev0.0

pi@raspi:~ $ groups
pi adm dialout cdrom sudo audio video plugdev games users input netdev gpio i2c spi

5. Clone and Compile dcurl library

git clone https://github.com/shufps/dcurl
cd dcurl
make BUILD_FPGA=1

6. Test library (for a reason I still don’t know SPI access only with „sudo“ possible).

cd build
sudo ./test-pow_fpga

parallel level detected: 5
Found nonce: 000c9b9c (mask: 00000008)
Time: 321ms  -  MH/s: 12.870

7. If there is no error then everything worked 🙂

Licence

This project is licensed under the MIT-License