All Projects → preusser → Q27

preusser / Q27

Licence: agpl-3.0
27-Queens Puzzle: Massively Parellel Enumeration and Solution Counting

Labels

Projects that are alternatives of or similar to Q27

Vhdl Mips Pipeline Microprocessor
VHDL-Mips-Pipeline-Microprocessor
Stars: ✭ 12 (-80%)
Mutual labels:  vhdl
Open Source Fpga Bitcoin Miner
A completely open source implementation of a Bitcoin Miner for Altera and Xilinx FPGAs. This project hopes to promote the free and open development of FPGA based mining solutions and secure the future of the Bitcoin project as a whole. A binary release is currently available for the Terasic DE2-115 Development Board, and there are compile-able projects for numerous boards.
Stars: ✭ 989 (+1548.33%)
Mutual labels:  vhdl
Vexriscv
A FPGA friendly 32 bit RISC-V CPU implementation
Stars: ✭ 1,041 (+1635%)
Mutual labels:  vhdl
Lxp32 Cpu
A lightweight, open source and FPGA-friendly 32-bit CPU core based on an original instruction set
Stars: ✭ 27 (-55%)
Mutual labels:  vhdl
Ophidian
Ophidian's Mirror Repository on github. https://gitlab.com/eclufsc/eda/ophidian
Stars: ✭ 32 (-46.67%)
Mutual labels:  vhdl
Vhdl
VHDL Samples
Stars: ✭ 40 (-33.33%)
Mutual labels:  vhdl
Rewire
Experimental compiler for a subset of Haskell to VHDL
Stars: ✭ 10 (-83.33%)
Mutual labels:  vhdl
Haddoc2
Caffe to VHDL
Stars: ✭ 57 (-5%)
Mutual labels:  vhdl
Flearadio
Digital FM Radio Receiver for FPGA
Stars: ✭ 36 (-40%)
Mutual labels:  vhdl
Reonv
ReonV is a modified version of the Leon3, a synthesisable VHDL model of a 32-bit processor originally compliant with the SPARC V8 architecture, now changed to RISC-V ISA.
Stars: ✭ 47 (-21.67%)
Mutual labels:  vhdl
Clash Compiler
Haskell to VHDL/Verilog/SystemVerilog compiler
Stars: ✭ 958 (+1496.67%)
Mutual labels:  vhdl
Vhdl Mode
A package for Sublime Text that aids coding in the VHDL language.
Stars: ✭ 31 (-48.33%)
Mutual labels:  vhdl
Scaffold
Donjon hardware tool for circuits security evaluation
Stars: ✭ 43 (-28.33%)
Mutual labels:  vhdl
Fpga Bbc
Acorn BBC Micro on an Altera DE1 FPGA board
Stars: ✭ 14 (-76.67%)
Mutual labels:  vhdl
Spi Fpga
SPI master and slave for FPGA written in VHDL
Stars: ✭ 50 (-16.67%)
Mutual labels:  vhdl
Fpganes
Stars: ✭ 12 (-80%)
Mutual labels:  vhdl
Hdmi2usb Numato Opsis Sample Code
Example code for the Numato Opsis board, the first HDMI2USB production board.
Stars: ✭ 40 (-33.33%)
Mutual labels:  vhdl
Sublime Vhdl
VHDL Package for Sublime Text
Stars: ✭ 58 (-3.33%)
Mutual labels:  vhdl
Aws Fpga
Official repository of the AWS EC2 FPGA Hardware and Software Development Kit
Stars: ✭ 1,091 (+1718.33%)
Mutual labels:  vhdl
Fpga Fft
A highly optimized streaming FFT core based on Bailey's 4-step large FFT algorithm
Stars: ✭ 45 (-25%)
Mutual labels:  vhdl

Mission Accomplished

This FPGA computation has raised the bar:

Q(27) = 234907967154122528

Seven years after the FPGAs prevail again.


The Q27 Project

The Q27 Project is a showcase for a massively parallel computation to tackle an otherwise infeasible problem. It demonstrates the tremendous computational power of designated circuit designs implemented on FPGA accelerators at the example of enumerating and counting all the valid solutions of the 27-Queens Puzzle.

Further Reading

Table of Contents

  1. License
  2. News
  3. Purpose
  4. How to Contribute
  5. Used Hardware
  6. Things To Do Alongside

License

This project is licensed under the terms of the GNU Affero General Public License, Version 3. A verbatim copy of the license document is contained within this repository.

The VHDL implementation builds upon the PoC Library, which is published under the terms of the Apache 2.0 license.

News

Sep 19, 2016: Mission completed - there are 234907967154122528 solutions of the 27-Queens Puzzle. 29363791967678199, i.e. very slightly more than an eighth, of them had actually to be discovered by the computation running for slightly more than a year.

Jan 22, 2016: The queens within the pre-placements advanced out of the overlap of west and south pre-placing region. This virtually broke a sound barrier! While, indeed, expecting an accelaration of the computation due to moving to more constrained pre-placements now typically comprising eight (8) queens, the observed speedup of 10 was still an exciting surprise. Given the behavior observed for the 26-Queens Puzzle as well as the 27-Queens Puzzle so far, a gradual slowdown must, however, be expected from here: subtaks seem to yield more solutions and take longer to compute as the pre-placed queens move closer to the center of their column.

Jan 11, 2016: Triggered by the extremely different synthesis results obtained for essentially the same FPGA devices found on the KC705 and the DNK7 FPGAs boards, Vivado was given a chance for all Xilinx Gen-7 platforms. The results are just unbelievably amazing:

  • KC705: 67% more solvers, 7% faster clock, and
  • VC707: 56% more solvers, 2% faster clock.

This is not just a different trade-off but truly tremendously better! The only shade of a cloud that Vivado leaves is that it required a workaround for dealing with physical VHDL types properly.

Dec 2015: With our new multi-FPGA cluster setup and running, we hope to make considerable computational progress over the holidays. (Yes, we let them labor hard while we relax.) Check out the performance figures to appreciate their contribution.

Purpose

Q27 demonstrates the tremendous power of compute accelerators provided through FPGA resources.

  1. Training: As the design virtually scales arbitrarily, it can be used:
  • to practice the performance tuning of large circuits cramped on filled FPGA devices, and
  • to practice trading off size (more solver slices) against clock speed (performance of a single solver).
  1. Engineering: As the problem is heavily compute- rather than communication-bound, further optimization potential may be explored in:
  • moving the solver communication from the compute clock domain into the slow clock domain (which is currently only used for system services such as the fan control), and
  • using independent regional clock resources for subsets of the solver slices.
  1. Benchmarking: Being able to tune the design size at a relatively fine granularity establishes a case for evaluating the resource balance of a device. For instance, it can be determined if the further growth of a design fails on routing or on logic resources or how fast the routing performance drops as the device is being cramped.

  2. Vanity: Top our own world record from 2009 obtained by computing Q(26)=22.317.699.616.364.044 using a similar distributed FPGA infrastructure.

How to Contribute

Since we are through with the computation, the interesting remaing question is whether or not we are right. An independent confirmation of the result would be great. While Q(26) was confirmed less than two months later by a competing second effort using to Russian supercomputers, it appears more patience is needed this time. Nonetheless, you are invited to:

  1. check the implemented algorithm, and
  2. verify individual subtotals.

The full computation log is available for reference.

Hardware Used by the Q27 Project

Count Board Family Device Solvers Clock SE
1x VC707 Xilinx Virtex-7 XC7VX485T-2 325 250.0 MHz 812
(1x) (VC707) (Xilinx Virtex-7) (XC7VX485T-2) (360) (248.0 MHz) (892)
1x KC705 Xilinx Kintex-7 XC7K325T-2 250 284.4 MHz 711
1x ML605 Xilinx Virtex-6 XC6VLX240T-1 127 171.4 MHz 217
2x DE4 Altera Stratix IV GX EP4SGX230KF40C2 125 250.0 MHz 312
4x DNK7_F5_PCIe Xilinx Kintex-7 5x XC7K325T 5x240 220.0 MHz 2640
4x SDRC4 Xilinx Virtex-4 4x XC4VLX160-10 4x 90 128.0 MHz 460

SE (Solver Equivalent) - The performance of one solver slice running at 100 MHz.

It appears the power supply on the VC707 board is failing us on the cramped Q27 design although computation picks up quite gently as problems arrive over the slow UART. This is why the maximum performance design is actually not in use.

Things To Do Alongside

  1. Use Machine Learning on the available solution counts of subproblems to build a predictor for the solution counts of subproblems yet to deal with. Such a predictor might help to describe and understand the patterns in the solution space.
  2. Build a high-performance, highly-parallel solver using GPGPU technology as an alternative acceleration approach to contribute to this computation.
  3. Design an interface to present and browse the completed subsolutions graphically.
  4. Ultimately, find the Formula that directly computes the valid solution count from the problem size N.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].