All Projects → blarney-lang → blarney

blarney-lang / blarney

Licence: other
Haskell library for hardware description

Programming Languages

haskell
3896 projects
Verilog
626 projects
shell
77523 projects

Projects that are alternatives of or similar to blarney

Ustc Rvsoc
FPGA-based RISC-V CPU+SoC.
Stars: ✭ 77 (-4.94%)
Mutual labels:  fpga, rtl, verilog
SpinalDev
Docker Development Environment for SpinalHDL
Stars: ✭ 17 (-79.01%)
Mutual labels:  fpga, rtl, verilog
Logic
CMake, SystemVerilog and SystemC utilities for creating, building and testing RTL projects for FPGAs and ASICs.
Stars: ✭ 149 (+83.95%)
Mutual labels:  fpga, rtl, verilog
Spinalhdl
Scala based HDL
Stars: ✭ 696 (+759.26%)
Mutual labels:  fpga, rtl, verilog
FPGA-USB-Device
FPGA-based USB-device controller to implement USB-CDC, USB-HID, etc.
Stars: ✭ 29 (-64.2%)
Mutual labels:  fpga, rtl, verilog
Darkriscv
opensouce RISC-V cpu core implemented in Verilog from scratch in one night!
Stars: ✭ 1,062 (+1211.11%)
Mutual labels:  fpga, rtl, verilog
hwt
VHDL/Verilog/SystemC code generator, simulator API written in python/c++
Stars: ✭ 145 (+79.01%)
Mutual labels:  fpga, rtl, verilog
Fpga readings
Recipe for FPGA cooking
Stars: ✭ 164 (+102.47%)
Mutual labels:  fpga, rtl, verilog
virtio
Virtio implementation in SystemVerilog
Stars: ✭ 38 (-53.09%)
Mutual labels:  fpga, rtl, verilog
cnn open
A hardware implementation of CNN, written by Verilog and synthesized on FPGA
Stars: ✭ 157 (+93.83%)
Mutual labels:  fpga, rtl, verilog
Verilog
Repository for basic (and not so basic) Verilog blocks with high re-use potential
Stars: ✭ 296 (+265.43%)
Mutual labels:  fpga, rtl, verilog
SpinalCrypto
SpinalHDL - Cryptography libraries
Stars: ✭ 36 (-55.56%)
Mutual labels:  fpga, rtl, verilog
Cores
Various HDL (Verilog) IP Cores
Stars: ✭ 271 (+234.57%)
Mutual labels:  fpga, rtl, verilog
Rggen
Code generation tool for configuration and status registers
Stars: ✭ 54 (-33.33%)
Mutual labels:  fpga, rtl, verilog
Fake-SDcard
Imitate SDcard using FPGAs.
Stars: ✭ 26 (-67.9%)
Mutual labels:  fpga, rtl, verilog
vga-clock
Show the time on a VGA monitor. Submitted for the Google MPW1 ASIC shuttle.
Stars: ✭ 48 (-40.74%)
Mutual labels:  fpga, rtl, verilog
sphinxcontrib-hdl-diagrams
Sphinx Extension which generates various types of diagrams from Verilog code.
Stars: ✭ 37 (-54.32%)
Mutual labels:  fpga, rtl, verilog
Hard-JPEG-LS
FPGA-based JPEG-LS image compressor.
Stars: ✭ 52 (-35.8%)
Mutual labels:  fpga, rtl, verilog
super-miyamoto-sprint
Homebrew game for homebrew FPGA game console
Stars: ✭ 48 (-40.74%)
Mutual labels:  fpga, verilog
dblclockfft
A configurable C++ generator of pipelined Verilog FFT cores
Stars: ✭ 147 (+81.48%)
Mutual labels:  fpga, verilog
Blarney logo

Blarney is a Haskell library for hardware description that builds a range of HDL abstractions on top of a small set of pure functional circuit primitives. It is a modern variant of Lava with Verilog and SMT backends. Some aspects of the library are also inspired by Bluespec, such as first-class actions and method-based interfaces. Applications of Blarney include SIMTight (a CHERI-enabled RISC-V GPGPU) and Actora (a stack machine with a compiler for an Erlang-like language). Below, we introduce the library by example, supplementing the Haddock docs.

Contents

1. Prerequisites

We’ll need Verilator and GHC 9.2.1 or later.

On Ubuntu 20.04, we can do:

$ sudo apt install verilator

For GHC 9.2.1 or later, ghcup can be used.

2. Quick start

Recursively clone the repo and set the BLARNEY_ROOT environment variable to point to it, and add the Scripts directory to your PATH:

$ git clone --recursive https://github.com/blarney-lang/blarney
$ export BLARNEY_ROOT=$(pwd)/blarney
$ export PATH=$PATH:$BLARNEY_ROOT/Scripts

To run an example using Blarney’s in-Haskell simulator:

$ cd Examples/Sorter
$ make
$ ./Sorter --simulate

You should see the output:

sort [3,4,1,0,2] = [0,1,2,3,4]

To generate Verilog for an example, and then simulate the Verilog using Verilator, omit the --simulate option and then run the generated makefile:

$ ./Sorter             # Run the Verilog generator
$ cd Sorter-Verilog
$ make                 # Compile the generated Verilog using Verilator
$ ./Sorter             # Simulate the generated Verilog

You should see the same output as before. Using Verilator is the recommended approach for simulating Blarney designs.

To run the regression test suite:

$ cd Test
$ ./test.sh --run-all

3. Blarney by example

3.1. Two-sort

Sorting makes for a good introduction to the library. Let’s start with the simplest kind of sorter possible: given a pair of 8-bit values, the function twoSort returns the sorted pair.

import Blarney

twoSort :: (Bit 8, Bit 8) -> (Bit 8, Bit 8)
twoSort (a, b) = a .<. b ? ((a, b), (b, a))

This definition makes use of three Blarney constructs: the Bit type for bit vectors (parametised by the size of the vector); the comparison operator .<.; and the ternary conditional operator ?. A quick test bench to check that it works:

top :: Module ()
top = always do
  display "twoSort (1,2) = " (twoSort (1,2))
  display "twoSort (2,1) = " (twoSort (2,1))
  finish

We use Blarney’s always construct to perform the given action on every clock cycle. Blarney actions include statements for displaying values during simulation (display), terminating the simulator (finish), and mutating state (see below). All statements in an Action execute in parallel, within a single cycle of an implicit clock. We can generate Verilog for the test bench as follows.

main :: IO ()
main = writeVerilogTop top "top" "/tmp/twoSort/"

Assuming the above code is in a file named Sorter.hs, it can be compiled at the command-line using

$ blc Sorter.hs

where blc stands for Blarney compiler. This is just a script (from Blarney’s Scripts directory) that invokes GHC with the appropriate compiler flags. Running the resulting executable ./Sorter will produce Verilog in the /tmp/twoSort directory, including a makefile to build a Verilator simulator. The simulator can be built and run as follows.

$ cd /tmp/twoSort
$ make
$ ./top
twoSort (1,2) = (1,2)
twoSort (2,1) = (1,2)

3.2. In-Haskell simulation

Sometimes it can be convenient to skip Verilog generation, and use the in-Haskell simulator.

main :: IO ()
main = simulate top

Now after running ./Sorter we see the test bench output directly.

$ ./Sorter
twoSort (1,2) = (1,2)
twoSort (2,1) = (1,2)

The in-Haskell simulator is much slower than Verilator, but can be more convenient for small designs. It is currently an experimental feature.

3.3. Bubble sort

We can build a general N-element sorter by connecting together multiple two-sorters. One of the simplest ways to do this is the bubble sort network. The key component is a function bubble that takes a list of inputs and returns a new list in which the smallest element comes first.

bubble :: [Bit 8] -> [Bit 8]
bubble [] = []
bubble [x] = [x]
bubble (x:y:rest) = bubble (small:rest) ++ [large]
  where (small, large) = twoSort (x, y)

If we repeatedly call bubble then we end up with a sorted list.

sort :: [Bit 8] -> [Bit 8]
sort [] = []
sort xs = smallest : sort rest
  where smallest:rest = bubble xs

Running the test bench

top :: Module ()
top = always do
  let inputs = [3, 4, 1, 0, 2]
  display "sort " inputs " = " (sort inputs)
  finish

in simulation yields:

sort [3,4,1,0,2] = [0,1,2,3,4]

To see that the sort function really is describing a circuit, let’s draw the circuit digram for a 5-element bubble sorter.

        -->.
           |
        -->+---.
           |   |
Inputs  -->+---+---.
           |   |   |
        -->+---+---+---.
           |   |   |   |
        -->+---+---+---+---.
           |   |   |   |   |
           v   v   v   v   v

                Outputs

The input list is supplied on the left, and the sorted output list is produced at the bottom. Each + denotes a two-sorter that takes inputs from the top and the left, and produces the smaller value to the bottom and the larger value to the right. See The design and verification of a sorter core for a more in-depth exploration of sorting circuits in Haskell.

3.4. Polymorphism

For simplicity, we’ve made our sorter specific to lists of 8-bit values. But if we look at the types of the primitive functions it uses, we can see that it actually has a more general type.

(.<.) :: Cmp a  => a -> a -> Bit 1
(?)   :: Bits a => Bit 1 -> (a, a) -> a

So .<. can be used on any type in the Cmp (comparator) class. Similarly, ? can be used on any type in the Bits class (which allows packing to a bit vector and back again). So a more generic definition of twoSort would be:

twoSort :: (Bits a, Cmp a) => (a, a) -> (a, a)
twoSort (a, b) = a .<. b ? ((a, b), (b, a))

Indeed, this would be the type inferred by the Haskell compiler if no type signature was supplied. Using Haskell’s rebindable syntax, we can also use an if-then-else expression instead of the ternary conditional operator:

twoSort :: (Bits a, Cmp a) => (a, a) -> (a, a)
twoSort (a, b) = if a .<. b then (a, b) else (b, a)

3.5. Mutable registers

So far, we’ve only seen display and finish actions inside a Blarney module. Also supported are creation and assignment of registers. To illustrate, here is a module that creates a 4-bit cycleCount register, increments it on each cycle, stopping when it reaches 10.

top :: Module ()
top = do
  -- Create a register
  cycleCount :: Reg (Bit 4) <- makeReg 0

  always do
    -- Increment on every cycle
    cycleCount <== cycleCount.val + 1

    -- Display value on every cycle
    display "cycleCount = " cycleCount.val

    -- Terminate simulation when count reaches 10
    when (cycleCount.val .==. 10) do
      display "Finished"
      finish

This example introduces a number of new library functions: makeReg creates a register, initialised to the given value; the val field yeilds the current value of the register; and when allows conditional actions to be introduced. We can use if-then-else in an Action context. For example, the final three lines above could have been written as:

  -- Terminate simulation when count reaches 10
  if cycleCount.val .==. 10
    then do
      display "Finished"
      finish
    else
      display "Not finished"

Running top in simulation gives

cycleCount = 0
cycleCount = 1
cycleCount = 2
cycleCount = 3
cycleCount = 4
cycleCount = 5
cycleCount = 6
cycleCount = 7
cycleCount = 8
cycleCount = 9
cycleCount = 10
Finished

3.6. Queues

Queues (also known as FIFOs) are a commonly used abstraction in hardware design. Blarney provides a range of different queue implementations, all of which implement the following interface available when importing Blarney.Queue.

-- Queue interface
data Queue a =
  Queue {
    notEmpty :: Bit 1           -- Is the queue non-empty?
  , notFull  :: Bit 1           -- Is there any space in the queue?
  , enq      :: a -> Action ()  -- Insert an element (assuming notFull)
  , deq      :: Action ()       -- Remove the first element (assuming canDeq)
  , canDeq   :: Bit 1           -- Guard on the deq and first methods
  , first    :: a               -- View the first element (assuming canDeq)
  }

The type Queue a represents a queue holding elements of type a, and provides a range of standard functions on queues. The enq method should only be called when notFull is true and the deq method should only be called when canDeq is true. Similarly, the first element of the queue is only valid when canDeq is true. Below, we present the simplest possible implementation of a one-element queue.

import Blarney.Queue

-- Simple one-element queue implementation
makeSimpleQueue :: Bits a => Module (Queue a)
makeSimpleQueue = do
  -- Register holding the one element
  reg :: Reg a <- makeReg dontCare

  -- Register defining whether or not queue is full
  full :: Reg (Bit 1) <- makeReg 0

  -- Methods
  return
    Queue {
      notFull  = full.val .==. 0
    , notEmpty = full.val .==. 1
    , enq      = \a -> do reg <== a
                          full <== 1
    , deq      = full <== 0
    , canDeq   = full.val .==. 1
    , first    = reg.val
    }

The following simple test bench illustrates how to use a queue.

-- Small test bench for queues
top :: Module ()
top = do
  -- Instantiate a queue of 8-bit values
  queue :: Queue (Bit 8) <- makeSimpleQueue

  -- Create an 8-bit count register
  count :: Reg (Bit 8) <- makeReg 0

  always do
    count <== count.val + 1

    -- Writer side
    when queue.notFull do
      queue.enq count.val
      display "Enqueued " count.val

    -- Reader side
    when queue.canDeq do
      queue.deq
      display "Dequeued " queue.first

    -- Terminate after 100 cycles
    when (count.val .==. 100) finish

3.7. Mutable wires

Wires are a feature of the Action monad that offer a way for separate action blocks to communicate within the same clock cycle. Whereas assignment to a register becomes visible on the clock cycle after the assigment occurs, assignment to a wire is visible on the same cycle as the assignment. If no assignment is made to a wire on a particular cycle, then the wire emits its default value on that cycle. When multiple assignments to the same wire occur on the same cycle, the wire emits the bitwise disjunction of all the assigned values.

To illustrate, let’s implement an n-bit counter module that supports increment and decrement operations.

-- Interface for a n-bit counter
data Counter n =
  Counter {
    inc    :: Action ()
  , dec    :: Action ()
  , output :: Bit n
  }

We’d like the counter to support parallel calls to inc and dec. That is, if inc and dec are called on the same cycle then the counter’s output is unchanged. We’ll achieve this using wires.

makeCounter :: KnownNat n => Module (Counter n)
makeCounter = do
  -- State
  count :: Reg (Bit n) <- makeReg 0

  -- Wires
  incWire :: Wire (Bit 1) <- makeWire 0
  decWire :: Wire (Bit 1) <- makeWire 0

  always do
    -- Increment
    when (incWire.val .&&. inv decWire.val) do
      count <== count.val + 1

    -- Decrement
    when (inv incWire.val .&&. decWire.val) do
      count <== count.val - 1

  -- Interface
  return
    Counter {
      inc = do incWire <== 1
      dec = do decWire <== 1
      output = count.val
    }

3.8. Recipes

State machines are a common way of defining the control-path of a circuit. They are typically expressed by doing case-analysis of the current state and manually setting the next state. Quite often however, they can be expressed more neatly in a Recipe — a simple imperative language with various control-flow constructs.

data Recipe =
    Skip                         -- Do nothing (in zero cycles)
  | Tick                         -- Do nothing (in one cycle)
  | Action (Action ())           -- Perform action (in one cycle)
  | Seq [Recipe]                 -- Execute recipes in sequence
  | Par [Recipe]                 -- Fork-join parallelism
  | Wait (Bit 1)                 -- Block until condition holds
  | When (Bit 1) Recipe          -- Conditional recipe
  | If (Bit 1) Recipe Recipe     -- If-then-else recipe
  | While (Bit 1) Recipe         -- Loop
  | Background Recipe            -- Run recipe in background

To illustrate, here is a small state machine that computes the factorial of 10.

fact :: Module ()
fact = do
  -- State
  n   :: Reg (Bit 32) <- makeReg 0
  acc :: Reg (Bit 32) <- makeReg 1

  -- Compute factorial of 10
  let recipe =
        Seq [
          Action do
            n <== 10
        , While (n.val .>. 0) (
            Action do
              n <== n.val - 1
              acc <== acc.val * n.val
          )
        , Action do
            display "fact(10) = " acc.val
            finish
        ]

  runRecipe recipe

Blarney provides a lightweight compiler for the Recipe language (under 100 lines of code), which we invoke above through the call to runRecipe.

A very common use of recipes is to define test sequences. For example, here is a simple test sequence for the Counter module defined earlier.

-- Test-bench for a counter
top :: Module ()
top = do
  -- Instantiate an 4-bit counter
  counter :: Counter 4 <- makeCounter

  -- Sample test sequence
  let test =
        Seq [
          Action do
            counter.inc
        , Action do
            counter.inc
        , Action do
            counter.inc
            counter.dec
        , Action do
            display "counter = " counter.output
            finish
        ]

  runRecipe test

Here, we increment counter on the first cycle, and then again on the second. On the third cycle, we both increment and decrement it in parallel. On the fourth cycle, we display the value and terminate the simulator.

3.9. Statements

For convenience, recipes can also be constucted using do notation. The Stmt monad is simply a wrapper around Recipe, which defines monadic bind as sequential composition. It is entirely syntatic sugar, providing no new functionality.

To illustrate, here’s the factorial example from earlier, rewritten using the Stmt monad.

fact :: Module ()
fact = do
  -- State
  n   :: Reg (Bit 32) <- makeReg 0
  acc :: Reg (Bit 32) <- makeReg 1

  -- Compute factorial of 10
  let stmt = do
        action do
          n <== 10
        while (n.val .>. 0) do
          action do
            n <== n.val - 1
            acc <== acc.val * n.val
        action do
          display "fact(10) = " acc.val
          finish

  runStmt stmt

We have found that some users prefer Recipe syntax, while others prefer Stmt syntax, so we offer both.

3.10. Block RAMs

Blarney provides a variety of block RAM modules commonly supported on FPGAs. They are all based around the following interface.

-- Block RAM interface
-- (Parameterised by the address width a and the data width d)
data RAM a d =
  RAM {
    load    :: a -> Action ()
  , store   :: a -> d -> Action ()
  , out     :: d
  }

When a load is issued for a given address, the value at that address appears on out on the next clock cycle. When a store is issued, the value is written to the RAM on the current cycle, and a load of the new value can be requested on the subsequent cycle. A parallel load and store should only be issued on the same cycle if the RAM has been created as a dual-port RAM (as opposed to a single-port RAM). To illustrate, here is a test bench that creates a single-port block RAM and performs a store followed by a load.

top :: Module ()
top = do
  -- Instantiate a 256 element RAM of 5-bit values
  ram :: RAM (Bit 8) (Bit 5) <- makeRAM

  -- Write 10 to ram[0] and read it back again
  runStmt do
    action do
      store ram 0 10
    action do
      load ram 0
    action do
      display "Got " ram.out
      finish

Somewhat-related to block RAMs are register files. The difference is that a register file allows the value at an address to be determined within a clock cycle. It also allows any number of reads and writes to be performed within the same cycle. Register files have the following interface.

data RegFile a d =
  RegFile {
    index  :: a -> d                -- Read
  , update :: a -> d -> Action()    -- Write
  }

To read from a register file, use the index method or the generic lookup operator !. Unlike block RAMs, register files (especially large ones) do not always map efficiently onto hardware, so use with care!

3.11. Sources, sinks, and streams

Sources and sinks are commonly-used flow-control abstractions in hardware description. They are often used to implement hardware modules that produce or consume data at a variable rate, depending on internal details of the module that the implementer does not wish to (or is unable to) expose. In Blarney, sources and sinks are captured by the following interfaces.

-- Data is consumed from a source
data Source t =
  Source {
    -- The next value being produced by the source
    peek :: t
    -- Invoke this action to consume the next value
  , consume :: Action ()
    -- Can the source currently be peeked or consumed?
  , canPeek :: Bit 1
  }

-- Data is injected into a sink
data Sink t =
  Sink {
    -- Can a value be injected into the sink?
    canPut :: Bit 1
    -- Inject the given value into the sink
  , put :: t -> Action ()
  }

-- A stream is another name for a source (discussed below)
type Stream t = Source t

A queue is both a source and a sink.

-- Convert a queue to a source
instance ToSource (Queue t) t where
  toSource :: Queue t -> Source t
  toSource q =
    Source {
      canPeek  = q.canDeq
    , peek     = q.first
    , consume  = q.deq
    }

-- Convert a queue to a sink
instance ToSink (Queue t) t where
  toSink :: Queue t -> Sink t
  toSink q =
    Sink {
      canPut = q.notFull
    , put    = q.enq
    }

-- Another name for toSource (discussed below)
toStream :: ToSource a b => a -> Stream b
toStream = toSource

Sources and sinks can be connected together.

Note that taking a sink as a function argument (input) is very similar to returning a source as a function result (output). Both allow the function to produce data at a variable rate. Is it therefore redundant to provide both Source and Sink? Not quite. When a function takes a sink as input, it knows when the caller is ready to consume before producing data; when a function returns a source as output, it knows when the caller does consume after producing data. This subtle difference can be important when the programmer wants minimise buffering and latency between producer and consumer. Often though, we don’t mind buffering (it’s good for Fmax) so our convention is to use the Stream type in most circumstances.

As an example, here’s a function that increments each value in an input stream to produce an output stream.

inc :: Stream (Bit 8) -> Module (Stream (Bit 8))
inc xs = do
  -- Output buffer
  buffer <- makeQueue

  always do
    -- Incrementer
    when (xs.canPeek .&&. buffer.notFull) do
      xs.consume
      buffer.enq (xs.peek + 1)

  -- Convert buffer to a stream
  return (toStream buffer)

3.12. Modular compilation

So far we’ve seen examples of top-level modules, i.e. modules with no inputs or outputs, being converted to Verilog. In fact, any Blarney function whose inputs and outputs are members of the Interface class can be converted to Verilog (and the Interface class supports generic deriving). To illustrate, we can convert the function inc (defined above) into a Verilog module as follows.

main :: IO ()
main = writeVerilogModule inc "inc" "/tmp/inc"

The generated Verilog module /tmp/inc/inc.v has the following interface:

module inc(
  input  wire clock
, input  wire reset
, output wire [0:0] in0_consume_en
, input  wire [0:0] in0_canPeek
, input  wire [7:0] in0_peek
, input  wire [0:0] out_consume_en
, output wire [7:0] out_peek
, output wire [0:0] out_canPeek
);

Considering the definition of the Stream type, the correspondance between the Blarney and the Verilog is quite clear:

Signal Description

in0_consume_en

Output asserted whenever the module consumes an element from the input stream.

in0_canPeek

Input signalling when there is data available in the input stream.

in0_peek

Input containing the next value in the input stream.

out_canPeek

Output asserted whenever there is data available in the output stream.

out_peek

Output containing the next value in the output stream.

out_consume_en

Input signalling when the caller consumes an element from the output stream.

It is also possible to instantiate a Verilog module inside a Blarney description. To illustrate, here is a function that creates an instance of the Verilog inc module shown above.

-- This function creates an instance of a Verilog module called "inc"
makeInc :: Stream (Bit 8) -> Module (Stream (Bit 8))
makeInc = makeInstance "inc"

Notice that interface of the Verilog module being instantiated is determined from the type signature. Here’s a sample top-level module that uses the makeInc function:

top :: Module ()
top = do
  -- Counter
  count :: Reg (Bit 8) <- makeReg 0

  -- Input buffer
  buffer <- makeQueue

  -- Create an instance of inc
  out <- makeInc (toStream buffer)

  always do
    -- Fill input
    when buffer.notFull do
      buffer.enq count.val
      count <== count.val + 1

    -- Consume
    when out.canPeek do
      out.consume
      display "Got " out.peek
      when (out.peek .==. 100) finish

Using the following main function we can generate both the inc module and a top-level module that instantiates it.

main :: IO ()
main = do
  let dir = "/tmp/inc"
  writeVerilogModule inc "inc" dir
  writeVerilogTop top "top" dir

Using this approach, we can maintain the module hierarchy of a Blarney design whenever we generate Verilog, rather than having to flatten it to big monolithic netlist. This technique can also be used to instantiate any Verilog module within a Blarney design.

When simply marking netlist boundaries within a Blarney design, the makeInstance/writeVerilogModule combination is rather low-level and error-prone. In particular, there is no requirement for the type of the instance to match the type of the module, and it would be nice to specify a boundary in a backend-independent way. To solve these problems, Blarney provides a makeBoundary function. We can now define makeInc as:

makeInc :: Stream (Bit 8) -> Module (Stream (Bit 8))
makeInc = makeBoundary "inc" inc

Unlike makeInstance, makeBoundary takes the module to instantiate as an argument. The type of the argument to makeBoundary must match the return type:

makeBoundary :: Modular m => String -> m -> m

This means that it is unncessary to supply a type signature for makeInc now; it will be inferred. Furthermore, the top-level of our design no longer needs to call writeVerilogModule for the inc module because Blarney now knows how to generate a module for any instance that it encounters.

3.13. Master-slave pattern

This is a common pattern in hardware design. Suppose we wish to move a multiplier out of a module and into an separate slave module, where the slave takes requests (pairs of 32-bit integers to multiply) and produces responses (32-bit results).

type MulReq  = (Bit 32, Bit 32)
type MulResp = Bit 32

The slave component might be defined as:

slave :: Stream MulReq -> Module (Stream MulResp)
slave reqs = do
  resps <- makeQueue

  always do
    when (reqs.canPeek .&&. resps.notFull) do
      reqs.consume
      let (a, b) = reqs.peek
      resps.enq (a * b)

  return (toStream resps)

The master component produces requests for the slave, and consumes responses from the slave. In the example below, the master simply asks the slave to multiply 2 by 2, waits for the response, and then terminates the simulation.

master :: Stream MulResp -> Module (Stream MulReq)
master resps = do
  reqs <- makeQueue

  runStmt do
    wait reqs.notFull
    action do
      reqs.enq (2, 2)
    wait resps.canPeek
    action do
      resps.consume
      display "Result: " resps.peek
      finish

  return (toStream reqs)

The top-level module which connects the master and the slave needs to introduce a cycle, which can be achieved simply using Haskell’s recursive-do (mdo) notation:

top :: Module ()
top = mdo
  resps <- slave reqs
  reqs <- master resps
  return ()

3.14. Tagged unions

Sum types such as

data Either a b = Left a | Right b

do not permit generic deriving for the Bits class, so cannot be used for circuit-time values. (An elaboration-time value cannot be influenced by a circuit-time value, making the definition of unpack problematic for sum types, at least without resorting to language plugins). However, Blarney does support tagged unions, allowing the following definition.

import Blarney.TaggedUnion

type Either a b =
  TaggedUnion [
    "left"  ::: a
  , "right" ::: b
  ]

The API for tagged unions is illustrated by the sample functions below.

makeLeft :: Bits a => a -> Either a b
makeLeft x = tag #left x

isLeft :: Either a b -> Bit 1
isLeft x = x `is` #left

isRight :: Either a b -> Bit 1
isRight x = x `is` #right

getLeft :: Bits a => Either a b -> a
getLeft x = untag #left x

getLeftOrZero :: Bits a => Either a b -> a
getLeftOrZero x = untagDefault #left zero x

exampleAction :: Action ()
exampleAction = do
  let foo :: Either (Bit 2) (Bit 4) = tag #right 15
  whenTagged #right foo \r -> do
    display "Right val: " r

3.15. Bit selection and lookup

Bit selection operators are used to extract a subset of bits out of a bit-vector. There are different flavours, depending on whether the indices are type-level numbers, elaboration-time numbers, or circuit-level numbers.

For type-level indices, we provide functions at and slice, and use type application to specify the type-level indices:

-- Extract most-sigificant bit of a byte
msb :: Bit 8 -> Bit 1
msb x = at @7 x

-- Extract upper 4 bits of a byte
upperNibble :: Bit 8 -> Bit 4
upperNibble x = slice @7 @4 x

For elaboration-time indices of type Int, we provide unsafeAt and unsafeSlice:

-- Extract most-sigificant bit of a byte
msb :: Bit 8 -> Bit 1
msb x = unsafeAt 7 x

-- Extract upper 4 bits of a byte
upperNibble :: Bit 8 -> Bit 4
upperNibble x = unsafeSlice (7, 4) x

The argument to unsafeAt could be out of range, and the result of unsafeSlice could have a different width to that implied by the range. Such cases will lead to confusing error messages, hence the "unsafe" prefix on the function names.

Finally, for circuit-level indicies of type Bit n, the generic lookup operator ! can be used:

-- Extract bit from byte at given index
getBit :: Bit 8 -> Bit 3 -> Bit 1
getBit x i = x!i

Blarney’s generic lookup operator x!i returns the element of x at index i, and works for many different types of x and i. See Lookup for more details.

3.16. Bit-string pattern matching

Recent work on specifying and implementing ISAs led us to develop two libraries for doing bit-string pattern matching. The first, BitPat, is statically-typed and based on the paper Type-safe pattern combinators. The second, BitScan, is dynamically typed but more expressive. As an example, BitScan, let’s us define the following instruction decoder for a tiny subset of RISC-V.

import Blarney.BitScan

-- Semantics of add instruction
add :: Bit 5 -> Bit 5 -> Bit 5 -> Action ()
add rs2 rs1 rd = display "add r" rd ", r" rs1 ", r" rs2

-- Semantics of addi instruction
addi :: Bit 12 -> Bit 5 -> Bit 5 -> Action ()
addi imm rs1 rd = display "addi r" rd ", r" rs1 ", " imm

-- Semantics of store instruciton
sw :: Bit 12 -> Bit 5 -> Bit 5 -> Action ()
sw imm rs2 rs1 = display "sw r" rs2 ", " imm "(r" rs1 ")"

top :: Module ()
top = always do
  -- Sample RISC-V store-word instruction
  let instr :: Bit 32 = 0b1000000_00001_00010_010_00001_0100011

  -- Dispatch
  match instr
    [
      "0000000   rs2[4:0]  rs1[4:0] 000 rd[4:0]  0110011" ==> add,
      "          imm[11:0] rs1[4:0] 000 rd[4:0]  0010011" ==> addi,
      "imm[11:5] rs2[4:0]  rs1[4:0] 010 imm[4:0] 0100011" ==> sw
    ]

  finish

The nice thing about this decoder is that the scattered immediate field imm in the sw instruction is automatically assembled by the library. That is, the imm[11:5] part of the immediate is combined with the imm[4:0] part to give the final 12-bit immediate value passed to the right-hand-side function. Scattered immediates appear a lot in the RISC-V specification. Thanks to Jon Woodruff for suggesting this feature! For a fuller example of the BitScan module, see the Pebbles RV32I instruction decoder.

3.17. CPUs

A few processor cores have been implemented in Blarney:

  • Simple: 4-stage 8-bit CPU, with just 4 instructions, for learning.

  • Pebbles: RISC-V CPU+GPU using plugable pipelines.

  • Actora: 3-stage stack machine that runs code written a subset of Erlang.

3.18. Namer plugin

One of the classic limitations of Lava is that identifier names are lost when the netlist is generated. In particular, this is problematic when we want to analyse, say, the critical-path of our circuit using a third-party tool, but there is no way to map the netlist names reported by the tool back to the Lava names in the original description.

Blarney provides a solution to this problem in the form of the Namer plugin. This is a simple GHC plugin (around 150 lines of code) that looks for monadic bindings of the form

  x <- m

where m has type Module a for any a, and automatically rewrites the binding as

  x <- withName "x" m

where withName is a Blarney primitive that introduces name information inside m This simple approach captures quite a lot of useful names.

The plugin is completely optional, and disabled by default. To enable it, first install using cabal

cd Haskell/BlarneyPlugins/Namer
cabal v1-install

and then pass the --enable-namer-plugin flag to blc.

To further improve the readability of generated code, we can also pass the --enable-name-prop and --enable-simplifier options to our circuit generator. This will enable the (experimental) name propagation and netlist simplification passes respectively.

4. Type class overview

4.1. Bits

Any type in the Bits class can be represented in hardware, e.g. stored in a wire, a register, or a RAM.

class Bits a where
  type SizeOf a :: Nat
  sizeOf        :: a -> Int
  pack          :: a -> Bit (SizeOf a)
  unpack        :: Bit (SizeOf a) -> a

The Bits class supports generic deriving. For example, suppose we have a simple data type for memory requests:

data MemReq =
  MemReq {
    memOp   :: Bit 1    -- Is it a load or a store request?
  , memAddr :: Bit 32   -- 32-bit address
  , memData :: Bit 32   -- 32-bit data for stores
  }
  deriving (Generic, Bits)

To make this type a member of the Bits class, we have suffixed it with derving (Generic, Bits). The generic deriving mechanism for Bits does not support sum types: there is no way to convert a bit-vector (run-time circuit value) to a sum type (elaboration-time value) using the circuit primitives provided by Blarney (however, see tagged unions).

4.2. Interface

Any type in the Interface class can be used as a module input or output when doing modular compilation. Furthermore, collections of interfaces can be indexed by circuit-time values using the ! operator. To illustrate, here is an example circuit to split a stream of MemReq into four streams, using the lower two bits of the address to decide which output stream to use.

split :: Stream MemReq -> Module [Stream MemReq]
split reqs = do
  -- Create a list of 4 queues
  queues :: [Queue MemReq] <- replicateM 4 makeQueue

  always do
    -- Consume request, and put into appropriate queue
    when reqs.canPeek do
      let i :: Bit 2 = truncate reqs.peek.memAddr
      when (queues!i).notFull do
        reqs.consume
        (queues!i).enq reqs.peek

  return (map toStream queues)

The Interface class supports generic deriving: just add Interface to the deriving clause for the datatype. In the above example, MemReq is an Interface, and so too is Queue a for any a that is also an Interface.

4.3. Lookup

The generic lookup operator ! is provided by the Lookup class.

-- Index a collection 'c' of elements 'e' using index 'i'
class Lookup c i e | c -> e where
  (!) :: c -> i -> e

A wide range of combinations of types are supported. The functional dependency c → e allows the return type to be inferred from the collection type.

4.4. FShow

Any value whose type is in the FShow class, or any value of type Format, can be passed as arguments to the variadic display function.

class FShow a where
  fshow     :: a -> Format
  fshowList :: [a] -> Format     -- Has default definition

-- Abstract data type for things that can be displayed
newtype Format

-- Format constructors
mempty :: Format                         -- Empty (from Monoid class)
(<>)   :: Format -> Format -> Format     -- Append (from Monoid class)

As an example, here is how the FShow instance for pairs is defined.

-- Example instance: displaying pairs
instance (FShow a, FShow b) => FShow (a, b) where
  fshow (a, b) = fshow "(" <> fshow a <> fshow "," <> fshow b <> fshow ")"

The FShow class supports generic deriving.

The radix and padding used to display a bit vector can be specified using the following functions.

-- Display bit vector in binary with given amount of zero padding
formatBin :: Int -> Bit n -> Format

-- Display bit vector in decimal with given amount of zero padding
formatDec :: Int -> Bit n -> Format

-- Display bit vector in hex with given amount of zero padding
formatHex :: Int -> Bit n -> Format

The FShow instance for Bit n uses decimal format with no padding.

4.5. Cmp

The Cmp (comparator) class provides a range of familiar comparison operators, and supports generic deriving.

class Cmp a where
  (.<.)  :: a -> a -> Bit 1
  (.<=.) :: a -> a -> Bit 1
  (.==.) :: a -> a -> Bit 1
  (.>.)  :: a -> a -> Bit 1
  (.>=.) :: a -> a -> Bit 1
  (.!=.) :: a -> a -> Bit 1

Only the first three operators must be defined; the others have default definitions.

4.6. Assign

The assignment operator is overloaded.

class Assign v where
  (<==) :: Bits a => v a -> a -> Action ()

Example instances are Reg, Wire, and WriteOnly.

4.7. ToSource and ToSink

Converting interfaces to sources and sinks may turn out to be common. For example, Queue and Stack are both sources and sinks. Therefore the following type classes are provided.

-- Convert to a source
class ToSource a b | a -> b where
  toSource :: a -> Source b

-- Convert to a sink
class ToSink a b | a -> b where
  toSink :: a -> Sink b

-- Another name for toSource
toStream :: ToSource a b => a -> Source b
toStream = toSource
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].