Blarney is a Haskell library for hardware description that builds a range of HDL abstractions on top of a small set of pure functional circuit primitives. It is a modern variant of Lava with Verilog and SMT backends. Some aspects of the library are also inspired by Bluespec, such as first-class actions and method-based interfaces. Applications of Blarney include SIMTight (a CHERI-enabled RISC-V GPGPU) and Actora (a stack machine with a compiler for an Erlang-like language). Below, we introduce the library by example, supplementing the Haddock docs.
Contents
- 1. Prerequisites
- 2. Quick start
- 3. Blarney by example
- 3.1. Two-sort
- 3.2. In-Haskell simulation
- 3.3. Bubble sort
- 3.4. Polymorphism
- 3.5. Mutable registers
- 3.6. Queues
- 3.7. Mutable wires
- 3.8. Recipes
- 3.9. Statements
- 3.10. Block RAMs
- 3.11. Sources, sinks, and streams
- 3.12. Modular compilation
- 3.13. Master-slave pattern
- 3.14. Tagged unions
- 3.15. Bit selection and lookup
- 3.16. Bit-string pattern matching
- 3.17. CPUs
- 3.18. Namer plugin
- 4. Type class overview
1. Prerequisites
We’ll need Verilator and GHC 9.2.1 or later.
On Ubuntu 20.04, we can do:
$ sudo apt install verilator
For GHC 9.2.1 or later, ghcup can be used.
2. Quick start
Recursively clone the repo and set the BLARNEY_ROOT
environment
variable to point to it, and add the Scripts
directory to your
PATH
:
$ git clone --recursive https://github.com/blarney-lang/blarney
$ export BLARNEY_ROOT=$(pwd)/blarney
$ export PATH=$PATH:$BLARNEY_ROOT/Scripts
To run an example using Blarney’s in-Haskell simulator:
$ cd Examples/Sorter
$ make
$ ./Sorter --simulate
You should see the output:
sort [3,4,1,0,2] = [0,1,2,3,4]
To generate Verilog for an example, and then simulate the Verilog
using Verilator, omit the --simulate
option and then run the
generated makefile:
$ ./Sorter # Run the Verilog generator
$ cd Sorter-Verilog
$ make # Compile the generated Verilog using Verilator
$ ./Sorter # Simulate the generated Verilog
You should see the same output as before. Using Verilator is the recommended approach for simulating Blarney designs.
To run the regression test suite:
$ cd Test
$ ./test.sh --run-all
3. Blarney by example
3.1. Two-sort
Sorting makes for a good introduction to the library. Let’s start
with the simplest kind of sorter possible: given a pair of 8-bit
values, the function twoSort
returns the sorted pair.
import Blarney
twoSort :: (Bit 8, Bit 8) -> (Bit 8, Bit 8)
twoSort (a, b) = a .<. b ? ((a, b), (b, a))
This definition makes use of three Blarney constructs: the Bit
type
for bit vectors (parametised by the size of the vector); the
comparison operator .<.
; and the ternary conditional operator ?
.
A quick test bench to check that it works:
top :: Module ()
top = always do
display "twoSort (1,2) = " (twoSort (1,2))
display "twoSort (2,1) = " (twoSort (2,1))
finish
We use Blarney’s always
construct to perform the given action on
every clock cycle. Blarney actions include statements for displaying
values during simulation (display
), terminating the simulator
(finish
), and mutating state (see below). All statements in an
Action
execute in parallel, within a single cycle of an implicit
clock. We can generate Verilog for the test bench as follows.
main :: IO ()
main = writeVerilogTop top "top" "/tmp/twoSort/"
Assuming the above code is in a file named Sorter.hs
, it can be
compiled at the command-line using
$ blc Sorter.hs
where blc
stands for Blarney compiler. This is just a script
(from Blarney’s Scripts
directory) that invokes GHC with the
appropriate compiler flags. Running the resulting executable
./Sorter
will produce Verilog in the /tmp/twoSort
directory,
including a makefile to build a Verilator simulator. The simulator
can be built and run as follows.
$ cd /tmp/twoSort
$ make
$ ./top
twoSort (1,2) = (1,2)
twoSort (2,1) = (1,2)
3.2. In-Haskell simulation
Sometimes it can be convenient to skip Verilog generation, and use the in-Haskell simulator.
main :: IO ()
main = simulate top
Now after running ./Sorter
we see the test bench output directly.
$ ./Sorter
twoSort (1,2) = (1,2)
twoSort (2,1) = (1,2)
The in-Haskell simulator is much slower than Verilator, but can be more convenient for small designs. It is currently an experimental feature.
3.3. Bubble sort
We can build a general N-element sorter by connecting together
multiple two-sorters. One of the simplest ways to do this is the
bubble sort network. The key component is a function bubble
that
takes a list of inputs and returns a new list in which the smallest
element comes first.
bubble :: [Bit 8] -> [Bit 8]
bubble [] = []
bubble [x] = [x]
bubble (x:y:rest) = bubble (small:rest) ++ [large]
where (small, large) = twoSort (x, y)
If we repeatedly call bubble
then we end up with a sorted list.
sort :: [Bit 8] -> [Bit 8]
sort [] = []
sort xs = smallest : sort rest
where smallest:rest = bubble xs
Running the test bench
top :: Module ()
top = always do
let inputs = [3, 4, 1, 0, 2]
display "sort " inputs " = " (sort inputs)
finish
in simulation yields:
sort [3,4,1,0,2] = [0,1,2,3,4]
To see that the sort
function really is describing a circuit, let’s
draw the circuit digram for a 5-element bubble sorter.
-->. | -->+---. | | Inputs -->+---+---. | | | -->+---+---+---. | | | | -->+---+---+---+---. | | | | | v v v v v Outputs
The input list is supplied on the left, and the sorted output list is
produced at the bottom. Each +
denotes a two-sorter that takes
inputs from the top and the left, and produces the smaller value to
the bottom and the larger value to the right. See
The
design and verification of a sorter core for a more in-depth
exploration of sorting circuits in Haskell.
3.4. Polymorphism
For simplicity, we’ve made our sorter specific to lists of 8-bit values. But if we look at the types of the primitive functions it uses, we can see that it actually has a more general type.
(.<.) :: Cmp a => a -> a -> Bit 1
(?) :: Bits a => Bit 1 -> (a, a) -> a
So .<.
can be used on any type in the
Cmp (comparator)
class. Similarly, ?
can be used on any type in the Bits
class (which allows packing to a bit vector and back again). So a more generic
definition of twoSort
would be:
twoSort :: (Bits a, Cmp a) => (a, a) -> (a, a)
twoSort (a, b) = a .<. b ? ((a, b), (b, a))
Indeed, this would be the type inferred by the Haskell compiler if no type signature was supplied. Using Haskell’s rebindable syntax, we can also use an if-then-else expression instead of the ternary conditional operator:
twoSort :: (Bits a, Cmp a) => (a, a) -> (a, a)
twoSort (a, b) = if a .<. b then (a, b) else (b, a)
3.5. Mutable registers
So far, we’ve only seen display
and finish
actions inside a Blarney module.
Also supported are creation and assignment of registers. To illustrate, here
is a module that creates a 4-bit cycleCount
register, increments it on each
cycle, stopping when it reaches 10.
top :: Module ()
top = do
-- Create a register
cycleCount :: Reg (Bit 4) <- makeReg 0
always do
-- Increment on every cycle
cycleCount <== cycleCount.val + 1
-- Display value on every cycle
display "cycleCount = " cycleCount.val
-- Terminate simulation when count reaches 10
when (cycleCount.val .==. 10) do
display "Finished"
finish
This example introduces a number of new library functions: makeReg
creates a register, initialised to the given value; the val
field
yeilds the current value of the register; and when
allows
conditional actions to be introduced. We can use if-then-else in an
Action
context. For example, the final three lines above could have
been written as:
-- Terminate simulation when count reaches 10
if cycleCount.val .==. 10
then do
display "Finished"
finish
else
display "Not finished"
Running top
in simulation gives
cycleCount = 0 cycleCount = 1 cycleCount = 2 cycleCount = 3 cycleCount = 4 cycleCount = 5 cycleCount = 6 cycleCount = 7 cycleCount = 8 cycleCount = 9 cycleCount = 10 Finished
3.6. Queues
Queues (also known as FIFOs) are a commonly used abstraction in hardware
design. Blarney provides a
range of different queue implementations, all of which implement the following
interface available when importing Blarney.Queue
.
-- Queue interface
data Queue a =
Queue {
notEmpty :: Bit 1 -- Is the queue non-empty?
, notFull :: Bit 1 -- Is there any space in the queue?
, enq :: a -> Action () -- Insert an element (assuming notFull)
, deq :: Action () -- Remove the first element (assuming canDeq)
, canDeq :: Bit 1 -- Guard on the deq and first methods
, first :: a -- View the first element (assuming canDeq)
}
The type Queue a
represents a queue holding elements of type a
, and
provides a range of standard functions on queues. The enq
method should only
be called when notFull
is true and the deq
method should only be called
when canDeq
is true. Similarly, the first
element of the queue is only
valid when canDeq
is true. Below, we present the simplest possible
implementation of a one-element queue.
import Blarney.Queue
-- Simple one-element queue implementation
makeSimpleQueue :: Bits a => Module (Queue a)
makeSimpleQueue = do
-- Register holding the one element
reg :: Reg a <- makeReg dontCare
-- Register defining whether or not queue is full
full :: Reg (Bit 1) <- makeReg 0
-- Methods
return
Queue {
notFull = full.val .==. 0
, notEmpty = full.val .==. 1
, enq = \a -> do reg <== a
full <== 1
, deq = full <== 0
, canDeq = full.val .==. 1
, first = reg.val
}
The following simple test bench illustrates how to use a queue.
-- Small test bench for queues
top :: Module ()
top = do
-- Instantiate a queue of 8-bit values
queue :: Queue (Bit 8) <- makeSimpleQueue
-- Create an 8-bit count register
count :: Reg (Bit 8) <- makeReg 0
always do
count <== count.val + 1
-- Writer side
when queue.notFull do
queue.enq count.val
display "Enqueued " count.val
-- Reader side
when queue.canDeq do
queue.deq
display "Dequeued " queue.first
-- Terminate after 100 cycles
when (count.val .==. 100) finish
3.7. Mutable wires
Wires are a feature of the Action
monad that offer a way for separate
action blocks to communicate within the same clock cycle. Whereas assignment
to a register becomes visible on the clock cycle after the assigment occurs,
assignment to a wire is visible on the same cycle as the assignment. If no
assignment is made to a wire on a particular cycle, then the wire emits its
default value on that cycle. When multiple assignments to the same wire
occur on the same cycle, the wire emits the bitwise disjunction of all the
assigned values.
To illustrate, let’s implement an n-bit counter module that supports increment and decrement operations.
-- Interface for a n-bit counter
data Counter n =
Counter {
inc :: Action ()
, dec :: Action ()
, output :: Bit n
}
We’d like the counter to support parallel calls to inc
and dec
. That is,
if inc
and dec
are called on the same cycle then the counter’s output
is
unchanged. We’ll achieve this using wires.
makeCounter :: KnownNat n => Module (Counter n)
makeCounter = do
-- State
count :: Reg (Bit n) <- makeReg 0
-- Wires
incWire :: Wire (Bit 1) <- makeWire 0
decWire :: Wire (Bit 1) <- makeWire 0
always do
-- Increment
when (incWire.val .&&. inv decWire.val) do
count <== count.val + 1
-- Decrement
when (inv incWire.val .&&. decWire.val) do
count <== count.val - 1
-- Interface
return
Counter {
inc = do incWire <== 1
dec = do decWire <== 1
output = count.val
}
3.8. Recipes
State machines are a common way of defining the control-path of a circuit. They are typically expressed by doing case-analysis of the current state and manually setting the next state. Quite often however, they can be expressed more neatly in a Recipe — a simple imperative language with various control-flow constructs.
data Recipe =
Skip -- Do nothing (in zero cycles)
| Tick -- Do nothing (in one cycle)
| Action (Action ()) -- Perform action (in one cycle)
| Seq [Recipe] -- Execute recipes in sequence
| Par [Recipe] -- Fork-join parallelism
| Wait (Bit 1) -- Block until condition holds
| When (Bit 1) Recipe -- Conditional recipe
| If (Bit 1) Recipe Recipe -- If-then-else recipe
| While (Bit 1) Recipe -- Loop
| Background Recipe -- Run recipe in background
To illustrate, here is a small state machine that computes the factorial of 10.
fact :: Module ()
fact = do
-- State
n :: Reg (Bit 32) <- makeReg 0
acc :: Reg (Bit 32) <- makeReg 1
-- Compute factorial of 10
let recipe =
Seq [
Action do
n <== 10
, While (n.val .>. 0) (
Action do
n <== n.val - 1
acc <== acc.val * n.val
)
, Action do
display "fact(10) = " acc.val
finish
]
runRecipe recipe
Blarney provides a lightweight compiler for the Recipe
language (under 100
lines of code), which we invoke above through the call to runRecipe
.
A very common use of recipes is to define test sequences. For example, here is
a simple test sequence for the Counter
module defined earlier.
-- Test-bench for a counter
top :: Module ()
top = do
-- Instantiate an 4-bit counter
counter :: Counter 4 <- makeCounter
-- Sample test sequence
let test =
Seq [
Action do
counter.inc
, Action do
counter.inc
, Action do
counter.inc
counter.dec
, Action do
display "counter = " counter.output
finish
]
runRecipe test
Here, we increment counter
on the first cycle, and then again on the second.
On the third cycle, we both increment and decrement it in parallel. On the
fourth cycle, we display the value and terminate the simulator.
3.9. Statements
For convenience, recipes can also be constucted using do
notation. The
Stmt monad is simply a
wrapper around Recipe
, which defines monadic bind as sequential composition.
It is entirely syntatic sugar, providing no new functionality.
To illustrate, here’s the factorial example from earlier, rewritten using the
Stmt
monad.
fact :: Module ()
fact = do
-- State
n :: Reg (Bit 32) <- makeReg 0
acc :: Reg (Bit 32) <- makeReg 1
-- Compute factorial of 10
let stmt = do
action do
n <== 10
while (n.val .>. 0) do
action do
n <== n.val - 1
acc <== acc.val * n.val
action do
display "fact(10) = " acc.val
finish
runStmt stmt
We have found that some users prefer Recipe
syntax, while others prefer
Stmt
syntax, so we offer both.
3.10. Block RAMs
Blarney provides a variety of block RAM modules commonly supported on FPGAs. They are all based around the following interface.
-- Block RAM interface
-- (Parameterised by the address width a and the data width d)
data RAM a d =
RAM {
load :: a -> Action ()
, store :: a -> d -> Action ()
, out :: d
}
When a load
is issued for a given address, the value at that address appears
on out
on the next clock cycle. When a store
is issued, the value is
written to the RAM on the current cycle, and a load of the new value can be
requested on the subsequent cycle. A parallel load
and store
should only
be issued on the same cycle if the RAM has been created as a dual-port RAM (as
opposed to a single-port RAM). To illustrate, here is a test bench that
creates a single-port block RAM and performs a store
followed by a load
.
top :: Module ()
top = do
-- Instantiate a 256 element RAM of 5-bit values
ram :: RAM (Bit 8) (Bit 5) <- makeRAM
-- Write 10 to ram[0] and read it back again
runStmt do
action do
store ram 0 10
action do
load ram 0
action do
display "Got " ram.out
finish
Somewhat-related to block RAMs are register files. The difference is that a register file allows the value at an address to be determined within a clock cycle. It also allows any number of reads and writes to be performed within the same cycle. Register files have the following interface.
data RegFile a d =
RegFile {
index :: a -> d -- Read
, update :: a -> d -> Action() -- Write
}
To read from a register file, use the index
method or the generic lookup
operator !
. Unlike block RAMs, register files (especially large ones) do not
always map efficiently onto hardware, so use with care!
3.11. Sources, sinks, and streams
Sources and sinks are commonly-used flow-control abstractions in hardware description. They are often used to implement hardware modules that produce or consume data at a variable rate, depending on internal details of the module that the implementer does not wish to (or is unable to) expose. In Blarney, sources and sinks are captured by the following interfaces.
-- Data is consumed from a source
data Source t =
Source {
-- The next value being produced by the source
peek :: t
-- Invoke this action to consume the next value
, consume :: Action ()
-- Can the source currently be peeked or consumed?
, canPeek :: Bit 1
}
-- Data is injected into a sink
data Sink t =
Sink {
-- Can a value be injected into the sink?
canPut :: Bit 1
-- Inject the given value into the sink
, put :: t -> Action ()
}
-- A stream is another name for a source (discussed below)
type Stream t = Source t
A queue is both a source and a sink.
-- Convert a queue to a source
instance ToSource (Queue t) t where
toSource :: Queue t -> Source t
toSource q =
Source {
canPeek = q.canDeq
, peek = q.first
, consume = q.deq
}
-- Convert a queue to a sink
instance ToSink (Queue t) t where
toSink :: Queue t -> Sink t
toSink q =
Sink {
canPut = q.notFull
, put = q.enq
}
-- Another name for toSource (discussed below)
toStream :: ToSource a b => a -> Stream b
toStream = toSource
Sources and sinks can be connected together.
Note that taking a sink as a function argument (input) is very similar
to returning a source as a function result (output). Both allow the
function to produce data at a variable rate. Is it therefore
redundant to provide both Source
and Sink
? Not quite. When a
function takes a sink as input, it knows when the caller is ready to
consume before producing data; when a function returns a source as
output, it knows when the caller does consume after producing data.
This subtle difference can be important when the programmer wants
minimise buffering and latency between producer and consumer. Often
though, we don’t mind buffering (it’s good for Fmax) so our convention
is to use the Stream
type in most circumstances.
As an example, here’s a function that increments each value in an input stream to produce an output stream.
inc :: Stream (Bit 8) -> Module (Stream (Bit 8))
inc xs = do
-- Output buffer
buffer <- makeQueue
always do
-- Incrementer
when (xs.canPeek .&&. buffer.notFull) do
xs.consume
buffer.enq (xs.peek + 1)
-- Convert buffer to a stream
return (toStream buffer)
3.12. Modular compilation
So far we’ve seen examples of top-level modules, i.e. modules with no inputs or
outputs, being converted to Verilog. In fact, any Blarney function whose
inputs and outputs are members of the
Interface class can
be converted to Verilog (and the Interface
class supports generic deriving).
To illustrate, we can convert the function inc
(defined
above) into
a Verilog module as follows.
main :: IO ()
main = writeVerilogModule inc "inc" "/tmp/inc"
The generated Verilog module /tmp/inc/inc.v
has the following
interface:
module inc(
input wire clock
, input wire reset
, output wire [0:0] in0_consume_en
, input wire [0:0] in0_canPeek
, input wire [7:0] in0_peek
, input wire [0:0] out_consume_en
, output wire [7:0] out_peek
, output wire [0:0] out_canPeek
);
Considering the definition of the Stream
type, the correspondance between the
Blarney and the Verilog is quite clear:
Signal | Description |
---|---|
|
Output asserted whenever the module consumes an element from the input stream. |
|
Input signalling when there is data available in the input stream. |
|
Input containing the next value in the input stream. |
|
Output asserted whenever there is data available in the output stream. |
|
Output containing the next value in the output stream. |
|
Input signalling when the caller consumes an element from the output stream. |
It is also possible to instantiate a Verilog module inside a Blarney
description. To illustrate, here is a function that creates an instance of the
Verilog inc
module shown above.
-- This function creates an instance of a Verilog module called "inc"
makeInc :: Stream (Bit 8) -> Module (Stream (Bit 8))
makeInc = makeInstance "inc"
Notice that interface of the Verilog module being instantiated is determined
from the type signature. Here’s a sample top-level module that uses the
makeInc
function:
top :: Module ()
top = do
-- Counter
count :: Reg (Bit 8) <- makeReg 0
-- Input buffer
buffer <- makeQueue
-- Create an instance of inc
out <- makeInc (toStream buffer)
always do
-- Fill input
when buffer.notFull do
buffer.enq count.val
count <== count.val + 1
-- Consume
when out.canPeek do
out.consume
display "Got " out.peek
when (out.peek .==. 100) finish
Using the following main
function we can generate both the inc
module and a
top-level module that instantiates it.
main :: IO ()
main = do
let dir = "/tmp/inc"
writeVerilogModule inc "inc" dir
writeVerilogTop top "top" dir
Using this approach, we can maintain the module hierarchy of a Blarney design whenever we generate Verilog, rather than having to flatten it to big monolithic netlist. This technique can also be used to instantiate any Verilog module within a Blarney design.
When simply marking netlist boundaries within a Blarney design, the
makeInstance
/writeVerilogModule
combination is rather low-level
and error-prone. In particular, there is no requirement for the type
of the instance to match the type of the module, and it would be nice
to specify a boundary in a backend-independent way. To solve these
problems, Blarney provides a makeBoundary
function. We can now
define makeInc
as:
makeInc :: Stream (Bit 8) -> Module (Stream (Bit 8))
makeInc = makeBoundary "inc" inc
Unlike makeInstance
, makeBoundary
takes the module to instantiate
as an argument. The type of the argument to makeBoundary
must match
the return type:
makeBoundary :: Modular m => String -> m -> m
This means that it is unncessary to supply a type signature for
makeInc
now; it will be inferred. Furthermore, the top-level of our
design no longer needs to call writeVerilogModule
for the inc
module because Blarney now knows how to generate a module for any
instance that it encounters.
3.13. Master-slave pattern
This is a common pattern in hardware design. Suppose we wish to move a multiplier out of a module and into an separate slave module, where the slave takes requests (pairs of 32-bit integers to multiply) and produces responses (32-bit results).
type MulReq = (Bit 32, Bit 32)
type MulResp = Bit 32
The slave component might be defined as:
slave :: Stream MulReq -> Module (Stream MulResp)
slave reqs = do
resps <- makeQueue
always do
when (reqs.canPeek .&&. resps.notFull) do
reqs.consume
let (a, b) = reqs.peek
resps.enq (a * b)
return (toStream resps)
The master component produces requests for the slave, and consumes responses from the slave. In the example below, the master simply asks the slave to multiply 2 by 2, waits for the response, and then terminates the simulation.
master :: Stream MulResp -> Module (Stream MulReq)
master resps = do
reqs <- makeQueue
runStmt do
wait reqs.notFull
action do
reqs.enq (2, 2)
wait resps.canPeek
action do
resps.consume
display "Result: " resps.peek
finish
return (toStream reqs)
The top-level module which connects the master and the slave needs to introduce
a cycle, which can be achieved simply using Haskell’s recursive-do (mdo
)
notation:
top :: Module ()
top = mdo
resps <- slave reqs
reqs <- master resps
return ()
3.14. Tagged unions
Sum types such as
data Either a b = Left a | Right b
do not permit generic deriving for the Bits
class, so cannot be used
for circuit-time values. (An elaboration-time value cannot be
influenced by a circuit-time value, making the definition of unpack
problematic for sum types, at least without resorting to language
plugins). However, Blarney does support tagged unions, allowing the
following definition.
import Blarney.TaggedUnion
type Either a b =
TaggedUnion [
"left" ::: a
, "right" ::: b
]
The API for tagged unions is illustrated by the sample functions below.
makeLeft :: Bits a => a -> Either a b
makeLeft x = tag #left x
isLeft :: Either a b -> Bit 1
isLeft x = x `is` #left
isRight :: Either a b -> Bit 1
isRight x = x `is` #right
getLeft :: Bits a => Either a b -> a
getLeft x = untag #left x
getLeftOrZero :: Bits a => Either a b -> a
getLeftOrZero x = untagDefault #left zero x
exampleAction :: Action ()
exampleAction = do
let foo :: Either (Bit 2) (Bit 4) = tag #right 15
whenTagged #right foo \r -> do
display "Right val: " r
3.15. Bit selection and lookup
Bit selection operators are used to extract a subset of bits out of a bit-vector. There are different flavours, depending on whether the indices are type-level numbers, elaboration-time numbers, or circuit-level numbers.
For type-level indices, we provide functions at and slice, and use type application to specify the type-level indices:
-- Extract most-sigificant bit of a byte
msb :: Bit 8 -> Bit 1
msb x = at @7 x
-- Extract upper 4 bits of a byte
upperNibble :: Bit 8 -> Bit 4
upperNibble x = slice @7 @4 x
For elaboration-time indices of type Int
, we provide
unsafeAt and
unsafeSlice:
-- Extract most-sigificant bit of a byte
msb :: Bit 8 -> Bit 1
msb x = unsafeAt 7 x
-- Extract upper 4 bits of a byte
upperNibble :: Bit 8 -> Bit 4
upperNibble x = unsafeSlice (7, 4) x
The argument to unsafeAt
could be out of range, and the result of
unsafeSlice
could have a different width to that implied by the range. Such
cases will lead to confusing error messages, hence the "unsafe" prefix on the
function names.
Finally, for circuit-level indicies of type Bit n
, the generic lookup
operator !
can be used:
-- Extract bit from byte at given index
getBit :: Bit 8 -> Bit 3 -> Bit 1
getBit x i = x!i
Blarney’s generic lookup operator x!i
returns the element of x
at
index i
, and works for many different types of x
and i
. See
Lookup for more details.
3.16. Bit-string pattern matching
Recent work on specifying and implementing ISAs led us to develop two libraries
for doing bit-string pattern matching. The first,
BitPat, is statically-typed
and based on the paper Type-safe
pattern combinators. The second,
BitScan, is dynamically
typed but more expressive. As an example, BitScan
, let’s us define the
following instruction decoder for a tiny subset of RISC-V.
import Blarney.BitScan
-- Semantics of add instruction
add :: Bit 5 -> Bit 5 -> Bit 5 -> Action ()
add rs2 rs1 rd = display "add r" rd ", r" rs1 ", r" rs2
-- Semantics of addi instruction
addi :: Bit 12 -> Bit 5 -> Bit 5 -> Action ()
addi imm rs1 rd = display "addi r" rd ", r" rs1 ", " imm
-- Semantics of store instruciton
sw :: Bit 12 -> Bit 5 -> Bit 5 -> Action ()
sw imm rs2 rs1 = display "sw r" rs2 ", " imm "(r" rs1 ")"
top :: Module ()
top = always do
-- Sample RISC-V store-word instruction
let instr :: Bit 32 = 0b1000000_00001_00010_010_00001_0100011
-- Dispatch
match instr
[
"0000000 rs2[4:0] rs1[4:0] 000 rd[4:0] 0110011" ==> add,
" imm[11:0] rs1[4:0] 000 rd[4:0] 0010011" ==> addi,
"imm[11:5] rs2[4:0] rs1[4:0] 010 imm[4:0] 0100011" ==> sw
]
finish
The nice thing about this decoder is that the scattered immediate field imm
in the sw
instruction is automatically assembled by the library. That is,
the imm[11:5]
part of the immediate is combined with the imm[4:0]
part to
give the final 12-bit immediate value passed to the right-hand-side function.
Scattered immediates appear a lot in the RISC-V specification. Thanks to Jon
Woodruff for suggesting this feature!
For a fuller example of the BitScan
module, see the Pebbles RV32I
instruction decoder.
3.17. CPUs
A few processor cores have been implemented in Blarney:
3.18. Namer plugin
One of the classic limitations of Lava is that identifier names are lost when the netlist is generated. In particular, this is problematic when we want to analyse, say, the critical-path of our circuit using a third-party tool, but there is no way to map the netlist names reported by the tool back to the Lava names in the original description.
Blarney provides a solution to this problem in the form of the Namer plugin. This is a simple GHC plugin (around 150 lines of code) that looks for monadic bindings of the form
x <- m
where m
has type Module a
for any a
, and automatically rewrites the
binding as
x <- withName "x" m
where
withName is
a Blarney primitive that introduces name information inside m
This simple
approach captures quite a lot of useful names.
The plugin is completely optional, and disabled by default. To enable it, first install using cabal
cd Haskell/BlarneyPlugins/Namer
cabal v1-install
and then pass the --enable-namer-plugin
flag to blc
.
To further improve the readability of generated code, we can also pass the
--enable-name-prop
and --enable-simplifier
options to our circuit
generator. This will enable the (experimental) name propagation and netlist
simplification passes respectively.
4. Type class overview
4.1. Bits
Any type in the Bits class can be represented in hardware, e.g. stored in a wire, a register, or a RAM.
class Bits a where
type SizeOf a :: Nat
sizeOf :: a -> Int
pack :: a -> Bit (SizeOf a)
unpack :: Bit (SizeOf a) -> a
The Bits
class supports generic deriving. For example, suppose we have a
simple data type for memory requests:
data MemReq =
MemReq {
memOp :: Bit 1 -- Is it a load or a store request?
, memAddr :: Bit 32 -- 32-bit address
, memData :: Bit 32 -- 32-bit data for stores
}
deriving (Generic, Bits)
To make this type a member of the Bits
class, we have suffixed it with
derving (Generic, Bits)
. The generic deriving mechanism for Bits
does not
support sum types: there is no way to convert a bit-vector (run-time circuit
value) to a sum type (elaboration-time value) using the circuit primitives
provided by Blarney (however, see tagged unions).
4.2. Interface
Any type in the
Interface class can
be used as a module input or output when doing modular
compilation. Furthermore, collections of interfaces can be indexed by
circuit-time values using the !
operator. To illustrate, here is an example
circuit to split a stream of MemReq into four streams, using the
lower two bits of the address to decide which output stream to use.
split :: Stream MemReq -> Module [Stream MemReq]
split reqs = do
-- Create a list of 4 queues
queues :: [Queue MemReq] <- replicateM 4 makeQueue
always do
-- Consume request, and put into appropriate queue
when reqs.canPeek do
let i :: Bit 2 = truncate reqs.peek.memAddr
when (queues!i).notFull do
reqs.consume
(queues!i).enq reqs.peek
return (map toStream queues)
The Interface
class supports generic deriving: just add Interface
to the
deriving clause for the datatype. In the above example, MemReq
is an
Interface
, and so too is Queue a
for any a
that is also an Interface
.
4.3. Lookup
The generic lookup operator !
is provided by the
Lookup class.
-- Index a collection 'c' of elements 'e' using index 'i'
class Lookup c i e | c -> e where
(!) :: c -> i -> e
A wide range of combinations of types are supported. The functional dependency
c → e
allows the return type to be inferred from the collection type.
4.4. FShow
Any value whose type is in the
FShow class, or any
value of type Format
, can be passed as arguments to the variadic display
function.
class FShow a where
fshow :: a -> Format
fshowList :: [a] -> Format -- Has default definition
-- Abstract data type for things that can be displayed
newtype Format
-- Format constructors
mempty :: Format -- Empty (from Monoid class)
(<>) :: Format -> Format -> Format -- Append (from Monoid class)
As an example, here is how the FShow
instance for pairs is defined.
-- Example instance: displaying pairs
instance (FShow a, FShow b) => FShow (a, b) where
fshow (a, b) = fshow "(" <> fshow a <> fshow "," <> fshow b <> fshow ")"
The FShow
class supports generic deriving.
The radix and padding used to display a bit vector can be specified using the following functions.
-- Display bit vector in binary with given amount of zero padding
formatBin :: Int -> Bit n -> Format
-- Display bit vector in decimal with given amount of zero padding
formatDec :: Int -> Bit n -> Format
-- Display bit vector in hex with given amount of zero padding
formatHex :: Int -> Bit n -> Format
The FShow
instance for Bit n
uses decimal format with no padding.
4.5. Cmp
The Cmp
(comparator) class provides a range of familiar comparison
operators, and supports generic deriving.
class Cmp a where
(.<.) :: a -> a -> Bit 1
(.<=.) :: a -> a -> Bit 1
(.==.) :: a -> a -> Bit 1
(.>.) :: a -> a -> Bit 1
(.>=.) :: a -> a -> Bit 1
(.!=.) :: a -> a -> Bit 1
Only the first three operators must be defined; the others have default definitions.
4.6. Assign
The assignment operator is overloaded.
class Assign v where
(<==) :: Bits a => v a -> a -> Action ()
4.7. ToSource and ToSink
Converting interfaces to sources and sinks may turn out to be common. For example, Queue and Stack are both sources and sinks. Therefore the following type classes are provided.
-- Convert to a source
class ToSource a b | a -> b where
toSource :: a -> Source b
-- Convert to a sink
class ToSink a b | a -> b where
toSink :: a -> Sink b
-- Another name for toSource
toStream :: ToSource a b => a -> Source b
toStream = toSource