FPGA Design: The world of electronics has seen a remarkable advancement in due course of time. The VLSI industry has come a long way from the transistor to the integrated circuit to the ASIC.
Next came programmable logic devices on the road, thus making a course for the standard fabless semiconductor industry. These PLDs started way back early in the 1970s, but it was not until Xilinx presented the FPGAs in the late 1980s that PLDs ran into the ASIC world, which is quite a big deal. Saying that, let us get started.
Let us look at what the FPGA is all about and what it has to offer?
What is an FPGA?
FPGA stands for Field Programmable Gate Array. It is an integrated circuit that can be “field” programmed/customized to function according to the planned design, implying it can function as a microprocessor, an encryption unit, a graphics card, or even all these three at once.
An FPGA Design filling in as a chip can be reprogrammed to work as the graphics card in the field rather than in the semiconductor foundries. Knowing such flexibility of FPGA Design, an obvious question would be arising in your brains- What are these FPGAs composed of, which makes them so flexible?
FPGA Design consists of thousands of Configurable Logic Blocks (CLBs) installed in an expanse of programmable interconnects.
A typical model of an FPGA chip is shown in the below figure. It also consists of I/O blocks, which are designed and numbered according to function.
The programmable interconnects are made with horizontal and vertical routing channels and PSM(Programmable Multiplexers).
The CLBs are essentially made of Look-up Tables (LUTs) with a fixed number of inputs and are constructed over simple memories, SRAM or Flash, that store Boolean functions. Look-up tables (LUTs) are utilized to carry out function generators in CLBs. Four independent inputs are given to every one of two function generators (F1-F4 and G1-G4).
These function generators can carry out any arbitrarily characterized Boolean function of four inputs.
For the implementation of sequential circuits, each LUT is coupled with a multiplexer and a Flip-Flop.
Moreover, LUTs can be combined to carry out complex logic functions.
Aside from CLBs and directing interconnects, numerous FPGAs also contain devoted hard-silicon blocks for different capacities, such as DSP blocks, block RAM, External Memory Controllers, PLLs, etc. The present FPGAs are potent devices supporting many I/O standards like I2C, SPI, CAN, PCIe. I/Os in FPGAs are grouped in banks where each bank can autonomously uphold diverse I/O standards.
Moreover, a new pattern of providing a hard-silicon processor center (like ARM Cortex A9 in Xilinx Zynq) inside the same FPGA is being practiced. This kicks the actual bucket, thus making sure that the processor can deal with the commonplace, non-basic undertakings.
You must be wondering how the entire technicality works? So, to give you the bigger picture and get you going with the guide, here is a brief explanation.
The quantity of CLB decides the intricacy of an FPGA Design. The functionality of CLB’s and PSM are designed by Verilog, VHDL, or any other hardware descriptive language. The vendor’s synthesizer utilizes the hardware description described by the designer to discover an optimized course of action of the FPGAs assets that executes the depicted functionality. After programming,
CLB and PSM are placed on-chip and associated with one another via routing channels.
Three primary reasons to Choose FPGA Design over ASIC Design
Before jumping to the core of the heading, first, let us understand the fundamental and critical difference between ASIC and FPGA Design through an analogy.
Consider making a palace using Lego Blocks and making a palace utilizing concrete. The former is analogous to FPGAs, whereas the latter is to ASICs. You can reuse the Lego blocks to make another design, but the concrete palace is perpetual.
There are significant three advantages of using an FPGA Design over a microprocessor like an Application-specific integrated circuit (ASIC) in a prototype or in limited production designs, which are bulleted below-
- Flexibility & Performance -Where performance is king, FPGAs put themselves aside in profoundly parallelized tasks. The modern microprocessors execute on many cores with out-of-order instructions, which is not well suited for functions such as massive image or digital signal processing applications.
Moreover, FPGA Design may have many hard or soft microprocessors operating inside one package. Then why occupy a room for two different devices that require a physical interconnection?
The FPGA Design has a more straightforward design cycle to manage and requires less manual mediation. The software handles much of the routing, placement, and timing automatically to match the programmed specifications.
- Reusability -Since FPGAs are reprogrammable, they are re-usable, making them adaptable for quick prototyping, and mistakes are not expensive.
The reconfigurable feature of your standard FPGA leaves ASICs in the residue. Notwithstanding the intricate/delicate IP centers designed for a particular application, the actual worth lies in having the option to reconfigure (and reconfigure again) after establishment – something that ASICs cannot do.
- Quick acquittance-An FPGA probably has a faster time-to-market because they are not pre-designed to play out specific tasks. One can purchase a ready-made FPGA and, afterward, can configure it to the design one needs.
Before embarking more profoundly on the FPGA, let us first look at its diverse application to gain insights into its offers. Without any further ado, let us get to the list –
- Best-in-class security for secured data and connectivity
- Low power for optimal power efficiency
- Lowest cost of ownership
- Increased cybersecurity.
- Increasing automation in vehicles and weapons
- Battlefield portability and high mission life
- Lower physical and carbon footprint
- IoT growth with minimal energy consumption
- Delivers 4k video
- Rise of cloud services requiring decentralized, secure computing
- Increased networking of factory automation
- Portability becoming more prevalent
- heterogeneous computing capabilities maintaining low latency.
- re-programmability and adaptability to sparse data and variable-precision weights
RELATED READ: FPGAs for Data Center Acceleration
FPGA Applications Infographics:
Moreover, these FPGAs hit the sweet spot for processing power instead of ASIC and GPU being present. It is because ASIC, although it provides incredible performance per watt but is much more challenging to design and also is pricey. On the other hand, GPUs being easily programmable lacks efficiency.
The following are some of the features of an FPGA that make it a suitable choice for processing power.
- Parallel Processing – One of the advantages of FPGAs that make them a decent choice for working with measurement systems and other computing applications that require processing a large amount of data is that they can process in parallel. In contrast, CPUs work sequentially, making it not a practical choice.
- FPGAs Perform Time-Critical Processing – Being able to process data parallelly, FPGAs have low latency, making them suitable for applications requiring very time-critical calculations such as in software-defined radio, medical devices, and mil-aero systems.
- FPGAs Have Optimal Performance per watt – When compared with a CPU or GPU, one will get better per watt performance with an FPGA. This low power consumption can be almost 3 to 4 times less than that of a GPU. The operating cost of an ASIC is by a long shot the best, but the high initial price does a ton to counterbalance that.
FPGA Development Process:
FPGA programming, or the FPGA development process, is planning, designing, and implementing a solution on FPGA Design. It all starts with a thought advancement, then turning the idea into a code. The RTL code is then simulated and debugged for functional verification. After every one of the checks, the code is ready to download into the FPGA Design.
The FPGA Design development process can be broadly classified into three stages: design, verification, and implementation.
We center around moving our underlying idea or thought into an actual FPGA Design device in the design stage. It involves commonly architecting the chip or separating it into smaller blocks to form an entire design. We then, at that point, implement every one of these blocks utilizing an HDL language or another methodology.
In the implementation stages, the HDL design is converted into a programming file further dumped on the FPGA Design.
FPGA Design Flow
The FPGA design flow comprises several steps, including design entry, synthesis, implementation, and device programming. All these steps involved in the design flow are explained in detail below.
Design entry can be done through schematics or Hardware Description Language (HDL). You also get the privilege to combine them both and use the best of both using tools that can convert schematic into HDL and vice-versa. Generally, it is better to opt for HDL for a design that deals more with complex systems. This quicker, language-based process rids you of the need to design in lower-level hardware.
The schematic approach is a good choice for designers who are into hardware design because the schematic approach gives them more visibility.
There are benefits and flaws associated with each approach. While a schematic-based technique is easier to read and comprehend, it only works with more minor projects. On the other hand, HDL-based methods tend to be fast and easy to implement, and today is the most popular design entry for FPGA design.
After the design has been entered into code, the design is synthesized, where the code is translated into a circuit that contains gates, flip-flops, multipliers, etc. The HDL is converted into a netlist which includes the logic elements, and the interconnects required in the specific hierarchy.
The process begins with a syntax check on the HDL-based design. It is then optimized by reducing logic, eliminating redundant logic, and reducing the size of the design. In the last step, the design is connected to the logic by mapping out the technology.
Dedicated synthesis tools perform FPGA Design synthesis. Many EDA companies such as Cadence, Synopsys, and Mentor Graphics develop tools for FPGA Design synthesis.
In the Implementation phase, the layout of the design is determined, and it consists of three steps: translate, map, and Place & Route. The FPGA Design vendor provides the tool used in this step as they know how to translate the generated netlist into an FPGA.
The first step for the tools is to gather all the constraints set by the user and the netlist files. These constraints contain the position of pins of I/Os and can include requirements regarding timing such as maximum delay, false path, etc.
Then the tool maps out the implementation by comparing the resource requirement specified in the files to the resources available on the FPGA Design being used. The circuit is divided into logic blocks or elements in the form of sub-blocks. As a result, your entire design is placed in specific logic blocks and is ‘mapped out’ into the FPGA Design.
The next step is connecting and routing all the signals according to the constraints set by the user.
FPGA Program Memory
Once we have finished our FPGA design, we create a programming file that tells the FPGA Design how to be configured.
The FPGA Design uses this file to configure the internal LUTs and interconnections and other internal components such as PLLs , DSP cores, DDR, BRAM etc.
Related Read: DDR2/3 Interface Engine
Modern FPGA Design devices use SRAM type memory to store this information because of its high speed.
One flaw of using SRAM type memory is that an external memory chip is needed to store the program, as SRAM is volatile. The SRAM can’t retain its state once the power is off. To tackle this issue, some vendors offer FPGA Design which use flash memory instead of SRAM, as this memory type is non-volatile.
However, SRAM-based FPGA Design can typically run with higher clock speeds which is why they are more popular.
5. FPGA Verification Process
After the HDL code is written, there is a requirement to test it. This process is called verification.
The first stage of verifying the design is the simulation. To simulate the design a testbench is created to generate the input to the design.
Then according to the functionality, the FPGA Design outputs are checked. If the output is as expected then the simulation is said to be passed.
Typically, simulation is the primary process that is involved in the verification of our design. We usually complement this with hardware testing to ensure that the FPGA Design interfaces work as expected with all external circuitry.
However, as FPGA design have become more complex, other techniques have become popular.
This involves running code on our target device and feeding back data to simulation software in both cases. This allows us to run specific, structured tests on our device in near real-time.
This is advantageous as post place, and route netlists are extremely slow to simulate compared to function code. A simulation that takes an hour to run with functional code can easily take a day or more to run with a post place and route netlist.
6. FPGA Integration
The last but one step is termed system integration before the actual design. The typical digital system being designed using FPGAs does not consist of a single FPGA Design but instead contains multiple FPGAs, glue logic such as PDLs or TTL parts, and other components such as EPROMs and SRAMs.The model is near the real framework, and successful simulation shows that the physical prototype functions well.
7. HDL Design Over to you
All the levels of competence adopt areas of HDL application. The initial step is system specification, and afterward, the system is partitioned into two sections, software, and hardware. The hardware can be ASIC, FPGA Design, or PLD (Programmable Logic Device). Furthermore, for the system, the software acts as the mind of the equipment and makes it work appropriately.
The fundamental limitation of the HDL language is that it cannot handle analog signals; it can take only digital systems. Furthermore, the HDL models can be isolated in different abstractions like HDL (the highest level abstraction), then Gate Level, Netlist, and Layout (lowest level of abstraction).
The HDL hierarchy can be divided into various levels, as displayed in the figure underneath. On the left side, we can see that the top module can be divided into two diverse sub-modules while the submodule can be further sub partitioned in basic level modules. This can be better perceived while contrasting this and a model (FULL ADDER) on the right half of the figure underneath.
Verilog or VHDL – The Usual Query
Now, there are two different types of HDL language. Designers have different opinions on which language is better, but it is not about which language is better; instead, it is about which language YOU prefer. I ought to likewise specify that there is obviously System Verilog, but it is very closely related to Verilog, so we will leave that for some other day.
How about we investigate these languages and see what the distinctions are?
VHDL stands for Very High-Speed Integrated Circuit Hardware Description Language. One of the critical features of the VHDL language is that it is a strongly typed language, which implies VHDL itself has predefined data types (integer, character, etc.). All values or variables defined in this language are described by one of the data types.
VHDL is more verbose than Verilog and additionally has a non-C-like syntax. With VHDL, there is higher chance of composing more lines of code.
Verilog is a more compact language as it is more of a hardware modeling language. You will wind up composing not many lines of code, and it attracts likenesses to the C language. Verilog has a better grip on hardware modeling yet has a lower level of programming constructs.
Verilog is not just as lengthy as VHDL, so that is why it is more compact. Verilog is unique concerning VHDL. There are a few similitudes; however, they are dominated by their disparities.
Related Read: Unerring Language: VHDL vs VERILOG
Taking a gander at this model code, we can think about how a MUX can be programmed through VHDL and Verilog.
The format of these programs is the same; you can sensibly follow what every rendition of the code is doing. The VHDL variant is longer than the Verilog, yet it tends to be understood better.
The last match of this opposition is something that cannot be concluded effectively. It is the individual inclination challenge. The choice rides on the pursuers. Weigh in and decide where you lean toward the most.
Nevertheless, since we need to discuss some basic HDL programming, let us stick to Verilog.
Learning Verilog itself is not a troublesome task, yet making a good design can be. In any case, we center around straightforward designs here and will make an honest effort to clarify things as essential as they could be.
If you are accustomed to procedural languages such as C, C++, you will have to comprehend that not all things happen consecutively in the digital world. Many things happen parallelly as well. C programs run on microprocessors, which execute each instruction in turn successively.
So it is easy to compose a program in a manner you need things to happen one step at a time.
This is the weak spot of microchips/microcontrollers. They can do just one thing at a time, just a single thing (obviously, I’m discussing single-core devices!). However, unlike microprocessors, digital circuits (FPGAs, CPLDs, and ASICs) can do numerous things simultaneously.
Related Read: CPLD VS FPGA: The Never-Ending Conquest
Furthermore, procedural language requires figuring out how to imagine numerous things co-occurring instead of numerous things occurring at different times.
Let’s look at the overview of Verilog and some design guidelines which can be followed while executing a design.
Coding style significantly affects how an FPGA configuration is carried out and eventually how it performs. Albeit numerous well-known Synthesis tools have fundamentally further developed improved optimization algorithms for FPGAs,
it is still the architect’s obligation to produce HDL code that guides the synthesis tools and accomplish the best result for a given design. This section gives Verilog HDL design rules for both amateur and experienced planners.
Fundamentally, Verilog is all about creating modules, interconnecting them, and managing the timing of interactions.
An HDL design can either be synthesized as a flat module or as many small hierarchical modules. Each methodology enjoys its benefits and inconveniences. Since design in smaller blocks is simpler to monitor, applying hierarchical structure to large and complex FPGA design is the best approach. The hierarchical coding methodology allows a group of engineers to work on one design simultaneously. It speeds up the design compilation process and reduces the design period by permitting the re-utilization of design modules for current and future designs.
Verilog can represent the hierarchy of a design. The Verilog structures which is used to build the hierarchy are:
A Verilog model is made out of modules. A module is the fundamental unit of the model, and it may be composed of instances of other modules. A module made out of other modules is known as a parent module, and the instances are called child modules
In this example, there are four modules: system, comp_1, comp_2, and sub_3. The system is the parent of comp_1 and comp_2, and comp_2 is the parent of sub_3. comp_1 and comp_2 are the children of the system, and sub_3 is the child of comp_2.
A module is defined as shown below:
The <module_name> is the type or name of this module. The <portlist> is the list of ports connected to the module.
Here are some tips for building hierarchical structures:
- The top-level should only contain instantiation statements to call all major
- Any I/O instantiations should be at the top level.
- Any signals going into or out of the devices should be declared as input, output, or bidirectional pins at the top level.
Port Connections at Instantiations In Verilog
There are two ways of specifying connections among ports of instances.
A) By ordered list (positional association)
This is the more intuitive method. The connected signals must appear in the module instantiation in the same order as the ports listed in the module definition.
B) By name (named association)
When there are too many ports in a large module, it becomes difficult to track the order. Connecting the signals to the ports by the port names increases readability and reduces possible errors.
Verilog Parameters and its Instantiations
In Verilog, parameters are constants and do not belong to other data types such as register or net data types.
A constant expression refers to a constant number or previously defined parameter. We cannot modify parameter values at runtime, but we can change a parameter value using the defparam statement.
The defparam statement can modify parameters only at the compilation time. Parameter values can be adjusted using # delay specification with module instantiation.
The values of parameters can be overridden during instantiation so that each instance can be customized separately. Alternatively, a defparam statement can be used for the same purpose.
Terminologies and basic concepts
Before going deep down into the Verilog coding styles, let us look at some of the terminologies and basics related to HDL coding.
Reg and wire
Data types in Verilog are divided into NETS and Registers. These data types differ in how they are assigned and hold values, and also they represent different hardware structures.
The nets variables represent the physical connection between structural entities. These variables do not store values(except trireg); they have the value of their drivers, which changes continuously by the driving circuit. Some net data types are wire, tri, wor, trior, wand, triand, tri0, tri1, supply0, supply1, and trireg. Wire is the most frequently used type. A net data type must be used when a signal is:
- driven by the output of some device.
- Declared as an input or in-out port.
- On the left-hand side of a continuous assignment.
The register variables are used in procedural blocks, which store values from one assignment to the next. An assignment statement in a procedure acts as a trigger that changes the value of the data storage element. Some register data types are reg, integer, time, and real. Reg is used for describing logic, integer for loop variables and calculations,
real in system modules, and time and real-time for storing simulation times in test benches.
- The reg variables are initialized to x at the start of the simulation. Any wire variable not connected to anything has the x value.
- The size of a register or wire may be specified during the declaration.
- Register and wire are declared as vectors when the reg or wire size is more than one bit.
- The Verilog HDL value set consists of four fundamental values:
There are two structured procedure statements, namely initial and always. They are the introductory statements for behavioral modeling from which other behavioral statements are declared. They cannot be nested, but many of them can be declared within a module.
a) initial statement
The initial statement executes precisely once and becomes inactive upon exhaustion. If there are multiple initial statements, they all start to run concurrently at time 0.
b) always statement
The always statement continuously repeats itself throughout the simulation. If there are multiple always statements, they all start to execute concurrently at time 0. Always statements may be triggered by events using an event recognizing list @( ).
Task and Function
Tasks and functions in Verilog closely resemble the procedures and functions in programming languages.
- Both tasks and functions are defined locally in the module in which they will be invoked.
- No initial or always statement may be defined within either tasks or functions.
- Tasks and functions are different — tasks may have 0 or more arguments of type input or output; function must have at least one input argument.
- Tasks do not return value but pass values through output and input arguments; functions always return a single value but cannot have output or input arguments.
- Tasks may contain delay, event, or timing control statements; functions may not.
- Tasks can invoke other tasks and functions; functions can only invoke other functions, but not tasks
Sequential & Parallel Blocks
Block statements group multiple statements together. Block statements can be either sequential or parallel. Block statements can be nested or named for direct access and disabled if named.
a) Sequential block
Sequential blocks are delimited by the pair of keywords begin and end. The statements in sequential blocks are executed in the order they are specified, except for nonblocking assignments.
b) Parallel block
Parallel blocks are delimited by the pair of keywords fork and join. The statements in parallel blocks are executed concurrently. Hence, the order of the statements in parallel blocks is immaterial.
a) Continuous assignment
Continuous assignments are always active — changes in RHS (right-hand side) expression are assigned to LHS (left-hand side) net. LHS must be a scalar or vector of nets, and assignment must be performed outside procedure statements.
Delay may be associated with the assignment, where new changes in expression are assigned to the net after the delay. However, note that such delay is called inertial delay. The 1st change within the delay is not assigned to the net. If the expression changes again within the delay, the last change is only enacted and assigned to the net.
b) Procedural assignment
LHS must be a scalar or vector of registers, and assignment must be performed inside procedure statements (initial or always). The assignment is only active (evaluated and loaded) when control is transferred to it.
After that, the value of the register remains until another procedural assignment reassigns it.
There are two types of procedural assignments:
- Blocking assignment
Blocking assignments are executed in the order specified in the sequential block, i.e., a blocking assignment waits for the previous blocking assignment of the same time to complete before executing.
- Nonblocking assignment
Nonblocking assignments are executed concurrently within the sequential blocks, i.e., a nonblocking assignment executes without waiting for other nonblocking assignments to occur at the same time to complete.
Execution of a statement can be delayed by a fixed-time period using the # operator.
This evaluates the RHS expression immediately but delays for a fixed period before assigning to LHS, which must be a register.
Execution of a statement is triggered by the change of value in a register or a net. The @ operator captures such a change of value within its recognizing list. To allow multiple triggers, use, or between each event.
The @ is edge-sensitive. To achieve level-sensitive, use additional if statements to check the values of each event.
Alternatively, the combination of always and the wait can be used. However, note that wait is a blocking statement, i.e., wait blocks the following statement until the condition is true
Event is explicitly triggered (with -> operator) and recognized (with @ operator). Note that the named event cannot hold any data
8. Conditional Statements
The body only allows a single statement. If multiple statements are desired, block statements may be used to enclose various statements in place of the body.
c) Loop Statements
The body only allows a single statement. If multiple statements are desired, block statements may be used to enclose multiple statements in place of the body.
Iterations are based on a constant instead of a conditional expression.
After getting such a detailing about the VERILOG, I guess we are good to get our hands-on with Verilog. Let us design a NOT gate in Verilog, simulate it, and make it ready to test on real hardware. A NOT gate (an inverter) is the simplest of all gates.
The output of an inverter is the negation of the input. i.e., B = !A, where A is the input and B is the output. The below table summarizes the behavior of NOT gate as a truth table.
“NOT” gate can be considered a module with one input and one output with an internal behavior of B=!A. The graphical representation of the inverter module is shown below.
Let us see how we would represent this in Verilog.
Extremely basic, right? Let us scan through every single line and attempt to get what is happening in this code snippet.
The name of the module is “my_module” and is declared using the “module” keyword. The keyword module in Verilog defines our module (called my_module) and assigns two ports to it. Everything that goes into this module is placed in between “module” and “endmodule” keywords. “my_module” has two ports. The ports’ size or direction is not known yet.
In the next line, Port A and port B are declared as input and output. Finally, an assign statement is used to get the desired output.
Ok, so now we have a module ready, let’s run a simulation on the module and see if it works as expected. To simulate the module, we need to give input to the module. The input is given through a testbench.
The test bench is designed to generate the necessary inputs for the module under analysis (Here “my_module”). A test bench is nothing but another Verilog module that generates some signals and feeds them to the module under test.
The test bench should be a “top module” (top-level module) with no I/O ports during the simulation. But when it comes to implementation on a real FPGA, the “top module” can have I/O ports, and test benches won’t be the top modules there. So here goes the test bench code.
Let me break this down for you.
The test bench is just another module with no I/O ports, as mentioned earlier. A wire named “out” and a reg named “clock.” is created. A clock is created on reg “clock” by periodically inverting it and feeding it to the input (port A) of my_module. The wire “out” is connected to the output port (port B) of my_module.
The result should appear on the wire “out” in the simulation.
The “always” block will keep on executing as long as the simulation runs.
In the “always” block, the reg “clock” is inverted after every one-time unit delay. The symbol # is a way to specify a delay in Verilog. So the always block executes throughout till the code is running. The clock inside the always block is inverted continuously to generate the square wave. Remember that the # symbol is not a synthesizable element. We have to find another way (using flip-flops) if the delay is needed in the design. But it works just fine for simulation.
The next block is an initial block. As already explained above in section “Terminologies and basic concepts), this block will be executed only once at time t = 0. So anything that we need to initialize should go here. The initial block also is usually used only in test benches.
Initial blocks are rarely used in a synthesizable code; reset logic is created if initialization is required. We initialize the reg “clock” to zero. This is very important. If we don’t initialize a register, its value is unknown, and no matter how many times an unknown is inverted, the result will always be unknown. That is, if we leave “clock” uninitialized, “clock” won’t be generated.
The last part of the initial block is the $finish directive. The $finish directive is placed after a 10-time unit delay; this means after simulating the design for 10-time units, the stimulator will stop running.
Last but not least, the module instantiation. The statement “myModule notGate(clock, out)” creates an instance of the module “myModule” with the name “notGate.” You can create as many instances as you want from a module. One crucial thing here is the wiring. If you look at the code, you can see the reg “clock” placed as the first parameter, and the wire “out” is set as the second parameter.
The reg “clock” is connected to port A of the module instance, and wire “out” is connected to port B.
Now both testbench, as well as the module to be tested, is ready. It would help if you had an HDL simulator to simulate it. Simulators such as ISE Simulator, ModelSim, and Questa, etc., are available.
While simulation can tell us many things about the correctness of our module, there is nothing like putting it on a piece of hardware and seeing it working. To run the design on the actual board, we need to synthesize it and implement it on the hardware.
As mentioned earlier, the test bench code is used only for simulation. To synthesize our module, we have to remove the test bench code. As discussed earlier, HDL Synthesis is the step where the HDL ( Verilog/VHDL or any other HDL for that matter) is interpreted, and an equivalent hardware topology is generated. This hardware topology will be particular to the target FPGA Design selected.
Let’s come back to our module and think about how we can implement the same on the hardware. As we know, the output of a NOT gate is always the negation of the input. We can have many possible hardware configurations to test this module. The easiest would be with a switch and a LED. See the proposed hardware configuration in the picture below.
In the above diagram, a switch is connected to an input pulled up to VCC using a resistor. The output is connected to a LED.
When the switch is in the open position, there will be a positive voltage, i.e., a logic “1” at the input (A) of the NOT gate. That means the output (B) will be at logic “0”. So the LED will be in OFF condition. When the switch is closed, the input of NOT gate will become logic “0,” and the output will switch to logic “1” state. And the LED will glow.
Now we know the basic hardware requirements.
We need the following in our prospective hardware.
- An input capable IO with a pull-up resistor and a switch attached.
- An output capable IO with an LED connected.
You can select any FPGA Design board to test this module’s functionality. We now have a Verilog module that we want to implement, and we have selected a hardware platform and decided what IOs to use for implementation.
The module to be tested has two ports—Port A, the input, and Port B, the output. Now an obvious question would be arising – how are we going to attach Port A to the specific pin (switch) of the hardware and Port B to the particular pin (LED) of the hardware?
It is done by defining user constraints. User constraints tell the router and the placement logic (part of HDL synthesizer) on which physical pins the module signals are connected. A list of constraints is made, placed in a file, and is included t in the project. This file is called a User Constraints File.
For The Radiant tool, it s the .pdc file, and for Xilinx, it is the .ucf file.
After writing the user constraint file, a bitstream file is generated and dumped onto the selected board using the programmer tool. And the functionality of the module is tested.
Below are the coding style rules that prove to be the most beneficial. Note that these are recommended for both VHDL and Verilog to keep consistency. There are three main benefits to adopting the coding style below.
- Increased readability of code
- Improved thoughtfulness of code writing
- Code is less error-prone
i_ and o_ prefix
An essential style you should adopt! Too many designers do not indicate if their signals are inputs or outputs from an entity/module.
It can be challenging and annoying to look through the code to determine the direction of a signal. Additionally, a signal named “data” and its output will be much harder to find in your code via a search than a signal named “o_data.” Examples: i_address, o_data_valid.
r_ and w_ prefix
This is the second most crucial style you need to use. Indicating if your signal is a register or a wire is hugely vital to writing good code.
Verilog is nice because it forces you to declare your signal as a reg or a wire, but VHDL has no such requirement! Therefore this style is essential for VHDL coders.
All signals declared with r_ should have initial conditions. All signals declared with w_ should never appear to the left-hand side of an assignment operator in a sequential process (in VHDL), or a clocked always block (in Verilog). Examples: r_Row_Count, w_Pixel_Done.
c_, g_, and t_, prefix
These are helpful indicators when coding. c_ indicates that you are referring to a constant in VHDL or a parameter in Verilog. g_ is used for all VHDL generics. t_ demonstrates that you are defining your own data type
Examples: c_NUM_BYTES, t_MAIN_STATE_MACHINE.
All capital letters are used to define states in state machines. e.g., IDLE, DONE, CLEANUP.
Whether or not you want to capitalize your signal names is up to you. Whatever you choose, stay consistent! VHDL is not case-sensitive, so r_ROW_COUNT is the same as r_Row_Count, but this is not true in Verilog. Verilog is case-sensitive, so maintaining rules about capitalization is very important! You don’t want to accidentally create two different signals when you meant to make just one signal.
Initializing Signals Keynote:
There is a fairly widespread misconception that FPGAs need to have a reset signal into a register to set an initial condition.
This is not true; FPGA Design registers can have initial values. All FPGA Design registers can be initialized to zero or non-zero values. It’s best practice to reset as few Flip-Flops as possible in your design and to rely on initializing all Flip-Flops instead. The reason for this is that each reset line you add to a Flip-Flop takes routing resources and power and makes your design harder to meet timing.
The rule you should be following is this: All registers (as identified by r_ prefix) should always have an initial condition applied to them. No wires (as identified by w_ prefix) should EVER have an initial condition applied to them.
Based on the design requirement, every design on board has to run at a certain specified speed. There are generally three types of speed requirements in any FPGA design:
Timing requirement – How fast or slow a design should run. Target clock period (or clock frequency) and other constraints are required to define the design speed.
Throughput – The average rate of the valid output delivered per clock cycle
Latency – the amount of the time required when the valid output is available after the input arrives, usually measured in the number of clock cycles
Throughput and latency are usually related to the design architecture and application, and they need to be traded off based on the system requirement.
For example, high throughput usually means more pipelining, which increases the latency; low latency usually requires longer combinatorial paths, which removes pipelines, reducing the throughput and clock speed.
FPGA Design designers often struggle to meet the timing requirement to ensure that the design runs at the required clock speed. It can require hard work for high-speed design to close timing using various techniques (including the trade-off between throughput and latency, appropriate timing
constraint adjustments, etc.) and running through multiple processing iterations, including Synthesis, MAP, and PAR. Said that, let’s jump to the last topic of the discussion.
How to Close timing violation
In FPGA design, logic synthesis and related timing closure occur during compilation. Moreover, many things, including I/O cell structure, asynchronous logic, and timing constraints, can significantly impact the compilation process, varying results with each pass through the toolchain.
Follow the following three steps to avoid timing violations.
- Do not let your synthesis tool guess at what you want. Clearly define all I/O pins and critical logic. Be sure to specify the electrical properties of your I/O pins.
- Make your logic 100 percent synchronous and reference all logic to your master clock domain.
- Apply timing constraints to ensure timing closure.
If you have come this long way, I guess now you are good to go and can start your FPGA Design journey right from this moment. Also, if this guide did any good to you, stay tuned with Logic Fruit Technologies to get more such blogs on FPGA Design and get the ball rolling.