Electronics: VHDL SDRAM Memory Controller

I've completed a new, tested, fully working controller - have a look at Simple_SDRAM_Controller

Although aimed at 100MHz all of the designs below can be adapted to other clock speeds. The only changes needed are to increase the number of NOPs in the refresh chain to ensure that it takes at least 70ns.

Adapting to a CAS setting of 2 is only a little bit more difficult, as the data is available one cycle earlier. A CAS of 2 can only be used with a clock speed of 100MHz, and will make the biggest difference with the simple FSM where it saves a cycle on every read, or in the most complex where it saves a cycle flipping between reads and writes.

The priority should be first to perform any pending refresh, but priority of performing reads over writes depends on your target application. For example if you are generating a VGA video signal reads should take priority over writes otherwise "tearing" of the picture could occur.

FSM1 - Simple controller

Based on the information above, here is the design for a simple FSM to access the SDRAM with a burst length of 4 at 100MHz, CAS = 3:

(The blue circles indicate where data is transferred to/from the SDRAM)

Performance:

Read is 11 cycles for four words = 72MB/s @ 100Hz, excluding refresh overhead
Write is 11 cycles for four words = 72MB/s @ 100Hz, excluding refresh overhead
Refresh is 8 cycles.

Pros:

Simple to implement
Fast logic - only two nodes has multiple exits, and the choice is simple.
Predictable performance.

Cons:

Poor performance

This can be slightly improved on with a few little changes. These focus around skipping the idle stage where possible.

FSM2 - Optimised simple controller

(The blue circles indicate where data is transferred to/from the SDRAM)

Performance:

Read is 10 cycles for four words = 80MB/s @ 143Hz, excluding refresh overhead
Write is 10 cycles for four words = 80MB/s @ 143Hz, excluding refresh overhead
Refresh is 7 cycles.

Pros:

Simple to implement
Fast logic - only three nodes has multiple exits, and the choice at these nodes is simple.
Predictable performance.

Cons:

Poor performance

Further improvements can be made by not activating and precharging the row every time.

FSM3 - With back-to-back reads or back-to-back writes

(The blue circles indicate where data is transferred to/from the SDRAM)

Performance:

Single read is 10 cycles for four words = 80MB/s @ 100Hz, excluding refresh overhead. For back to back reads this gets close 200MB/s
Single write is 10 cycles for four words = 80MB/s @ 100Hz, excluding refresh overhead. For back to back writes this gets close 200MB/s
For mixed read/write workloads performance can be as low as 72MB/s
Refresh is 7 cycles.

Pros:

Much improved performance for back-to-back operations (as long as you don't mix reads and writes.
You can choose to allow back-to-back operations in only the write or read sections, optimising for the applicaiton

Cons:

Logic is starting to get complex (and slow).
Unpredictable latency.

One major issue with this design is that it is possible to get stuck in a loop in either the 'read' or 'write' operations. In the unlikely case that this occurs there is the chance that refresh operations will not performed as needed. The easy solution would be to not perform back-to-back writes if a refresh operation is pending.

To improve on this we have to start mixing the read and write operations, as long as they are on the same row. This is where things get complex!

FSM4 - With mixed back-to-back reads and writes

(The blue circles indicate where data is transferred to/from the SDRAM)

Performance:

Single read is 10 cycles for four words = 80MB/s @ 100Hz, excluding refresh overhead. For back to back reads this gets close 200MB/s
Single write is 10 cycles for four words = 80MB/s @ 100Hz, excluding refresh overhead. For back to back writes this gets close 200MB/s
For mixed read/write workloads performance can be upto 145MB/s
Refresh is 7 cycles.

Pros:

Much improved performance for back-to-back operations, including mixed reads and writes.

Cons:

Logic is getting complex (and slow).
Unpredictable latency
Only back-to-back operations are improved, if the requests to the same rwo are separated by a few clock cycles the row gets precharged and opened again.

One other issue with this design is that it is possible to get stuck in a loop in either the 'read' or 'write' operations. In the unlikely case that this occurs there is the chance that refresh operations will not performed as needed. The easy solution is to not perform back-to-back operations if a refresh operation is pending.

Further improvements - FSM5

The FSM4 design can also be improved on. It involves having an "idle row activated" state, which would reduce latency for operations that are interspersed with a few idle cycles - down from 10 cycles to 7 for reads, and down from 7 cycles to 4 four for writes. These are pretty big improvements.

As it involves a lot more complexity than the above designs, so the diagram looks completely different:

(Blue circles are data transfers from the SDRAM, red circles are data transfers to the SDRAM)

Pros:

Nearly a full featured design, everything but the ability to abort burst transfers is catered for

Cons:

Very complex to code and test.
Complexity may reduce speed.
Large number of states to understand and manage.
Design in some cases is slower simpler design. for example from and idle activerow to a completed read in a different row is takes three cycles longer..

As long as priority is given to getting back to the idle state when a refresh is pending this seems to be close to optimal design.

Source code

This is the source code for the FSM.

Source code

Electronics

Pages

Saturday, January 3, 2015

VHDL SDRAM Memory Controller

FSM1 - Simple controller

FSM2 - Optimised simple controller

FSM3 - With back-to-back reads or back-to-back writes

FSM4 - With mixed back-to-back reads and writes

Further improvements - FSM5

Source code

1 comment: