C.R.I.S.P.Y Hardware is all about tools to help you debug hardware faster.
These are cloud connected hardware debug tools that help you follow the C.R.I.S.P.Y. checklist.
Debugging digital hardware design is not the same as debugging software.
You may have very strong software debugging skills. But I have found that those skills assume that the design is behaving as a digital system. If your hardware design has bugs at the analog level, it can cause the digital behavior to appear unpredictable. This behavior can be very frustrating to debug.
I want to share a method that I teach and use to reduce this frustration. This is a checklist that you can remember using the cutesy mnemonic device 'CRISPY'.
The CRISPY checklist is:
- Clocks
- Reset
- Integrity of Signals
- Power
- You
As the mnemonic does not list the items in the best order, I will explain each item below in the order I have found most useful. I will also be relatively brief in this post. If you have questions about terms I use or tools needed just ask @duppy on twitter #crispyhw tag. If you have better solutions or a problem whose fix did not fit into this checklist, please share on twitter @duppy #crispyhw. I'll bet somebody out there has an even better checklist than this one.
- Power
- Are power and ground shorted?
- I had to ask. It is a good habit to "buzz out" all power and ground signals after every circuit mod before applying power.
- Is your design plugged in?
- Seriously. Double check with a voltmeter or LED.
- If battery powered, is the battery near full? I recommend replacing the battery with a bench supply while debugging.
- What chemistry battery are you using? Remember the voltage levels and discharge curves vary, that 1.5v AAA may actually only be 1.2V by design.
- Are you using too much power?
- A good bench supply will tell you how much average current is being drawn. The average will catch some but not all issues.
- A good bench supply will also have a current limiting feature. Are you allowing sufficient power?
- Often your micro will boot fine and only fail when you start turning on more modules or clocks or driving too many LEDs or motors. Radios (BLE, Wi-FI) and displays may also cause brown outs. The failure level can vary greatly from chip to chip. Use a bench supply when possible during debugging, add extra LEDs on purpose to stress power, or write specific firmware to intentionally turn on as many of the micro's internal modules and clocks to stress power.
- Is the resistance of your power cable too high (often because it is too long)?
- This is more common when breadboarding. Check power at the power pin of each IC, not at the power supply.
- If you are using a motor, add a diode!
- See this great tutorial for driving a motor.
- Are you using bypass capacitors of the correct value and more importantly placement?
- Bypass capacitors can be especially tricky because often modern ICs will work fine without them. Until they don't.
- Also, larger is not better with bypass caps and too small doesn't work either. The value and type needed can also change when going from breadboard to PCB.
- When in doubt, you can almost never go wrong with using 0.1uF caps near each IC power pin and 10uF at the main power source.
- Very rarely will you need to delve deeper than the above recommendation. If you do, start with these links.
- Do you have more than one power domain?
- Remember that a debug or programming cable or connection to a PC may count as your second power domain.
- Are all the power domains on?
- What order are the power domains turning on and off?
- Check for "backdrive". If an output from a powered IC or a pull-up resistor is driving a high input to an IC that is not powered, then you are backdriving that IC. The backdriven IC's behavior can range from appearing to work as if powered properly to releasing smoke. The releasing smoke behavior is the easy bug to find. The other behaviors are harder.
- Can someone recommend a good reference for more on backdrive?
- Are you interfacing 1.8V, 3.3V logic to 5V logic?
- Are power and ground shorted?
- Reset
- Once your digital ICs have stable power, they still need a good Reset to start working.
- Many ICs have internal Power On Resets (POR) so you may have been lured into skipping this step. Don't ignore it because your debug cable needs reset to work well and it is also often very convenient to reset your system during testing without having to cycle power.
- Check that any external reset signals are meeting the timing and voltage level requirements.
- Check that all ICs that take a reset (whether explicit signal or implicit POR) are getting reset at the same time. If not, understand the consequences. Recently I work a lot with Electric Imp. In these designs it is tempting to 'reset' by removing and re-inserting the Electric Imp card. The problem is that this does not reset the other digital devices that the Imp talks to. Are you having this problem?
- Do all ICs get reset when the controller gets an internal soft reset? For example, some I2C connected peripherals require a reset command to be sent over the bus, but if the controller resets while the target is responding with a data low, hte I2C bus may get hung.
- Clocks
- Even if Power and Reset are good, your ICs still don't work without a stable clock. Like reset, many ICs these days have internal oscillators so you might have also been lured into skipping this step.
- Read the datasheet and make sure you are providing everything necessary for the internal clock to work or your external crystal or oscillator to start up.
- If you have multiple ICs with high speed clocks then you need to read up on jitter, skew, setup, hold, metastability, transmission line termination techniques, crosstalk, duty cycle, and probably a few other topics.
- I don't mean to scare you. If you are using Arduino or similar class micro based designs, then high-speed clocks are probably not your problem. Learn enough to rule them out as a possible cause and look everywhere else first.
- Integrity of Signals
- This is more commonly called "Signal Integrity" when you aren't trying to make your checklist have a cutesy mnemonic.
- Know the speed of all your external signals and start with anything faster than 1MHz.
- Know which signals are edge vs. level sensitive. Level sensitive signals can look ugly as they change but must be stable around the setup and hold time of the receiving chip. Edge sensitive signals need to have clean changes and may need to be treated as Clocks
- Lookup VIH, VIL, VOH, VOL , for your chips and understand what they mean.
- Check both your internal and external pull-up/down resistors. Use a 100K ohm resistor connected to VCC or GND and poke it at your IOs to see whether they are being actively driven, floating, or pulled up/down.
- Are lows really low? Do you really meet GND level, or not quite?
- You
- Are you reading the docs right?
- Are you reading the right docs? Check for the latest version that matches your device. Check for separate errata.
- Do your pin numbers defined in your firmware match the schematic?
- Are you using the right register addresses? Are you accessing them as 16bit when they are only 8bit?
- Does the PCB as built match the schematic? Version control is not used as often in hobby hardware development as it is in software development. It should be.
- Is your scope or multimeter connected to the right pin? Pin one labeling can often be confusing and it can be hard to count fine pitch pins. Use labeled test points. Don't skimp on annotating your silk screen or use different color breadboard wires in some consistent manner such as red for power, black for ground, blue for IOs, green for clocks.
- Are you assuming a pin is active high when it is really active low?
- Are your clocks rising or falling edge?
- Have you set your LA/scope trigger levels right? With a mix of 1.1V, 1.8V, 3.3V and 5V technologies, is your trigger level all embracing, or did you set it too high, or too low? A 5V low can be higher than a 1.1V high.
- If you have a serial bus is it sending LSB or MSB first?
I hope you find this checklist helpful. Share a story if it helps you find a bug. If you find a bug that doesn't fit into one of these categories, share that also.
To discuss: Reply on Twitter @duppy with #crispyhw tag.