# Single Event Upset Mitigation in Digital Circuits

Gökçe Aydos

#### **Graduate School System Design**

University of Bremen German Research Center for Artificial Intelligence

German Research Center for Artificial Intelligence German Aerospace Center

www.informatik.uni-bremen.de/syde









### **Agenda**

1 From radiation to single event effect

2 Mitigation techniques

3 Hardening of an FPGA design







#### **Definition**

a process in which particles or waves travel through media



Source: Wikipedia







#### **Radiation Sources**

- Cosmic rays
- Solar particles
- Van Allen radiation belts

- Nuclear reactors
- Particle accelerators
- Chip packaging materials







## Single event effects (SEE)



effects caused by a single energetic particle



- Destructive
  - SEL, SEGR, SEB, SESB...
- Non-destructive
  - Single-event Upset (SEU)
  - Single-event Transient (SET)
  - SEFI, SEHE, MCU, SMU, SED...









## Current pulse caused by the particle

$$I_P(t) = I_0(\exp(-\frac{t}{\tau_\alpha}) - \exp(-\frac{t}{\tau_\beta}))$$
 [?]









Gökçe Aydos Single Event Upset Mitigation in Digital Circuits April 11, 2014

## Upsets in combinatorics and memory







### **SET** in combinatorics











## SEU in memory circuits















## **Brief history of SEE**

- 1962: Eventual occurrence of SEU in microcircuits forecasted. "Minimum volume will be limited to 10µm due to terrestrial cosmic rays" [?]
- 1975: First confirmed report of cosmic-ray-induced upsets in space, four upsets in 17 years of satellite operation[?]
- 1978: On-orbit error rate of one per day due to cosmic rays[?]
- 1979: Occurrence of SEU in terrestrial microelectronics especially due to radioactive contaminants in package materials[?]
- 1984: SEU in combinational logic[?]







#### Brief history of SEE - 2000s

- SEE vulnerability becoming mainstream reliability metric
- Screened COTS products sold as space products
- Hardening-by-design (HDB) techniques vs decreasing rad-hard foundries







## State-of-the-art mitigation techniques

- Technology hardening
  - Reduce charge collection of substrate at sensitive nodes
  - Long term effects are eliminated but SEU and SET are still an issue
- Circuit-level hardening
  - Adding decoupling elements to the critical path[?]
  - Hardened-by-design (HBD)
    - Spatial and temporal Redundancy
    - Hardened memory cell (more transistors)
- System-level hardening
  - Error detection and correction (EDAC)
  - Modular redundancy (DMR, TMR...)
  - Self-checker
- Recovery (only programmable logic)
  - Reconfiguration







#### Section 3

#### Hardening of an FPGA design

















### **Lowering redundancy constraints**

- Make use of already existing redundancy
  - State machine already one-hot
  - Packets carry already parity

| binary | one-hot |
|--------|---------|
| 00     | 0001    |
| 01     | 0010    |
| 10     | 0100    |
| 11     | 1000    |







# Interface FPGA diagram 2









#### **Goals**

- Evaluation of different HBD techniques on the existing design regarding
  - error-rate
  - max. frequency
  - design area
  - time-efficiency
- Creation of abstract models for the parameters







## Fault-injection-tool diagram









#### FI Simulation with IFF

- Using IFF config with SpW, bus arbiter and RAM module
- Testbench with a dozen memory accesses
- Flip a random FF bit periodically
- Check for expected responses
- Testbench fails or successes
- Run the testbench for significant amount of cycles







#### **Next steps**

- Implementation of other design variants
- Constraint-based random simulation testbench
- Fault-injection on SRAM









#### Reference selfmus for SRAM-based FPGAs.

Springer, 2006.

[2] GC Messenger

Collection of charge on junction nodes from ion tracks.

Nuclear Science, IEEE Transactions on, 29(6):2024–2031, 1982.

[3] JT Wallmark and SM Marcus.

Minimum size and maximum packing density of nonredundant semiconductor devices. Proceedings of the IRE, 50(3):286–298, 1962.

[4] D Binder, EC Smith, and AB Holman,

Satellite anomalies from galactic cosmic rays.

Nuclear Science, IEEE Transactions on, 22(6):2675-2680, 1975.

[5] James C Pickel and James T Blandford.

Cosmic ray induced in mos memory cells.

Nuclear Science, IEEE Transactions on, 25(6):1166–1171, 1978.

[6] Timothy C May and Murray H Woods.

Alpha-particle-induced soft errors in dynamic memories.

Electron Devices, IEEE Transactions on, 26(1):2-9, 1979

[7] TC May, GL Scott, ES Meieran, P Winer, and VR Rao. Dynamic fault imaging of vlsi random logic devices.

In Reliability Physics Symposium, 1984. 22nd Annual, pages 95-108. IEEE, 1984.

[8] JL Andrews, JE Schroeder, BL Gingerich, WA Kolasinski, R Koga, and SE Diehl. Single event error immune cmos ram.

Nuclear Science, IEEE Transactions on, 29(6):2040-2043, 1982.

[9] Paul E Dodd and Lloyd W Massengill.

Basic mechanisms and modeling of single-event upset in digital microelectronics.

Nuclear Science, IEEE Transactions on, 50(3):583-602, 2003.

[10] Methods for the calculation of radiation received and its effects, and a policy for design margins. ECSS-E-ST-10-12C, 2008.







### **Further reading**

- Basic mechanisms and modeling of single-event upset[?]
- ECSS standard ECSS-E-ST-10-12C[?]
- Fault-tolerance techniques for SRAM-based FPGAs[?]







#### **Earth Orbits**

- Low Earth Orbit (LEO): up to 2000 km
- Medium Earth Orbit (MEO):
   2000 km ~36000 km
- High Earth Orbit (HEO): above MEO
- Geostationary Orbit (GEO):
   ~42000 km











#### Van-Allen Belt











#### Trapped electrons in Van-Allen Belt



Source: Sturesson,

**ESTEC** 







## Bar-magnet vs iron powder





Source: physicscentral.com







## Silicon struck by a charged particle



#### Direct Ionization



#### Indirect Ionization



Source: [?]







# Cross section vs LET

















#### SFSpace Radiation and its Effects on EEE Components

# Single Event Effects - Summary

| _                                           |                                                                     |                                                               |        |
|---------------------------------------------|---------------------------------------------------------------------|---------------------------------------------------------------|--------|
| Single Event Upset (SEU)                    | corruption of the information stored in a memory element            | Memories, latches in logic devices                            |        |
| Multiple Bit Upset (MBU)                    | several memory elements corrupted by a single strike                | Memories, latches in logic devices                            |        |
| Single Event Functional<br>Interrupt (SEFI) | corruption of a data path<br>leading to loss of normal<br>operation | Complex devices with built-in state machine/control sections  |        |
| Single Hard Error (SHE)                     | unalterable change of state in a memory element                     | Memories, latches in logic devices                            |        |
| Single Event Transient (SET)                | Impulse response of certain amplitude and duration                  | Analog and Mixed Signal circuits,<br>Photonics                |        |
| Single Event Disturb (SED)                  | Momentary corruption of the information stored in a bit             | combinational logic, latches in logic devices                 |        |
| Single Event Latchup (SEL)                  | high-current conditions                                             | CMOS, BiCMOS devices                                          |        |
| Single Event Snapback (SESB)                | high-current conditions                                             | N-channel MOSFET, SOI devices                                 |        |
| Single Event Burnout (SEB)                  | Destructive burnout due to high-current conditions                  | BJT, N-channel Power MOSFET                                   |        |
| Single Event Gate Rupture (SEGR)            | Rupture of gate dielectric due to high electrical field             | Power MOSFETs, Non-volatile<br>NMOS structures, VLSIs, linear |        |
|                                             | conditions                                                          | devices Source: S                                             | turess |

esa\_

EPFL Space Center 9th June 2009

Non Exhaustive, more in ECSS E-ST-10-12C

III.A page 30







**EPFL** 

## Observable error dependant on the state











### Adding decoupling elements

• CMOS memory cell modified with R<sub>G</sub> resistors











## Modular redundancy









## SEU in configuration memory of FPGAs









