Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TRIM-Aligned Polymorphic Sweep (TAPS) wiping method for SSDs #642

Open
Knogle opened this issue Jan 13, 2025 · 1 comment
Open

TRIM-Aligned Polymorphic Sweep (TAPS) wiping method for SSDs #642

Knogle opened this issue Jan 13, 2025 · 1 comment

Comments

@Knogle
Copy link
Contributor

Knogle commented Jan 13, 2025

I took some time in order to develop a mathematical sophisticated wiping method for SSDs.
Licensed under MIT license.
The formula formatting is an issue.


TRIM-Aligned Polymorphic Sweep (TAPS)

  1. Introduction

Why is SSD wiping challenging?

SSDs differ from HDDs in their behavior due to:

  1. Flash Translation Layer (FTL): Logical Block Addresses (LBAs) are dynamically mapped to physical flash memory blocks. Writing to an LBA does not guarantee that the corresponding physical block is overwritten.

  2. Wear-Leveling: SSD controllers distribute writes across physical blocks to prolong device lifespan. This can prevent specific physical blocks from being overwritten during sequential writes.

  3. Over-Provisioning: SSDs include extra blocks (typically 7–20% of the capacity) that are not exposed to the user. These blocks can retain data even after wiping the logical space.

  4. TRIM and Garbage Collection (GC): TRIM marks blocks as invalid at the logical level, signaling to the SSD that the data is no longer needed. However, actual erasure depends on the garbage collection routine.

  5. Compression/Deduplication: Some SSDs compress or deduplicate data internally, skipping writes for repetitive patterns like 0x00.

Traditional methods, such as multi-pass overwrites, fail to address these complexities. While Secure Erase (using ATA or NVMe commands) can sanitize the drive, it depends entirely on the SSD firmware's correctness, which may vary between manufacturers.


  1. Methodology

The TAPS method addresses the above challenges by combining:

  1. TRIM Commands to mark blocks invalid.

  2. Block-Saturating Random Writes to maximize physical block coverage.

  3. Polymorphic Overwrites to defeat compression and deduplication.

  4. Statistical Modeling to estimate the probability of residual data.

Phases of TAPS


Phase 1: Pre-TRIM

Objective: Invalidate all logical blocks to maximize the effectiveness of garbage collection.

Procedure:

Issue TRIM commands across the entire logical address space:

for LBA = 0 to MAX_LBA:
TRIM(LBA)

Rationale:

TRIM informs the SSD firmware that specific blocks no longer contain valid data. If the SSD has user-accessible blocks and total physical blocks (including over-provisioning), the probability of a block being marked invalid after TRIM is:

P_{\text{invalid}} = \frac{m}{n}, \quad m \leq n

Although TRIM does not guarantee immediate erasure, it increases the likelihood that old data is removed during garbage collection.


Phase 2: Block-Saturating Random Sweep

Objective: Overwrite all logical blocks with cryptographically secure random data, maximizing physical coverage via wear-leveling.

Procedure:

Write random data sequentially across the entire logical address space:

for LBA = 0 to MAX_LBA, step = BLOCK_SIZE:
WriteRandom(LBA, BLOCK_SIZE)

Mathematical Justification:

Each random write increases the chance of overwriting a physical block. Assuming uniform wear-leveling, the probability that a single physical block remains unwritten after one full random pass is:

P_{\text{unwritten_1_pass}} = 1 - \frac{m}{n}

After full random passes, the probability that a specific physical block remains unwritten is:

P_{\text{unwritten_k_passes}} = \left(1 - \frac{m}{n}\right)^k

For large and , this can be approximated using the exponential function:

P_{\text{unwritten_k_passes}} \approx e^{-\frac{k \cdot m}{n}}

Example:

Let , (20% over-provisioning), and passes. The probability of a physical block remaining unwritten is:

P_{\text{unwritten}} \approx e^{-\frac{2 \cdot 1,000,000}{1,200,000}} \approx e^{-1.67} \approx 0.188

Thus, ~18.8% of blocks might still remain untouched after two random passes, which necessitates further steps.


Phase 3: Polymorphic Overwrites

Objective: Mitigate compression and deduplication effects by performing multiple overwrites with distinct patterns.

Procedure:

Perform passes with varying patterns, such as:

Pass 1: Overwrite with a fixed pattern (e.g., 0x55).

Pass 2: Overwrite with random data from a PRNG.

Pass 3: Overwrite with another fixed pattern (e.g., 0xAA).

Pass 4: Overwrite with new random data.

Mathematical Justification:

Let be the probability that a block is physically overwritten during one pass. After distinct passes, the probability of a block not being overwritten at all is:

P_{\text{residual}} = (1 - P_{\text{overwrite}})^k

Example:

Assume (83.3% chance of overwriting a block per pass) and . The probability of residual data is:

P_{\text{residual}} = (1 - 0.833)^4 = (0.167)^4 \approx 0.00078

This implies a 0.078% chance of residual data after four polymorphic passes, which is statistically negligible.


Phase 4: Verification & Statistical Estimation

Objective: Provide a statistical model for residual data or sample raw blocks if supported by the SSD firmware.

Statistical Model:

Residual data can be modeled using the hypergeometric distribution:

P(X = r) = \frac{\binom{n - k}{r} \cdot \binom{k}{m - r}}{\binom{n}{m}}

where:

: Total physical blocks

: Blocks overwritten

: Total user-accessible blocks

: Residual blocks after passes

If raw sampling is not possible, estimate the upper bound of residual data probability:

P_{\text{residual}} \leq \left(1 - \frac{m}{n}\right)^{k \cdot \text{#Passes}}


  1. Comparison to Secure Erase

  1. Conclusion

Strengths of TAPS:

Does not depend on proprietary Secure Erase commands.

Combines TRIM, random fills, and polymorphic overwrites to maximize physical block coverage.

Provides a mathematical framework to estimate residual data probability.

Limitations:

Slower than Secure Erase, as multiple passes are required.

Full efficacy depends on SSD controller behavior, which remains a black box.

Conclusion:
TAPS is a robust, universally applicable alternative to Secure Erase for securely wiping SSDs. It is especially useful in scenarios where firmware-level commands cannot be trusted or are unavailable. Through a combination of techniques and statistical modeling, TAPS achieves a high level of assurance in data sanitization.

@PartialVolume
Copy link
Collaborator

PartialVolume commented Jan 13, 2025

Nice work! I know this could prove expensive and time consuming but it would be nice to validate the theory by writing the disc with a known repeating pattern then running the wipe then removing the chips from the SSD and reading them to see how effective it is for different makes and models of SSD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants