Skip to content

StatSQLi Documentation

Methods and Results

dcbhupendra7/BlindSQLInjection

Methods and Results¶

Methodology overview¶

StatSQLi models time-based inference as a statistical decision process:

Collect baseline timing samples.
Choose delay candidates adaptively.
Compare delayed vs baseline distributions with Welch's t-test.
Use binary search to infer character values efficiently.

Statistical model used¶

Null hypothesis: delayed samples are not meaningfully greater than baseline.
Alternative hypothesis: delayed samples are greater than baseline.
Test: one-tailed Welch t-test (equal_var=False, alternative='greater').
Confidence target in code defaults to 99% (confidence_level=0.99), implying alpha near 0.01.

Algorithmic shift¶

Traditional approach: linear search over printable ASCII (O(n) per character).
StatSQLi approach: binary search over ASCII range (O(log n) per character).

In practice, this reduces condition checks significantly for each character position.

Adaptive delay strategy¶

The current implementation uses a baseline-driven heuristic:

compute baseline mean and standard deviation,
estimate detectable delay level,
pick smallest candidate delay that is reliably distinguishable.

This reduces manual tuning effort while keeping delays practical.

Benchmark pipeline¶

The benchmark runner:

executes repeated runs per method,
records extraction duration and query counts,
computes summary statistics (mean, median, std, min, max, success rate),
saves JSON output for later charting.

Report-aligned figures¶

The figure set you added is included in the docs and can be viewed on the Figures page.

Key visuals include:

architecture overview (Fig. 1),
query efficiency and speedup plots (Fig. 2 and Fig. 3),
complexity and speedup trend comparison (Fig. 4).

Result interpretation guidance¶

When evaluating your own benchmark runs:

compare query counts first (algorithmic efficiency),
compare time second (environment-sensitive),
inspect run-to-run variation to assess jitter sensitivity.

Use the generated charts to communicate both raw performance and complexity trends.