Big models get all the hype — but small models are becoming the real surprise.
When I started 6SigmaMind, I wanted to know:
How far can a tiny 1.7B model go when solving real Excel tasks?
To find out, I ran it through three challenges:
- SUMIFS (conditional math)
- XLOOKUP (modern lookups)
- T.TEST (statistics)
Then I asked readers like you to break it.
The results?
Surprisingly good — and occasionally hilarious.
👉 Try the model yourself: https://huggingface.co/spaces/benkemp/6SigmaMindv2
Let’s walk through the battle test.
🥊 Round 1 — SUMIFS: Everyday Excel Powerhouse
SUMIFS is one of the most-used formulas on the planet.
Test Prompt:
“Sum all values in C where B equals ‘Closed’.”
Expected Output:
=SUMIFS(C:C, B:B, "Closed")
6SigmaMind’s performance:
⭐⭐⭐⭐☆ (4/5)
- Almost always gets it right
- Clean formatting
- Works across most phrasings
- Rarely mixes up the argument order (but it can)
Try these in the demo:
- “Sum D2:D200 where A2:A200 = ‘Active’.”
- “Add all values in column F only if column C is ‘Approved’.”
This is the model’s comfort zone.
🥊 Round 2 — XLOOKUP: The Modern Lookup Test
XLOOKUP is where small models often fail — but 6SigmaMind holds its own.
Test Prompt:
“Return the price in D for the SKU in A that matches H2.”
Expected Output:
=XLOOKUP(H2, A:A, D:D)
6SigmaMind’s performance:
⭐⭐⭐☆☆ (3/5)
- Often correct
- Sometimes swaps lookup + return arrays
- Occasionally reverts to VLOOKUP (funny, but still works)
Try these variants:
- “Lookup the value in column C where A = G5.”
- “Give me an XLOOKUP formula for ID in A and name in B.”
Evaluating a small model on lookups is eye-opening — it reveals where reasoning lives.
🥊 Round 3 — T.TEST: Entering the Statistics Arena
Here’s where things get… unpredictable.
Test Prompt:
“Do a two-tailed t-test comparing C2:C50 with D2:D50.”
Expected Output:
=T.TEST(C2:C50, D2:D50, 2, 2)
6SigmaMind’s performance:
⭐⭐☆☆☆ (2/5)
- It knows T.TEST exists
- It understands two ranges
- It sometimes confuses arguments
- It sometimes produces a more complex formula than needed
This is where future fine-tuning will shine — especially when we introduce the Excel for Statistics Dataset.
Try pushing the model with:
- “Calculate the correlation between A and B.”
- “Give me the variance for C2:C100.”
- “Return the standard deviation for D2:D80.”
These tasks help map out the model’s statistical “baseline.”
📊 What This Battle Test Really Shows
This comparison reveals the nature of tiny models:
✔ They can learn structured logic
Even SUMIFS feels natural to them.
✔ They understand modern Excel functions
XLOOKUP support is impressive for their size.
✔ They have pockets of statistical knowledge
T.TEST, CORREL, STDEV.S — these aren’t trivial.
✔ Their biggest weakness is precision ordering
Not surprising for a 1.7B model.
✔ But they’re fast, responsive, and fun
And that makes experimentation addictive.
🧠 Why This Test Matters
6SigmaMind isn’t trying to beat GPT-4.
It’s demonstrating something different:
Tiny, specialized models can already handle real tasks —
and with the right fine-tuning, they get dramatically better.
The next steps:
- Build a dataset of 1,000+ Excel tasks
- Fine-tune the model on structured examples
- Evaluate accuracy across domains
- Compare against other small models: Phi-3 Mini, Qwen 3B, Gemma 2B
- Release benchmark results publicly
This project is as much about education as performance.
🎯 Your Turn: Try the Battle Test Yourself
Here’s how to play along:
- Open the demo
- Copy/paste these three prompts:
Prompt 1 — SUMIFS
“Sum values in C where B equals ‘Closed’.”
Prompt 2 — XLOOKUP
“Return the value in column D for the ID in A matching H2.”
Prompt 3 — T.TEST
“Perform a two-tailed t-test for C2:C50 vs D2:D50.”
- Compare the output
- Share the results — good, bad, weird — it all helps
👉 Try the 6SigmaMind battle test:
https://huggingface.co/spaces/benkemp/6SigmaMindv2
🚀 Coming Next in the Series
The next articles will explore:
- How Small Models Understand Excel Logic
- Why Domain-Specific Fine-Tuning Changes Everything
- Excel for Statistics: Training a Mini-Model on Real Data
- How 6SigmaMind Compares to Qwen / Phi-3 / Gemma / SmolLM2
- The Road to 6SigmaMind v2
We’re building this project in the open — and you’re part of it.