Benchmarking Text-to-Excel Formula Accuracy Across SmolLM2-1.7B Small Language Model

Converting plain English instructions into correct Excel formulas is one of the simplest—but also most revealing—ways to evaluate a model’s reasoning, precision, and real-world usefulness. Unlike open-ended text generation, Excel formulas are binary: either the answer is correct, or it isn’t.
That makes this benchmark ideal for comparing Small Language Models (SLMs) and Large Language Models (LLMs) side by side.

To understand how different models perform, we created a large benchmark of unique Excel tasks, ranging from basic arithmetic and text manipulation to advanced lookups, dynamic arrays, and financial formulas. For each task, we provided a short English instruction and asked models to produce the exact Excel formula that satisfies it.

Below is the table showing:

  • The expected formula
  • Each model’s output formula
  • (Optionally) a Correct/Incorrect indicator or scoring column

This allows you to immediately see where models perform well—and where they fail.

Why This Benchmark Matters

Most AI benchmarks test abstract reasoning, math puzzles, or code challenges. Useful, yes—but not what everyday users actually need. Excel formulas are different:

✔ Real business use-case

Millions of people use Excel every day for reporting, analytics, budgeting, forecasting, and automation. An AI that understands Excel formulas delivers real value immediately.

✔ High signal, low noise

If a model outputs even a slightly wrong formula—wrong range, wrong delimiter, missing parentheses—it fails the task. This gives a very clean accuracy signal.

✔ Great for testing reasoning + structure

Excel syntax requires:

  • correct logic
  • correct function names
  • correct range references
  • correct separators
  • correct order of operations

Models that “hallucinate” show it instantly.

✔ Perfect stress test for Small Language Models

SLMs often struggle with structured output. Text-to-Excel conversion exposes exactly how well they handle deterministic tasks.

How the Benchmark Was Built

To ensure fairness and consistency:

  • All tasks use English Excel functions and comma separators.
  • No trick questions or ambiguous phrasing.
  • Tasks cover 80 different scenarios, including:
    • Basic math
    • Conditional logic
    • Text manipulation
    • Dates and times
    • Lookup functions
    • Dynamic arrays
    • Financial functions
    • Error handling
    • Reference manipulation
  • No duplicates and no overlapping instructions.
  • All formulas were validated manually before inclusion.

Each model received the same prompt and no additional context—just the instruction for the formula.

Excel Formula Basic Math & Cell References (1–10) Table

TaskExcel Formula SmolLM2-1.7BScore
1In cell D2, add up the values from A2 to C2.=SUM(A2:C2)✅OK
2In cell C2, calculate net price by subtracting discount in B2 from original price in A2.=A2-B2✅ OK
3In cell F2, calculate total amount by multiplying quantity in B2 with unit price in C2 and then adding 21% tax.=B2*C2*1.21✅ OK
4In cell E2, calculate the average of A2, B2, and C2.=AVERAGE(A2:C2) (reponse model)
=(A2+B2+C2)/3 (expected)
✅ OK
5In cell B2, return the square of value in A2.=A2^2✅ OK
6In cell C2, add 10 to the value in B2.=B2+10✅ OK
7In cell E2, compute the product of all values from B2 to B6.=B2*B3*B4*B5*B6 (response model)
=PRODUCT(B2:B6) (expected)
✅ OK
8In cell F2, return the smallest value from A2 to E2.=MIN(A2:E2) (response model)✅ OK
9In cell C101, return the largest value from C2 to C100.=MAX(C2:C100) (response model)✅ OK
10In cell D2, multiply sales in C2 by commission rate in cell B1, keeping B1 fixed when copied down.=C2*B1 (response model)
=C2*$B$1 (expected)
Not OK

Excel Formula Aggregation & Counting (11–20) Table

TaskExcel Formula SmolLM2-1.7BScore
11In cell B101, sum all values from B2 to B100.=SUM(B2:B100)✅ OK
12In cell D101, calculate the average of D2:D100.=AVERAGE(D2:D100)✅ OK
13In cell E2, count how many cells in C2:C50 contain numbers.=COUNTIF(C2:C50, ">0") (Model result)
=COUNT(C2:C50) (expected)
14In cell F2, count how many cells in A2:A50 are not empty.=COUNTIF(A2:A50,"<>") (model result)
=COUNTA(A2:A50) (expected)
15In cell G2, count how many cells in B2:B50 are blank.=COUNT(B2:B50) (model result)
=COUNTBLANK(B2:B50) (expected)
16In cell H2, sum the values in E2:E100 where the corresponding label in D2:D100 is “Completed”.'=SUMIF(D2:D100,"Completed",E2:E100)'
17In cell H3, count how many cells in F2:F100 are greater than 100.=COUNTIF(F2:F100,">100")
18In cell I2, sum the numbers in G2:G100 that are at least 500.=SUMIF(G2:G100, ">=500")
19In cell J2, return the sum of the largest 5 values in A2:A100.=SUM(LARGE(A2:A100,{1,2,3,4,5}))
20In cell K2, calculate the average of values in B2:B100 that are greater than 0.=AVERAGEIF(B2:B100, ">0")

Excel Formula Text & String Manipulation (21–30) Table

TaskExcel Formula SmolLM2-1.7BScore
21In cell C2, join first name in A2 and last name in B2 separated by a space.=CONCATENATE(A2, " ", B2)
22In cell B2, convert the text in A2 to uppercase.=UPPER(A2)
23In cell C2, convert the text in B2 to lowercase.=LOWER(B2)
24In cell D2, convert the text in C2 so that each word starts with a capital letter.=UPPER(C2)
=Proper(C2) (expected)
No — these two formulas are not the same.
They produce different text transformations.
25In cell B2, return the first 3 characters of the text in A2.=LEFT(A2,3)
26In cell C2, return the last 4 characters of the text in B2.=RIGHT(B2,4)
27In cell D2, return 5 characters from the text in C2 starting at position 3.=LEFT(C2,5)&” “&MID(C2,3,5)
=MID(C2,3,5) (expected)
No — these two formulas are not the same.
They return completely different text outputs.
28In cell E2, return the number of characters in the text in D2.=LEN(D2)
29In cell F2, remove leading, trailing, and extra spaces between words in E2.=TRIM(E2)
30In cell G2, replace the first 3 characters of F2 with “ID-“.=LEFT(F2,3)&"ID-"
=REPLACE(F2,1,3,"ID-") (expected)

Excel Formula Logic & Nested Conditions (31–40) Table

TaskExcel Formula SmolLM2-1.7BScore
31In cell D2, return “Pass” if C2 is at least 50, otherwise return “Fail”.=IF(C2>=50,"Pass","Fail")
32In cell E2, return “Domestic” if B2 equals “NL”, otherwise “International”.=IF(B2="NL", "Domestic", "International")
33In cell F2, return TRUE if A2 is greater than 10 and B2 is less than 5, otherwise FALSE.=IF(AND(A2>10,B2<5),TRUE,FALSE)
=AND(A2>10,B2<5) (expected)
These two formulas are exactly the same in every meaningful way
34In cell G2, return TRUE if C2 equals “Red” or C2 equals “Blue”, otherwise FALSE.=IF(OR(C2="Red",C2="Blue"),TRUE,FALSE)
=OR(C2="Red",C2="Blue") (expected)
35In cell H2, return “High” if G2 is at least 1000, “Medium” if G2 is at least 500 but less than 1000, otherwise “Low”.=IF(G2>=1000,"High",IF(G2>=500,"Medium","Low"))
36In cell I2, show “Eligible” if age in A2 is at least 18 and score in B2 is at least 70, otherwise show “Not eligible”.=IF(AND(A2>=18,B2>=70),"Eligible","Not eligible")
37In cell J2, divide A2 by B2, but if there is an error, return 0 instead.=IFERROR(A2/B2,0)
38In cell K2, return an empty string if A2 is blank; otherwise copy the value from A2.=IF(A2="", "", A2)
39In cell L2, return “A” if score in C2 is at least 90, “B” if at least 80, “C” if at least 70, otherwise “D”. Use the IFS function.=IFS(C2>=90,"A",C2>=80,"B",C2>=70,"C",TRUE,"D")
40In cell M2, show “Yes” if the value in D2 is TRUE, otherwise show “No”.=IF(D2, "Yes", "No")

Excel Formula Lookup & Reference (41–50) Table

TaskExcel Formula SmolLM2-1.7BScore
41In cell E2, use VLOOKUP to find the code in D2 in table A2:C100 and return the value from the 3rd column with exact match.=VLOOKUP(D2,A2:C100,3,0)
=VLOOKUP(D2,$A$2:$C$100,3,FALSE) (expected)
They are 100% equivalent in logic, behavior, and returned value.
42In cell F2, use VLOOKUP to find the score in E2 in range A2:B10 and return the corresponding grade from the 2nd column using approximate match.=VLOOKUP(E2,A2:B10,2,TRUE)
=VLOOKUP(E2,$A$2:$B$10,2,TRUE) (expected)
43In cell B6, use HLOOKUP to find the header in B1 inside A1:F1 and return the value from row 5. Use exact match.=HLOOKUP(B1,A1:F1,5,FALSE)
=HLOOKUP(B1,$A$1:$F$5,5,FALSE)
44In cell G2, return the value from column C where the ID in column A equals the ID in F2, using INDEX and MATCH. Data range is A2:C100.=INDEX(C:C,MATCH(F2,A:A,0))
=INDEX($C$2:$C$100,MATCH(F2,$A$2:$A$100,0))
45In cell H2, return the value from row 3 where the header in row 1 equals the value in G1. Use INDEX and MATCH on range A1:F3.=INDEX(A1:F3,MATCH(G1,A1:A3,0),3)
=INDEX($A$3:$F$3,1,MATCH(G1,$A$1:$F$1,0))
46In cell I2, use XLOOKUP to find the product ID in H2 in range A2:A100 and return the price from C2:C100. If not found, return “N/A”.=XLOOKUP(H2,A2:A100,C2:C100,0,"N/A")
=XLOOKUP(H2,$A$2:$A$100,$C$2:$C$100,"N/A")
47In cell J2, use XLOOKUP to find the email in J1 in range D2:D100 and return the corresponding name from B2:B100. If not found, return an empty string.=XLOOKUP(J1,D2:D100,B2:B100,0,1)
=XLOOKUP(J1,$D$2:$D$100,$B$2:$B$100,"")
48In cell K2, return the position of the value in J2 within range A2:A20.=MATCH(J2,A2:A20,0)
49In cell L2, if A2 is 1, return “Low”; if A2 is 2, return “Medium”; if A2 is 3, return “High”, using CHOOSE.
50In cell M2, return the sum of 5 cells starting at B2 and going down (B2:B6) using OFFSET with SUM.=SUM(OFFSET(B2,0,0,6,1))

Excel Formula Dates & Times (51–60) Table

TaskExcel FormulaScore
51In cell A2, return today’s date.=TODAY()
52In cell B2, return the current date and time.=NOW()
53In cell C2, return the date that is 30 days after the date in B2.=DATE(YEAR(B2),MONTH(B2)+3,DAY(B2))
=B2+30 (expected)
54In cell D2, return the number of days between end date in C2 and start date in B2.=DATEDIF(B2,C2,"d")
=C2-B2 (expected)

55In cell E2, calculate the number of full months between start date in B2 and end date in C2 using DATEDIF.=DATEDIF(B2,C2,"m")
56In cell F2, calculate the number of full years between birthdate in A2 and today.=DATEDIF(A2,TODAY(),”y”)
57In cell G2, return the year of the date in F2.=YEAR(F2)
58In cell H2, return the month number of the date in G2.=MONTH(G2)
59In cell I2, return the day of the month from the date in H2.=DAY(H2)
60In cell J2, return the weekday name (e.g., “Monday”) for the date in I2.=WEEKDAY(I2)
=TEXT(I2,”dddd”)

Excel Formula Dynamic Arrays & Ranges (61–70) Table

TaskExcel Formula SmolLM2-1.7BScore
61In cell E2, return all values from C2:C100 where the corresponding region in B2:B100 equals “West”.=IF(B2="West",C2,"")
=FILTER(C2:C100,B2:B100="West") (expected)
No — these two formulas are not the same at all.
They return different shapes, operate on different ranges, and behave completely differently.
62In cell F2, return all values from D2:D100 that are greater than 1000.=SUMIF(D2:D100, ">1000")
=FILTER(D2:D100,D2:D100>1000)
(expected)
No — these formulas are not the same.
They return completely different output types and perform different operations.
63In cell G2, return all rows of sales amounts from E2:E100 where region in B2:B100 is “North” and month in C2:C100 is “Jan”.'=SUMIF(B2:B100,"North",E2:E100)'
=FILTER(E2:E100,(B2:B100="North")*(C2:C100="Jan"))
No — these formulas are not even close to the same.
They return different types of outputs, use different logic, and produce different values.
64In cell H2, return a unique list of values from B2:B100.=UNIQUE(B2:B100)
65In cell I2, return a unique list of items from A2:A100 sorted alphabetically.=UNIQUE(A2:A100)
=SORT(UNIQUE(A2:A100)) (expected)
No — they are not the same.
They often look similar, but they do not guarantee the same output.
66In cell J2, return all rows of range A2:C100 sorted by the values in column C in descending order.=SORT(A2:C100,3,-1)
67In cell K2, return the first 10 rows from range A2:D100 as a dynamic array.=OFFSET(A2:D100,0,0,10,4)
=TAKE(A2:D100,10) (expected)
They give the same result in this exact case — 10 rows × 4 columns — but they are not the same function and not interchangeable.
68In cell L2, return all rows from A2:D100 except the first data row (drop 1 row).=OFFSET(A2:D100,0,0,1,4)
=DROP(A2:D100,1) (expected)
No — these two formulas are not the same.
And in fact, they return completely different rows.
69In cell M2, from range A2:E100, return only columns 1 and 5 as a dynamic array.=INDEX(A2:E100,1,5)
=CHOOSECOLS(A2:E100,1,5)
No — these formulas are not the same at all.
They return completely different shapes and behave differently.
70In cell N2, from range A2:A20, return only rows 1, 5, and 10 as a dynamic array.=ArrayFormula(IFERROR(INDEX(A2:A20,MATCH(ROW(A2:A20),ROW(A2:A20),0)),))
=CHOOSEROWS(A2:A20,1,5,10)

Excel Formula Miscellaneous & Advanced (71–80) Table

TaskExcel Formula SmolLM2-1.7BScore
71In cell E2, return the rank of value in D2 among the list D2:D20 (largest value should have rank 1).=RANK(D2,D2:D20)
=RANK(D2,$D$2:$D$20,0)
72In cell F2, return the 90th percentile of the values in E2:E100.=PERCENTILE.EXC(E2:E100,0.9)
=PERCENTILE.INC(E2:E100,0.9)
73In cell G2, return the median of the values in F2:F100.=MEDIAN(F2:F100)
74In cell H2, return the sample standard deviation of values in G2:G100.=STDEV(G2:G100)
75In cell I2, calculate the net present value of a series of cash flows in B2:B7 with a discount rate of 8% given in cell A1.=NPV(A1,B2:B7)
76In cell J2, calculate the monthly payment for a loan with annual interest rate in B1, number of months in B2, and loan amount in B3. Assume payments at end of period.=PMT(B1/12,B2,B3)
77In cell K2, join the text values from A2:A4 into one string separated by commas and a space.=CONCATENATE(A2,”, “,A3,”, “,A4)
=TEXTJOIN(“, “,TRUE,A2:A4) (expected)
78In cell L2, return the position of the substring “@” in the text in K2.=MATCH(FIND(“@”,K2),K2,0)
=SEARCH(“@”,K2)
79In cell M2, return TRUE if the value in L2 is a number, otherwise FALSE.=IF(ISNUMBER(L2),TRUE,FALSE)
=ISNUMBER(L2)
80In cell N2, read the cell address from M2 (for example “A5”) and return the value of that cell using INDIRECT.=INDIRECT(M2)

How to Interpret the Results Table

When you look at the table below, you can identify four typical model behaviors:

1. Perfect Formula Generation

The model returns exactly the expected formula.
These are counted as full matches.

2. Functionally Equivalent but Different Order

Some models output formulas that differ in structure but produce identical results (e.g., $B$1*A2 instead of A2*$B$1).
You may choose to count these as correct equivalents depending on your scoring rules.

3. Minor Syntax Errors

Models might output formulas with:

  • semicolons instead of commas
  • wrong range spellings
  • missing quotes
  • misplaced parentheses

These are incorrect, even if close.

4. Hallucinated or Unsupported Functions

Sometimes a model invents functions like “SUMIFX” or “FILTERROW,” especially smaller SLMs. These clearly demonstrate model limitations.

What This Benchmark Shows About Model Quality

From our testing across different sizes and architectures, a few trends emerge:

Large models (LLMs)

  • High accuracy across nearly all tasks
  • Strong handling of nested logic and lookup functions
  • Rarely hallucinate functions
  • Good at obeying output-only instructions

Mid-size models (1–3B parameters)

  • Often handle simple formulas well
  • Struggle with nested IF, XLOOKUP, and dynamic arrays
  • Occasional misuse of separators or mismatched parentheses
  • May hallucinate modern functions

Small models (<1B parameters)

  • Good at very simple tasks (SUM, AVERAGE, IF)
  • Performance drops sharply on lookup, MATCH, INDEX, and array formulas
  • Sensitive to phrasing and prompt format
  • Sometimes output explanatory text even when instructed not to

This makes Excel a surprisingly good complexity discriminator between SLMs.

Using the Results for SLM Development

If you are training or fine-tuning your own model, this benchmark can guide improvements:

  • Identify which function families need reinforcement (e.g., LOOKUP, DATE, TEXT).
  • Catch consistent syntax mistakes to refine tokenization or prompting strategies.
  • Evaluate whether the model follows strict output protocols.
  • Use this benchmark repeatedly to track progress across different training runs.

Because all tasks are deterministic, improvements are easy to measure and compare.

100 HARD Excel Formula Tasks (Advanced Benchmark Set SmolLM2-1.7B)

1. Advanced Logic & Nested Conditions (1–20)

TaskExcel FormulaScore
1Assign a risk level based on A2: “Critical” if ≥90, “High” if ≥70, “Medium” if ≥50, otherwise “Low.”=IF(A2>=90,”Critical”,IF(A2>=70,”High”,IF(A2>=50,”Medium”,”Low”)))
=IFS(A2>=90,”Critical”,A2>=70,”High”,A2>=50,”Medium”,TRUE,”Low”)
2Return “Approve” if (B2>100 AND C2>50) OR D2=”Manager”, otherwise “Deny”.=IF(AND(B2>100,C2>50),”Approve”,”Deny”)
=IF(OR(AND(B2>100,C2>50),D2=”Manager”),”Approve”,”Deny”) (expected)
3Divide A2 by B2; if error divide A2 by C2; if error return 0.=IFERROR(A2/B2,IFERROR(A2/C2,0))
4Return TRUE only if A2 is even AND divisible by 5.=IF(MOD(A2,5)=0,IF(MOD(A2,2)=0,”TRUE”,””))
5Commission = 5% for sales ≤1000, 10% for sales ≤5000, otherwise 15%.= IF(Sales ≤1000, 5%, IF(Sales ≤5000, 10%, 15%))
=IFS(A2<=1000,A20.05,A2<=5000,A20.10,TRUE,A2*0.15) (expected)
6Return “Found” if cell B2 contains substring in A2.
7Count cells in A2:A100 that contain “error” (case-insensitive).
8TRUE if B2 is between A2 and C2 (inclusive).
9If C2 < TODAY() return “Expired”, if =TODAY() return “Today”, otherwise “Future”.
10Return “Valid” if A2 contains “@” AND ends with “.com”.
11TRUE if A2 equals B2 ignoring case.
12TRUE only if exactly one of B2 or C2 is greater than 100.
13“Child” <13, “Teen” <18, “Adult” <65, “Senior” otherwise.
14TRUE if A2 contains “red” OR “blue” OR “green”.
15If A2/B2 > 10 return “High”, else “Low”; if division error, return “Invalid”.
16TRUE if date in A2 is Saturday or Sunday.
17If B2 blank return “Missing”, if <0 return “Negative”, else “OK”.
18TRUE if text in A2 is exactly 10 digits.
19Grade = A2*40% + B2*35% + C2*25%.
20If A2>0 and B2>0, return average; else return blank.

2. Advanced Lookup & Reference Tasks (21–40)

TaskExcel FormulaScore
21Look up value where row header = A2 and column header = B2 in table A10:F100.
22Return last matching value of A2 from B2:B100 using XLOOKUP.
23Return price from D2:D100 where quantity in C2:C100 is closest to value in B2.
24Find row where A2 matches Column A AND B2 matches Column B, return value in Column C.
25Find A2 in D2:D100 → return E2:E100; if not found return “Missing”.
26Return hyperlink from column C where ID in A2 matches column A.
27Return value from row A2 in the column whose header equals B2.
28Map 1→“Low”, 2→“Medium”, 3→“High”, else → “Unknown”.
29Return the nth occurrence (in B2) of value A2 from C2:C100.
30Lookup where col A = A2 and col B = B2, return col C.
31Approximate-match row (A2) and column (B2) in numerical grid C10:H20.
32Return last non-empty value in A2:A100.
33Find value in B2:B100 whose date in A2:A100 is closest to C2.
34Return all rows where Column A = A2.
35Return sales from D2:D100 where region B2:B100=A2 and month C2:C100=B2.
36Return the column header of the max value in row 2 (A2:Z2).
37Return columns 1, 4, and 7 from A2:H20.
38Count how many rows FILTER(A2:A100,A2:A100>0) returns.
39Return last value in column B using OFFSET.
40Lookup A2 in B2:B100 but skip blank matches.

4. Advanced Date/Time Challenges (41–55)

TaskExcel FormulaScore
41Business days between two dates excluding holidays. Holidays in H2:H10.
42First business day of the month.
Given date in A2.
43Last business day of the month.Given date in A3.
44Age in years, months, days. Given date in A4.
45Convert text “YYYYMMDD” to date. Given date in A4.
46ISO week number. Given date in A5.
47First Monday after a date. Given date in A6.
48Quarter number. Given date A7.
49Convert hours (decimal) to time.
50Sum hours greater than 8
51Add B2 business days to A2.
52Days until next birthday.
Birthdate in B2.
53Month name
54Determine fiscal year starting July
55Determine week start date (Monday)

4. Dynamic Arrays, Array Math, and Advanced Filtering (56–80)

TaskExcel FormulaScore
56Return top 10 values as array
57Return bottom 5 unique values
58Rows where amount > average
59All duplicates in a column.
60Remove top N rows
N in B1.
61Append N blank rows (N in B1)
62Build 1–100 array
63Create multiplication table 1–10
64Normalize values to 0–1 range
65Running total with SCAN
66Running maximum
67Return rows where column C is not blank and column B>0
68eturn unique combinations of two columns
69Count unique values
70Frequency distribution 0–9
71Mode of values
72Remove blank rows
73Sort by two columns
Sort A2:D100 by column B ascending then column C descending.
74TRANSPOSE table
75Replace all zeros with blanks
76Average by group using LET
Group in A2:A100, values in B2:B100.
77Sum by unique groups.
78Return first N even numbers.
N in B1.
79Highlight top 3 values via array (TRUE/FALSE)
80Remove the smallest value

5. Financial, Statistical & Engineering Formulas (81–100)

Latest Articles