Attention: As of January 2024, We have moved to counting-stuff.com. Subscribe there, not here on Substack, if you want to receive weekly posts.
Getting LLMs to do technical stuff, like write code or SQL, is one of the hot fads these days. As is commentary on that behavior.
Benn just last week wrote about a hypothetical “smol analyst”, a bot that you have an iterative design session like you would direct a junior analyst that is prone to making mistakes. In there, he explores the notion of how verifying the validity of a chart is very much different than verifying the behavior of a generated app. In one, if the requested button does the requested behavior, it’s verified. In the other, seeing a requested chart does not guarantee the numbers within the chart are actually correct. It’s a hard design problem filled with black boxes. There are tons of companies and people who believe such problems can be overcome and are willing to burn money trying to do so.
Meanwhile, on a number of occasions, I said the following to people: “People invented formal languages because we learned what a shitshow human language is. LLM to code/SQL tools is society collectively re-learning this lesson at great expense.” It’s really really hard to write clearly, and if I have to write a page of context and filters to describe the data I want, I might as well be writing SQL.
But as time progressed, I started wondering to myself… to what extent is my statement actually true? What did math and other logical endeavors look like before the relatively modern invention of formal languages and notation? While I don’t think my statement is wrong, I honestly don’t know what math was like before the modern era. I barely know what math is like today.
So I went looking for examples of math problems from ancient times, specifically from the Renaissance and earlier. I wanted to understand as best I could the amount of effort people went to in order to describe and solve problems without much of the modern notation and tools we use today. Obviously, mathematicians of that era were able to overcome the sloppy nature of natural language to arrive at, and describe, exact mathematical truths. As last week’s preview post alluded to, this week is a brief peek into what they did.
So this week is a trip through some of the very old, very famous math texts and some of the problems I found within (that didn’t take pages and pages to explain).
First, what’s a math problem?
On my journey to find old math problems, I eventually realized that I didn’t really understand what a math problem actually was to begin with. In my mind, the words “math problem” spun up images of the problem sets that we had for homework throughout school and into college. Here’s a set of symbols and formulae, manipulate them using the techniques we learned to arrive at an answer. Here’s a right triangle with these sized lengths, find the angle X. Solve for X in a system of equations, etc.
But… honestly that’s not math, just a reflection of the pedagogical methods with which I had been exposed to math for much of my life. I had often heard that math problems from the ancient era were all stated as word problems, but the more I’ve exposed myself to the notion, the more I’ve come to understand that all problems essentially start off as word problems — they’re descriptions of honest actual “problems that need to be solved in the real world”. Those problems then subsequently get abstracted into symbols, we start seeing patterns across how the symbols are laid out and solved, and then the abstraction lives on separated from the original words. We may at some point forget even where the original problem statement came from, but the symbols always can act as a shorthand that can be rehydrated into a problem described purely, if not awkwardly, with words.
Another thing to also keep in mind is that the books I found are famous treatises that aimed to not just show problems, but also the solutions to said problems. They’re effectively ancient textbooks, except they don’t have problem sections that are left as exercises to the reader. Oftentimes, I found it hard to separate the text of the problem from the solution — they’re often presented one right after another. Honestly, many of the explanations are difficult enough to understand that you have to sit down and work them out yourself to even follow along.
Book 1 — Euclid’s Elements
Euclid’s Elements, here referencing Richard Fitzpatrick’s translation from 2007 — is the book that was used to teach geometry for millennia and adaptations are still used to this day. Many students are probably familiar with the first book that covers basic geometry.
Here’s an example from Book 2, which goes into geometric algebra. I assume that the diagram and the first paragraph are essentially the problem statement, with the remaining text being the proof. I honestly have a ton of trouble even understanding what that first paragraph even means.
One thing that takes some time to get used to is how all these texts talk about forming squares and cubes with numbers in a literal geometric sense.
The translator notes that this is a geometric version of the algebraic identity a*b + a*c = a² if a = b + c
.
Book 2 — The Arithmetica by Diophantus of Alexandria
This book is another ancient Greek text, The Arithmetica by Diophantus of Alexandria — from which we get Diophantine equations.
This particular translation includes the modern algebraic notation to the problems, and thanks to that, the solutions seem obvious to us since we’re used to that kind of manipulation. But just imagine what kind of algorithm you would need to provide the solutions to the generalized question of “Divide a given number [100] into two having a given difference [40]”. And this is just the first problem in the entire book.
Things just get progressively more confusing with “divide a given number [100] into two numbers such that given fractions (not the same) [1/3 and 1/5] of each number when added together produce a given number [30].” With the necessary condition of “The latter given number must be such that it lies within the numbers arising when the given fractions respectively are taken of the first given number”
I have the example in front of me, in modern symbolic notation… and I still am not sure what the problem even means.
A few more problems down and I am so completely lost, it’s pretty hilarious.
Book 3 — al/Khwarizmi's Al-jabr
An untranslated copy of al/Khwarizmi's Al-jabr, and another translation by Frederic Rosen — this book from the 9th Century BCE is considered the foundational text that established algebra as a field.
So one of the initial problems is for “one square, and ten roots of the same amount to thirty-nine dirhems” . The book restates the question as “what’s the size of the square which when its side is increased 10 times, has an area equal to 39”. In modern notation, we’re to solve x² + 10x = 39
for x.
Many of us would have been taught to use the quadratic equation/formula to solve this. But the given solution is … convoluted. You halve the number of the roots (the 10x, to get 5) and multiply that by itself to get 25. Then you add 25+39 = 64, take that square root of that to get 8, subtract out the half of the root again (5) and you get 3.
The translator’s modern notation explains it like this:
x² + 10x = 39
x = sqrt( (10/2)² + 39) - 10/2
x = sqrt(64) - 5
x = 8 - 5 = 3
Even with modern notation I’m not sure how everything got rewritten to get to the second line, so you have to go down into the proof. There’s a geometric proof on page 13 of the book, but the text is multiple pages long and too much for this post. You can take a look at the link to see for yourself.
The gist of the provided solution is that you start with the inner square AB, then since you want to find the the side of the square that has an additional 10*y worth of area to the inside square (with y representing the unknown side of AB you want). You take take 10/4 = 2.5 and extend four rectangles out (C,G,K,T) by 2.5 units each to get a cross shape. That leaves you these four 2.5*2.5 unit sized squares in the corners, which sum up to an area of 25. What we’re seeking is the root (aka side) of this giant square with a known area of 39 plus the 25 extra units you created. That gives the answer we’re seeking, and also explains why I couldn’t understand how they rewrote the formula algebraically.
Book 4 — Liber Abaci
Finally, we come to Fibonnaci’s Liber Abaci by L.E. Sigler — an English translation of the 13th century book that introduced the Hindu-Arabic number system to Medieval Europe, replacing the Roman numeral system which was extremely inconvenient for calculation purposes. The book is named after the procedures for calculating the values of things, what we now call algorithms and what was then called abaco.
The book is utterly filled with practical problems that a merchant would encounter, including figuring out the weight of silver and copper needed to alloy money, because money-making was apparently a thing that could be engaged in during the medieval period.
Since the book aims to teach doing calculations with a new number system, it covers a very wide range of topics starting from the extremely simple, basic multiplication and addition, to many operations with fractions and proportions (due to how important such work is for commerce). But it eventually touches on algebra and solving what we would call systems of equations.
What I find quite interesting is that many of the problems read similarly to a word problem you might find in a student’s math text. There is a huge array of currencies and weights and measures that need conversion. In one problem, we’re asked to figure out the value of a chunk of cheese.
A ton of cheese which weights 22 hundredpounds, that is 2200 pounds, is sold for 24 pounds [the currency]; it is asked how much 86 pounds are worth? [p137]
In the excerpt above, not only is the answer tricky to understand, the notation is foreign to us. Mixed numbers are written with the fraction first, 1/4 10 instead of 10 1/4. They also use a system of composed fractions, the (8 1 9 18 / 10 11 12 20
) is read right to left (taking a cue from Arabic), and is calculated as 18/20 + 9/(12*20) + 1/(11*12*20) + 8/(10*11*12*20) = 18/20 + 9/240 + 1/2640 + 8/26400 = 24768/26400 = 258/275.
Decimal expressions are a special case of the above composed fractions, 3.1416 would look something like ( 6 1 4 1 / 10 10 10 10 ) 3
, though this usage doesn’t appear very often in the text. Supposedly this notation makes calculation a bit easier since the factors are more readily accusable.
But the book isn’t all about mundane fractions and basic arithmetic.
This problem below comes much later in the book and is an example of applying calculation methods to geometry. The solution requires you to diagram out the poles and their lengths, figure out the triangles involved using geometry, and then effectively using the Pythagorean theorem to work out where along the longer pole the intersection occurs.
A Problem on Two Poles
On a certain ground are standing two poles that are nearly 12 feet apart, and the lesser pole is in height 35 feet, and the greater 40 feet; it is south, if the greater pole will lean on the lesser, then in what part of it will it touch? [p543]
In this book, the problems themselves are surprisingly familiar and relatable despite it being written over 800 years ago. Who doesn’t have a ton of cheese to sell and need to sell a 86 pound chunk? Instead, the complexity is in the explanations. The detailed description needed to lead the reader into doing the calculation correctly sounds extremely alien to me. I find it hard to keep track of the various objects and numbers that need to be operated on.
I’ll end this week’s journey with… a description of algebra. Since without modern notation, they had to refer to “things” and squares and roots and something that was translated as “census” which seems to mean x²
. Just make an attempt at reading the paragraph and try to figure out what’s going on.
Yup, old math is really confusing
Obviously this newsletter is not math history themed. I don’t write much about math at all. So I thank you for joining me on this wild tangent to satisfy my curiosity about how humans tried very hard to write very precise things. It was about as painful as I expected, but also different from what I had expected. While there were plenty of convoluted wordings, especially surrounding algebra, it was still somewhat accessible.
A special thanks to the paid subscribers to the newsletter! I wouldn’t have splurged, mostly on impulse, on a giant 600 page book on calculation without their support. Next week we go back to something data science-y.
Standing offer: If you created something and would like me to review or share it w/ the data community — just email me by replying to the newsletter emails.
Guest posts: If you’re interested in writing something a data-related post to either show off work, share an experience, or need help coming up with a topic, please contact me. You don’t need any special credentials or credibility to do so.
About this newsletter
I’m Randy Au, Quantitative UX researcher, former data analyst, and general-purpose data and tech nerd. Counting Stuff is a weekly newsletter about the less-than-sexy aspects of data science, UX research and tech. With some excursions into other fun topics.
All photos/drawings used are taken/created by Randy unless otherwise credited.
randyau.com — Curated archive of evergreen posts.
Approaching Significance Discord —where data folk hang out and can talk a bit about data, and a bit about everything else. Randy moderates the discord.
Support the newsletter:
This newsletter is free and will continue to stay that way every Tuesday, share it with your friends without guilt! But if you like the content and want to send some love, here’s some options:
Share posts with other people
Consider a paid Substack subscription or a small one-time Ko-fi donation
Get merch! If shirts and stickers are more your style — There’s a survivorship bias shirt!
the x^2 + 10x = 39 thing: the solution to which you found the geometry more appealing is called "completing the square". Algebraically, what you do is to realize that (x+5)^2 is x^2 + 10x + something else, and then you write the left side with the something else and simplify, getting (x+5)^2 = something, and then take square roots. Specifically, x^2 + 10x = 39 => (x+5)^2 - 25 = 39 => (x+5)^2 = 64 => x + 5 = 8 => x = 3 (or -13, but the original problem couldn't have a negative answer).
The geometry makes it rather more obvious why the name is what it is.
“It’s really really hard to write clearly, and if I have to write a page of context and filters to describe the data I want, I might as well be writing SQL.” True, that. I spent years going in the both directions working as a mortgage backed securities lawyer. (Guilty.) The drill was that bond traders came up with some new twist on cash flow slicing and dicing and sold it to buyers. The modeler in the back office expressed it in spreadsheet form to confirm that it worked mechanically and there would be a call to discuss it all in general terms. From there one of us lawyers would describe it in detail for the offering document and the other for the contractual document, in even greater detail. After closing, the cash flow administrator would program the whole process to determine who got what each month. At the insistence of our tax lawyer, who had a PhD in math, our work was all done with defined terms, making our work a Lego-like word problem. So, sort of like writing 10-20 pages of context and filters as the defined terms. It was fun, sort of like flipping burgers--hundreds of billions served. Until the music died.