Why I’m Learning more SQL

Sam Myres
5 min readMar 24, 2024

I think that Software Engineering as a whole is lacking on this particular skill.

When I got my start, it was still LAMP days. That is, the most common stack of technologies for web development (where I found my entrance into this industry) was still Linux, Apache, MySQL and PHP. For me, that MySQL part was nearly missing, as I could far too easily install PHPMyAdmin, an open-source PHP app that connects to MySQL servers for management, and outside of SELECT, UPDATE, INSERT and DELETE, I didn’t really have to know SQL for developing my own app.

A decade on and I’ve used a few more technologies. I even made it into a Perl shop still doing the whole license-Oracle-DB-by-CPU-core thing. I still wonder if they meant physical or logical cores 🤔. The role I find myself in now is using Go, an absolute godsend of a programming language imho, but that doesn’t matter here. We do our work in sort of micro-lithic apps all working against a set of rather beefy SSD-laden Postgres 🐘 Databases. For the first time the tables I’m working with are measured in the 10s to 100s of MILLIONS of rows. Yes, with SSDs and all the magic of RDBMS it works, but we’re feeling those row counts on queries that join more than two of these tables.

A lot of what I’ve learned up to this point has been just what was necessary to work on the features/projects I’ve been assigned along my career path. There was a very short stint as a contractor where at the suggestion of the client I learned a good bit from SQLZoo.net but aside from that, I never purposefully learned about just SQL. All of the knowledge was about making SQL work inside the procedural language I was working with directly.

SQL, if you think about it, is really quite magic. It’s so old that I can’t tell you without looking it up whether it or COBOL is older. (looked it up, COBOL has about 10 years on SQL, if I just believe Gemini). It’s declarative, meaning you don’t actually tell the system how to do what you want it to do. Instead you just tell it what data you’re looking for with a very specific format, and it’s able to analyze how to get that data while taking into account table sizes and index availability, etc. That cannot be easy. Every time I think about how I would start to implement software remotely similar to an RDBMS I hurt myself with confusion like a tired Pokemon.

Enter stage left, a Jira ticket: “The API our customers use to get info about money they’re earning, it’s super-duper slow.” Okay, maybe the ticket wasn’t worded like that, but you get the point. This query that was actually four JOINs decided it was going to traverse a whole lot of the database before determining that you might have 0 results. We’re talking 5+ seconds to get nothing back, 30+ seconds to get back a couple thousand rows. I was given the task of researching possible improvements for the view in this query, and I duly started researching information about indexing, approaches to the query and even partitioning. I presented some options to my team and I was fortunate to have the my VP of Eng. at my side looking into this, as he discovered that de-normalization of a column used for filtering made a huge difference for this particular query.

One small change was able to get put in production almost immediately: BRINs. Block-Range Indexes are min & max values from a page placed in metadata on the pages holding blocks of data from tables. In our case dates that were being used in the WHERE clause of the query were nearly “clustered”, ascending in value as you traverse down the rows. Because of this, BRIN indexes were able to speed up performance 1.5–3x. This is because it eliminated the need for checking many blocks where the range of dates didn’t overlap but Postgres couldn’t previously know.

For the larger improvement, after a couple weeks of development and a couple more of delays because this was not a critical change, this attempt at improvement was put into Prod. It did not have exactly the intended effect. It’s likely due to other loads/types of operation on the same table underlying the view and the fact that much more spartan API rate limits were put on the micro-lithic app querying this view when one client was hitting us very fast on a Friday evening. In fact, the latency improvement for one important client from that rate limit change alone was very close to what I was going to try to achieve fixing the filtering on my view.

My change did however noticeably change the speed of this API for other much smaller clients who were recently onboarded or are still waiting for their first earnings. In fact, it sped up the search about 50x from 3.2 seconds to 60 milliseconds. This may actually be more important as making a good first impressions with these clients through our tech can be make-or-break stuff for a lucrative relationship. End of the day, I’m happy I’ve been able to make such an obvious impact, but I’m still hungry for more DB Performance.

So I asked my boss. I said “hey, this DB optimization stuff has been really interesting and I like it, can I buy a book on Postgres and expense it?” and the answer was not only a yes; it was a directive to read the book with consideration for adding it to a list of candidates for a book club meeting we hold once a sprint. Since I received and started reading the book I have decided it’s very dry tutorial-like expansion on each feature is not exactly invigorating reading. Instead, I just appreciate the Postgres documentation online more now 😆

So that’s how I got to a point where SQL has taken dominance in what I’m learning about Software Engineering today. It’s been an interesting turn to take, as I got this job (I think) specifically because I’m a staunch advocate and enjoyer of Go, the programming language. I’m still learning about Go especially since it’s still actively in development like most programming languages, but SQL represents an almost entirely separate and expansive landscape of learning opportunity. And although it’s not something I would brag about to my current employer and I’m not looking for an exit, SQL is definitely still a more generalizable / applicable skill than Go. SQL and RDBMSes are still used almost everywhere in the industry and it’s more important than ever that we understand all the things it’s abstracting away from us in our everyday workflows so that we don’t go reinventing wheels when we just need to rewrite a query.

Don’t sleep 💤 on SQL 🗄, it’s not a job you can outsource entirely to ChatGPT just yet.

--

--

I'm a Software Engineer, FAA 107 Drone Pilot and Radio Amateur. I write about things related to SWE and Tech and my own projects.