
source: Max Pixel (http://maxpixel.freegreatpicture.com/home)
Whenever we think about tech, it’s easy to conjure images of coding, obscure languages, and terminology that is indecipherable. Before I got involved in the industry, the Matrix digital rainfall was analogous to tech for me.
While learning solid software development practices, version control, and staying up-to-date on the latest libraries is very important to be successful, many of the intricacies of coding can be broken down into more digestible flows of logic. Code should be thought of as a means for achieving an end, not the end itself. Imagine a carpenter who is building a desk. The first task is not to think about which hammer to use, but to envision the steps and design of the final product.
Before I got involved in tech, I knew very little about coding. I absorbed what I could at my first job and took some online courses in SQL and Python. These skills helped me to think about technical problems differently. However, after developing my skills for a few years, I believe it’s more important to think about the “what” and “why” to a problem instead of the “how”.
Working example
To illustrate this approach, let’s imagine you have a database with customer information. For simplicity, let’s say there is a customer order table and customer location table.
If you want to use SQL to see what was sold to customers in Minneapolis, how would you approach this? If you know some SQL, it’s tempting to jump right into writing the code to extract the data. However, even without any knowledge of SQL, we can still reason through the problem and arrive at some pseudo code that does the same thing.
Let’s solve the “what” and “why” questions first. What are we trying to do? We want to get the order information for customers from Minneapolis. This means we probably want to include all of the existing information from the customer order table and apply some filter on the city. Why are we doing this? We are looking to get a sense of our sales to customers in Minneapolis to potentially assess our performance in that market. Based on this it would probably be a good idea to include the city that each order came from in our final result.
Now that we have answered the what/why questions, we can think about design of our approach. The customer order table lists all the order details, including the CustomerId. This is the identifier to tell us which customer made the order. Then we have the customer location table which tells us where each customer is located. From a high-level perspective, we want all the data from the customer table, but we only want to keep the orders that are located in Minneapolis. This suggests we need to get the data from both tables, and then apply a condition on the city.
Let’s now summarize the steps we may take to achieve our goal.
1) Get all the data from the CustomerOrderTable
2) Get the city from the CustomerLocationTable
3) Join together the data from Step1 and Step2 based on the CustomerId
4) Only keep the resulting data from Step3 that is from Minneapolis
Below is some pseudo code that could be used to achieve this. The instructions are written in order from top to bottom (this is not in a known language, just notes about what I would do).
get Data from CustomerOrderTable
get City from CustomerLocationTable
join CustomerOrderTable and CustomerLocationTable as FinalTable by CustomerId
keep Data from FinalTable where City is "Minneapolis"
These instructions get the data from both tables, create a new table that includes all the data, and then applies a filter where the city is Minneapolis. Irrespective of whether you are using SQL or another language, this approach is universally applicable. Now that we know the what/why, we can finally go into our toolbox and apply SQL.
SELECT CustomerOrderTable.*, CustomerLocationTable.City
FROM CustomerOrderTable
INNER JOIN CustomerLocationTable
ON CustomerOrderTable.CustomerId == CustomerLocationTable.CustomerId
WHERE CustomerLocationTable.City == "Minneapolis";
The above script is valid SQL. If we look at this code, we can see the same type of pattern from our pseudo code. We are getting all the data from the CustomerOrderTable (denoted by the ‘*’ symbol) as well as the city from the CustomerLocationTable. We do what is called an inner join on the two tables, connecting them by their shared column of CustomerId. The inner join ensures that we only keep customer order records that have an associated record in the location table. Finally, we ensure that the city is equal to “Minneapolis”. Executing this code returns the following.
Design first, code later
The example illustrated above shows how intuition and design must come before any specific technical approaches. If you can reason about what task you are trying to solve and why it’s important, your solution will be informed by those answers. Even if you don’t know much about the specific language/tool you are using, understanding the problem and breaking it down into its pieces will enable you to thrive in the ever fast-paced tech environment. Coding languages, libraries, and frameworks all evolve. But core problem solving skills never go out of style.