Database Anti-Pattern: Recursive Network Select
Recursive Network Select is a pattern that occurs when a program makes a call to a database returning a resultset and then as it loops through that resultset it makes a call to the database for each record as it loops through the resultset. In extreme cases, the program will add additional levels of recursion while looping through the resultsets returned as a result of the calls made while looping through the initial resultset.
Here is a typical example: The program retrieves a list of customers in the database. Then to get a list of orders placed by each customer the program will loop through the customers and for each customer it will query the database to get a list of orders placed for that customer. In the extreme cases, the order itself doesn’t contain all of the necessary information, so the program will then loop through each order and send another query to the database to get the list of products in each order.
Why is a Recursive Network Select bad? Let’s look at some of the numbers.
One hundred customers, each with an average of three orders, and each order averaging three products.
- One query to get the customer list
- One hundred queries to get the orders for each customer
- Three hundred queries to get the product list for each order
That’s a total of four hundred and one queries. Each of those queries is composed of the following steps (simplified, there are actually many more steps than this):
- Open a connection to the server (some languages/frameworks retain persistent database connections)
- Transmit your query across the wire to the database server
- The database server parses and tokenizes the query
- After processing the query the database server comes up with a query plan to fetch the data
- Indexes are traversed, identifying rows that qualify for inclusion
- The server positions the read/write heads to gather the data from disk
- The server reads all of the data from disk assembling the resultset
- Criteria are used to exclude data from the resultset based on the where clause
- The final resultset is packaged up and shipped back across the wire to your program (optionally closing the connection when finished)
- Your program processes the resultset and makes it available to user level code for processing
If each query ran in 100 milliseconds, it would still take 41 seconds to process that resultset. The reality is that the total round trip time for those queries would be closer to 200 to 300 ms even if the query itself ran in 100 milliseconds. Each query comes with non-optimizable network overhead. That’s with only one hundred customers, the page load time increases with each customer and order added. Imagine this with system dealing with many thousands of customers.
This isn’t an idle speculative post, I have run across this pattern numerous times in the field dealing with database systems. In fact a system I am dealing with currently has this problem, which inspired this post.
There are a number of reasons programmers make this mistake. The biggest reason seems to be lack of understanding about SQL, or the total processing time required for this type of a solution. Unfortunately a large number of programmers do not properly understand how to join tables to get data. Sometimes programmers who know how to join tables, simply opt to do a Recursive Network Select because they view it as easier to do in their language than it is to use joins and get the data in a single call.
So we’ve established that the Recursive Network Select is bad practice, but how do you get around it? If a developer is working with a database, it’s important to understand the basic concepts involved, such as joining tables and/or limiting results with a properly formed where clause. Next, with the advent of good OO mappers, many languages provide framework facilities that will cleanly rip apart the resultset leaving you with a nice set of objects to loop through. Even if you don’t have a nice OO mapper to disassemble the dataset, it’s better to loop through the records and do it yourself than it is to make hundreds or thousands of network calls.
Finally, in most cases your program shouldn’t be returning hundreds or thousands of records to the client. This is why almost all libraries support pagination. People cannot consume hundreds or thousands of records at one time. Typically results should be limited to ten or twenty top level results. In some cases you may need to deal with hundreds or thousands of records, and where possible you should do a lot of the heavy lifting in Stored Procedures.
This article may be obvious to many developers, unfortunately it isn’t obvious to everyone. So the next time you encounter a Recursive Network Select, point the developer to this page, hopefully this will help them understand how performance damaging it can be.