Designing an Online Book Store
I’ve been given the opportunity recently to do a small tech talk at a Hackathon taking place at my alma mater. The goal is to both teach how software engineers build something as well as demonstrate it to them live. I wanted to pick a topic that would be somewhat representative to students of what I think most engineers work on, but also something that I had experience with. Having worked in e-commerce for a long time, I decided implementing a store using AWS would be the best way to demonstrate that. Maybe it’s generic, but it is an easy-to-understand use-case because most college students have some experience using an online store and building something using cloud tools is likely to be helpful for them in their career in any kind of software engineering (or, at least for the hackathon). It also covers the kind of skills that people want to capture in a systems design interview question (for what that’s worth).
This will be the corresponding blog post in which I cover the same content except it focuses more on the experience of designing the system and my thought processes as I make those decisions. The talk will focus more on the how-to aspect of setting up the project and using the associated tools. Therefore, in this post, you see the more disorganized process of me designing something and maybe backtracking later and changing it a little etc.
Let’s start with what do we want the store to be able to do.
- We should be able to browse products. We can implement ~3 products in our example.
- We should be able to add some products to cart, and this cart should persist across browsing sessions. There are a lot of definitions of “session” in the front-end world, but I mean session as in the layman’s terms here e.g. closing your browser and revisiting the same url.
- At the end of our “session”, we should be able to buy the items in our cart and that order should be logged somewhere.
I think those are the basic elements for a store. Of course, I’m skipping a lot of details for simplicity here. I’ve worked most of my career to this point in Order Validation and Fulfillment, which both are conveniently ignored here.
I like to start my design with how will we store data. Given the requirements that we’ve identified, we should identify what kind of data we need to store.
- Product information. In order to browse products we need to display information about these products for the customer. Let’s call this the catalog.
- Session information. We need to know what products customers have in their cart for the checkout stage.
- Order information. We need to know what customers ordered and how to fulfill that order. We need this to actually ship the product to customers but also for legal and tax reasons.
There are lots of ways to store data and data storage is a complicated topic in itself, but I am planning to do all of this with DynamoDB. The main reason is that I want to help students with their Hackathon. Therefore it’s simplest to choose some kind of serverless solution to decrease the complexity of the project. This leaves either DynamoDB or S3, and DynamoDB makes the most sense for these things.And although serverless SQL is starting to exist nowadays, I think that NoSQL makes the most sense for the Hackathon too because we don’t have the time to do the proper kind of table design during a Hackathon that SQL really requires.
We still have to decide exactly how we want to structure our data, so the “data store” aspect of this design is not done, but one of the advantages of NoSQL is that we can decide this later or even as we implement, so I’m moving on to the API design.
API – Or, a brief detour into the definition of an API
At least when I was in college, I didn’t really know what API meant, so here’s a quick and dirty explanation. I see a lot of people trying to explain what an API is in light of the Oracle V Google Supreme Court case, and they always just say API stands for Application Programming Interface. I feel that doesn’t really help at all, so I will try to describe it differently. In short, software tends to be architected in layers, and an API is the general catch-all term for the ways software layers interact.
So, for example, in python, in order to write to a file, you could use the following code:
with open("text.txt", "w") as file_handler: file_handler.write("Some string")
Writing files is complicated and depends on your Operating System, your File System, and even the actual kind of disk you have. Generically speaking, you make a request to your Operating System, which in turn makes a request to your File System, which in turn, makes a request to your actual hardware.
Something like this:
But python takes care of this for you, you write this same code regardless of whether your working with:
- macOS, APFS, NVME SSD;
- Windows, NTFS, HDD;
- or; Linux, ext4, SATA SSD
That snippet of python code is described as the File API for python. It’s how python programs interact with Files. This allows programmers to create complicated systems without necessarily understanding how each piece of it works as long as they understand the concepts the API designer has created for them.
If the data storage section are the nouns in this project, the APIs are the verbs. What can we do with these data? Let’s break these down by the requirements again.
Primarily, we need a way to read the catalog data. Let’s call this
getProduct: (productId: str) -> (Product)
It will take a
productId which will be an uuid and return a JSON representation of your product.1
Saving a customer’s cart session
For this one, I expect customers to need to add products to their cart;
addToCart: (customerId: str, productId: str) -> ()
be able to list the items in cart;
getCart: () -> List<Product>
and remove items from cart;
removeFromCart: (customerId: str, productId: str) -> ()
Note from this, you start to see the benefit of creating these APIs and not using the database directly. You could just read from the database using their API from the client side, but this way, you limit the ways that a customer can interact with the data.
For checkout, we will be needing the same
getCart that we created earlier. In addition to that, we’ll need a way to create orders:
makeOrder: (customerId: str) -> (orderId: int)
makeOrder takes a
customerId, and returns an
int representing the
orderId, which will be visible to the customer, so I’ve chosen a simple
int to display to them.
For these APIs, I’m planning to use API Gateway backed by Lambda because I think serverless will be the easiest to use for the Hackathon (but also I’m biased).
Generally, in this part of the design, we want to imagine the way that customers would interact with this system. For a web application this would be the web page or pages that they would use to shop. Often as a software engineer, this part of the design is usually done in advance. Luckily for the end users, typically Product Managers and/or UX Designers will design the general interface, and engineers only implement it.
Since I have limited front end experience, having only worked in the front end for a few projects here and there and that front end is generally on the client side and doesn’t have too much interaction with cloud architecture. I’m probably going to just make a quick website for demo purposes but not really include it in this design process.2
My next post will be the how-to post on how to set this up using AWS.
Why an uuid instead of just a number? It can be a problem for customers to be able to guess your product ids, but it can also be used by competitors to guess how many products you have. See https://en.wikipedia.org/wiki/German_tank_problem ↩︎
Plus it took me two hours to embed the html for that one graphic in this post, so maybe I shouldn’t be teaching front end development. ↩︎