Here’s a quick guide to designing a medium-sized system.
Identify stakeholders
Identify all stakeholders: everyone whose well-being is impacted by the project, including customers who will use the product, technical staff who will develop and maintain it, management, investors, and so on.
Decide on interfaces
Decide what each stakeholder’s interface to the system will be: A UI? An API? Logs for your DevOps team? Analytics for your Product team? Weekly reports to your executives? Each of these interfaces represents a surface.
Work backwards from interfaces / surfaces
Work backwards from the surfaces. List the high-level requirements of each, then the tech and policies required to support it (independently of the others) in abstract terms: Does it need a UI? A web service? Persistent storage? Particular datasets? Authentication? A human being writing a weekly email?
Identify dependencies and commonality
Identify dependencies and commonality. If two surfaces need access to the same data—for example, if your Security and Product teams both want to know what geographies are accessing your website, but need the information in separate dashboards—define a Single Source of Truth and an API that enforces invariants, such as a microservice with exclusive access to a database. This is all still conceptual, done with pen and paper or their digital equivalents, not code.
Diagram the complete logical architecture
Diagram the complete logical system architecture. This isn’t a draw the rest of the owl exercise, but the consolidation of your work from the previous steps into a single, cohesive document that can be used to get your entire team on the same page, and to help the folks you haven’t met yet who will be hired to maintain and expand the system in years to come.
Use stick figures or smiley faces to represent people or external systems (“actors”) that will interact with the system; boxes to represent surfaces, microservices, and batch jobs; and cylinders to represent data stores.
Name components, represent dependencies and data flow
Name every component. Add directed arrows (in two different styles) to represent logical dependencies and data flow. Make sure the dependency arrows describe an acyclic graph, and that every component and arrow has an obvious reason for existing.
Add a key explaining what each symbol and arrow style means. Use standard flowchart or UML symbols if convenient, but don’t stress over it: The simplicity and internal consistency of your design document are more important than compatibility with any external standard.
Review and incorporate feedback
Get reviews and seriously consider all feedback. Don’t incorporate bad advice, but do record any rejected suggestion in a “rationale” document, along with the reason you went a different way. Your team, taken as a whole, is probably smarter than you are.
Track updates
As the system evolves, produce updated iterations of the system architecture diagram. Keep a versioned history. You could be forgiven for thinking this step-by-step process smacks of Waterfalls and Big Design Up Front, and that all system diagrams are immediately obsolete; but having a well thought out design is tremendously helpful. Go into this process understanding that the system architecture diagram is a living document whose maintenance cost will be dwarfed by its ongoing usefulness.
At some point, you may be tempted to get cutesy. You’ll have trouble naming something, so you’ll want to call it Wombat or Quicksilver or NeoFromTheMatrix. Resist that temptation. If there’s no obvious, short name for a component, rethink your design. Name each thing for what it does, and let it have a single purpose: Marketing Site, Inventory Service, User Database, etc.
Address cross-cutting concerns, like how many instances of each microservice you’ll run, or which relational database to use, separately from the logical architecture.
Establish trust and communication with PM
If you’re an engineering lead, establish trust and clear communication with your Product Manager early in the process. Respect the PM as the voice of the customer. Likewise, if you’re a PM, loop your engineering lead into product conversations. Make sure they understand your vision.
One particular gotcha is worth calling out: While we ordinarily focus our efforts on the most important aspects of a system, during the design phase we’re better off focusing on aspects that require foundational support. We may care a great deal about performance, but performance can best be optimized later, once we have profile data. Similarly, though we value feature-richness, we can add features later. Other aspects, like quality and security, are almost impossible to add reliably once the system is up and running. You can’t sprinkle security dust on a rickety old system.