The 7 Layers of a Production App
This article outlines the base elements that an application developer eventually has to consider while designing, creating and deploying contemporary software made for public or private distribution which processes data in some way and provides some form of user interface experience.
What is an Application?
Contemporary venacular loosely lumps anything which provides some sort of user interface to a computer program into the term app which is short for application. Ultimately, for the all but the most particular of souls (such as myself), every piece of software is an 'app' now.
If you are writing tools, frameworks, APIs, or any other sofware which falls outside of the context as described, then only a subset of these layers may apply to your software. For our purposes, we are explicity concerned with the design and implementation of a modern web and/or mobile application deployed for 24/7 multi-user service.
6 Critical Layers
- Application Stack
The technologies, components, APIs, packages and layers of your application, from project root inward exclusive of static assets and inclusive of all actionable code which shall, at some point, be compiled or interpreted. Some parts of your application stack will also be part of your server stack.
Files (typically) that are not executed or compiled and passed to the client without processing. File types such as CSS, fonts and images are almost always considered static assets, but an asset (or static asset if you prefer) has no restriction on file type. Any dynamically (as in on-the-fly or generated-per-request) constructed element cease to be considered an asset for our purposes. A file which is generated by a pre-processor, for instance, becomes an asset while the file processed by the pre-processor shall be considered actionable code. Technically a static sub-set of your application stack or data sets.
- Data Sets
All data is data, so to speak. Your program is, in fact, data. But in this context the reference is to actual data collected and processed by your program in some way and most likely stored somewhere. This may be a large distributed database, a collection of spreadsheets, a collection of files (like your database ultimately is), or a collection of bytes in some form of memory buffer (like your database, or file system, could be). The point is, any container which holds data-at-rest shall be considered a data set (or store, collection, table, etc), which therefore may be considered to have state.
Ultimately your program (application stack), assets and data all must live in some physical location(s). Even if it only exists in your mind. Whatever physical resources your program requires for one complete deployment is what shall be considered your application's infrastructure, regardless of the geo-physical boundaries it occupies.
- Program Flow
If all your program does is nothing, it is probably very boring. Program flow defines how a program sends bits of data from one piece, component or service to another. Whether that other is within the program itself or external to it is of little consequence. Any movement of data, signal emitted, or side-effect incurred shall be considered part of program flow. An application's state may be said to be a snapshot of the program's flow frozen in time.
- Server Stack
This is the system of software and technology that resides between your infrastructure and your application, enabling your software to be executed (or interpreted). It also generally provides the mechanisms which enable the storage and retrieval of your data, process and resource management. Logging, monitoring and management tools are always available at this layer, although at times an interface may be provided by the provider of one's infrastructure and one's infrastructure may be logged, monitored and managed from somewhere within one's server stack.
The Sweet 7th Layer, Automation
The icing on the cake, of course, is almost all sugar. Cake is still good without it, but a truly well-crafted cake has just the perfect amount.
Automation will always be found in highly scaled applications, although applications of any size may utilize it. There are many ways to integrate magic into your application, from automated code building and deployment, to infrastructure that scales on demand.
Many people want to jump right in and start developing their application infrastructure in a scalable way using automation. It sounds like a great idea, but in the real world, it can just make everything harder. It's not that hard, but if you haven't done everything manually first it can be hard to grasp what is really happening behind the scenes when something goes wrong.
You, your application and its layers should all grow together harmoniously. Don't try to do everything the way the big boys do before you even get off the ground or you never will.
You may think of Facebook as a mamoth corporate entity furthering the interests of Big Brother's Big Data, and in that you would not be wrong. However, it is also an application stack. It started off much like any other baby, with a smattering of PHP and some database. Facebook is still an application stack, although highly distributed and integrating many disparate technologies comprised of many applications. It still just breaks down to a pile of code and assets that can be collectively viewed as it's application stack.
Solely an application stack and not an application unto itself, Meteor provides all the tools for building an application from front-to-back and is therefore fairly rare in being a complete application stack without also being an application.
Perhaps the most well-known and freely available application stack (which I have an extremely strong distaste for), it is also an application. And, a stack. It has a variety of code and assets which can be used to create a diverse array of applications, even if in the end they are just web sites.
Asset Management Methodologies
The easiest and most frequently used method of serving up assets is by storing and serving them directly along-side the rest of your application from the same physical location. That means that ultimately, they are being served to clients directly by your web server daemon.
- Content delivery network
When your audience grows geographically, it can often reduce page load times for those located physically distant from your primary web server if your static assets are cached by someone with a large and geographically distributed infrastructure. We call those CDNs. Use wisely though - if you use 10 CDNs to deliver content, there are 10 additional points of latency (and potential failure) which can ultimately destroy your goal of reduced load time.
- Dynamically generated & cached
This method stores assets in a database (or on the fs before pre-processing) and then uses some form of caching mechanism to store those assets in memory or on disk until they are changed, at which time a new version will be generated and cached. Using this method in tandem with smart CDN choices can really lighten the load on your infrastructure, decrease load time for your visitors, and increase your application's resilience to failure.
Data Management Techniques
- File system
Storing your data in flat files on the file system is super old-school. It's also, oddly, forever hip. It doesn't necessarily scale well, but it's not impossible either.
- Non-Relational databases
Black is the new black, so to speak. Dating back to the 70s or maybe even the stone age, now that storage is cheap and computation is expensive, NoSQL is back to being taken seriously. After decades of normalization mantra, it can be hard to start to accept denormalization. Dropping all the weight relationships bring does make for blazing fast queries though.
- Object stores
Object stores are essentially giant sets of key:value pairs stored in memory. Usually they have some sort of disk-based archival or backup system to maintain state in the event of failure, but some run at such large scale or with such live data that they do not. Useful in all but development or small scale situations, keeping an applications data in RAM is obviously a massive performance gain. Of course, it may also incur quite a bit more cost in infrastructure.
- Relational databases
You would find it difficult to argue that a relational database suffers in performance in any way compared to a non-relational database. And, that may be true in some circumstances or a perfect world. The real advantage to relational databases is data integrity. There are others, but that is the most important. In the end, your choice of database is often determined by your application stack's standard choice.
- Traditional shared hosting
The traditional way people go about launching their first web site, web app or web business is most often backed by cPanel and WHM and a LAMP server. Essentially your stuff lives on the same physical hardware as some other peoples' stuff buried in deep in a data center somewhere. Pretty limited on configuration options, but also pretty light on administrative duties.
- Traditional VPS/DVS
You can think of this as your own personal server-in-the-closet, but in a data center. Or someone else's closet. But on a virtual private server (or dedicated virtual server), you get root access and total control (or very close to it) over the software configuration. Hardware configuration, such as number of processors and RAM can still be costly in terms of time or labor to alter.
Services like Heroku, Modulus and Engine Yard are known as platform providers, which means they provide pretty-yet-simplified interface combining your server stack and infrastructure all into a neat point-and-click package. Most of them run on AWS, removing most of its complexity (and power). Most importantly perhaps, they simplify time-to-deployment while still offering the ability to scale vertically and horizontally without intervention by tech support.
- Direct Cloud
There are only a handful of cloud providers that are really operating at cloud scale (AWS, Microsoft Azure, Google Cloud, IBM Softlayer, Rackspace Cloud) and everything else you can think of as either an limited-interface to one of those, or a dinosaur. Operating directly in the cloud is definitely the most cost-effective, most resilient (cough S3 crash cough) and hippest. However, those benefits are directly offset by the complexity of operating at that scale. When you are talking about extreme scale, you must scale and manage every single tiny bit of every layer of in this application paradigm right down to network cards and routing tables. Of course, you can start out directly in the cloud with just one little micro-service. That's easy!
Program Flow Methodologies
Model-View-Controller is a method, often used in PHP, which seperates components into three levels. The controller is essentially the middle-man which does all the leg work to get data from the model (database) layer into the view (presentation) layer, and vice versa.
The Flux client-side application architecture, promoted by Facebook's development team, defines what they call a 'uni-directional data flow'. By uni-directional, they mean circular. It very closely parallels the W3 Document Object Model's principles but in a react-oriented context using their own set of ontology. Data flows down and events flow up, they say. Essentially Flux was born of a need to define React's program-flow methodology.
This is my personal program-flow methodology developed over many years of coding. The document itself is in its infancy, however the concept is finally honed in my brain. Each component represents a single concern, components have types and those types send data to only one other type along a uni-directional hierarchy (direction one, top to bottom). Events may be triggered by any component at any time, sending messages back towards the top (direction two, bottom to top) that may be intercepted along the way (making a J rather than a full loop). A component receiving data from an upstream component generally returns a confirmation signal that the data was recieved as intended. Actions happen (actionable code is executed resulting in a change of application state) when a listening component receives a message or a high-level API method is called.
- Message passing
A message passing system does not mandate which components interact with which or in what way. Application state may always be considered transient. Messages are set from one component to another where it may decide what code, if any, to execute as a result of that message. Flux uses message passing for sending events to the top of the model (thus, data only flows down.. messages are not technically data nor do they call any actionable code directly).
Reactive program flow creates a sub-conscious layer of inter-dependancy within your application. Changing a value in one place automagically propagates that change throughout the program. It is often used in real-time hardware programming, but is becoming very popular in interface design as well. Meteor was the first major modern framework to utilize reactive methodologies, with Facebook following shortly thereafter with a client-side only implementation called React.
- Serverless, Distributed Micro-Services
SDMS (did I must coin that?) are small bits and pieces of code that live somewhere in the cloud and talk to each other only when they have to. When they aren't talking, they are running or costing you money. An event anywhere in the real or digital world sends a message into one of your services and that service either changes state or executes code. Other services are listening or called, and a chain of events may or may not occur which ultimately control your application. It's crazy weird, but extremely awesome.
- Linux, Apache, MySQL, PHP
The traditional application stack, aka old faithful. Time tested, mature and every bit as messy as when it started. It's stable and flexible beyond belief and at this point just about runs itself. Of course, most people associate it almost exclusively with Wordpress now so who wants that?
- Linux, Nginx, *
Linux and Nginx can serve up just about anything better than just about anything else, so this will often be running your stack even if your stack includes Apache or you think your stack is being run by your app stack (like Node, Ruby or Go). Linux provides the OS, Nginx answers the door for your services, and then your application is served up by either Nginx, Apache, PHP-FPM, Puma or some other process manager for your application (if you are going to deploy at scale you will not be letting your application stack manage your server processes).
- Microsoft Stack
I know it's hard to believe, but some people are actually still doing this. They make a thing you probably have to pay for for everything you can think of. Server OS, database, web server, programming language. They are all but completely proprietary and interdependent of course.
Ansible is a sensible container automation system which orchestrates the deployment of many machines utilizing one controlling node.
- AWS CloudFormation
Cloud Formation defines a way to represent infrastructure as code allowing for extremely granular control of resources and configuration of every layer of your server and infrastructure layers. It is just for AWS and you can use it with Chef, Ansible, Docker, etc. It does what they do, but also more. And you can use it to use them. It's all so magical.
Chef is similar to Ansible, but allows for extremely complex configurations at very large to massive scale. If you were to outgrow Ansible, which would be a challenge, then you might move to Chef.
Docker provides a unique way to manage and build containers. Containers are disk images with an operating system and server stack all configured for your application which then usually loads the latest version of your code when launched. It also has a way to compose and manage 'swarms' of containers that operate together to contain the entire declaration of your applications scalable infrastructure in a way that can be easily shared and replicated.
While the previous contenders in this stack are primarily proactive, Runbook is primarily reactive in nature and responds to events generated by your server or infrastructure layers using pre-programmed actions.
Slightly Obscure Ontology
- Actionable Code
Any code in any language that will eventually be compiled or interpreted by a compiler, pre-processor or post-processor.
Short for application programming interface, APIs are simply functions (aka methods) exposed by some source. This source could be a global variable in your program, an imported interface from an open-source module, or provided by some external resource using HTTP GET/POST requests (often known as REST or API gateways).
A term used to express a certain set of values frozen in time. If your program updates a counter every nano-second, then every one billionth of a second your data is at rest. Typically though, we are referring to data stored in a database or memory store that does not change until some aspect of your program intentionally does so.
- Memory Store
A database that is only stored in RAM, as opposed to traditional databases which actually live on the file system.
- Project Root
The outter-most directory of your application, typically containing a file named README and containing only files related to the application (ie; not your downloads & documents, etc).
A collection of loosely related technologies and resources all used in cooperation to form something larger as a whole.
Enjoy this article? Discuss it using hashtag #7LAP on Twitter or click the heart icon at the top right of the page!