Going Cloud with Twelve-Factor Apps
Published: 2019-02-12
Updated: 2019-02-21
Web: https://fritzthecat-blog.blogspot.com/2019/02/going-cloud-with-twelve-factor-apps.html
Heroku is a cloud application platform that published following 12 strategies and criteria for software-as-a-service (SAAS) applications to be run in the cloud:
This is a lot. Below you find a grid where you can move around the factors in case you consider their order to be wrong. Try to drag & drop the yellow rectangles to the place where you think they belong, and get familiar while playing with them. (This grid is meant to be viewed on big landscape screens.)
Codebase
Use dependency management instead of sharing code.
Dependencies
Dependency declaration (pom.xml) + dependency isolation (Maven build). No usage of external tools like curl or ImageMagick.
Configuration
Configuration files must not be in source code. Prefer environment variables to config files.
Backing Services
Databases, message queues, caches, ... No hardcoded dependencies to them must exist in the source code. Each must be replaceable by another implementation / product.
Build, Release, Run
A build is created from code of a specific commit-version of the code. A release adds configuration. The run-stage allows no code changes, but can provide means to step back to the previous release to fix bugs.
Processes
Application-processes are stateless, they do not assume a certain request order. They use stateful backing services for getting into state. They do not assume an exclusive connection to some client, because the next request could be dispatched to another process. They do not rely on filesystems and caches, or on session data or session stickiness.
Port Binding
A 12-factor web-app is runnable standalone, without web- or application-server. It just requires a port, and it exposes its service through a protocol like HTTP, resulting in an URL.
Concurrency
Applications should not try to manage their lifecycle on their own (like daemonize itself), this prevents scaling and handling concurrency from outside. Instead use the "process formation", which is an array of process types (web, worker, ...) and a number of processes of each type.
Disposability
The processes of an application can be started and stopped at any time, this facilitates scaling and fast deployment. It should start up fast. It should terminate (SIGTERM) gracefully by releasing all its resources, and cause no problems on sudden death.
Dev / Prod Parity
Time between development and deployment is short (hours, not weeks). The developer also deploys. The development environment should be the same as the deployment environment.
Logs
No management of log files. Write behavior to unbuffered STDOUT. Facilitate a tool that can flexibly introspect logs, and can actively alert admins.
Admin Processes
Administration processes and tools should be shipped with the application's codebase. Prefer programming languages that provide a REPL shell (run/evaluate/print-loop) out-of-the-box, like Python, Ruby or Scala.
Need for Speed
We can't argue that these factors are just Heroku's factors. Cloud application requirements are different from those of a "traditional" application. Availability wins, provided by resilience against, and scaling of, massive request rates.
These factors remind me of UNIX environments and process models. Performance tuning is recommended to not be done inside one process running multiple threads, instead multiple processes should do the work, communicating over message queues, synchronizing through semaphores. Such a concept deprecates web- and application-server concepts, where one process drives several applications deployed to it. Instead every cloud-app contains its own web server.
Healing by Scaling
- Horizontal scaling means running an application as multiple instances on different machines. This raises the need for data replication, mostly done through distributed caches.
- Vertical scaling means putting more processors and memory into a machine. But this helps just when the software actually can use multiple CPUs, which is not the case for old software that never has been modernized. Twelve-factor apps circumvent this by using processes instead of threads, nevertheless the software has to support parallel processing then.
Generally it looks like computers (RAM/CPU), operating systems and applications are dissolving into a mystic fog of containers, ports, protocols, environments, nodes, pods (not bots!) and resources that condenses to what is called cloud. Docker is a virtualization on operating system level, Kubernetes is a deployment and runtime environment for containers, OpenShift is the platform facilitating Kubernetes - how many Matryoshka dolls will we need for our web shop?
Dev, DevOps, Ops
A new role has been created called devops (developer + operator). It is expected to do all that deployment work after a software release. Devops is a developer because Kubernetes requires programming, as do build-severs and containers of any kind. Devops is an operator because of machines, operating systems, application instances, and the resources they use.
Nevertheless dev (developer) maintains the application that needs to go cloud. Development of twelve-factor apps looks like writing just small parts of logic, then encapsulating them carefully against things like concurrency and disposability. Applications need to be stateless, configuring their state on startup from persistence, that's the only place where state is allowed to be. This is quite a programming paradigm shift! How micro can you go? Will we still need devs for that, or can we pass it to business experts now?
In case you intend to hire devops, it's recommendable to define responsibilities of dev and devops sharply in context of the build-server (Jenkins). Both groups need to know it and use it. Dev needs to release the application. Devops needs to deploy the release to customers. Separate these two things, don't let them happen in just one "pipeline".
Twelve-factors mandate that devs do deployment. They must be able to find the reason for a NoClassDefFoundError, typically thrown when some JAR file has not been deployed correctly. So why do we need devops? Most likely because devs have been overloaded with all kinds of responsibilities for many decades, and it's time that this gets better. Cloud deployment environments are as complex as development environments.
May I Criticize her Majesty?
I don't agree with Dev / Production Parity. Yes, devs need to use at least the same environment that users have, but don't restrict the project to that. Try to keep the product flexible by using additional databases, file stores, caches, whatever. Backing Services abstraction is what we actually should stick to.
The twelve factors shouldn't include that source code must be versioned, and that it must not be reused by copying it around. We know that, it's state-of-the-art since decades, so why blow up the factors instead of focusing on cloud requirements?
Ever heard of the magical 7 +- 2 rule? Much easier to remember, avoiding overloaded burnt-out nervous-breakdown teams. Below I summarized the factors that focus on what the cloud demands, leaving out the dev responsibilities.
Configuration
Configuration files must not be in source code. Prefer environment variables to config files.
Backing Services
Databases, message queues, caches, ... No hardcoded dependencies to them must exist in the source code. Each must be replaceable by another implementation / product.
Build, Release, Run
A build is created from code of a specific commit-version of the code. A release adds configuration. The run-stage allows no code changes, but can provide means to step back to the previous release to fix bugs.
Processes
Application-processes are stateless, they do not assume a certain request order. They use stateful backing services for getting into state. They do not assume an exclusive connection to some client, because the next request could be dispatched to another process. They do not rely on filesystems and caches, or on session data or session stickiness.
Port Binding
A 12-factor web-app is runnable standalone, without web- or application-server. It just requires a port, and it exposes its service through a protocol like HTTP, resulting in an URL.
Concurrency
Applications should not try to manage their lifecycle on their own (like daemonize itself), this prevents scaling and handling concurrency from outside. Instead use the "process formation", which is an array of process types (web, worker, ...) and a number of processes of each type.
Disposability
The processes of an application can be started and stopped at any time, this facilitates scaling and fast deployment. It should start up fast. It should terminate (SIGTERM) gracefully by releasing all its resources, and cause no problems on sudden death.
Logs
No management of log files. Write behavior to unbuffered STDOUT. Facilitate a tool that can flexibly introspect logs, and can actively alert admins.
ɔ⃝ Fritz Ritzberger, 2019-02-12