2 usual ways: K8s, which allows container orchestration.
Kubernetes Basics:
Cluster: A Kubernetes cluster is a group of machines (physical or virtual) that run containerized applications. It consists of a Master node and multiple Worker nodes.
Pods: A Pod is the smallest deployable unit in Kubernetes. It can contain one or more containers. Containers within the same Pod share the same network namespace and can communicate with each other using
localhost
. Pods are the basic building blocks of applications in Kubernetes.Auto-Scaling: Kubernetes provides features for auto-scaling Pods. You can define Horizontal Pod Autoscalers (HPAs) to automatically adjust the number of replica Pods based on CPU or memory usage.
Simple Express.js App on Kubernetes: You have a simple Express.js app that listens on port 4001. Here's how it works in Kubernetes:
You would create a Docker container image of your Express.js app.
You define a Kubernetes Deployment, which is a higher-level resource. You specify the desired number of replicas, container image, and ports to expose. For your case, you would specify one replica and port 4001.
You can set up an HPA to monitor the resource utilization of your Pod. If the CPU usage, for example, goes beyond a certain threshold, Kubernetes will automatically create more replicas of your app to handle the load.
The other way is done using AWS Elastic Beanstalk.
Elastic Beanstalk is essentially a service that automates a large part of creating EC2 Instances, a base image, a launch template, and autoscaling groups with target groups, which then creates a load balancer and more.
All of this is a bunch of AWS Jargon, all of this created to allow backends to scale based on traffic, i.e. to reduce/increase the number of EC2 instances running based on the incoming traffic.
On a high level, you create an EC2 Instance, which is used to create a base image of other EC2 Instances you would create, that is then used to create a launch template, which is a bunch of configuration information about the instances like security groups, firewalls, and more. You then create autoscaling groups which use these templates to modulate the number of active instances, which is something that could autoscale up and down, or could manually be done using AWS's API constructs/through its UI.
Elastic Beanstalk automates a bunch of these processes into very simple steps. The code below with the Github link mentioned is to allow a CI/CD pipeline that automates deployments to the service (EBS), which internally does the work of creating load balancers, EC2 instances, and you can just modulate the number of instances.
https://github.com/aneeshseth/EBS-cicd
name: deploy api
on:
push:
branches:
- main
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Create zipfile
run: "cd server && npm install && npm run build && zip -r deployx.zip *"
- name: Deploy to EB
uses: einaregilsson/beanstalk-deploy@v21
with:
aws_access_key: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws_secret_key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
application_name: nodejsserver
environment_name: Nodejsserver-env
version_label: ${{ github.sha }}
region: us-west-2
deployment_package: ./server/deployx.zip
The code above is a Github workflow that is written to automate deployment to Elastic Beanstalk, which uses a GitHub action (einaregilsson/beanstalk-deploy@v21) that takes in the zipped contents of what we're deploying which in our case is a simple express server, (this is because the EBS service only accepts zip files, and internally within a Nodejs project automatically looks for the appropriate file to run.