Shoin | Let's Share!

Grafana OAuth SSO Integration with KeyCloak

Abdullah Caliskan — Mon, 28 Jul 2025 12:58:36 GMT

Single Sign-On integration is important. In my previous post, I demonstrated how to use Keycloak as the authentication layer in a FastAPI application. Now we will use Keycloak as identity provider for Grafana.

I'm going to run all those in my local env. So change the hostnames or FQDN for your domain.

Create KeyCloak Client

Properties of Keycloak client must be like following;

Client Type: OpenID Connect
Client ID: grafana-oauth
Authentication: On
Authentication Flow: Standart Flow Enabled
Direct access grants Enabled
Root URL: Grafana's root url
Home URL: Grafana's root url
Valid Redirect URIs: Grafana's root url + /login/generic_oauth
Web Origins: Grafana's root url

After creating the client, go to client details page and click Credentials tab to obtain client secret.

Grafana Compose File and OAuth Configuration

What to replace the file in below;

Your realm name and Client secret
Your keycloak hostname. (There are two domains. Localhost is sent to browser. Host.docker.internal is for Grafana access to Keycloak.

version: '3.8'

services:
  grafana:
    image: docker.io/grafana/grafana-oss:12.0.2
    container_name: grafana
    restart: unless-stopped
    environment:
    - GF_SERVER_ROOT_URL=http://localhost:3000/
    - GF_PLUGINS_PREINSTALL=grafana-clock-panel
    - GF_SECURITY_ADMIN_USER=admin
    - GF_SECURITY_ADMIN_PASSWORD=admin
    - GF_AUTH_GENERIC_OAUTH_ENABLED=true
    - GF_AUTH_GENERIC_OAUTH_NAME=Keycloak-OAuth
    - GF_AUTH_GENERIC_OAUTH_ALLOW_SIGN_UP=true
    - GF_AUTH_GENERIC_OAUTH_CLIENT_ID=grafana-oauth
    - GF_AUTH_GENERIC_OAUTH_CLIENT_SECRET=[SECRET]
    - GF_AUTH_GENERIC_OAUTH_SCOPES=openid email profile offline_access roles
    - GF_AUTH_GENERIC_OAUTH_EMAIL_ATTRIBUTE_PATH=email
    - GF_AUTH_GENERIC_OAUTH_LOGIN_ATTRIBUTE_PATH=username
    - GF_AUTH_GENERIC_OAUTH_NAME_ATTRIBUTE_PATH=full_name
    - GF_AUTH_GENERIC_OAUTH_AUTH_URL=http://localhost:8080/realms/local/protocol/openid-connect/auth
    - GF_AUTH_GENERIC_OAUTH_TOKEN_URL=http://host.docker.internal:8080/realms/local/protocol/openid-connect/token
    - GF_AUTH_GENERIC_OAUTH_API_URL=http://host.docker.internal:8080/realms/local/protocol/openid-connect/userinfo
    - GF_AUTH_GENERIC_OAUTH_ROLE_ATTRIBUTE_PATH=contains(roles[*], 'admin') && 'Admin' || contains(roles[*], 'editor') && 'Editor' || 'Viewer'

    ports:
     - '3000:3000'
    volumes:
     - 'grafana_storage:/var/lib/grafana'
volumes:
  grafana_storage:

You would see you can log in to Grafana via your Keycloak user. One important note here. In the localhost; you can't map Keycloak roles to Grafana users. OAUTH_ROLE_ATTRIBUTE_PATH defines mapping. So normally you should assign that role to user. It's happening due to localhost and host.docker.internal domains' differences. The JWT token couldn't be verified by Grafana so it doesn't fetch it. But it works in production. And make sure, the path matches with the key in your JWT token. Adjust it by the token. Or modify client scope details in Keycloak.

FastAPI - KeyCloak OAuth2 Integration

Abdullah Caliskan — Sat, 05 Jul 2025 18:36:12 GMT

In this blog post, I'm going to integrate FastAPI with KeyCloak to authenticate the users through KeyCloak. Instead of implementing authentication layer in each service, we take the advantage of KeyCloak to handle it in a centralized point. We can achieve that by using JWT tokens with RSA signing.

Install requirements in below

Requirements
- uvicorn, fastapi, pydantic, pydantic-settings, python-keycloak, pyjwt, python-multipart

KeyCloak Realm and Client Configuration

First create a realm in KeyCloak named local and select it.
Under clients , create a new client with below specs
Client type: OpenID Connect
Client ID: local-api
Click next and enable below capabilities
Client Authentication: On
Direct Access Grants: On
Standart Flow: On
Click next and configure login settings
Root URL: http://localhost:8000
Home URL: http://localhost:8000
Valid Redirect URL: http://localhost:8000/*
Web Origins: http://localhost:8000
Click save to create the client

Obtain Client Credentials

Go to client details page of local-api client and select credentials tab.
Copy client secret, it will be used by FastAPI app

Create a Test User in Keycloak

Click Users tab, and Click Create new user or Add user button
Fill the form and click create button.
In the user details page, click credentials tab and set a new password. Don't forget to disable Temporary password toggle to keep password permanent.

Create settings.py

I'm going to use pydantic-settings to load env variables based on the runtime env. If you set RUNTIME_ENV as local or unset, then application will look for .local.env in the app directory to load env variables. You must set KEYCLOAK variables in .local.env file. An example .env file would look like this.

RUNTIME_ENV=local
KEYCLOAK_SERVER_URL=http://localhost:8080/auth
KEYCLOAK_CLIENT_ID=local-api
KEYCLOAK_REALM_NAME=local
KEYCLOAK_CLIENT_SECRET_KEY=

from pydantic_settings import BaseSettings, SettingsConfigDict
import os 


class CommonSettings():
    VERSION: str = "0.1.0"
    
    KEYCLOAK_SERVER_URL: str
    KEYCLOAK_CLIENT_ID: str
    KEYCLOAK_REALM_NAME: str 
    KEYCLOAK_CLIENT_SECRET_KEY: str


class LocalSettings(CommonSettings, BaseSettings):
    model_config = SettingsConfigDict(env_file='.local.env', env_file_encoding='utf-8')
    RUNTIME_ENV: str = "local"


class ProdSettings(CommonSettings, BaseSettings):
    model_config = SettingsConfigDict(env_file='.env', env_file_encoding='utf-8')
    RUNTIME_ENV: str = "prod"


runtime_env = os.environ.get("RUNTIME_ENV", "local")
settings = LocalSettings() if runtime_env == "local" else ProdSettings()

Create security.py

In this module, we implement functions to use later and initiliaze and configure keycloak class for connecting to keycloak server.

from keycloak import KeycloakOpenID
from .settings import settings as s


idp = KeycloakOpenID(
    s.KEYCLOAK_SERVER_URL,
    s.KEYCLOAK_REALM_NAME,
    s.KEYCLOAK_CLIENT_ID,
    s.KEYCLOAK_CLIENT_SECRET_KEY
)

In this module we implement login endpoint and issue a token from Keycloak by using credentials provided by user. That's why you need to enable Direct Access Grants in client capabilities to exchange token with username and password. We return access and refresh tokens to user.

from typing import Annotated
from fastapi import APIRouter, Depends, HTTPException
from fastapi.security import OAuth2PasswordRequestForm
from pydantic import BaseModel


from .settings import settings as s
from .security import idp
from keycloak.exceptions import KeycloakAuthenticationError, KeycloakError


class OpenIdToken(BaseModel):
    token_type: str
    access_token: str 
    refresh_token: str
    expires_in: int 
    refresh_expires_in: int 
    

app = FastAPI(title="Code Reference API",
              description="API for code reference",
              version=s.VERSION)


@app.post("/login")
async def login(form: Annotated[OAuth2PasswordRequestForm, Depends()]):
    try: 
        resp = idp.token(form.username, form.password, scope="openid profile email")
        return OpenIdToken(**resp)

    except KeycloakAuthenticationError as e:
        raise HTTPException(401, "Username or password is incorrect!") from e

    except KeycloakError as e:
        raise HTTPException(400, "Request malformed!") from e

By here, we have managed to login through Keycloak. Now you must be able to get access token from /login endpoint. Give it a try!

Protect an Endpoint and Validate Access Tokens

Now it's time to make JWT tokens mandatory for protected endpoints. Let's decode and validate JWT tokens.

Add below lines to security.py

oauth_token gives information where to obtain token from for swagger UI.
decode_token calls keycloak decode_token function to decode and verify token by fetching public key from KeyCloak. (I'm going to show how to verify it without fetching key every time)
Function also handles the exceptions and returns proper error messages.

from fastapi.security import OAuth2PasswordBearer


oauth_token = OAuth2PasswordBearer("/login")


async def decode_token(token: str) -> dict:
    try: 
        return await idp.a_decode_token(token, validate=True)
        
    except JWTExpired as e:
        raise HTTPException(401, "Token has expired") from e 
    
    except InvalidJWSSignature as e: 
        raise HTTPException(400, "Token signature couldn't be verified") from e 
    
    except JWException as e: 
        raise HTTPException(400, "Token is malformed") from e 
    
    except Exception as e:
        print(e)
        raise HTTPException(500, "Error while decoding token") from e

Then we need to implement a dependency function to mark Authorization header as required and inject it into endpoint.

from typing import Annotated
from fastapi import Depends
from .security import oauth_token, decode_token


async def get_user(token: Annotated[str, Depends(oauth_token)]):
    return await decode_token(token)

@app.post("/me")
async def me(user: Annotated[dict, Depends(get_user)]):
    return token

When you visit the swagger UI, you will see an lock icon next to /me endpoint and login button in top-right corner. Login and try to send a request to me endpoint. You should see the issued token by Keycloak.

How to Verify without fetching key every time?

On the first startup, we can load the key and store it in the memory. While decoding the token, we use the from memory instead of fetching it from Keycloak. It reduces the load on keycloak and latency, but it comes with its consequences such as how to handle key rotation? Just keep in mind.

We're going to use lifespan feature in FastAPI. We load the key on app startup and save it to settings variable to make it available everywhere.
Don't forget to add JWT_KEY to your settings class.

from contextlib import asynccontextmanager
from jwcrypto import jwk
from .settings import settings as s 
from .security import idp


@asynccontextmanager
async def lifespan(app: FastAPI):
    key = (
        "-----BEGIN PUBLIC KEY-----\n"
        + await idp.a_public_key()
        + "\n-----END PUBLIC KEY-----"
    )
    s.JWT_KEY = jwk.JWK.from_pem(key.encode("utf-8"))
    yield 


app = FastAPI(version=s.VERSION,
              lifespan=lifespan)

Now we should provide the key to Keycloak class to use it, instead of fetching it. Find and change decode_token function content.

return await idp.a_decode_token(token, validate=True, key=s.JWT_KEY)

Zot OCI Registry Review and Configuration

Abdullah Caliskan — Mon, 20 Jan 2025 10:23:32 GMT

While searching an alternative to Docker's private registry, found Zot OCI registry. And wanted to see the capabilities and limitations. It's an image registry which allows you to store and distribute container images. The difference is from other registry servers, it follows the OCI distribution specs published by Open container initiative (OCI). OCI is the governor who sets the rules and defines common structure for the software developers and companies build around that standards.

Expectations

Usually, we expect the below criteria from an image registry.

Push and pull container images
Delete container images
Persistent storage for images, preferably object storage like S3
Authentication layer for security

Extras

Mirroring on Docker hub or another container registry
Security vulnerability scans on images
Container image signing and licence checking
Automatic retaining images by defined rules

Zot support all of the above features.

Configuration

This is the version I used: ghcr.io/project-zot/zot:v2.1.2-rc3
The entry point command is zot and I mount the configuration file below.

docker run -d -p 8000:8000 --name zot -v `pwd`/config.json:/etc/zot/config.json ghcr.io/project-zot/zot:v2.1.2-rc3 serve /etc/zot/config.json

{
    "distSpecVersion": "1.0.1",
    "storage": {
        "rootDirectory": "/tmp/zot",
        "commit": true,
        "dedupe": false,
        "gc": true,
        "gcDelay": "2h",
        "gcInterval": "1h",
        "storageDriver": {
            "name": "s3",
            "region": "eu-west-1",
            "bucket": "harver-zot-private-registry",
            "secure": true,
            "skipverify": false
        },
        "retention": {
            "dryRun": false,
            "delay": "24h",
            "policies": [
                {
                    "repositories": ["infra/**", "base/**"],
                    "keepTags": [{
                        "patterns": [".*"]
                    }]
                },
                {
                    "keepTags": [{
                        "patterns": [".*"]
                    }]
                }
            ]
        }
    },
    "http": {
        "address": "0.0.0.0",
        "port": "8000"
    },
    "extensions": {
        "metrics": {
            "enable": true,
            "prometheus": {
                "path": "/metrics"
            }
        },
        "sync": {
            "downloadDir": "/tmp/mirror",
            "enable": true,
            "registries": [
                {
                    "urls": ["https://docker.io"],
                    "content": [
                        {
                            "prefix": "**",
                            "destination": "/docker"
                        }
                    ],
                    "onDemand": true,
                    "tlsVerify": true
                }
            ]
        },
        "search": {
            "enable": true
        },
        "scrub": {},
        "lint": {},
        "trust": {},
        "ui": {
            "enable": true
        }
    },
    "log": {
        "level": "debug"
    }
}

How to Build Multi-Arch Container Images

Abdullah Caliskan — Tue, 03 Dec 2024 14:11:05 GMT

Each container image has processor architecture. That architecture describes where you can run them on. It's the similar, like in OS executable binaries.
If you want to run your application on amd64, you should build your code on amd64 processor. Nowadays, toolkits and emulators allow you to build the code anywhere. This is buildx for docker. It's a plugin that you can install by following the link: GitHub. Buildx has extra capabilities than only multi-arch builds.

Container images can be built for different architectures. Then you can combine them into a single image tag. Actually, this is called image manifest file. That file has references for the regarding architectures.

Build Image from Dockerfile - buildx

Let's build a container image for linux/amd64 and linux/arm64 architectures. We can use buildx for that. First we need to install it

Buildx and buildkit is already enabled in Docker Desktop

Buildx has builder instances for the builds. Each build is executed inside them. The advantage of them, they provide isolated environments. You can even set up a build farm with a set of builders. You can switch between the builders.

For building multi-platform images; follow below

docker run --privileged --rm tonistiigi/binfmt --install all

docker buildx create --use --name multi-arch node-amd64
docker buildx create --amend --name multi-arch node-arm64

# List and switch contexts
docker buildx ls
docker buildx use --builder multi-arch

The builder instances are ready to use. We can build container images in two architectures on the same host.
--push parameter pushes the image to the repository. If you want to keep in local pass --load parameter.

Build Multi-Arch images

docker buildx build \
	--platform linux/arm64,linux/amd64 \
	--push \
	--tag [image_tag] \
	.

SQLAlchemy - UUID and Mix-In Classes

Abdullah Caliskan — Mon, 11 Nov 2024 14:12:48 GMT

UUID

UUID (Universal Unique Identifier) is a 128-bit value used to uniquely identify an object or entity on the internet.
A real world example; When you visit a news website, you see something following in the address bar of the web browser.
https://newssite.example/category/economy/1
As you can imagine, the 1 indicates that it's the first article. If you increase the number you view the second article. It's predictable and the people can copy your content easily, just by increasing the numbers. So hiding your content's identical number to visitors can be crucial for web services. To be able to hide it you can start to implement UUID instead of regular numbers. Predicting UUID is difficult but implementing it is easy. Because an UUID looks like below;
291ad1cd-c352-4220-b8f8-5ee876da5390
If you can guess the next article's ID, don't wait and play lottery :)

The best place to put UUID field is the Base class. We ensure all resources in database has UUID type by defining there. And it is good for DRY (don't repeat yourself) principle.

As default parameter, we provide uuid.uuid4 function's itself. SQLAlchemy will call the function to generate UUID values. Both way is shown in below.

from sqlalchemy.orm import as_declarative
from sqlalchemy import Column, Integer
from sqlalchemy.dialects.postgresql import UUID
import uuid

class_registry: dict = {}

@as_declarative(class_registry=class_registry)
class Base():
    # id = Column('id', Integer, primary_key=True, index=True)
    uuid = Column('uuid', UUID(as_uuid=True), primary_key=True, index=True, default=uuid.uuid4)

Auto-Generated Table Names

Instead of typing __tablename__ in each model classes, you can follow below approach in Base class. You can use this decorator for more of course, like creating Mix-in classes. (in below it's explained)

from sqlalchemy.orm import as_declarative, declared_attr
from sqlalchemy import Column
from sqlalchemy.dialects.postgresql import UUID
import uuid

class_registry: dict = {}

@as_declarative(class_registry=class_registry)
class Base():
    @declared_attr
    def __tablename__(cls):
    	return f"{cls.__name__.lower()}s"

    uuid = Column('uuid', UUID(as_uuid=True), primary_key=True, index=True, default=uuid.uuid4)

Common Date Time Mix-in Classes

For auditing purposes and keeping everything under record, we should keep the creation and update dates. The below mix-in class is our abstract class. We will use them in classes which needs Date Time records.

Creation date and update date can be set automatically. Be aware of that we use server_default in created_at with func.now()
But we use onupdate in updated_at again with func.now() The reason is explained in this thread.

from sqlalchemy import DateTime, func
from sqlalchemy.orm import declared_attr

class DateTimeMixin:
    @declared_attr
    def created_at(cls):
        return Column(DateTime(timezone=True), server_default=func.now())
        
    @declared_attr
    def updated_at(cls):
        return Column(DateTime(timezone=True), onupdate=func.now())
        

class User(Base, DateTimeMixin):
    __tablename__ = "users"
    
    username = Column(String, index=True)
    password = Column(String)

FastAPI - SQLAlchemy and Alembic Integration

Abdullah Caliskan — Sat, 09 Nov 2024 13:31:11 GMT

Almost in every project we store information in databases. It could be an SQL or a NoSQL database. Let's say you would like to go with an SQL database. You have plenty of options to choose. In SQL databases you need to define the structure such as columns and its data types, limitations, default values etc. You may want to create indexes, views and transactions as well. SQLAlchemy allows you to define your tables in form of python classes. You can get the value of a column just by getting the attribute of class. It helps you to get advantage of OOP paradigm. ORM is capable of creating the tables but it's not preferred way for production workloads. It's not a sustainable option. When you modify your structure, it can't make it happen. It's not good as well. Because you don't track what changes you made before. In this point alembic comes to your help. It's capable of creating scripts to modify your structure. It provides rollback scripts as well.

Install required packages

poetry add sqlalchemy alembic

fooapi/ # Main package
    fooapi/
    	__init__.py
        main.py
    tests/
    poetry.lock
    pyproject.toml
    db/
    	__init__.py
        base_class.py
        base.py
        session.py

SQLAlchemy

We should create a new sub-package called db. We store the base classes, session objects and configurations for database connection. The structure is like below.

We define our base class first. We inherit new model classes from the base class. We can define the common columns in base class. It effects every model classes that is inherited from it. So that you don't have to type commun columns in each model classes.

from sqlalchemy.orm import as_declarative
from sqlalchemy import Column, Integer

class_registry: dict = {}

@as_declarative(class_registry=class_registry)
class Base:
	id = Column('id', Integer, primary_key=True, index=True)

class_registry dictionary is used for tracking the which classes are used creating database tables. It's mapping. as_declarative means that we declare the attributes and structure of table. We added an id column which is common for all of our tables.

Now we just defined our Base class. We need to initiate a Session to connect the database. SQLAlchemy is able to connect to the different database engines with almost same code or with small changes.
We keep database connection string in a variable. We need to create an engine. ORM gets the target engine from connection string. We can pass some extra parameters here. For instance, sqlite doesn't support multi-thread applications. FastAPI creates threads for requests because it's based on starlette (ASGI) which works async. We define that don't check is the connection coming from same thread, or not.
sessionmaker functions returns Session object which is used actually to connect and initiate a session in database. This object is callable. If you want to connect to the db, you call this object. SessionLocal()

from sqlalchemy.orm import sessionmaker
from sqlalchemy import create_engine

SQL_URI = "sqlite:///app.db"

engine = create_engine(SQL_URI, connect_args={"check_same_thread": False})
SessionLocal = sessionmaker(engine, autoflush=False, autocommit=False)

We create another module called base.py I'll explain that in alembic section. This is how it looks like.

from .base_class import Base

Alembic

We need to initate alembic in project. Go to directory next to pyproject.toml and run alembic init alembic It creates a directory named alembic and alembic.ini file next to pyproject.toml This is last directory setup.

fooapi/ # Main package
    fooapi/
    	__init__.py
        main.py
    tests/
    poetry.lock
    pyproject.toml
    db/
    	__init__.py
        base_class.py
        base.py
        session.py
    # NEW files below
    alembic.ini 
    alembic/
    	versions/
        env.py

Find and comment out sqlalchemy.url line in alembic.ini, because we set this value by getting from session.py Make below changes in env.py

from fooapi.db.base import Base
target_metadata = Base.metadata

from fooapi.db.session import SQL_URI
def get_url() -> str:
	return SQL_URI

target_metadata is required to run alembic properly. It finds Base class get its metadata with table-class mapping that we defined in base_class.py
We imported SQL_URI to point out the target database.

In addition to that, we must change the run_migrations_offline function. We commented out the sqlalchemy.url in alembic.ini. Update the value of url parameter in context.configure method.

context.configure(
    url=get_url,
    target_metadata=target_metadata,
    literal_binds=True,
    dialect_opts={"paramstyle": "named"},
)

We have to make some updates in run_migrations_online function as well. Because we load connection string from session.py We need to set the connection string in this configuration too.

configuration = config.get_section(config.config_ini_section, {})
configuration['sqlalchemy.url'] = get_url()
connectable = engine_from_config(
    configuration,
    prefix="sqlalchemy.",
    poolclass=pool.NullPool,
    )

Alembic Revision and Apply

Alembic creates migration scripts and they're versioned. To create a new revision and apply it, you can give this command

alembic revision --autogenerate -m "first revision"
alembic upgrade head

Most probably it's gonna create anything in database, because we didn't define any model classes that are mapped to real database tables. At least we tested our integration in this step.

Create DB Model

Create a new sub-package called models under the fooapi package. Create a new module named user.py We define the user model class in this module.
It must be inherited from Base class. We have only 2 columns for this table. We have to set the table name as well.

from fooapi.db.base_class import Base
from sqlalchemy import Column, String


class User(Base):
    __tablename__ = "users"
    
    username = Column(String, index=True)
    password = Column(String)

Now go to the base.py under the db sub-package, import this model class, otherwise alembic isn't going to detect that model class is available.

from .base_class import Base

from fooapi.models.user import User

This part is really important. You must have base.py and base_class.py both.
Alembic needs base.py. Model classes need base_class.py.
Alembic needs Model classes as well, that's why we import model classes in base.py
If you didn't define your Base class in base_class.py you would cause circular dependency. That's why you must define this in separated files. If you get an error about that, ensure you imported Base class from the correct module.

Model Classes;
- Import Base class from base_class.py
- Import Module class within base.py for alembic to discover it.

Alembic;
- Import base.py
- base.py must import Base class from base_class.py
- Make sure you imported all defined Model classes within base.py

To be able to create the table for User model. Give below commands

alembic revision --autogenerate -m "User model added"
alembic upgrade head 


alembic downgrade -1  # To revert changes

Now we have created a table by assist of Alembic. We store the changes in our repository and able to revert and provision versions in database with less-effort.

BigQuery Scheduled Query Monitoring

Abdullah Caliskan — Mon, 26 Dec 2022 19:31:36 GMT

We have created scheduled queries in the previous post. The query execution results could be forwarded to a pub/sub topic. In this post we developed a function that reviews the execution results and takes action based on state.

I want to build the solution async and reusable. Therefore there are two functions.
The first function is responsible to review the execution results.
The second function is responsible to send emails.
I have built in that way, because I want to utilize email-sender function for project-wide usage. :)

Workflow

BigQuery starts to process query based on schedule time.
The query is succeeded or not, the execution result is forwarded to where specified pub/sub topic in creation process
Pub/sub topic invokes the reviewer function.
The reviewer function checks the body of event data against is it succeeded or not.
If state is SUCCEEDED, then it does nothing.
If not, then it publish new message to target topic which is the email-relay.

The reviewer actually does just checking the state is SUCCEEDED or not. Besides it publish new message to email-relay topic, if query is failed.

shoin-posts/gcp-bigquery-scheduled-q-monitoring at master · xsetra/shoin-posts

Contains devops scripts to re-use. Contribute to xsetra/shoin-posts development by creating an account on GitHub.

GitHubxsetra

BigQuery Scheduled Query Management

Abdullah Caliskan — Mon, 26 Dec 2022 19:30:04 GMT

Sometimes you might want to run queries on recurring basis. The queries could be written in SQL.
We decided to run some queries for cleaning purposes the datasets, then developed the below scripts. If you have a lot of queries and you want to store a copy of them in a repository, the following script would be fit for you.

shoin-posts/scheduled_queries.py at master · xsetra/shoin-posts

Contains devops scripts to re-use. Contribute to xsetra/shoin-posts development by creating an account on GitHub.

GitHubxsetra

Workflow

The script simply reads the query list from query-catalog-file. There is a file in repo named query-catalog.json that shows the structure.
name , schedule and pubsub_topic is optional fields.
It creates or deletes the queries according to operation parameter.
The query must be provided within the catalog file.
While deleting the queries, the generate_name function takes a big role. Because the query name is composed with a parent name that includes project, location, dataset_id. The script finds the correct name, then deletes the query.

usage: scheduled_queries.py 
    project_id 
    query-catalog-file 
    operation
    [-h] 
    [--service-account-name SERVICE_ACCOUNT_NAME] 
    [--pubsub-topic-id PUBSUB_TOPIC_ID] 
    [--default-schedule DEFAULT_SCHEDULE] 
    [--location LOCATION]

Which one wins the race? File or Parameter?

Answer is file. For instance if you don't specify the name in file, the script will generate a name for it. If you specify, the file name is chosen. You can find the name rules in generate_name function.

Optional Parameters

Service Account Name
It's associated to bigquery scheduled query. You have to grant correct permissions according to query requirements. For instance, bigquery user.
PubSub Topic Id
When scheduled query executed, the result of query will be send to this topic.
It's required to monitor the queries. Take a look monitoring solution.
Default Schedule
If you don't specify the schedule in catalog-file, this value will be used.
Location
What is the location of dataset or query execution location.

Example Usage

./scheduled_queries.py gcp-project query-catalog.json create

GitOps Explained

Abdullah Caliskan — Mon, 26 Dec 2022 19:19:53 GMT

I heard the "GitOps" word first time while learning the terraform. A bunch of new words that have "Ops" suffix have appeared nowadays such as DevOps, NetDevOps, and DevSecOps. I decided to research that word and our hero today is GitOps :)

Initially I want to define what is GitOps?
GitOps is an operational framework. I want to draw your attention to the framework word. If you're a programmer, you might be confused. Please don't compare with the web frameworks. You should think it's like a manifest or a process. It defines the policies, standards, quality metrics and governance.

Git and Ops Combination

It is composed the Git and Ops words. Actually the words itself describes a lot of things when you look at first. Let's inspect them individually.

Git is a version control system. It provides developers the collaboration, integration and central management point for their applications and teams. Beside of that features todays providers such as GitHub and GitLab provides the CI/CD tooling services.

Ops stands for Operations and it consists the set of processes and services that are administrated by an IT department. Ops teams setup the infrastructure that application needs. The IT department has different teams. The teams define different policies and standards than each others due to communication problems and special tasks. They create the servers, installs the operating systems and the services on that servers manually.

If we led to the conclusion, we could say that the developers has an automated development lifecycle thanks to VCS and its processes.
But the ops teams still have a manual process and specialized teams for different tasks. Infrastructure is more complex and larger than ever. Because the need of the modern infrastructure has been changed.

Modern Infrastructure and GitOps

The demands of modern infrastructure is elasticity, easy maintenance and enhanced monitoring. Modern infrastructure needs to have automated provisioning. As DevOps engineers we work on prepare the declarative files of infrastructure. There is a missing point though. The provisioning the infrastructure is still a manual process and needs to have development lifecycle standards and policies. GitOps framework defines that how you could achieve that standards.

GitOps Framework

Speed and scalability are the most important properties of modern applications.

The application can be deployed thousands of time per day, if you are adopted the DevOps culture in your company. DevOps practices allow your company to be ensure the application is passed the standards, policies.
The standards are code review, automated testing, no human interaction in deployment process. The policies are composed code coverage tests and accepted pull requests. It's the DevOps practices.

GitOps is used to automate the process of provisioning infrastructure.
The developer teams store their source code in version control system (vcs). If you want to adopt the gitops in your company, the ops team should store their configuration files, declarative definitions (infra as code) in vcs.
GitOps allow your ops team to be ensure you will have the infrastructure as defined state at the end of the pipeline. The dev team already compiles and gets the binary file at the end of the pipeline. The output of the pipeline is named as artifact.
GitOps offers the versioning. The dev team tags their commits then have the versioned artifacts. Thanks to git, your ops team also will have versioned infrastructure.
If you have versioning that means also you have rollback scenario. When you revert the commit in vcs, the pipeline triggers and deploy the previous version to environment.
GitOps allows greater collaboration. The dev or ops team create the pull requests to deploy their changes to production environment. The team leaders review the code and configurations. If team leader was missed the errors, the pipeline test could reveal them.
GitOps = Infra-as-Code + Pull Requests + CI/CD

We examined what is GitOps and advantages of it. it requires an adoption process like DevOps. Because the team members are going to attempt to edit something in production environments manually. Don't be surprised :) GitOps will work better when you have less "cowboy engineering".
That's it another reason to have fast pipelines. Prevent the cowboy engineering. When you have fast pipelines, the team members would choose to alter the infrastructure or application in proper way. The proper way is GitOps :)

VPC Endpoint and Peering

Abdullah Caliskan — Mon, 26 Dec 2022 19:18:36 GMT

VPC Endpoints

VPC endpoint provides access to public AWS services for resources that don't have public IP or where NAT gateway isn't deployed.
There are gateway (s3, dynamodb) and interface (most aws services) endpoints.

Gateway Endpoints

Uses routing.
They are present in route table via prefix lists which are represent the CIDR of service. Prefix lists are updated by AWS.
They can have associated policies which defines who can access.
They are highly available for all AZs in a Region.

Interface Endpoints

Provisions a networking object, ENI.
The provisioned ENI is your interface endpoint to connect the service.
That ENI has got ip/dns pair, Security group. NACL works in subnet level.
You can provision that ENI in multiple subnets, one per AZ. The applications will use DNS resolve. By doing that endpoints will be highly available.
Enable the Private DNS to override public default name of service. Also, you can give a different names for AWS services, because you will have private route 53 zone.

VPC Peering - Layer 3

VPC peering is a way to link or connect two VPCs together without using any additional non AWS services.
You can connect the services via private IP while VPC peering span AWS accounts, regions with limitations.
Data is encrypted and transits via the Global backbone with lower latency.
It's scalable and highly performant way.
Use Case:
- Sharing database with other VPC and access to Database.
- Security auditors can be connect your VPC and performs tests.
- Vendor provided service, it should be a web API
- Splitted application for blast radius
Peering connection is a gateway like NAT and IGW.
VPC overlap is the limitation for that. There is requester and accepter.
Adjust the route tables in both VPC side. Remote CIDR
NACLs and SGs can be used to control access because you will have an ENI in your VPC. If VPCs are in same region, you can reference SG id.
IPv6 support is available for cross-region
DNS Resolution to private IPs can be enabled. It's a setting needed to adjust both sides. It prevents the traffic leaves AWS.
Transitive Routing isn't supported. Let's say A-B and B-C are peered. It doesn't mean you can reach to C from A. You must create another peering between A-C

AWS Networking - VPC Summary

Abdullah Caliskan — Mon, 26 Dec 2022 19:16:51 GMT

VPC stands for Virtual Private Cloud. VPC lets you build up your own private network in AWS Cloud. You can define an isolated networks by building up VPC. The private network is the beginning point of robust, secure, reliable and fast infrastructure.

Isolated Network Blast Radius
Let's say something happens in VPC-A such as malware infection or an attack. When you isolate your network, the other VPCs won't be affected from issue in VPC-A. It will decrease you blast radius.

VPC and Subnets

If you have a fresh AWS account, the default VPC will be created for you. The properties of default VPC like following;

172.31.0.0/16 and /20 subnet in each AZ with public IP enabled.
Internet Gateway with a configured main route table and DHCP option set
NACL: All allow in/out , Security Group: All from itself (ingress) , All egress

VPC has some limitations such as max subnet range could be /16 and min /28 for ipv4. but ipv6 max is /56.

The VPC is software defined network, but these definitions would effect the hardware such as router, gateway, firewalls. AWS offers two way of tenancy to consume network devices.
Default: Shared network. The underlying devices are shared with other tenancies. It could be changed after creation.
Dedicated: Locks this VPC to dedicated hardware and can't be changed later.

In addition to above limitations; you must to be ensure that ip overlapping doesn't exist with other accounts and partners, while designing corporate network.

VPC Routes

To deliver packets from a subnet to other or internet, we need routing.

Internet gateway has public IP and performs Static NAT. It translates the private ip addresses with public IP.
Route table manages the VPC router. It can route propagation over BGP when you have DirectConnect or VPN connection.
Route table is associated with subnets. Specific IP in route table has high priority.

VPC Security

AWS offers different services which they can work in different network levels.

NACL - Layer 4

Controls data traffic across subnets. It is consisted list of rules.
Impacts only traffic crossing boundary of subnet.
It contains explicitly deny or allow rules. Protocol, IP Range, Port for source and destination.
Rules are processed in number order. Lowest first. When a match found, process stops.
* rule is default. Last processed and implicit deny.
NACLs are stateless, you must add your rule ingress and egress appropriately for ephemeral ports.

Security Groups - Layer 5

Think like firewall rules. Thanks to Layer 5 capabilities, it stores the session. That means you have stateful firewall.
Security groups can be associated with AWS resources such as EC2 instance, EFS mount point. You can associate with each resource that has ENI
SG can not deny traffic explicitly. Insert allowed sources and protocols. If you want to explicitly deny, use NACL.

NAT Gateway

If instances or resources inside a VPC don't need incoming internet access, don't give them IPv4 public ip. IPv4 pool is out of space. Use NAT Gateway.

NAT Gateway is used to access public world from private network with a static IP address. It manages the egress traffic. It's suit for your private subnet's resources.
NAT Gateway run in subnet. Subnet is present in AZ. You may provision NAT Gateways for each subnet. Create route tables for each subnet that has been configured to consume NAT Gateways. By doing that your system will be resilient and highly available.

Cross VPC Access

VPC offers the ability of cross access or communication between your VPCs or 3rd part VPCs. I mentioned about the requirement the isolation of VPCs.

If you want to consume 3rd party service which is developed in AWS without public internet access, the only thing that you need is AWS PrivateLink. There will be another post.

Summary of Network Fundamentals

Abdullah Caliskan — Mon, 26 Dec 2022 19:15:11 GMT

What is a Network?

A network consists of two or more computers that are linked in order to share resources (such as printers and CDs), exchange files, or allow electronic communications. The computers on a network may be linked through cables, telephone lines, radio waves, satellites, or infrared light beams. [source]

After the definition of network, I guess you've already known, we can take a look to OSI networking stack and describing the network layers.

OSI Network Stack

It stands for Open Systems Interconnection. This model describes seven layers that computer systems use to communicate over a network.

Layer 1 - Physical: Optical, Frequency and cable is transfer medium
It defines how to transmit and receive wave-lengths, ones/zeroes, voltages and radio frequencies.
Layer 2 - Data Link: MAC Address, Frame, Named connection
There is shared transfer medium so it decides who can talk and when. Avoiding cross-talk. Allows for backoff and retransmission. Think like a traffic cop or traffic lights.
Layer 3 - Network: Public IP Address, Packets, Single Stream for a node
Packet encapsulated and un-encapsulated at each step.
Layer 4 - Transport: TCP/UDP , Segments
TCP reliable, UDP fast unreliable. TCP uses segments to ensure data is received in correct order. Error checking, ports that allow different streams on the same host.
Layer 5 - Session: Session concept, Security groups, Stateful firewalls
Initiating traffic and response traffic are part of same connection.
Layer 6 - Presentation: Data conversion, encryption and compression
Standards for L7 can use. HTTPS; TLS encryption happens here.
Layer 7 - Application: Application data, Body and HTTP headers
Your application or protocol data will be hold here.

IP Addressing and CIDR

IP Address is consisted host and port. Subnet Mask or Prefix helps you to understand where the split occurs.
This [page] describes the subnet mask and details of CIDR.

CIDR stands for classless inter-domain routing. It allows more effecting allocation and sub networking.
10.0.0.0/24 : The first 24 bit is network. The last 8 bit is address of node in network. The number after the slash is subnet indicator. So it means, the 3x8 = first 24 bit is the network address. 32-24 = 8 bit will usable by nodes.
0 and 255 is reserved for network broadcast and gateway. That means you will have 253 usable IP address.

Subnetting

Process of breaking a network down into smaller subnetworks.
Split VPC into individual subnets. Subnet is present inside an availability zone.
By implementing subnets you can spread your infra across different availability zones that allow you to build in high availability into your infrastructure.

10.0.0.0/16
32-16=16 -> 2^16 -> Available IP Address: 65536 - 2 for reserved IPs
First IP: 10.0.0.0 , Last IP: 10.0.0.255
10.0.0.0/17
32-17= 15 -> 2^15 -> Available IP Address: 32768 - 2 for reserved IPs
First IP: 10.0.0.0 , Last IP: 10.0.127.255
10.0.128.0/17
32-17= 15 -> 2^15 -> Available IP Address: 32768 - 2 for reserved IPs
First IP: 10.0.128.0 , Last IP: 10.0.0.255.255

Routing

Getting packets from your location to the destination where it's another location and network. IP routing happens in 3 factor. (LAN, MAN, WAN)

LAN

Local, Same network/subnet
ARP request is used to get MAC. Because you don't need to know IP address of other devices to communicate in same subnet. But generally apps and us prefer to use IP or DNS.
ARP: Make a broadcast, who has this IP? I would like to learn the MAC of it.
Send frames with MAC to target.
Layer 2, Router doesn't need.
Peer to Peer communication

MAN

Known locations, 2 Subnet communication
Check the target is in local? Answers that question by using the subnet masks and own ip address. If target is not in local, then follow below steps :)
We should forward that packet, there is one option that is default gateway.
Find MAC of gateway. Gateway is a router device. Generally first ip of network.
Send packet to router/gateway.
Router tries to find target node.
Find the MAC of next
Deliver the packet.

WAN

Unknown locations, Internet
There are extra steps to MAN.
We know the target, because it was in other subnet. (Remember we had got 2 subnets) In wan there are a lots of networks.
We use the BGP protocol to find the location of target. Backbone IP.
Backbone router looks how can I send packet to target. Calculates the best way to deliver packets.

Firewalls

They are barrier of networks and security devices to analyze incoming and outgoing traffic. They work in rule based. It matches rules with the traffic and makes decisions about the status of traffic, allow or deny traffic?

Firewalls are classified according to their ability to work in different network layers. Layer 3,4,5,7 Firewalls.

Proxy Server

It's another type of gateway. Sits between public and private network.
Client makes a connection to public internet, this request goes to proxy server. Proxy server makes a request to destination and delivers the response to client which is present in private network.

Proxy server needs application support, configured in OS, browser or app
Caching: The clients connects to same destination, proxy server caches the common large files and images then instead of re-request these files from remote. Proxy delivers files from cache. Bandwidth effective usage
Filtering: The clients are accessing indirectly to remote. Therefore proxy server can filter out content that might have child safety.
Proxy server can perform authentication or validation. Proxy server can check the client has a valid corporate ID.

Inside AWS we have a lot of filtering products such as NACLs, SecurityGroups. These filters based on network level factors. Nothing over layer 5 in AWS.
If you want to filter out based on profile, age, department by using corporate ID, you should install a proxy server on EC2.

Datastore Delete Entities in Bulk with Dataflow

Abdullah Caliskan — Mon, 26 Dec 2022 19:09:47 GMT

I've written this post, because I would like to draw your attention to some important points about the deleting entities in datastore through dataflow service. I think that the documentation page of the dataflow template for deleting entities isn't beneficial. Also it isn't up to date.

Job Common Properties

First of all visit dataflow service page. Click the create job from template button. You should create a dataflow job from template. The template name is Bulk Delete Entities in Datastore.
Give a name to your job.
Select an endpoint. Job metadata will be stored in here. Select same with datastore region.

Required Parameters

Type your GQL query. Simply SELECT * FROM [kind_name]
Type your read_project_id
Type your delete_project_id. In this point you should ask, why I had given two project id. Dataflow performs read operations from your read_project_id, then deletes the entities from the delete_project_id.
Temporary location is required to store some metadata and log variety things. Type a bucket path. gs://temp-dataflow-delete/path/

You shouldn't need to define any additional parameters theoretically. Because you defined required parameters, right? But there is more.

For example; we haven't defined the namespace? It will delete but from where?
Click Show Optional Parameters to define.
Type your datastore namespace to first field.
UDF GCS path doesn't required at this point, leave it blank.
Same for UDF function name, leave it blank.
Max workers. It is most important point. However GCP didn't pay enough attention to this. As you can imagine, it limits the worker count.
Dataflow provisions VM instances in your account to perform read and delete operations. This parameter set the maximum number of workers within a node group. Specify as you wish.
Number of workers defines the initial number of workers. For example 1. Node group will scales the instances up to the Max workers.
Select the worker region and zone. If you select same region with your datastore, it would be good. So you don't pay inter zone and region data transfer price.
If you would like to associate a service account to workers, type the email address.
Machine type is another important point. However they didn't pay enough attention to this parameter as always.
Additional experiments isn't required. Leave it blank.
Worker IP address configuration is important. If you don't have a special case, I suggest you to choose private. If you select the public, you will be charged. Because GCP suppose that it's giving a service to customer application or 3rd party. If you enabled Private Google Access in your VPC, then you will have an secure and high performant connection. Also you won't be charged. Yes again they didn't pay attention. :) I wonder why? :)
You can specify the VPC network
Also subnetwork.

That's it. You can run your job safely from price perspective. Also I suggest to use the burstable instance types which are f1-micro and g1-small. If you don't have performance concern, because they are cheaper :)

DNS Failover

Abdullah Caliskan — Mon, 26 Dec 2022 19:08:25 GMT

It helps the application or network services remain accessible in the event of outage. The failover updates the DNS records according to services' availability.
It provides you high availability.

How works?

It checks the availability of services regularly by using the health checks.
The health checks are like the regular requests coming from clients.
In addition you can specify the sources of health checks. If your customers are based in Europe, you can define health checks based on European cities.
It updates the DNS records to another working service which is predefined IP or host when it determines an outage.
Simply you define a primary/secondary system.

Advantages

It forwards the traffic to always running system.
Thanks to DNS failover your customers don't face with an issue or minimal while connecting your services. So you don't lose customers.
You can cover the SLA as much as possible. And business happy :)

Use Cases

The master servers of SQL database management systems.
Web services for a location
Network services like dns, smtp, ldap.

AWS Storage Services - EBS

Abdullah Caliskan — Mon, 26 Dec 2022 19:04:22 GMT

What is that?

EBS - Elastic Block Store is a service that provides the persistent block storage.
EBS volumes are presented over network and used by EC2 instances.

As always every service in AWS can run separately from each other.
So EBS volumes can be used with EC2 instances or you can keep them after termination of an EC2 instance. That means volumes are persistent and can be attached / detached from EC2 instances.

Durability Concern

EBS volumes are replicated across multiple servers managed by AWS in a single AZ to prevent data loss.
In addition to that you can increase the durability of the volume by taking snapshot. The snapshots are stored in S3 Buckets within a region. It allows us to store the data for long-term by storing the snapshots in bucket.
You can create point-in-time and incremental snapshots from volumes.
The incremental snapshots are quickest method. They are storage friendly because they don't need to store unchanged data again and again.
AWS incremental snapshot is more powerful than traditional snapshot. It doesn't need to all incremental snapshots to recreate volume. If the one snapshot is missed, you will lose the data in the snapshot.

What is the ability of?

You can format the EBS volume with a filesystem. Therefore, you can mount multiple EBS volumes to instance.
But be noticed that you can mount the EBS volume to only one instance.
If you want to mount single filesystem to multiple instances then you should utilize the EFS - Elastic File System service.
EBS also provides to create new volumes from the snapshots. It provides us to create the volume in another AZ for disaster recovery purposes or launching the same instance.

The Important Note: While creating a new EBS volume from the snapshot, it needs to copy all content from the S3 to the EBS volume where in the AZ.
The copy process keeps working in background, while showing the volume is created. Please be noticed to do that in production.
Make sure the all content is present in EBS volume, to avoid the performance issues. To speed up the process, try to access mount points in filesystem. By doing that EBS gives priority to get data from S3.

EBS Volume Types

The size of EBS volumes can be range from 1 GiB to 16 TiB. It depends to the volume type. Each volume type has dominant performance attribute.
Dominant performance attribute of SSD is IOPS and HDD is Throughput.

Storage Performance Measurement
IOPS: Number of input/output operations in a second
Throughput: Data rate, expressed in megabytes per second

[Block size of operation] multiple [Number of operations]
256 KB * 400 = 100 MB per second throughput

You should select the appropriate volume type according to your workload.
Does your workload demand IOPS or throughput?

gp2 - General Purpose

It offers general purpose SSD volumes. Default type. Also there is a new version of volume types that is gp3. Details in below
Recommended for most workloads
1 Gib - 16 TiB Size
It has the balance of IOPS and throughput.
Generally fast in terms of their throughput. Max throughput 250 mib/s
Exceptionally fast in terms of the number of IOPS. Range; 100 - 16,000 IOPS

There is a important point in gp2 types. It gains IOPS per gigabyte. Performance is linked to its size. You get 3 IOPS per GiB.

If you have 100 GiB volume, then you will have 300 IOPS.
But it can burst up to 3000 IOPS.

500 GiB * 3 = 1500 IOPS min. Burstable up to 3000 IOPS
1024 GIB * 3 = 3072 IOPS min. Not burstable

While provisioning the volume, you should consider IOPS.
The smaller sizes can hit performance ceilings. Anytime you need to go above the baseline, you can burst up to 3000 IOPS.

io1 - Provisioned IOPS

It offers highest performance SSD volumes.
Suit for Critical business applications
4 Gib - 16 TiB Size
You can adjust size and IOPS separately.
Max: 64,000 IOPS / Max: 1000 mib/s throughput
EBS has some limitations per instance. You can have up to;
1750 mib/s throughput
80,000 IOPS
Do you think it is a limit?
Then that is the solution. Use instance store volumes :)

st1 - Throughput Optimized

Offers low cost HDD volumes.
500 GiB - 16 TiB Size
Throughput up to 500 mib/s
HDD volumes can't be boot volume
Suit for throughput intensive workloads such as Streaming, BigData, Data Warehouse, Log processing

sc1 - Cold HDD

Offers lowest cost HDD volumes
Suit for infrequently accessed but requires throughput workloads
500 GiB - 16 TiB Size
Throughput up to 250 mib/s
HDD volumes can't be boot volume

gp3 - General Purpose

The next generation general purpose SSD volume type released on December, 2020. New gp3 volumes provides to provision IOPS and size separately like io1.
It offers lower price up to %20 per GiB than gp2.
You can scale IOPS and throughput without needing to provision new volume types. Pay only for your usage.
Min IOPS: 3,000 - Max IOPS: 16,000
Min throughput: 125 mib/s - Max throughput: 1,000 mib/s
Suits for transaction-intensive, low-latency workloads. Database clusters and web applications
You can easily migrate from gp2 to gp3 without interrupting the EC2 instances by using elastic volumes feature.

Shoin | Let's Share!

Grafana OAuth SSO Integration with KeyCloak

Create KeyCloak Client

Grafana Compose File and OAuth Configuration

FastAPI - KeyCloak OAuth2 Integration

Install requirements in below

KeyCloak Realm and Client Configuration

Obtain Client Credentials

Create a Test User in Keycloak

Create settings.py

Create security.py

Create auth.py and Login endpoint

Protect an Endpoint and Validate Access Tokens

How to Verify without fetching key every time?

Zot OCI Registry Review and Configuration

Expectations

Extras

Configuration

How to Build Multi-Arch Container Images

Build Image from Dockerfile - buildx

Build Multi-Arch images

SQLAlchemy - UUID and Mix-In Classes

UUID

Auto-Generated Table Names

Common Date Time Mix-in Classes

FastAPI - SQLAlchemy and Alembic Integration

SQLAlchemy

Alembic

Alembic Revision and Apply

Create DB Model

BigQuery Scheduled Query Monitoring

Workflow

BigQuery Scheduled Query Management

Workflow

Which one wins the race? File or Parameter?

Optional Parameters

Example Usage

GitOps Explained

Git and Ops Combination

Modern Infrastructure and GitOps

GitOps Framework

VPC Endpoint and Peering

VPC Endpoints

VPC Peering - Layer 3

AWS Networking - VPC Summary

VPC and Subnets

VPC Routes

VPC Security

NAT Gateway

Cross VPC Access

Summary of Network Fundamentals

What is a Network?

OSI Network Stack

IP Addressing and CIDR

Subnetting

Routing

Firewalls

Proxy Server

Datastore Delete Entities in Bulk with Dataflow

Job Common Properties

Required Parameters

DNS Failover

How works?

Advantages

Use Cases

AWS Storage Services - EBS

What is that?

Durability Concern

What is the ability of?

EBS Volume Types

gp2 - General Purpose

io1 - Provisioned IOPS

st1 - Throughput Optimized

sc1 - Cold HDD

gp3 - General Purpose