<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/"><channel><title><![CDATA[Shoin | Let's Share!]]></title><description><![CDATA[鯛も一人はうまからず | Even a sea bream loses its flavor when eaten alone.]]></description><link>https://shoin.cloudantler.com/</link><image><url>https://shoin.cloudantler.com/favicon.png</url><title>Shoin | Let&apos;s Share!</title><link>https://shoin.cloudantler.com/</link></image><generator>Ghost 3.42</generator><lastBuildDate>Mon, 09 Mar 2026 09:55:29 GMT</lastBuildDate><atom:link href="https://shoin.cloudantler.com/rss/" rel="self" type="application/rss+xml"/><ttl>60</ttl><item><title><![CDATA[Grafana OAuth SSO Integration with KeyCloak]]></title><description><![CDATA[<p>Single Sign-On integration is important. In my previous post, I demonstrated how to use Keycloak as the authentication layer in a FastAPI application. Now we will use Keycloak as identity provider for Grafana.</p><p>I'm going to run all those in my local env. So change the hostnames or FQDN for</p>]]></description><link>https://shoin.cloudantler.com/grafana-oauth-sso-integration-with-keycloak/</link><guid isPermaLink="false">68827d6ca5081e00017c5f12</guid><category><![CDATA[DevOps]]></category><category><![CDATA[Monitoring]]></category><category><![CDATA[Security]]></category><category><![CDATA[EN]]></category><dc:creator><![CDATA[Abdullah Caliskan]]></dc:creator><pubDate>Mon, 28 Jul 2025 12:58:36 GMT</pubDate><content:encoded><![CDATA[<p>Single Sign-On integration is important. In my previous post, I demonstrated how to use Keycloak as the authentication layer in a FastAPI application. Now we will use Keycloak as identity provider for Grafana.</p><p>I'm going to run all those in my local env. So change the hostnames or FQDN for your domain.</p><h3 id="create-keycloak-client">Create KeyCloak Client</h3><p>Properties of Keycloak client must be like following;</p><ul><li>Client Type: OpenID Connect<br>Client ID: grafana-oauth</li><li>Authentication: On<br>Authentication Flow: Standart Flow Enabled<br>Direct access grants Enabled</li><li>Root URL: Grafana's root url<br>Home URL: Grafana's root url<br>Valid Redirect URIs: Grafana's root url + <code>/login/generic_oauth</code><br>Web Origins: Grafana's root url</li></ul><p>After creating the client, go to client details page and click Credentials tab to obtain client secret.</p><h3 id="grafana-compose-file-and-oauth-configuration">Grafana Compose File and OAuth Configuration</h3><p>What to replace the file in below;</p><ul><li>Your realm name and Client secret</li><li>Your keycloak hostname. (There are two domains. Localhost is sent to browser. Host.docker.internal is for Grafana access to Keycloak.</li></ul><pre><code>version: '3.8'

services:
  grafana:
    image: docker.io/grafana/grafana-oss:12.0.2
    container_name: grafana
    restart: unless-stopped
    environment:
    - GF_SERVER_ROOT_URL=http://localhost:3000/
    - GF_PLUGINS_PREINSTALL=grafana-clock-panel
    - GF_SECURITY_ADMIN_USER=admin
    - GF_SECURITY_ADMIN_PASSWORD=admin
    - GF_AUTH_GENERIC_OAUTH_ENABLED=true
    - GF_AUTH_GENERIC_OAUTH_NAME=Keycloak-OAuth
    - GF_AUTH_GENERIC_OAUTH_ALLOW_SIGN_UP=true
    - GF_AUTH_GENERIC_OAUTH_CLIENT_ID=grafana-oauth
    - GF_AUTH_GENERIC_OAUTH_CLIENT_SECRET=[SECRET]
    - GF_AUTH_GENERIC_OAUTH_SCOPES=openid email profile offline_access roles
    - GF_AUTH_GENERIC_OAUTH_EMAIL_ATTRIBUTE_PATH=email
    - GF_AUTH_GENERIC_OAUTH_LOGIN_ATTRIBUTE_PATH=username
    - GF_AUTH_GENERIC_OAUTH_NAME_ATTRIBUTE_PATH=full_name
    - GF_AUTH_GENERIC_OAUTH_AUTH_URL=http://localhost:8080/realms/local/protocol/openid-connect/auth
    - GF_AUTH_GENERIC_OAUTH_TOKEN_URL=http://host.docker.internal:8080/realms/local/protocol/openid-connect/token
    - GF_AUTH_GENERIC_OAUTH_API_URL=http://host.docker.internal:8080/realms/local/protocol/openid-connect/userinfo
    - GF_AUTH_GENERIC_OAUTH_ROLE_ATTRIBUTE_PATH=contains(roles[*], 'admin') &amp;&amp; 'Admin' || contains(roles[*], 'editor') &amp;&amp; 'Editor' || 'Viewer'

    ports:
     - '3000:3000'
    volumes:
     - 'grafana_storage:/var/lib/grafana'
volumes:
  grafana_storage: </code></pre><p>You would see you can log in to Grafana via your Keycloak user. One important note here. In the localhost; you can't map Keycloak roles to Grafana users. <code>OAUTH_ROLE_ATTRIBUTE_PATH</code> defines mapping. So normally you should assign that role to user. It's happening due to localhost and host.docker.internal domains' differences. The JWT token couldn't be verified by Grafana so it doesn't fetch it. But it works in production. And make sure, the path matches with the key in your JWT token. Adjust it by the token. Or modify client scope details in Keycloak.</p>]]></content:encoded></item><item><title><![CDATA[FastAPI - KeyCloak OAuth2 Integration]]></title><description><![CDATA[<p>In this blog post, I'm going to integrate FastAPI with KeyCloak to authenticate the users through KeyCloak. Instead of implementing authentication layer in each service, we take the advantage of KeyCloak to handle it in a centralized point. We can achieve that by using JWT tokens with RSA signing.</p><h3 id="install-requirements-in-below">Install</h3>]]></description><link>https://shoin.cloudantler.com/fastapi-keycloak-oauth2-integration/</link><guid isPermaLink="false">685a9c12a5081e00017c5d9c</guid><category><![CDATA[Development]]></category><category><![CDATA[Python]]></category><category><![CDATA[FastAPI]]></category><category><![CDATA[Security]]></category><category><![CDATA[EN]]></category><dc:creator><![CDATA[Abdullah Caliskan]]></dc:creator><pubDate>Sat, 05 Jul 2025 18:36:12 GMT</pubDate><content:encoded><![CDATA[<p>In this blog post, I'm going to integrate FastAPI with KeyCloak to authenticate the users through KeyCloak. Instead of implementing authentication layer in each service, we take the advantage of KeyCloak to handle it in a centralized point. We can achieve that by using JWT tokens with RSA signing.</p><h3 id="install-requirements-in-below">Install requirements in below</h3><pre><code>Requirements
- uvicorn, fastapi, pydantic, pydantic-settings, python-keycloak, pyjwt, python-multipart</code></pre><h3 id="keycloak-realm-and-client-configuration">KeyCloak Realm and Client Configuration</h3><ul><li>First create a realm in KeyCloak named <code>local</code> and select it.</li><li>Under <code>clients</code> , create a new client with below specs<br>Client type: OpenID Connect<br>Client ID: local-api</li><li>Click next and enable below capabilities<br>Client Authentication: On<br>Direct Access Grants: On <br>Standart Flow: On</li><li>Click next and configure login settings<br>Root URL: http://localhost:8000<br>Home URL: http://localhost:8000<br>Valid Redirect URL: http://localhost:8000/*<br>Web Origins: http://localhost:8000</li><li>Click save to create the client</li></ul><h3 id="obtain-client-credentials">Obtain Client Credentials</h3><ul><li>Go to client details page of <code>local-api</code> client and select credentials tab.</li><li>Copy client secret, it will be used by FastAPI app</li></ul><h3 id="create-a-test-user-in-keycloak">Create a Test User in Keycloak</h3><ul><li>Click Users tab, and Click Create new user or Add user button</li><li>Fill the form and click create button.</li><li>In the user details page, click credentials tab and set a new password. Don't forget to disable Temporary password toggle to keep password permanent.</li></ul><h3 id="create-settings-py">Create settings.py</h3><p>I'm going to use pydantic-settings to load env variables based on the runtime env. If you set <code>RUNTIME_ENV</code> as local or unset, then application will look for <code>.local.env</code> in the app directory to load env variables. You must set KEYCLOAK variables in <code>.local.env</code> file. An example .env file would look like this.</p><pre><code class="language-config">RUNTIME_ENV=local
KEYCLOAK_SERVER_URL=http://localhost:8080/auth
KEYCLOAK_CLIENT_ID=local-api
KEYCLOAK_REALM_NAME=local
KEYCLOAK_CLIENT_SECRET_KEY=&lt;secret that you get from keycloak&gt;</code></pre><pre><code>from pydantic_settings import BaseSettings, SettingsConfigDict
import os 


class CommonSettings():
    VERSION: str = "0.1.0"
    
    KEYCLOAK_SERVER_URL: str
    KEYCLOAK_CLIENT_ID: str
    KEYCLOAK_REALM_NAME: str 
    KEYCLOAK_CLIENT_SECRET_KEY: str


class LocalSettings(CommonSettings, BaseSettings):
    model_config = SettingsConfigDict(env_file='.local.env', env_file_encoding='utf-8')
    RUNTIME_ENV: str = "local"


class ProdSettings(CommonSettings, BaseSettings):
    model_config = SettingsConfigDict(env_file='.env', env_file_encoding='utf-8')
    RUNTIME_ENV: str = "prod"


runtime_env = os.environ.get("RUNTIME_ENV", "local")
settings = LocalSettings() if runtime_env == "local" else ProdSettings()
</code></pre><h3 id="create-security-py">Create security.py</h3><p>In this module, we implement functions to use later and initiliaze and configure keycloak class for connecting to keycloak server.</p><pre><code class="language-Python">from keycloak import KeycloakOpenID
from .settings import settings as s


idp = KeycloakOpenID(
    s.KEYCLOAK_SERVER_URL,
    s.KEYCLOAK_REALM_NAME,
    s.KEYCLOAK_CLIENT_ID,
    s.KEYCLOAK_CLIENT_SECRET_KEY
)</code></pre><h3 id="create-auth-py-and-login-endpoint">Create auth.py and Login endpoint</h3><p>In this module we implement login endpoint and issue a token from Keycloak by using credentials provided by user. That's why you need to enable <code>Direct Access Grants</code> in client capabilities to exchange token with username and password. We return access and refresh tokens to user.</p><pre><code class="language-Python">from typing import Annotated
from fastapi import APIRouter, Depends, HTTPException
from fastapi.security import OAuth2PasswordRequestForm
from pydantic import BaseModel


from .settings import settings as s
from .security import idp
from keycloak.exceptions import KeycloakAuthenticationError, KeycloakError


class OpenIdToken(BaseModel):
    token_type: str
    access_token: str 
    refresh_token: str
    expires_in: int 
    refresh_expires_in: int 
    

app = FastAPI(title="Code Reference API",
              description="API for code reference",
              version=s.VERSION)


@app.post("/login")
async def login(form: Annotated[OAuth2PasswordRequestForm, Depends()]):
    try: 
        resp = idp.token(form.username, form.password, scope="openid profile email")
        return OpenIdToken(**resp)

    except KeycloakAuthenticationError as e:
        raise HTTPException(401, "Username or password is incorrect!") from e

    except KeycloakError as e:
        raise HTTPException(400, "Request malformed!") from e</code></pre><p>By here, we have managed to login through Keycloak. Now you must be able to get access token from /login endpoint. Give it a try!</p><h3 id="protect-an-endpoint-and-validate-access-tokens">Protect an Endpoint and Validate Access Tokens</h3><p>Now it's time to make JWT tokens mandatory for protected endpoints. Let's decode and validate JWT tokens. </p><p>Add below lines to <code>security.py</code> </p><ul><li><code>oauth_token</code> gives information where to obtain token from for swagger UI. </li><li><code>decode_token</code> calls keycloak decode_token function to decode and verify token by fetching public key from KeyCloak. (I'm going to show how to verify it without fetching key every time)<br>Function also handles the exceptions and returns proper error messages.</li></ul><pre><code class="language-Python">from fastapi.security import OAuth2PasswordBearer


oauth_token = OAuth2PasswordBearer("/login")


async def decode_token(token: str) -&gt; dict:
    try: 
        return await idp.a_decode_token(token, validate=True)
        
    except JWTExpired as e:
        raise HTTPException(401, "Token has expired") from e 
    
    except InvalidJWSSignature as e: 
        raise HTTPException(400, "Token signature couldn't be verified") from e 
    
    except JWException as e: 
        raise HTTPException(400, "Token is malformed") from e 
    
    except Exception as e:
        print(e)
        raise HTTPException(500, "Error while decoding token") from e </code></pre><p>Then we need to implement a dependency function to mark Authorization header as required and inject it into endpoint. </p><pre><code>from typing import Annotated
from fastapi import Depends
from .security import oauth_token, decode_token


async def get_user(token: Annotated[str, Depends(oauth_token)]):
    return await decode_token(token)</code></pre><pre><code>@app.post("/me")
async def me(user: Annotated[dict, Depends(get_user)]):
    return token</code></pre><p>When you visit the swagger UI, you will see an lock icon next to /me endpoint and login button in top-right corner. Login and try to send a request to me endpoint. You should see the issued token by Keycloak.</p><h3 id="how-to-verify-without-fetching-key-every-time">How to Verify without fetching key every time?</h3><p>On the first startup, we can load the key and store it in the memory. While decoding the token, we use the from memory instead of fetching it from Keycloak. It reduces the load on keycloak and latency, but it comes with its consequences such as how to handle key rotation? Just keep in mind.</p><p>We're going to use lifespan feature in FastAPI. We load the key on app startup and save it to <code>settings</code> variable to make it available everywhere. <br>Don't forget to add <code>JWT_KEY</code> to your settings class.</p><pre><code>from contextlib import asynccontextmanager
from jwcrypto import jwk
from .settings import settings as s 
from .security import idp


@asynccontextmanager
async def lifespan(app: FastAPI):
    key = (
        "-----BEGIN PUBLIC KEY-----\n"
        + await idp.a_public_key()
        + "\n-----END PUBLIC KEY-----"
    )
    s.JWT_KEY = jwk.JWK.from_pem(key.encode("utf-8"))
    yield 


app = FastAPI(version=s.VERSION,
              lifespan=lifespan)
</code></pre><p>Now we should provide the key to Keycloak class to use it, instead of fetching it. Find and change <code>decode_token</code> function content.</p><pre><code>return await idp.a_decode_token(token, validate=True, key=s.JWT_KEY)</code></pre>]]></content:encoded></item><item><title><![CDATA[Zot OCI Registry Review and Configuration]]></title><description><![CDATA[<p>While searching an alternative to Docker's private registry, found Zot OCI registry. And wanted to see the capabilities and limitations. It's an image registry which allows you to store and distribute container images. The difference is from other registry servers, it follows the OCI distribution specs published by Open container</p>]]></description><link>https://shoin.cloudantler.com/zot-registry-set-up-and-configuration/</link><guid isPermaLink="false">67321782a5081e00017c5b92</guid><category><![CDATA[Container]]></category><category><![CDATA[EN]]></category><dc:creator><![CDATA[Abdullah Caliskan]]></dc:creator><pubDate>Mon, 20 Jan 2025 10:23:32 GMT</pubDate><content:encoded><![CDATA[<p>While searching an alternative to Docker's private registry, found Zot OCI registry. And wanted to see the capabilities and limitations. It's an image registry which allows you to store and distribute container images. The difference is from other registry servers, it follows the OCI distribution specs published by Open container initiative (OCI). OCI is the governor who sets the rules and defines common structure for the software developers and companies build around that standards.</p><h3 id="expectations">Expectations </h3><p>Usually, we expect the below criteria from an image registry.</p><ul><li>Push and pull container images</li><li>Delete container images</li><li>Persistent storage for images, preferably object storage like S3</li><li>Authentication layer for security</li></ul><h3 id="extras">Extras</h3><ul><li>Mirroring on Docker hub or another container registry</li><li>Security vulnerability scans on images</li><li>Container image signing and licence checking</li><li> Automatic retaining images by defined rules</li></ul><p>Zot support all of the above features. </p><h3 id="configuration">Configuration</h3><p>This is the version I used: <a href="https://shoin.cloudantler.com/p/df4cb9bf-b4b9-4b0f-9658-f52b9f9cfc70/ghcr.io/project-zot/zot:v2.1.2-rc3">ghcr.io/project-zot/zot:v2.1.2-rc3</a><br>The entry point command is <code>zot</code> and I mount the configuration file below.</p><p><code>docker run -d -p 8000:8000 --name zot -v `pwd`/config.json:/etc/zot/config.json ghcr.io/project-zot/zot:v2.1.2-rc3 serve /etc/zot/config.json</code> </p><pre><code>{
    "distSpecVersion": "1.0.1",
    "storage": {
        "rootDirectory": "/tmp/zot",
        "commit": true,
        "dedupe": false,
        "gc": true,
        "gcDelay": "2h",
        "gcInterval": "1h",
        "storageDriver": {
            "name": "s3",
            "region": "eu-west-1",
            "bucket": "harver-zot-private-registry",
            "secure": true,
            "skipverify": false
        },
        "retention": {
            "dryRun": false,
            "delay": "24h",
            "policies": [
                {
                    "repositories": ["infra/**", "base/**"],
                    "keepTags": [{
                        "patterns": [".*"]
                    }]
                },
                {
                    "keepTags": [{
                        "patterns": [".*"]
                    }]
                }
            ]
        }
    },
    "http": {
        "address": "0.0.0.0",
        "port": "8000"
    },
    "extensions": {
        "metrics": {
            "enable": true,
            "prometheus": {
                "path": "/metrics"
            }
        },
        "sync": {
            "downloadDir": "/tmp/mirror",
            "enable": true,
            "registries": [
                {
                    "urls": ["https://docker.io"],
                    "content": [
                        {
                            "prefix": "**",
                            "destination": "/docker"
                        }
                    ],
                    "onDemand": true,
                    "tlsVerify": true
                }
            ]
        },
        "search": {
            "enable": true
        },
        "scrub": {},
        "lint": {},
        "trust": {},
        "ui": {
            "enable": true
        }
    },
    "log": {
        "level": "debug"
    }
}</code></pre>]]></content:encoded></item><item><title><![CDATA[How to Build Multi-Arch Container Images]]></title><description><![CDATA[<p>Each container image has processor architecture. That architecture describes where you can run them on. It's the similar, like in OS executable binaries.<br>If you want to run your application on <code>amd64</code>, you should build your code on <code>amd64</code> processor. Nowadays, toolkits and emulators allow you to build the code</p>]]></description><link>https://shoin.cloudantler.com/build-multi-arch-container-images/</link><guid isPermaLink="false">674860afa5081e00017c5c37</guid><category><![CDATA[DevOps]]></category><category><![CDATA[Container]]></category><category><![CDATA[EN]]></category><dc:creator><![CDATA[Abdullah Caliskan]]></dc:creator><pubDate>Tue, 03 Dec 2024 14:11:05 GMT</pubDate><content:encoded><![CDATA[<p>Each container image has processor architecture. That architecture describes where you can run them on. It's the similar, like in OS executable binaries.<br>If you want to run your application on <code>amd64</code>, you should build your code on <code>amd64</code> processor. Nowadays, toolkits and emulators allow you to build the code anywhere. This is <code>buildx</code> for docker. It's a plugin that you can install by following the link: <a href="https://github.com/docker/buildx">GitHub</a>. Buildx has extra capabilities than only multi-arch builds.</p><p>Container images can be built for different architectures. Then you can combine them into a single image tag. Actually, this is called image manifest file. That file has references for the regarding architectures. </p><h3 id="build-image-from-dockerfile-buildx">Build Image from Dockerfile - buildx</h3><p>Let's build a container image for <code>linux/amd64</code> and <code>linux/arm64</code> architectures. We can use <code>buildx</code> for that. First we need to install it</p><blockquote>Buildx and buildkit is already enabled in Docker Desktop</blockquote><p>Buildx has builder instances for the builds. Each build is executed inside them. The advantage of them, they provide isolated environments. You can even set up a build farm with a set of builders. You can switch between the builders. </p><p>For building multi-platform images; follow below</p><pre><code>docker run --privileged --rm tonistiigi/binfmt --install all

docker buildx create --use --name multi-arch node-amd64
docker buildx create --amend --name multi-arch node-arm64

# List and switch contexts
docker buildx ls
docker buildx use --builder multi-arch</code></pre><p>The builder instances are ready to use. We can build container images in two architectures on the same host. <br><code>--push</code> parameter pushes the image to the repository. If you want to keep in local pass <code>--load</code> parameter.</p><h3 id="build-multi-arch-images">Build Multi-Arch images</h3><pre><code>docker buildx build \
	--platform linux/arm64,linux/amd64 \
	--push \
	--tag [image_tag] \
	.</code></pre>]]></content:encoded></item><item><title><![CDATA[SQLAlchemy - UUID and Mix-In Classes]]></title><description><![CDATA[<p></p><h3 id="uuid">UUID</h3><p>UUID (Universal Unique Identifier) is a 128-bit value used to uniquely identify an object or entity on the internet. <br>A real world example; When you visit a news website, you see something following in the address bar of the web browser.<br> <code>https://newssite.example/category/economy/1</code> <br>As you</p>]]></description><link>https://shoin.cloudantler.com/sqlalchemy-uuid-mix-in-classes/</link><guid isPermaLink="false">672f647ea5081e00017c5aa2</guid><category><![CDATA[Development]]></category><category><![CDATA[Python]]></category><category><![CDATA[Database]]></category><category><![CDATA[EN]]></category><dc:creator><![CDATA[Abdullah Caliskan]]></dc:creator><pubDate>Mon, 11 Nov 2024 14:12:48 GMT</pubDate><content:encoded><![CDATA[<p></p><h3 id="uuid">UUID</h3><p>UUID (Universal Unique Identifier) is a 128-bit value used to uniquely identify an object or entity on the internet. <br>A real world example; When you visit a news website, you see something following in the address bar of the web browser.<br> <code>https://newssite.example/category/economy/1</code> <br>As you can imagine, the <code>1</code> indicates that it's the first article. If you increase the number you view the second article. It's predictable and the people can copy your content easily, just by increasing the numbers. So hiding your content's identical number to visitors can be crucial for web services. To be able to hide it you can start to implement UUID instead of regular numbers. Predicting UUID is difficult but implementing it is easy. Because an UUID looks like below;<br><code>291ad1cd-c352-4220-b8f8-5ee876da5390</code> <br>If you can guess the next article's ID, don't wait and play lottery :)</p><p>The best place to put UUID field is the Base class. We ensure all resources in database has UUID type by defining there. And it is good for DRY (don't repeat yourself) principle. </p><p>As default parameter, we provide <code>uuid.uuid4</code> function's itself. SQLAlchemy will call the function to generate UUID values. Both way is shown in below.</p><pre><code class="language-Python 3.11">from sqlalchemy.orm import as_declarative
from sqlalchemy import Column, Integer
from sqlalchemy.dialects.postgresql import UUID
import uuid

class_registry: dict = {}

@as_declarative(class_registry=class_registry)
class Base():
    # id = Column('id', Integer, primary_key=True, index=True)
    uuid = Column('uuid', UUID(as_uuid=True), primary_key=True, index=True, default=uuid.uuid4)</code></pre><h3 id="auto-generated-table-names">Auto-Generated Table Names</h3><p>Instead of typing <code>__tablename__</code> in each model classes, you can follow below approach in Base class. You can use this decorator for more of course, like creating Mix-in classes. (in below it's explained)</p><pre><code>from sqlalchemy.orm import as_declarative, declared_attr
from sqlalchemy import Column
from sqlalchemy.dialects.postgresql import UUID
import uuid

class_registry: dict = {}

@as_declarative(class_registry=class_registry)
class Base():
    @declared_attr
    def __tablename__(cls):
    	return f"{cls.__name__.lower()}s"

    uuid = Column('uuid', UUID(as_uuid=True), primary_key=True, index=True, default=uuid.uuid4)</code></pre><h3 id="common-date-time-mix-in-classes">Common Date Time Mix-in Classes</h3><p>For auditing purposes and keeping everything under record, we should keep the creation and update dates. The below mix-in class is our abstract class. We will use them in classes which needs Date Time records. </p><p>Creation date and update date can be set automatically. Be aware of that we use <code>server_default</code> in <code>created_at</code> with <code>func.now()</code> <br>But we use <code>onupdate</code> in <code>updated_at</code> again with <code>func.now()</code> The reason is explained in this<a href="https://stackoverflow.com/questions/13370317/sqlalchemy-default-datetime"> thread. </a></p><pre><code>from sqlalchemy import DateTime, func
from sqlalchemy.orm import declared_attr

class DateTimeMixin:
    @declared_attr
    def created_at(cls):
        return Column(DateTime(timezone=True), server_default=func.now())
        
    @declared_attr
    def updated_at(cls):
        return Column(DateTime(timezone=True), onupdate=func.now())
        

class User(Base, DateTimeMixin):
    __tablename__ = "users"
    
    username = Column(String, index=True)
    password = Column(String)</code></pre>]]></content:encoded></item><item><title><![CDATA[FastAPI - SQLAlchemy and Alembic Integration]]></title><description><![CDATA[<p>Almost in every project we store information in databases. It could be an SQL or a NoSQL database. Let's say you would like to go with an SQL database. You have plenty of options to choose. In SQL databases you need to define the structure such as columns and its</p>]]></description><link>https://shoin.cloudantler.com/fastapi-sqlalchemy-and-alembic-integration/</link><guid isPermaLink="false">6504b1b6a5081e00017c57a4</guid><category><![CDATA[Development]]></category><category><![CDATA[Python]]></category><category><![CDATA[FastAPI]]></category><category><![CDATA[Database]]></category><category><![CDATA[EN]]></category><dc:creator><![CDATA[Abdullah Caliskan]]></dc:creator><pubDate>Sat, 09 Nov 2024 13:31:11 GMT</pubDate><content:encoded><![CDATA[<p>Almost in every project we store information in databases. It could be an SQL or a NoSQL database. Let's say you would like to go with an SQL database. You have plenty of options to choose. In SQL databases you need to define the structure such as columns and its data types, limitations, default values etc. You may want to create indexes, views and transactions as well. SQLAlchemy allows you to define your tables in form of python classes. You can get the value of a column just by getting the attribute of class. It helps you to get advantage of OOP paradigm. ORM is capable of creating the tables but it's not preferred way for production workloads. It's not a sustainable option. When you modify your structure, it can't make it happen. It's not good as well. Because you don't track what changes you made before. In this point alembic comes to your help. It's capable of creating scripts to modify your structure. It provides rollback scripts as well.</p><p>Install required packages</p><pre><code class="language-bash">poetry add sqlalchemy alembic</code></pre><pre><code>fooapi/ # Main package
    fooapi/
    	__init__.py
        main.py
    tests/
    poetry.lock
    pyproject.toml
    db/
    	__init__.py
        base_class.py
        base.py
        session.py</code></pre><h3 id="sqlalchemy">SQLAlchemy</h3><p>We should create a new sub-package called <code>db</code>. We store the base classes, session objects and configurations for database connection. The structure is like below.</p><p>We define our base class first. We inherit new model classes from the base class. We can define the common columns in base class. It effects every model classes that is inherited from it. So that you don't have to type commun columns in each model classes.</p><pre><code class="language-Python 3.11">from sqlalchemy.orm import as_declarative
from sqlalchemy import Column, Integer

class_registry: dict = {}

@as_declarative(class_registry=class_registry)
class Base:
	id = Column('id', Integer, primary_key=True, index=True)</code></pre><p><code>class_registry</code> dictionary is used for tracking the which classes are used creating database tables. It's mapping. <code>as_declarative</code> means that we declare the attributes and structure of table. We added an <code>id</code> column which is common for all of our tables. </p><p>Now we just defined our Base class. We need to initiate a Session to connect the database.  SQLAlchemy is able to connect to the different database engines with almost same code or with small changes. <br>We keep database connection string in a variable. We need to create an engine. ORM gets the target engine from connection string. We can pass some extra parameters here. For instance, sqlite doesn't support multi-thread applications. FastAPI creates threads for requests because it's based on starlette (ASGI) which works async. We define that don't check is the connection coming from same thread, or not.<br>sessionmaker functions returns Session object which is used actually to connect and initiate a session in database. This object is callable. If you want to connect to the db, you call this object. <code>SessionLocal()</code></p><pre><code class="language-Python 3.11">from sqlalchemy.orm import sessionmaker
from sqlalchemy import create_engine

SQL_URI = "sqlite:///app.db"

engine = create_engine(SQL_URI, connect_args={"check_same_thread": False})
SessionLocal = sessionmaker(engine, autoflush=False, autocommit=False)</code></pre><p>We create another module called <code>base.py</code> I'll explain that in alembic section. This is how it looks like. </p><pre><code class="language-Python 3.11">from .base_class import Base
</code></pre><h3 id="alembic">Alembic</h3><p>We need to initate alembic in project. Go to directory next to <code>pyproject.toml</code> and run <code>alembic init alembic</code> It creates a directory named <code>alembic</code> and <code>alembic.ini</code> file next to <code>pyproject.toml</code> This is last directory setup.</p><pre><code>fooapi/ # Main package
    fooapi/
    	__init__.py
        main.py
    tests/
    poetry.lock
    pyproject.toml
    db/
    	__init__.py
        base_class.py
        base.py
        session.py
    # NEW files below
    alembic.ini 
    alembic/
    	versions/
        env.py</code></pre><p>Find and comment out <code>sqlalchemy.url</code> line in alembic.ini, because we set this value by getting from <code>session.py</code> Make below changes in <code>env.py</code></p><pre><code>from fooapi.db.base import Base
target_metadata = Base.metadata

from fooapi.db.session import SQL_URI
def get_url() -&gt; str:
	return SQL_URI</code></pre><p><code>target_metadata</code> is required to run alembic properly. It finds Base class get its metadata with table-class mapping that we defined in <code>base_class.py</code><br>We imported SQL_URI to point out the target database.</p><p>In addition to that, we must change the <code>run_migrations_offline</code> function. We commented out the <code>sqlalchemy.url</code> in alembic.ini. Update the value of url parameter in <code>context.configure</code> method.</p><pre><code>context.configure(
    url=get_url,
    target_metadata=target_metadata,
    literal_binds=True,
    dialect_opts={"paramstyle": "named"},
)</code></pre><p>We have to make some updates in <code>run_migrations_online</code> function as well. Because we load connection string from <code>session.py</code> We need to set the connection string in this configuration too.</p><pre><code>configuration = config.get_section(config.config_ini_section, {})
configuration['sqlalchemy.url'] = get_url()
connectable = engine_from_config(
    configuration,
    prefix="sqlalchemy.",
    poolclass=pool.NullPool,
    )</code></pre><h3 id="alembic-revision-and-apply">Alembic Revision and Apply</h3><p>Alembic creates migration scripts and they're versioned. To create a new revision and apply it, you can give this command</p><pre><code>alembic revision --autogenerate -m "first revision"
alembic upgrade head</code></pre><p>Most probably it's gonna create anything in database, because we didn't define any model classes that are mapped to real database tables. At least we tested our integration in this step.</p><h3 id="create-db-model">Create DB Model</h3><p>Create a new sub-package called <code>models</code> under the fooapi package. Create a new module named <code>user.py</code> We define the user model class in this module. <br>It must be inherited from <code>Base</code> class. We have only 2 columns for this table. We have to set the table name as well. </p><pre><code>from fooapi.db.base_class import Base
from sqlalchemy import Column, String


class User(Base):
    __tablename__ = "users"
    
    username = Column(String, index=True)
    password = Column(String)</code></pre><p>Now go to the <code>base.py</code> under the db sub-package, import this model class, otherwise alembic isn't going to detect that model class is available.</p><pre><code>from .base_class import Base

from fooapi.models.user import User</code></pre><p>This part is really important. You must have <code>base.py</code> and <code>base_class.py</code> both.<br>Alembic needs <code>base.py</code>. Model classes need <code>base_class.py</code>. <br>Alembic needs Model classes as well, that's why we import model classes in <code>base.py</code> <br>If you didn't define your Base class in <code>base_class.py</code> you would cause <code>circular dependency</code>. That's why you must define this in separated files. If you get an error about that, ensure you imported Base class from the correct module.</p><p>Model Classes; <br>- Import Base class from <code>base_class.py</code><br>- Import Module class within <code>base.py</code> for alembic to discover it.</p><p>Alembic;<br>- Import <code>base.py</code> <br>- <code>base.py</code> must import Base class from <code>base_class.py</code><br>- Make sure you imported all defined Model classes within <code>base.py</code></p><p>To be able to create the table for <code>User</code> model. Give below commands</p><pre><code>alembic revision --autogenerate -m "User model added"
alembic upgrade head 


alembic downgrade -1  # To revert changes</code></pre><p>Now we have created a table by assist of Alembic. We store the changes in our repository and able to revert and provision versions in database with less-effort.</p>]]></content:encoded></item><item><title><![CDATA[BigQuery Scheduled Query Monitoring]]></title><description><![CDATA[<p>We have created scheduled queries in the <a href="https://shoin.cloudantler.com/bigquery-scheduled-queries/">previous post</a>. The query execution results could be forwarded to a pub/sub topic. In this post we developed a function that reviews the execution results and takes action based on state.</p><p>I want to build the solution async and reusable. Therefore there</p>]]></description><link>https://shoin.cloudantler.com/bigquery-scheduled-query-monitoring/</link><guid isPermaLink="false">63a9f65e7a30a8000171c94c</guid><category><![CDATA[Cloud]]></category><category><![CDATA[GCloud]]></category><category><![CDATA[Solution]]></category><category><![CDATA[Lambda]]></category><category><![CDATA[Database]]></category><category><![CDATA[EN]]></category><dc:creator><![CDATA[Abdullah Caliskan]]></dc:creator><pubDate>Mon, 26 Dec 2022 19:31:36 GMT</pubDate><media:content url="https://shoin.cloudantler.com/content/images/2022/12/app-integrations.png" medium="image"/><content:encoded><![CDATA[<img src="https://shoin.cloudantler.com/content/images/2022/12/app-integrations.png" alt="BigQuery Scheduled Query Monitoring"><p>We have created scheduled queries in the <a href="https://shoin.cloudantler.com/bigquery-scheduled-queries/">previous post</a>. The query execution results could be forwarded to a pub/sub topic. In this post we developed a function that reviews the execution results and takes action based on state.</p><p>I want to build the solution async and reusable. Therefore there are two functions.<br>The first function is responsible to review the execution results.<br>The second function is responsible to send emails.<br>I have built in that way, because I want to utilize email-sender function for project-wide usage. :)</p><h3 id="workflow">Workflow</h3><figure class="kg-card kg-image-card"><img src="https://shoin.cloudantler.com/content/images/2022/01/image.png" class="kg-image" alt="BigQuery Scheduled Query Monitoring"></figure><ol><li>BigQuery starts to process query based on schedule time.</li><li>The query is succeeded or not, the execution result is forwarded to where specified pub/sub topic in creation process</li><li>Pub/sub topic invokes the reviewer function.</li><li>The reviewer function checks the body of event data against is it succeeded or not.<br>If state is <code>SUCCEEDED</code>, then it does nothing.<br>If not, then it publish new message to target topic which is the email-relay.</li></ol><p>The reviewer actually does just checking the state is <code>SUCCEEDED</code> or not. Besides it publish new message to email-relay topic, if query is failed.</p><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://github.com/xsetra/shoin-posts/tree/master/gcp-bigquery-scheduled-q-monitoring"><div class="kg-bookmark-content"><div class="kg-bookmark-title">shoin-posts/gcp-bigquery-scheduled-q-monitoring at master · xsetra/shoin-posts</div><div class="kg-bookmark-description">Contains devops scripts to re-use. Contribute to xsetra/shoin-posts development by creating an account on GitHub.</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://github.githubassets.com/favicons/favicon.svg" alt="BigQuery Scheduled Query Monitoring"><span class="kg-bookmark-author">GitHub</span><span class="kg-bookmark-publisher">xsetra</span></div></div><div class="kg-bookmark-thumbnail"><img src="https://opengraph.githubassets.com/69ca72e3b34d84272eaf377c94f57bf960c4605f57a410aff1348b87d11269be/xsetra/shoin-posts" alt="BigQuery Scheduled Query Monitoring"></div></a></figure>]]></content:encoded></item><item><title><![CDATA[BigQuery Scheduled Query Management]]></title><description><![CDATA[<p>Sometimes you might want to run queries on recurring basis. The queries could be written in SQL.<br>We decided to run some queries for cleaning purposes the datasets, then developed the below scripts. If you have a lot of queries and you want to store a copy of them in</p>]]></description><link>https://shoin.cloudantler.com/bigquery-scheduled-query-management/</link><guid isPermaLink="false">63a9f5d17a30a8000171c93b</guid><category><![CDATA[Cloud]]></category><category><![CDATA[GCloud]]></category><category><![CDATA[Database]]></category><category><![CDATA[EN]]></category><dc:creator><![CDATA[Abdullah Caliskan]]></dc:creator><pubDate>Mon, 26 Dec 2022 19:30:04 GMT</pubDate><content:encoded><![CDATA[<p>Sometimes you might want to run queries on recurring basis. The queries could be written in SQL.<br>We decided to run some queries for cleaning purposes the datasets, then developed the below scripts. If you have a lot of queries and you want to store a copy of them in a repository, the following script would be fit for you.</p><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://github.com/xsetra/shoin-posts/blob/master/gcp-bigquery-scheduled-queries/scheduled_queries.py"><div class="kg-bookmark-content"><div class="kg-bookmark-title">shoin-posts/scheduled_queries.py at master · xsetra/shoin-posts</div><div class="kg-bookmark-description">Contains devops scripts to re-use. Contribute to xsetra/shoin-posts development by creating an account on GitHub.</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://github.githubassets.com/favicons/favicon.svg"><span class="kg-bookmark-author">GitHub</span><span class="kg-bookmark-publisher">xsetra</span></div></div><div class="kg-bookmark-thumbnail"><img src="https://opengraph.githubassets.com/69ca72e3b34d84272eaf377c94f57bf960c4605f57a410aff1348b87d11269be/xsetra/shoin-posts"></div></a></figure><h3 id="workflow">Workflow</h3><ul><li>The script simply reads the query list from <code>query-catalog-file</code>. There is a file  in repo named <code>query-catalog.json</code> that shows the structure.<br><code>name</code> , <code>schedule</code> and <code>pubsub_topic</code> is optional fields.</li><li>It creates or deletes the queries according to <code>operation</code> parameter.</li><li>The query must be provided within the catalog file.</li><li>While deleting the queries, the <code>generate_name</code> function takes a big role. Because the query name is composed with a parent name that includes project, location, dataset_id. The script finds the correct name, then deletes the query.</li></ul><pre><code class="language-bash">usage: scheduled_queries.py 
    project_id 
    query-catalog-file 
    operation
    [-h] 
    [--service-account-name SERVICE_ACCOUNT_NAME] 
    [--pubsub-topic-id PUBSUB_TOPIC_ID] 
    [--default-schedule DEFAULT_SCHEDULE] 
    [--location LOCATION] </code></pre><h3 id="which-one-wins-the-race-file-or-parameter">Which one wins the race? File or Parameter?</h3><p>Answer is file. For instance if you don't specify the name in file, the script will generate a name for it. If you specify, the file name is chosen. You can find the name rules in <code>generate_name</code> function.</p><h3 id="optional-parameters">Optional Parameters</h3><ul><li>Service Account Name<br>It's associated to bigquery scheduled query. You have to grant correct permissions according to query requirements. For instance, bigquery user.</li><li>PubSub Topic Id<br>When scheduled query executed, the result of query will be send to this topic.<br>It's required to monitor the queries. Take a look monitoring solution.</li><li>Default Schedule<br>If you don't specify the schedule in catalog-file, this value will be used.</li><li>Location<br>What is the location of dataset or query execution location.</li></ul><h3 id="example-usage">Example Usage</h3><pre><code>./scheduled_queries.py gcp-project query-catalog.json create</code></pre>]]></content:encoded></item><item><title><![CDATA[GitOps Explained]]></title><description><![CDATA[<p>I heard the "GitOps" word first time while learning the terraform. A bunch of new words that have "Ops" suffix have appeared nowadays such as DevOps, NetDevOps, and DevSecOps. I decided to research that word and our hero today is GitOps :)</p><p>Initially I want to define <strong>what is GitOps?</strong><br>GitOps</p>]]></description><link>https://shoin.cloudantler.com/gitops-explained/</link><guid isPermaLink="false">63a9f3ae7a30a8000171c929</guid><category><![CDATA[DevOps]]></category><category><![CDATA[EN]]></category><dc:creator><![CDATA[Abdullah Caliskan]]></dc:creator><pubDate>Mon, 26 Dec 2022 19:19:53 GMT</pubDate><media:content url="https://shoin.cloudantler.com/content/images/2022/12/publishing-options.png" medium="image"/><content:encoded><![CDATA[<img src="https://shoin.cloudantler.com/content/images/2022/12/publishing-options.png" alt="GitOps Explained"><p>I heard the "GitOps" word first time while learning the terraform. A bunch of new words that have "Ops" suffix have appeared nowadays such as DevOps, NetDevOps, and DevSecOps. I decided to research that word and our hero today is GitOps :)</p><p>Initially I want to define <strong>what is GitOps?</strong><br>GitOps is an operational framework. I want to draw your attention to the framework word. If you're a programmer, you might be confused. Please don't compare with the web frameworks. You should think it's like a manifest or a process. It defines the policies, standards, quality metrics and governance.</p><h3 id="git-and-ops-combination">Git and Ops Combination</h3><p>It is composed the Git and Ops words. Actually the words itself describes a lot of things when you look at first. Let's inspect them individually.</p><p><strong>Git</strong> is a version control system. It provides developers the collaboration, integration and central management point for their applications and teams. Beside of that features todays providers such as GitHub and GitLab provides the CI/CD tooling services.</p><p><strong>Ops</strong> stands for Operations and it consists the set of processes and services that are administrated by an IT department. Ops teams setup the infrastructure that application needs. The IT department has different teams. The teams define different policies and standards than each others due to communication problems and special tasks. They create the servers, installs the operating systems and the services on that servers manually.</p><p>If we led to the conclusion, we could say that the developers has an automated development lifecycle thanks to VCS and its processes.<br>But the ops teams still have a manual process and specialized teams for different tasks. Infrastructure is more complex and larger than ever. Because the need of the modern infrastructure has been changed.</p><h3 id="modern-infrastructure-and-gitops">Modern Infrastructure and GitOps</h3><p>The demands of modern infrastructure is elasticity, easy maintenance and enhanced monitoring. Modern infrastructure needs to have automated provisioning. As DevOps engineers we work on prepare the declarative files of infrastructure. There is a missing point though. The provisioning the infrastructure is still a manual process and needs to have development lifecycle standards and policies. GitOps framework defines that how you could achieve that standards.</p><h3 id="gitops-framework">GitOps Framework</h3><p>Speed and scalability are the most important properties of modern applications.</p><p>The application can be deployed thousands of time per day, if you are adopted the DevOps culture in your company. DevOps practices allow your company to be ensure the application is passed the standards, policies.<br>The standards are code review, automated testing, no human interaction in deployment process. The policies are composed code coverage tests and accepted pull requests. It's the DevOps practices.</p><ul><li>GitOps is used to automate the process of provisioning infrastructure.</li><li>The developer teams store their source code in version control system (vcs). If you want to adopt the gitops in your company, the ops team should store their configuration files, declarative definitions (infra as code) in vcs.</li><li>GitOps allow your ops team to be ensure you will have the infrastructure as defined state at the end of the pipeline. The dev team already compiles and gets the binary file at the end of the pipeline. The output of the pipeline is named as artifact.</li><li>GitOps offers the versioning. The dev team tags their commits then have the versioned artifacts. Thanks to git, your ops team also will have versioned infrastructure.</li><li>If you have versioning that means also you have rollback scenario. When you revert the commit in vcs, the pipeline triggers and deploy the previous version to environment.</li><li>GitOps allows greater collaboration. The dev or ops team create the pull requests to deploy their changes to production environment. The team leaders review the code and configurations. If team leader was missed the errors, the pipeline test could reveal them.</li><li>GitOps = Infra-as-Code + Pull Requests + CI/CD</li></ul><p>We examined what is GitOps and advantages of it.  it requires an adoption process like DevOps. Because the team members are going to attempt to edit something in production environments manually. Don't be surprised :) GitOps will work better when you have less "cowboy engineering".<br>That's it another reason to have fast pipelines. Prevent the cowboy engineering. When you have fast pipelines, the team members would choose to alter the infrastructure or application in proper way. The proper way is GitOps :)</p>]]></content:encoded></item><item><title><![CDATA[VPC Endpoint and Peering]]></title><description><![CDATA[<h3 id="vpc-endpoints">VPC Endpoints</h3><ul><li>VPC endpoint provides access to public AWS services for resources that don't have public IP or where NAT gateway isn't deployed.</li><li>There are gateway (s3, dynamodb) and interface (most aws services) endpoints.</li></ul><p><strong>Gateway Endpoints</strong></p><ul><li>Uses routing.</li><li>They are present in route table via prefix lists which are represent</li></ul>]]></description><link>https://shoin.cloudantler.com/vpc-endpoint-and-peering/</link><guid isPermaLink="false">63a9f36c7a30a8000171c91f</guid><category><![CDATA[Cloud]]></category><category><![CDATA[AWS]]></category><category><![CDATA[VPC]]></category><category><![CDATA[EN]]></category><dc:creator><![CDATA[Abdullah Caliskan]]></dc:creator><pubDate>Mon, 26 Dec 2022 19:18:36 GMT</pubDate><content:encoded><![CDATA[<h3 id="vpc-endpoints">VPC Endpoints</h3><ul><li>VPC endpoint provides access to public AWS services for resources that don't have public IP or where NAT gateway isn't deployed.</li><li>There are gateway (s3, dynamodb) and interface (most aws services) endpoints.</li></ul><p><strong>Gateway Endpoints</strong></p><ul><li>Uses routing.</li><li>They are present in route table via prefix lists which are represent the CIDR of service. Prefix lists are updated by AWS.</li><li>They can have associated policies which defines who can access.</li><li>They are highly available for all AZs in a Region.</li></ul><p><strong>Interface Endpoints</strong></p><ul><li>Provisions a networking object, ENI.</li><li>The provisioned ENI is your interface endpoint to connect the service.</li><li>That ENI has got ip/dns pair, Security group. NACL works in subnet level.</li><li>You can provision that ENI in multiple subnets, one per AZ. The applications will use DNS resolve. By doing that endpoints will be highly available.</li><li>Enable the Private DNS to override public default name of service. Also, you can give a different names for AWS services, because you will have private route 53 zone.</li></ul><h3 id="vpc-peering-layer-3">VPC Peering - Layer 3</h3><ul><li>VPC peering is a way to link or connect two VPCs together without using any additional non AWS services.</li><li>You can connect the services via private IP while VPC peering span AWS accounts, regions with limitations.</li><li>Data is encrypted and transits via the Global backbone with lower latency.</li><li>It's scalable and highly performant way.</li><li>Use Case:<br>- Sharing database with other VPC and access to Database.<br>- Security auditors can be connect your VPC and performs tests.<br>- Vendor provided service, it should be a web API<br>- Splitted application for blast radius</li><li>Peering connection is a gateway like NAT and IGW.</li><li>VPC overlap is the limitation for that. There is requester and accepter.</li><li>Adjust the route tables in both VPC side. Remote CIDR</li><li>NACLs and SGs can be used to control access because you will have an ENI in your VPC. If VPCs are in same region, you can reference SG id.</li><li>IPv6 support is available for cross-region</li><li>DNS Resolution to private IPs can be enabled. It's a setting needed to adjust both sides. It prevents the traffic leaves AWS.</li><li>Transitive Routing isn't supported. Let's say A-B and B-C are peered. It doesn't mean you can reach to C from A. You must create another peering between A-C</li></ul>]]></content:encoded></item><item><title><![CDATA[AWS Networking - VPC Summary]]></title><description><![CDATA[<p>VPC stands for Virtual Private Cloud. VPC lets you build up your own private network in AWS Cloud. You can define an isolated networks by building up VPC. The private network is the beginning point of robust, secure, reliable and fast infrastructure.</p><blockquote>Isolated Network Blast Radius<br>Let's say something happens</blockquote>]]></description><link>https://shoin.cloudantler.com/aws-networking-vpc-summary/</link><guid isPermaLink="false">63a9f2f97a30a8000171c911</guid><category><![CDATA[Cloud]]></category><category><![CDATA[AWS]]></category><category><![CDATA[VPC]]></category><category><![CDATA[EN]]></category><dc:creator><![CDATA[Abdullah Caliskan]]></dc:creator><pubDate>Mon, 26 Dec 2022 19:16:51 GMT</pubDate><content:encoded><![CDATA[<p>VPC stands for Virtual Private Cloud. VPC lets you build up your own private network in AWS Cloud. You can define an isolated networks by building up VPC. The private network is the beginning point of robust, secure, reliable and fast infrastructure.</p><blockquote>Isolated Network Blast Radius<br>Let's say something happens in VPC-A such as malware infection or an attack. When you isolate your network, the other VPCs won't be affected from issue in VPC-A. It will decrease you blast radius.</blockquote><h3 id="vpc-and-subnets">VPC and Subnets</h3><p>If you have a fresh AWS account, the default VPC will be created for you. The properties of default VPC like following;</p><ul><li>172.31.0.0/16 and /20 subnet in each AZ with public IP enabled.</li><li>Internet Gateway with a configured main route table and DHCP option set</li><li>NACL: All allow in/out , Security Group: All from itself (ingress) , All egress</li></ul><p>VPC has some limitations such as max subnet range could be /16 and min /28 for ipv4. but ipv6 max is /56.</p><p>The VPC is software defined network, but these definitions would effect the hardware such as router, gateway, firewalls. AWS offers two way of tenancy to consume network devices.<br>Default: Shared network. The underlying devices are shared with other tenancies. It could be changed after creation.<br>Dedicated: Locks this VPC to dedicated hardware and can't be changed later.</p><p>In addition to above limitations; you must to be ensure that ip overlapping doesn't exist with other accounts and partners, while designing corporate network.</p><h3 id="vpc-routes">VPC Routes</h3><p>To deliver packets from a subnet to other or internet, we need routing.</p><ul><li>Internet gateway has public IP and performs Static NAT. It translates the private ip addresses with public IP.</li><li>Route table manages the VPC router. It can route propagation over BGP when you have DirectConnect or VPN connection.</li><li>Route table is associated with subnets. Specific IP in route table has high priority.</li></ul><h3 id="vpc-security">VPC Security</h3><p>AWS offers different services which they can work in different network levels.</p><p><strong>NACL - Layer 4</strong></p><ul><li>Controls data traffic across subnets. It is consisted list of rules.</li><li>Impacts only traffic crossing boundary of subnet.</li><li>It contains explicitly deny or allow rules. Protocol, IP Range, Port for source and destination.</li><li>Rules are processed in number order. Lowest first. When a match found, process stops.</li><li>* rule is default. Last processed and implicit deny.</li><li>NACLs are stateless, you must add your rule ingress and egress appropriately for ephemeral ports.</li></ul><p><strong>Security Groups - Layer 5</strong></p><ul><li>Think like firewall rules. Thanks to Layer 5 capabilities, it stores the session. That means you have stateful firewall.</li><li>Security groups can be associated with AWS resources such as EC2 instance, EFS mount point. You can associate with each resource that has ENI</li><li>SG can not deny traffic explicitly. Insert allowed sources and protocols. If you want to explicitly deny, use NACL.</li></ul><h3 id="nat-gateway">NAT Gateway</h3><p>If instances or resources inside a VPC don't need incoming internet access, don't give them IPv4 public ip. IPv4 pool is out of space. Use NAT Gateway.</p><p>NAT Gateway is used to access public world from private network with a static IP address. It manages the egress traffic. It's suit for your private subnet's resources.<br>NAT Gateway run in subnet. Subnet is present in AZ. You may provision NAT Gateways for each subnet. Create route tables for each subnet that has been configured to consume NAT Gateways. By doing that your system will be resilient and highly available.</p><h3 id="cross-vpc-access">Cross VPC Access</h3><p>VPC offers the ability of cross access or communication between your VPCs or 3rd part VPCs. I mentioned about the requirement the isolation of VPCs.</p><p>If you want to consume 3rd party service which is developed in AWS without public internet access, the only thing that you need is AWS PrivateLink. There will be another post.</p>]]></content:encoded></item><item><title><![CDATA[Summary of Network Fundamentals]]></title><description><![CDATA[<h3 id="what-is-a-network">What is a Network?</h3><p>A network consists of two or more computers that are linked in order to share resources (such as printers and CDs), exchange files, or allow electronic communications. The computers on a network may be linked through cables, telephone lines, radio waves, satellites, or infrared light beams.</p>]]></description><link>https://shoin.cloudantler.com/summary-of-network-fundamentals/</link><guid isPermaLink="false">63a9f1a17a30a8000171c901</guid><category><![CDATA[Cloud]]></category><category><![CDATA[Network]]></category><category><![CDATA[EN]]></category><dc:creator><![CDATA[Abdullah Caliskan]]></dc:creator><pubDate>Mon, 26 Dec 2022 19:15:11 GMT</pubDate><media:content url="https://shoin.cloudantler.com/content/images/2022/12/world-blue-1.jpeg" medium="image"/><content:encoded><![CDATA[<h3 id="what-is-a-network">What is a Network?</h3><img src="https://shoin.cloudantler.com/content/images/2022/12/world-blue-1.jpeg" alt="Summary of Network Fundamentals"><p>A network consists of two or more computers that are linked in order to share resources (such as printers and CDs), exchange files, or allow electronic communications. The computers on a network may be linked through cables, telephone lines, radio waves, satellites, or infrared light beams. <a href="https://fcit.usf.edu/network/chap1/chap1.htm">[source]</a></p><p>After the definition of network, I guess you've already known, we can take a look to OSI networking stack and describing the network layers.</p><h2 id="osi-network-stack">OSI Network Stack</h2><p>It stands for <strong>Open Systems Interconnection.  </strong>This model describes seven layers that computer systems use to communicate over a network.</p><ul><li>Layer 1 - Physical: Optical, Frequency and cable is transfer medium<br>It defines how to transmit and receive wave-lengths, ones/zeroes, voltages and radio frequencies.</li><li>Layer 2 - Data Link: MAC Address, Frame, Named connection<br>There is shared transfer medium so it decides who can talk and when. Avoiding cross-talk. Allows for backoff and retransmission. Think like a traffic cop or traffic lights.</li><li>Layer 3 - Network: Public IP Address, Packets, Single Stream for a node<br>Packet encapsulated and un-encapsulated at each step.</li><li>Layer 4 - Transport: TCP/UDP , Segments<br>TCP reliable, UDP fast unreliable. TCP uses segments to ensure data is received in correct order. Error checking, ports that allow different streams on the same host.</li><li>Layer 5 - Session: Session concept, Security groups, Stateful firewalls<br>Initiating traffic and response traffic are part of same connection.</li><li>Layer 6 - Presentation: Data conversion, encryption and compression<br>Standards for L7 can use. HTTPS; TLS encryption happens here.</li><li>Layer 7 - Application: Application data, Body and HTTP headers<br>Your application or protocol data will be hold here.</li></ul><h3 id="ip-addressing-and-cidr">IP Addressing and CIDR</h3><p>IP Address is consisted host and port. Subnet Mask or Prefix helps you to understand where the split occurs.<br>This <a href="https://avinetworks.com/glossary/subnet-mask/">[page]</a> describes the subnet mask and details of CIDR.</p><p>CIDR stands for classless inter-domain routing. It allows more effecting allocation and sub networking.<br>10.0.0.0/24 : The first 24 bit is network. The last 8 bit is address of node in network. The number after the slash is subnet indicator. So it means, the 3x8 = first 24 bit is the network address. 32-24 = 8 bit will usable by nodes.<br>0 and 255 is reserved for network broadcast and gateway. That means you will have 253 usable IP address.</p><h3 id="subnetting">Subnetting</h3><p>Process of breaking a network down into smaller subnetworks.<br>Split VPC into individual subnets. Subnet is present inside an availability zone.<br>By implementing subnets you can spread your infra across different availability zones that allow you to build in high availability into your infrastructure.</p><ul><li><strong>10.0.0.0/16</strong><br>32-16=16 -&gt; 2^16 -&gt; Available IP Address: 65536 - 2 for reserved IPs<br>First IP: 10.0.0.0 , Last IP: 10.0.0.255</li><li><strong>10.0.0.0/17</strong><br>32-17= 15 -&gt; 2^15 -&gt; Available IP Address: 32768 - 2 for reserved IPs<br>First IP: 10.0.0.0 , Last IP: 10.0.127.255</li><li><strong>10.0.128.0/17</strong><br>32-17= 15 -&gt; 2^15 -&gt; Available IP Address: 32768 - 2 for reserved IPs<br>First IP: 10.0.128.0 , Last IP: 10.0.0.255.255</li></ul><h3 id="routing">Routing</h3><p>Getting packets from your location to the destination where it's another location and network. IP routing happens in 3 factor. (LAN, MAN, WAN)</p><p><strong>LAN</strong></p><ul><li>Local, Same network/subnet</li><li>ARP request is used to get MAC. Because you don't need to know IP address of other devices to communicate in same subnet. But generally apps and us prefer to use IP or DNS.<br>ARP: Make a broadcast, who has this IP? I would like to learn the MAC of it.</li><li>Send frames with MAC to target.</li><li>Layer 2, Router doesn't need.</li><li>Peer to Peer communication</li></ul><p><strong>MAN</strong></p><ul><li>Known locations, 2 Subnet communication</li><li>Check the target is in local? Answers that question by using the subnet masks and own ip address. If target is not in local, then follow below steps :)</li><li>We should forward that packet, there is one option that is default gateway.</li><li>Find MAC of gateway. Gateway is a router device. Generally first ip of network.</li><li>Send packet to router/gateway.</li><li>Router tries to find target node.</li><li>Find the MAC of next</li><li>Deliver the packet.</li></ul><p><strong>WAN</strong></p><ul><li>Unknown locations, Internet</li><li>There are extra steps to MAN.</li><li>We know the target, because it was in other subnet. (Remember we had got 2 subnets) In wan there are a lots of networks.</li><li>We use the BGP protocol to find the location of target. Backbone IP.</li><li>Backbone router looks how can I send packet to target. Calculates the best way to deliver packets.</li></ul><h3 id="firewalls">Firewalls</h3><p>They are barrier of networks and security devices to analyze incoming and outgoing traffic. They work in rule based. It matches rules with the traffic and makes decisions about the status of traffic, allow or deny traffic?</p><p>Firewalls are classified according to their ability to work in different network layers. Layer 3,4,5,7 Firewalls.</p><h3 id="proxy-server">Proxy Server</h3><p>It's another type of gateway. Sits between public and private network.<br>Client makes a connection to public internet, this request goes to proxy server. Proxy server makes a request to destination and delivers the response to client which is present in private network.</p><ul><li>Proxy server needs application support, configured in OS, browser or app</li><li>Caching: The clients connects to same destination, proxy server caches the common large files and images then instead of re-request these files from remote. Proxy delivers files from cache. Bandwidth effective usage</li><li>Filtering: The clients are accessing indirectly to remote. Therefore proxy server can filter out content that might have child safety.</li><li>Proxy server can perform authentication or validation. Proxy server can check the client has a valid corporate ID.</li></ul><p>Inside AWS we have a lot of filtering products such as NACLs, SecurityGroups. These filters based on network level factors. Nothing over layer 5 in AWS.<br>If you want to filter out based on profile, age, department by using corporate ID, you should install a proxy server on EC2.</p>]]></content:encoded></item><item><title><![CDATA[Datastore Delete Entities in Bulk with Dataflow]]></title><description><![CDATA[<p>I've written this post, because I would like to draw your attention to some important points about the deleting entities in datastore through dataflow service. I think that the documentation page of the dataflow template for deleting entities isn't beneficial. Also it isn't up to date.</p><h3 id="job-common-properties">Job Common Properties</h3><ul><li>First</li></ul>]]></description><link>https://shoin.cloudantler.com/datastore-delete-entities-in-bulk-with-dataflow/</link><guid isPermaLink="false">63a9f14f7a30a8000171c8f3</guid><category><![CDATA[Cloud]]></category><category><![CDATA[GCloud]]></category><category><![CDATA[DataFlow]]></category><category><![CDATA[EN]]></category><dc:creator><![CDATA[Abdullah Caliskan]]></dc:creator><pubDate>Mon, 26 Dec 2022 19:09:47 GMT</pubDate><content:encoded><![CDATA[<p>I've written this post, because I would like to draw your attention to some important points about the deleting entities in datastore through dataflow service. I think that the documentation page of the dataflow template for deleting entities isn't beneficial. Also it isn't up to date.</p><h3 id="job-common-properties">Job Common Properties</h3><ul><li>First of all visit dataflow service page. Click the create job from template button. You should create a dataflow job from template. The template name is <code>Bulk Delete Entities in Datastore</code>.</li><li>Give a name to your job.</li><li>Select an endpoint. Job metadata will be stored in here. Select same with datastore region.</li></ul><h3 id="required-parameters">Required Parameters</h3><ul><li>Type your GQL query. Simply SELECT * FROM [kind_name]</li><li>Type your read_project_id</li><li>Type your delete_project_id. In this point you should ask, why I had given two project id. Dataflow performs read operations from your read_project_id, then deletes the entities from the delete_project_id.</li><li>Temporary location is required to store some metadata and log variety things. Type a bucket path. <code>gs://temp-dataflow-delete/path/</code></li></ul><p>You shouldn't need to define any additional parameters theoretically. Because you defined required parameters, right? But there is more.</p><ul><li>For example; we haven't defined the namespace? It will delete but from where?<br>Click <code>Show Optional Parameters</code> to define.</li><li>Type your datastore namespace to first field.</li><li>UDF GCS path doesn't required at this point, leave it blank.</li><li>Same for UDF function name, leave it blank.</li><li>Max workers. It is most important point. However GCP didn't pay enough attention to this. As you can imagine, it limits the worker count.<br>Dataflow provisions VM instances in your account to perform read and delete operations. This parameter set the maximum number of workers within a node group. Specify as you wish.</li><li>Number of workers defines the initial number of workers. For example 1. Node group will scales the instances up to the Max workers.</li><li>Select the worker region and zone.  If you select same region with your datastore, it would be good. So you don't pay inter zone and region  data transfer price.</li><li>If you would like to associate a service account to workers, type the email address.</li><li>Machine type is another important point. However they didn't pay enough attention to this parameter as always.</li><li>Additional experiments isn't required. Leave it blank.</li><li>Worker IP address configuration is important. If you don't have a special case, I suggest you to choose private. If you select the public, you will be charged. Because GCP suppose that it's giving a service to customer application or 3rd party. If you enabled <code>Private Google Access</code> in your VPC, then you will have an secure and high performant connection. Also you won't be charged. Yes again they didn't pay attention. :) I wonder why? :)</li><li>You can specify the VPC network</li><li>Also subnetwork.</li></ul><p>That's it. You can run your job safely from price perspective. Also I suggest to use the burstable instance types which are f1-micro and g1-small. If you don't have performance concern, because they are cheaper :)</p>]]></content:encoded></item><item><title><![CDATA[DNS Failover]]></title><description><![CDATA[<p>It helps the application or network services remain accessible in the event of outage. The failover updates the DNS records according to services' availability.<br>It provides you high availability.</p><h3 id="how-works">How works?</h3><ul><li>It checks the availability of services regularly by using the health checks.</li><li>The health checks are like the regular</li></ul>]]></description><link>https://shoin.cloudantler.com/dns-failover/</link><guid isPermaLink="false">63a9f1027a30a8000171c8e4</guid><category><![CDATA[Network]]></category><category><![CDATA[DNS]]></category><category><![CDATA[HowItWorks]]></category><category><![CDATA[EN]]></category><dc:creator><![CDATA[Abdullah Caliskan]]></dc:creator><pubDate>Mon, 26 Dec 2022 19:08:25 GMT</pubDate><content:encoded><![CDATA[<p>It helps the application or network services remain accessible in the event of outage. The failover updates the DNS records according to services' availability.<br>It provides you high availability.</p><h3 id="how-works">How works?</h3><ul><li>It checks the availability of services regularly by using the health checks.</li><li>The health checks are like the regular requests coming from clients.</li><li>In addition you can specify the sources of health checks. If your customers are based in Europe, you can define health checks based on European cities.</li><li>It updates the DNS records to another working service which is predefined IP or host when it determines an outage.</li><li>Simply you define a primary/secondary system.</li></ul><h3 id="advantages">Advantages</h3><ul><li>It forwards the traffic to always running system.</li><li>Thanks to DNS failover your customers don't face with an issue or minimal while connecting your services. So you don't lose customers.</li><li>You can cover the SLA as much as possible. And business happy :)</li></ul><h3 id="use-cases">Use Cases</h3><ul><li>The master servers of SQL database management systems.</li><li>Web services for a location</li><li>Network services like dns, smtp, ldap.</li></ul>]]></content:encoded></item><item><title><![CDATA[AWS Storage Services - EBS]]></title><description><![CDATA[<h3 id="what-is-that">What is that?</h3><p>EBS - Elastic Block Store is a service that provides the <strong>persistent</strong> block storage.<br>EBS volumes are presented over network and used by EC2 instances.<br><br>As always every service in AWS can run separately from each other.<br>So EBS volumes can be used with EC2 instances or</p>]]></description><link>https://shoin.cloudantler.com/aws-storage-services-ebs/</link><guid isPermaLink="false">63a9ef377a30a8000171c8c7</guid><category><![CDATA[Cloud]]></category><category><![CDATA[AWS]]></category><category><![CDATA[EBS]]></category><category><![CDATA[EN]]></category><dc:creator><![CDATA[Abdullah Caliskan]]></dc:creator><pubDate>Mon, 26 Dec 2022 19:04:22 GMT</pubDate><media:content url="https://shoin.cloudantler.com/content/images/2022/12/geo-orange.jpeg" medium="image"/><content:encoded><![CDATA[<h3 id="what-is-that">What is that?</h3><img src="https://shoin.cloudantler.com/content/images/2022/12/geo-orange.jpeg" alt="AWS Storage Services - EBS"><p>EBS - Elastic Block Store is a service that provides the <strong>persistent</strong> block storage.<br>EBS volumes are presented over network and used by EC2 instances.<br><br>As always every service in AWS can run separately from each other.<br>So EBS volumes can be used with EC2 instances or you can keep them after termination of an EC2 instance. That means volumes are persistent and can be attached / detached from EC2 instances.</p><h3 id="durability-concern">Durability Concern</h3><ul><li>EBS volumes are replicated across multiple servers managed by AWS in a single AZ to prevent data loss.</li><li>In addition to that you can increase the durability of the volume by taking snapshot. The snapshots are stored in S3 Buckets within a region. It allows us to store the data for <strong>long-term</strong> by storing the snapshots in bucket.</li><li>You can create <strong>point-in-time </strong>and <strong>incremental </strong>snapshots from volumes.<br>The incremental snapshots are quickest method. They are <strong>storage friendly </strong>because they don't need to store unchanged data again and again.</li><li>AWS incremental snapshot is more powerful than <strong>traditional snapshot</strong>. It doesn't need to all incremental snapshots to recreate volume. If the one snapshot is missed, you will lose the data in the snapshot.</li></ul><h3 id="what-is-the-ability-of">What is the ability of?</h3><ul><li>You can format the EBS volume with a filesystem. Therefore, you can mount multiple EBS volumes to instance.<br>But be noticed that you can mount the <strong>EBS volume to only one instance.</strong><br>If you want to mount single filesystem to multiple instances then you should utilize the <strong>EFS - Elastic File System</strong> service.</li><li>EBS also provides to create new volumes from the snapshots. It provides us to create the volume in another AZ for <strong>disaster recovery</strong> purposes or launching the same instance.</li></ul><blockquote><strong>The Important Note:</strong> While creating a new EBS volume from the snapshot, it needs to copy all content from the S3 to the EBS volume where in the AZ.<br>The copy process keeps working in background, while showing the volume is created. Please be noticed to do that in production.<br>Make sure the all content is present in EBS volume, to avoid the performance issues. To speed up the process, try to access mount points in filesystem. By doing that EBS gives priority to get data from S3.</blockquote><h3 id="ebs-volume-types">EBS Volume Types</h3><p>The size of EBS volumes can be range from 1 GiB to 16 TiB. It depends to the volume type. Each volume type has dominant performance attribute.<br>Dominant performance attribute of <strong>SSD is IOPS </strong>and <strong>HDD is Throughput.</strong></p><blockquote>Storage Performance Measurement<br><strong>IOPS</strong>: Number of input/output operations in a second<br><strong>Throughput</strong>: Data rate, expressed in megabytes per second<br><br>[Block size of operation] multiple [Number of operations]<br>256 KB * 400 = 100 MB per second throughput</blockquote><p>You should select the appropriate volume type according to your workload.<br>Does your workload demand IOPS or throughput?</p><h3 id="gp2-general-purpose">gp2 - General Purpose</h3><ul><li>It offers general purpose SSD volumes. Default type. Also there is a new version of volume types that is gp3. Details in below</li><li>Recommended for most workloads</li><li>1 Gib - 16 TiB Size</li><li>It has the balance of IOPS and throughput.</li><li>Generally fast in terms of their throughput. Max throughput 250 mib/s<br>Exceptionally fast in terms of the number of IOPS. Range; 100 - 16,000 IOPS</li></ul><p>There is a important point in gp2 types. It gains IOPS per gigabyte. Performance is linked to its size. You get <strong>3 IOPS per GiB.</strong></p><blockquote>If you have 100 GiB volume, then you will have 300 IOPS.<br>But it can burst up to 3000 IOPS.<br><br>500 GiB * 3 = 1500 IOPS min. Burstable up to 3000 IOPS<br>1024 GIB * 3 = 3072 IOPS min. Not burstable</blockquote><p>While provisioning the volume, you should consider IOPS.<br>The smaller sizes can <strong>hit performance ceilings</strong>. Anytime you need to go <strong>above the baseline</strong>, you can burst up to 3000 IOPS.</p><h3 id="io1-provisioned-iops">io1 - Provisioned IOPS</h3><ul><li>It offers highest performance SSD volumes.</li><li>Suit for Critical business applications</li><li>4 Gib - 16 TiB Size</li><li>You can adjust size and IOPS separately.</li><li>Max: 64,000 IOPS / Max: 1000 mib/s throughput</li><li>EBS has some limitations per instance. You can have up to;<br>1750 mib/s throughput<br>80,000 IOPS<br>Do you think it is a limit?<br>Then that is the solution. Use <strong>instance store volumes</strong> :)</li></ul><h3 id="st1-throughput-optimized">st1 - Throughput Optimized</h3><ul><li>Offers low cost HDD volumes.</li><li>500 GiB - 16 TiB Size</li><li>Throughput up to 500 mib/s</li><li>HDD volumes can't be boot volume</li><li>Suit for <strong>throughput intensive</strong> workloads such as <strong>Streaming, BigData, Data Warehouse, Log processing</strong></li></ul><h3 id="sc1-cold-hdd">sc1 - Cold HDD</h3><ul><li>Offers lowest cost HDD volumes</li><li>Suit for infrequently accessed but requires throughput workloads</li><li>500 GiB - 16 TiB Size</li><li>Throughput up to 250 mib/s</li><li>HDD volumes can't be boot volume</li></ul><h3 id="gp3-general-purpose">gp3 - General Purpose</h3><ul><li>The next generation general purpose SSD volume type released on December, 2020. New gp3 volumes provides to provision IOPS and size separately like io1.</li><li>It offers lower price up to %20 per GiB than gp2.</li><li>You can scale IOPS and throughput without needing to provision new volume types. Pay only for your usage.</li><li>Min IOPS: 3,000 - Max IOPS: 16,000</li><li>Min throughput: 125 mib/s - Max throughput: 1,000 mib/s</li><li>Suits for transaction-intensive, low-latency workloads. Database clusters and web applications</li><li>You can easily migrate from gp2 to gp3 without interrupting the EC2 instances by using elastic volumes feature.</li></ul>]]></content:encoded></item></channel></rss>