Google Workspace Setup
Setting up Google Workspace is different from setting up a personal Gmail account. Your domain is likely registered with a provider like GoDaddy or Namecheap, and email traffic must be routed to Google’s servers.
Example scenario:
- Domain
abc.airegistered with GoDaddy - Cloudflare as DNS provider (optional but recommended for easier setup)
- Goal: Create Google Workspace emails like
pavan@abc.ai
Setup Steps
1. Configure DNS Provider (Optional - Cloudflare)
If using Cloudflare, update nameservers on GoDaddy domain settings to point to Cloudflare nameservers (typically elias.ns.cloudflare.com and megan.ns.cloudflare.com).
Who's Watching ??
After reducing the RDS size, I was eager to watch the reduction in monthly bill. Earlier, RDS was costing around $11/day and now it should have cost me $3/day. But after few days of monitoring, it did not budge below $7/day. On splitting the cost by usage type, some storage cost had increased for RDS. On inspecting, I found that this was due to the snapshots that I had created before deleting the previous RDS instance. AWS provides free storage only upto the size of RDS (200GB in my case). But the previous snapshots were of 2TB. Hence, they were beyond free storage and were costing around $4/day. On deleting these snapshots, the RDS bill had come down to the expected value of $3/day.
Downscaling RDS
So, when we found out the root cause for memory full the next step was to reduce the storage. After cleaning up the replication slot, WAL size had come down drastically. Now, 3GB storage was enough to hold current databases. But we were still paying for 2TB usage. So, we had to downscale RDS to reduce monthly bill.
Upscaling is easier in AWS and downscaling is equally tedious. RDS Storage are based on EBS volume which only expand and don’t allow shrinking. To upscale, it is just one click job (increase storage) but to downscale you need to create another database and switch traffic to that. In current setup, Blue-Green deployment provided by AWS seemed most convenient option (compared to DMS, or restore from storage).
Memory Full
On Sunday morning, I got lot of messages that production was down. Users are not able to login. Naturally, I went to Sentry to check if any alerts were raised. And there seemed to be a connection issue with Postgres database. I went to the db dashboard and it flagged that the RDS memory had exhausted. It had 1TB memory provisioned and that was exhausted. To resolve the problem immediately I provisioned more memory to it.
Redis Valkey
Redis used to be open source with BSD license. Berkeley Software Distribution (BSD) license gives full permission to use the software. It is like do whatever you want with this code, just don't blame us if it doesn't work.
Cloud providers used Redis and made lot of money over it, but nothing was shared back to Redis Ltd. Hence, they moved to a more restrictive licensing on March 2024. Their source code is still available but not for commercial hosting / service use. For commercial use you need to take their permission.
ECS Pipeline
If you have a docker image of your service you can deploy it scalably using ECS.
AWS handles all the orchestration for you. You need to provide a task definition and once deployment is run, ECS creates tasks (running containers) as per task definition. An ECS service ensures that desired number of tasks are always running.
For standard services you have container images publically available. But mostly if you are using ECS, you will be trying to create your own docker image that would run some server or anything. But then how does ECS know about your docker image ?
SSH vs HTTPS
Before we can ssh into a server, we need to (manually) provide our public key to the server. Server stores this key in authorized_keys file under ~/.ssh directory. Similarly, when we try to ssh to a server, our laptop asks if you want to add this server to known hosts.
If selected yes an entry will be made to known_hosts file in local system. This entry is of the server’s public key.
VPN
We generally use VPN (Virtual Private Network) to test whether content piece is accessible from another country. Using VPN our servers are tricked as if the request is coming from another country. How does this happen ?
On a simple level, when VPN app is running on my device it will send all my requests (app or browser requests) to the VPN server that is located in some another country. VPN server will then forward these requests to the destination server. Since the request has actually come from another country, destination server will respond. This response will then be forwarded back to me. Thus VPN has acted as a middle-man or proxy to send my request.
Slack Integration
With scale, time becomes of essence in resolving an issue. Thus, identifying and resolving issues quickly becomes of essence. Collating data from multiple sources to debug an issue is not desirable in such case. If your workspace is on Slack, then getting alerts on Slack increases visibility and saves time in crisis (when it matters). Slack can also be used to send alerts of any background tasks like celery tasks, jenkins deployments, …
Gunicorn
While development, we run Django application using python manage.py runserver. But that is not recommended for production use. Why so ?
Firstly, in development mode, lot of details are returned which expose internals of system. This is a security risk in production. Secondly, application doesn’t auto-restart if it crashes with some error. Memory leaks in application accumulate over time eventually leading to crashing of server. Load can’t be balanced across CPU cores. Then static files are served synchronously this would result in queuing of requests. In dev mode server runs as a single thread so it can’t handle multiple concurrent requests thus increasing overall response time. There isn’t connection pooling for database connections. Overall it is not a reliable setup that should be used in production.