The Swiss Army Knife
FFmpeg is called the Swiss Army Knife for video and audio. Whatever you want to do with media like compressing, scaling, editing, encoding, decoding, transcoding, streaming, … FFmpeg can do everything. If it’s so special, let’s start using it. You can install FFmpeg on your system.
Let’s download some video video_1.mp4
. I want to get details about this video. I can do so using:
ffprobe -v quiet -print_format json -show_format -show_streams video_1.mp4
Here we used ffprobe
(that usually comes with ffmpeg).
DRM
Billions Of Dollars are annually lost by production companies due to piracy. Someone can download your content and distribute it on free channel. Thus, you lose on revenue that user might have paid.
How do you solve this problem ? Or can we even solve it ? If someone is playing a movie on TV and then recording it again through high quality camera (thus recreating it), you can’t block this. This is called Analog Hole. Thus, you can’t protect 100% piracy.
Vector Graphics
If you zoom in on any image on your phone, it starts breaking after 2x - 3x zoom. You can see square pixel blocks. But what happens when you zoom in on google map ? Despite how much you zoom (till you hit last block) it doesn’t seem to break. I observed similar thing with Figma.
What is the magic here ? Why doesn’t it break ? Let’s try to reverse engineer the problem. How would I achieve it if I want to create such google map like functionality.
Video Scrubbing
While watching YouTube video, if you hover over the seek bar you can preview video image of that location. You can see images even at 10 min even though only 30 sec of video has buffered yet. So, if the video has not loaded till 10 min, how is the video player able to display image at that location ? Isn’t it interesting ?
This happens with a feature called video scrubbing. We need to process the uploaded video for video scrubbing. When a video is uploaded (say of 10 min), we can take snapshots at regular intervals (of say 3sec). So, for a 10 min video there will be 10*60/3 = 200 snapshots. We can generate these snapshots using ffmpeg.
HAProxy
When you go to HAProxy Site, the small case fonts and hardly any effort on UI make you think do they even maintain this site ?
Are HAProxy days done ? But if you make effort to read along, you fumble on this line: Most users report having never ever faced any single crash and claim that HAProxy is the most solid part of their infrastructure. Finding machines with HAProxy processes being up for more than 3 years is not exceptional at all! Phew!! Flexing of highest caliber.
Caddy
Just as a golf caddie makes the game easier for golfers by carrying their equipments and offering guidance, the Caddy web server aims to make web hosting easier by handling complicated tasks (like SSL/TLS certificates) automatically. With caddy, you don’t need to worry about your certificate expiration. It started back in the days when Let’s Encrypt was just starting and enabling HTTPS on your website was a headache and costly.
Nginx
Nginx can be used for multiple purposes like web serving, load balancing or reverse proxying. It was designed to solve C10k problem so obviously it can handle large number of connections simultaneously. Besides, it also excels at serving static content. But how does it achieve this ?
Nginx follows an asynchronous event-driven architecture. There is a master process that manages all worker processes. Typically there are as many workers as there are CPU cores. Each worker has an event loop running inside it. When an event comes on socket, OS notifies worker. Worker registers interest in the event and moves ahead. When data is received from client, a read event is triggered which is then processed by worker. Thus, the worker continues with its events without being blocked.
Iron Beneath the Rust
Having worked on Python for 6+ years, learning Rust took me back to C/C++ days in college. Compile the code before running it. With that it also brought back the nostalgic fear of pointers. Rust put all of the fears to rest.
There are two types of memory: Stack and Heap. As the name suggests on stack memory things will be put one above other and popped in reverse order. Since you know where you are putting things and removing from, stack memory is extremely fast.
Rustic
I have been hearing a lot about Rust
lately. How it is fast and achieves near C/C++ performance. So, I was thinking of usecases where I can use it.
I have been using Python (at work) for many years now. Python is a good starting point to embody business ideas as it significantly reduces development time. Libraries are readily available for almost all use cases.
But when your product is built and is handling traffic at scale, you need to go deeper. Python can be a barrier too sometimes. We have migrated one of our high-traffic module from Python (Django) to Java (Spring Webflux) and have seen a 5x impovement in response times. That got me thinking about other options too and eventual interest in Rust
.
Memory Overload
An Issue was reported that videos were not being uploaded from google drive to S3. We have an asynchronous task that transfers files from google drive to S3 (for further processing).
My first assumption was that the tasks might have been queued. Also there weren’t any sentry alerts, so I ignored the issue. Even after few hours, the issue was not resolved so it seemed something was cooking.
I tried to upload a 2MB file from drive and it got processed instantaneously. So, the pipeline was working fine. I looked at the file that they were trying to upload. It was a whopping 8.3GB file. That raised some doubts. Is google drive blocking such file ? Is some network connection getting reset ? Is S3 limiting file upload ?