Video Conference in R
After making a game engine in R, I was hard-pressed to come up with a crazier idea.
But I prevailed. What if I made a video conference client to let people see each other in realtime, running all on R? Surprisingly, R provides almost everything we need.
Gesticulation is particularly effective to make up for the low frame rate.
Here I’m streaming to localhost (my own computer) to demonstrate the latency.
This gave me quite a fun challenge for the semester. Things were quite a bit more finicky to get working than in previous projects, and there were several times when all hope of it working seemed lost— I had to push through and persevere quite a bit throughout the project.
1. Details
1.1 Big Picture
Webcam video is captured by OBS, processed into images in R and sent to the client, and then displayed on the console.
Pipeline
1.2 OBS
Unfortunately, R can’t natively read from the webcam. To get webcam input, we’ll use OBS, the standard screen-recording and streaming application. OBS gives a lot of useful control over how it outputs video, which we’ll take full advantage of.
We need to turn the live webcam feed into a video format that we can read in R. We’ll need to read it as it’s being written, which restricts what file formats we can use— but thankfully, old reliable .mkv still does the job. OBS has to use a video encoder (which we get to choose) to write live camera data into a video file, and different encoders take longer before the most recent frame can be read. (Lower-framerate videos also take longer, so we use a relatively high framerate and only save images every few frames)
After testing quite a few encoders and filetypes, the Apple VT ProRes Software Encoder permitted the fastest reading of recent frames (from my apple laptop webcam).1
Anyway, all of this means that we end up with a .mkv file that OBS writes to live. Next, we’ll use FFmpeg (via R) to access this file and extract its frames as it’s being written.
Why not just use OBS to generate a real livestream?
Uhh… I was particularly interested in learning to work with these video and image formats. Using an actual streaming protocol would indeed work much better.
1.3 FFmpeg
FFmpeg is a standard video processing software, used by R packages like av2. In our case, we want to use it to generate a JPEG for every frame of video.
This turns out to be tricky. FFmpeg is very convenient to use for regular video files, and is even quite capable of reading incomplete files as they’re written3, but reading that incomplete video live takes some effort and jank.
Mainly, the issue is that FFmpeg helpfully stops running when it reaches the end of the video, but we want it to just pause and wait for new data to be added. Amazingly, this happens to be possible through some jank suggested in a forum email 10 years ago:
So we actually manage to read a video file and write its frames as JPEGs almost as soon as they’re written by OBS. (The filetype and encoder determine how fast we can read a frame after OBS writes it, as mentioned in the previous section.)
We save these images to a temporary folder, from which the server will send images to the client.
1.4 Memory Leak
The process as described so far introduces a bit of an elephant in the room that can be easy to miss. If we just keep writing to the video file… we just keep writing to the video file, and video files are big and take up a lot of storage space. It would be rather bad practice to rely on a constantly-growing file, and this would prevent us from running the video conference indefinitely.
The solution is to break up the video every so often and get rid of the old parts we don’t need anymore. Luckily, OBS provides a built-in option to start writing to a new video file every X minutes— we can just wait for this and then delete the first file. (and the old image frames as well!)
Getting everything to mesh nicely with this file switch is actually a bit of a headache—my current implementation has to pause for 5-10 seconds to do it—but the infrequency of happening every few minutes makes it not a huge concern.
Doing all this makes it so we never take up more space than a couple minute’s worth of video, fixing any concerns of a memory leak.
1.5 Server and Client
The simplest and neatest way to serve our photos is to have a static server serving the whole directory containing the images, so the client can request whatever frame it wants, whenever it wants. This takes a lot less effort than trying to manually serve the right frame at the right time, and works just about as well.
To get a frame, the client just sends an http request with the image filename, which is in order (frame 1.jpeg, 2.jpeg, etc…). The server will serve the image, or a 404 error if the image doesn’t exist yet.
This makes it simple to create a system that robustly keeps up with the server— the client asks for an image every frame and just tries again if it gets a 404. This ensures that it will keep running and catch up if the server or client experience lag or get desynced.
1.6 Rendering Engine
Once the client gets the image, it’s time to prepare it to draw. This uses an engine I made previously to render video in the console4.
2. Guide
This section will describe how to try the project out yourself.
Note: this has only been tested on my macbook. The project may work on Windows but might require some fiddling.
2.1 Setup
Download and install OBS and FFmpeg. Launch OBS and skip the autoconfig wizard (click Cancel).
Change the settings with OBS Studio -> Preferences (or File -> Settings). All the following steps are critical.
Go to
Outputon the sidebar.Change
Output ModetoAdvanced.Go to the
Recordingtab.Set
Recording Pathto a folder where video files will go; I recommend making a new folder for this on your desktop. R will also have to read from this folder.Enable
Generate File Name without Space.Change
Recording FormattoMatroska Video (.mkv).Change
Video EncodertoApple VT ProRes Software Encoder(orAOM AV1).Enable
Automatic File Splitting,Split by Time,2 minutes.Go to
Videoon the sidebar.Set
Common FPS Valuesto 48.Click
OKat the bottom to leave the settings.Add a new video source with the plus button in the
Sourcespanel.Select
Video Capture Device,OK, and select your webcam for theDevice.Change the
Presetto the lowest possible resolution and clickOK.Right-click the new source and select
Resize output, and clickYes.OBS should now be ready.
2.2 Usage
Download the files from r-bites’ releases page, or copy the code from the github. You’ll need webcam.R and client.R.
2.2.1 R Startup
You and the person you’ll be conferencing with should both follow these steps.
Open an R window which will serve as the host for your webcam stream. Load webcam.R and run:
stream.webcam(
<OBS recording path>,
<your IP address, including port>
)Now open a new R window to act as the client to receive the other person’s stream. Load client.R and run:
stream.client(
<the other persons's IP address, including port>,
<desired rendering resolution (default 200)>
)The rendering resolution has a huge impact on the framerate of the console stream. I recommend a value of 100 or 64 if the default is too slow and choppy.
If you want to test this by streaming to yourself, use 127.0.0.1:8080 for both IP addresses.
Note: streaming between devices will require port forwarding to be set up on your network. If you don’t know what that means, it’s not worth trying to set up.
2.2.2 OBS
Now, both of you can open OBS and click “Start Recording”. The console streams should start after a few seconds; you may have to resize the window and/or zoom out with cmd - to get it to render properly.
To stop, just end the recording. To completely clean up, you can also remove the OBS recording folder.
3. Next Steps
There’s a lot to be improved here. For one, using a real streaming protocol would work infinitely better than the duct tape solution going on here, and OBS already facilitates this.
More importantly, though, the console rendering isn’t the best— even on powerful computers there’s a strong tradeoff between resolution and framerate. I don’t believe there’s a solution for this… but there may be a way to increase visual resolution without changing the character resolution. I’ve experimented a little with compositing and marching squares to improve contrast and visibility, but I haven’t found anything yet that’s better than the default, naive approach this guide (and my generic video renderer) use.
Footnotes
.avifiles written with certain encoders also appear to work, but I haven’t explored them fully (and.avifiles take up a lot more storage).↩︎avis the standard package for processing videos in R, but unfortunately doesn’t give us as much control over FFmpeg as we need.↩︎Using the
-reoption makes FFmpeg process the video at its native framerate, so it won’t catch up to the end if the video is continually added to.↩︎