Video Conference in R
After making a game engine in R, I was hard-pressed to come up with a crazier idea.
But I prevailed. What if I made a video conference client to let people see each other in realtime, running all on R? Surprisingly, R provides almost everything we need.
Gesticulation is particularly effective to make up for the low frame rate.
Here I’m streaming to localhost (my own computer) to demonstrate the latency.
This gave me quite a fun challenge for the semester. Things were quite a bit more finicky to get working than in previous projects, and there were several times when all hope of it working seemed lost— I had to push through and persevere quite a bit throughout the project.
1. Details
1.1 Big Picture
Webcam video is captured by OBS, processed into images in R and sent to the client, and then displayed on the console.
Pipeline
1.2 OBS
Unfortunately, R can’t natively read from the webcam. To get webcam input, we’ll use OBS, the standard screen-recording and streaming application. OBS gives a lot of useful control over how it outputs video, which we’ll take full advantage of.
We need to turn the live webcam feed into a video format that we can read in R. We’ll need to read it as it’s being written, which restricts what file formats we can use— but thankfully, old reliable .mkv still does the job. OBS has to use a video encoder (which we get to choose) to write live camera data into a video file, and different encoders take longer before the most recent frame can be read. (Lower-framerate videos also take longer, so we use a relatively high framerate and only save images every few frames)
After testing quite a few encoders and filetypes, the Apple VT ProRes Software Encoder permitted the fastest reading of recent frames (from my apple laptop webcam).1
Anyway, all of this means that we end up with a .mkv file that OBS writes to live. Next, we’ll use FFmpeg (via R) to access this file and extract its frames as it’s being written.
Why not just use OBS to generate a real livestream?
Uhh… I was particularly interested in learning to work with these video and image formats. Using an actual streaming protocol would indeed work much better.
1.3 FFmpeg
FFmpeg is a standard video processing software, used by R packages like av2. In our case, we want to use it to generate a JPEG for every frame of video.
This turns out to be tricky. FFmpeg is very convenient to use for regular video files, and is even quite capable of reading incomplete files as they’re written3, but reading that incomplete video live takes some effort and jank.
Mainly, the issue is that FFmpeg helpfully stops running when it reaches the end of the video, but we want it to just pause and wait for new data to be added. Amazingly, this happens to be possible through some jank suggested in a forum email 10 years ago:
So we actually manage to read a video file and write its frames as JPEGs almost as soon as they’re written by OBS. (The filetype and encoder determine how fast we can read a frame after OBS writes it, as mentioned in the previous section.)
We save these images to a temporary folder, from which the server will send images to the client.
1.4 Memory Leak
The process as described so far introduces a bit of an elephant in the room that can be easy to miss. If we just keep writing to the video file… we just keep writing to the video file, and video files are big and take up a lot of storage space. It would be rather bad practice to rely on a constantly-growing file, and this would prevent us from running the video conference indefinitely.
The solution is to break up the video every so often and get rid of the old parts we don’t need anymore. Luckily, OBS provides a built-in option to start writing to a new video file every X minutes— we can just wait for this and then delete the first file. (and the old image frames as well!)
Getting everything to mesh nicely with this file switch is actually a bit of a headache—my current implementation has to pause for 5-10 seconds to do it—but the infrequency of happening every few minutes makes it not a huge concern.
Doing all this makes it so we never take up more space than a couple minute’s worth of video, fixing any concerns of a memory leak.
1.5 Server and Client
The simplest and neatest way to serve our photos is to have a static server serving the whole directory containing the images, so the client can request whatever frame it wants, whenever it wants. This takes a lot less effort than trying to manually serve the right frame at the right time, and works just about as well.
To get a frame, the client just sends an http request with the image filename, which is in order (frame 1.jpeg, 2.jpeg, etc…). The server will serve the image, or a 404 error if the image doesn’t exist yet.
This makes it simple to create a system that robustly keeps up with the server— the client asks for an image every frame and just tries again if it gets a 404. This ensures that it will keep running and catch up if the server or client experience lag or get desynced.
1.6 Rendering Engine
Once the client gets the image, it’s time to prepare it to draw. This uses an engine I made previously to render video in the console4.
2. Guide
This section will describe how to try the project out yourself.
Note: this has only been tested on my macbook. The project may work on Windows but might require some fiddling.
2.1 Setup
Download and install OBS. Launch OBS and skip the autoconfig wizard (click Cancel).
Change the settings with OBS Studio -> Preferences (or File -> Settings). All the following steps are critical.
Go to
Outputon the sidebar.Change
Output ModetoAdvanced.Go to the
Recordingtab.Set
Recording Pathto a folder where video files will go; I recommend making a new folder for this on your desktop. R will also have to read from this folder.Enable
Generate File Name without Space.Change
Recording FormattoMatroska Video (.mkv).Change
Video EncodertoApple VT ProRes Software Encoder(orAOM AV1).Enable
Automatic File Splitting,Split by Time,2 minutes.Go to
Videoon the sidebar.Set
Common FPS Valuesto 48.Click
OKat the bottom to leave the settings.Add a new video source with the plus button in the
Sourcespanel.Select
Video Capture Device,OK, and select your webcam for theDevice.Change the
Presetto the lowest possible resolution and clickOK.Right-click the new source and select
Resize output, and clickYes.OBS should now be ready.
2.2 Usage
2.2.1 Server
open R and run stream.webcam with that directory and your IP (or localhost if testing by yourself) set directory to that folder, and run
TODO
Note: streaming between devices will require port forwarding to be set up on your network. If you don’t knopw what that means, it’s not worth trying to set up.
2.2.2 Client
run steam.client() with the IP of the person you’re conferencing with (or localhost)
Footnotes
.avifiles written with certain encoders also appear to work, but I haven’t explored them fully (and.avifiles take up a lot more storage).↩︎avis the standard package for processing videos in R, but unfortunately doesn’t give us as much control over FFmpeg as we need.↩︎Using the
-reoption makes FFmpeg process the video at its native framerate, so it won’t catch up to the end if the video is continually added to.↩︎