Learning WebRTC and it's different applications
December 04, 2017
This post discusses the key points and advantages of WebRTC, what it brings to the modern web platform, and my experience with learning it’s use from day one until now. I won’t go into great technical detail here, but instead focus on the breadth and nuances of the WebRTC landscape.
Web RTC has been around since May 2011, and is a collection of protocols and APIs which enable real-time communication between devices. It’s commonly used for fast streaming of video, audio and data between browsers over the web. This leads to software such as VoIP (internet phone calls) and teleconferencing apps much easier to develop.
It is supported in Chrome, Firefox, Edge and Opera, but lacking in support for IE. More recently (July 2017) it was announced that Apple would be supporting WebRTC in the latest Safari desktop and mobile apps. This is great news because until recently, fast, P2P communication between Android and desktop, both supporting WebRTC, was easy, but Apple devices needed a native app solution which would require something like OpenCV or other Native iOS development. Now though, a fully cross browser/mobile solution is possible using just a browser, or React Native!
For me, the fact Web RTC is now supported on all major desktop and mobile platforms means investing time into learning it is a no brainer, and there are plenty of resources out there to help or abstract the confusing processes to get things working.
If you’d like to know more about WebRTC and it’s communications architecture, please take a look at some of the great resources which I’ll add into the post below.
It’s not all about live video chat
Earlier I mentioned the most common use is for VoIP and teleconferencing apps, and this is evident by such countless tutorials for Web RTC. I don’t believe this large set of tutorials put Web RTC in the best light though, and I think it convolutes the nature of the Web RTC protocols by not explaining it’s true or potential ability. Essentially Web RTC provides the means to have real-time communication between browsers, and there are many other fun and useful applications which can be made to take advantage of this.
One great example I have came across is AR.js which provides high performing augmented reality on mobile and desktop devices with the help of Web RTC and WebGL. A completely different usage of WebRTC protocols, but something which is immensely powerful and fun to play with.
Alternatives to GetUserMedia
Another point I wanted to mention is that there are other ways to get feeds from a devices camera through to a browser, such as the HTML5 media capture, which is a great way to take photos and videos in the browser. This has applications to take photo’s in websites or web/hybrid apps, and share via the web.
This has nowhere near the flexibility and power of WebRTC though, as they are constrained within their own API and functionality. WebRTC allows you to fully manipulate the the media stream, and do pretty much anything you want to it.
It’s also not all about GetUserMedia
Again, most examples when you dig a little deeper use the GetUserMedia feature of WebRTC, but this is not needed to make an application which takes advantage of WebRTC. In fact, I think because most examples use GetUserMedia it takes the focus away from the difficult/important parts of WebRTC which is signalling - setting up the connection using TURN/STUN servers. I’m not saying it’s a bad thing, but it means that people are less likely to truly learn Web RTC and it’s nuanced signalling communication protocols. WebRTC isn’t the easiest aspect of software development to understand fully anyway.
There are some excellent examples of usage in some of the strangest applications at the WebRTC-Experiment GitHub, and they really help get a feel for the full breadth of what can be done with WebRTC.
After a while of looking around at some great tutorials and walk-throughs from the likes of Google Codelabs, MDN and this random one, I realised it was a lot more complicated than learning the average framework or browser API. This is because, as discussed earlier, there are so many uses of WebRTC and it’s not simply displaying a camera feed or recording on a microphone, it’s about understanding an entire communications architecture, signalling, STUN/TURN servers, traversing firewalls, cross browser compatibility etc. Even Google Codelabs say that to get stuck in there’s pretty much no other way than doing a straight copy and paste.
One positive note is that many of the tutorials out there have pretty much the same content, often line-for-line (as shown in the official WebRTC repo and the Google Codelabs one). I feel this makes the breadth of WebRTC a little more understandable and palletable as there’s not hundreds of commonly used ways to reach the same outcome. It’s just a little different than learning the average API as there’s so much involved.
The complex nature has lead me to re-think how I approach WebRTC. I feel a great option is to abstract the core API by using libraries such as PeerJS and EasyRTC. They simplify the process of getting up-and-running massively by eliminating the headache of cross-browser differences and repetative copy and pasting from online resources.
Of course, I’m speaking from the point of view of someone who wants to integrate WebRTC into a project and move on to the next project. If WebRTC is something you want to truly understand and make use of without any third-party abstractions, it’s definitely worth spending the time fully understanding the entire WebRTC landscape.
For me, after playing around with libraries which enabled me to have fun with WebRTC, I felt compelled to go back and try to learn the ins-and-outs of the RTCPeerConnection, data-streaming, signalling etc, although I have to say, I have not yet ran into a situation where I have had to use WebRTC without a library.
As discussed at the start of this post, WebRTC is now in a much stronger position because of it’s recent addition to Safari. This helps tremendiously with app development as it can be incorporated into hybrid/native app frameworks built with web technology such as Ionic or React Native, and be fully supported on all major platforms. This was not possible in a “write once, deploy anywhere” fashion without the use of native integration.
Integration with Ionic/Cordova apps is pretty simple - just include like you would a normal website and the basic features are supported. Integration with React Native is a little more interesting as there are two options. The WebRTC can either be used in an embedded web view, or by using react-native-webrtc which looks to be a great solution, although I’ve never actually used it. It’ll be interesting to see the performance differences between the both solutions which I may cover in a furure post.
Simple setup on my Github
There are a few really good resources for learning WebRTC, and like anything it’s good to have solutions used in a different situation. I have my own repo called web-rtc-playground which I use to show how it different WebRTC implementations can be used with a Node Express server. It’s not anything new and other implementations go into way more detail, but feel free to take a look. I’ll b adding to it now and then hopefully.
I think the important parts of learning WebRTC was a shock when I was first confronted with it, and I feel it’s harder to jump straight into developing with the core WebRTC protocols to do something such as data or media transfer. Having said that, after using third party libraries and eventually going back to the WebRTC fundamentals, it did become clearer to me.
My advice is not to give up with WebRTC as it’s usefulness for out-of-the-box thinking of app development is incredible, and I believe the future of the web will have a big influence from real-time communication between users.
Senior Engineer at Haven