Illusions in the Web: a real-time video editor built in HTML5

November 2, 2011

I am excited to blog about the current project I am working on and deeply enjoying. In Collabora, we have developed a tech demo of a simple video editor application running in the WebKit based web browser using HTML5 video technology. Well to be honest, not just plain HTML5 video, but we will get to that point later on. Keep in mind it is not a fully functional application, just a demo with some basic functionality like:

  • Media library browser for images and videos.
  • Drag and drop clips to build the media timeline.
  • Set the inpoint and outpoint for each video clip.
  • Real-time seamless preview of the media timeline.
  • Basic touch based interface, although there is room for a lot of improvement here.

The main advantage is that all processing is done locally and all the data is kept locally too, so no need to upload all the source material to a server or download the final video from a server, which might take some time specially when working with HD resolutions. Another nice feature is that you can actually preview the final video in real-time, instead of waiting for the server to process it.

Some leftover Halloween (eye) candy

All this stuff was shown by Collabora last week in Prague, during the LinuxCon Europe and GStreamer conference. Thanks Collabora for sponsoring my visit to the conferences by the way.

Let me show a screenshot and a screencast of how the the demo looks like.

You can even try this out if you wish. Install the Ubuntu package named witivi located in zdra/prague-demo, kindly packaged by Xavier Claessens. Additionally, you will need to get some videos to enjoy it. Just contact me if you need instructions on how to make it run.

Current trends in the web

As the title of this blog post might suggest you, all this is not yet possible with current state of HTML5 video spec, so we had to cheat a little bit by adding some extensions to WebKit in order to make it real.
However, there are a lot of efforts around the multimedia on the web nowadays, so evolution could certainly make it possible in a hopefully near future. Some of the current hot spots in WHATWG and W3C seems to be WebRTC, WebAudio API and even Augmented Reality. Things are really moving fast to make web apps more and more powerful specially regarding multimedia features.

Just in case you haven’t heard before, let me introduce briefly these technologies. WebRTC aims to provide real-time communication built-in HTML5 without additional plugins, so imagine audio and video calls possible from any html application with a simple Javascript API. WebAudio allows to do sample processing and synthesis from Javascript.

An interesting proposal which I discovered recently is MediaStream API, which is being promoted by Robert O’Callahan from Mozilla, looks like it could lead to have video editing functionality in HTML5 and it tries to integrate nicely with the other existing specs like WebRTC and HTML5 video.

General architecture

There are two main parts in this demo, the extension to WebKit for video editing and the video editor webapp (the user interface). The Webkit extension has been fairly easy to implement by using the Gstreamer editing services (GES) module, so having video editing features seemed like a breeze. The video playback of the media timeline has been accomplished through a special webkit gstreamer sink using a fake url, although we are checking how to do it with blob URLs similarly to how it is done in WebRTC. Even though the current implementation is not really very intrusive into WebKit codebase, it would be better to keep the changes separate from WebKit for now since there is no standard for this yet. We still have to investigate for the best way to achieve this. But it depends on how the whole things evolves.

The video editor web application has been implemented using jQuery, jQuery UI and jQuery layout libraries. Implementing the video demo webapp has not been exactly a bed of roses, but good documentation and lots of examples from the aforementioned libs helped a lot indeed.

Show me the code

Here you have the git repos in case you want to sneak on how we did it:

* GES WebKit. To compile GES WebKit make sure you specify –enable-mediatimeline option on configure. GES WebKit depends on GES and gnonlin.

* Web Video editor demo, also know as Witivi in reference to our beloved pitivi. You can find here all the html, javascript and css magic used to build the user interface.

Future extensions

Here is a list of things that potentially could be done for the following versions:

* Transitions and effects on video clips (in progress).
* Adapt to MediaStream API.
* Move to BLOB URLs.
* Create text and apply them on videos.
* Render to a local file.
* Push video to server.
* Media timeline support for multiple layers.
* Integrate with webrtc to do some collaborative video editing.
* And many many more …

Related work [Update]

Apart of this demo where Gustavo Boiko and I have been working on (with contributions from Alvaro Soliverez and Abner Silva), there are some very cool related demos developed by Collabora, all of them were shown at LinuxCon in Prague. We hope you enjoy them as much as we do:

* IM client running in the Webkit browser. An IM client with chat and video calls running in the browser mostly done in Javascript and HTML using Telepathy framework.

* Video call plugin on Media Explorer. A telepathy plugin to media explorer to be able to call our contacts, integrated nicely in media explorer UI.

Stay tuned.


Open Collaboration Services and libattica on MeeGo

October 16, 2010

I am going to write about my most recent contributions to libattica. Libattica, is a library in KDE implementing a client for the Open Collaboration Services protocol (OCS). OCS is a Free Desktop specification, with the purpose of integrating web communities and web based services into desktop applications. This is very general though, more specifically, it allows users to browse and share content like applications, wallpapers, etc. All this done with the typical social features, a user can write comments, rate content, contact other people, access knowledge base, etc.

Thanks to Intel hiring Collabora for sponsoring all the work I have done in libattica so far. The efforts have been mainly directed to make use of the OCS protocol in the MeeGo Garage project. The MeeGo Garage Client uses libattica to connect with the OCS servers. OCS is actually quite large, and we are just using using a small part of it (just the content and the comments modules), and I would even say we are using it in a way it was not originally meant, but that in the end it matched the requirements we had.

I also want to give thanks to Collabora, the company I work for, for sponsoring my attendance to Akademy 2010, my first Akademy actually, and I hope to be there every year :). I met in Finland a lot of nice people in th KDE world, and we had several meetings regarding OCS stuff with the people interested in it, Frank Karlitschek, Frederik Gladhorn, Daniel Wilms, Henri Bergius, Cornelius Schumacher among many others. In particular we made good progress on drafting next version of OCS spec, with several updates needed by MeeGo Garage project and other projects. The OCS draft spec 1.6 is online. Most of the features are already implemented and tested and will be ready to use in upcoming release of libattica 0.2.0. Just to mention some features where I put some effort:

  • Comments (for content and other items)
  • New scale for voting
  • GPG fingerprint and signatures
  • New download type to describe a content specified by a package name and repository
  • Summary field in content
  • Icons in content items
  • Licenses
  • Video files in content items
  • Home page entries
  • Distributions

Not only that, but there were a lot of ideas generated that had to be postponed to OCS version 2.0 version, which I hope we can start to work soon.

Frank has been very helpful in providing test servers available for me to test the client, and implement in the server the parts of the spec I needed to test the client. Fredrik was very helpful in making me become a KDE contributor.

Finally, libattica seems will be part of MeeGo, since it is a Qt-only library, no real dependency on other KDE stuff. In case you want to dig more on OCS and/or libattica you can check the following places:


10 secret Ninja weapons for Qt Quick QML developers

October 5, 2010

You might be asking yourself what has to do Qt Quick or QML and Ninjas. Well, let me introduce you the coding Ninja,  our cool mascot at Collabora. And as you know real ninjas are almost invisible and use very lethal weapons. But let’s go to the point and see what secret stuff are using coding Ninjas. Let me show you some useful tricks discovered while playing with Qt Quick and QML stuff at Collabora, which might not be secret really but there is little documentation about them anyway, so they could be missed easily.

  1. QML Viewer scripts. This is really great tool for demos and as a testing aid, unfortunately there is no documentation on it yet except the (-help and -scriptopts help command line options). Basically when you record a script, it generates a QML file which contains the input commands (keys, mouse, …) and some frame output information (png and frame output hashes). You can record test scripts and replay them later. The tool automatically test if the recorded output images are still the same and check for errors during the script run. It is actually checking every frame. But don’t panic it will not store an image for every frame, it will just store a combination, roughly an image every second and a hash for every frame. When you test the images output, it will tell you which frames don’t match and even save a copy of the rejected frames if they don’t. Of course in most apps you are not guaranteed to have repeatable results among different runs of the program, due to animations, little differences in time, event happening at different times, using random functions, etc. Nevertheless a useful tool, it can be used for error reporting, automating tests, coverage tests, demo run of applications, etc.
  2. Use F5 to reload in QML Viewer. You can have qmlviewer running your QML scripts and you can edit the source code at the same time, just use F5 to reload the script. Of course this restarts the application.
  3. Use F3 to take snapshots in QML Viewer. Be careful with F3 since will overwrite existing snapshots.
  4. Use F9 to start / stop video recording in QML Viewer. It is recommended to tweak the video format and other settings in the video settings dialog for better performance. You can also use the -recordfile option from the command line to be able to record from the application startup. You can record on multiple PNG files, on a animated GIF file or using any video format supported by your ffmpeg installation .
  5. Slow down animations in QML Viewer. If you have not the sight and powers of a Ninja, you might need this to slow down animations, so you can see the bullets coming just like in Matrix. Unfortunately it does not work very well my Zij Lost tetris like game, specially in game mode. The reason is that the game update function checks the real elapsed time between ticks to provide a better simulation. The game update function is called from the Timer QML object, which is affected by the slow down animations feature (as well as any QML Animation object). This can be fixed by just assuming the time between game updates is fixed, but still it is not working properly. The Timer callback does not seem to get called at regular intervals. The slowdown looks like 5 times slower, but sometimes it stops working for a few seconds. By the way, I saw the QML viewer released with SDK and the one from Qt creator have some differences. But that is probably due to the fact I’m using  a Qt Creator snapshot.
  6. Change orientation in QML Viewer with Ctrl-T or F10. You can change from portrait to landscape.
  7. Check the command line of QML Viewer with -help. There are some useful features like -borderless, -fullscreen among others.
  8. Check the warnings window in QML Viewer. There you can get your console.log, but there you can find also useful messages that might help you when there is something wrong in your application.
  9. QML dump. qmldump is a tool included in the SDK Qt bin directory, which gives you information about properties and signals of the classes available in QML. The information is provided in XML format, which might be useful to create other tools or to see the guts of some QML classes which are not fully documented.
  10. Press F1 for help text about debugging features in QML Viewer. It shows some interesting debug options like F2 save test script, F4 show time and state, F6 show object tree and F7 show timing. Although they don’t seem to work in my version. They are not fully implemented or broken in my Qt Creator snapshot version, but it is worth to know they might be implemented in next versions.

How to mingle QML, Tetrominos and Ninjas

September 30, 2010

Zij Lost is a simple tetris like game done entirely in QML and Javascript, written to learn Qt Quick technology.

While waiting for a new and exciting project to land on my table after finishing my assignments in last project, Collabora gave me the opportunity to start tinkering with the hot Qt Quick, recently released in Qt 4.7 by Trolltech people. I had already read a bit about it in the past, and even attended some talks and demos at Akademy 2010 in Tampere, Finland. So I spent first days mainly reading docs, running and reading examples. But to really learn something I have to get my hands dirty with it, so I decided to create a small tetris-like game with QML, just for learning purposes, just for fun. After a few days, I had a basic implementation of the game which I am using as a testbed to experiment with QML.

The QML Advanced Tutorial was really useful, since it also implements the game “Same Game” just using QML and Javascript. No C++ required. My goal was similar create a whole game logic just with QML and Javascript, without C++ involved. Note that this might not be the recommended design pattern for real world QML applications, but I just wanted to pursue the limits of this new technology, get used to it and see if I could reach some boundaries in my expression power with just QML and Javascript.

I started creating the tetrominos in QML. I did not use images for them, I just build them with rectangles. This way it is  easier to apply animations and special effects to them. At the moment I have  effects for piece rotation and line completed. Then added a Timer QML object which calls the game loop. All game logic is of course written in Javascript, things like checking collisions, movement, game loop, game rules, etc. So in the end it has been useful to refresh a bit my rusty Javascript skills. Afterwards, I added some states to the main.qml in order to add a start screen, game mode and game over screen. I found the QML State abstraction very useful and it helps to create more clean and maintainable code.

Well, it has been fun. QML is fun. I have enjoyed programming with it. Scripting languages are nice with good tools, and Qt Creator did not dissapoint me.

You can find the source code the git repository at:

git://git.collabora.co.uk/git/user/mbatle/zijlost.git

Also you can find a release in a zip file at:

http://people.collabora.co.uk/~mbatle/zijlost/zijlost.zip

Note: I named the game “Zij Lost“, just a mixing of all the tetromino letters, which happens to mean something in Dutch, even although it does not have much sense without a context.

New link: just found a guide to game creation in QML which is interesting.

Some images and videos.