Configure whisperer

See also: Run whisperer

General

whisperer can manage a set of media sources and expose them to whisperCast in a variety of formats and characteristics. As whisperCast cannot encode media, whisperer is the component that provides the encoding services, too.

The media data can be either pulled from whisperer, or whisperer can push the media data to whisperCast - in both cases the communication between whisperer and whisperCast being performed over HTTP.

Typically, whisperer works as a HTTP server, gathering the media data, preparing it and exposing it by the means of specific URLs that specify the actual source and the desired media characteristics (such as encoding type, audio sample rate, frame rate, video dimensions, etc).

So, a typical "pull" setup would be something like:

                                                           / -> clients
                           /-- HTTP GET -- WHISPERCAST 0 --  
                          /                    .           \ -> clients
camera --> WHISPERER < - -                     .
                          \                    .           / -> clients
                           \-- HTTP GET -- WHISPERCAST N --  
                                                           \ -> clients

In this case, each whisperCast instance is pulling the media data from whisperer, by HTTP GET requests.

In the cases when the actual media data source is not permanently available (live event broadcasting, for instance), whisperer can also use HTTP POST requests to push the media data to whisperCast, whenever the media data is available.

A possible "push" setup:

camera 0 --> WHISPERER --- HTTP POST -\
                 .                     \                  / -> clients
                 .                      --> WHISPERCAST --  
                 .                     /                  \ -> clients
camera N --> WHISPERER --- HTTP POST -/

whisperer is heavily based on gstreamer (http://gstreamer.net), actually being a HTTP server/client that exposes the data provided by the internal gstreamer pipelines it manages.

Understanding whisperer requires a fair knowledge of how gstreamer actually operates.

Gstreamer is similar to other media processing frameworks (notably Microsoft's DirectShow) and is based on the concept of "processing pipeline". A pipeline performs a series of operations on the media data, in sequence, the actual processing being done by the pipeline building block, the "element". Each element is designed to perform a specific function (encode, decode, etc) and, by chaining different elements in a pipeline a large variety of processing can be done on virtually any kind of media data.

For instance, a simple gstreamer pipeline:

v4l2src ! ffenc_flv ! ffmux_flv ! filesink location="output.flv"

The elements are linked by "!" and the media flows from left to right: it's initially acquired by the "v4l2src" element, using VideoForLinux2, then it is encoded to VP6/FLV by the "ffenc_flv" element, then it is wrapped as valid FLV data by the "ffmux_flv" element and, finally, it is written to the output.flv file by the "filesink" element.

For further details on gstreamer, please check out the gstreamer documentation at:
http://gstreamer.net
http://gstreamer.freedesktop.org/data/doc/gstreamer/head/manual/html/index.html

whisperer is creating and managing 3 different kinds of gstreamer pipelines:

  • Sources:
    These are user defined pipelines that are actually acquiring the media data from the external sources.

    The textual description of the source pipelines are configured by using the standard RPC?, available athttp://<hostname>:<port>/rpc-config/MediaProviderService/__forms, where hostname is the host on which whisperer is run and port is the HTTP serving port.

    The source pipelines are named providers in the following parts of the documentation.
  • Encoders:
    These are created internally by whisperer to encode the media data to the desired format. The encoders can and will be shared internally if the same source is encoded with the same parameters into different outputs.
  • Muxers:
    These are created internally by whisperer to multiplex the output of a video and an audio encoder into the container format that will be delivered to whisperCast.

Typically, for interleaved audio/video streams, the output media stream is linked to a muxer. If the media is just audio, the output stream will be linked to the corresponding audio encoder. One important feature is the reuse of encoders. For instance, if a specific source is delivered as FLV(320x240,44Khz/128Kbps MP3) and FLV(160x120, 44KHz/128Kbps MP3), only one MP3 encoding pipeline will be instantiated and it will be shared by both the encoded streams.

Configuration

Configuration, at this point, can be done by using standard RPC?, available athttp://<hostname>:<port>/rpc-config/MediaProviderService/__forms, where hostname is the host on which whisperer is run and port is the HTTP serving port.

Alternately, the configuration can be edited directly in the whisperer.config file, that is on the disk in the directory pointed by the --media_config_dir flag, when starting whisperer.

We do not recommend this method, as it is prone to errors, and also, whisperer needs to be restarted in order to load the new configuration.

This method is good if you want to save / backup a configuration file. In such a case, to use a backup copy of the config file, you should copy the backup copy over the live config file and restart whisperer.

Please check the Run whisperer document in order to see how to properly start whisperer.

Providers

The main thing about configuring whisperer is defining providers. The providers are the actual sources of media, media that will be encoded and delivered to whisperCast through HTTP. Each provider is based on a gstreamer pipeline description that must provide raw video data on an element named "source_video" and raw audio data on an element named "source_audio". Of course, if the stream is audio only that "source_video" element can (and should!) be omitted.

When requested, the raw media that the provider pipeline provides is encoded, packed and delivered, based on the parameters parsed from the actual HTTP request's URL.

A simple, test provider pipeline description can be (this is actually the test provider we use in the samples provided):

audiotestsrc ! capsfilter caps="audio/x-raw-int,rate=44100,width=16" name=source_audio videotestsrc ! capsfilter caps="video/x-raw-yuv,width=640,height=480" name=source_video

URLs

The media is requested from whisperer by HTTP, by using specific URLs that specify the source of the media and the format in which it should be delivered. The format of the URLs is:
http://<hostname>:<port>/media/<provider>?param1=value1&..., where hostname is the host and port' is the HTTP port on which whisperer runs, provider is the name of the configured media provider and the parameters are detailed below.

All the parameters related to the format and encoding of the media data delivered by whisperer are conveyed through standard URL parameters, and are listed below:

  • encoder - string
    The format of the encoded stream, possible values being:
    • flv
      The output will be FLV (Flash Video), interlaced video and audio.
    • mp3
      The output will be MP3.
    • aac
      The output will be HE-AAC.

  • audio_encoder - string
    If the output is set to FLV, this determines the kind of audio delivered in the stream (defaults to mp3).
    Possible values:
    • mp3
      The audio will be MP3.
    • aac
      The audio will be HE-AAC.

  • video_encoder - string
    If the output is set to FLV, this determines the kind of video delivered in the stream (defaults to vp6).
    Possible values:
    • vp6
      The video will be On2 VP6.
    • h264
      The video will be H.264.

  • audio_bitrate - integer
    This parameter specifies the bitrate set on the audio encoder, in bits-per-second.
    The range, valid values and the default value depend on the actual audio encoder that is used.

  • audio_samplerate - integer
    This parameter specifies the sample rate of the encoded audio, in samples-per-second.
    The range, valid values and the default value depend on the actual audio encoder that is used.

  • video_bitrate - integer
    This parameter specifies the bitrate set on the video encoder.
    The range, valid values and the default value depend on the actual video encoder that is used.

  • video_width - integer
    The width of the encoded video, in pixels (defaults to the acquired video width).

  • video_height - integer
    The height of the encoded video, in pixels (defaults to the acquired video height).

  • video_framerate_n - integer
    The nominator of the delivered framerate (defaults to the acquired video framerate).

  • video_framerate_d - integer
    The denominator of the delivered framerate (defaults to the acquired video framerate).

  • video_gop_size - integer
    This determines, approximately, the spacing of the I-frames that the video encoder delivers, in frames.
    The range, valid values and the default value depend on the actual video encoder that is used.

If a parameter is not specified, a reasonable default will be used (as specified above). The only required parameter is encoder.

Some URL samples:

http://hostname:port/media/live?encoder=flv
http://hostname:port/media/live?encoder=flv&video_bitrate=300000
http://hostname:port/media/live?encoder=mp3&audio_bitrate=131072&audio_samplerate=44100