Using MSML in SIP

Krishnakumar PG
3 min read2 days ago

--

Media Server Markup Language (MSML) is a powerful tool designed to control media servers in IP environments. It allows for dynamic control over media sessions, enabling functionalities such as conferencing, announcements and more. This article will guide you through the basics of using MSML with SIP, including its setup, use cases, and implementation.

Understanding MSML

The Media Server Markup Language (MSML) is an XML-based language used to control and manage media streams on media servers, enabling a wide range of services in IP-based networks. It is designed to work independently of the transport mechanism, although it is commonly used with SIP for signaling. MSML facilitates various media operations, such as establishing conferences, DTMF detection, handling interactive voice response (IVR) sessions, and applying media transformations.

MSML provides a comprehensive set of commands and constructs for defining and managing complex media interactions. Here is a list of the primary commands and constructs used in MSML:

  1. Dialogs: MSML dialogs are used to create user interaction sessions.
  • <dialogstart>: Initiates a media processing dialog.
  • <dialogend>: Terminates a specific dialog.
  • <dialogexit>: Exits a dialog while keeping the call connected.

2. Media Elements:

  • <play>: Plays a specified media file or stream.
  • <record>: Records media to a specified location.
  • <pause>: Pauses a media element.
  • <resume>: Resumes a paused media element.
  • <stop>: Stops a media element.

3. Conferencing:

  • <createconference>: Creates a new conference.
  • <modifyconference>: Modifies an existing conference.
  • <deleteconference>: Deletes a conference.
  • <join>: Joins a participant to a conference.
  • <unjoin>: Removes a participant from a conference.
  • <mixer>: Manages audio and video mixing within a conference.

4. Announcements:

  • <announcement>: Plays an announcement to the user.
  • <stopannouncement>: Stops a currently playing announcement.

5. DTMF Handling:

  • <dtmf>: Specifies DTMF handling within a dialog.
  • <dtmfcollect>: Collects DTMF input from the user.

6. Event Notifications:

  • <event>: Defines event notifications to be sent to the application server.
  • <notify>: Sends a specific event notification.

7. Audit and Status:

  • <audit>: Requests status information from the media server.
  • <status>: Provides status information about various media elements and conferences

Integration with SIP

MSML commands are sent within SIP messages, typically in the INVITE or INFO methods. The interaction between SIP and MSML typically involves the following steps:

  1. Session Initiation: A SIP INVITE request is sent to establish a session. The INVITE can include an MSML script in its body or reference an external MSML script.
  2. MSML Execution: The media server parses and executes the MSML script, controlling the media streams as specified.
  3. Event Notifications: The media server sends SIP NOTIFY messages to report events and status changes back to the controlling application.
  4. Session Termination: A SIP BYE request is sent to terminate the session, and the media server stops the execution of the MSML script.

Example MSML Message Body

Here’s an example of an MSML script embedded in a SIP INVITE message to play an announcement.

Response indicating the end of an announcement:

MSML Command Breakdown:

  1. <msml version=”1.1">: This is the root element for MSML commands, specifying the MSML version.

2. <dialogstart target=”conn:1" type=”announcement”>

  • dialogstart: Initiates a new dialog session.
  • target=”conn:1": Points to the connection (e.g., a call leg or a session) where the announcement will be played.
  • type=”announcement”: Specifies that this dialog is for making an announcement.

3. <play>: Encloses the play action to be executed in the dialog.

4. <audio uri=”http://yourdomain.com/announcement.wav"/>

Conclusion

MSML enhances SIP by providing sophisticated media server control. It supports various functionalities like announcements, conferencing, and DTMF detection, improving your VoIP and real-time communication services. By understanding these fundamentals, you can leverage MSML in your SIP-based applications for a richer communication experience.

For further details and advanced concepts, you may refer to the official RFC 5707 documentation and other relevant resources.

--

--