0020755 Abram Hindle Aug.11 2002


Abram's Unix Sound System


Purpose: AUSS will provide an easy to use API, who's purpose is to hasten the development of audio applications. AUSS will enable the use of pipes and sockets to move digital audio between processes and computers. AUSS is meant to enable interprocess, intercomputer patching of audio . This sort of patching enables software modules to emulate their black box analog equivalent in the analog electronic music systems. The analog systems consist of black boxes linked by patch cables to create a network for the purpose of audio processing. AUSS allows the computer and computer networks to enable this kind of configuration between software processes and computers.


Format: All audio will be transmitted raw. Raw means no headers or identifying information will be sent with streams. It is expected you already know what format is being sent. Sample formats could be mono float, stereo float, mono short, stereo short, mono byte, stereo byte. The system will initially focus on mono and stereo short. Multichannel (stereo) will be interleaved (right sample followed by a left sample).


Timing: Timing can be enforced through the use of blocking for reading and writing on the streams. For instance in a non realtime system with file writing instead of using a soundcard blocking would be used less than a soundcard output based system as file writing is generally faster than soundcard output. Essentially, blocking enables writing to the end device as fast as the system and the end device can handle. So non-realtime systems could output a lot of data very quickly.


Latency: Latency will be an issue with the system, especially if more than one computer is used. Latency increases as more applications buffer audio. Latency can be lowered through minimizing the amount of patching between applications in a system. Some operating system such as Linux have patches available to low latency operations of streams.


Licensing: AUSS is intended to be open source. There is no reason someone should not have access to AUSS source code. AUSS requires community support to survive and Open Source would provide that community. A BSD, LGPL, or GPL license would be acceptable. Libraries could be LGPL while the applications could be GPL.


Target Platform: Linux x86 and hopefully other Unices on other platforms. Development should be done in C/C++ and Java The smaller utilities will be written in C . The configuration manager will be GUI driven thus written in cross platform Java. The C utilities can be ported to Java to further enable cross platform support.



Implementation: Provide helpful tools for developers and users of AUSS such that AUSS will be usable.

Demuxer: Demuxer is a server which mixes the audio streams while outputting the mixed audio to STDOUT . Each connection is expected to stream raw audio while the Demuxer mixes the audio and outputs the streams to STDOUT. All streams are expected to be synchronized based on when they connect. If a new stream connect the output is altered to allow the new stream to be mixed in. If one disconnects the output compensates and mixes out the old stream. Buffering will be optional otherwise a small buffer of only a few bytes will be kept for mixing purposes. Demuxer also has to deal with mixing interleaved sound. Demuxer will not write to STDOUT until it has information to write to STDOUT.


Example use:

./demux 8888 | play - #play the mixing of connections on port 8888


Muxer: The Muxer is an application in which the data piped in from STDIN will be communicated across sockets to various hosts and ports. Muxer will not output data to STDOUT or STDERR. Muxer will block reading from STDIN until it has connected to as many as the hosts as possible. The muxer is meant to copy one stream across the network into many audio streams.


soundcommand | ./mux localhost:8888 otherhost:9999 localhost 8889

Pipe Splitter: The Pipe Splitter duplicates a stream by channel to STDOUT and STDERR, allowing processing of each channel separately. All input from STDIN would be either mirrored or split to STDOUT and STDERR.


Socket Connector: The Socket Connector is a server which takes commands from other programs to tell it where to send it's signal. Basically this server is being connected by pipes or sockets then the server passes on it's audio streams to other sockets. This would enable dynamic configuration of modules in the system. Only one socket connector should be needed since one process could sit on the necessary ports and connect to the necessary ports. For instance you could mux to the socket the connector; the socket connector could then be told through a configuration program how and where to route your audio. Potentially the Socket Connector could have the demux and mux built right inside of it for efficient demuxing and muxing of data.


Configuration Manager: The configuration manager maintains and manages the services provided and running. Configuration Manager would talk to the socket connector and provide an interface to easily hook streams together. Configuration manager will show a visual or graphical representation of the connections forms.


AUSS LIB: AUSS LIB is a library which provides easy to use interfaces to help programs not just use STDOUT but also use the network to transmit audio. The library will provide easy to use interfaces to connect to the Socket Connector as well. Possible languages for the library:

C

C++

Java

Perl


Example Programs: These programs will help educate the user of AUSS on how they can use the system:

White Noise Generator

Sine Wave Generator

CrossFader – This provides a GUI to enable realtime cross fading of streams.

Volume Control – This provides a GUI to enable realtime volume control of streams.


Available similar projects:

JackIt – Written in C for POSIX systems. Not multi-language compatible. Meant for low latency writing to the sound card. Not meant for patching audio applications together.

EsounD – EsounD is not meant for patching audio applications. It is meant to demux audio and mix it before sending audio out to the soundcard.

NAS – The Network Audio System is a stable system not intended for patching as it is intended for playing to soundcard.

PD – Patches and patched and linked together INSIDE of the program to produce sound. PD cannot mix the audio of external applications easily.

Most projects are concerned with mixing audio for so that multiple applications can play audio at one time. Systems which allow patching only allow it internally and do not allow external programs to be used without conforming to an API.


Documentation: The project will be documented from a developer and user perspective. There will be a user manual, a reference, an API specification and probably some form of development documentation (requirements analysis and design).


GLOSSARY:

Channel – one monaural stream of audio.

Float – floating point number usually 32-bit in size

Interleave – mix 1 or more channels of audio together by having each following sample belong to a different channels. E.g. LEFT – RIGHT – LEFT – RIGHT - ..

Mono / Monaural – Single channel audio.

Multichannel – Two or more streams of audio like stereo or 5.1. Generally interleaved on output.

Patch – To connect two units together, much like patch audio cables. Patches provide a transport for audio between units.

Raw – Headerless data.

Realtime – A system in which the audio is immediately played to a sound output device at the rate in which it is heard.

Short – 16 bit Integer number.

Stereo – Dual channel audio, usually interleaved.