C++ API

Have a look at the DOxygen documentation.

 

The API is divided into two sections: Machine Learning and Signal Processing. Machine learning provides algorithms to train models that can help interpret complex multimodal data. As an input to Machine Learning, Signal Processing might be used to process raw sensor, audio, or video data into something more useable, or to do feature extraction. Also, the output of machine learning might be processed — to smooth out a jittery classifier, for example.

The API is contained in the rapidmix namespace.

Machine Learning Classes

Machine learning classes are oriented towards four basic use cases: static vs temporal input and classification vs regression output. These terms are defined on the General Concepts page.

Basic algorithm classes:

  • rapidmix::staticClassification
  • rapidmix::staticRegression
  • rapidmix::temporalClassification
  • rapidmix::temporalRegression

For any machine learning algorithm class, there are at least three methods:

train(trainingData) — takes a training data class as an argument, and trains the model or models using those data

run(inputVector) — takes a vector of doubles as input and processes them using a trained model. This triggers a vector of doubles as output.

reset() — resets any model to a default state

The trainingData class can hold data for any (or many) machine learning class.

  • rapidmix::trainingData

It has the following methods:

startRecording() and startRecording(“label”)

stopRecording()

addElement(input, output)

Signal Processing Classes

The rapidStream API has methods for interacting with signal processing in a basic manner. Users, such as audio programmers, who are concerned with issues of timing, locking, threading, etc. should look at the PiPo section of the API. Also, Maximilian provides some functions for FFT-based feature extraction.

rapidStream

rapidStream is a class implementing a circular buffer plus a collection of methods allowing to compute a variety of common signal features on this buffer. It is driven by a pushToWindow(double input) method, and each feature is computed by calling the corresponding method name. It includes the following methods:

  • First order difference (aka velocity())
  • Second order difference (aka acceleration())
  • Maximum() or minimum() value
  • Sum(), mean(), and standard deviation() of values in buffer
  • Maximum or minimum velocity or acceleration

rapidStream does not incorporate a notion of timetags or sample rate. For timing-critical process, users will need pipoHost.

pipoHost

pipoHost is a minimal implementation of a PiPo host, based on the now open-source PiPo library, coming with a collection of 23 native PiPo modules, and providing easy integration of third-party modules through its SDK. The pipoHost class is able to instantiate a graph of PiPo modules (as a single PiPo module) from a description string, expose its parameters through setAttr() and getAttr() methods, has a getLastFrameOut() method and a virtual callback method, onNewFrameOut(), intended to be overridden by classes inheriting from pipoHost to suit their own needs.

It must be initialized by its setInputStreamAttributes(pipoStreamAttributes sa) method and driven by its frames(double time, double weight, PiPoValue *values, unsigned int size, unsigned int num) method.

maximilian feature extraction

A set of audio feature extractors is included as part of the Maximilian library.

The maxiFFT class is exposed as FFT. Here’s a short example:

Once you’ve run an FFT, there are some useful functions

  • myFFT.spectralCentroid() returns a float representing the frequency of the “centre of mass” of the spectrum. Higher values mean a brighter spectrum.
  • myFFT.spectralFlatness() returns a float that is some metric of how flat the spectrum is.
  • myFFT.magsToDB() returns a pointer to an array of magnitudes for each FFT bin.

Documentation TODO:

  • maxiIFFT
  • maxiFFTOctaveAnalyser

The maxiMFCC class is exposed as MFCC. Here’s a short example: