Camera automation based on microphone location data

A high-level overview on the required components for camera automation based on speaker location-providing microphones

The goal of this article is to provide a high-level overview of the possibilities for automation based on microphones that provide positional data for the active speaker.

As each automation setup is unique in its own way, we don’t include example code here – writing this code is entirely dependent on your intended goals.

The examples below are not necessarily limited to Shure’s microphone panels. In reality, any microphone system with an API that provides some kind of speaker location can be used.

Shure is used in these examples because it is the hardware that we have worked most commonly with, and seems to be the most accurate with the location data it provides.

Basic Setup

Before kicking off, you will have to decide on a central service where you will write all your logic that handles connecting to APIs, parsing information, and sending control commands to all relevant devices (the “brain” of the automation). At Seervision, we mostly use Node-RED for this, and all of our Seervision servers by default offer a Node-RED instance. If you have a Seervision server, you can access this Node-RED interface on the IP of the Seervision server, port 1880 (as an example, on the LAN at our office, it would be

Next, you should configure your microphone array correctly. In the case of the Shure MXA920, this will include configuring it via its web interface (e.g. microphone height, speaker height), but this varies between hardware. It’s best to contact your microphone manufacturer’s representative to make sure you get the configuration right.

Once your hardware setup is complete, your first step should be to write the logic to access the microphone and start receiving its data. For Shure’s MXA920, the documentation is available here.


The last step in the basic setup is deciding for yourself what you want your automation to look like, i.e. coming up with a couple of automation scenarios. Try to write a couple of bullet points in the form of If This Then That. For example: If Lobe 1 on my microphone activates, then switch to Input 1 in my vision mixer. Having this clearly in your head will simplify converting this to code later on.

Automated Camera Switching

If you wish to automatically switch the active camera based on microphone input, you will need to find a way to interact with your vision mixer, which usually offers an API as well. At Seervision, we use VMix (their API documentation is available here), but most vision mixers have some kind of API (we’ve also done it with Blackmagic ATEM minis for example).

Once you have set up the communication in your automation “brain” to the vision switcher, it is a simple matter of writing your logic by leveraging the data from the APIs. For example: if lobe 1 on the MXA920 activates (Shure API), switch to Input 1 on VMix (VMix API).

Microphone Speaker Tracking with Seervision

This is the most advanced use-case, and Seervision will have to work together with you in order to get this set up. In order for this to be set up, you must already have:

  • An active Node-RED instance that is connected to your microphone and is receiving data on the current active speaker
  • Written logic that tells us to what pan/tilt/zoom we should send the PTZ

Once both of these are setup, we will provide you with the relevant interface that consumes your pan/tilt/zoom inputs, and sends them to the Seervision Suite in order to be executed as a movement to track the speaker.