Understanding audio_policy_configuration.xml

Think of it as a wiring map for Android’s audio: it lists every “kind” of stream the HAL can produce/consume, every physical/virtual device, and which stream types are allowed to go to which devices.

What the big blocks mean

  • <globalConfiguration …>
    Global switches. Here:
    • speaker_drc_enabled="true": enable dynamic range control on speaker.
    • call_screen_mode_supported="true": the platform supports call‑screen mode.
  • <modules>
    One block per audio HAL module:
    • primary: the main on‑board codec/DSP (earpiece, speaker, headset, mics, BT SCO, USB, HDMI, FM, telephony).
    • a2dp: Bluetooth A2DP input (for BT source devices streaming into the phone).
    • usb: USB Accessory (phone acts as USB host output).
    • remote_submix: included from another file (virtual device for audio capture/playback between apps, e.g., cast/mirroring).

Each module contains:

  • attachedDevices: always‑present hardware endpoints.
  • defaultOutputDevice: fallback route when nothing else matches (Speaker).
  • mixPorts: logical streams the HAL exposes.
    • role="source" = an output stream (to be played out).
    • role="sink" = an input stream (to record into).
    • Each profile gives the allowed format, sample rates, and channel masks for that stream.
    • flags tell Android how to treat the stream (low‑latency, deep buffer, direct, offload, VoIP, etc.).
  • devicePorts: the actual devices: speakers, mics, BT, USB, HDMI, etc., with their capabilities.
  • routes: the routing rules: which mixPorts (sources) may connect to which devices (sinks), or which devices (sources) may feed which input mixPorts (sinks).
  • Volume tables are included at the end to define per‑use‑case volume curves.

Primary module, in simple terms

Outputs you can open (mixPorts, role=”source”)

  • primary output: the default fast/primary PCM 48 kHz stereo path for most media/UI.
  • raw: fast path with minimal processing (low latency).
  • deep_buffer: power‑efficient path with larger buffers (higher latency, better battery).
  • hifi_playback: placeholder for Hi‑Fi playback via USB (no profiles here; the device decides).
  • compress_passthrough: send compressed bitstreams directly (e.g., to HDMI) without decoding.
  • direct_pcm: open a “direct” high‑quality PCM stream with many rates (8 k → 384 k) and channel counts (up to 7.1).
  • compressed_offload: hardware offload for compressed formats (MP3/AAC/FLAC/ALAC/APE/DTS/WMA/Vorbis, etc.). Saves CPU/battery.
  • dsd_compress_passthrough: DSD bitstream passthrough (SACD style) to a capable sink.
  • voice_tx: the uplink stream to the modem/telephony.
  • voip_rx: a special direct output used for VoIP playback (e.g., 8/16/32/48 kHz mono/stereo).
  • incall_music_uplink: inject music into a call uplink (karaoke/tones during call).

Inputs you can record (mixPorts, role=”sink”)

  • primary input: generic recording path (mono/stereo/front‑back up to 48 kHz).
  • usb_surround_sound: multichannel USB capture (up to 7.1 and up to 192 kHz/float/32‑bit).
  • voip_tx: special input for VoIP transmit (mono, 8–48 kHz).
  • surround_sound: device mic array capture supporting index masks/5.1.
  • record_24: higher‑resolution capture (24‑bit packed, 8_24, or float up to 192 kHz).
  • voice_rx: capture downlink from telephony (to tap the far‑end audio).
  • hifi_input: USB capture handed to the Hi‑Fi path.

Output devices (devicePorts, role=”sink”)

Earpiece, Speaker, Wired Headset/Headphones, Line‑out, BT SCO (mono narrow/wideband), Telephony Tx (to modem), HDMI, Proxy (virtual), FM out, BT A2DP (out/headphones/speaker with SBC/AAC/APTX/*/LDAC), USB Device/Headset Out.

Input devices (devicePorts, role=”source”)

Built‑in mic(s), FM Tuner, Wired Headset Mic, BT SCO Headset Mic, Telephony Rx (from modem), USB Device/Headset In.

The routing rules (how streams connect to devices)

  • Music‑type outputs (primary output, raw, deep_buffer, direct_pcm, compressed_offload, sometimes dsd…, voip_rx) are allowed to go to: Earpiece, Speaker, Wired (HS/HP), Line, BT SCO (for telephony/VoIP), HDMI, Proxy, FM, USB outs, A2DP outs—exactly as listed under each <route sink="…">.
  • Call paths:
    • To modem (uplink): voice_tx and incall_music_uplinkTelephony Tx.
    • From modem (downlink): Telephony Rxvoice_rx (so the system can consume it).
  • Recording:
    • primary input can pull from Built‑In Mics, Wired HS Mic, BT SCO HS Mic, FM Tuner.
    • High‑res/array inputs (record_24, surround_sound) pull from built‑in mics as defined.
    • USB multichannel: usb_surround_sound pulls from USB Device/Headset In.
    • VoIP transmit: voip_tx from Built‑In/Back/BT SCO/USB mics.
    • Hi‑Fi input: hifi_input from USB Device/Headset In.
  • Default device: Speaker if nothing else forces a choice.
  • attachedDevices: always present on boot (earpiece, speaker, telephony, built‑in mics, FM, etc.).

Other modules

  • A2DP module
    Provides a2dp input (recording from a BT source device). Device is BT A2DP In. Route connects the two.
  • USB module
    Provides usb_accessory output at 44.1 kHz stereo to USB Host Out (when phone is the USB Accessory sink device).
  • Remote Submix
    Brought in via xi:include. Used for virtual capture/playback between apps (e.g., screen cast).

Flags you’ll see (why they matter)

  • AUDIO_OUTPUT_FLAG_FAST: low‑latency mixer path (small buffers, strict formats).
  • AUDIO_OUTPUT_FLAG_PRIMARY: the “main” stream the system prefers.
  • AUDIO_OUTPUT_FLAG_DEEP_BUFFER: power‑saving playback (bigger buffers).
  • AUDIO_OUTPUT_FLAG_DIRECT: bypass the mixer; app must match device/profile exactly (used for hi‑res or VoIP).
  • AUDIO_OUTPUT_FLAG_COMPRESS_OFFLOAD: hardware decodes compressed audio to save CPU/battery.
  • AUDIO_OUTPUT_FLAG_NON_BLOCKING: offload writes are asynchronous.
  • AUDIO_OUTPUT_FLAG_VOIP_RX / AUDIO_INPUT_FLAG_VOIP_TX: dedicated VoIP play/rec paths.
  • AUDIO_OUTPUT_FLAG_INCALL_MUSIC: inject audio into call uplink.

How to read/extend this file quickly

  • To allow a new path, you need three things to line up:
    1. A mixPort that produces/consumes the needed format/rate/channels.
    2. A devicePort that supports those formats (or channelMasks="dynamic" etc.).
    3. A route that connects that mixPort to that devicePort.
  • Typical edits you might do:
    • Add a new USB DAC rate: extend the USB Device Out profile’s samplingRates.
    • Enable hi‑res direct to 192 k over headphones: ensure direct_pcm supports it and add/keep a route to “Wired Headphones.”
    • Allow VoIP RX to speaker only: restrict the voip_rx routes to the Speaker sink.
    • Add a new mic array mode: add/adjust surround_sound/record_24 profiles and routes from the correct mic devicePorts.

Sanity checks when things don’t route

  • Is the route present for that sink?
  • Do the formats/rates/channels of the opened stream match any profile on both ends?
  • For DIRECT/OFFLOAD, mismatches will fail to open. Try lowering to 48 kHz stereo 16‑bit to confirm.
  • Is the device considered “attached” (or plugged) and recognized by the HAL?

The volume part

At the bottom, the file includes audio_policy_volumes.xml and default_volume_tables.xml. Those define how loudness scales per usage (ring, media, call, alarm, etc.) on different devices.

Leave a Comment