Listen to audio examples and learn more about our algorithms.
Balances levels between speakers, music and speech – no compressor knowledge required.
The Adaptive Leveler corrects level differences between speakers, between music and speech and applies dynamic range compression to achieve a balanced overall loudness. We use classifiers to carefully process music segments and prevent amplification of unwanted noises.
Based on over five years of training with audio files from our web service, the algorithm keeps learning
and adapting to new data every day.
It is most suitable for programs,
where dialog or speech is the most prominent content such as podcasts, radio, broadcast,
lecture and conference recordings, film and videos, screencasts etc.
Define whether Auphonic should remove only static or also fast-changing noises and if we should keep or eliminate music.
Do you want to eliminate any ambient sounds from your audio to get a clean speech file? Or do you just want to get rid of static background noises while keeping the singing bird outside your window on record? With our advanced AI denoising algorithms, you have the choice!
Our classifiers also detect segments containing music or breathings, so you can decide if
music is part of the content and should remain in your production or
if you want to completely isolate the spoken content and
remove all breathings and mouth noises.
For even greater speech intelligibility control, it is possible to separately adjust the amount of
noise, reverb or breath reduction to strike the perfect balance between clarity and ambiance.
Removes unwanted frequencies and sibilance (De-Esser) and creates a clear, warm, and pleasant sound.
Our AutoEQ algorithm automatically analyzes and optimizes the frequency spectrum of a voice recording, removing sibilance (De-Esser) and creating a clear, warm, and pleasant sound.
The equalization of multi-speaker audio can be complex and time-consuming, as each voice requires its own unique frequency spectrum equalization. Our AutoEQ simplifies this process by creating separate, time-dependent EQ profiles for each speaker, ensuring a consistent and pleasant sound output despite any changes in the voices during the recording.
Automatically cut silent segments, pauses, and filler words like "ah", "uhm", "mh", or "ähm" in multiple languages.
A few seconds of silence quickly arise due to equipment re-adjustment or short speaking pauses to breathe or think. Many speakers also tend to fill the thinking pauses with "ah", "uhm", "mh", etc. to avoid awkward silence.
Whether it is silence or filler words, listeners usually do not enjoy listening to informationless audio.
Our automatic cutting algorithms reliably detect and remove silent segments and filler words. Simply enable the algorithms in your production without further settings to cut redundant filler content and achieve a high-quality listening experience. If you want to check and apply the cuts manually, we provide cut lists that you can import into your favorite audio/video editor.
Process multiple tracks to create an optimized mixdown - featuring automatic ducking, noise gate and crosstalk removal.
Auphonic multitrack leverages multiple input audio files to produce a balanced,
high-quality final mixdown.
The algorithm processes individual and combined tracks, including speech tracks from multiple
microphones, music tracks, and remote speakers via phone or Skype.
This allows for a balanced loudness between tracks, with dynamic range compression applied
only to speech segments and automatic ducking of music/FX tracks.
Denoising per track, adaptive noise gates and a crossgate decrease noise,
crosstalk and reverb in the final mixdown by identifying
when and in which track a speaker is active.
Auphonic's multitrack algorithm produces exceptional results, making it the go-to solution for audio professionals.
Define a target loudness, true peak limit, MaxLRA and more for consistency across files and compliance with audio specs.
Auphonic is the perfect tool for you to never again worry about admission criteria
for different platforms (Audible, Netflix, Spotify, podcasts, etc.) or
broadcasters (EBU R128, ATSC A/85, radio and mobile, commercials).
You can define a set of target parameters
(integrated loudness, true peak level, dialog normalization, MaxLRA, MaxM, MaxS),
like -16 LUFS for podcasts,
and we will produce the audio accordingly in one click.
Multilingual speech-to-text with auto-generated shownotes and chapters displayed in a shareable transcript editor.
Auphonic uses a multilingual Whisper model by OpenAI as self-hosted speech recognition engine, including a sharable transcript editor that can easily be integrated into your post production workflow without extra costs. In addition to our Whisper engine, we also integrated a wide range of popular external speech recognition services, including Amazon Transcribe, Google Cloud Speech API, wit.ai and Speechmatics.
Our Automatic Shownotes and Chapters feature gives you AI-generated summaries in multiple levels of detail and timestamped thematic sections, that you can use as shownotes and chapters to boost your podcast's accessibility and search engine visibility.
For multitrack productions, each track is processed separately, so you get a detailed transcript showing
which speaker is active at what exact time.
We produce enhanced audio or video podcasts with chapters and waveform audiograms in all output formats you need.
Chapter marks or enhanced podcasts are used for quick navigation within audio files
and can be entered directly in our web interface or imported from various sources
such as text files or audio editors (DAWs).
Auphonic supports all common audio and video file formats, offers customized encoding settings,
maps metadata tags to multiple output files and exports them to platforms such as
Soundcloud, YouTube, and Spreaker.
By generating videos from audio files, including a dynamically generated waveform, cover image or chapter images as the background, audiograms allow you to create shareable videos from audio-only productions with ease.
In video productions, Auphonic extracts the audio track, processes and merges it with the original video track without any loss of image quality. You can export the processed video to YouTube or create an audio-only version for your podcast platform automatically.
You can use Auphonic for free for up to 2 hours of audio each month. That way, you can fully test our algorithms and services without any commitment.
Yes, certain features like batch productions or watch folders for workflow automation are only available for premium users. You can find out more about our premium features on our pricing page.
Yes, we integrate with several tools which can be used for file transfers, to automatically publish your productions or to automate your workflow. Find out more about our integrations here.
Yes, our software has a developer API available. The API allows you to build custom integrations and applications that can interact with the software. You can check it out here.
The time it takes for the AI algorithms to improve your audio file will depend on the size and complexity of the file, but on average, it typically takes about 10% of the length of the audio file. For example, if your audio file is one hour long, it would take about 6 minutes for our algorithms to improve it.