Skip to content

ifnspaml/P.808-Localization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

P.808 ACR Localization Instructions

This project contains instructions and scripts to localize subjective absolute category rating (ACR) tests conducted with Amazon Mechanical Turk (mTurk). The instructions and methodology are based on the Microsoft P.808 repository and refer to it.

Prerequisites

Before starting, ensure you have the following:

  • The P.808 repository codebase, included as submodule at /P.808.
  • Required Python packages detailed in the P.808 submodule src/requirements.txt.
  • A python environment suitable for installation of further requirements.
  • AWS Credentials (if you want to use the automated Amazon Polly TTS generation for localization).

Localization Steps

You will need to manually translate, adapt, and generate new audio files for your required target language. Use the provided Jupyter Notebook (localization.ipynb) as a step-by-step guide.

1. Data Selection Preparation

Gather the following audio clips in your target language:

  • ~16 clips: Mixed noisy, processed, and clean files for trapping and training phases. Make sure they cover the full range of tested qualities.
  • at least 1 clip: Clean speech for qualification. More may be used.
  • ~16 clips: Noisy and clean files for the gold standard set.

2. Audio Generation (via localization.ipynb)

Follow the interactive steps in localization.ipynb to generate the required localized content:

  • Bandwidth Tests: Generates filtered noisy audio (at specific cutoff frequencies) overlayed with your target language's clean speech.
  • Qualification Files: Mixes clean speech and noise at specific SNRs (e.g., 35dB, 45dB).
  • TTS Instructions & Trappings: Uses Amazon Polly (or another TTS engine) to generate math problems, digits in noise, and trapping message interruptions ("Please select the answer..."). Make sure you manually translate the tts_translation_dict variables to your target language.

3. File Hosting

Once the localized audio files have been generated, upload and host these files at a cloud hosting provider of your choice (e.g., Azure Blob Storage, AWS S3) so they can be accessed publicly by Amazon MTurk workers. This step is out-of-scope of this project.

Generating the HTML File

The HTML generation involves using the main P.808 repository script and then injecting the localized content via localization.ipynb:

  1. Mapping Localized Files: In localization.ipynb, configure your hosting domain and generate the mapping CSV files (general.csv, gold_clips.csv, rating_clips.csv, training_clips.csv, and trapping_clips.csv).
  2. Updating general.csv: Overwrite the default general.csv located at src/assets_master_script/general.csv in the P.808 repository with the newly generated general.csv. (Remember to keep a backup of the original).
  3. Running the Master Script: In the P.808 src/ directory, run the master script to scaffold the project files (replace NAME with your actual project name) as described in the localization.ipynb:
    python master_script.py --project NAME --method acr --cfg P.808/configurations/master.cfg \
        --clips ./localization_instructions/rating_clips.csv \
        --training_clips ./localization_instructions/training_clips.csv \
        --gold_clips ./localization_instructions/gold_clips.csv \
        --trapping_clips ./localization_instructions/trapping_clips.csv
  4. HTML File Modifications: Copy the newly generated .html file from the P.808 repository to your localization_instructions folder.
  5. Injecting localizations: In localization.ipynb, set the html_file variable to the name of your copied HTML file. Execute the remaining cells to automatically:
    • Inject the URLs of your newly hosted, localized audio files via BeautifulSoup.
    • Replace the English template strings with your target language strings.
    • Update form controls (e.g. replacing the text input for mother tongue with an explicit radio button selection).
    • Generate and embed a translated instructional image (converting process_no_text.png back into base64 to embed directly in the HTML).

After completing these steps, the resulting .html file (P.808_acr_translated.html) is fully localized and ready to be used in your Amazon MTurk UI configuration.

If you find this repository useful or perform your localization according to our paper, please cite:

@inproceedings{Sach2025,
  title={2025 Urgent Speech Enhancement Challenge: Multilingual P.808 Listening Tests: Approach and Results},
  author={Sach, Marvin and Fu, Yihui and Saijo, Kohei and Zhang, Wangyou and Cornell, Samuele and Scheibler, Robin and Li, Chenda and Kumar, Zhaoheng Ni, Anurag and Wang, Wei and Qian, Yanmin and Watanabe, Shinji and Fingscheidt, Tim},
  booktitle={Proc.\ of ICASSP},
  year={2026}
}

About

Instructions and scripts for localizing P.808 subjective listening tests

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors