Skip to content

Latest commit

 

History

History
136 lines (118 loc) · 6.11 KB

File metadata and controls

136 lines (118 loc) · 6.11 KB

USAGE

Web Application Interface

The web application features a user-friendly interface divided into five interactive regions, designed to facilitate seamless control and customization of the synesthetic audio experience. All camera processing is performed locally on your device, ensuring privacy. No frames are transmitted externally, though the browser will request camera access permission to enable this local processing for audio generation.

Interface Regions

  • Center Rectangle: Audio Enabler
    A touch-sensitive area that activates the webpage’s audio output, allowing sound generation to begin.
  • Top Border Rectangle: Settings SHIFTer Button
    Toggles the settings mode to reveal advanced configuration options.
  • Bottom Rectangle: Start/Stop Button
    Initiates or pauses the audio generation and camera processing.
  • Left Rectangle: Day/Night Switch
    Inverts light logic to optimize visibility and processing for different lighting conditions.
  • Right Rectangle: Language Switcher
    Changes the interface language for improved accessibility.

Settings Mode (SHIFTed Interface)

When settings are enabled via the SHIFTer button:

  • SHIFTed Left Rectangle: Grid Selector
    Adjusts the camera’s framing or "gridding" of the environment, allowing users to customize how the visual input is segmented for audio mapping.
  • SHIFTed Right Rectangle: Audio Engine Selector
    Modifies the sound synthesizer’s response to the selected grid, enabling users to tailor the audio output to their preferences.

The latest stable version is hosted at:

https://mamware.github.io/acoustsee/present

Browser compability list:

Browser Minimum Version for Full Support Notes
Chrome for Android Chrome 47 (December 2015) Full support for getUserMedia, AudioContext, and createStereoPanner.
Safari on iOS iOS 14.5 (Safari 14.1, April 2021) Supports unprefixed AudioContext and createStereoPanner. No vibration support.
Firefox for Android Firefox 50 (November 2016) Full support for all APIs, though SpeechSynthesis may be inconsistent.
Samsung Internet Samsung Internet 5.0 (2017) Based on Chromium, full support for all APIs.
Opera Mobile Opera 36 (2016) Based on Chromium, full support for all APIs.
Edge for Android Edge 79 (January 2020) Based on Chromium, full support for all APIs.

To test our first commit wich is a Python script, either out of curiosity or educational purposes, follow the instrucctions below

How to run the first iteration, a simple proof-of-concept processing a static image file and output basic left/right panned audio file.

Setup

Clone the Repo:

git clone https://github.com/MAMware/acoustsee.git
cd acoustsee

Set Up Virtual Environment:

python3 -m venv acoustsee_env
source acoustsee_env/bin/activate

Install Dependencies: bash pip install opencv-python-headless numpy scipy pyo Run the MVP: For local machines bash python src/main.py For headless environments (e.g., Codespaces):

python src/main_codespaces.py

Try it with examples/wall_left.jpg to hear a basic left/right audio split!

Troubleshooting the python version installation

  • Windows pyo Installation:
    • Use Python 3.11 or 3.12 for best compatibility.
    • Install Microsoft Visual C++ Build Tools: Download.
    • Ensure PortAudio is installed and in your PATH.
    • Example:
      python3.11 -m venv acoustsee_env
      .\acoustsee_env\Scripts\activate
      pip install opencv-python numpy scipy pyo
  • Linux pyo Installation (e.g., GitHub Codespaces):
    • Use a virtual environment:
      python3 -m venv acoustsee_env
      source acoustsee_env/bin/activate
    • Install development libraries:
      sudo apt update
      sudo apt install -y libportaudio2 portaudio19-dev libportmidi-dev liblo-dev libsndfile1-dev libasound-dev libjack-dev build-essential libgl1-mesa-glx
    • Install Python dependencies:
      pip install opencv-python-headless numpy scipy pyo
    • If opencv-python fails with libGL.so.1 errors, use opencv-python-headless:
      pip uninstall -y opencv-python
      pip install opencv-python-headless
    • If Python 3.12 fails, try Python 3.11:
      sudo apt install -y python3.11 python3.11-venv
      python3.11 -m venv acoustsee_env
      source acoustsee_env/bin/activate
      pip install opencv-python-headless numpy scipy pyo
  • Headless Environments (e.g., Codespaces):
    • Codespaces lacks audio output. Use main_codespaces.py to generate WAV files:
      python src/main_codespaces.py
    • Download examples/output.wav via the Codespaces file explorer and play locally.
    • Example WAV test:
      from pyo import *
      s = Server(audio="offline").boot()
      s.recordOptions(dur=2, filename="test.wav")
      sine = Sine(freq=440, mul=0.5).out()
      s.start()
      s.stop()
  • WxPython/Tkinter Warning:
    • pyo may warn about missing WxPython, falling back to Tkinter. This is harmless for WAV generation.
  • SetuptoolsDeprecationWarning:
    • A warning about License :: OSI Approved :: GNU General Public License is harmless (it’s a pyo packaging issue).

Privacy and Processing The application processes all camera data locally on your device, ensuring no visual information leaves your processor. Upon launching, the browser will request camera access to perform this private processing, which is essential for generating the real-time audio cues used for navigation.

  • Still stuck? Open an issue on GitHub or ping us on X.