This tool allows users to record audio or voice inputs and converts them into text with accompanying timestamps. It leverages OpenAI's Whisper model via the Hugging Face serverless API to provide accurate transcriptions. The output includes both the full transcribed text and detailed timing information, facilitating easy reference to specific segments of the audio.
Note: The backend Flask application that runs the AI model is hosted on Render's free tier service and deployed via Docker containers, which may take approximately 50 seconds to boot up if inactive.