Fish Audio - AI Audio Generation SaaS Platform

AI Audio Generation

USA

fish.audio

Deliverables

Desktop Web app

Mobile Responsive

Icon & Illustration

Project timeline

2024

What we do

UX Research

UI/UX Design

Icon & Illustration

QC slicing

Fish.audio is a platform and project focused on cutting-edge text-to-speech (TTS) and conversational AI technologies. It provides tools and models for voice synthesis and natural language processing, allowing developers to create applications such as voice assistants, chatbots, and multilingual speech systems.

Service

Designing for MVP

Objective

The objective is they want to realize thier ide a about developing a user-friendly, web-based platform that leverages the power of advanced text-to-speech (TTS) technology. This platform aims to make AI-powered audio creation accessible to a wide audience. Users can interact with the platform to convert text into realistic speech, potentially opening doors for various applications like educational resources, audiobooks, marketing materials, and more.

The Challenge

The challenges we're facing are:

New Industry: This project is in the relatively new field of AI audio generation.
Unique Selling Proposition (USP): We're struggling to identify a unique selling point that will set our platform apart from competitors.
Visual and UX Differentiation: From a visual design and user experience perspective, we're challenged to create a distinct look and feel that differentiates us from our competitors.

The Solution

To build a successful AI audio generation platform, a deep understanding of the industry is crucial. Conduct thorough research to identify emerging trends, challenges, and market opportunities.     

Analyze the competitors to understand their strengths, weaknesses, and unique selling points. By aligning your platform's vision with the client's goals, you can ensure that your solution is both innovative and market-relevant.

Workspace to create voice models

This screen serves as a workspace for users to create and manage voice models. It allows for uploading details like title, description, tags, and audio samples, with options to set privacy levels (public, unlisted, or private). By providing privacy level settings, users have the ability to choose which samples they want to make public. This gives users greater control over the visibility of their creative work.

On the right, a Latest Activity section displays the status of ongoing tasks, such as drafts, models in progress, or completed ones, providing users with real-time updates on their projects.

Workspace to create voice models

On the right, a Latest Activity section displays the status of ongoing tasks, such as drafts, models in progress, or completed ones, providing users with real-time updates on their projects.

Workspace to create voice models

On the right, a Latest Activity section displays the status of ongoing tasks, such as drafts, models in progress, or completed ones, providing users with real-time updates on their projects.

Overview of API usage

The API details screen provides an overview of API usage and information on how to use it for developers. It includes the current quota usage, displayed as a monetary value (e.g., $4.60), and offers the ability to securely generate and manage API tokens. A detailed bar chart visualizes API activity, helping users track their usage patterns.

Additionally, the screen provides links to API documentation and offers an efficient usage plan (e.g., $15 per million UTF-8 bytes) to guide users on optimizing their API interactions and understanding the pricing options for marketing purposes.

Overview of API usage

Highlights the leaderboard of the platform's top users

This screen highlights the leaderboard of the platform's top users, ranking them based on engagement metrics such as Total Likes, Total Share, Total Saved, and Total Use. It prominently displays user badges for top-ranking individuals (gold, silver, bronze) and emphasizes the number of models created by each user.

Highlights the leaderboard of the platform's top users

Emphasizes discoverability and engagement

This user profile screen demonstrates a clean and intuitive UI with a focus on simplicity and functionality, designed to facilitate sharing and interaction. It emphasizes discoverability and engagement by allowing users to share what they like, enabling fans or followers to see and connect with their interests

Emphasizes discoverability and engagement

The Result

The platform is designed to empower users to create, share, and manage audio content using advanced AI text-to-speech (TTS) technology. A thoughtfully designed, intuitive interface enhances its appeal by prioritizing smooth workflows, clear navigation, and engaging visual elements.     

This approach has a noticeable impact, making the platform both innovative and user-friendly, and helping it stand out from competitors in the growing AI audio generation market.