SoundSeg

Featured

An AI-powered sound segmentation tool that separates songs from YouTube into downloadable components (vocals, drums, etc)

React

TypeScript

Material UI

Firebase

Python

View Live Demo View Source

What problem does this solve?

Musicians, producers, and audio enthusiasts often need isolated instrument tracks from existing songs—whether to learn a guitar riff, remix a track, practice drumming along to a song, or create karaoke versions. Traditionally, this required access to the original multi-track recordings, which are almost never publicly available. Professional-grade source separation software exists but is expensive and complex.

SoundSeg uses Convolutional Neural Networks and spectral signal processing to automatically separate any song into its component parts: drums, bass, vocals, and other instruments. Users simply paste a YouTube URL or upload an audio file, and within about a minute, they receive downloadable isolated tracks.Who is it for?

Musicians who want to learn specific parts of a song by ear
Producers & DJs creating remixes, mashups, or samples
Content creators who need instrumental or a cappella versions
Music teachers isolating parts for students
Anyone wanting a karaoke-style experience

Why did I build it?

This was a collaborative project with a friend who developed the ML model, serverless backend, and co-authored a research paper on spectral normalization techniques. My role was to design and build the entire frontend—turning a research tool into a polished, accessible web application. I built the React UI with Material UI, implemented Google authentication, crafted a custom dark theme with subtle animations, and focused on creating an intuitive experience that makes advanced audio processing feel effortless for non-technical users.

Stack

React 18 + TypeScript with Vite
Material UI v5 with a heavily customized theme
Firebase for Google auth and storage
React Hook Form for form handling
React Context for auth and modal state

Architecture Decisions

Custom Theme System

Extended MUI's palette with a custom accent color via TypeScript module augmentation
Dark glassmorphism aesthetic—semi-transparent cards, backdrop blur, layered gradient glows
Centralized component overrides keep styling consistent across the app

Styled Components Pattern

All styled components defined at the top of each file, keeping JSX clean
Used MUI's styled() API with full theme access

Context Over Redux

Auth and modals managed via lightweight React Context—right-sized for this app's scope

Interesting Challenges

Deferred Authentication

Users can fill the form before signing in to reduce friction
On submit, unauthenticated users see a login modal
Form data is captured in a closure—after sign-in, the submission continues automatically

Queueable Modal System

Custom modal context supporting a queue with prepend/append positioning
Needed for potential sequential modals (error → login → success)
Each modal has a unique ID for targeted closing

File Uploads

Audio converted to base64 client-side before sending to Cloud Functions
Client-side size validation (10MB) to fail fast

Animations

CSS keyframe animations for floating logo, pulsing loader, glowing icons, staggered fade-in reveals

Audio Source Separation

Upload any audio file or paste a YouTube URL
AI separates the track into individual stems: drums, bass, vocals, and other instruments
Download each isolated track directly from the browser

Flexible Input Options

Supports direct audio file uploads (MP3, WAV, etc.)
Accepts YouTube and other video platform URLs—automatically extracts and processes the audio
Built-in demo mode to preview the output without uploading anything

Frictionless Authentication

No sign-in required to explore the interface
Google authentication only prompted when submitting a request
After sign-in, the original action continues automatically—no re-entry needed

Polished Dark UI

Custom glassmorphism theme with layered gradients and subtle animations
Responsive design adapts to mobile and desktop
Real-time loading feedback with progress indicators

Instant Playback & Download

Processed stems are available for immediate playback in-browser
One-click download for each isolated track

🏆 Achievements

Backdrop Build v4 Winner

Built and shipped a product during a competitive 4-week builder cohort focused on rapid iteration and public feedback

Collaborating Across Disciplines

Worked closely with a friend focused on ML research—learned to bridge the gap between a research tool and a user-facing product
Gained appreciation for API contract design and clear communication when building on top of someone else's backend

Handling Long-Running Async Operations

Audio processing takes 30–60 seconds; learned to design around that with clear loading states and user feedback
Reinforced the importance of managing user expectations when you can't make something faster

Shipping a Complete Product

Small project, but end-to-end: design, build, deploy, and actually put it in front of users
Reinforced that "done" beats "perfect"

SoundSeg

💡Overview