Audio Version of Riya
Dan Housman put up an interesting conjecture today about creating an audio version of Riya here . Can you do voice recognition inside of streams. I spent quite a bit of time looking at this problem. It seemed to me that the most important stuff you wanted to search was proper nouns (Jane, Wall Street, IBM, etc). Although I am not an expert in voice recognition those folks tell me that finding proper nouns is very very difficult. However I do think there is a revolution occuring in searching inside of major media file formats.

I put a more verbose comment on Dan's blog but I'll do a little recap. The trend I've spotted is the move towards more phonetic based indexes where the index is on a lattice (a set of phoneme transition hypotheses). Nexidia, aurix, tveyes/SAIL labs (podscope.com) are all taking this approach. I've heard some great numbers from Nexidia but haven't tried out their system.
1-best transcription isn't enough because we simply can't build dicitionaries with large enough vocab and there are too many ambiguities. We can do rough transcriptions and augment them with a richer phonetic index. The various systems I've played with are pretty damn good, they even pick up my name ;). Add language, accent, speaker idenfitication with a little bit of topic extraction and you have something huge.
Sudeep on Dan's blog mentioned combining face with speech recognition. Multi-modal search will become really useful with the rise of IPTV. If we can analyze video and pick out faces, semantic scene analysis and also index the sound , that would be killer. Then people can do searches like "find me a show with angelina jolie where she mentions string theory".
Posted by: prasanna | April 05, 2006 at 03:45 PM
Hi there,
Congratulations on beta ,and even better to see Riya sponsor us folks at barcamp in chennai among otehr cities.
I just thought ill add to the conversation with a mention of the projects that im working on that have got pretty good response from the developer community (got listed on ajaxian recently!).
Check them out at http://bosky101.blogspot.com/2006/04/im-in-news.html
Keep Clicking,
Bhasker
Posted by: Bhasker V K | April 23, 2006 at 01:58 AM