Monday Jan 10, 2005

Over the past few months, I've been surveying the MIR community about the tools MIR researchers use. MIR researchers have been very generous with their time and information, so I've been able to build quite a good list of tools. The result of this survey is posted on the music-ir website. Checkout The tools we use survey for an overview of the tools used by MIR researchers.

Friday Jan 07, 2005

Willie Walker, champion of Open Source speech recognition and synthesis technology at Sun Labs, has transferred to Sun's Accessibility team. research.sun.com has an article about our fearless leader Willie's transfer. I worked for Will for almost 5 years. He's a super guy, incredibly smart and was a great leader. I'll certainly miss working for Will.

Samsung's latest phone has the ability to convert speech into text. According to the article, a user can dictate a message and the phone will convert it to text instead of having to type in a message (which is awkward on a numeric keypad). The article doesn't mention anything about vocabulary size. I must admit that I'm a bit skeptical about how well it works. Time will tell.

Thursday Jan 06, 2005

Matt Quail of Totally Gridbag fame, has discovered the preview for JarWars - Revenge of the <T>. Duke Vader is on a mission to destroy the casting knights. See James Gosling and Guy Steele in their last stand against the forces of generic evil.

Wednesday Jan 05, 2005

During holiday break the Lamere clan spends lots of time around the kitchen table playing all sorts of games. I make sure that there are plenty of new games under the Christmas year. This year was no exception, with 5 new games for us to while away the hours. We do have some gaming requirements: there are 5 of us who play (sadly, mom is not a game player (apparently I'm too competitive ...sigh)) with an age range of 10 to adult. This year's favorite game by far was 'Citadels'. Citadels is a card game where the goal is to build a city before your opponents do. There are assasins, thieves, warlords and others to watch out for. There's great player interaction, some luck, bluffing and even a bit of role playing. It's a lot of fun (I can tell for sure because the kids dragged out the game even when I wasn't around). Worth checking out during these upcoming indoor winter evenings.

Many pattern classification problems (including speech recognition and music classification) are solved by using a set of probability distributions called mixture models to represent a single statistical distribution. The EM algorithm is often used to determine the set of distributions given the raw data and the desired number of distributions. S. Akaho has a Java applet that demonstrates how the EM algorithm works. First you draw your data, next you tell it how many probability distributions you want, and then you tell it to go and it will use the EM algorithm to calculate the best sets of distributions.

A good paper on the EM algorithm is A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models by Jeff A. Bilmes.

Monday Jan 03, 2005

I was playing with my brother-in-law's iPod the other day. He has about 10MB of songs from about 1,000 or so artists. The iPod really is a nice device, worthy of the techno-lust that seems to surround it. Still, it is not a perfect interface. I wanted to listen to some Weezer, which is an alphabetically-challenged band. I had to scroll through the 950 or so artists until I could get to the 'w' bands. This really took me about a minute of making my thumb do the circle-dance. I could see no better way of doing it. Now I could select a song by album, which would be faster if weezer made albums like "abbey road", but the only album I knew by them was called 'weezer', sigh. So my best option was to just hit shuffle play and hope that I would get lucky. Maybe I should just start listening to Aerosmith, AC/DC and Abba. Oy!

Sunday Jan 02, 2005

A new year ... a new look and feel. This one looks rather nice. I think I'll try it out for a while.

Monday Dec 13, 2004

Ken Streeter is the co-coach for the Mindstorms Mayhem. During the 2003-2004 season, the Mayhem won the Granite State Director's Award and went on to win the International Director's Award. Just this last weekend, the Mindstorm Mayhem clinched their second NH State Director's award and are now preparing for their second trip to the FLL International competition. Earlier this year, before the FLL season got under way, Ken agreed to subject himself to this interview. Showing extreme Gracious Professionalism Ken gives us some insight into what makes the Mayhem so successful (and he gives away a couple of secrets too!). Read the interview here.

Sunday Nov 21, 2004

Researchers Ge Wang and Perry Cook have developed ChucK, a new audio programming language for real-time synthesis, composition, and performance. ChucK presents a new time-based concurrent programming model, which supports multiple, simultaneous, dynamic control rates, and the ability to add, remove, and modify code, on-the-fly, while the program is running, without stopping or restarting. It offers composers, researchers, and performers a powerful and flexible programming tool for building and experimenting with complex audio synthesis programs, and real-time interactive control.

The paper ChucK: A Concurrent, On-the-fly, Audio Programming Language won the International Computer Music Association best paper award in 2003.

ChucK is quite new, and needs a bit more work, but it has some real advantages over some of the other synthesis languages. The way it treats time as a first class entity is quite slick. They are working hard to make sure that ChucK can be used during live performances.

Tuesday Nov 16, 2004

Sun Labs is sponsoring a talk at the Computer History Museum called Music Meets the Computer. John Chowning (the father of FM synthesis, inventor and composer), Max Mathews (father of computer music) and Curtis Roads, composer and music historian will be discussing the world of computer music. December 14, 2004. Mark your calendars.

Ah, I wish I were on the west coast for this.

I coach the local middle school First Lego League. Last weekend the team competed in the regional competition where 18 teams fight for 8 tickets to the state finals. I was worried that we wouldn't do well since the regionals come very early in the season and we were quite under prepared. Add to that the quality of the other teams (last years defending world champion team would be there).

Well, the kids on the team really pulled together and did everything right. It was just great to see them rise to the occasion as a team. They were rewarded for their efforts with the second highest overall score for the day (as well as the programming trophy) and a ticket to the state tournament. Congratulations Spoink! Now back to work, there's only one month until the state tournament ...

Here's the team:

And here's an early version of the team's robot:

Monday Nov 15, 2004

Last month I wrote about the Yankee Farmer who could toss a pumpkin a thousand feet. Well, this same farmer packed up his 20 ton weapon and trucked it down to Delaware for the world punkin chunkin championship. Their first toss in Delaware set a world record, their next toss was even further tossing a pumpkin 1,394.29 feet. You can read more about in the Nashua Telegraph, our local newspaper.

Wednesday Oct 20, 2004

There's a nice, quick overview of VoiceXML at NewsForge. The article mentions PublicVoiceXML an open source implementation of VoiceXML. So far PublicVoiceXML supports TTS (via SAPI), ViaVoice or Festival, and quite a bit of telephony hardware but no recognition support yet. Oh yeah ...go red sox.

Monday Oct 18, 2004

The speech team at Sun has used the JavaSound API quite a bit over the last few years. We use it quite extensively for sound output in FreeTTS, our speech synthesizer and for sound input in Sphinx-4, our speech recognizer. The JavaSound API has the unenviable task of providing a standard sound API that works in the same way across all operating systems, audio systems and sound cards. Not an easy task given the wide variety of sound resources available. Due to this complexity, there are subtle differences in JavaSound behavior across the different platforms. A Java sound program that works fine on one platform may work a bit different on another platform. For instance, recently Phil and Will spent some time chasing down a latency issue that only occurred on one of our target platforms.

One excellent resource for anyone working with JavaSound is the Java Sound Resources site. This site has numerous JavaSound examples, an extensive FAQ, and a number of tutorials to guide a developer in writing good JavaSound program. The site is put together by Florian Bomers and Matthias Pfisterer. Both of these guys have worked with or on the JavaSound API for years and years so they know what they are talking about. If you are doing anything with JavaSound, I strongly recommend this site.

This blog copyright 2010 by plamere