Tuesday Oct 12, 2004

This week, research.sun.com is profiling (pun intended) Misha Dmitriev, the project lead for JFluid, the Java profiler. JFluid is one of my favorite Java tools. It will profile a Java application without the usual inaccuracies. JFluid uses a modified JVM that can add instrumentation to your class files when you wish to profile them. This allows you to easily enable and disable profiling on a method by method basis, thus avoiding the usual problems with some profilers that vastly slow down the execution of your program. JFluid directly instruments your methods instead of relying on sampling (like most other profilers do). Sampling profilers are about as accurate as the current presidential polls. Since they only take periodic samples of your application, sampling profilers are likely to misrepresent the overall profile. By directly adding instrumentation to your classes, JFluid can give you a 100% accurate representation of where your program is spending its time.

Jfluid is being incorporated into the next version of NetBeans. There's a good overview of JFluid at JDJ here: JFluid: A New Way to Profile Java Applications

Thursday Oct 07, 2004

Kumar has integrated FreeTTS with ant and describes it in his blog entry: Adding sound to boring build. With about a dozen lines of Java code, Kumar created a new ant logger that uses FreeTTS to announce the status of the build. Euxx did something similar a while back.

Wednesday Oct 06, 2004

The Advanced Interfaces Group in the School of Computer Science at the University of Manchester has developed a system called Kekule that they use to experiment with ways of exploring graphs non-visually. Kekule reads CML data (CML is a molecule description language) and allows the user to explore it with input via the keyboard and output via the FreeTTS speech synthesizer.

AIG researchers will be presenting a paper on this research at Asset 2004 The Sixth International ACM SIGACCESS Conference on Computers and Accessibility.

Lokesh Shah has combined FreeTTS with JavaHelp. With less than a page of code, Lokesh now has a help system that reads the help text to you. Very nice.

Tuesday Oct 05, 2004

Yes, I know Hinkmond is supposed to do all of the wireless stories, but I haven't seen him mention this one. Curitel has has just released this two Megapixel TTS Phone in Korea that has speech synthesis built into it. The idea is that when you receive text message, not only can you read it but you can hear it as well. Now, I'm not sure what the target demographic is for this ... it could be for users who are driving, or perhaps for those that can't read. (?!) Anyway, the phone is only availabile now in Korea, but a deal with Verizon may be in the works, so something like this may be in the USA before too long. So I guess that is something else we have to look forward to hearing while sitting in a movie theater, or the concert hall or church or wherever.

Monday Oct 04, 2004

Tom Erbe has compiled a good list of Java Music Projects. This includes programs that can be used for audio/music analysis, synthesis, composition and I/O (including MIDI). The page is currently hosted by SoftSynth which makes JSyn a Java-based system for synthesizing sounds and music.

Sunday Oct 03, 2004

Our release of Sphinx-4 received a mention on slashdot this week. As usual, the slashdot hive descended on our site in droves. Here are some stats for our Sourceforge web page:

And the stats for this blog.

It looks like slashdot was responsible for an extra 30,000 hits on the SourceForge site and about 5,000 hits for this blog. That's quite a bit of traffic. Of course, the downside is that you have to put up with some of the typical slashdot comments like this one, but there were some good ones too.

Friday Oct 01, 2004

One of my last tasks in preparing for the recent release of Sphinx-4 was to improve the performance of the DynamicFlatLinguist . The DynamicFlatLinguist is used to generate the search graph used by the decoder during speech recognition. The DynamicFlatLinguist generates the search on-the-fly as the recognizer search algorithm explores the graph. This allows very large and perplex grammars to be used without incurring long grammar compile times or enormous memory footprints. Although the DynamicFlatLinguist worked well in terms of reducing startup time and overall memory footprint, it resulted in recognition speeds of about 3X slower than that of the non-dynamic flat linguist. So my final task in preparing for the release was to reduce the speed penalty of the DynamicFlatLinguist.

I had some idea where the bottlenecks in the linguist might be, but I wasn't exactly sure and often times my intuition on such things is just plain wrong, so I fired up JFluid to profile Sphinx-4 while using the DynamicFlatLinguist. One really powerful aspect of JFluid is that it can profile a small subset of the code, and this subset can easily be changed on the fly. So with JFluid, I could get an overall profile to see where the recognizer was spending its time, and then focus in directly on the hotspots related to the DynamicFlatLinguist. JFluid showed me that my intuition was partially correct about where the bottlenecks were, but it also showed a few problem areas that I hadn't anticipated. Based upon the JFluid results, I was able to focus my optimization efforts on the proper methods. A cache here, a code hoist there and soon the DynamicFlatLinguist was running 3 times faster.

JFluid continues to be one my favorite tools, but soon JFluid will be no more. Not because it is going away, but because it is now being incorporated into NetBeans. As of NetBeans 3.6 an early access version of JFluid technology can be incorporated directly into the IDE. JFluid will become the NetBeans Profiler.

Wednesday Sep 29, 2004

What is over 50 feet tall, weighs about 20 tons and can throw a 50 lb pumpkin 1000 feet? It's a trebuchet of course. This trebuchet was built by a Yankee Farmer as a hobby. During the fall (when there is a ready supply of ammunition), he fires off two shots an hour at the castle up on the hill. Oh, did I mention he has a castle? What else would you aim a trebuchet at! The shot is quite amazing, the pumpkin is in the air for eight seconds. At its peak of over 500 feet you can barely see the pumpkin.

There's no charge to watch the shot, and last Saturday when I arrived with the kids there were probably 25 spectators hanging out. Of course the half-hour between shots gives you plenty of time to check out the farm stand and purchase some pumpkins or apples.

Here's a picture of the trebuchet as it is being loaded

And one more just as they prepare to load the pumpkin.

Willie was the one who clued me in on this. Willie was so inspired that he decided to dig into our Lego collection and build his own trebuchet. Just on a smaller scale. This instrument of warfare works just as well as its big brother. It can throw the little lego man about 15 feet.

Ah ... the power of legos (and pumpkins).

Tuesday Sep 28, 2004

Willie, our fearless leader, just gave us the final go!, so Sphinx-4 1.0 beta is officially released. We've made quite a number of improvements over the last few months based upon user feedback. We are all quite proud of the work we've done on this system, building a world-class, state-of-the-art speech recognition system on the Java platform.

Here's the official announcement:

It is with great pleasure that we announce the 1.0 beta release of
Sphinx-4:

    http://cmusphinx.sourceforge.net/sphinx4

In this release, we have provided the following new features and 
improvements over the 0.1 alpha release:

    - Confidence scoring
    - Dynamic grammar support
    - JSGF limitations removed
    - Improved performance for large, perplex JSGF grammars
    - Filler support for JSGF Grammars
    - Out-of-grammar utterance rejection
    - Narrow bandwidth acoustic model
    - WSJ5K Language model
    - More demonstration programs
    - Better control over microphone selection
    - Lots of bug fixes

Sphinx-4 is a state-of-the-art, speaker-independent, continuous speech
recognition system written entirely in the Java programming language.
It was created via a joint collaboration between the Sphinx group at
Carnegie Mellon University, Sun Microsystems Laboratories, Mitsubishi
Electric Research Labs (MERL), and Hewlett Packard (HP), with
contributions from the University of California at Santa Cruz (UCSC)
and the Massachusetts Institute of Technology (MIT).

The design of Sphinx-4 is based on patterns that have emerged from the
design of past systems as well as new requirements based on areas that
researchers currently want to explore.  To exercise this framework,
and to provide researchers with a "research-ready" system, Sphinx-4
also includes several implementations of both simple and
state-of-the-art techniques.  The framework and the implementations
are all freely available via open source under a very generous
BSD-style license.

With the 1.0 beta release, you get the complete Sphinx-4 source tree
along with several acoustic and language models capable of handling a
variety of tasks ranging from simple digit recognition to large
vocabulary n-Gram recognition.

Because it is written entirely in the Java programming language,
Sphinx-4 can run on a variety of platforms without requiring any
special compilation or changes.  We've tested Sphinx-4 on the
following platforms with success: the Solaris 9 Operating System
on the SPARC platform, Mac OS X 10.3.5, RedHat 9.0, Fedora Core 1, 
Microsoft Windows XP, and Microsoft Windows 2000. 

Please give Sphinx-4 1.0 beta a try and post your questions,
comments, and feedback to one of the CMU Sphinx Forums:

    https://sourceforge.net/forum/?group_id=1904

We can also be reached at [email protected]

Sincerely,

The Sphinx-4 Team:  Peter Gorniak, MIT (developer)
(in alph. order)    Evandro Gouvea, CMU (developer and speech advisor)
                    Philip Kwok, Sun Labs (developer)
                    Paul Lamere, Sun Labs (design/technical lead)
                    Beth Logan, HP (speech advisor)
                    Pedro Moreno, Google (speech advisor)
                    Bhiksha Raj, MERL (design lead)
                    Mosur Ravishankar, CMU (speech advisor)
                    Bent Schmidt-Nielsen, MERL (speech advisor)
                    Rita Singh, CMU/MIT (design/speech advisor)
                    JM Van Thong, HP (speech advisor)
                    Willie Walker, Sun Labs (overall lead)
                    Manfred Warmuth, USCS (speech advisor)
                    Joe Woelfel, MERL (developer and speech advisor)
                    Peter Wolf, MERL (developer and speech advisor)

Monday Sep 27, 2004

We are gearing up for the 1.0 Beta release of Sphinx-4 a speech recognizer written entirely in the Java programming language. One component of this release is this new paper called Sphinx-4: A Flexible Open Source Framework for Speech Recognition. This paper provides a good overview of the architecture of Sphinx-4, as well as a discussion of the accuracy and speed of the system. The paper makes good reading if you are interested in how a speech recognizer works, or if you are interested in seeing how the Java platform performs for a domain such a speech recognition that requires a large amount of computing power.

Wednesday Sep 22, 2004

Code reviews are good. Reviewers find bugs in your code, point out inconsistencies, areas where you are misusing the Java language, deviations from good style, and so forth. The problem with code reviewers is they usually get tired after a half an hour or so of reviewing code, and the quality of the review drops off.

The program FindBugs is a bug pattern detector for java. FindBugs is just like a code reviewer that never gets tired. It scours through your code and looks for potential bugs and problems and gives you a report. Unlike a human code reviewer, FindBugs doesn't give you that self-satisfied smirk when it finds a bug like the human reviewer does.

In preparation for a major release of Sphinx-4, I ran FindBugs against our code base (about a 100K commented lines of code). I'd rather not say exactly how many bugs FindBugs reported. Suffice it to say that it was more than I expected. Constants that were not final, unused fields, inconsistent synchronization, all the types of flaws that a good, tireless code reviewer would find.

I was surprised and impressed by the types of bugs FindBugs can find. It has proven to be extremely useful in improving my code quality. I've added FindBugs to my set of must-use Java tools, next to ant and jfluid.

Tuesday Sep 21, 2004

My 14 year old daughter is diagramming sentences in her English class. She asked me to give her a few sentences to diagram so she could prepare for tomorrow's quiz. So I gave her the sentence:

The old man the boat.
Well, of course, she argued with me about how that was not a proper sentence, that I was doing it wrong, and really I wasn't that good of a father after all. After about 10 minutes of my insisting that it was indeed a proper sentence and she just had to look a bit harder, I saw the light dawn on her face. "Oh, I see, 'man' can be a verb."

If she asks me for another sentence (which I rather doubt), I'll give her this one:

The horse raced past the barn fell.

Update: Chris argues that 'raced past' is better than 'ran past', I agree.

Monday Sep 20, 2004

I enjoy programming puzzles, and as such I really like MaryMary's weekly Java puzzler. This weekend, while sorting out an attic bookcase I came across my old copy of the The C Puzzle Book by Alan Feuer. I got this book way back in the early eighties when I was first learning C. This book contains the memorable puzzle called Pointer Stew. If you could work out what Pointer Stew output, you really knew your pointers, arrays and operator precedence.

char *c[]={
	"ENTER",
	"NEW",
	"POINT",
	"FIRST"
};
char **cp[]={c+3,c+2,c+1,c};
char ***cpp=cp;
main()
{
	printf("%s",**++cpp);
	printf("%s ",*--*++cpp+3);
	printf("%s",*cpp[-2]+3);
	printf("%s\n",cpp[-1][-1]+1);
}
You just couldn't come up with such a complicated puzzle in Java (and that's a good thing).

Saturday Sep 18, 2004

NASA has been working on a system to recognize sub-vocal speech. With this system, a person need only mouth the words without actually uttering anything. For very small vocabulary tasks, they were able to achieve 92% accuracy. Someday, all of our cellphones will have this as a builtin, allowing you to converse while in the movie theater without bothering anyone around you (except maybe your date). In the mean time, its probably just best if you turn the phone off.

This blog copyright 2010 by plamere