Friday, May 30, 2008

A minor change

I found a bug (sort of) in Concord a couple of days ago. It's a minor bug and this happens only when you use the database mode in Concord. Well, it's more of memory leak. I implemented a forced garbage collection when full text is displayed in the context view of Concord, but somehow memory is not released. So I changed the way to read the text from a database file. Now it should not keep using additional memory when you select a different concordance line to show full text.

I use the same technique to read data from a database file when CasualConc searches a string, but if I implemented the same change to the search function, it used more memory because the search returns more hits. What this means is if you search word(s)/phrase(s) in any of the tools many times, CasualConc keeps using memory. I haven't tested if it uses up all the available memory and starts using virtual memory or if Ruby starts GC when it uses up all the available physical memory. In any case, until I can find a way to solve this problem, you might want to quit CasualConc after a while and restart it.

Tuesday, May 20, 2008

A few more fixes again

I fixed a few more bugs this weekend that are mainly related to lemmatization and collocation statistics. I also added some more documentations to the main site (some in Japanese). The latest version is still 0.9.7 but the date is 05192008.

Now, most of the features I wanted to include in CasualConc is there and mostly functioning. I don't have time to improve Japanese kwic feature now, so that should wait until sometime in summer or fall. And unless I find or someone reports any major bugs, I will try not to spend too much time on this for a while. I don't know how many people actually downloaded CasualConc and are using it, but I guess there aren't many. If you happened to be one of them, I'd like to hear what you think about it.

Well, I might need to publicize this a bit more, so I might start trying to get more beta testers somewhere.

Sunday, May 18, 2008

Another bug fix and minor update

I highly doubt if anyone downloaded CasualConc recently, but anyway, I found a few bugs and also make some changes, which I wanted to for a while. Now the latest version is 0.9.7 beta.

First, I found a bug in Japanese Concordance, which I'm sure nobody has ever used. When I dropped the text only mode, I forgot to change it in Japanese concordance mode. Now it should be fixed. I also fixed some other bugs what relate to the recent feature changes.

The changes I made are mainly with Collocation. Now, if you search for multiple words or use wildcard search and multiple words are found, collocation information will be displayed for each keyword. This change affected statistics calculation, so I think I made necessary changes to it.

I also made a minor change to Export result function of Concord. Originally, an exported CSV file from Concord only include kwic results and file paths. Now it has an option to include context words (L5 - R5). To include them, go to Preferences -> Concord and check the box Include context words (L5 - R5) in CSV output.

As always, if you happened to find this blog or the main page, and tried CasualConc, any feedback (including bug reports) is welcome. Especially if you find it useful, I'd like to know.

Thursday, May 15, 2008

Another quick fix

I found a minor (maybe major to someone if anyone ever uses CasualConc) bug and fixed it today. This only affects you if you use Concord with non-plain text files as your corpus files. And this only happens in the paragraph mode (the default mode). Now you should be able to use other file types as your corpus files with Concord and in the paragraph mode.

I found this bug when I was testing .odt files. After the fix, I was able to use .odt files as corpus files, so this confirms CasualConc can read .odt files!

I don't know how many people are affected (I know not many people), but if you downloaded CasualConc in the last couple of weeks, please go to the site and download the latest version. It has the same version number (0.9.6), but different date (05142008).

And if you find any other bugs, please let me know. The email address is on the main site.

Tuesday, May 13, 2008

Some details of last update

As I mentioned in the last post, I added/activated a couple of new features on CasualConc. One is based on the lemmatizing function and the other is something with Concord.

The first, which is based on the lemmatizing function is keyword grouping or whatever name I will settle (it has a tentative label). What it does is first you prepare a text file (UTF-8) with the same format the lemmatizer accepts. The default is:

keyword -> word,word,word,...

The keyword is a grouping label, so if you want to group days of a week, it looks like:

week -> Monday,Tuesday,Wednesday,Thursday,Friday,Saturday,Sunday

Once you prepare as many groups you want to have, save the file as a plain text with UTF-8 encoding. Then, go to Preferences on CasualConc -> Lemma, and check Grouped Keywords. Next you select the file you just saved by clicking Select Grouping File button. Now everything is set.

If this works as intended, you should be able to use this function on Concord, Cluster, and Collocation/Cooccurrence. What you will do is add @@ at the beginning of your search word(s). So if you want to search all the days of a week, as specified above, you will @@week, then search. You should be able to search all the words in this group. Technically, you should be able to search multiple groups, but it is not fully tested and might not work, and I don't know what will happen if you combine this feature and wildcard search. I might change the behavior of this feature if I ever get any feedback.

Another somewhat major addition is which is not documented at this time is a function for Concord. You can now open a concordance result in a new window. This might be useful if you want to compare several concordance results. To use this function, search any word(s) in Concord, and then go to Menu -> Misc -> Open Concordance Result in New Window. This is experimental. I added this because I found a way to add multiple window function to a program (I just wanted to have something so that I remember how to do it). You should be able to resort the results even on a new window, just like on the main window. But be ware, if the concordance result is huge (like returned 10000 hits), using this might eats a lot of memory because CasualConc keeps all the info on memory. If you have at least 2GB of memory, this should be less of a problem, though.

Finally, I have something that is not related to CasualConc. I posted a couple of weeks ago that I wrote a simple utility program that helps typing IPA characters. I wrote a similar program(?) with Javascript and added to the IPATypist page. I highly doubt many people read this post and especially people who don't use Leopard, but this is written for those people. It should run on Tiger with Firefox, Safari and Camino. I haven't tested it on IE on Windows and I have no intention to support it, but it might work. It is also available for download, so if you are ever interested, you can download it and use it on your computer or put it on your course site or wherever you want to use it, though I can't guarantee it will work.

As always, if you ever use any of the programs, I'd apprecite your feedback. That will motivate me to improve them.