Wednesday, September 26, 2012

CasualConc bug fixes and more

Since I last posted here, I made a few changes and fixed bugs.

- some minor interface changes (sort, context word, etc)
- you can now save the Concord result table as PDF (Print or Command + P)
- more accurate keyword coloring in the context view when tags are suppressed

Word Count
- Phrase-frames list (i.e. in * of, etc.): Preferences->Others, Word Count, Advanced Mode

- the minimum window size is set to 800 x 550px
- some minor interface changes to accommodate the minimum window size

Bug fixes

- saving the results should work now
- keyword coloring in the context view in the Database mode

- should run in non-Japanese environment (it didn't)

I also started to update the documentation. The ones with (updated) on the How to Use page are partially updated.

Thursday, June 28, 2012

CasualConc beta bug fix

With the recent changes, I introduced one serious bug to the Concord data save function, so I fixed it. Now it should work fine.

I also made a few more changes to the Concord Plot function.

Concord Plot
- you can divide the plots by a fixed number of units (characters/words)

Let me know if you find any problem.

Saturday, June 16, 2012

CasualConc beta minor update

Upon requests, I added a few minor features to CasualConc beta.

- If you enable lemmatization and apply lemmatization to Concord search words, you can sort the Concord results by lemmas of the key.
- When you export Concord results, you can insert characters before and after the key.

Concord Plot
- export plot data (File -> Save Table)
- import to the plots of the same corpus/files (File -> Open Saved Data)
- variable plot widths for each file
- fixed widths for the plot view
- refresh to reflect changes to the Concord results (by deletion)

Bug fix
- context tag handling in Concord

With the changes to Concord, you can now sort by lemma, and then by each word. This is not available to the keyword group function at the moment. I'll see if I can apply to it if I get requests.

If you use the lemmatization function with word family lists, you can sort by the word family first, and then by each word.

If you want to insert characters before and after the key when you export Concord results, go to Preferences -> Concord 

With the Concord Plot changes, you can now put three different search results on one plot. First, search any word(s)/phrase(s) on Concord with the plotting on, then save the data on the Plot view. If you do this with another search and save the data. Then with the same corpus/files, search new word(s)/phrase(s) and then on the Plot view, open the saved data one by one. You can set the color of plots in Preferences -> Others and check Insert markers and type any characters before and after KEY.

To make the plot widths relative to the file length, Preferences -> Others -> Concordance Plot, and select Relative (or Same Width for the same plot width for all the files).

If you want to make the plots width wider, set Width (print) to whatever number (pixels) you want and check Apply to On-screen.

I might move the settings to new tab later, but for now, the settings are under Others.

The new beta version is 1.9.5.

If you find any problem, please let me know.

Monday, April 23, 2012

Bug fix to CasualConc beta

With a few minor changes I made in the last couple of weeks, I introduced yet another bugs.

If you tried to search anything in Concord, you had a warning message, though you still should have been able to search words/phrases. Now I addressed this bug with the latest build (20120423).

The change I made was to allow wildcard character only search in Concord in Word(s) mode and non-word character search in Character/Regular Expression search modes.

Also I made one minor feature addition upon request. If you add (x is any string of your choice) to your corpus files and search Concord with Concord Plot, you will see red lines where the tag(s) is inserted. So if you have any section break in your file(s), you can mark them on plots. This is still an experimental feature and hopefully I can make it more a little more flexible as soon as I have time to work on it more.

In any case, if you have downloaded CasualConc in the last couple of weeks, please get the latest beta build.

Wednesday, April 4, 2012

Well, another bug fix...

I think I fixed a bug in the Database mode last time, but it turned out I introduced another bug. Last time I fixed the process of converting search strings with wild card characters to SQL query strings, but it the change I implemented causes SQL query errors. But it should be fixed now unless your search string is full of wild card characters.

So, if you use the Database mode with wild card characters, please download the latest beta (1.9.2, 20120404).

And please let me know if you find any other bugs.

Wednesday, March 28, 2012

CasualConc beta bug fix

I've been fixing some minor bugs and adding a few minor features in the past few weeks, but I found a bit more serious bug, so I fixed it.

This only applies to the Database mode searching with wild card characters. When you search in Concord (and probably in Cluster and Collocation) using wild card characters, depending on the combination of words and wild card characters, a search could have taken very long time. This was a bug and I thought I fixed it long time ago, but it looks like I only did it for the File mode. I applied the same fix to the Database mode, so this shouldn't be an issue any more.

Since I updated the blog last time, I made a few minor changes and various minor bug fixes. A couple of notable features are on the Corpus File Information. I think I posted either here or on the main site that I added a new frequency table feature to count groups of words for each corpus/database. Now you can select either you count all the words in a corpus/database or in each file in a corpus/database. I also added a function to save the Corpus File Information table results as a file (not export as a CSV file) and later import it back to CasualConc.

Anyway, if you use the Database mode frequently, you might want to update to this version (1.9.2, 20120328 or later).

If you find any other problem, please let me know.

Sunday, March 11, 2012

CasualConc beta update

It has been a known issue that CasualConc crashes when viewing results on tables. It is actually not a bug of CasualConc itself, but a bug of RubyCocoa which CasualConc depends on. To address this issue, I decided turn off garbage collection on Ruby side when not processing texts, which means when you are viewing results. This might increase memory usage, but will provide much better experience (or at least I hope so).

Also, I fixed a few minor bugs related to the lemmatization, spelling variation, and stop word processing.

Please try this new beta version 1.9.1 and let me know if you encounter any problem. It is available on the CasualConc site. If this is much more problematic than the table view bug, I revert it back to the previous build.

Sunday, February 26, 2012

CasualConc beta update and more

I haven't updated this blog for a while, but I have made a few bug fixes and a few feature additions. I also added a small utility program that accompanies CasualConc.

Bug fixes
- Tag search mode should work now

Word Count
- The counts of files that particular words appear are not correctly displayed in the Database mode with the lemmatize option on
- Keyness statistics are correctly calculated in the Database mode with lemmatize option on
- Specified string search mode is functional in the Database mode
- Tag list creation should work now

- Corpus/database switching in the Advanced Corpus Handling mode is now available in Concord and Collocation; you can switch them directly on each tool
- When exporting results in CSV or Tab-delimited format, you can select .txt in addition to .csv, though the default is still .csv.
- You can specify context tags (or any strings) to limit the search only to specific section(s) (See Preferences -> Tags)
- Enhancements on Context Tags to Ignore Settings in Preferences
- You can now specify ** to ignore any character in brackets (i.e. <>)
- You can now add files to a selected corpus/database by drag&drop when files are shown in Advanced Mode

- You can edit the preset sort orders in Preferences -> Concord (this might have been introduced before)

Word Count
- Specified strong search has search history
- Search function on the result table is enhanced

Corpus File Information
- You can count sets of words/phrases for each selected corpus/database in Word Group Freq Table; the format is as follows: Group Name->word1,word2,...

CasualConc Viewer
It's been reported that CasualConc crashes when scrolling fast on result tables. This is not a bug of this particular program but a bug of the program environment CasualConc depends on (RubyCocoa). I've asked the maintainer of RubyCocoa to fix the bug, but it hasn't been resolved yet, so I decided to create a viewer app. This viewer is written in MacRuby and table view is much more stable.

To use the viewer, after you create KWIC results or any lists on a table, go to Misc -> Open with Viewer. If you want to view the results on the right table on Cluster and Word Count, go to Misc -> Open with Viewer (Right).

The viewer is just a viewer, so you can't do much. I might add a few more functions, but if you want to export results or calculate statistics, you should do it on CasualConc.

e-lemma file
With the current beta (1.9.0), e-lemma file is included in the disk image (with a permission). e-lemma file is a lemma list file created by Prof. Yasumasa Someya at Kansai University. You can import the list for lemmatization on CasualConc.

Another file, a-e spelling differences, is a list of American/British spelling pairs. You can also import this list to CasualConc.

You can use the lemmatization function and the spelling variation function. When applied to a search word, you can search words of the same lemma as well as spelling variants.

If you have any other bugs, please let me know.