I've been fixing some minor bugs and adding a few minor features in the past few weeks, but I found a bit more serious bug, so I fixed it.
This only applies to the Database mode searching with wild card characters. When you search in Concord (and probably in Cluster and Collocation) using wild card characters, depending on the combination of words and wild card characters, a search could have taken very long time. This was a bug and I thought I fixed it long time ago, but it looks like I only did it for the File mode. I applied the same fix to the Database mode, so this shouldn't be an issue any more.
Since I updated the blog last time, I made a few minor changes and various minor bug fixes. A couple of notable features are on the Corpus File Information. I think I posted either here or on the main site that I added a new frequency table feature to count groups of words for each corpus/database. Now you can select either you count all the words in a corpus/database or in each file in a corpus/database. I also added a function to save the Corpus File Information table results as a file (not export as a CSV file) and later import it back to CasualConc.
Anyway, if you use the Database mode frequently, you might want to update to this version (1.9.2, 20120328 or later).
If you find any other problem, please let me know.
Wednesday, March 28, 2012
Sunday, March 11, 2012
CasualConc beta update
It has been a known issue that CasualConc crashes when viewing results on tables. It is actually not a bug of CasualConc itself, but a bug of RubyCocoa which CasualConc depends on. To address this issue, I decided turn off garbage collection on Ruby side when not processing texts, which means when you are viewing results. This might increase memory usage, but will provide much better experience (or at least I hope so).
Also, I fixed a few minor bugs related to the lemmatization, spelling variation, and stop word processing.
Please try this new beta version 1.9.1 and let me know if you encounter any problem. It is available on the CasualConc site. If this is much more problematic than the table view bug, I revert it back to the previous build.
Also, I fixed a few minor bugs related to the lemmatization, spelling variation, and stop word processing.
Please try this new beta version 1.9.1 and let me know if you encounter any problem. It is available on the CasualConc site. If this is much more problematic than the table view bug, I revert it back to the previous build.
Sunday, February 26, 2012
CasualConc beta update and more
I haven't updated this blog for a while, but I have made a few bug fixes and a few feature additions. I also added a small utility program that accompanies CasualConc.
Bug fixes
Concord
- Tag search mode should work now
Word Count
- The counts of files that particular words appear are not correctly displayed in the Database mode with the lemmatize option on
- Keyness statistics are correctly calculated in the Database mode with lemmatize option on
- Specified string search mode is functional in the Database mode
- Tag list creation should work now
Enhancements
General
- Corpus/database switching in the Advanced Corpus Handling mode is now available in Concord and Collocation; you can switch them directly on each tool
- When exporting results in CSV or Tab-delimited format, you can select .txt in addition to .csv, though the default is still .csv.
- You can specify context tags (or any strings) to limit the search only to specific section(s) (See Preferences -> Tags)
- Enhancements on Context Tags to Ignore Settings in Preferences
- You can now specify ** to ignore any character in brackets (i.e. <>)
- You can now add files to a selected corpus/database by drag&drop when files are shown in Advanced Mode
Concord
- You can edit the preset sort orders in Preferences -> Concord (this might have been introduced before)
Word Count
- Specified strong search has search history
- Search function on the result table is enhanced
Corpus File Information
- You can count sets of words/phrases for each selected corpus/database in Word Group Freq Table; the format is as follows: Group Name->word1,word2,...
CasualConc Viewer
It's been reported that CasualConc crashes when scrolling fast on result tables. This is not a bug of this particular program but a bug of the program environment CasualConc depends on (RubyCocoa). I've asked the maintainer of RubyCocoa to fix the bug, but it hasn't been resolved yet, so I decided to create a viewer app. This viewer is written in MacRuby and table view is much more stable.
To use the viewer, after you create KWIC results or any lists on a table, go to Misc -> Open with Viewer. If you want to view the results on the right table on Cluster and Word Count, go to Misc -> Open with Viewer (Right).
The viewer is just a viewer, so you can't do much. I might add a few more functions, but if you want to export results or calculate statistics, you should do it on CasualConc.
e-lemma file
With the current beta (1.9.0), e-lemma file is included in the disk image (with a permission). e-lemma file is a lemma list file created by Prof. Yasumasa Someya at Kansai University. You can import the list for lemmatization on CasualConc.
Another file, a-e spelling differences, is a list of American/British spelling pairs. You can also import this list to CasualConc.
You can use the lemmatization function and the spelling variation function. When applied to a search word, you can search words of the same lemma as well as spelling variants.
If you have any other bugs, please let me know.
Bug fixes
Concord
- Tag search mode should work now
Word Count
- The counts of files that particular words appear are not correctly displayed in the Database mode with the lemmatize option on
- Keyness statistics are correctly calculated in the Database mode with lemmatize option on
- Specified string search mode is functional in the Database mode
- Tag list creation should work now
Enhancements
General
- Corpus/database switching in the Advanced Corpus Handling mode is now available in Concord and Collocation; you can switch them directly on each tool
- When exporting results in CSV or Tab-delimited format, you can select .txt in addition to .csv, though the default is still .csv.
- You can specify context tags (or any strings) to limit the search only to specific section(s) (See Preferences -> Tags)
- Enhancements on Context Tags to Ignore Settings in Preferences
- You can now specify ** to ignore any character in brackets (i.e. <>)
- You can now add files to a selected corpus/database by drag&drop when files are shown in Advanced Mode
Concord
- You can edit the preset sort orders in Preferences -> Concord (this might have been introduced before)
Word Count
- Specified strong search has search history
- Search function on the result table is enhanced
Corpus File Information
- You can count sets of words/phrases for each selected corpus/database in Word Group Freq Table; the format is as follows: Group Name->word1,word2,...
CasualConc Viewer
It's been reported that CasualConc crashes when scrolling fast on result tables. This is not a bug of this particular program but a bug of the program environment CasualConc depends on (RubyCocoa). I've asked the maintainer of RubyCocoa to fix the bug, but it hasn't been resolved yet, so I decided to create a viewer app. This viewer is written in MacRuby and table view is much more stable.
To use the viewer, after you create KWIC results or any lists on a table, go to Misc -> Open with Viewer. If you want to view the results on the right table on Cluster and Word Count, go to Misc -> Open with Viewer (Right).
The viewer is just a viewer, so you can't do much. I might add a few more functions, but if you want to export results or calculate statistics, you should do it on CasualConc.
e-lemma file
With the current beta (1.9.0), e-lemma file is included in the disk image (with a permission). e-lemma file is a lemma list file created by Prof. Yasumasa Someya at Kansai University. You can import the list for lemmatization on CasualConc.
Another file, a-e spelling differences, is a list of American/British spelling pairs. You can also import this list to CasualConc.
You can use the lemmatization function and the spelling variation function. When applied to a search word, you can search words of the same lemma as well as spelling variants.
If you have any other bugs, please let me know.
Sunday, December 4, 2011
CasualConc beta bug fix
I got a bug report, so I fixed it.
The problem was in File Info. When exporting a Word Freq Info result, low frequency counts of individual files were sometime omitted. This was because the cells with no numbers were not skipped. So when it reached the number of types in a file, CasualConc stopped handling the data for that file for exporting. Internally, the frequency counts were stored (you could see them on the window), so I made it sure that CasualConc handle all the data properly.
If you find any problem, please let me know.
The problem was in File Info. When exporting a Word Freq Info result, low frequency counts of individual files were sometime omitted. This was because the cells with no numbers were not skipped. So when it reached the number of types in a file, CasualConc stopped handling the data for that file for exporting. Internally, the frequency counts were stored (you could see them on the window), so I made it sure that CasualConc handle all the data properly.
If you find any problem, please let me know.
Sunday, November 27, 2011
CasualTranscriber alpha
Since the current version of CasualTranscriber is quite buggy because of the programming language I use, I decided to rewrite it in another language. Now, the new version has most of the basic functions for transcribing, so I decided to release it as an alpha version. It does not have all the functions that the current version has and is more likely to have bugs, but it is, in a sense, more stable (at least in my environment).
If you are using CasualTranscriber, especially on Lion, please try it and let me know what you think.
If you are using CasualTranscriber, especially on Lion, please try it and let me know what you think.
Subscribe to:
Posts (Atom)