<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-626625090667668862</id><updated>2012-01-26T00:20:59.183+09:00</updated><category term='ruby'/><category term='casualconc'/><category term='japanese'/><category term='utility program'/><category term='mecab'/><title type='text'>CasualConc - a concordancer for Mac OS X</title><subtitle type='html'>This blog is mostly about a concordance program I have been developing for Mac OS X and some related stuff.  CasualConc is designed for not-so-serious corpus analysis (text analysis).  You can download &lt;b&gt;CasualConc&lt;/b&gt; from the main site (&lt;a href="http://sites.google.com/site/casualconc/"&gt;English&lt;/a&gt; or &lt;a href="http://sites.google.com/site/casualconcj/"&gt;Japanese&lt;/a&gt;).</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>86</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-7344210174058535921</id><published>2011-12-04T09:50:00.001+09:00</published><updated>2011-12-04T09:54:27.910+09:00</updated><title type='text'>CasualConc beta bug fix</title><content type='html'>I got a bug report, so I fixed it.&lt;br /&gt;&lt;br /&gt;The problem was in File Info. When exporting a Word Freq Info result, low frequency counts of individual files were sometime omitted. This was because the cells with no numbers were not skipped. So when it reached the number of types in a file, CasualConc stopped handling the data for that file for exporting. Internally, the frequency counts were stored (you could see them on the window), so I made it sure that CasualConc handle all the data properly.&lt;br /&gt;&lt;br /&gt;If you find any problem, please let me know.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-7344210174058535921?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/7344210174058535921/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=7344210174058535921' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/7344210174058535921'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/7344210174058535921'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2011/12/casualconc-beta-bug-fix.html' title='CasualConc beta bug fix'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-7131316596443759331</id><published>2011-11-27T03:05:00.001+09:00</published><updated>2011-11-27T03:10:40.385+09:00</updated><title type='text'>CasualTranscriber alpha</title><content type='html'>Since the current version of CasualTranscriber is quite buggy because of the programming language I use, I decided to rewrite it in another language. Now, the new version has most of the basic functions for transcribing, so I decided to release it as an alpha version. It does not have all the functions that the current version has and is more likely to have bugs, but it is, in a sense, more stable (at least in my environment).&lt;br /&gt;&lt;br /&gt;If you are using CasualTranscriber, especially on Lion, please try it and let me know what you think.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-7131316596443759331?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/7131316596443759331/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=7131316596443759331' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/7131316596443759331'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/7131316596443759331'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2011/11/casualtranscriber-alpha.html' title='CasualTranscriber alpha'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-7437819978496886952</id><published>2011-09-30T23:16:00.000+09:00</published><updated>2011-09-30T23:16:24.688+09:00</updated><title type='text'>Lion compatibility and ...</title><content type='html'>It's been a couple of months after Lion was released.&amp;nbsp; Today, I made  some changes to application so that they run on Lion (at least).&amp;nbsp; I  don't have time to test all the features, so if you use them on Lion,  please let me know if they work fine or have some problems.&lt;br /&gt;&lt;br /&gt;And  finally, I decided to drop the Leopard support.&amp;nbsp; They might run, but I  will not test the compatibility any more (I finally upgraded my Leopard  machine to Snow Leopard).&lt;br /&gt;&lt;br /&gt;If you really need to use them on a Leopard machine, let me know.&lt;br /&gt;&lt;br /&gt;Also, I've added a few minor features to CasualConc.&amp;nbsp; It now has a function to test your regular expressions.&amp;nbsp; Go to Main Menu -&amp;gt; Window -&amp;gt; Regex Test Panel.&amp;nbsp; This is an experimental feature, so let me know what you think.&lt;br /&gt;&lt;br /&gt;Another minor feature is a support of multi-line regular expression search.&amp;nbsp; You can enable this on the Preference window.&lt;br /&gt;&lt;br /&gt;I've also started to rewrite CasualTranscriber.&amp;nbsp; I labeled it as alpha, but most of the basic features of the current version are added.&amp;nbsp; It only runs on 64-bit Mac with Snow Leopard or Lion.&amp;nbsp; If you have chance to try it, let me know.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-7437819978496886952?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/7437819978496886952/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=7437819978496886952' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/7437819978496886952'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/7437819978496886952'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2011/09/lion-compatibility-and.html' title='Lion compatibility and ...'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-2717045112157968429</id><published>2011-07-28T00:26:00.000+09:00</published><updated>2011-07-28T00:26:11.471+09:00</updated><title type='text'>CasualConc on Lion update</title><content type='html'>I figured out how to include sqlite3-ruby in CasualConc, so I made changes to the latest beta build and labeled it as CasualConc beta for Lion.&amp;nbsp; It is available on the CasualConc download page.&amp;nbsp; Please try it and let me know if it doesn't work (or it does).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-2717045112157968429?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/2717045112157968429/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=2717045112157968429' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/2717045112157968429'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/2717045112157968429'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2011/07/casualconc-on-lion-update.html' title='CasualConc on Lion update'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-1357394678723603201</id><published>2011-07-25T00:13:00.003+09:00</published><updated>2011-07-28T00:31:15.484+09:00</updated><title type='text'>CasualConc on Lion</title><content type='html'>I haven't been able to check the compatibility personally, but I got a  couple of reports that CasualConc 1.0.x does not run on Lion. &lt;br /&gt;&lt;br /&gt;&lt;b&gt;UPDATE&lt;/b&gt;: As you can see above (at least for now), I figured out how to include sqlite3-ruby, so you don't have to follow this process unless the beta for Lion doesn't run.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;To use CasualConc beta on Lion, you need to install 'sqlite3-ruby' and run a beta build 2011/07/21 or later.&lt;br /&gt;&lt;/b&gt;1. open Mac &lt;b&gt;App Store.app&lt;/b&gt; and download/install Xcode (free) [you need Mac App Store account]&lt;br /&gt;2. open &lt;b&gt;Terminal.app&lt;/b&gt;&lt;br /&gt;3. type &lt;b&gt;sudo gem install sqlite3-ruby --version "= 1.2.5"&lt;/b&gt; and hit the enter key (and enter the password of your account on Mac)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;I will check this as soon as I can access Lion (hopefully within a  couple of weeks), but I heard the beta ran after installing  sqlite3-ruby.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;I will also check other apps and see if I can make them run on Lion (if they don't).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-1357394678723603201?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/1357394678723603201/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=1357394678723603201' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/1357394678723603201'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/1357394678723603201'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2011/07/casualconc-on-lion.html' title='CasualConc on Lion'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-720622331320764430</id><published>2011-03-10T13:41:00.000+09:00</published><updated>2011-03-10T13:41:30.749+09:00</updated><title type='text'>Cumulative bug fixes and feature additions to CasualConc beta</title><content type='html'>I haven't posted anything here for a while because I've been busy with developing a new version of CasualConc and preparing for the classes I start teaching next month.&amp;nbsp;&lt;br /&gt;&lt;br /&gt;In the last few months, I added a few new features and fixed bugs (mostly the ones I introduced when I added these new features).&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Bug fixes&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;General&lt;/b&gt;&lt;br /&gt;- searching multiple words/phrases in Word (wildcard)  mode using a slash (/) matched words/phrases that included searched  words/phrases&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Concord&lt;/b&gt;&lt;br /&gt;- keyword was not correctly colored in the context view when searched in the database mode with File as Scope of Context&lt;br /&gt;- keyword was not colored correctly in the context view when used in the database mode (introduced with the above change)&lt;br /&gt;- saved results did not display correctly&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Feature additions&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Concord&lt;/b&gt;&lt;br /&gt;- added positions in a file as an sort option when File is selected as Scope of Context&lt;br /&gt;&lt;br /&gt;&lt;b&gt;File info&lt;/b&gt;&lt;br /&gt;- added standardized TTR (TTR in every 1000 words)&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Concordance Plot&lt;/b&gt;&lt;br /&gt;- you can reflect the changes you made to Concord results&lt;br /&gt;- able to export selected/all plots as a single PDF file.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;If you find any other bugs or want to see some other features, please let me know.&amp;nbsp; I'll try to fix bugs as soon as possible and I will try to add requested features as much as possible (unless they are technically too difficult).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-720622331320764430?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/720622331320764430/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=720622331320764430' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/720622331320764430'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/720622331320764430'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2011/03/cumulative-bug-fixes-and-feature.html' title='Cumulative bug fixes and feature additions to CasualConc beta'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-2090335565665050405</id><published>2010-11-04T17:03:00.000+09:00</published><updated>2010-11-04T17:03:51.248+09:00</updated><title type='text'>Bug fixes to CasualConc Beta 1.8</title><content type='html'>I found a few bugs related to Corpus/Database file handling in the Advanced mode.&amp;nbsp; Also, a serious bug was reported, so I fixed them and uploaded the latest beta to the site.&amp;nbsp; If you have downloaded beta 1.8, please visit the site and download the latest beta.&amp;nbsp; The latest beta is also version 1.8, but if you could run and check About CasualConc, it should say Version 1.8 (20101104).&lt;br /&gt;&lt;br /&gt;If you still cannot run this version or find another bug, please report it to me.&amp;nbsp; You can email me or post comment on the blog or send message on Twitter.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-2090335565665050405?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/2090335565665050405/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=2090335565665050405' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/2090335565665050405'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/2090335565665050405'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2010/11/bug-fixes-to-casualconc-beta-18.html' title='Bug fixes to CasualConc Beta 1.8'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-5232638882999523913</id><published>2010-11-01T12:59:00.002+09:00</published><updated>2010-11-01T15:00:04.781+09:00</updated><title type='text'>A minor update to CasualConc Beta</title><content type='html'>I added a few features (some new and some enhanced) to CasualConc Beta.&amp;nbsp;  I also enabled an experimental gapped n-gram list feature.&lt;br /&gt;&lt;br /&gt;Documentation is not updated, so you might need to figure out how to use some of the features.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;General&lt;/b&gt;&lt;br /&gt;-  Spelling variation feature.&amp;nbsp; You can register spelling variations (i.e.  analyze-analyse) and use them in Concord/Cluster/Collocation searches  as well as in Word List.&lt;br /&gt;- You can assign different corpus/database file to the left and right  tables in Cluster/Word Count in Advanced Corpus Handling Mode.&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;b&gt;&lt;br /&gt;Concord&lt;/b&gt;&lt;br /&gt;- Sorting now has 4th elements.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;&lt;b&gt;Cluster&lt;/b&gt;&lt;br /&gt;- longer cluster search (up to 8 words)&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Word List&lt;/b&gt;- Gapped n-gram (3-5 gram)&lt;br /&gt;&lt;b&gt;&lt;br /&gt;Concordance Plot&lt;/b&gt;&lt;br /&gt;- You can export selected concordance plot as JPEG files (individually) or print them (= export as PDF).&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The spelling variation feature is to accommodate word search/word list creation in languages that have spelling variations (i.e. American/British English).&amp;nbsp; Once you create a spelling variation list, you can use information on it when searching a word/creating a word list.&amp;nbsp; There is one problem with this feature.&amp;nbsp; Unless your corpus and spelling variation list are not pos-tagged, CasualConc cannot distinguish the same spelling of different word classes (i.e. analyses [v] vs analyses [n]).&amp;nbsp; A sample list of spelling variations is included in the disk image.&amp;nbsp; You can import it to CasualConc to see how this feature works.&amp;nbsp; You might want to create your own list (there might be some errors in my list and it is far from complete).&lt;br /&gt;&lt;br /&gt;For corpus handling, I added a feature to assign different corpus/database file to the left/right tables in Cluster and Word Count.&amp;nbsp; This is available in Advanced Corpus Handling Mode.&amp;nbsp; If you have more than one corpus or database file registered on the table and check more than one corpora or database files, you can select one corpus/database file or All for each of the two tables in Cluster and Word Count.&amp;nbsp; You need to do this in File view.&lt;br /&gt;&lt;br /&gt;In Concord, you can select 4th sorting position.&amp;nbsp; I don't know how useful this feature is, but I wanted this from time to time, so I added it.&amp;nbsp; Also related to Concord is Concordance Plot export.&amp;nbsp; Now you can export Concordance Plots.&amp;nbsp; There are two types.&amp;nbsp; One is to export selected plots individually as JPEG files.&amp;nbsp; You can select the ones you want to export on the plots.&amp;nbsp; Another is to print the selected plots.&amp;nbsp; Thanks to OS X's "Save as PDF" feature, you can save the plots as a single PDF file.&amp;nbsp; Now that you can export plots, you might want to change the size of plots, I guess.&amp;nbsp; So I added a feature to change the hight and width of plot boxes.&amp;nbsp; You can set them in Preferences -&amp;gt; Others.&lt;br /&gt;&lt;br /&gt;In Cluster, you can create a Cluster list of up to 8 words (= 7 + search word).&lt;br /&gt;&lt;br /&gt;In Word List, you can search gapped n-grams (2-5 grams).&amp;nbsp; This feature was already introduced as an experimental feature.&amp;nbsp; I enhanced it a bit and enabled it.&amp;nbsp; I'm thinking about enhancing this feature, but I decided to release it before that hoping to get some feedback (to decide how to enhance this feature).&lt;br /&gt;&lt;br /&gt;There might be bugs related to these new features.&amp;nbsp; Any feedback and/or bug report is welcome.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-5232638882999523913?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/5232638882999523913/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=5232638882999523913' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/5232638882999523913'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/5232638882999523913'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2010/11/minor-update-to-casualconc-beta.html' title='A minor update to CasualConc Beta'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-1028905535046760566</id><published>2010-10-01T02:11:00.000+09:00</published><updated>2010-10-01T02:11:26.058+09:00</updated><title type='text'>CasualConc bug fix</title><content type='html'>I found a bug on CasualConc, so I fixed it.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Concord&lt;/b&gt;&lt;br /&gt;- The number of files count was incorrect when used with the context word search option ON&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Also on CasualConc beta, a little more serious bug was found in the database mode.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;General&lt;/b&gt;&lt;br /&gt;- Results of Cluster, Collocation, and Word Count were incorrect when the context word search option was ON and a context word(s) was specified in Concord.&amp;nbsp; This happened ONLY in the database mode.&amp;nbsp; There was a bug in creating an SQL query.&lt;br /&gt;&lt;br /&gt;If you find any other bugs, please let me know.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-1028905535046760566?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/1028905535046760566/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=1028905535046760566' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/1028905535046760566'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/1028905535046760566'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2010/10/casualconc-bug-fix.html' title='CasualConc bug fix'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-1296698212345727928</id><published>2010-09-28T10:03:00.000+09:00</published><updated>2010-09-28T10:03:27.908+09:00</updated><title type='text'>CasualConc beta another minor update</title><content type='html'>I found a bug in the process to add files to an existing database file,  so I fixed it and made some minor changes to corpus file handling in  advanced mode.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;File&lt;/b&gt; (in Advanced Corpus Handling mode)&lt;br /&gt;- you can now add files to an existing database file (this feature was broken)&lt;br /&gt;- you can delete multiple files from a corpus/database file&lt;br /&gt;- you can find duplicate files in a corpus/database file -&amp;gt; identify files with the same path&lt;br /&gt;- you have an option to move a database file to Trash when you delete one from the table&lt;br /&gt;&lt;br /&gt;The bug I found was in the process of adding files to an existing database file.&amp;nbsp; This worked long time ago and I somehow broke it when I made changes in the process.&amp;nbsp; But since I haven't got any bug report, I guess no one really uses this function.&lt;br /&gt;&lt;br /&gt;And when I fixed this, I also made some changes in corpus handling in the advanced corpus handling mode.&amp;nbsp; Now you can select multiple files on the corpus/database content table (lower left).&amp;nbsp; This means you can delete multiple files at one go.&amp;nbsp; Related to this, I added a feature to identify duplicate files in the selected corpus/database file.&amp;nbsp; If CasualConc find duplicate files, the ones added first (older ones) are selected.&amp;nbsp; Once files are selected, you can delete them at once.&lt;br /&gt;&lt;br /&gt;Also you can now move a database file to Trash when you remove it from the database list table. &lt;br /&gt;&lt;br /&gt;Finally, I added alert messages related to the corpus handling.&amp;nbsp; When a corpus/database file was not selected or files are not added to a table to process, CasualConc simply ignored button clicking.&amp;nbsp; Now, it gives you a message explaining why CasualConc doesn't process your request in the file view (in most of the cases).&lt;br /&gt;&lt;br /&gt;As always, if you find any bug, please let me know.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-1296698212345727928?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/1296698212345727928/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=1296698212345727928' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/1296698212345727928'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/1296698212345727928'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2010/09/casualconc-beta-another-minor-update.html' title='CasualConc beta another minor update'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-6982290384105231942</id><published>2010-09-08T23:26:00.000+09:00</published><updated>2010-09-08T23:26:14.686+09:00</updated><title type='text'>CasualConc beta minor update</title><content type='html'>I haven't updated this blog for a long time.&amp;nbsp; Since I last posted here, I made a few changes to CasualConc beta.&lt;br /&gt;&lt;br /&gt;- Exporting Concord result as RTF&lt;br /&gt;&lt;br /&gt;You can now export Concord results as a RTF document.&amp;nbsp; With this format, you can choose to keep coloring of sort words as well as font style of context words.&lt;br /&gt;&lt;br /&gt;- Searching Cluster/Collocation from other tools&lt;br /&gt;&lt;br /&gt;Though it was not documented on the site, you could run Concord search from the Cluster, Collocation, and Word Count tables.&amp;nbsp; Now you can search Cluster and Collocation from the Word Count table.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;If you are using CasualConc beta or want to try the beta, please go to the CasualConc site and download the latest beta.&lt;br /&gt;&lt;br /&gt;I'd appreciate any feedback on the beta.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-6982290384105231942?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/6982290384105231942/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=6982290384105231942' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/6982290384105231942'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/6982290384105231942'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2010/09/casualconc-beta-minor-update.html' title='CasualConc beta minor update'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-8166419791601672505</id><published>2010-06-16T10:23:00.001+09:00</published><updated>2010-06-16T10:23:41.211+09:00</updated><title type='text'>CasualTextractor documentation and a few minor bug fixes</title><content type='html'>I finally updated documentation of CasualTextractor.&amp;nbsp; I added quite a lot of features to the PDF mode while ago, but haven't had time to document them.&amp;nbsp; Now the new page is lined from the CasualTextractor page.&lt;br /&gt;&lt;br /&gt;Also I fixed a few minor bugs I found while I was documenting it and made a few minor changes to it while I was documenting the new features.&amp;nbsp; The current version is 0.7.1.&amp;nbsp; If you use CasualTextractor or and wanted to know how to use it in the PDF mode, please check the site.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-8166419791601672505?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/8166419791601672505/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=8166419791601672505' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/8166419791601672505'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/8166419791601672505'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2010/06/casualtextractor-documentation-and-few.html' title='CasualTextractor documentation and a few minor bug fixes'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-8488064708893587262</id><published>2010-04-30T13:25:00.001+09:00</published><updated>2010-05-06T18:59:23.283+09:00</updated><title type='text'>A CasualConc bug fix</title><content type='html'>I got a bug report and fixed it (thank you, Adriano).&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Bug fix&lt;/b&gt;&lt;br /&gt;- crashed when searching with wildcard in the Database mode.&lt;br /&gt;&lt;b&gt; &lt;/b&gt;&lt;br /&gt;The report was about Concord, but I think it also happened in Cluster and Collocation/Cooccurrences.&lt;br /&gt;&lt;br /&gt;This is not a problem on the beta version.&amp;nbsp; I fixed this on the beta when I found this bug, but forgot to apply it to the working version.&lt;br /&gt;&lt;br /&gt;You can download CasualConc from the &lt;a href="http://sites.google.com/site/casualconc/download"&gt;CasualConc site&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;If you find any other bugs on the working version (1.0.2), please send a bug report to me.&amp;nbsp; Currently, I'm only using the beta version, so I will only fix bugs which are in common with both the working version and the beta version unless I receive a report.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-8488064708893587262?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/8488064708893587262/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=8488064708893587262' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/8488064708893587262'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/8488064708893587262'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2010/04/casualconc-bug-fix.html' title='A CasualConc bug fix'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-1526564874579817035</id><published>2010-04-21T21:43:00.002+09:00</published><updated>2010-07-03T23:49:49.495+09:00</updated><title type='text'>The current status of CasualConc beta</title><content type='html'>Since it's hard to follow what I've written in the series of posts on the current status of CasualConc beta, I will put the link on this post.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://casualconc.blogspot.com/2010/04/current-status-of-casualconc-beta.html"&gt;The current status of CasualConc Beta&lt;/a&gt;&lt;br /&gt;&lt;a href="http://casualconc.blogspot.com/2010/04/current-status-of-casualconc-beta_09.html"&gt;The current status of CasualConc beta - General/Global&lt;/a&gt;&lt;br /&gt;&lt;a href="http://casualconc.blogspot.com/2010/04/current-status-of-casualconc-beta_3281.html"&gt;The current status of CasualConc beta - General/Global part 2&lt;/a&gt;&lt;br /&gt;&lt;a href="http://casualconc.blogspot.com/2010/04/current-status-of-casualconc-beta_8653.html"&gt;The current status of CasualConc beta - Concord&lt;/a&gt;&lt;br /&gt;&lt;a href="http://casualconc.blogspot.com/2010/04/current-status-of-casualconc-beta_10.html"&gt;The current status of CasualConc beta - Cluster/Collocation/Cooccurrence&lt;/a&gt;&lt;br /&gt;&lt;a href="http://casualconc.blogspot.com/2010/04/current-status-of-casualconc-beta-word.html"&gt;The current status of CasualConc beta - Word Count&lt;/a&gt;&lt;br /&gt;&lt;a href="http://casualconc.blogspot.com/2010/04/current-status-of-casualconc-beta_4748.html"&gt;The current status of CasualConc beta - Corpus File Information&lt;/a&gt;&lt;br /&gt;&lt;a href="http://casualconc.blogspot.com/2010/04/current-status-of-casualconc-beta_7874.html"&gt;The current status of CasualConc beta - Interface&lt;/a&gt;&lt;br /&gt;&lt;a href="http://casualconc.blogspot.com/2010/04/current-status-of-casualconc-beta_4372.html"&gt;The current status of CasualConc beta - experimental features&lt;/a&gt;&lt;br /&gt;&lt;a href="http://casualconc.blogspot.com/2010/04/current-status-of-casualconc-beta_5418.html"&gt;The current status of CasualConc beta - experimental features 2&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I might update these pages or add a new post if I find other features I've added (but couldn't remember when I wrote these).&lt;br /&gt;&lt;br /&gt;You can download the beta and the current working version from the CasualConc site (English and Japanese).&amp;nbsp; Please follow the link on the right.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-1526564874579817035?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/1526564874579817035/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=1526564874579817035' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/1526564874579817035'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/1526564874579817035'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2010/04/current-status-of-casualconc-beta_21.html' title='The current status of CasualConc beta'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-2709693704607912676</id><published>2010-04-10T11:36:00.000+09:00</published><updated>2010-04-10T11:36:32.954+09:00</updated><title type='text'>The current status of CasualConc beta - experimental features 2</title><content type='html'>This post will be the last of this series of posts 'The current status of CasualConc beta'.&amp;nbsp; The last feature is also a experimental one.&amp;nbsp; It's a gap n-gram list creation (for a lack of better word).&amp;nbsp;&lt;br /&gt;&lt;br /&gt;What this does is simple, you can create a n-gram list with one of the words in n-gram (3-5) as a gap or wildcard or whatever you call it.&amp;nbsp; In the experimental beta version, when you select 3-gram, 4-gram, or 5-gram in Word Count, a check box appears.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S7_d6a8wFbI/AAAAAAAAAPE/yFPza1hkMtY/s1600/wc_gap_checkbox.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S7_d6a8wFbI/AAAAAAAAAPE/yFPza1hkMtY/s320/wc_gap_checkbox.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Check this box and click Count.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_aDr7cQQzDwI/S7_eGStHA5I/AAAAAAAAAPU/uZir6LeB-nc/s1600/wc_gap_warning.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_aDr7cQQzDwI/S7_eGStHA5I/AAAAAAAAAPU/uZir6LeB-nc/s320/wc_gap_warning.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Because this process can take a long time and needs a lot of memory if your corpus size is big, a warning message appears.&amp;nbsp; When I tried this with a corpus size of 500,000, this process took almost 10 minutes.&lt;br /&gt;&lt;br /&gt;If you are brave enough, this is what you get.&amp;nbsp; The corpus I used is Inaugural Addresses of the Presidents of the United States corpus, prepared by Prof. Tabata at Osaka University for a workshop I attended.&amp;nbsp; As you can see, the gap is represented by * and the words that appear in that slot is in the Gap words column with frequency information.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_aDr7cQQzDwI/S7_eoNsgN_I/AAAAAAAAAPc/a7BKhcCAVkM/s1600/wc_gap_result4.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="172" src="http://2.bp.blogspot.com/_aDr7cQQzDwI/S7_eoNsgN_I/AAAAAAAAAPc/a7BKhcCAVkM/s400/wc_gap_result4.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;You can select one of the line and see the entire list of gap words on a table.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S7_fau8_HbI/AAAAAAAAAPk/P6HYFdwh0ro/s1600/wc_gap_result_selected1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="47" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S7_fau8_HbI/AAAAAAAAAPk/P6HYFdwh0ro/s400/wc_gap_result_selected1.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Select a line and right click the table.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_aDr7cQQzDwI/S7_fjlv1StI/AAAAAAAAAPs/aYQMj4r9ObU/s1600/wc_gap_openlist.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_aDr7cQQzDwI/S7_fjlv1StI/AAAAAAAAAPs/aYQMj4r9ObU/s320/wc_gap_openlist.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;A panel with a table appears with the list.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S7_fqn_ntpI/AAAAAAAAAP0/GJyu-Kn1Gc8/s1600/wc_gap_wordlist1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S7_fqn_ntpI/AAAAAAAAAP0/GJyu-Kn1Gc8/s320/wc_gap_wordlist1.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;You can copy the list and paste it on other applications.&amp;nbsp; In this case all the gap words appeared on the Word Count table, but this is basically designed to see all the words when they are not displayed.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_aDr7cQQzDwI/S7_gGzZoHZI/AAAAAAAAAP8/x_6tUbDE8Ks/s1600/wc_gap_result_selected2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="32" src="http://1.bp.blogspot.com/_aDr7cQQzDwI/S7_gGzZoHZI/AAAAAAAAAP8/x_6tUbDE8Ks/s400/wc_gap_result_selected2.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;You can see all of them on the table.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S7_gMktrPDI/AAAAAAAAAQE/lHhtPgM3n4c/s1600/wc_gap_wordlist2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S7_gMktrPDI/AAAAAAAAAQE/lHhtPgM3n4c/s320/wc_gap_wordlist2.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;OK, that's it.&amp;nbsp; I think I covered almost all the new and enhanced features for the next version (current beta) of CasualConc.&amp;nbsp; The current beta has all these new features except for the last two experimental features.&amp;nbsp; You can download the beta version from &lt;a href="http://sites.google.com/site/casualconc/download/download-casualconc-beta"&gt;Download CasualConc Beta page&lt;/a&gt; (&lt;a href="http://sites.google.com/site/casualconcj/download/casualconc-daunrodo--beta"&gt;Japanese page&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;If you are interested in the experimental beta build, please contact me directly at casualconc (at) gmail.com.&amp;nbsp;&lt;br /&gt;&lt;br /&gt;Since this is a beta version, CasualConc can be unstable and might have bugs related to the changes I have made.&amp;nbsp; If you ever try this beta version, I'd appreciate your feedback/bug reports.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-2709693704607912676?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/2709693704607912676/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=2709693704607912676' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/2709693704607912676'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/2709693704607912676'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2010/04/current-status-of-casualconc-beta_5418.html' title='The current status of CasualConc beta - experimental features 2'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_aDr7cQQzDwI/S7_d6a8wFbI/AAAAAAAAAPE/yFPza1hkMtY/s72-c/wc_gap_checkbox.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-6857977396732176927</id><published>2010-04-10T04:35:00.001+09:00</published><updated>2010-04-10T10:41:35.600+09:00</updated><title type='text'>The current status of CasualConc beta - experimental features</title><content type='html'>In this post (and possibly the next) I will present two new experimental features of CasualConc.&amp;nbsp; These two features are still too experimental to be included in a beta on the site.&amp;nbsp; So if you get interested after reading this post, please contact me directly.&amp;nbsp; The email address is on the CasualConc main site.&lt;br /&gt;&lt;br /&gt;The first one is visualization of collocations and the second is n-gram search with a gap.&amp;nbsp; I will explain what these mean.&lt;br /&gt;&lt;br /&gt;The visualization of collocations is an idea proposed by Prof. Tabata at Osaka University, Japan.&amp;nbsp; This is simply a realization of his idea (not mine).&amp;nbsp; Now, let me show you the current implementation.&lt;br /&gt;&lt;br /&gt;To use this function, you need a word list and collocation with the same corpus.&amp;nbsp; So run Word Count and Collocation.&amp;nbsp; The Visual (name is tentative) button is on the Collocation tool.&amp;nbsp; Click it to display the Visualizer panel.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79pBnSfKLI/AAAAAAAAAMs/4-Z01pgIFLg/s1600/colloc_visualizerpanel.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="331" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79pBnSfKLI/AAAAAAAAAMs/4-Z01pgIFLg/s400/colloc_visualizerpanel.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;The settings of the Visualizer as follows:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_aDr7cQQzDwI/S79pM8H1GVI/AAAAAAAAAM0/p1MEUhIaZzE/s1600/colloc_visualizersetting.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_aDr7cQQzDwI/S79pM8H1GVI/AAAAAAAAAM0/p1MEUhIaZzE/s320/colloc_visualizersetting.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;The top one is which statistics to use for visualization.&amp;nbsp; The choices are shown below.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_aDr7cQQzDwI/S79pig-KYSI/AAAAAAAAAM8/f6JFVwJQKIk/s1600/colloc_visualizer_statschoice.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_aDr7cQQzDwI/S79pig-KYSI/AAAAAAAAAM8/f6JFVwJQKIk/s320/colloc_visualizer_statschoice.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Then select the span/position of collocated words.&amp;nbsp; The upper choice is a position from L5 to R5.&amp;nbsp; If you select R1, only the information on the R1 position (frequency) is used to calculate a selected statistic.&amp;nbsp; If you enable Span, the calculation will be based on the tally of frequency up to the selected position.&amp;nbsp; For example, if you select R3 and check Span, the information in R1, R2, and R3 positions will be used to calculate statistics.&amp;nbsp; The lower choice is a span to the left and right of the keyword.&amp;nbsp; You can select from L1~R1 to L5~R5.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79pyzFCWgI/AAAAAAAAANE/054ITiQkbqQ/s1600/colloc_visualizer_spanchoice.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79pyzFCWgI/AAAAAAAAANE/054ITiQkbqQ/s320/colloc_visualizer_spanchoice.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Take xxx words means the first xxx words on the Collocation table will be used for visualization.&amp;nbsp; So if you sort the results on the Collocation table, the words taken from the list will be affected.&lt;br /&gt;&lt;br /&gt;Now let's see what this does.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_aDr7cQQzDwI/S79rcUSlQTI/AAAAAAAAANU/1MXyTvXlv2U/s1600/visualization_freqsettings.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_aDr7cQQzDwI/S79rcUSlQTI/AAAAAAAAANU/1MXyTvXlv2U/s320/visualization_freqsettings.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;With this setting, the result will look like this.&amp;nbsp; Larger the number, bigger the font size.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79rS0BKPQI/AAAAAAAAANM/t2wC4tQGS-U/s1600/visualization_freq.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79rS0BKPQI/AAAAAAAAANM/t2wC4tQGS-U/s320/visualization_freq.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;This simply reflect the frequency of words in the L1 position.&amp;nbsp; Our (80) is the most frequent and American (40) follows.&amp;nbsp; But the picture is quite different with MI (Mutual information).&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_aDr7cQQzDwI/S79sAvk9iNI/AAAAAAAAANc/zRLeF7FVep4/s1600/visualization_mi.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_aDr7cQQzDwI/S79sAvk9iNI/AAAAAAAAANc/zRLeF7FVep4/s320/visualization_mi.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;You can also incorporate frequency information with other statistic result.&amp;nbsp; If you enable Include Freq Info, the frequency information will be added with a gray scale.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_aDr7cQQzDwI/S79sRT9SWMI/AAAAAAAAANk/iJMjSVa8iuA/s1600/visualization_includefreqinfo.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_aDr7cQQzDwI/S79sRT9SWMI/AAAAAAAAANk/iJMjSVa8iuA/s320/visualization_includefreqinfo.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;The result will look like this:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S7932M_hrDI/AAAAAAAAANs/29LFo7yORDo/s1600/visualization_mifreq.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S7932M_hrDI/AAAAAAAAANs/29LFo7yORDo/s320/visualization_mifreq.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;If you click Stats button, you can see the actual numbers on a table.&amp;nbsp; You can sort by alphabetical order of words or stats values.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_aDr7cQQzDwI/S794ZvclF0I/AAAAAAAAAN0/rnSJh0CW3DY/s1600/visualzation_statstable.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="338" src="http://1.bp.blogspot.com/_aDr7cQQzDwI/S794ZvclF0I/AAAAAAAAAN0/rnSJh0CW3DY/s400/visualzation_statstable.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;If you check Ignore zero occurrence, words with zero frequency will be removed from the display.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S7943jrHcGI/AAAAAAAAAN8/e0QJwnrzGF0/s1600/visualization_excludezero.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S7943jrHcGI/AAAAAAAAAN8/e0QJwnrzGF0/s320/visualization_excludezero.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_aDr7cQQzDwI/S795FAuXgSI/AAAAAAAAAOE/f36MRD8Upyc/s1600/visualization_zeroexcludecmi.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="180" src="http://1.bp.blogspot.com/_aDr7cQQzDwI/S795FAuXgSI/AAAAAAAAAOE/f36MRD8Upyc/s320/visualization_zeroexcludecmi.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;If you choose Log-Likelihood, because of its values, higher values can go extreme, so you can convert LL value with log(10).&amp;nbsp; To enable this, click Convert LL val to log.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S796C17SOEI/AAAAAAAAAOM/Z9imfVT66k0/s1600/visualization_convertll.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S796C17SOEI/AAAAAAAAAOM/Z9imfVT66k0/s320/visualization_convertll.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;With the original LL values, the image will look like this:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_aDr7cQQzDwI/S79667H0G_I/AAAAAAAAAOU/By6HINkOrEE/s1600/visualization_ll_orig.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_aDr7cQQzDwI/S79667H0G_I/AAAAAAAAAOU/By6HINkOrEE/s320/visualization_ll_orig.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;With the conversion, it will look like this:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_aDr7cQQzDwI/S797CSam3RI/AAAAAAAAAOc/fRILv-ZgyXs/s1600/visualization_ll_convert.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_aDr7cQQzDwI/S797CSam3RI/AAAAAAAAAOc/fRILv-ZgyXs/s320/visualization_ll_convert.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;The choice is up to you.&lt;br /&gt;&lt;br /&gt;The most experimental part is visualization with 4 statistics.&amp;nbsp; By clicking Use Multiple info, you can incorporate three additional statistics values.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_aDr7cQQzDwI/S797-2d39SI/AAAAAAAAAOk/eMLsRiHYOj4/s1600/visualization_multinfo.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_aDr7cQQzDwI/S797-2d39SI/AAAAAAAAAOk/eMLsRiHYOj4/s320/visualization_multinfo.png" /&gt;&lt;/a&gt;&lt;/div&gt;The current implementation is highly experimental and not tuned to display most effective color scheme, but basic idea is that value of each statistic can be assigned to one of RGB.&amp;nbsp; Higher the value, lower the color value.&amp;nbsp; So if the value for a certain word is high in all three, the font color should be closer to black.&amp;nbsp; If in the above example, Log-Likelihood value is very low and z-score value and Log-log value are very high, the font color should be close to red.&amp;nbsp; These are primary colors, so 100% on all of them means white and 0% on all of them means black.&lt;br /&gt;&lt;br /&gt;When the above three statistics are applied with MI as the primary statistic, the image will look something like this:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_aDr7cQQzDwI/S7992I76s-I/AAAAAAAAAOs/63AwlI57WYE/s1600/visualizer_threecolors.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_aDr7cQQzDwI/S7992I76s-I/AAAAAAAAAOs/63AwlI57WYE/s320/visualizer_threecolors.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Blueish or Greenish font colors mean relative values of z-score and Log-log are low compared to a relative value of Log-Likelihood.&amp;nbsp; But the actual values of each statistic can vary a lot, the displayed color scheme may not reflect a true relationships among statistics.&amp;nbsp; I need to figure out the way to visualize the optimum relationships among statistics values.&amp;nbsp; If you have any suggestion, I'd most appreciate it.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Finally, the statistic values of all four indicators can be checked on the stats value table.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S7-A3_oeWmI/AAAAAAAAAO0/4nzm_n9hsCI/s1600/visualzation_statstablewith3_alphabetorder.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="338" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S7-A3_oeWmI/AAAAAAAAAO0/4nzm_n9hsCI/s400/visualzation_statstablewith3_alphabetorder.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;You can sort the items by clicking the header of columns.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_aDr7cQQzDwI/S7-BFibkKBI/AAAAAAAAAO8/shRuSvYtnFI/s1600/visualzation_statstablewith3_orderd.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="338" src="http://4.bp.blogspot.com/_aDr7cQQzDwI/S7-BFibkKBI/AAAAAAAAAO8/shRuSvYtnFI/s400/visualzation_statstablewith3_orderd.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;That's about it.&amp;nbsp; As I mentioned at the beginning, this feature is not available on the current beta.&amp;nbsp; If you'd like to try this, please contact me directly.&amp;nbsp; My email address is on the CauslConc sites (the links are on the right side column of this blog).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-6857977396732176927?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/6857977396732176927/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=6857977396732176927' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/6857977396732176927'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/6857977396732176927'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2010/04/current-status-of-casualconc-beta_4372.html' title='The current status of CasualConc beta - experimental features'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_aDr7cQQzDwI/S79pBnSfKLI/AAAAAAAAAMs/4-Z01pgIFLg/s72-c/colloc_visualizerpanel.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-8970698233880350187</id><published>2010-04-10T02:35:00.000+09:00</published><updated>2010-04-10T02:35:27.013+09:00</updated><title type='text'>The current status of CasualConc beta - Interface</title><content type='html'>The most salient difference in the new version for Japanese users is the interface.&amp;nbsp; If you use CasualConc in Japanese language environment, interface items and messages will be displayed in Japanese.&amp;nbsp; Here is the example from Concord.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79jLtcLDII/AAAAAAAAAMM/RSMe1mspz3U/s1600/concord_japanese.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="86" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79jLtcLDII/AAAAAAAAAMM/RSMe1mspz3U/s400/concord_japanese.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Messages are also in Japanese (unless I forgot to change them).&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_aDr7cQQzDwI/S79jZDg0CVI/AAAAAAAAAMU/XLUR8ObXlZY/s1600/message_japanese1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_aDr7cQQzDwI/S79jZDg0CVI/AAAAAAAAAMU/XLUR8ObXlZY/s320/message_japanese1.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_aDr7cQQzDwI/S79jb0NfTvI/AAAAAAAAAMc/enoG5ndNuwk/s1600/message_japanese2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_aDr7cQQzDwI/S79jb0NfTvI/AAAAAAAAAMc/enoG5ndNuwk/s320/message_japanese2.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Of course, Preferences are also in Japanese.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_aDr7cQQzDwI/S79jiMxj3zI/AAAAAAAAAMk/b8azddAg10k/s1600/pref_japanese.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="301" src="http://1.bp.blogspot.com/_aDr7cQQzDwI/S79jiMxj3zI/AAAAAAAAAMk/b8azddAg10k/s400/pref_japanese.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;When this new version is out of beta, I will include help files in Japanese (the current beta does not have help files included).&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;OK, this is the end of new feature show case.&amp;nbsp; These should be available in the most up-to-date beta.&amp;nbsp;&lt;br /&gt;&lt;br /&gt;For the next couple of posts, I will show you some experimental features.&amp;nbsp; Those are not enabled in the beta on the site, but if you are interested in testing the features, please contact me directly.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-8970698233880350187?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/8970698233880350187/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=8970698233880350187' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/8970698233880350187'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/8970698233880350187'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2010/04/current-status-of-casualconc-beta_7874.html' title='The current status of CasualConc beta - Interface'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_aDr7cQQzDwI/S79jLtcLDII/AAAAAAAAAMM/RSMe1mspz3U/s72-c/concord_japanese.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-3778190984587612039</id><published>2010-04-10T02:18:00.001+09:00</published><updated>2010-04-12T01:00:46.422+09:00</updated><title type='text'>The current status of CasualConc beta - Corpus File Information</title><content type='html'>Perhaps Corpus File Information gets the most enhancement.&lt;br /&gt;&lt;br /&gt;In the current working version, Corpus File Info only creates basic file information and the number of &lt;i&gt;n&lt;/i&gt; letter words, which is not very interesting.&amp;nbsp; In the new version, you can do a lot more with Corpus File Info.&lt;br /&gt;&lt;br /&gt;Corpus File Info now has three modes.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_aDr7cQQzDwI/S79TnTLm40I/AAAAAAAAAJ0/6l15sDMQq80/s1600/fileinfo_modes.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_aDr7cQQzDwI/S79TnTLm40I/AAAAAAAAAJ0/6l15sDMQq80/s320/fileinfo_modes.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Basic Info is what the current working version does.&amp;nbsp; Basic frequency stats with frequencies of n-letter words.&amp;nbsp; Word Freq Info is to create a frequency matrix (frequencies of specified words in each file).&amp;nbsp; TF-IDF is a measure of prominence or or importance of words in a file.&amp;nbsp; For more information read &lt;a href="http://en.wikipedia.org/wiki/Tf%E2%80%93idf"&gt;this Wikipedia entry&lt;/a&gt;. &amp;nbsp; To run TF-IDF analysis, you need to run Word Count or import a word list with the information regarding how many files in a provided corpus a certain word appears.&lt;br /&gt;&lt;br /&gt;Let me start with Word Freq Info.&amp;nbsp; If you select Word Freq Info, the following items appear on the window.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79VXfOMReI/AAAAAAAAAJ8/3IjinRTX3ds/s1600/fileinfo_wordfreq.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="22" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79VXfOMReI/AAAAAAAAAJ8/3IjinRTX3ds/s400/fileinfo_wordfreq.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;First you select how you create a list of word you count in the selected corpus.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_aDr7cQQzDwI/S79WNcoFuHI/AAAAAAAAAKE/iqMZL1JBf8E/s1600/fileinfo_wordfreq_import.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_aDr7cQQzDwI/S79WNcoFuHI/AAAAAAAAAKE/iqMZL1JBf8E/s320/fileinfo_wordfreq_import.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;You can select one of the three sources.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79WSLX9EfI/AAAAAAAAAKM/Pg20jYCXFDQ/s1600/fileinfo_wordfreq_importlist.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79WSLX9EfI/AAAAAAAAAKM/Pg20jYCXFDQ/s320/fileinfo_wordfreq_importlist.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Once you select the source click Import button.&amp;nbsp; You can limit the number of words you import in Preferences -&amp;gt; File Info.&amp;nbsp; If you uncheck this, CasualConc tries to import all the word available in the source. &lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_aDr7cQQzDwI/S79XBHzna8I/AAAAAAAAAKU/OWfse9rFn5E/s1600/fileinfo_wordfreq_importlimit.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_aDr7cQQzDwI/S79XBHzna8I/AAAAAAAAAKU/OWfse9rFn5E/s320/fileinfo_wordfreq_importlimit.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;If you select From Word Count, the words on the left table of Word Count from the first one to the number specified in the Limit or all if not.&amp;nbsp; The order of words to be imported is whatever order on the Word Count.&amp;nbsp; So if you sort the list by alphabets, the imported list is in that order.&lt;br /&gt;&lt;br /&gt;If you select From File and click Import, you can import a word list, you will be prompted to specify the format of the word.&amp;nbsp; You can only import a plain text file with a certain format (CSV or tab-delimited).&amp;nbsp; You can specify how many columns from the left or rows from the beginning.&amp;nbsp; You can click Check button to check what will be imported.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_aDr7cQQzDwI/S79Yj4tZTGI/AAAAAAAAAKc/J62ExGXVjkc/s1600/fileinfo_wordfreq_importfile.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_aDr7cQQzDwI/S79Yj4tZTGI/AAAAAAAAAKc/J62ExGXVjkc/s320/fileinfo_wordfreq_importfile.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;If you select From Import Panel and click Import, the import panel appears.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79ZUmQPWnI/AAAAAAAAAKk/LjBUf7LMSWU/s1600/fileinfo_wordfreq_importtext.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79ZUmQPWnI/AAAAAAAAAKk/LjBUf7LMSWU/s320/fileinfo_wordfreq_importtext.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;You can directly enter or copy/paste a list (one item/word per line).&amp;nbsp; Once you finish creating a list, click Read button.&lt;br /&gt;&lt;br /&gt;You can check what are imported, by clicking Check button on the main window.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79Z_LfJhDI/AAAAAAAAAKs/QL_S1iDNg4M/s1600/fileinfo_wordfreq_check.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79Z_LfJhDI/AAAAAAAAAKs/QL_S1iDNg4M/s320/fileinfo_wordfreq_check.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;You can sort words in alphabetical order by clicking the column header.&amp;nbsp; The last column is the original order of the imported words.&amp;nbsp; You can delete words from the list if you want.&lt;br /&gt;&lt;br /&gt;Once you are sure about the list, you can specify the range of words to count.&amp;nbsp; If you want to count only the first 20, you enter 1 and 20 in the boxes. &lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_aDr7cQQzDwI/S79afpumHdI/AAAAAAAAAK0/blTGyOlBPl0/s1600/fileinfo_wordfreq_span.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_aDr7cQQzDwI/S79afpumHdI/AAAAAAAAAK0/blTGyOlBPl0/s320/fileinfo_wordfreq_span.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Then click Get File Info.&amp;nbsp; With the default settings, the result will look like this.&amp;nbsp; The header is the word that were counted and the numbers are the frequencies of the word in each file (and Total).&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79a7WFQSMI/AAAAAAAAAK8/Wpvb_nvhn-s/s1600/fileInfo_wordfreq_default_result.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="156" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79a7WFQSMI/AAAAAAAAAK8/Wpvb_nvhn-s/s400/fileInfo_wordfreq_default_result.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;If you check Normalize word freq in Preferences -&amp;gt; File Info, you can convert the frequency to percent or per xxx words.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79bwrHHm9I/AAAAAAAAALE/IHNY5m6aV_s/s1600/fileInfo_wordfreq_normalizeset.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="35" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79bwrHHm9I/AAAAAAAAALE/IHNY5m6aV_s/s400/fileInfo_wordfreq_normalizeset.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;If you select %, the result will look like this.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_aDr7cQQzDwI/S79cJkhtusI/AAAAAAAAALM/zaNyOca4GDY/s1600/fileInfo_wordfreq_normalizeresult.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="132" src="http://1.bp.blogspot.com/_aDr7cQQzDwI/S79cJkhtusI/AAAAAAAAALM/zaNyOca4GDY/s400/fileInfo_wordfreq_normalizeresult.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;If enable Sort frequency list of each file by word frequency, you can sort the result for each file by the order of frequency index.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_aDr7cQQzDwI/S79c7UwxpqI/AAAAAAAAALU/UAjtpwYKdlY/s1600/fileInfo_wordfreq_sortset.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_aDr7cQQzDwI/S79c7UwxpqI/AAAAAAAAALU/UAjtpwYKdlY/s320/fileInfo_wordfreq_sortset.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;If you create a list with percent and sort the result by frequency, you will get a result like this:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79dLvFHxBI/AAAAAAAAALc/aJG5jQsWpkk/s1600/fileInfo_wordfreq_normalizeandsortbyfreq.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="121" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79dLvFHxBI/AAAAAAAAALc/aJG5jQsWpkk/s400/fileInfo_wordfreq_normalizeandsortbyfreq.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;The process to import word list is the same for TF-IDF.&amp;nbsp; But to run TF-IDF, you need to have a word list on Word Count and the list should have the information of the number of files a certain word appear in the corpus.&amp;nbsp; If you run Word Count, this information is on the table.&amp;nbsp; Here is the result of the default setting.&amp;nbsp; .00 means the word appears on all the files in the corpus.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_aDr7cQQzDwI/S79eflttX1I/AAAAAAAAALk/VVoKYyf5K40/s1600/fileInfo_tfidf_defaultresult.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="153" src="http://1.bp.blogspot.com/_aDr7cQQzDwI/S79eflttX1I/AAAAAAAAALk/VVoKYyf5K40/s400/fileInfo_tfidf_defaultresult.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;You can select how to sort the results in Preferences -&amp;gt; File Info.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_aDr7cQQzDwI/S79fUHw7DMI/AAAAAAAAALs/yNf_WUdlVKM/s1600/fileInfo_tfidf_sortoption.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_aDr7cQQzDwI/S79fUHw7DMI/AAAAAAAAALs/yNf_WUdlVKM/s320/fileInfo_tfidf_sortoption.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;If you select Sum of all files, the TF-IDF values for a word will be added up and the sorting is based on the sum of the value of TF-IDF on each file.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_aDr7cQQzDwI/S79fpqcG7FI/AAAAAAAAAL0/oXygoBoU40s/s1600/fileInfo_tfidf_allresult.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="112" src="http://2.bp.blogspot.com/_aDr7cQQzDwI/S79fpqcG7FI/AAAAAAAAAL0/oXygoBoU40s/s400/fileInfo_tfidf_allresult.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;If you select Each file, the sorting will be done for each file based on the TF-IDF values.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79f3kBTS2I/AAAAAAAAAL8/upR7lR5-Mgg/s1600/fileInfo_tfidf_eachresult.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="113" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79f3kBTS2I/AAAAAAAAAL8/upR7lR5-Mgg/s400/fileInfo_tfidf_eachresult.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Personally, I've never used this for my research, but it seems to be a well-known indicator in text mining.&lt;br /&gt;&lt;br /&gt;In any case, if you want calculate TF-IDF values for all the words, but only display a limited number of words, uncheck Limit the number of words to import to and import all the words from Word List.&amp;nbsp; Then set Limit result table columns to a reasonable number.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_aDr7cQQzDwI/S79giBCczWI/AAAAAAAAAME/Ak5Ek8G8Eqk/s1600/fileInf_tfidf_limitationsettings.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="67" src="http://2.bp.blogspot.com/_aDr7cQQzDwI/S79giBCczWI/AAAAAAAAAME/Ak5Ek8G8Eqk/s400/fileInf_tfidf_limitationsettings.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;I once tried this with no limits on both with a corpus of about one hundred thousand tokens and the corpus had several thousand unique words.&amp;nbsp; This means the table had several thousand columns.&amp;nbsp; When I tried to scroll the table, even scrolling only one row took several minutes.&amp;nbsp; So don't try it!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-3778190984587612039?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/3778190984587612039/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=3778190984587612039' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/3778190984587612039'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/3778190984587612039'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2010/04/current-status-of-casualconc-beta_4748.html' title='The current status of CasualConc beta - Corpus File Information'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_aDr7cQQzDwI/S79TnTLm40I/AAAAAAAAAJ0/6l15sDMQq80/s72-c/fileinfo_modes.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-8075498123464607880</id><published>2010-04-10T00:40:00.002+09:00</published><updated>2010-04-10T01:17:51.820+09:00</updated><title type='text'>The current status of CasualConc beta - Word Count</title><content type='html'>There is only one new feature on Word Count (although creating n-gram list should be a little faster).&amp;nbsp;&lt;br /&gt;&lt;br /&gt;You can now specify what to count in Regular Expression.&amp;nbsp; Go to Preferences -&amp;gt; Others and check Regular Expression Mode for Word Count.&amp;nbsp; You can count whatever you want as long as you can write a regular expression for it.&amp;nbsp;&amp;nbsp; &lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_aDr7cQQzDwI/S79THQsHw6I/AAAAAAAAAJs/B7C89_hz_WU/s1600/wc_regex_count.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="25" src="http://2.bp.blogspot.com/_aDr7cQQzDwI/S79THQsHw6I/AAAAAAAAAJs/B7C89_hz_WU/s400/wc_regex_count.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;Now, the functionality of tag modes will be presented here.&lt;br /&gt;&lt;br /&gt;Word Count in Tag(s) mode with Separate Word and Tag in WC NOT checked, the result will look like this:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_aDr7cQQzDwI/S79JQNty-uI/AAAAAAAAAIc/P1Ez1jgLpv8/s1600/wc_tagmode_together.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="126" src="http://4.bp.blogspot.com/_aDr7cQQzDwI/S79JQNty-uI/AAAAAAAAAIc/P1Ez1jgLpv8/s400/wc_tagmode_together.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;If Separate Word and Tag in WC IS checked, the result will look like this:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_aDr7cQQzDwI/S79JdSC5C5I/AAAAAAAAAIk/Xfxm19ulIQQ/s1600/wc_tagmode_separate.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="137" src="http://4.bp.blogspot.com/_aDr7cQQzDwI/S79JdSC5C5I/AAAAAAAAAIk/Xfxm19ulIQQ/s400/wc_tagmode_separate.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;If Tag Only mode is selected, the result will look like this:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_aDr7cQQzDwI/S79Jn5jxCtI/AAAAAAAAAIs/w5l5RM8jtrA/s1600/wc_tagonly.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="122" src="http://2.bp.blogspot.com/_aDr7cQQzDwI/S79Jn5jxCtI/AAAAAAAAAIs/w5l5RM8jtrA/s400/wc_tagonly.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;N-gram in Tag(s) mode with Separate Word and Tag in WC NOT checked, the result  will look like this (4-gram): &lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_aDr7cQQzDwI/S79KHCMUwdI/AAAAAAAAAI0/_GwIN2xs1_0/s1600/ngram_tagmode_together.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="122" src="http://2.bp.blogspot.com/_aDr7cQQzDwI/S79KHCMUwdI/AAAAAAAAAI0/_GwIN2xs1_0/s400/ngram_tagmode_together.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;If Separate Word and Tag in WC IS checked, the result will look like  this:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_aDr7cQQzDwI/S79KN8zBlZI/AAAAAAAAAI8/jcVRD6gj9dI/s1600/ngram_tagmode_separate.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="126" src="http://2.bp.blogspot.com/_aDr7cQQzDwI/S79KN8zBlZI/AAAAAAAAAI8/jcVRD6gj9dI/s400/ngram_tagmode_separate.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;If Tag Only mode is selected, the result will look like this:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79KVrkXR8I/AAAAAAAAAJE/gdy_Hi6ALE4/s1600/ngram_tagonly.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="115" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79KVrkXR8I/AAAAAAAAAJE/gdy_Hi6ALE4/s400/ngram_tagonly.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;That's it for Word Count.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-8075498123464607880?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/8075498123464607880/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=8075498123464607880' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/8075498123464607880'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/8075498123464607880'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2010/04/current-status-of-casualconc-beta-word.html' title='The current status of CasualConc beta - Word Count'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_aDr7cQQzDwI/S79THQsHw6I/AAAAAAAAAJs/B7C89_hz_WU/s72-c/wc_regex_count.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-6448015671864327478</id><published>2010-04-10T00:24:00.003+09:00</published><updated>2010-04-10T01:09:13.767+09:00</updated><title type='text'>The current status of CasualConc beta - Cluster/Collocation/Cooccurrence</title><content type='html'>Only one minor change is made to Cluster.&lt;br /&gt;&lt;br /&gt;When you select Left Only in Span, the result will be aligned to the left.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_aDr7cQQzDwI/S788Yq-2dXI/AAAAAAAAAGM/cYBEnB_7uNY/s1600/cluster_leftonly.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="135" src="http://2.bp.blogspot.com/_aDr7cQQzDwI/S788Yq-2dXI/AAAAAAAAAGM/cYBEnB_7uNY/s400/cluster_leftonly.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;If you select Tag(s) mode in search word, the result of searching 'jj nn1' (adjective + singular noun) will look like this:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_aDr7cQQzDwI/S789HxLFjaI/AAAAAAAAAGc/6pCBqT6EKzA/s1600/cluster_tagsearch.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="140" src="http://2.bp.blogspot.com/_aDr7cQQzDwI/S789HxLFjaI/AAAAAAAAAGc/6pCBqT6EKzA/s400/cluster_tagsearch.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;If Suppress tags in context is on, the result will look like this.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_aDr7cQQzDwI/S789JltECII/AAAAAAAAAGk/dMHbhypJkbU/s1600/cluster_tagsearch_suppress.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="135" src="http://4.bp.blogspot.com/_aDr7cQQzDwI/S789JltECII/AAAAAAAAAGk/dMHbhypJkbU/s400/cluster_tagsearch_suppress.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;And Tag Only search will return results like this:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S789ie9071I/AAAAAAAAAGs/X7rz8fBqSMo/s1600/cluster_tagonly.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="160" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S789ie9071I/AAAAAAAAAGs/X7rz8fBqSMo/s400/cluster_tagonly.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;Now the Collocation.&amp;nbsp; In the current working version, the frequency of the total includes the frequency of the keyword position.&amp;nbsp; But in the new version, the total is only the total of word in the context.&amp;nbsp; So the keyword no longer comes on top of the list.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_aDr7cQQzDwI/S78-ewKfGPI/AAAAAAAAAG0/Y4rGO4sVc_Y/s1600/collocation_lrtotal.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="122" src="http://4.bp.blogspot.com/_aDr7cQQzDwI/S78-ewKfGPI/AAAAAAAAAG0/Y4rGO4sVc_Y/s400/collocation_lrtotal.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Another change is variable span.&amp;nbsp; Now you can set the span of the context words and do it separately for the left and right up to 5 words.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_aDr7cQQzDwI/S78-9ic6FcI/AAAAAAAAAG8/UC63cTWweEc/s1600/collocation_span.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_aDr7cQQzDwI/S78-9ic6FcI/AAAAAAAAAG8/UC63cTWweEc/s320/collocation_span.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;The result will look like this:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_aDr7cQQzDwI/S78_GhsA2rI/AAAAAAAAAHE/fbf7rPhU-V8/s1600/collocation_span_result.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="70" src="http://4.bp.blogspot.com/_aDr7cQQzDwI/S78_GhsA2rI/AAAAAAAAAHE/fbf7rPhU-V8/s400/collocation_span_result.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Collocation Stats calculations should reflect this to some extent.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Now with Tag(s) selected, the result will look like this:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_aDr7cQQzDwI/S79D-JioxrI/AAAAAAAAAHk/laGfIcdhwW0/s1600/colloc_tagsearch_withtag_ind.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="136" src="http://2.bp.blogspot.com/_aDr7cQQzDwI/S79D-JioxrI/AAAAAAAAAHk/laGfIcdhwW0/s400/colloc_tagsearch_withtag_ind.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&amp;nbsp;If Treat Keywords As One Word is checked, the result will look like this:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_aDr7cQQzDwI/S79DeyYHlTI/AAAAAAAAAHc/eMRYlGwjREU/s1600/colloc_tagsearch_withtag.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="116" src="http://4.bp.blogspot.com/_aDr7cQQzDwI/S79DeyYHlTI/AAAAAAAAAHc/eMRYlGwjREU/s400/colloc_tagsearch_withtag.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;If Suppress tags in context is on, the results with or without Treat Keywords As One Word checked will look like this:&lt;br /&gt;&lt;br /&gt;NOT Checked&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79C5BdNl9I/AAAAAAAAAHM/Vt6nqhVSj58/s1600/colloc_tagserch_ind.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="181" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79C5BdNl9I/AAAAAAAAAHM/Vt6nqhVSj58/s400/colloc_tagserch_ind.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Checked&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79DGSzgVYI/AAAAAAAAAHU/IZ4OX4Le-_A/s1600/colloc_tasearch_one.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="157" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79DGSzgVYI/AAAAAAAAAHU/IZ4OX4Le-_A/s400/colloc_tasearch_one.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Finally, Tag Only search will return the result like this:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79EjMvt0EI/AAAAAAAAAHs/XHuXxuJ7HQw/s1600/colloc_tagonlysearch.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79EjMvt0EI/AAAAAAAAAHs/XHuXxuJ7HQw/s320/colloc_tagonlysearch.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;The next is Cooccurrence.&lt;br /&gt;&lt;br /&gt;The new feature of Cooccurrence is sorting.&amp;nbsp; You can now sort words in each position based on collocation statistics.&amp;nbsp; To use this feature, you need to run Word Count first.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79Ff1gO3HI/AAAAAAAAAH0/8SkXcvZ1g3k/s1600/cooccur_sort.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79Ff1gO3HI/AAAAAAAAAH0/8SkXcvZ1g3k/s320/cooccur_sort.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;In the normal frequency order, the result looks like this:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79FzFINTUI/AAAAAAAAAH8/-par_QeDjQk/s1600/cooccur_freqsort.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="115" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79FzFINTUI/AAAAAAAAAH8/-par_QeDjQk/s400/cooccur_freqsort.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;With MI (Mutual Information) sort, the result will look like this:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79F70SK1dI/AAAAAAAAAIE/QWCWydxARFc/s1600/cooccur_misort.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="111" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79F70SK1dI/AAAAAAAAAIE/QWCWydxARFc/s400/cooccur_misort.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;With Tag(s) mode, the result will look like this:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_aDr7cQQzDwI/S79GMyRlrFI/AAAAAAAAAIM/4Lkwu21GXrU/s1600/cooccur_tagsearch.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="131" src="http://2.bp.blogspot.com/_aDr7cQQzDwI/S79GMyRlrFI/AAAAAAAAAIM/4Lkwu21GXrU/s400/cooccur_tagsearch.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Now, you can export Cooccurrence result with frequency information.&amp;nbsp; Check Include frequency info in the Cooccurrence export in Preference -&amp;gt; Others.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_aDr7cQQzDwI/S79QPD36hPI/AAAAAAAAAJM/sJvYdo8_saE/s1600/cooccur_exportfreq.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_aDr7cQQzDwI/S79QPD36hPI/AAAAAAAAAJM/sJvYdo8_saE/s320/cooccur_exportfreq.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;The exported csv file with freq info will look like this on Excel.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_aDr7cQQzDwI/S79Q_xcbnbI/AAAAAAAAAJU/Um0CsGOX3xo/s1600/cooccur_export_result.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="136" src="http://4.bp.blogspot.com/_aDr7cQQzDwI/S79Q_xcbnbI/AAAAAAAAAJU/Um0CsGOX3xo/s400/cooccur_export_result.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Now with tag handling on Cooccurrence.&amp;nbsp; With Tag Only mode, the result will look like this:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79GWLdBPnI/AAAAAAAAAIU/Y0dJe9v3g_g/s1600/cooccur_tagonly.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="178" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S79GWLdBPnI/AAAAAAAAAIU/Y0dJe9v3g_g/s400/cooccur_tagonly.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;These are the new features of Cluster/Collocation/Cooccurrence.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-6448015671864327478?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/6448015671864327478/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=6448015671864327478' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/6448015671864327478'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/6448015671864327478'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2010/04/current-status-of-casualconc-beta_10.html' title='The current status of CasualConc beta - Cluster/Collocation/Cooccurrence'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_aDr7cQQzDwI/S788Yq-2dXI/AAAAAAAAAGM/cYBEnB_7uNY/s72-c/cluster_leftonly.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-2113474722312224245</id><published>2010-04-09T23:33:00.002+09:00</published><updated>2010-04-18T11:52:54.421+09:00</updated><title type='text'>The current status of CasualConc beta - Concord</title><content type='html'>Now the Concord.&lt;br /&gt;&lt;br /&gt;It now has a few minor feature additions.&amp;nbsp; First is independent left/right span.&amp;nbsp; You can set the span of context texts on the right and left independently.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S78jT2oqc-I/AAAAAAAAACk/B-my4gpNpIQ/s1600/concord_span.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S78jT2oqc-I/AAAAAAAAACk/B-my4gpNpIQ/s320/concord_span.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;So if you want to see more on the right of the keyword, you can set the span on the left to a smaller number of characters and set the right one to a larger number.&amp;nbsp; So the result would look like this:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_aDr7cQQzDwI/S78jX34Zl2I/AAAAAAAAACs/XKKhKHT0t68/s1600/concord_span_example.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="137" src="http://2.bp.blogspot.com/_aDr7cQQzDwI/S78jX34Zl2I/AAAAAAAAACs/XKKhKHT0t68/s640/concord_span_example.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;The second is wider context word span.&amp;nbsp; In the current working version, you can only search context words up to 5 words to the right and left and you can only search single words.&amp;nbsp; With the new version, you can search up to 10 word on the right and left of the keyword and you can search a phrase.&amp;nbsp; To enable wider context words, go to Preferences -&amp;gt; Concord and check &lt;b&gt;Wide Context&lt;/b&gt;.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_aDr7cQQzDwI/S78kMN1iabI/AAAAAAAAAC0/-czHcpDEsSU/s1600/concord_widecontext.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="33" src="http://4.bp.blogspot.com/_aDr7cQQzDwI/S78kMN1iabI/AAAAAAAAAC0/-czHcpDEsSU/s400/concord_widecontext.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Now you can select up to 10 words on the right or left.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S78kaMMtRHI/AAAAAAAAAC8/OEQ2msDbwwg/s1600/concord_widenedcontext.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S78kaMMtRHI/AAAAAAAAAC8/OEQ2msDbwwg/s320/concord_widenedcontext.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;On the preferences, you can select Words Only or Words and Phrases for context word search.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S78kp23ukgI/AAAAAAAAADE/8O7eDUJ94L0/s1600/concord_contextword_modes.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="59" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S78kp23ukgI/AAAAAAAAADE/8O7eDUJ94L0/s640/concord_contextword_modes.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;This means you can search like this:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_aDr7cQQzDwI/S78lVnjN5SI/AAAAAAAAADM/d2Cdc3P3V4E/s1600/concord_widercontextphrase.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="179" src="http://1.bp.blogspot.com/_aDr7cQQzDwI/S78lVnjN5SI/AAAAAAAAADM/d2Cdc3P3V4E/s640/concord_widercontextphrase.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;You can also specify words you don't want to see in the context.&amp;nbsp; This means if the specified word(s) appear in the specified span, that line will be excluded from the result.&amp;nbsp; To enable this, check Exclude right next to the Context Word Mode setting (see above).&amp;nbsp; A text box and span settings for Excluding words appear on the main window.&amp;nbsp; Check the box next to Exclude and search the keyword.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S78mJa9k4EI/AAAAAAAAADU/eYlxYkAPr4E/s1600/concord_exclude.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="47" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S78mJa9k4EI/AAAAAAAAADU/eYlxYkAPr4E/s400/concord_exclude.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;The result would look like this:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_aDr7cQQzDwI/S78my47FM7I/AAAAAAAAADc/bKJCLldllss/s1600/concord_exclude_example.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="160" src="http://2.bp.blogspot.com/_aDr7cQQzDwI/S78my47FM7I/AAAAAAAAADc/bKJCLldllss/s640/concord_exclude_example.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;In this example, a word 'the' should not appear on the left and the right of the keyword with in 10 words (if this is working correctly).&lt;br /&gt;&lt;br /&gt;Another feature is font selection.&amp;nbsp; With the current working version, you can only select Courier or Courier New (or appropriate font might be chosen for non-alphabetic languages).&amp;nbsp; Now you can select any font on your system.&amp;nbsp; To do this, go to Preferences -&amp;gt; Concord and click Manage in the Display setting.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_aDr7cQQzDwI/S78nezhV5TI/AAAAAAAAADk/hN8qf9mJnv8/s1600/concord_font.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="110" src="http://2.bp.blogspot.com/_aDr7cQQzDwI/S78nezhV5TI/AAAAAAAAADk/hN8qf9mJnv8/s400/concord_font.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;A font panel appears.&amp;nbsp; You can select a font from Pop-up menu and click Add to add to the list.&amp;nbsp; To remove a font from the list, just select a font and click Remove.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_aDr7cQQzDwI/S78rRoTeFII/AAAAAAAAAD0/u0ecPiXxVI4/s1600/concord_fontManage.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_aDr7cQQzDwI/S78rRoTeFII/AAAAAAAAAD0/u0ecPiXxVI4/s320/concord_fontManage.png" /&gt;&lt;/a&gt;&lt;/div&gt;You can now select this added font.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_aDr7cQQzDwI/S78rskp8IdI/AAAAAAAAAD8/hZ2-1OyjLpk/s1600/concord_fontadded.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_aDr7cQQzDwI/S78rskp8IdI/AAAAAAAAAD8/hZ2-1OyjLpk/s320/concord_fontadded.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Another feature is related to copy/paste or export kwic results.&amp;nbsp; If you want to paste copied lines on Pages or MSWord or any application that accepts Rich Text Format text, you can paste the lines with the keyword in bold.&amp;nbsp; Check Keep text style when copying the results.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_aDr7cQQzDwI/S78uj3d4B8I/AAAAAAAAAEE/RAiYf2AeIi4/s1600/concord_copykeepstyle.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_aDr7cQQzDwI/S78uj3d4B8I/AAAAAAAAAEE/RAiYf2AeIi4/s320/concord_copykeepstyle.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;If you paste the text on a document, the copied text preserves the font, font size and bold text on the keywords.&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S78vGDBhLcI/AAAAAAAAAEU/ovfgKqyggTI/s1600/concord_copy_boldkey.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="110" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S78vGDBhLcI/AAAAAAAAAEU/ovfgKqyggTI/s640/concord_copy_boldkey.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;You can also insert a tab before and after the keywords.&amp;nbsp; This works with a plain text format and rich text format.&amp;nbsp; Check Insert Tab before/after Keyword.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_aDr7cQQzDwI/S78vT-LxAOI/AAAAAAAAAEc/WxCqHmy0bJY/s1600/concord_copy_inserttab.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_aDr7cQQzDwI/S78vT-LxAOI/AAAAAAAAAEc/WxCqHmy0bJY/s320/concord_copy_inserttab.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;The pasted text should look like these:&lt;br /&gt;&lt;br /&gt;Plain Text&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_aDr7cQQzDwI/S78vdc7LhsI/AAAAAAAAAEk/6WD5-YzgDpM/s1600/concord_copy_tabspace.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_aDr7cQQzDwI/S78vdc7LhsI/AAAAAAAAAEk/6WD5-YzgDpM/s320/concord_copy_tabspace.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Rich Text&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_aDr7cQQzDwI/S78vgwGEGVI/AAAAAAAAAEs/zopx9Mqk-Yo/s1600/concord_copy_boldkey_tabspace.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="131" src="http://1.bp.blogspot.com/_aDr7cQQzDwI/S78vgwGEGVI/AAAAAAAAAEs/zopx9Mqk-Yo/s400/concord_copy_boldkey_tabspace.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;This setting (tab) should work when you export the results.&lt;br /&gt;&lt;br /&gt;Another feature is editing the original/database text.&amp;nbsp; First, you need to enable this in Preferences.&amp;nbsp; You can allow editing of the original file and editing of database  files.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_aDr7cQQzDwI/S783zTCq1BI/AAAAAAAAAFc/T7d2L0Rg6n4/s1600/allowediting.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="60" src="http://1.bp.blogspot.com/_aDr7cQQzDwI/S783zTCq1BI/AAAAAAAAAFc/T7d2L0Rg6n4/s640/allowediting.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&amp;nbsp;For Allow editing of original file, you can select Only in File Mode or In Both Modes.&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_aDr7cQQzDwI/S784C9pcd_I/AAAAAAAAAFk/BqjPUs1Xd7M/s1600/allowediting_mode.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_aDr7cQQzDwI/S784C9pcd_I/AAAAAAAAAFk/BqjPUs1Xd7M/s320/allowediting_mode.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;If allow editing of original file is on, after you run Concord and select a line on the table, right-click (or go Main Menu -&amp;gt; Text Data) the table.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_aDr7cQQzDwI/S784ZLU31YI/AAAAAAAAAFs/Oej3dFNG5b4/s1600/concord_contextmenu.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="104" src="http://1.bp.blogspot.com/_aDr7cQQzDwI/S784ZLU31YI/AAAAAAAAAFs/Oej3dFNG5b4/s320/concord_contextmenu.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;If you select Open Displayed Text, a editor panel appears.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S784lfBsz9I/AAAAAAAAAF0/Z_VEU2UXZZE/s1600/concord_edittext.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="400" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S784lfBsz9I/AAAAAAAAAF0/Z_VEU2UXZZE/s400/concord_edittext.png" width="382" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;You can edit the original text file and save the changes.&amp;nbsp; This comes in handy when you find errors in the kwic result.&amp;nbsp; This has a basic tagging helper function.&amp;nbsp; You specify tags on Tag Panel and then open Tag Drawer and click the number next to the tag text to insert a tag.&lt;br /&gt;&lt;br /&gt;If you select Open Displayed Text with Application, the file opens with a specified application (the default is TextEdit).&lt;br /&gt;&lt;br /&gt;In Database Mode, you can edit a database entry.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_aDr7cQQzDwI/S785nhp1iaI/AAAAAAAAAF8/TEmEswRuo88/s1600/concord_contextmenu_db.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_aDr7cQQzDwI/S785nhp1iaI/AAAAAAAAAF8/TEmEswRuo88/s320/concord_contextmenu_db.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Select Edit Database Entry of the selected Line.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_aDr7cQQzDwI/S7853w5gLKI/AAAAAAAAAGE/SoXjh5MurzI/s1600/concord_dbedit.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="379" src="http://1.bp.blogspot.com/_aDr7cQQzDwI/S7853w5gLKI/AAAAAAAAAGE/SoXjh5MurzI/s640/concord_dbedit.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;You can make changes and update the entry or delete the entry.&lt;br /&gt;&lt;br /&gt;If the original text file is in the same directory when you created the database, you can select Open Displayed Text with Application to open the file with a specified application.&lt;br /&gt;&lt;br /&gt;And in both modes, if you select Show File of the selected Line in Finder, the original text file will be displayed in Finder (of course if the file is still in the same directory in Database mode).&lt;br /&gt;&lt;br /&gt;Another minor addition is searching text in the Context view.&amp;nbsp; You could just copy and paste to search any word you find in the Context view, but you can directly do it in the Context view.&amp;nbsp; Just select a word (or phrase) in the Context view and right-click to select Search in Concord.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S8ADj5ZRPOI/AAAAAAAAAQM/dwviSFNMvyU/s1600/concord_searchcontextext.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S8ADj5ZRPOI/AAAAAAAAAQM/dwviSFNMvyU/s320/concord_searchcontextext.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;Finally, I will briefly go over how tag-handling works.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;In Preferences, you can select Tag(s).&amp;nbsp; Tag Only mode does not work in Concord because looking at kwic lines of just tags does not make much sense.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_aDr7cQQzDwI/S78ym_rv8LI/AAAAAAAAAE0/IXdRaPAmGdI/s1600/tagmodes.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_aDr7cQQzDwI/S78ym_rv8LI/AAAAAAAAAE0/IXdRaPAmGdI/s320/tagmodes.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Once you select the Tag(s) mode, select a tag type.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S78zQWqIfLI/AAAAAAAAAFE/BiGR8EFJiME/s1600/tagtype_select.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S78zQWqIfLI/AAAAAAAAAFE/BiGR8EFJiME/s320/tagtype_select.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;In Tag(s) mode, you can run kwic just by typing tags.&amp;nbsp; This example search 'jj nn' (adjective + noun) combination.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_aDr7cQQzDwI/S78y2Du046I/AAAAAAAAAE8/7_n_jHEJGdQ/s1600/concord_tagsearch.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="214" src="http://2.bp.blogspot.com/_aDr7cQQzDwI/S78y2Du046I/AAAAAAAAAE8/7_n_jHEJGdQ/s640/concord_tagsearch.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;You can suppress tags in the context.&amp;nbsp; In Preferences, check Suppress tags in context.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_aDr7cQQzDwI/S78zvzthOMI/AAAAAAAAAFU/w7FI67z6FKU/s1600/tagformat.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="27" src="http://2.bp.blogspot.com/_aDr7cQQzDwI/S78zvzthOMI/AAAAAAAAAFU/w7FI67z6FKU/s400/tagformat.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Then the results should look like this: &lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_aDr7cQQzDwI/S78zkN6XNlI/AAAAAAAAAFM/k_Tok9g9MTc/s1600/conord_tagsearch_supresstag.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="220" src="http://1.bp.blogspot.com/_aDr7cQQzDwI/S78zkN6XNlI/AAAAAAAAAFM/k_Tok9g9MTc/s640/conord_tagsearch_supresstag.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;But this suppressing tags in context has one issue.&amp;nbsp; When you click a kwic result line, the displayed context text and keyword coloring in the context text is not correct.&amp;nbsp; I will fix this if I can find a good way, but until then, this remains as is.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-2113474722312224245?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/2113474722312224245/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=2113474722312224245' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/2113474722312224245'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/2113474722312224245'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2010/04/current-status-of-casualconc-beta_8653.html' title='The current status of CasualConc beta - Concord'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_aDr7cQQzDwI/S78jT2oqc-I/AAAAAAAAACk/B-my4gpNpIQ/s72-c/concord_span.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-1954493526231984978</id><published>2010-04-09T17:25:00.000+09:00</published><updated>2010-04-09T17:25:13.464+09:00</updated><title type='text'>The current status of CasualConc beta - General/Global part 2</title><content type='html'>Lemma and Keyword grouping.&amp;nbsp; The changes are mostly with how to deal with lemma/keyword group lists.&amp;nbsp; In the current working version, you can just select a lemma or keyword group file.&amp;nbsp; In the next version, you can manage them on CasualConc.&lt;br /&gt;&lt;br /&gt;To use lemma/keyword group function, go to Preferences -&amp;gt; Lemma (this process is the same).&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_aDr7cQQzDwI/S77YY-OkKjI/AAAAAAAAABk/Y3mwG0ZcsCc/s1600/lemma_prefrence.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_aDr7cQQzDwI/S77YY-OkKjI/AAAAAAAAABk/Y3mwG0ZcsCc/s320/lemma_prefrence.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;The difference is you select a group you manage on lemma/keyword group panel.&amp;nbsp; If you check Apply Lemmatization to Search Word, searching a word returns all the words with the same stem.&lt;br /&gt;&lt;br /&gt;To manage lemma list, go to Main Menu -&amp;gt; Window -&amp;gt; Lemma List Panel.&amp;nbsp; There are separate panel for Lemma and Keyword Group.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_aDr7cQQzDwI/S77Y9amCpII/AAAAAAAAABs/cS_Wx6oEa9Q/s1600/lemma_menu.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_aDr7cQQzDwI/S77Y9amCpII/AAAAAAAAABs/cS_Wx6oEa9Q/s320/lemma_menu.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;A panel appears.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S77ZGirEsYI/AAAAAAAAAB0/AhQBaIPZVws/s1600/lemma_panel.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="196" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S77ZGirEsYI/AAAAAAAAAB0/AhQBaIPZVws/s400/lemma_panel.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;On the left table, you manage groups just like stop word/skip character list.&amp;nbsp; You can enter a lemma set or remove/modify them.&amp;nbsp; You can also import an existing lemma file such as e-lemma file.&amp;nbsp; The lemma/keyword group lists can be exported in the same format.&amp;nbsp; You can create multiple lists and apply any of them to your analysis.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;In file handling, you can specify any file extensions by checking Allow Other File Types on Preferences -&amp;gt; Files.&amp;nbsp; Or you can allow any file types (even the files with no extension) by checking Any File Types.&amp;nbsp; If there is any readable text on files, CasualConc tries to read it.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S77iTk5gBGI/AAAAAAAAACM/m6gUvSwxMCQ/s1600/new_file_types.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="25" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S77iTk5gBGI/AAAAAAAAACM/m6gUvSwxMCQ/s400/new_file_types.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;Ignore tag/section has a little improvement.&amp;nbsp; You can select either xml/html type tags or under score type tags (&amp;lt;*&amp;gt; or ~_*).&amp;nbsp; &lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S77i4IJe1EI/AAAAAAAAACU/Nk3TGLLgwLA/s1600/ignore_tags.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="41" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S77i4IJe1EI/AAAAAAAAACU/Nk3TGLLgwLA/s400/ignore_tags.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;You can also specify any string to ignore in the analysis.&amp;nbsp; To enable this, check Ignore Specified Strings.&amp;nbsp; You can specify whatever strings you want in regular expression.&amp;nbsp; The blank column with a check box determines whether the entry is used or not, C is case sensitivity (check to make the regular expression case sensitive) and M is to apply regular expression across line feed characters (\r\n,\n,\r).&amp;nbsp; You can edit regular expressions in the text box or on the table.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_aDr7cQQzDwI/S77jWmbjuYI/AAAAAAAAACc/385uhBaD3E8/s1600/ignore_strings.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="70" src="http://2.bp.blogspot.com/_aDr7cQQzDwI/S77jWmbjuYI/AAAAAAAAACc/385uhBaD3E8/s400/ignore_strings.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Now, I will focus on tools from the next post.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-1954493526231984978?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/1954493526231984978/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=1954493526231984978' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/1954493526231984978'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/1954493526231984978'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2010/04/current-status-of-casualconc-beta_3281.html' title='The current status of CasualConc beta - General/Global part 2'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_aDr7cQQzDwI/S77YY-OkKjI/AAAAAAAAABk/Y3mwG0ZcsCc/s72-c/lemma_prefrence.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-6668810308043886785</id><published>2010-04-09T16:28:00.003+09:00</published><updated>2010-07-03T23:51:09.907+09:00</updated><title type='text'>The current status of CasualConc beta - General/Global</title><content type='html'>I start with some features that are common across tools.&lt;br /&gt;&lt;br /&gt;First is Search Word Choice.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_aDr7cQQzDwI/S77MLnsRmWI/AAAAAAAAAAk/mAFvuz7oRIE/s1600/search_word_choice.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_aDr7cQQzDwI/S77MLnsRmWI/AAAAAAAAAAk/mAFvuz7oRIE/s320/search_word_choice.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Experimental tag search options are added.&amp;nbsp; This feature is only supported in European Language A/B mode and works in Concord, Cluster, Collocation, and Word Count.&amp;nbsp; What this does in each tool will be explained (if I have time) in the post for each tool.&amp;nbsp; But generally, if you select Tag(s) in Concord, Cluster, Collocation, you can search tagged corpus just by tag.&amp;nbsp; In Word Count, you can choose to display words and tags in separate columns.&amp;nbsp; For example, if your corpus is POS-tagged and jj is used for adjectives and nn is used for nouns, searching 'jj nn' returns all the sequence of 'jj nn' in your corpus, such as 'beautiful_jj day_nn'.&amp;nbsp; Tag Only does not work in Concord, but in Cluster, Collocation, and Word Count, you can create lists only with tags.&lt;br /&gt;&lt;br /&gt;If you select Regular Expression, you can set case sensitivity.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_aDr7cQQzDwI/S77QCwrkJZI/AAAAAAAAAAs/rPVtvfv3FTs/s1600/regex_case.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_aDr7cQQzDwI/S77QCwrkJZI/AAAAAAAAAAs/rPVtvfv3FTs/s320/regex_case.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Another new feature is setting character replacement.&amp;nbsp; In the current official version (1.0.2), some of the characters (mostly symbols, non alphabet characters) are automatically replaced.&amp;nbsp; For example, smart quotes (“”) which usually used in word processing applications are treated as a regular character (like alphabets) because of their assignments in Unicode (double byte?).&amp;nbsp; Also if you copy/paste or convert text from MSWord or read MSWord documents directly, there would be many multi-byte characters.&amp;nbsp; In the new version (or beta), you can specify (in other words, you need to specify) replacement.&lt;br /&gt;&lt;br /&gt;To enable this feature, in Preferences -&amp;gt; General, check &lt;b&gt;Replace Characters&lt;/b&gt;.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_aDr7cQQzDwI/S77Rv4QhwtI/AAAAAAAAAA0/l4NFdtNfxpA/s1600/replace_char.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_aDr7cQQzDwI/S77Rv4QhwtI/AAAAAAAAAA0/l4NFdtNfxpA/s320/replace_char.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Then, click Show List to open Replace Characters panel.&amp;nbsp; Click Add button to add a new entry and enter character directly on the table.&amp;nbsp; To apply any of the replacement pairs, check the box on the left before you run the analysis.&amp;nbsp; This feature is not fully functional, so if you enter a weird character on the table, it might not accept it (this happened when I enter a garbage character [a half of the Unicode code of a character]).&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S77R5t0kv3I/AAAAAAAAAA8/nfEVkoS2J1I/s1600/replace_char_example.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S77R5t0kv3I/AAAAAAAAAA8/nfEVkoS2J1I/s320/replace_char_example.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;Next is Include Words.&amp;nbsp; You can specify some sequence of character to be treated as a word.&amp;nbsp; In the current official version, you can specify this on the Preferences, but they are always applied to any files.&amp;nbsp; With this new function, you can create different groups of characters.&amp;nbsp; To enable this, check Include Items on 'Include Words' list on the Preferences -&amp;gt; General.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_aDr7cQQzDwI/S77S-J81KGI/AAAAAAAAABE/Yx9ypTnewxM/s1600/include_word.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="31" src="http://2.bp.blogspot.com/_aDr7cQQzDwI/S77S-J81KGI/AAAAAAAAABE/Yx9ypTnewxM/s400/include_word.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Clicking the Include Words List button will display a panel, but this is integrated with Stop Word/Skip Character list, so I will explain this with them below.&lt;br /&gt;&lt;br /&gt;Also you can specify any character to be treated as a part of a word (if this works properly).&amp;nbsp; Check Other characters to be included and specify any character you want.&amp;nbsp; If you check Includes word initial, a word that starts with the specified character should be treated as a word with them (I'm not sure how well this works, though).&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_aDr7cQQzDwI/S77U4s7ru6I/AAAAAAAAABM/x6ST-eRmDTU/s1600/include_character.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="20" src="http://2.bp.blogspot.com/_aDr7cQQzDwI/S77U4s7ru6I/AAAAAAAAABM/x6ST-eRmDTU/s400/include_character.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;Stop Word/Skip Character function.&amp;nbsp; This is a totally new function.&amp;nbsp; You can create and manage stop word/skip character.&amp;nbsp; The latter is for multi-byte character language, such as Japanese, to ignore punctuation characters.&amp;nbsp; In these languages, punctuation characters are also multi-byte, so they are treated as a regular character (like alphabets).&amp;nbsp; So this function is to avoid it.&lt;br /&gt;&lt;br /&gt;To use this function, go to Main Menu -&amp;gt; Window -&amp;gt; Stop Word/Skip Character List Panel.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_aDr7cQQzDwI/S77VyzT9RhI/AAAAAAAAABc/Zi5p5X7hxtQ/s1600/stop_list_menu.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_aDr7cQQzDwI/S77VyzT9RhI/AAAAAAAAABc/Zi5p5X7hxtQ/s320/stop_list_menu.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;A panel appears.&amp;nbsp; On the left, you manage groups.&amp;nbsp; Fist, enter a group name and click Add button to create a new group.&amp;nbsp; Then select a group on the left table.&amp;nbsp; You can delete or rename a group.&amp;nbsp; On the right, you have three choices: Stop Words, Skip Characters, and Include Words.&amp;nbsp; Select an appropriate tab and enter a word/character, and click Add button.&amp;nbsp; You can remove any word by clicking Remove button.&amp;nbsp; Also you can import/export the list.&amp;nbsp; This accepts a plain text file and  you can select an encoding.&amp;nbsp; The format if one word/character per line.&amp;nbsp; Exported text will be in the same format (one word/character per line).&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_aDr7cQQzDwI/S77VrJWORXI/AAAAAAAAABU/610hA0Uoxzk/s1600/stop_word_list.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/S77VrJWORXI/AAAAAAAAABU/610hA0Uoxzk/s320/stop_word_list.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;You can set if you want to apply stop words on particular tools.&amp;nbsp; Go to Preferences -&amp;gt; Others and check the tool to apply stop word deletion.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_aDr7cQQzDwI/S79Rm-WR8iI/AAAAAAAAAJc/c9y-51DiuRg/s1600/stopword_application.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="45" src="http://2.bp.blogspot.com/_aDr7cQQzDwI/S79Rm-WR8iI/AAAAAAAAAJc/c9y-51DiuRg/s400/stopword_application.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;The language/group choice and stop/skip word application as well  as a current search word mode are now displayed on the main window.&amp;nbsp; SK  is stop word is enabled, and SK is Skip Characters is enabled.&amp;nbsp; You can  switch Search Word mode and stop word/skip character group on the main  window.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_aDr7cQQzDwI/S77h71BorRI/AAAAAAAAACE/1aaih4ccrSA/s1600/status_on_window.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="21" src="http://4.bp.blogspot.com/_aDr7cQQzDwI/S77h71BorRI/AAAAAAAAACE/1aaih4ccrSA/s400/status_on_window.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;Well, this post is getting long, so I stop here.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-6668810308043886785?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/6668810308043886785/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=6668810308043886785' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/6668810308043886785'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/6668810308043886785'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2010/04/current-status-of-casualconc-beta_09.html' title='The current status of CasualConc beta - General/Global'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_aDr7cQQzDwI/S77MLnsRmWI/AAAAAAAAAAk/mAFvuz7oRIE/s72-c/search_word_choice.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-601010150550035088</id><published>2010-04-09T15:27:00.000+09:00</published><updated>2010-04-09T15:27:04.826+09:00</updated><title type='text'>The current status of CasualConc beta</title><content type='html'>It's been a while since my last post...&amp;nbsp; I've been busy with annotating and analyzing my own corpus for my research.&amp;nbsp; While I was working on my own corpus, I've made a lot of changes to CasualTagger and some on CasualConc.&amp;nbsp; The analysis of my corpus text is almost done and now I need to concentrate on writing.&amp;nbsp; But before that, I decided to make note of the changed I've made on CasualConc beta.&amp;nbsp; I'm not sure if anyone is following this blog, so this is mainly for my own purpose.&amp;nbsp;&lt;br /&gt;&lt;br /&gt;I will make multiple posts today (and maybe tomorrow) to cover most of the changes and how-tos for new features, so that those who is (bravely) using the beta version will know what some of the new buttons, choices, etc. do.&lt;br /&gt;&lt;br /&gt;I will also try to present some experimental features that will appear in the coming beta.&amp;nbsp; These features are still underdevelopment, but if you are interested, I'm more than happy to share it with you.&amp;nbsp; Just email me (address is on the main site) so that I can send you the most up-to-date beta.&lt;br /&gt;&lt;br /&gt;Also if you ever try the beta version, please send me any feedback or post comments on this blog.&amp;nbsp; I can read Japanese and English, so both are OK.&lt;br /&gt;&lt;br /&gt;Now let me start with general or global feature in the next post.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-601010150550035088?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/601010150550035088/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=601010150550035088' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/601010150550035088'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/601010150550035088'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2010/04/current-status-of-casualconc-beta.html' title='The current status of CasualConc beta'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-7142153874615270704</id><published>2009-12-18T02:36:00.001+09:00</published><updated>2009-12-20T02:24:12.495+09:00</updated><title type='text'>CasualTagger 0.8</title><content type='html'>I'm not sure if there is anyone who have even tried CasualTagger, but because I've been using it to tag my own corpus, I've been adding features to it.&amp;nbsp; I don't have time to work on the documentation with screenshots, so I'll describe new features and how to use them (to some extent) on this blog.&amp;nbsp; But because I don't remember all the changes I've made since the last update, I only mention more significant ones here.&lt;br /&gt;&lt;br /&gt;General feature&lt;br /&gt;- memo panel&lt;br /&gt;&lt;br /&gt;You can keep notes of what you are doing and CasualTagger keeps you memo.&amp;nbsp; The format of this might change in the future, though.&lt;br /&gt;&lt;br /&gt;In Editor mode&lt;br /&gt;- coloring of specified strings in kwic context (up to 2)&lt;br /&gt;- specifying left/right kwic context span separately&lt;br /&gt;- adding xml type tag in addition to pos-tags (with shortcuts)&lt;br /&gt;- more versatile tag coloring&lt;br /&gt;- search word/coloring string history &lt;br /&gt;&lt;br /&gt;The coloring is simple.&amp;nbsp; You just specify word/phrase/whatever to add colors in kwic context.&amp;nbsp; You can specify if you add colors to left/right/both context.&amp;nbsp; The mode of search word (wildcard/character/regular expressions) applies to context string coloring.&lt;br /&gt;&lt;br /&gt;For kwic span, you can now specify it for left and right context separately.&lt;br /&gt;&lt;br /&gt;The xml tagging feature is added.&amp;nbsp; So now you can add two different types of tag formats (one for pos-tags and the other for xml type tags).&amp;nbsp; Both can be done with shortcuts.&amp;nbsp; For example, you can add pos-tags in _XXX format and at the same time, you can work on xml type tags &lt;xxx type=""&gt;~&lt;/xxx&gt;. &lt;br /&gt;&lt;br /&gt;Tag coloring is more versatile now.&amp;nbsp; You can specify different types of tags including xml type tags.&lt;br /&gt;&lt;br /&gt;Search word and context coloring strings has history features just like search word/context word history in CasualConc.&lt;br /&gt;&lt;br /&gt;New modes&lt;br /&gt;- Item Counter&lt;br /&gt;- Custom File Info&lt;br /&gt;&lt;br /&gt;Item Counter is simply to count occurrences of strings in your corpus that match a regular expression.&amp;nbsp; To use this, first add files to the file list table on the left.&amp;nbsp; Then open Option panel (Menu -&amp;gt; Window -&amp;gt; Counter Option Panel).&amp;nbsp; You can specify the end of the file information part (just like you can in CasualConc).&amp;nbsp; You can also specify any strings to ignore in counting (with regular expression).&amp;nbsp; If the files have any string that match the specified regular expression, they will be deleted before CasualTagger counts what you want to count.&amp;nbsp; You can have any number of items on a table and check the ones you want to apply.&amp;nbsp; You can also specify them for each table.&amp;nbsp; If you use () to back reference, only those in the brackets will appear on the table.&lt;br /&gt;&lt;br /&gt;Custom File Info is basically multiple Item Counter.&amp;nbsp; To use this, add files to the file table on the left.&amp;nbsp; Then click Settings button on the top right corner.&amp;nbsp; You can specify end of the file info part and any string to ignore in all the counts.&lt;br /&gt;&lt;br /&gt;On the setting table, add items to count.&amp;nbsp; Label is a label for table columns.&amp;nbsp; Check "U" if you want to count only unique occurrences.&amp;nbsp; "C" is case sensitivity for regular expression.&amp;nbsp; Check "M" to allow multiple line matches.&amp;nbsp; Then specify items to count and items to ignore in regular expressions.&amp;nbsp; You can specify multiple items for Items to ignore.&amp;nbsp; Just use a comma [,] to separate regular expressions.&amp;nbsp; You can export/import that list for later use.&amp;nbsp; Drag and drop to change the order.&lt;br /&gt;&lt;br /&gt;Once you set everything, close the setting window and click Run.&amp;nbsp; You can export the result as a tab-delimited text (in UTF-8) to open in Excel/Numbers or any spreadsheet application.&lt;br /&gt;&lt;br /&gt;You can use Item Counter to check how your regular expressions work in Custom File Info, though it's not perfect (it doesn't have ignore case/multiple line for ignore items in Item Counter.&lt;br /&gt;&lt;br /&gt;Anyway, I don't know if there's anyone to try CasualTagger, but if you are interested, you can download it from the CasualConc site.&amp;nbsp; If you ever use it, please let me know what you think.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-7142153874615270704?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/7142153874615270704/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=7142153874615270704' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/7142153874615270704'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/7142153874615270704'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2009/12/casualtagger-08.html' title='CasualTagger 0.8'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-1179153415067725402</id><published>2009-11-08T01:53:00.002+09:00</published><updated>2009-11-08T01:55:06.856+09:00</updated><title type='text'>CasualConc update and CasualTranscriber</title><content type='html'>Well, I stated here that there would be no more new feature in version 1.0, but I changed my mind...&amp;nbsp; This is due to a couple of reasons.&amp;nbsp; One is I found a couple of bugs and I had to fixed them.&amp;nbsp; Another is adding them was not so time consuming.&amp;nbsp; I added to the current beta first and mostly simply copy/paste the items and scripts to version 1.0.&amp;nbsp; Especially one of the new features is what I wanted to include in version 1.0, but hadn't figured out how.&lt;br /&gt;&lt;br /&gt;Anyway, the current version is 1.0.2 and here's the list of bug fixes and added features.&lt;br /&gt;&lt;br /&gt;CasualConc Version 1.0.2&lt;br /&gt;&lt;br /&gt;Bug fix&lt;br /&gt;- In Cluster, the same cluster was counted twice if a search word/phrase appears twice in a cluster (such as 'is that is')&lt;br /&gt;- Related to the above one: in Cluster, if a search word appears twice in a cluster, only one word was colored&lt;br /&gt;- In Cluster and Collocation with Lemma search is on, not all words with the same lemma appear in the list (right most column)&lt;br /&gt;&lt;br /&gt;The bug in Cluster was a little serious.&amp;nbsp; In the original script, clusters were collected on every search word.&amp;nbsp; This means that if there is a sequence 'is that is' and you search 'is', 'is that is' was counted twice (once with the first 'is' and then with the second 'is' in the cluster).&amp;nbsp; Now, when this happens duplicates are eliminated, so 'is that is' is counted only once.&amp;nbsp; And because it was assumed the search word appears only once in a cluster, only one of them was colored.&amp;nbsp; Now if there are two occurrence of the search word in a cluster, both should be colored.&lt;br /&gt;&lt;br /&gt;The bug with Lemma is a minor one, which just means I don't expect many people use this function.&amp;nbsp; When you run Cluster or Collocation with the Lemma feature on, CasualConc shows the frequencies of words in the same lemma (or clusters with them) on the very right column.&amp;nbsp; But when Lemma contains only one word, it didn't display correctly.&amp;nbsp; Now all the words should be displayed.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;New features&lt;br /&gt;- Concordance Plot -- you need to set 'Scope of Context' to 'File' and check 'Create Concordance Plot'&lt;br /&gt;- Search word/context search word history&lt;br /&gt;&lt;br /&gt;Concordance Plot shows where in a file the search word appears.&amp;nbsp; The plots are generated when you run Concord with 'Scope of Context' set as 'File' and 'Create Concordance Plot' is checked in Preferences.&lt;br /&gt;&lt;br /&gt;Another new feature is search word/context search word history.&amp;nbsp; CasualConc remembers the words you searched in Concord/Cluster/Collocation and you can select one from the pull-down menu.&amp;nbsp; You can set how many search/context search words to remember in Preferences (in General [Search Word] and Concord [Context Word]).&lt;br /&gt;&lt;br /&gt;I hope these new features didn't introduce new bugs.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;I also updated CasualTranscriber, a transcription helper application.&amp;nbsp; Now it should run on Mac OS X 10.6 Snow Leopard.&amp;nbsp; I also fixed some bugs and added new features.&amp;nbsp; It should be a little more stable.&lt;br /&gt;&lt;br /&gt;The new/enhanced features are insert tag function and much more powerful regular expression search/replace.&lt;br /&gt;&lt;br /&gt;To use tag function, go to &lt;b&gt;Menu&lt;/b&gt; -&amp;gt; &lt;b&gt;Window&lt;/b&gt; -&amp;gt; &lt;b&gt;Tag Panel&lt;/b&gt;.&amp;nbsp; Then enter tags under Tag.&amp;nbsp; To insert a tag, type &lt;b&gt;Command&lt;/b&gt; + &lt;b&gt;Option&lt;/b&gt; + (the number on the left).&amp;nbsp; So to insert the tag in 1, type &lt;b&gt;Command&lt;/b&gt; + &lt;b&gt;Option&lt;/b&gt; + &lt;b&gt;1&lt;/b&gt;.&amp;nbsp; If Tag is &lt;b&gt;laugh&lt;/b&gt;, the tag will be inserted as &lt;b&gt;&lt;laugh&gt;&lt;/laugh&gt;&lt;/b&gt;.&amp;nbsp; If any text is selected on the text view, it appears between the tags.&amp;nbsp; If you check the box and enter options, they appear like attributes in XML.&amp;nbsp; Options will be divided by a comma (,) so if you enter &lt;b&gt;option1,option2&lt;/b&gt; in the Options box, the inserted tag will be &lt;b&gt;&lt;laugh option1="" option2=""&gt;&lt;/laugh&gt;&lt;/b&gt;.&amp;nbsp; The selected text appears between tags.&lt;br /&gt;&lt;br /&gt;If you don't want to type the combination, you can click a button to insert a tag.&amp;nbsp; On the main window, you will see a button on the top right corner (the one to show tool bar on any Cocoa application).&amp;nbsp; Click it to show an icon like a gear.&amp;nbsp; Clicking the gear icon shows a drawer on the right.&amp;nbsp; Click a button next to the tag you want to insert.&amp;nbsp; You can change the tag on the drawer (the change will be reflected on the panel).&amp;nbsp; But if you want to change the options, you need to do that on the panel.&lt;br /&gt;&lt;br /&gt;I fixed some bugs, but I can't tell which ones.&amp;nbsp; This is because so many things broke in Snow Leopard and I couldn't tell which bugs were in the previous version and which ones were due to change in the OS.&lt;br /&gt;&lt;br /&gt;Anyway, if you try either of them and find any bug, please let me know.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-1179153415067725402?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/1179153415067725402/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=1179153415067725402' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/1179153415067725402'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/1179153415067725402'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2009/11/casualconc-update-and-casualtranscriber.html' title='CasualConc update and CasualTranscriber'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-1509054909869893482</id><published>2009-11-01T10:48:00.001+09:00</published><updated>2009-11-01T10:50:05.654+09:00</updated><title type='text'>Another CasualConc bug fix</title><content type='html'>In the development of the next version, I found a bug which affects the version 1.0 and fixed it.&lt;br /&gt;&lt;br /&gt;Bug&lt;br /&gt;- the context text (in context view) is not properly displayed when the search word appears in the first paragraph of a file.&lt;br /&gt;&lt;br /&gt;This only affected if you use Database mode and the searched word appears in the very first paragraph of a file (original file, not the database file). &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The development of the next version is slow.&amp;nbsp; Because I started to change the fundamentals, only a part of functions work now (file management and a part of Concord).&amp;nbsp; Anyway, here's the list of tentative new/revised features (might not show up in the next version).&lt;br /&gt;&lt;br /&gt;- stop word list&lt;br /&gt;- skip character list&lt;br /&gt;- experimental pos tag search/count (only in European Language modes)&lt;br /&gt;&lt;br /&gt;If you have any suggestions, please let me know.&amp;nbsp; I'll see if I can include them in the next version.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-1509054909869893482?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/1509054909869893482/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=1509054909869893482' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/1509054909869893482'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/1509054909869893482'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2009/11/another-casualconc-bug-fix.html' title='Another CasualConc bug fix'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-3196379777574405788</id><published>2009-10-25T02:15:00.000+09:00</published><updated>2009-10-25T02:15:01.679+09:00</updated><title type='text'>CasualConc minor bug fix</title><content type='html'>I found a bug in CasualConc when I was working on the beta, which has the same script.&amp;nbsp; It's a minor one (a feature I believe not a lot of people use). &lt;br /&gt;&lt;br /&gt;&lt;b&gt;Bug fix&lt;/b&gt;&lt;br /&gt;- crashed when creating a database file in Advanced Corpora Database mode using tag deletion.&lt;br /&gt;&lt;br /&gt;If you are one of the rare people who uses this function, please download the latest version.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-3196379777574405788?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/3196379777574405788/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=3196379777574405788' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/3196379777574405788'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/3196379777574405788'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2009/10/casualconc-minor-bug-fix.html' title='CasualConc minor bug fix'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-6678871536686697726</id><published>2009-10-19T02:15:00.000+09:00</published><updated>2009-10-19T02:15:06.903+09:00</updated><title type='text'>CasualConc 1.0 and more...</title><content type='html'>This post will a long one.&lt;br /&gt;&lt;br /&gt;I've decided to make the latest build of CasualConc a version 1.0 mostly because I didn't get feedback/bug report (well, I guess not a lot of people ever tried the latest beta or people don't bother to report any bugs...).&amp;nbsp; Anyway, I made a few changes to it and here's the list.&amp;nbsp; Bug fixes are very minor &lt;br /&gt;&lt;br /&gt;&lt;b&gt;Casualconc version 1.0&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Bug Fix&lt;/b&gt;&lt;br /&gt;- timer for File Info now working&lt;br /&gt;- move tables in Cluster moves everything including span and type &lt;br /&gt;- word list import now functions&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Enhancement&lt;/b&gt;&lt;br /&gt;- creating File Info table is much faster&lt;br /&gt;- progress bar in File mode progresses based on the number of files processed&lt;br /&gt;- now including help files (the same content as you find on the site; English only)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;From now on, this version is only maintained if I ever get bug report.&amp;nbsp; I've already started adding new features and I'm planning to make more fundamental changes to it.&amp;nbsp; I'll release it as version 1.x some time in the future, but no time frame.&amp;nbsp; If you have any feature request, please send them to me.&amp;nbsp; I'll try to add them once I add what I want, though whether I can add what you want depend on my time and scripting skills.&amp;nbsp; I might release it as beta (or alpha) if people are interested even before it becomes stable.&amp;nbsp; If you are, please let me know.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;In addition to this, I've also updated some of other applications.&amp;nbsp; I don't know if anyone ever uses them, but I really started using some of them personally, so I've been trying to include what I want.&amp;nbsp; Well, the reasons I started to write the applications vary.&amp;nbsp; Some were upon requests and some were just experimental (to try what I learned in scripting).&amp;nbsp; Now, I want to make them more like real applications.&lt;br /&gt;&lt;br /&gt;Anyway, the updated applications and the details of updates are below.&amp;nbsp; I don't think people are interested, but these are for the record, so I can keep track of what I've done.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;CasualMecab&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;- Aozora Bunko Kanji substitute handling&lt;br /&gt;- experimental Word List function using Mecab output&lt;br /&gt;- Snow Leopard Support (separate build)&lt;br /&gt;&lt;br /&gt;Aozora Bunko Kanji substitute handling is to pick the Kanji substitute, such as ※［＃「てへん＋劣」、第3水準1-84-77］ and replace with real Kanji's (now this is possible with Unicode characters).&amp;nbsp; Word List function uses Mecab format output and create word list with any of the info available (not just with the word on the text).&amp;nbsp; You can create a word list of base form and part-of-speech combination, etc.&amp;nbsp; Snow Leopard Support is just a work around.&amp;nbsp; If you use Snow Leopard, you need to download the one for it.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;CasualTagger&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;- support rbtagger if installed&lt;br /&gt;- better regex search/replace&lt;br /&gt;- delete punctuation tags&lt;br /&gt;- ignore header section (info part?)&lt;br /&gt;- skip bracketed tags from tagging&lt;br /&gt;- progress bar in batch process&lt;br /&gt;- run tagger in editor mode&lt;br /&gt;&lt;br /&gt;CasualTagger now support &lt;b&gt;rbtagger&lt;/b&gt;.&amp;nbsp; You can find more information about rbtagger &lt;a href="http://rbtagger.rubyforge.org/"&gt;here&lt;/a&gt;.&amp;nbsp; &lt;b&gt;rbtagger&lt;/b&gt; is a tagger based on Eric Brill's tagger by Todd A. Fisher.&amp;nbsp; You need to install it by yourself, but it's very simple.&amp;nbsp; Just type &lt;b&gt;sudo gem install rbtagger&lt;/b&gt; in Terminal.app.&amp;nbsp; It still has some issues, but it's good to have alternatives.&lt;br /&gt;&lt;br /&gt;Regex replace was supported before, but now you can use it for search.&amp;nbsp; Delete punctuation tags delete tags put on punctuation characters (not words).&amp;nbsp; Ignore header section is for my own purpose.&amp;nbsp; Some of my corpus files have header section &lt;info&gt;~&lt;/info&gt; and I don't want to add tags to the text in this section.&amp;nbsp; So now CasualTagger can ignore this part (keep the original text).&amp;nbsp; Skip bracketed tags are to ignore section tags I have on some files, such as &lt;text&gt;~&lt;/text&gt;, etc.&amp;nbsp; And the progress bar is added to batch processing.&amp;nbsp; Finally, you can apply tags (engtagger/rbtagger) on a single file in Editor mode.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;CasualTextractor&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;- PDF mode&lt;br /&gt;&amp;nbsp; - search in PDF&lt;br /&gt;&amp;nbsp; - enlarge/reduce in PDF&lt;br /&gt;&amp;nbsp; - go to selected text (from PDF to extracted text)&lt;br /&gt;&amp;nbsp; - delete selected text (on PDF from extracted text)&lt;br /&gt;&amp;nbsp; - split word search and replace (PDF artifacts)&lt;br /&gt;&amp;nbsp; - replace character list&lt;br /&gt;&lt;br /&gt;- Web mode&lt;br /&gt;&amp;nbsp; - open web files (html/htm/webarchive)&lt;br /&gt;&amp;nbsp; - clear open page/file&lt;br /&gt;&amp;nbsp; - web history&lt;br /&gt;&amp;nbsp; - source view&lt;br /&gt;&lt;br /&gt;- Document&lt;br /&gt;&amp;nbsp; - split word search (for PDF)&lt;br /&gt;&amp;nbsp; - replace character list&lt;br /&gt;&lt;br /&gt;Overall&lt;br /&gt;&amp;nbsp; - open recent files&lt;br /&gt;&amp;nbsp; - regular expression search&lt;br /&gt;&amp;nbsp; - simple tagging support&lt;br /&gt;&amp;nbsp; - text format options (replace certain text/characters)&lt;br /&gt;&lt;br /&gt;I've made so many changes, so I make notes on some of them. &lt;br /&gt;&lt;br /&gt;In PDF mode, with delete selected text on PDF, you select text on PDF view and delete the section.&amp;nbsp; The text will be deleted from the text view and the text on PDF will be struck through.&amp;nbsp; This is handy if you want to delete header or footer on the PDF text.&amp;nbsp; Split word search is to find words split when PDF file was created, such as interest- ing due to line break.&lt;br /&gt;&lt;br /&gt;In Web mode, you can open web files (not just drag&amp;amp;drop) and clear the page to allow you drag&amp;amp;drop another file.&amp;nbsp; Web history is what you usually see in a browser, though it's limited.&amp;nbsp; You can see the source of the page and make changes to it (you can see the result of the change).&lt;br /&gt;&lt;br /&gt;Overall, it has regex search/replace.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The information on the site is still based on the previous beta.&amp;nbsp; I probably won't update it until I'm certain the features are set.&amp;nbsp; But if you try and wonder how a function works, please feel free to contact me.&amp;nbsp; Also any bug report is welcome.&amp;nbsp;&amp;nbsp;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-6678871536686697726?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/6678871536686697726/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=6678871536686697726' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/6678871536686697726'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/6678871536686697726'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2009/10/casualconc-10-and-more.html' title='CasualConc 1.0 and more...'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-7408737449597953191</id><published>2009-09-28T22:12:00.000+09:00</published><updated>2009-09-28T22:12:45.814+09:00</updated><title type='text'>CasualPConc 0.7 and a new application</title><content type='html'>I made a few changes and fixed a few bugs on CasualPConc, a simple parallel concordancer for Mac OS X.&amp;nbsp; It is a little more stable now.&amp;nbsp; I also worked on the documentation.&amp;nbsp; Now it covers most of the features.&lt;br /&gt;&lt;br /&gt;A new application is based on CasualPConc.&amp;nbsp; When I first released CasualPConc, someone asked if I would make it to handle more than two corpora.&amp;nbsp; This is kind of my answer to that.&amp;nbsp; CasualMultiPConc has limited features, but it can handle up to 5 parallel corpora.&amp;nbsp;&lt;br /&gt;&lt;br /&gt;This new application simply does kwic concordance of up to 5 parallel corpora.&amp;nbsp; The future of this application is up to users (if there's any).&amp;nbsp; I don't have any experience in parallel corpus analysis and I don't need to use it right now, so I'm not even sure how well this works.&amp;nbsp; I only use small corpora to test it.&amp;nbsp; If you are interested in testing it, any feedback is welcome.&amp;nbsp; This one is also for Mac OS X 10.5 Leopard or later, though I only tested it on 10.6 Snow Leopard.&lt;br /&gt;&lt;br /&gt;This application and CasualPConc are only on English site (Main Site link on the right).&amp;nbsp; Both of them are under Other Applications.&amp;nbsp;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-7408737449597953191?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/7408737449597953191/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=7408737449597953191' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/7408737449597953191'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/7408737449597953191'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2009/09/casualpconc-07-and-new-application.html' title='CasualPConc 0.7 and a new application'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-8521572917283097868</id><published>2009-09-23T01:42:00.001+09:00</published><updated>2009-09-23T01:43:45.030+09:00</updated><title type='text'>CasualConc updated, but still not 1.0</title><content type='html'>I fixed a couple of bugs and added a few features.&amp;nbsp; Yes, I wrote I wouldn't spend time on East Asian language support, but I somehow figured out how to handle coloring in Concord with 2-byte character modes (Japanese (plain) and Japanese (wakachi)).&amp;nbsp; Well, it's in the middle of 5-day weekend in Japan...&lt;br /&gt;&lt;br /&gt;Bug fixes&lt;br /&gt;- crashes in Word Count when n-gram list is created in File Mode.&lt;br /&gt;- File Information treated word lengths in bytes not in characters&lt;br /&gt;&lt;br /&gt;Feature Improvements&lt;br /&gt;- added two new Word Count sort options: Word Length and Reverse Word Length&lt;br /&gt;- added Character as a search word choice&lt;br /&gt;- full regular expression search&lt;br /&gt;- a progress bar is added at the bottom of the main window, though it doesn't indicate the progress (it shows CasualConc is processing your request)&lt;br /&gt;- much better East Asian Language support (Japanese (plain) and Japanese (wakachi))&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Along with better East Asian Language support, I made a few other changes.&amp;nbsp; Two new Word Count sort options goes with File Information.&amp;nbsp; Now you can get information about the number of words with certain characters/letters.&amp;nbsp; So this new feature is to check which ones are the longest/shortest words.&lt;br /&gt;&lt;br /&gt;An addition of a new search word mode is to search for the characters used for wildcard search.&amp;nbsp; Now you can search * ? ! and other non-word characters.&lt;br /&gt;&lt;br /&gt;The change in regular expression search is that before this change, all the regular expressions are word level.&amp;nbsp; In other words, the actual regular expression processed inside CasualConc was inside the \b~\b (word boundaries).&amp;nbsp; Now this limitation is lift.&amp;nbsp; So if you want the same results as before, simply put \b in front and after the regular expression.&lt;br /&gt;&lt;br /&gt;The progress bar added to the main window only shows CasualConc is processing.&amp;nbsp; It doesn't show how much processing has done.&lt;br /&gt;&lt;br /&gt;East Asian Language support is much better now.&amp;nbsp; Context word coloring is added and now you can use the database mode.&amp;nbsp; But because of the nature of texts (no spaces between words), some of the functions behave differently.&amp;nbsp; More detailed information about East Asian Language support is documented on the site (only on the English site at this moment).&lt;br /&gt;&lt;br /&gt;Now I will finally focus on bug fixes and minor changes.&amp;nbsp; I won't make any more major changes before 1.0, or so I think...&lt;br /&gt;&lt;br /&gt;The documentations on CasualConc and CasualTagger are updated.&amp;nbsp; They should reflect the latest versions.&lt;br /&gt;&lt;br /&gt;I'd appreciate if you could report any bugs as soon as you find them.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-8521572917283097868?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/8521572917283097868/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=8521572917283097868' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/8521572917283097868'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/8521572917283097868'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2009/09/casualconc-updated-but-still-not-10.html' title='CasualConc updated, but still not 1.0'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-2828350155122419241</id><published>2009-09-20T03:05:00.000+09:00</published><updated>2009-09-20T03:05:37.248+09:00</updated><title type='text'>CasualConc update</title><content type='html'>Just because I wanted to get file information, I added it to CasualConc.&amp;nbsp;&lt;br /&gt;&lt;br /&gt;New feature&lt;br /&gt;- File Information&lt;br /&gt;&lt;br /&gt;It is very basic and only returns type, token, type-token ratio and number of n-letter words.&lt;br /&gt;&lt;br /&gt;Improved&lt;br /&gt;- Fisher's Exact Test calculation speed&lt;br /&gt;&lt;br /&gt;I also rewrote the algorithm for Fisher's Exact test and it is much much faster, though I'm sure no one has used this since I added it last time.&amp;nbsp; But I haven't fully tested this, so I'd appreciate if anyone can test its accuracy.&lt;br /&gt;&lt;br /&gt;Now the version is 0.9.9.9.&amp;nbsp; Well, it's almost 1.&amp;nbsp; If I can't find any other bugs when I use it in the next couple of weeks or I don't get any bug report, I'll simply make it 1.0.&amp;nbsp; I'm sure it still has bugs even if I make it 1.0, but that's the nature of computer programs.&amp;nbsp;&lt;br /&gt;&lt;br /&gt;So, my decision for now is that I don't spend any more time on East Asian Language support because I haven't heard from anyone who uses CasualConc with East Asian languages.&amp;nbsp; I personally don't use CasualConc even with Japanese, so I don't see any necessity.&amp;nbsp; I'll work on this later if I have time, but I'll focus more on other programs once this hits 1.0.&amp;nbsp; So if you use this with East Asian languages and would like to have better support for East Asian languages, please let me know.&amp;nbsp;&lt;br /&gt;&lt;br /&gt;If I think of other nice features, I'll probably add it to other programs and see if it works with CasualConc before I added them to it.&amp;nbsp; In fact, I added File Info to CasualTagger and it wasn't too complicated, so I decided to add it to CasualConc (well, I wanted this feature, but haven't tried to write scripts). &lt;br /&gt;&lt;br /&gt;Anyway, if you use CasualConc or other programs, please, please let me know what you think.&amp;nbsp;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-2828350155122419241?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/2828350155122419241/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=2828350155122419241' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/2828350155122419241'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/2828350155122419241'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2009/09/casualconc-update.html' title='CasualConc update'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-5625138586017546236</id><published>2009-09-17T02:15:00.001+09:00</published><updated>2009-09-17T02:18:29.865+09:00</updated><title type='text'>CasualConc and CasualTagger updates</title><content type='html'>Tonight, I uploaded newer versions of CasualConc and CasualTagger.&lt;br /&gt;&lt;br /&gt;CasualConc's update is minor.&lt;br /&gt;&lt;br /&gt;New features:&lt;br /&gt;- Fisher's Exact Test in Collocation Stats Calculators (experimental)&lt;br /&gt;- Calculator for 2x2 contingency table&lt;br /&gt;&lt;br /&gt;The first one is added upon request by someone who kindly checked accuracies of stats calculation.&amp;nbsp; Thank you, Sebastian!&amp;nbsp; It looks like most of the stats are reasonably accurate.&amp;nbsp; Anyway, because the calculation of the Fisher's p-value is CPU intensive (esp. with large N), I made it as an option.&amp;nbsp; To include Fisher's p-value, go to Preferences -&amp;gt; Other and check 'Include Fisher's Exact Test'.&amp;nbsp; I haven't got report of the accuracy of this, so I'm not sure how well it works.&amp;nbsp; If anyone can test it, I'd appreciate it.&amp;nbsp; The contingency table calculator is based on the same formulas with other stats calculation.&amp;nbsp; It returns Log-Likelihood, chi-square and Fisher's Exact Test (optional).&amp;nbsp;&amp;nbsp; I hope this is useful for someone.&lt;br /&gt;&lt;br /&gt;The update to CasualTagger is also feature enhancements.&lt;br /&gt;&lt;br /&gt;Enhanced features:&lt;br /&gt;- word count now works with untagged text ('None' is added to options)&lt;br /&gt;- kwic search for specified word(s)/phrase(s) is available (it was only possible from a word list)&lt;br /&gt;- simple sort in kwic &lt;br /&gt;- word count and kwic with multiple files (optional)&lt;br /&gt;- editor now accepts text files encoded other than UTF-8 (set in Preferences)&lt;br /&gt;- ignore specified tags or file information (by specifying an end marker/tag) in word count and kwic&lt;br /&gt;&lt;br /&gt;Because I made so many changes, there might be many bugs.&amp;nbsp; Now I'm trying to tag my own corpus, so I've been making changes to suit to my needs.&amp;nbsp; I'll make more changes as I need them, but if you ever try CasualTagger and have nice ideas, please let me know.&amp;nbsp; I'll try to include them if they are not too complicated or they look useful for my work.&amp;nbsp; Also I'll try to update the documentation.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Also I checked CasualMecab on Snow Leopard, but it doesn't work.&amp;nbsp; I installed MeCab and MeCab-Ruby on Snow Leopard and it works fine from Ruby scripts (I tested the exact same script).&amp;nbsp; But somehow MeCab-Ruby doesn't work in an application.&amp;nbsp; I'll try to fix it if I can find any solution.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-5625138586017546236?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/5625138586017546236/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=5625138586017546236' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/5625138586017546236'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/5625138586017546236'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2009/09/casualconc-and-casualtagger-updates.html' title='CasualConc and CasualTagger updates'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-102213946390127788</id><published>2009-09-13T01:36:00.000+09:00</published><updated>2009-09-13T01:36:10.779+09:00</updated><title type='text'>CasualPConc update</title><content type='html'>Somehow, CasualPConc didn't run well in Snow Leopard.&amp;nbsp; This could be because Ruby in Snow Leopard is updated to 1.8.7 from 1.8.6 and this change might have caused errors.&amp;nbsp;&lt;br /&gt;&lt;br /&gt;Anyway, I fixed some major bugs.&amp;nbsp; There might be some other bugs which are caused by the same source (related to Array Controller).&amp;nbsp; I also updated the how-to on the site.&lt;br /&gt;&lt;br /&gt;If you find any bugs, please let me know.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-102213946390127788?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/102213946390127788/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=102213946390127788' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/102213946390127788'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/102213946390127788'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2009/09/casualpconc-update.html' title='CasualPConc update'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-8819856094562590006</id><published>2009-09-05T00:38:00.002+09:00</published><updated>2009-09-05T00:59:27.848+09:00</updated><title type='text'>CasualConc bug fix</title><content type='html'>I found a bug in Word Count when I was cleaning up the codes for it.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;&lt;b&gt;Bug fix&lt;/b&gt;&lt;br /&gt;- crashed when creating n-gram list in the Database mode.&lt;br /&gt;&lt;br /&gt;This bug was introduced when I added a warning message for missing files in the last update.&lt;br /&gt;&lt;br /&gt;As for the clean-up, the problem was in Cluster and Word Count, separate codes were written for each table (right and left).&amp;nbsp; This was because of my lack of scripting skill (I still don't have much).&amp;nbsp; I couldn't think of a good way to identify which button was clicked and process them accordingly.&amp;nbsp; If you know how Cocoa works, this should be obvious, but when I started this project, I had no experience in Cocoa.&lt;br /&gt;&lt;br /&gt;The new version is 0.9.9.7.&amp;nbsp; It's almost 1.0, so I'll try to wrap up to make it 1.0 soon.&amp;nbsp; This means no more major feature before 1.0 and I'll focus on bug fixes.&amp;nbsp; But unless I hear a lot from users whether it is mostly bug free or still has many bugs, I'm not confident enough to make it out of beta, though beta simply means (to me) it's not tested enough.&amp;nbsp; Computer programs will never be bug-free.&lt;br /&gt;&lt;br /&gt;Anyway, if you find any bug, esp. in Word Count and Cluster, please let me know.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-8819856094562590006?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/8819856094562590006/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=8819856094562590006' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/8819856094562590006'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/8819856094562590006'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2009/09/casualconc-bug-fix.html' title='CasualConc bug fix'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-3639718275241428075</id><published>2009-08-29T00:21:00.000+09:00</published><updated>2009-08-29T00:25:56.623+09:00</updated><title type='text'>Snow Leopard</title><content type='html'>Apple's new OS, Mac OS 10.6 Snow Leopard was released yesterday.  I installed it on my Mac mini and did some tests with CasualConc.  All the basic functions seem to work fine.  I haven't checked every single feature, but I don't expect to see any serious issues with this upgrade.&lt;br /&gt;&lt;br /&gt;Yet, if you find any broken feature or any other bugs, please let me know.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-3639718275241428075?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/3639718275241428075/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=3639718275241428075' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/3639718275241428075'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/3639718275241428075'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2009/08/snow-leopard.html' title='Snow Leopard'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-2532530414603667642</id><published>2009-08-27T01:14:00.000+09:00</published><updated>2009-08-27T01:28:00.077+09:00</updated><title type='text'>CasualConc minor update</title><content type='html'>I uploaded a newer version of CasualConc last week.  I haven't had time to work on this for a while, but I got one bug report and one minor feature request, so I fixed the bug and added the feature.  The new version is 0.9.9.6.&lt;br /&gt;&lt;br /&gt;Bug fix&lt;br /&gt;- crashed when a lemma file is selected and the Lemma mode is on but CasualConc can't find the selected lemma file (it was moved, deleted, etc.). &lt;br /&gt;&lt;br /&gt;New feature&lt;br /&gt;- file name (FN) can be selected in sorting Concord results&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;If you find any other bugs, send it to me at casualconc (at) gmail.com.  If the bug is very serious, I'll try to fix it as soon as possible.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-2532530414603667642?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/2532530414603667642/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=2532530414603667642' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/2532530414603667642'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/2532530414603667642'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2009/08/casualconc-minor-update.html' title='CasualConc minor update'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-2838019037140324961</id><published>2009-05-08T20:17:00.000+09:00</published><updated>2009-05-08T20:30:41.452+09:00</updated><title type='text'>CasualConc update</title><content type='html'>I got a couple of bug reports and a feature request, so I fixed them and add the feature.  I also found some other bugs related the reported one and fixed them too.  In addition to them, I made one minor change.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Bug fixes&lt;br /&gt;&lt;/b&gt;- crash in Concord with Scope of Context set as Sentence in the Database mode.&lt;br /&gt;- corrupted export CSV files from Collocation&lt;br /&gt;- crash in saving Collocation table&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Addition&lt;/b&gt;&lt;br /&gt;- Reverse Alphabetical sorting in Word Count&lt;br /&gt;&lt;br /&gt;What this does (if it works) is to sort words in alphabetical order, but from the last letter to the first letter.  So in the normal alphabetical order, &lt;span style="font-weight: bold;"&gt;a&lt;/span&gt;, &lt;span style="font-weight: bold;"&gt;an&lt;/span&gt;, &lt;span style="font-weight: bold;"&gt;that&lt;/span&gt;, &lt;span style="font-weight: bold;"&gt;the&lt;/span&gt;, &lt;span style="font-weight: bold;"&gt;this&lt;/span&gt; are ordered in this order, but in Reverse Alphabetical order, the order will be &lt;span style="font-weight: bold;"&gt;a&lt;/span&gt;, &lt;span style="font-weight: bold;"&gt;the&lt;/span&gt;, &lt;span style="font-weight: bold;"&gt;an&lt;/span&gt;, &lt;span style="font-weight: bold;"&gt;this&lt;/span&gt;, &lt;span style="font-weight: bold;"&gt;that&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Change&lt;/b&gt;&lt;br /&gt;- settings of minimum frequency in Cluster, Collocation/Coocurrence, and Word Count are moved to Preferences&lt;br /&gt;&lt;br /&gt;Now you can't set Min Freq. for each table in Cluster and Word Count, but CasualConc remembers the Min Freq. for each tool.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;I also got a request to support exporting results in Excel format.  I've been experimenting this and this is on the to-do list (I can't tell you when I will add this because I need to figure out how to implement this first).  This would probably require you to install a Ruby Module in Terminal (a single line of command).  Is there any other people who are interested?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-2838019037140324961?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/2838019037140324961/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=2838019037140324961' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/2838019037140324961'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/2838019037140324961'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2009/05/casualconc-update.html' title='CasualConc update'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-88847544670008478</id><published>2009-03-30T19:56:00.000+09:00</published><updated>2009-03-30T20:00:22.859+09:00</updated><title type='text'>CasualConc quick bug fix</title><content type='html'>I just found a bug in CasualConc.  When it opens kwic result in a new window, it crashes.  If you use this function, please download version 0.9.9.3 from the site.  I think this was introduced when I made a few changes last time.&lt;br /&gt;&lt;br /&gt;If you find any other bugs, please let me know.  If they are minor and easy to fix, I'll try to fix them in a day or two.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-88847544670008478?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/88847544670008478/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=88847544670008478' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/88847544670008478'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/88847544670008478'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2009/03/casualconc-quick-bug-fix.html' title='CasualConc quick bug fix'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-9222053028641742283</id><published>2009-03-30T19:08:00.000+09:00</published><updated>2009-03-30T19:21:02.519+09:00</updated><title type='text'>CasualPConc more updates</title><content type='html'>Today, I learned at least one person in the world knows CasualPConc exists other than myself.  I'm really glad that.&lt;br /&gt;&lt;br /&gt;I added a few more features to CasualPConc today.  Now almost all the functions I can think of and I wanted to add are there.  I might add a function to export results if anyone is interested.  Or if anyone has a good idea, I might consider that.  But from now on, I'll focus on bug fixing and documentation.  I'll update the CasualPConc page on the CasualConc main site in the coming weeks. &lt;br /&gt;&lt;br /&gt;I got one request to make CasualPConc be able to handle more than two parallel corpora.  But I think it's hard to add that function to CasualPConc.  It would probably be easier to write a new program based on CasualPConc.  I might work on this once I finalize CasualPConc and if I have time to focus on its development.&lt;br /&gt;&lt;br /&gt;Anyway, if you happen to be reading this blog and are interested, please try it ang give me some feedback.  Using basic functions should not be difficult.  Or you can wait for a few days or weeks until I update the documentation (how to use).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-9222053028641742283?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/9222053028641742283/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=9222053028641742283' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/9222053028641742283'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/9222053028641742283'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2009/03/casualpconc-more-updates.html' title='CasualPConc more updates'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-974184592379251845</id><published>2009-03-29T18:59:00.000+09:00</published><updated>2009-03-29T19:32:59.518+09:00</updated><title type='text'>CasualPConc update</title><content type='html'>I'm almost certain no one has tried it yet, but I spent a little more time to add some more features to CasualPConc, a new parallel concordancer.  This application is available at the CasualConc main site under Utility Programs, but the documentation is not up-to-date. &lt;br /&gt;&lt;br /&gt;I don't think I'm going make this as fancy as CasualConc, but I'm trying to use much more RubyCocoa (or Cocoa) features (I'm learning...).&lt;br /&gt;&lt;br /&gt;CasualPConc originally had just kwic and word frequency count features.  Now it has word cluster and collocation features just as CasualConc.  One specific feature to CasualPConc is finding keyword in the matched corpus after running kwic search.  When you run kwic search, you have the matched portion of the matched corpus (paragraphs or sentences), which includes words that are equivalent or similar to the one you searched.  CasualPConc goes through the matched portion of text and calculate keyness of words against the entire corpus.  I'm not sure if I explain this clearly, but it's there, but I'm also not sure if this works as intended.&lt;br /&gt;&lt;br /&gt;I also added stop word/skip character functions.  My understanding is stop words are the ones that are very frequent in a language and eliminating them helps people see what they look for more clearly.  You can create stop word lists for any number of languages or corpora.  The skip characters function is for two-byte languages, like East Asian languges or more specifically Japanese because characters for period, comma, brackets, etc. in Japanese are not treated as such by regular expressions.  They are treated as regular characters like alphabets and included in word lists and context words and they contaminate results.  Both of these functions are experimental and not fully tested and they are separate at this moment, but I might combine them as a single function or list.&lt;br /&gt;&lt;br /&gt;If you are interested, please try it and give me some feedback.  I personally don't use parallel concordancer much and I don't have good parallel corpora, so I can't really test it.  Any feedback is welcome (functionality, usability, bug report, etc.).  The current version is 0.3, but this simple means I have made two major changes/enhancements with some testing and bug fixing since version 0.1.&lt;br /&gt;&lt;br /&gt;Also if you use any of my applications, I'd really appreciate your feedback on them.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-974184592379251845?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/974184592379251845/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=974184592379251845' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/974184592379251845'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/974184592379251845'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2009/03/casualpconc-update.html' title='CasualPConc update'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-5664215040112135594</id><published>2009-03-26T17:36:00.000+09:00</published><updated>2009-03-29T18:59:19.779+09:00</updated><title type='text'>A new project</title><content type='html'>As I didn't get much feedback and I was kind of busy, I didn't touch any of the programs (scripting) for a while.  But I recently did small translation work and thought a parallel concordancer might help in that situation.  So I spent the last few days to start a new project.  It is a simple parallel concordancer for Mac OS X Leopard and I named it CasualPConc.&lt;br /&gt;&lt;br /&gt;Currently it doesn't do much (possibly many bugs) and because I don't really use parallel corpora, I don't have a good idea about how to develop it.  So I'd really appreciate any feedback.  I used to work as a translator for a short period of time, so if I just follow my intuition, it will be more like a database program for a translator or language learner.  The program is available on the CasualConc main site (&lt;a href="http://sites.google.com/site/casualconc/utility-programs/casualpconc"&gt;direct link&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;I don't expect many people use it, so if you give me feedback, it is likely that the functions you request will be added (as long as I can handle them).  Please email me directly, or leave comment here, or post on the Discussion Board.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-5664215040112135594?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/5664215040112135594/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=5664215040112135594' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/5664215040112135594'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/5664215040112135594'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2009/03/new-project.html' title='A new project'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-5739581068910935538</id><published>2008-12-28T16:56:00.000+09:00</published><updated>2008-12-28T17:00:45.820+09:00</updated><title type='text'>A few minor changes</title><content type='html'>This would probably be the last update of CasualConc this year.&lt;br /&gt;&lt;br /&gt;I made a few minor cosmetic changes and fixed a few minor bugs.  The latest version is 0.9.9.2.  If you don't have any problem with 0.9.9.1, you don't need to update to this version.  I just wanted to make the changes before I forget.&lt;br /&gt;&lt;br /&gt;Happy new year!!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-5739581068910935538?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/5739581068910935538/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=5739581068910935538' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/5739581068910935538'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/5739581068910935538'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/12/few-minor-changes.html' title='A few minor changes'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-2373000693510453392</id><published>2008-12-10T10:47:00.000+09:00</published><updated>2008-12-28T17:01:45.943+09:00</updated><title type='text'>Bug fix and few final touch to CasualTranscriber</title><content type='html'>I found a bug when I was showing it to someone today.  When I added a function to dynamically change menu items based on Preference settings (App Mode), I set some of them to turn off when they should be on.  I think I fixed this.  I also changed the time stamp insertion.  I finally figured out how to override link clicking behavior (links open with a browser by default).  So now you can click time stamp on the Editor to go to the time on the movie/sound clip.  You can still select the time code and use shortcut (you don't need to use a mouse in this way).  The latest version is 0.9.9.1.&lt;br /&gt;&lt;br /&gt;But today, my friend showed me there IS software just like CasualTranscriber...  It is called &lt;span style="font-weight: bold;"&gt;InqScribe&lt;/span&gt;.  Well, what's shocking was it looks almost the same as CasualTranscriber.  I mean the layout, functions, etc.  I haven't tried it (it's $99 with 30-day free trial, $39 for students), so I don't know how good it is, but it would probably better considering its price and how much time I spent on CasualTranscriber (less than three weeks).  So if you are more interested in adding subtitles, you might want to take a look at it.   Probably the only advantage of CasualTranscriber is the cost...&lt;br /&gt;&lt;br /&gt;Anyway, if you try CasualTranscriber, I'd really like to hear from you.  And because there aren't many users (prabably only a few), your feature request might be taken seriously unless it's too complicated for me to handle.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-2373000693510453392?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/2373000693510453392/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=2373000693510453392' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/2373000693510453392'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/2373000693510453392'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/12/bug-fix-and-few-final-touch-to.html' title='Bug fix and few final touch to CasualTranscriber'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-4499759211386664583</id><published>2008-12-07T18:36:00.001+09:00</published><updated>2008-12-07T18:42:34.676+09:00</updated><title type='text'>CasualTranscriber Development is wrapping up</title><content type='html'>I worked on CasualTranscriber a little more and I finally found a major source of crash.  It crashes less now (I hope).  I also added a separate player in case someone might want to use controller for your movie playback (such as in class).  I also found that OS X can handle MS Word format (only text), so I added a function to handle MS Word document.&lt;br /&gt;&lt;br /&gt;I personally think this is a program to help transcription and adding subtitles is a secondary feature, but I looks like my friend uses this primarily for adding subtitles.  So I spent a little more time to add subtitles in a different format (not as a text track). &lt;br /&gt;&lt;br /&gt;I hope this program is useful for more people.  Intended users are language teachers and conversation analysts.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-4499759211386664583?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/4499759211386664583/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=4499759211386664583' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/4499759211386664583'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/4499759211386664583'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/12/casualtranscriber-development-is.html' title='CasualTranscriber Development is wrapping up'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-1475013750542496309</id><published>2008-12-05T16:41:00.000+09:00</published><updated>2008-12-05T16:49:36.978+09:00</updated><title type='text'>Bug fixes to CasualTranscriber</title><content type='html'>Thanks to my friend who is testing CasualTranscriber, I was able to fix many bugs.  The program is still not totally stable, but it sounds like it is usable.  I implemented almost all the features I can think of (or I can manage) now and from now on I'll fix bugs and might add some features if I get feedback.  The current version is 0.6, but it just means I've made some significant changes 6 times since I started working on this. &lt;br /&gt;&lt;br /&gt;In any case, if you ever use it and find any bugs or think of something very cool, please let me know.  I'm not sure what I do next, but I don't get much feedback, so I don't know what I should do with CasualConc.  I want to make it handle 2-byte languages better and possible add parallel concordancing, but it takes time to think about how to implement them.  I have a few more ideas about small programs, but maybe I have to get more serious about my own work...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-1475013750542496309?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/1475013750542496309/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=1475013750542496309' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/1475013750542496309'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/1475013750542496309'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/12/bug-fixes-to-casualtranscriber.html' title='Bug fixes to CasualTranscriber'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-3971659901404614673</id><published>2008-12-01T17:05:00.000+09:00</published><updated>2008-12-01T17:24:53.741+09:00</updated><title type='text'>CasualTranscriber</title><content type='html'>I know there are not so many people (probably only few mostly I personally know) have tried CasualTranscriber.  It's still very early in development, but I've basically rewritten the program from a window-based to a document-based application.  Now it deals a rich text or plain text file just like a text editor or word processor.  When you close a window, that means you close the document on it.&lt;br /&gt;&lt;br /&gt;In addition to the original features (shortcut control of movie/sound clip and extraction of a selected part), you can add chapters, extract a frame image, add subtitles, use the application just as a player, etc.  I can't list all the features, but they are all on the new CasualConc site under Utility Programs.  Some of the features (esp. adding subtitles) are still experimental and the application itself is not as stable as I want it to be (so now it has an autosave function).  This is partly because I'm still new to QTKit (QuickTime Took Kit) and not all QuickTime functions are not avaiable via QTKit.  I'm using RubyCocoa, which is a bridge between Ruby, a scripting language, and Cocoa, Mac's application foramework, so there is another layer of issues there.&lt;br /&gt;&lt;br /&gt;I want to make this program easy to use for teachers and researchers (to-be) who need to transcribe conversations, speech, movie clips, etc. for teaching/researching.  And if you ever use this, I'd really appreciate your feedback.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-3971659901404614673?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/3971659901404614673/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=3971659901404614673' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/3971659901404614673'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/3971659901404614673'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/12/casualtranscriber.html' title='CasualTranscriber'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-2557903721088504274</id><published>2008-11-20T21:02:00.000+09:00</published><updated>2009-03-06T03:21:30.322+09:00</updated><title type='text'>New Utility</title><content type='html'>I started another experiment and started to write a new utility program.  It's named CasualTranscriber.  It's a simply utility program to assist transcription of movie/sound files (text).  I googled and found some good free ones, but because I wanted to learn a bit about QTKit (QuickTime Kit) with RubyCocoa, I wrote it.  It's very simple at the moment and at very early stage of development.  But if you are interested, please check it out and let me know what you think.  If enough people are interested, I will develop it further.  Here's the &lt;a href="http://sites.google.com/site/casualconc/utility-programs/casualtranscriber"&gt;direct link&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;As I wrote in the previous post, I started a new CasualConc site on Google Sites.  And I want to introduce it here, though only English site is available at this moment (no Japanese site yet).  All the program files are still on the old site, but the new site has much more information especially about utility programs (with screenshots).  You can access the new site from &lt;a href="http://sites.google.com/site/casualconc/Home"&gt;this link&lt;/a&gt;.  The CasualTranscriber page has much more information.  So please check it out there.&lt;br /&gt;&lt;br /&gt;As always, I'd appreciate any feedback on any of the program and also the sites.  If you have any suggestion/comment/bug report, please leave comment on this blog, add new topic to the Discussion Board, or send email to casualconc (at) gmail.com.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-2557903721088504274?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/2557903721088504274/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=2557903721088504274' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/2557903721088504274'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/2557903721088504274'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/11/new-utility.html' title='New Utility'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-6126087638774667758</id><published>2008-11-12T18:01:00.000+09:00</published><updated>2008-11-12T19:03:09.147+09:00</updated><title type='text'>Bug fixes</title><content type='html'>As you might have noticed, the current CasualConc site is hosted on Google Page Creator.  But Google decided to stop this service and focus on Google Sites.  So I've been transferring site contents to a new site on Google Sites.  In the process, I've been updating the content and adding a little more information to some pages.  So far, I've created the English site, but haven't started the Japanese site.  I'm not sure how many people prefer to have Japanese page, but I will eventually create a Japanese site (I personally prefer to reading in Japanese).&lt;br /&gt;&lt;br /&gt;While I've been updating the content, I used basic functions of all the current programs and found many bugs.  Most of them were minor and some of them are major, but not the main features.  I fixed so many things in the last couple of weeks (as well as adding new functions), so I can't track all the fixes/changes, but here's a list of few of them.&lt;br /&gt;&lt;br /&gt;CasualConc:&lt;br /&gt;- Context words function in Concord was broken, but not it should be working.&lt;br /&gt;- Keyword grouping function was fixed&lt;br /&gt;- Keyword grouping only worked when the search was for a group of words.  Now keyword groups can be used in a phrase search&lt;br /&gt;- Lemma in search word should work with wild card/phrase search now&lt;br /&gt;- You can import a word list not created by CasualConc now.  It accepts CSV or Tab-delimited file with words in one column and frequency in the other.  This allows you to import a word list created by other program/script.&lt;br /&gt;&lt;br /&gt;CasualTextractor:&lt;br /&gt;- In PDF/Web/Document, most of changes should be undoable.  I changed the function to draw text in text area.&lt;br /&gt;- Batch process was not working, but it should now.&lt;br /&gt;&lt;br /&gt;I also made minor changes to CaualTagger and CasualMecab, but I forgot what I did.  Most of them are bug fixes.&lt;br /&gt;&lt;br /&gt;If you downloaded any of the program and find bugs, they might have been fixed now.  If not, please report them to me.  You can add your comment to the post on this blog or email me or post on Discussion Board (Google Groups).  If you have any good ideas/suggestions for any of the programs, I'd appreciate your feedback.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-6126087638774667758?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/6126087638774667758/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=6126087638774667758' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/6126087638774667758'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/6126087638774667758'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/11/bug-fixes.html' title='Bug fixes'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-3331471968470080521</id><published>2008-11-11T09:56:00.000+09:00</published><updated>2008-11-11T10:00:57.221+09:00</updated><title type='text'>New experiment (UPDATE)</title><content type='html'>This is an update to the post, &lt;a href="http://casualconc.blogspot.com/2008/11/new-experiment.html"&gt;New experiment&lt;/a&gt;.  I implemented a function to automatically update CasualConcData.ccdb file.  So I dropped the beta beta version.  The official beta version is 0.9.9.1.  Please check the &lt;a href="http://casualconc.blogspot.com/2008/11/new-experiment.html"&gt;previous post&lt;/a&gt; for the details of the last update.&lt;br /&gt;&lt;br /&gt;If you try this new function, please let me know what you think.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-3331471968470080521?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/3331471968470080521/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=3331471968470080521' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/3331471968470080521'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/3331471968470080521'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/11/new-experiment-update.html' title='New experiment (UPDATE)'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-7486499214616878086</id><published>2008-11-09T18:13:00.000+09:00</published><updated>2009-03-19T01:42:48.404+09:00</updated><title type='text'>Link Grammar</title><content type='html'>I found this syntactic parser (&lt;a href="http://www.link.cs.cmu.edu/link/"&gt;site&lt;/a&gt;) the other day and I also found there is a Ruby binding called Ruby LinkParser (&lt;a href="http://deveiate.org/projects/Ruby-LinkParser"&gt;site&lt;/a&gt;).  The instruction on the Ruby LinkParser looked pretty simple and I thought I would be able to install it and start using Link Grammar from Ruby in no time.&lt;br /&gt;&lt;br /&gt;Well, I was wrong.  Maybe because the instruction was written for Unix/Linux users, it didn't work on Mac OS X Leopard.  I spend half of the day to somehow manage to install it on my Mac.  But the process is ugly.  I don't recommend it to someone who is not so familiar with Mac OS X system.  I hope the original authors could fix the problem and/or write a patch for the latest version of Link Grammar.&lt;br /&gt;&lt;br /&gt;Anyway, I put the step-by-step instruction with a lot of screen captures on the CasualConc site.  If you are interested, please check &lt;a href="http://sites.google.com/site/casualconc/utility-programs/install-ruby-linkparser"&gt;this page&lt;/a&gt;.  If you know or figure out a better way to install them, please, please let me know.  In fact, I'm not sure if it's working as it should.&lt;br /&gt;&lt;br /&gt;In the future, I want to add a function to CasualTagger to do some simple syntactic parsing using Link Grammar/Ruby LinkParser.  But first I need to figure out which functions to add and how. Yoichiro Hasebe wrote a program (port of phpSyntaxTree), RSyntaxTree, that draws a tree diagram from a syntactically parsed sentence (&lt;a href="http://yohasebe.com/rsyntaxtree/"&gt;check this page&lt;/a&gt;), such as [S [NP RSyntaxTree][VP [V generates][NP multilingual syntax trees]]].  So I want to add a function to output this format (I'm not even sure if Ruby LinkParser does it or not).&lt;br /&gt;&lt;br /&gt;By the way, for those who are not sure what syntactic parser is, here's the sample.&lt;br /&gt;&lt;br /&gt;The origial sentence is:&lt;br /&gt;&lt;br /&gt;Ruby is a dynamic, open source programming language.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_aDr7cQQzDwI/SRau_0JXD6I/AAAAAAAAAAU/BmgILvp3MuM/s1600-h/linkparser_sample.png"&gt;&lt;img style="cursor: pointer; width: 320px; height: 205px;" src="http://3.bp.blogspot.com/_aDr7cQQzDwI/SRau_0JXD6I/AAAAAAAAAAU/BmgILvp3MuM/s320/linkparser_sample.png" alt="" id="BLOGGER_PHOTO_ID_5266589225609269154" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;If you try to install this, please let me know if it works for you.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-7486499214616878086?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/7486499214616878086/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=7486499214616878086' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/7486499214616878086'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/7486499214616878086'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/11/link-grammar.html' title='Link Grammar'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_aDr7cQQzDwI/SRau_0JXD6I/AAAAAAAAAAU/BmgILvp3MuM/s72-c/linkparser_sample.png' height='72' width='72'/><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-8177691482401299965</id><published>2008-11-06T17:49:00.000+09:00</published><updated>2008-11-11T09:50:02.804+09:00</updated><title type='text'>New experiment</title><content type='html'>I added a new experimental feature to CasualConc.  It is an XML information tag handling, which is very limited.  What I mean is if you have XML files or XML-formatted plain text files that have information as tag attributes or elements, you can filter the files with the information.&lt;br /&gt;&lt;br /&gt;The current implementation can handle two types:&lt;br /&gt;&lt;br /&gt;&amp;lt;header attr1="~" attr2="~"&amp;gt;&amp;lt;/header&amp;gt;&lt;br /&gt;&lt;br /&gt;or&lt;br /&gt;&lt;br /&gt;&amp;lt;header&amp;gt;&amp;lt;attr1&amp;gt;~&amp;lt;/attr1&amp;gt;&amp;lt;attr2&amp;gt;~&amp;lt;/attr2&amp;gt;&amp;lt;/header&amp;gt;&lt;br /&gt;&lt;br /&gt;So if your files have:&lt;br /&gt;&lt;br /&gt;&amp;lt;info date="11052008" title="Presidential Election"&amp;gt;&lt;br /&gt;&lt;br /&gt;or&lt;br /&gt;&lt;br /&gt;&amp;lt;info&amp;gt;&lt;br /&gt;&amp;lt;date&amp;gt;11052008&amp;lt;/date&amp;gt;&lt;br /&gt;&amp;lt;title&amp;gt;Presidential Election&amp;lt;/title&amp;gt;&lt;br /&gt;&amp;lt;/info&amp;gt;&lt;br /&gt;&lt;br /&gt;CasualConc can preselect the files based on your query.  Check &lt;a href="http://casualconc.googlepages.com/howtouse-xmlinfotaghandling"&gt;this page&lt;/a&gt; for more information.&lt;br /&gt;&lt;br /&gt;&lt;strike&gt;Because of the other changes I made to SQLite database handling, CasualConc is incompatible with the corpora/databases you created in the Advanced File Handling Mode (it shuts down when it tries to read CasualConcData.ccdb file in the ~/Library/Application Support/CasualConc.  If you have used a prior version, you need to delete/move/rename it.  So this beta of beta is not linked to any pages on the site.  If you are interested in testing this highly experimental beta-beta version, please go to this download page directly.&lt;/strike&gt;  (I implemented a function to automatically update the CasualConcData.ccdb file.  Now this version is downloadable from the regular download page.)  I would really appreciate if you could give me feedback especially on this new function.  If you have any suggestion, please make it detailed.  I personally haven't used XML files, so I'm not sure if this is useful.  More detailed is your suggestion, more likely it will be added to CasualConc (no guarantee, though).&lt;br /&gt;&lt;br /&gt;Apart from this beta-beta, I fixed one bug on collocation coloring in CasualConc regular beta version (sounds wierd...).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-8177691482401299965?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/8177691482401299965/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=8177691482401299965' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/8177691482401299965'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/8177691482401299965'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/11/new-experiment.html' title='New experiment'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-1827202936564365663</id><published>2008-10-22T17:17:00.000+09:00</published><updated>2008-10-22T17:58:05.095+09:00</updated><title type='text'>CasualConc Update</title><content type='html'>I've been working on CasualConc and uploaded 0.9.9 to the site.  Well, I'm not sure when to put 1.0, so I might go with 0.9.9.1...&lt;br /&gt;&lt;br /&gt;The most of the changes I implemented are internal and many bug fixes.  I didn't really touch the core tools, so most of the work was done on file handling.&lt;br /&gt;&lt;br /&gt;Here are some of the changes you might (or not) notice:&lt;br /&gt;&lt;br /&gt;Plain Text File encoding&lt;br /&gt;You can now set a default text encoding on the File view (no need to open Preferences).  You can specify a default encoding before you open files, which applies to all the files you add to the file list table.  But now you can change them on the table.  This means you can select different text encoding for each file.  I also added ISO Latin 1 and ISO Latin 2 to the encoding list.&lt;br /&gt;&lt;br /&gt;Open/Save panel&lt;br /&gt;This change probably would not make difference to most of the people.  I just wanted to change it to Genie panel(?) because I learned how to.&lt;br /&gt;&lt;br /&gt;Collocation&lt;br /&gt;Most frequent position for each context word is now colored in red.&lt;br /&gt;&lt;br /&gt;Bug fixes&lt;br /&gt;- Exporting/Saving results should work fine now, though I'm not sure how many people have ever tried to use this function.  I also made changes to accommodate the changes in Collocation.&lt;br /&gt;- Fixed crashes when you hit space (or may be with some other keys) on the blank table.  You might have never done such a stupid thing, but I found this when I accidentally hit the space bar in Advanced Corpus File Handling mode.&lt;br /&gt;&lt;br /&gt;Also I put a note on the site, but your preferences settings will be lost if you have used the previous versions.  If you want to use it change the name of the preference file "com.apple.rubycocoa.CasualConcApp.plist" in your home -&gt; Library -&gt; Prefereces folder to "CasualConcApp.plist".  Except for tag ignore settings, your preferences settings should be safe.&lt;br /&gt;&lt;br /&gt;Along with this change, I also added ISO Latin 1 and ISO Latin 2 to the list of encodings (open/save) in CasualTextractor.&lt;br /&gt;&lt;br /&gt;If you find any of these attractive or bothered by bugs, please try the latest version and let me know what you think about it.  But reports are also welcome.&lt;br /&gt;&lt;br /&gt;By the way, I haven't updated all the documentation yet.  Some of them are quite old.  I guess I have to find time and update them (or rearrange them).  I read somewhere that Google is moving the content of Google Page Creatot to Google Sites.  That might be a good time to update documentation.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-1827202936564365663?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/1827202936564365663/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=1827202936564365663' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/1827202936564365663'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/1827202936564365663'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/10/casualconc-update.html' title='CasualConc Update'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-1367523280465191295</id><published>2008-10-18T17:14:00.000+09:00</published><updated>2008-10-18T17:37:33.932+09:00</updated><title type='text'>Utility Programs updates</title><content type='html'>I've been experimenting some Cocoa UIs and bindings, and adding features I learned to utility programs.  I made minor changes to CasualTextractor and CasualMecab.  I also made a little more changes and added some new functions to CasualTagger.  Now I need to manually tag a lot of texts, so I'm trying to make it a tool to help manual tagging.  I copied regular expression replace from CasualConc (hidden feature) and added simple tag coloring.  Now it also has a simple word/tag count and kwic concordance of a single file.&lt;br /&gt;&lt;br /&gt;I'm also working on CasualConc.  As I wrote in the last post, I will probably clean up some old codes.  Tag handling might take some time to implement because I have to think about how to handle tags in CasualConc.  Any idea?  What I'll probably do first is change/fix file handling elements.  In utility programs, you can now change the character encoding of plain text files after you add them to the table.  This will allow you to use text files with different encodings.&lt;br /&gt;&lt;br /&gt;Another minor change is coloring in Collocation.  Now the most frequent position for each context word will be colored in red.  This looks working fine, so you will see this feature in the next update.&lt;br /&gt;&lt;br /&gt;In addition to RubyCocoa programs, I wrote a simple javascript-based parallel concordancer, which I was asked to write.  I based it on my old javascript-based concordancer, so not much scripting was involved.  I did this because I'm thinking about writing a parallel concordancer for Mac, as I wrote on this blog before, so I wanted to know what are the most fundamental features for a parallel concordancer.  I googled and based on some parallel concordancers out there, I wrote it.  It only creates a table with matched texts based on the search.  It also creates kwic results and you can select one to show the matching text.  But what else is necessary for a parallel concordancer for Mac?  If you have any suggestion, please leave your comment here or send me email or post on CasualConc Google Discussion Board.  If you could give me enough information for me to figure out how to implement your requests, you will have a better chance to see them, though it also depends on my scripting skills.&lt;br /&gt;&lt;br /&gt;Anyway, please check out the utility programs and let me hear what you think.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-1367523280465191295?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/1367523280465191295/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=1367523280465191295' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/1367523280465191295'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/1367523280465191295'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/10/utility-programs-updates.html' title='Utility Programs updates'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-1586087837358121746</id><published>2008-10-13T17:49:00.000+09:00</published><updated>2008-10-13T18:24:29.485+09:00</updated><title type='text'>A few bug fixes</title><content type='html'>I made a few more bug fixes and some internal changes.  The somewhat major but I fixed was aligning and coloring of text on Concord table and text view.  If you use CasualConc only with English, you probably didn't see any problem, but if you deal with text with a lot of non-standard alphabet characters, the display was ugly.  Now it's better (not perfect).  There still is a problem with language that combine more than one characters to display one character on display.  Other than that displaying text is less problematic.&lt;br /&gt;&lt;br /&gt;The major internal change I made is using Shared User Defaults Controller to save Preference settings.  This saved a lot of codes, but at the same time, this is not perfect.  Somehow this doesn't remember that changes made by scripts, so for some text data, I have to use script to save the data properly.  But I might have done something wrong, so if you find any bugs related to Preferences, please let me know.&lt;br /&gt;&lt;br /&gt;I also made a major change to CasualTagger.  CasualConc had hidden functions to help manual tagging, which I have never turned on officially.  This was because I didn't have time to finalize/fix bugs, so I decided not to make it available.  Now I took some of the features from it and added to CasualTagger.  I haven't documented them, but I included a simple instruction in the application as a help file.  If you are interested, please take a look at it.  CasualTagger is on the main site under Utility Programs.&lt;br /&gt;&lt;br /&gt;I also made more changed to IPATypist, which not a lot of people use.  And I guess those people who have downloaded it might not read this, so they don't know if it's updated or not (though I'm not sure if they keep using it).&lt;br /&gt;&lt;br /&gt;That's about it for now.  I'm also thinking about adding tag handling features to CasualConc, but it doesn't look promising.  I once wrote experimental scripts to handle some types of tags, but they don't work very well.  Now if I want to seriously add this feature, I have to get rid of old ones first.  It's not very easy...  Also there are a lot of weird scripts in CasualConc because it includes some codes I wrote when I was just starting to learn Ruby.  I guess I have to clean up old messes first before I add some significant new features.&lt;br /&gt;&lt;br /&gt;In any case, if you use CasualConc and/or other utility programs, please let me know what you think.  The current priority is adding tag handling features.  East Asian language support might be dropped.  It would be a separate program.  Some people asked about parallel concordancer, so I might write a separate one for it, but I still don't have enough information to go ahead.  If you'd like to see a parallel concordancer for Mac, please give me information.  You can email me directly or make a comment on this blog or post on CasualConc Discussion Board.  I need to know what are the most fundamental features and how they should be implemented.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-1586087837358121746?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/1586087837358121746/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=1586087837358121746' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/1586087837358121746'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/1586087837358121746'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/10/few-bug-fixes.html' title='A few bug fixes'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-7055645296691346010</id><published>2008-10-10T17:52:00.000+09:00</published><updated>2008-10-10T18:00:08.880+09:00</updated><title type='text'>IPATypist update</title><content type='html'>This is not related to corpus analysis, so I guess almost no one is interested, but as my memo, I write down what I did.&lt;br /&gt;&lt;br /&gt;IPATypist is a very simple utility program to type IPA phonetic alphabets.  With this update, I added a database function to it.  Now phonetic transcriptions can be stored so you don't have to type them again.  But this is mainly my experiment to use CoreData, an OS X framework to easily handle database type programs.  I just started look at it, so I'm still not sure if I did this right or wrong, but it looks it's working. &lt;br /&gt;&lt;br /&gt;If you happen to be one of rare people who are interested in this program, please check it out and let me know what you think.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-7055645296691346010?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/7055645296691346010/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=7055645296691346010' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/7055645296691346010'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/7055645296691346010'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/10/ipatypist-update.html' title='IPATypist update'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-8276884349455180809</id><published>2008-10-09T16:08:00.000+09:00</published><updated>2008-10-09T16:33:09.724+09:00</updated><title type='text'>Utility Programs</title><content type='html'>I updated a couple of utility programs, including changing names, and added a new utility program to the main site.&lt;br /&gt;&lt;br /&gt;Two of the utilities are TextExtractor and jparser.  These two names are already used for program/module names, so I decided to change them.  I use the same Casual~ namings for them and TextExtractor is now CasualTextractor and jparser is now CasualMecab.  Well, it's obvious I didn't spend much time on this...  Both utilities had a few bug fixes and minor feature changes, which are not obvious.&lt;br /&gt;&lt;br /&gt;And a new addition is a POS tagger.  I finally found a English POS tagging module for Ruby.  The module is EngTagger by &lt;a href="http://yohasebe.com/2008/05"&gt;Yoichiro Hasebe&lt;/a&gt;, which is a Ruby port of Perl's Lingua::En::Tagger module.  I simply added GUI and a function to process multiple files.  You can also select a tag type (the default of EngTagger is xml format).  I named it CasualTagger and you can download it from the CasualConc main site (here's the direct link to the page: &lt;a href="http://casualconc.googlepages.com/casualtagger"&gt;English&lt;/a&gt;/&lt;a href="http://casualconc.googlepages.com/casualtagger2"&gt;Japanese&lt;/a&gt;).  To use CasualTagger, you need to install EngTagger, but it's very easy from Terminal.app (single line of command).  For more information about EngTagger (tag sets, etc.), check &lt;a href="http://engtagger.rubyforge.org/"&gt;this page&lt;/a&gt;.  As always, any feedback is welcome.&lt;br /&gt;&lt;br /&gt;Now I need to seriously think about adding tag-handling feature to CasualConc.  But how??  If you have any suggestion, leave your comment on this blog or email me (email address is on the main site).&lt;br /&gt;&lt;br /&gt;Also a couple of people asked for a parallel concordancer (for Mac and with Javascript).  But what's the most fundamental functions?  Any suggestion/comment about parallel concordancer is also welcome.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-8276884349455180809?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/8276884349455180809/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=8276884349455180809' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/8276884349455180809'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/8276884349455180809'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/10/utility-programs.html' title='Utility Programs'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-6026478400002307459</id><published>2008-09-19T09:24:00.001+09:00</published><updated>2009-09-09T10:41:51.989+09:00</updated><title type='text'>TextExtractor</title><content type='html'>I haven't had time to write any script at all for a while, but I've found some time to experiment on RubyCocoa.  The stuff I'm working on won't change anything on the surface of CasualConc.  The changes will be mostly internal and slow.&lt;br /&gt;&lt;br /&gt;A very few feature requests I got are xml file handling and parallel concordancing.  But I don't have any experience in these, so I need more information.  I added two threads to Google Groups discussion board about these two.  If people ever read this and give me information about these two, I might add them or write a separate program for them (parallel concordancer).&lt;br /&gt;&lt;br /&gt;Another thing I'm considering about CasualConc is dropping East Asian Languages support.  I don't hear from anybody who uses CasualConc for Japanese/Korean/Chinese, so I don't know if I really want to keep trying to accommodate these function on CasualConc.  It would probably be easier for me to maintain if I separate East Asian Languages concordancer (eps. kwic) as another program.  I'll think about this more, but if you have any suggestion, let me know.&lt;br /&gt;&lt;br /&gt;Anyway, as a part of my experiment on RubyCocoa, I updated TextExtractor, an utility program to extract text data from verious text embedded files and to convert text encoding of plain text files to UTF-8.  I'm not sure if you looked at the utility program section of the CasualConc site, I have a few utility programs that deal with text files.  I combined two of them (PDF to Text and HTML to Text converters) and added a few extra functions.  The first version (0.1) of TextExtractor had a function of jparser (a Japanese parsing program using MeCab), but it didn't run without MeCab and MeCab-Ruby.  So I dropped this function.&lt;br /&gt;&lt;br /&gt;Instead, I made that part to simply convert non-UTF-8 text files (.txt) to UTF-8 text files and MS Word, PDF, HTML, OpenOffice documents to UTF-8 text files or Rich Text Format files.  All other parts (PDF to Text, Web file to Text, and batch process) can save files as RTF files.  When you convert files to RTF files, you can either keep text/font information of the original files (fonts/font style/etc.) or throw away this info and save as a plain text on RTF file.&lt;br /&gt;&lt;br /&gt;I also added basic instruction in English (not translated to Japanese yet).  So if you are interested, please try it and let me know what you think.&lt;br /&gt;&lt;br /&gt;EDIT: This program is renamed as CasualTextractor&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-6026478400002307459?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/6026478400002307459/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=6026478400002307459' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/6026478400002307459'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/6026478400002307459'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/09/textextractor.html' title='TextExtractor'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-4331103231907722172</id><published>2008-09-07T14:25:00.000+09:00</published><updated>2008-09-07T14:35:01.515+09:00</updated><title type='text'>A few very minor changes</title><content type='html'>Well, obviously, I haven't done anything with CasualConc for the last three months.  I finally announced this on Corpora List and some people got interested and started testing CasualConc.  But I heard from only a few people.  Still it's good to know someone uses it and likes it. &lt;br /&gt;&lt;br /&gt;I made a few minor changes/bug fixes to CasualConc.  The only one that's worth mentioning here is that now CasualConc remembers the files you selected in File Mode when you quit the program.  The next time you start CasualConc, the files you selected last time should be on the file list.  Now the version number is 0.9.8 beta. &lt;br /&gt;&lt;br /&gt;As always, I'd like to know what you think about the program.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-4331103231907722172?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/4331103231907722172/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=4331103231907722172' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/4331103231907722172'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/4331103231907722172'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/09/few-very-minor-changes.html' title='A few very minor changes'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-7440121818755483617</id><published>2008-06-08T01:32:00.000+09:00</published><updated>2008-06-08T02:08:34.019+09:00</updated><title type='text'>Bug fixes</title><content type='html'>I finally got a bug report.  Now I know at least a few more people are using CasualConc.&lt;br /&gt;&lt;br /&gt;The bugs are related to the recent changes I made to Lemmatization and Collocation.&lt;br /&gt;&lt;br /&gt;The bug related to lemmatization was that when lemmatization was activated without specifying a lemma file, CasualConc crashed.  This was because CasualConc looked for a lemma file when it started or returned from the preferences and if the file was not found, it crashed. &lt;br /&gt;&lt;br /&gt;The two bugs related to collocation were 1) it didn't run in file mode, and 2) search in concord didn't work when 'Treat Keywords as One Word' option is activated in preferences.  These should be fixed and work properly now.&lt;br /&gt;&lt;br /&gt;I would appreciate any report of bugs.  And I'd like to know how you like CasualConc.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-7440121818755483617?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/7440121818755483617/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=7440121818755483617' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/7440121818755483617'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/7440121818755483617'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/06/bug-fixes.html' title='Bug fixes'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-7338221449842933240</id><published>2008-05-30T10:31:00.000+09:00</published><updated>2008-05-30T10:41:24.405+09:00</updated><title type='text'>A minor change</title><content type='html'>I found a bug (sort of) in Concord a couple of days ago.  It's a minor bug and this happens only when you use the database mode in Concord.  Well, it's more of memory leak.  I implemented a forced garbage collection when full text is displayed in the context view of Concord, but somehow memory is not released.  So I changed the way to read the text from a database file.  Now it should not keep using additional memory when you select a different concordance line to show full text.&lt;br /&gt;&lt;br /&gt;I use the same technique to read data from a database file when CasualConc searches a string, but if I implemented the same change to the search function, it used more memory because the search returns more hits.  What this means is if you search word(s)/phrase(s) in any of the tools many times, CasualConc keeps using memory.  I haven't tested if it uses up all the available memory and starts using virtual memory or if Ruby starts GC when it uses up all the available physical memory.  In any case, until I can find a way to solve this problem, you might want to quit CasualConc after a while and restart it.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-7338221449842933240?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/7338221449842933240/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=7338221449842933240' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/7338221449842933240'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/7338221449842933240'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/05/minor-change.html' title='A minor change'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-6961721115279455098</id><published>2008-05-20T11:46:00.000+09:00</published><updated>2008-05-20T11:55:29.172+09:00</updated><title type='text'>A few more fixes again</title><content type='html'>I fixed a few more bugs this weekend that are mainly related to lemmatization and collocation statistics.  I also added some more documentations to the main site (some in Japanese).  The latest version is still 0.9.7 but the date is 05192008.&lt;br /&gt;&lt;br /&gt;Now, most of the features I wanted to include in CasualConc is there and mostly functioning.  I don't have time to improve Japanese kwic feature now, so that should wait until sometime in summer or fall.  And unless I find or someone reports any major bugs, I will try not to spend too much time on this for a while.  I don't know how many people actually downloaded CasualConc and are using it, but I guess there aren't many.  If you happened to be one of them, I'd like to hear what you think about it. &lt;br /&gt;&lt;br /&gt;Well, I might need to publicize this a bit more, so I might start trying to get more beta testers somewhere.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-6961721115279455098?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/6961721115279455098/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=6961721115279455098' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/6961721115279455098'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/6961721115279455098'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/05/few-more-fixes-again.html' title='A few more fixes again'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-914035463366479600</id><published>2008-05-18T06:33:00.000+09:00</published><updated>2008-05-18T06:45:15.576+09:00</updated><title type='text'>Another bug fix and minor update</title><content type='html'>I highly doubt if anyone downloaded CasualConc recently, but anyway, I found a few bugs and also make some changes, which I wanted to for a while.  Now the latest version is 0.9.7 beta.&lt;br /&gt;&lt;br /&gt;First, I found a bug in Japanese Concordance, which I'm sure nobody has ever used.  When I dropped the text only mode, I forgot to change it in Japanese concordance mode.  Now it should be fixed.  I also fixed some other bugs what relate to the recent feature changes.&lt;br /&gt;&lt;br /&gt;The changes I made are mainly with Collocation.  Now, if you search for multiple words or use wildcard search and multiple words are found, collocation information will be displayed for each keyword.  This change affected statistics calculation, so I think I made necessary changes to it.&lt;br /&gt;&lt;br /&gt;I also made a minor change to Export result function of Concord.  Originally, an exported CSV file from Concord only include kwic results and file paths.  Now it has an option to include context words (L5 - R5).  To include them, go to Preferences -&gt; Concord and check the box Include context words (L5 - R5) in CSV output.&lt;br /&gt;&lt;br /&gt;As always, if you happened to find this blog or the main page, and tried CasualConc, any feedback (including bug reports) is welcome. Especially if you find it useful, I'd like to know.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-914035463366479600?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/914035463366479600/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=914035463366479600' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/914035463366479600'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/914035463366479600'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/05/another-bug-fix-and-minor-update.html' title='Another bug fix and minor update'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-2533963221709764178</id><published>2008-05-15T11:55:00.000+09:00</published><updated>2008-05-15T12:06:25.925+09:00</updated><title type='text'>Another quick fix</title><content type='html'>I found a minor (maybe major to someone if anyone ever uses CasualConc) bug and fixed it today.  This only affects you if you use Concord with non-plain text files as your corpus files.  And this only happens in the paragraph mode (the default mode).  Now you should be able to use other file types as your corpus files with Concord and in the paragraph mode.&lt;br /&gt;&lt;br /&gt;I found this bug when I was testing .odt files.  After the fix, I was able to use .odt files as corpus files, so this confirms CasualConc can read .odt files!&lt;br /&gt;&lt;br /&gt;I don't know how many people are affected (I know not many people), but if you downloaded CasualConc in the last couple of weeks, please go to the site and download the latest version.  It has the same version number (0.9.6), but different date (05142008).&lt;br /&gt;&lt;br /&gt;And if you find any other bugs, please let me know.  The email address is on the main site.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-2533963221709764178?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/2533963221709764178/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=2533963221709764178' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/2533963221709764178'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/2533963221709764178'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/05/another-quick-fix.html' title='Another quick fix'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-2077814045501117335</id><published>2008-05-13T05:15:00.000+09:00</published><updated>2008-05-13T05:47:11.288+09:00</updated><title type='text'>Some details of last update</title><content type='html'>As I mentioned in the last post, I added/activated a couple of new features on CasualConc.  One is based on the lemmatizing function and the other is something with Concord.&lt;br /&gt;&lt;br /&gt;The first, which is based on the lemmatizing function is keyword grouping or whatever name I will settle (it has a tentative label).  What it does is first you prepare a text file (UTF-8) with the same format the lemmatizer accepts.  The default is:&lt;br /&gt;&lt;br /&gt;keyword -&gt; word,word,word,...&lt;br /&gt;&lt;br /&gt;The keyword is a grouping label, so if you want to group days of a week, it looks like:&lt;br /&gt;&lt;br /&gt;week -&gt; Monday,Tuesday,Wednesday,Thursday,Friday,Saturday,Sunday&lt;br /&gt;&lt;br /&gt;Once you prepare as many groups you want to have, save the file as a plain text with UTF-8 encoding.  Then, go to &lt;span style="font-weight: bold;"&gt;Preferences&lt;/span&gt; on CasualConc -&gt; &lt;span style="font-weight: bold;"&gt;Lemma&lt;/span&gt;, and check &lt;span style="font-weight: bold;"&gt;Grouped Keywords&lt;/span&gt;.  Next you select the file you just saved by clicking Select Grouping File button. Now everything is set.&lt;br /&gt;&lt;br /&gt;If this works as intended, you should be able to use this function on Concord, Cluster, and Collocation/Cooccurrence.  What you will do is add &lt;span style="font-weight: bold;"&gt;@@&lt;/span&gt; at the beginning of your search word(s).  So if you want to search all the days of a week, as specified above, you will &lt;span style="font-weight: bold;"&gt;@@week&lt;/span&gt;, then search.  You should be able to search all the words in this group.  Technically, you should be able to search multiple groups, but it is not fully tested and might not work, and I don't know what will happen if you combine this feature and wildcard search.  I might change the behavior of this feature if I ever get any feedback.&lt;br /&gt;&lt;br /&gt;Another somewhat major addition is which is not documented at this time is a function for Concord.  You can now open a concordance result in a new window.  This might be useful if you want to compare several concordance results.  To use this function, search any word(s) in Concord, and then go to Menu -&gt; &lt;span style="font-weight: bold;"&gt;Misc&lt;/span&gt; -&gt; &lt;span style="font-weight: bold;"&gt;Open Concordance Result in New Window&lt;/span&gt;.  This is experimental.  I added this because I found a way to add multiple window function to a program (I just wanted to have something so that I remember how to do it).  You should be able to resort the results even on a new window, just like on the main window.  But be ware, if the concordance result is huge (like returned 10000 hits), using this might eats a lot of memory because CasualConc keeps all the info on memory.  If you have at least 2GB of memory, this should be less of a problem, though.&lt;br /&gt;&lt;br /&gt;Finally, I have something that is not related to CasualConc.  I posted a couple of weeks ago that I wrote a simple utility program that helps typing IPA characters.  I wrote a similar program(?) with Javascript and added to the IPATypist page.  I highly doubt many people read this post and especially people who don't use Leopard, but this is written for those people.  It should run on Tiger with Firefox, Safari and Camino.  I haven't tested it on IE on Windows and I have no intention to support it, but it might work.  It is also available for download, so if you are ever interested, you can download it and use it on your computer or put it on your course site or wherever you want to use it, though I can't guarantee it will work.&lt;br /&gt;&lt;br /&gt;As always, if you ever use any of the programs, I'd apprecite your feedback.  That will motivate me to improve them.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-2077814045501117335?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/2077814045501117335/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=2077814045501117335' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/2077814045501117335'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/2077814045501117335'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/05/some-details-of-last-update.html' title='Some details of last update'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-910132341234635917</id><published>2008-05-06T16:35:00.000+09:00</published><updated>2008-05-07T17:03:43.334+09:00</updated><title type='text'>Minor update</title><content type='html'>Over the weekend, I fixed a few bugs and made a few changes to some of the existing functions of CasualConc.  But these changes might have introduced another bugs...  Now CasualConc is 0.9.6 (beta).&lt;br /&gt;&lt;br /&gt;The bugs or more precisely, legacy features, that were fixed or updated were mostly on file handling.  When I first started writing the program, I didn't know anything about RubyCocoa (or Cocoa).  So when corpus files/database file were selected, only one file/folder was selectable.  This was simply because of the original Ruby script.  In that script, I simply specify a directory or a file to analyze in the script.  And I simply added Cocoa interface to it.  Eventually, I learned how to receive multiple file names as an array from the open panel, I made it available to some of the new features.  Now you should be able to select multiple files/folders when you choose your corpus files/folders in File Mode.  In Database Mode, only one file can be selected.  If you need to select multiple database files, please use the advanced file handling mode.&lt;br /&gt;&lt;br /&gt;Another bug fix was drag and drop of files.  I'm not sure if I mentioned this feature in any of the documentation, but you can actually drag and drop files to the table in File View.  So if you have a Finder window open with files you want to add, you can drag and drop them (or the folder that contains the files) to the table.  This should work with files to analyze in File Mode and files to add to a database in Database Mode.  If these don't work, please let me know.&lt;br /&gt;&lt;br /&gt;Other minor changes are too minor and I'm almost certain no one has ever used it.  But anyway, I dropped Text Only mode.  So now all you need to do is check/uncheck the file types you want to use.  You still need to specify text encoding if you use text files.  This is because the auto-detection of text encoding in Objective-C is not usable.  Related to this change is the addition of OpenDocuemnt Text (.odt) support.  But because I've never used Open Office, I haven't tested it.  I implemented this a while ago when I added others but didn't activate it because I don't use it.  And now I decided to activate it. I simply use a bulit-in Objective-C function, it should work as other files do (no guarantee).&lt;br /&gt;&lt;br /&gt;Oh, one, kind of, major fix is the lemma function.  I implemented the lemmatization function at a very early stage.  But I've made a lot of changes to most of the tools since then, so it seemed like I broke it.  Now it's fixed and I also added a function to use lemma grouping in kwic search.  I mean, if you turn on this feature and search a word that is on the lemma file you provide, you can search all the words grouped under the same lemma, though I'm not sure if this works as intended.&lt;br /&gt;&lt;br /&gt;In addition to these mostly fixes, I added a couple of new features.  One is based on the lemmatization function and the other is something very experimental.  But this post is getting long, so I'll post them in the next few days when I have time.&lt;br /&gt;&lt;br /&gt;As always, if you happen to find this blog or the CasualConc site, please let me know what you think.  You can leave your comment on this blog or email me (the address is on the main site).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-910132341234635917?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/910132341234635917/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=910132341234635917' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/910132341234635917'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/910132341234635917'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/05/minor-update.html' title='Minor update'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-8848945846416633000</id><published>2008-05-03T16:14:00.000+09:00</published><updated>2008-05-06T07:43:57.736+09:00</updated><title type='text'>IPATypist</title><content type='html'>This is a small utility program I wrote for an ESL instructor at my school (yes, this is just written for you, Janet!), but I made a few changes to it so that this can be also useful for other people.&lt;br /&gt;&lt;br /&gt;Originally, she told me she was having trouble typing IPA phonetic alphabets in Unicode.  There is a keyboard mapping to type phonetic alphabets, but it is cumbersome.  So I simply put a lot of buttons to enter IPA characters.  Because this was written specifically to serve her purpose, the characters are the ones used for English and some special ones that are used for the book she and her colleagues are working on.&lt;br /&gt;&lt;br /&gt;What you can do with this utility is type phonetic alphabets by simply clicking buttons.  Once you are done, copy/paste them to whatever the program you are working on.  You can either go to Menu to copy or Command + C to copy the string, which keeps all the font information (font type/size).  Or you can click Copy button, which only keeps the character information, so when you paste the string, whatever the font setting (type/size) on your document will be applied.&lt;br /&gt;&lt;br /&gt;The latest version (0.3) supports key (button) mapping (if it functions as intended).  Now, any character can be assigned to any of the buttons, so users/teachers of languages other than English could use it (I hope).&lt;br /&gt;&lt;br /&gt;The system requirements are the same as all the program/utilities I wrote: Mac OS X 10.5.2 (Leopard) or later.  I think this works on 10.5, but now all my machines are running 10.5.2, so I can't check the version prior to this (but at least I'm sure this won't run on Tiger).  You also need Doulos SIL font, which can be downloaded freely from the SIL site.  The link to their site is on the download page of this utility.&lt;br /&gt;&lt;br /&gt;If you find any bugs or have any feature request, I will try to fix them/add them as much as I can (if they are minor).  I don't have time to spend much time on this now (or I should say I should not spend time on this).  But any feedback is welcome.  Especially, I'd like to know if this helps someone.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-8848945846416633000?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/8848945846416633000/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=8848945846416633000' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/8848945846416633000'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/8848945846416633000'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/05/ipatypist.html' title='IPATypist'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-4756531728874072017</id><published>2008-04-29T11:13:00.001+09:00</published><updated>2008-04-29T12:11:35.792+09:00</updated><title type='text'>Documentation</title><content type='html'>Over the last couple of weeks, I've worked on the documentation of CasualConc.  Now, it covers most of the basic functions.  I also started more step-by-step instruction with a lot of images and named it Getting Started with CasualConc.  So far, I have only finished basic file management and database creation along with the kwic concordance function, which I think will be the most frequently used function (I do).&lt;br /&gt;&lt;br /&gt;Now, my only hope is someone will find the site or this blog and start using it.  Somehow, I can't search the CasualConc main site on Google.  It doesn't show up in the result.  When I add a post to this blog, it shows up in the next 20 hours or so and disappears.  Well, maybe I should add one post per day until some more people find this blog and CasualConc.&lt;br /&gt;&lt;br /&gt;If you happen to find this blog, please try it (if you use Leopard) or tell your friends who uses Leopard to try it.  I know it still has bugs and a lot of limitations, but I really want other people's opinions to improve it (it serves most of my current uses, so I don't have much motivation to make a lot of changes).  Well, even if I hear from people, I might not be able work on it for a while, but at least it's good to hear esp. if people like it.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-4756531728874072017?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/4756531728874072017/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=4756531728874072017' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/4756531728874072017'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/4756531728874072017'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/04/documentation.html' title='Documentation'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-1739102420951091540</id><published>2008-04-28T02:55:00.000+09:00</published><updated>2008-04-28T02:59:26.888+09:00</updated><title type='text'>A quick fix</title><content type='html'>I'm almost certain nobody has downloaded CasualConc since my last post.  But anyway, accidentally, I introduced a bug to database creation function.  This was caused by implementing a new tag-deletion code, which I forgot to apply to database creation part.  So if you find the database creation function does not work properly (this crashes CasualConc), please go to the site and download the latest beta.&lt;br /&gt;&lt;br /&gt;If you find any other bugs, please report it to me.  The email address is on the main site.  Or you can leave a comment on this blog.&lt;br /&gt;&lt;br /&gt;Thanks!!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-1739102420951091540?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/1739102420951091540/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=1739102420951091540' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/1739102420951091540'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/1739102420951091540'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/04/quick-fix.html' title='A quick fix'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-4804064635411948755</id><published>2008-04-27T01:57:00.000+09:00</published><updated>2008-04-27T02:05:18.072+09:00</updated><title type='text'>A few more fixes</title><content type='html'>I found a few minor bugs which I introduced with the last changes, so fixed them.  And I also found that the default font of CasualConc, Courier, is not monospace in Greek, which is the language my very first user tests (I guess) on CasualConc, so I added a function to select Courier or Courier New, which is monospace in Greek.&lt;br /&gt;&lt;br /&gt;I didn't mention this in the last post, but I also made some changes to the codes of Concord, which only improved speed about 2-3%.&lt;br /&gt;&lt;br /&gt;Now I hope more people find this blog or the main site and test CasualConc.  So if you happen to find this blog or the main site and you know someone who uses Leopard and is interested in corpus analysis, please tell him/her to test CasualConc, even if you don't use Mac OS X Leopard.  If you do, please try it!!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-4804064635411948755?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/4804064635411948755/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=4804064635411948755' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/4804064635411948755'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/4804064635411948755'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/04/few-more-fixes.html' title='A few more fixes'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-8003382446260550070</id><published>2008-04-26T02:03:00.000+09:00</published><updated>2008-04-26T02:29:12.030+09:00</updated><title type='text'>A minor update</title><content type='html'>As I reported a couple of days ago, at least a few people around the world are testing CasualConc and I've already got a report of a bug...  This very minor update is partly based on the report, which I don't really know the source of, and a minor change to a setting.&lt;br /&gt;&lt;br /&gt;What I found was an inconsistency of handling special characters, such as curly quotes or curly apostrophes.  These look like a single byte character on web pages or Word documents, but in fact two-byte characters in Unicode (UTF-8), so CasualConc replace them with a single byte quote or apostrophe.  Recently I added .doc and .pdf support, and these documents often contain their own special characters (arrows, etc.).  I only replaced these in some parts and not others, which caused inconsistency.  Now I think I applied the same rule to all the tools, but I'm not sure.&lt;br /&gt;&lt;br /&gt;Another change is drop of ASCII mode in concordance. In the Concordance Preferences, CasualConc has 4 ways to handle texts.  Originally the two European Language supports are ASCII and with Acccented Characters.  The former assumed the corpus files do not contain any multi-byte characters (in UTF-8).  The latter only assumed a few accented (multi-byte) characters in a context (the context words shown in the concordance result table). But then I realized after the very first person downloaded CasualConc that he uses Greek, which, I think, uses full of multi-byte characters in UTF-8.  So the new two modes for European Languages are A and B.  A is the same as the previous with Accented Characters mode and B is for full multi-byte character languages, but still assuming not many 3-byte characters used in East Asian Languages. If the text contains many 3-byte characters like East Asian Language characters, like Japanese, which are 2-byte characters on the screen but processed as 3-byte characters in UTF-8, CasualConc might not be able to display concordance result or full context view properly. If there are any languages that have full of 2- and 3-byte characters, let me know.  I'll see what I can do.&lt;br /&gt;&lt;br /&gt;By the way, I decided to add 'Getting Started' section on the CasualConc site.  The current 'How to use' is more like a manual or lists of functions CasualConc has, so it's not really how-to.  The site only has basic file handling or 'how to select files for your analysis' type entry.  I'll try to add more when I can find time.&lt;br /&gt;&lt;br /&gt;Anyway, the current version is 0.9.4.  If it is up to 0.9.9 and still not ready for version 1, I might go for 0.9.9.1..., but if enough people use it and does not have major problem, I might put it as Version 1.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-8003382446260550070?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/8003382446260550070/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=8003382446260550070' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/8003382446260550070'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/8003382446260550070'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/04/minor-update.html' title='A minor update'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-669146693549994090</id><published>2008-04-24T23:04:00.000+09:00</published><updated>2008-04-24T23:23:10.405+09:00</updated><title type='text'>Old stuff</title><content type='html'>Now I learned I can add an html page with javascript to the Google Page Creator site by simply uploading it as a file and link to it, I added an old javascript-based concordancer/word counter to the CasualConc site.  This is probably useless for people and I'm not sure if I need it on the site, but I just wanted to keep it somewhere and because this old script is the basis of CasualConc, I think it's the right place (for me). &lt;br /&gt;&lt;br /&gt;I wrote this script about 2 years ago when I was playing with javascript. At that time, I wanted to learn javascript, which I just started to learn a few months before that. I only knew MS-BASIC before that. When I started to play with javascript, I figured the best way to learn it is to write something with it.  I first wrote s few scripts for my colleague at the work to save a repetitive task.  Then, I wanted to do something for my self.  I always wanted to do something with corpus linguistics. I found a few sites that did it with javascript and many scripts in Perl.  With trial and error, and a lot of revisions, this javascript page was written.  The page says it's version 6, but the script version is 61 (its on the file name of the script file).  But then I learned limitations of javascript as a tool for corpus analysis.  Then I tried Perl because that seemed to be what everyone used (and a lot of people are using it for text analysis), but somehow, it didn't appeal to me (or I wasn't/isn't smart enough to learn it).  Then a year later, I used Ruby for something at work and somehow I liked it (still like it).  I didn't know Python, which I learned when I was learning Ruby. Another big plus was that because Ruby was originally and is still developed mainly by Japanese people, I found a lot of documents in Japanese. This and the inclusion of RubyCocoa in Leopard is why CasualConc exists now.  I think I wrote something like this in the very first post on this blog, but anyway, it's fun to use Ruby though my scripts are still primitive. I hope I can learn more about Ruby and improve CasualConc.  What I want is time, but now I need to spend more time on other more important stuff...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-669146693549994090?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/669146693549994090/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=669146693549994090' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/669146693549994090'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/669146693549994090'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/04/old-stuff.html' title='Old stuff'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-3063851584675279989</id><published>2008-04-23T16:17:00.001+09:00</published><updated>2008-04-23T16:28:57.069+09:00</updated><title type='text'>CasualConc launched!!</title><content type='html'>I finally found someone who got interested in using CasualConc!!.  I was just surfing the web looking for info on concordancing on Mac.  Though I'm developing CasualConc, if I can find better more flexible concordancer for Mac, I'm happy to use it.  The only problem will be I can't make the changes I want.  Anyway, I found a blog that was describing poor concordance software situation on Mac, so I posted a comment and he replied to it and wrote he downloaded CasualConc.  He wrote he would post the impression of CasualConc on his blog, so I'm really looking forward to it and at the same time I'm a bit nervous.  I think CasualConc at its beta state works ok for my casual use right now (mostly searching for collocation of words I want to use in my paper).  With database mode, it's fast enough to use regularly.  And because I wrote the program, I know how to use it, but I'm wondering how easy or difficult CasualConc is for others.  I've been adding contents to the documentation.  But at some point, I might need to work on step-by-step instruction of how to use it.  Well, this will only happen if more people are interested and start using CasualConc.&lt;br /&gt;&lt;br /&gt;If you ever find CasualConc and use it, any comments are welcome!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-3063851584675279989?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/3063851584675279989/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=3063851584675279989' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/3063851584675279989'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/3063851584675279989'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/04/casualconc-launched.html' title='CasualConc launched!!'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-1909616040030067358</id><published>2008-04-21T09:03:00.000+09:00</published><updated>2008-04-21T11:08:32.929+09:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='casualconc'/><title type='text'>Japanese Support</title><content type='html'>This would probably the last update for a while.  While I was stuck with the idea for my dissertation, I simply spend some time here and there for the last few days to make minor fixes and feature enhancements to CasualConc.  As I posted in the last couple of days, I finally made the download page available to public, though I'm not sure how many will find it, and added support for several different character encodings and file types.  Finally, I added very limited Japanese support.&lt;br /&gt;&lt;br /&gt;Now &lt;span style="font-weight: bold;"&gt;CasualConc&lt;/span&gt; can read Japanese (and possibly other East Asian Languages) files in two formats.  One is a plain format without any space in between words.  The other is wakachi-gaki, which has 1-byte space in between words.  Wakachi-gaki files can be created with &lt;span style="font-weight: bold;"&gt;jparser&lt;/span&gt; unitility program.  To analyze Japanese texts, a proper mode should be selected in the preference.  Select &lt;span style="font-weight: bold;"&gt;Japanese (plain)&lt;/span&gt; under Concord options in the preference for the former and &lt;span style="font-weight: bold;"&gt;Japanese (wakachi)&lt;/span&gt; for the latter.  If a proper mode is not selected, CasualConc cannot search words/characters.  Wildcard search is implemented, but not tested thoroughly.  Because of the way wakachi-gaki is written, 1-byte space should be inserted between words in phrase search.  Because this is also experimental, CasualConc might crush when you try to analyze Japanese text.  Japanese is only available for Text Mode.  Once features are set, I will add database file support.&lt;br /&gt;&lt;br /&gt;If you happen to find this blog or CasualConc page and are willing to try, please do so and let me hear what you think.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-1909616040030067358?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/1909616040030067358/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=1909616040030067358' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/1909616040030067358'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/1909616040030067358'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/04/japanese-support.html' title='Japanese Support'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-5550510873696716495</id><published>2008-04-20T04:08:00.001+09:00</published><updated>2008-04-20T04:22:48.697+09:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='casualconc'/><title type='text'>CasualConc open to public</title><content type='html'>Well, I finally decided to make CasualConc public.  This just means I added a link to the download page, which was already active, to the home page.  I highly doubt anyone is visiting the CasualConc site, so this doesn't make much difference, but I'm hoping somehow someone might find the page and try it. When I googled, it didn't come up, so the only way to find the page is from a bbs (or usergroup?) post I wrote while ago (which I mistakingly posted multiple times because I thought I was able to edit my post, but it turned out I posted multiple times...) or from a link on my personal schedule page at my work.  Or possibly, from this blog, if this can be googlable.&lt;br /&gt;&lt;br /&gt;Anyway, because I don't get any feedback on existing features, I decided to work on something not currently implemented: Japanese (or Asian Languages) support.  This is going to be highly experimental and I don't have much time now, so I can't tell when I will release it.  So far, I can display kwic results of Japanese text in plain format (no space) and wakachi-gaki format (space-separated).   The former can be sorted by L5-R5 context characters and the latter can be sorted by L5-R5 context words (or whatever the separated units are). In the future (only for Japanese), I want to include MeCab (which you need to install following the instruction on the CasualConc page) to process plain texts, but this won't happen near future.&lt;br /&gt;&lt;br /&gt;If you ever find this blog and use Mac OS X Leopard and are interested in corpus analysis, check CasualConc and let me know what you think. The link to the CasualConc site is on the right or click this &lt;a href="http://casualconc.googlepages.com/home"&gt;link&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-5550510873696716495?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/5550510873696716495/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=5550510873696716495' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/5550510873696716495'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/5550510873696716495'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/04/casualconc-open-to-public.html' title='CasualConc open to public'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-7937521954487698608</id><published>2008-04-19T03:14:00.000+09:00</published><updated>2008-04-19T03:20:35.088+09:00</updated><title type='text'>Another file format support update in CasualConc</title><content type='html'>After I updated CasualConc last night, I realized I could add html and WebKit Webarchive support.  So I added these two to supported file format.  Now it can read various files that contain text.  But the process will be slower than plain text files.  So if speed is important, convert the files to plain text.  If you want faster search, then create a database file from plain text files.  I will try to write a utility program to convert CasualConc supported files to UTF-8 plain text files.&lt;br /&gt;&lt;br /&gt;Well, it's kind of sad to keep writing blogs knowing nobody is reading...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-7937521954487698608?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/7937521954487698608/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=7937521954487698608' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/7937521954487698608'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/7937521954487698608'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/04/another-file-format-support-update-in.html' title='Another file format support update in CasualConc'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-5290034410351075100</id><published>2008-04-18T18:56:00.000+09:00</published><updated>2008-04-18T19:12:46.805+09:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='casualconc'/><title type='text'>CasualConc update</title><content type='html'>With recent discoveries of file handling on Objective-C side (or RubyCocoa side?), I decided to add a new feature to CasualConc.  Originally, CasualConc was able to handle UTF-8 or ASCII encoded files because I used Ruby's file handling method. Now I switched to Objective-C methods, so it can handle a few more encodings.  Added encodings are UTF-16, Windows Latin 1, Windows Latin 2, Mac Roman, Shift-JIS, EUC-JP, and ISO-2202 JP.  The last three encodings are all for Japanese.  These are limited to the ones Objective-C can handle by default. I wasn't able to find Chinese or Korean encoding settings, so they are not included.  But CasualConc cannot handle 2-byte character properly (in Concordance), so this shouldn't be an issue.  I haven't really tested all the encodings, so if someone happens to find this blog and would like to try, let me know.  I might add a link to download page to the CasualConc site soon (hopefully).&lt;br /&gt;&lt;br /&gt;Also experimentally implemented is support of other file formats.  This still returns error from time to time. Personally, all my corpus files are in text, so this is not for myself, but I just thought someone might be interested.  The problem is no one is checking this blog or CasualConc site, so I highly doubt anyone even uses this function.  By the way the added file formats are .doc, .docx., .rtf, .rtfd, and .pdf.&lt;br /&gt;&lt;br /&gt;Well, I really hope someone would try to use CasualConc, though...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-5290034410351075100?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/5290034410351075100/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=5290034410351075100' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/5290034410351075100'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/5290034410351075100'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/04/casualconc-update.html' title='CasualConc update'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-5963882811847446297</id><published>2008-04-13T16:17:00.002+09:00</published><updated>2009-09-09T10:40:02.285+09:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='utility program'/><title type='text'>WtoTconvUtil</title><content type='html'>This is another utility program I wrote in Ruby + RubyCocoa. What you can do with this program is very simple: extract text from a web page.  In fact, I'm not sure how useful this is, but I just wanted to experiment. This is based on PtoTconvUtil, a PDF to text converter utility program. After I wrote it, I wondered how I could extract text from a web page, so I experimented a while, but couldn't figure it out. But last night, when I was not able to think about my paper, I found a clue at a web site, and then spent 15-20 min. to figure out how.&lt;br /&gt;&lt;br /&gt;This program still has many issues because the browser part is simply made of Cocoa binding, which means no scripting. I simply wrote scripts to extract text and save it as a text file.  But thanks to built-in Cocoa functions, it recognizes a web address in a text box (though you always need to type "http://"), reload, forward and back buttons work, and accepts Safari Webarchive file and HTML file by drag&amp;amp;drop. And I also found that this program can extract text from a PDF file which is displayed on a browser with a plug-in.  So if you know a web address of a text-embedded PDF file, you can show it on a browser box and extract text. Yes, this is very good, but because I just use built-in functions, it's not flexible. I want to add a function to read bookmark from other browsers, so you don't have to type an address everytime. It might be easier to read Safari Bookmark, so I might try it first, though I'm not sure when that will happen.&lt;br /&gt;&lt;br /&gt;Anyway, this program helps you compare the original and the extracted text on one window.  So if you build a corpus from web pages, you can either extract the entire text or simply copy and paste a part of it.&lt;br /&gt;&lt;br /&gt;So again, if you somehow found this page and is reading this, AND if you use Leopard, try it and let me know what you think.&lt;br /&gt;&lt;br /&gt;EDIT: This program is discontinued and integrated into CasualTextractor, which is available on CasualConc site under Utility Programs.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-5963882811847446297?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/5963882811847446297/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=5963882811847446297' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/5963882811847446297'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/5963882811847446297'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/04/wtotconvutil.html' title='WtoTconvUtil'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-1437700001629152916</id><published>2008-04-12T17:19:00.001+09:00</published><updated>2009-10-19T09:01:51.933+09:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='utility program'/><category scheme='http://www.blogger.com/atom/ns#' term='japanese'/><category scheme='http://www.blogger.com/atom/ns#' term='mecab'/><title type='text'>CasualMecab</title><content type='html'>is the name I gave to a utility program that is based on MeCab. What this program does is POS/morphological analysis of Japanese text. What the program does at this moment is simply produce MeCab output. Choices are MeCab output, Chasen-like output, wakachi-gaki (words with spaces in between), and yomi (in katakana). The output can be saved as a text file. I want to add other output formats, but probably not in the near future.  This program can also handle batch process although I haven't tested it extensively.  The output file is encoded in UTF-8, mainly because that's what CasualConc can handle.  I want to add Japanese concordancing feature to CasualConc in the future.  If anyone ever finds this blog and is interested, please go to CasualConc site and download it.  By the way, this program requires MeCab and MeCab-Ruby. The instruction to install these are also at CaualConc site. The installation is not simple (you need to use Terminal and command line to install), but the instruction is step-by-step.  I hope anyone can understand it. As always, this is a Leopard only program and free.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-1437700001629152916?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/1437700001629152916/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=1437700001629152916' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/1437700001629152916'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/1437700001629152916'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/04/jparser.html' title='CasualMecab'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-7520762432990258362</id><published>2008-04-11T13:14:00.000+09:00</published><updated>2008-04-11T13:25:32.537+09:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='ruby'/><category scheme='http://www.blogger.com/atom/ns#' term='mecab'/><title type='text'>MeCab-Ruby</title><content type='html'>I finally found a way to successfully install MeCab (Japanese parser) and MeCab-Ruby, Ruby binding for MeCab on Leopard.  I added this page to the CasualConc web site.  It's only in Japanese at this moment because I'm not sure how many people actually check the site and how many of very limited visitors are interested in installing MeCab-Ruby on their Leopard machine.  If anyone is interested, I can translate the page into English, but probably there are many better sites somewhere.&lt;br /&gt;&lt;br /&gt;But now that I installed it, I might add Japanese concordancing features to CasualConc, if I ever have time.  At least, I can try it now.  Also if anyone can understand how to install MeCab-Ruby on their computer, I might add parcing feature (Japanese) to CasualConc, assuming people are willing to install it on their own.  But I'll probably first work on GUI interface of MeCab-Ruby to create wakachi-gaki files or syntactically parsed files.  But when do I have time???&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-7520762432990258362?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/7520762432990258362/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=7520762432990258362' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/7520762432990258362'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/7520762432990258362'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/04/mecab-ruby.html' title='MeCab-Ruby'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-9094992669575391453</id><published>2008-03-14T06:10:00.000+09:00</published><updated>2008-03-14T06:38:00.812+09:00</updated><title type='text'>Garbage Collection again</title><content type='html'>After some experiments, some of my efforts paid off, but not all.  Then, I realized it was not just Ruby that used memory.  Becaue it's written in RubyCocoa, Cocoa or Objective-C part should use some memory.  I believe Objective-C 2.0 has GC, but OS X uses as much memory it has and manage it.  I might be wrong, but using more memory itself might not be that bad.&lt;br /&gt;&lt;br /&gt;Concord, Cluster, Collocation might be usable with modest amount of memory, but Word Count (n-gram) requires a lot of memory.  This is because it creates a huge array (all create arrays, though).  I know my current implementation is not ideal, but maybe I have to improve Word Count first.  When I was testing the original Ruby scripts, I only used smaller corpora (far less than 100 mil.).  Now I need to figure out a way to reduce memory usage, but how?  Does anyone have good idea?  My implementation is to use hash to count, just as any basic Ruby book shows.  But I tweaked it a bit to increase processing speed. &lt;br /&gt;&lt;br /&gt;Anyway, this is partly why I put CasualConc can handle 1 mil. corpus at reasonable speed.  Well, I need time.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-9094992669575391453?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/9094992669575391453/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=9094992669575391453' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/9094992669575391453'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/9094992669575391453'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/03/garbage-collection-again.html' title='Garbage Collection again'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-2953551730976257205</id><published>2008-03-12T11:44:00.000+09:00</published><updated>2008-03-12T12:02:59.446+09:00</updated><title type='text'>Garbage Collection</title><content type='html'>I use CasualConc regularly to look up how certain words are used in a context.  When I was using it, I realized CasualConc is a memory hog.  I knew Word List, espcially when used for n-gram list, needs a lot of memory to process because it keeps counting new ones while it stores counted ones (not exactly, though).  I knew Ruby has garbage collection built in, but it seemed like it wasn't working when I wated it to work (maybe because there still was a lot of unused memory).  So I decided to force GC to start at some points (GC.start).&lt;br /&gt;&lt;br /&gt;But when?&lt;br /&gt;&lt;br /&gt;I've been trying several differnt points per each tool and associated method and monitor the differences.  But because I've never seriously studied programming (I'm not and have never been in computer science), I don't think I understand how GC works (or in fact, I'm still not sure what exactly OO language entails.  If you are breave enough to take a look at the Ruby/RubyCocoa source code of CasualConc, you can see my scripts are not written in Ruby way.  I hope I have some time to learn to program a little more seriously someday, but for now, CasualConc works ok (at least for me).&lt;br /&gt;&lt;br /&gt;Anyway, I'm not sure if someone ever reads this entry or any entry on this blog, but I'll try to keep my record on this.  I want to add some memos on Ruby/RubyCocoa codes on this blog if I can.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-2953551730976257205?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/2953551730976257205/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=2953551730976257205' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/2953551730976257205'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/2953551730976257205'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/03/garbage-collection.html' title='Garbage Collection'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-1730765215146826337</id><published>2008-03-09T05:12:00.001+09:00</published><updated>2009-09-09T10:38:58.063+09:00</updated><title type='text'>PDF to Text converter</title><content type='html'>In the last post, I wrote I found a way to extract embedded text from PDF files.  I wanted to do something with it before I forget, so I wrote a simple utility program in Ruby+RubyCocoa and posted it to CasualConc site.  The system requirement is Mac with Leopard.  I named it simply PDFtoTextConverter.  What it does is open a PDF and show it's embedded text in the text box on the same window.  The extracted text can be saved as .txt file.  It also has a batch process mode.  You can add PDF files to the list and select a folder to save the text as .txt or save .txt file to the same folder where the origial PDF files are stored.  If you are interested, please try it.  You can go to CasualConc site by following the link on the right.&lt;br /&gt;&lt;br /&gt;EDIT: This program is discontinued and integrated into CasualTextractor which is available on the CasualConc Main site under Utility Programs.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-1730765215146826337?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/1730765215146826337/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=1730765215146826337' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/1730765215146826337'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/1730765215146826337'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/03/pdf-to-text-converter.html' title='PDF to Text converter'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-2348161707787160652</id><published>2008-03-07T02:37:00.001+09:00</published><updated>2008-03-07T02:53:43.389+09:00</updated><title type='text'>PDF</title><content type='html'>I finally found a way to extract text from text-embedded PDF files in RubyCocoa.  I personally don't care about this much, but I guess this might be useful for some people.  The problem with handling PDF text is not the extracting part.  I mean, the real issues with implementing it to CasualConc are:&lt;br /&gt;&lt;br /&gt;1. each line of text is separated by a line feed character LF (\n or \r\n?)&lt;br /&gt;2. page headers/footers, etc. that are not the main text are also included&lt;br /&gt;3. embedded text often includes extra spaces, garbled characters (often with ligatures), etc.&lt;br /&gt;&lt;br /&gt;1 is probably the main issue.  Currently, the basic unit of analysis in CasualConc is paragraph, which means text separated by LF characters.  So it cannot handle text files that separate each line with LF characters such as Brown Corpus files.  This require some coding (means not just adding a few lines) and I can't find time to do it now.  I'll try to implement this feature in the future, but I don't know when.&lt;br /&gt;&lt;br /&gt;2 and 3 cannot be avoided, I guess.  So I might try to add a feature to extract text from PDF files within CasualConc, but this also requires certain amount of time.&lt;br /&gt;&lt;br /&gt;But at least I know how to extract text from PDF files.  So the feature will be included in a future version of CasualConc.  If many people are interested, I might prioritize this (but probably won't happen at least until Summer).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-2348161707787160652?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/2348161707787160652/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=2348161707787160652' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/2348161707787160652'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/2348161707787160652'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/03/pdf.html' title='PDF'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-7524173436112595024</id><published>2008-03-06T18:19:00.000+09:00</published><updated>2008-03-06T18:38:01.806+09:00</updated><title type='text'>Google Search</title><content type='html'>I've been adding documentation to CasualConc site, although I haven't yet added a download page. Now it has a page for Concordance and Word Cluster along with Basic File Handling.&lt;br /&gt;&lt;br /&gt;But now I'm wondering how Google works.  I mean the Google Page Creator Help says the page created by it "can be crawled by Google within a few hours of publication".  Well, it says "can be", so the actual time might be longer than a few hours.  In fact, the CasualConc site was searchable on Google a couple of days after I published it.  BUT now it's not on the search result.  It disappeared!!&lt;br /&gt;&lt;br /&gt;Maybe I should tell my friends to check this first...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-7524173436112595024?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/7524173436112595024/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=7524173436112595024' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/7524173436112595024'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/7524173436112595024'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/03/google-search.html' title='Google Search'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-626625090667668862.post-6720738002330437094</id><published>2008-03-03T17:31:00.001+09:00</published><updated>2009-04-09T03:40:00.130+09:00</updated><title type='text'>CasualConc</title><content type='html'>I started this blog to keep track of what I do for CasualConc, experimental concordancing software for Mac OS X 10.5 Leopard (and possibly later version of OS X).&lt;br /&gt;&lt;br /&gt;I started to learn a scripting language called Ruby, which is similar to Perl or Python, last summer.  The main reason I chose Ruby was that there are many documentations in Japanese.  I don't know if I made a right decision, but at least I tried Perl and Ruby but I like Ruby better for no particular reason (Perl simply didn't appeal to me when I tried).  Another reason was that I read somewhere that Apple decided to include software (?) that bridges Ruby and Cocoa, Mac OS X's GUI framework (?) in Leopard.  It's called RubyCocoa and it allows users to add Mac GUI to Ruby scripts (btw, there's a similar one for Python).  Isn't this cool?&lt;br /&gt;&lt;br /&gt;At first, I used Ruby for my work (I'm working as Instructional Technology Consultant at my school), but later decided to learn it more seriously.  I'm interested in corpus linguistics and want to do some corpus-based/driven research, so I decided to write some scripts for basic corpus analyses.  When OS X 10.5 Leopard came out, I had a few simple scripts for kwic, word count, etc., so I tried to add GUI to them.  It wasn't very easy because there isn't much documentation for RubyCocoa.  So I had to learn both Ruby and Cocoa and combine them to make GUI work.&lt;br /&gt;&lt;br /&gt;Now, I have added some more features to kwic and word count and named it CasualConc.  It is Mac GUI based software written in Ruby+RubyCocoa.  Because the developing environment is OS X 10.5 Leopard, it only runs on Leopard.  There might be a way to make it run on Tiger, but I don't want to spend time on it simply because I don't have time (and I don't have expertise).  The current version is 0.9 and still in beta (well, beta simply means I call it so).  I don't have much time to make a lot of changes now.  From now on, I try to fix major bugs and write up some documents.  And now I want to have someone to test it.&lt;br /&gt;&lt;br /&gt;There is no guarantee that this works for you, but if you are interested, I'm happy to have you as a beta tester.  Here's basic info:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;System requirement&lt;/span&gt;: a Mac with a lot of memory (at least 1GB) and that runs Mac OS X 10.5 Leopard (Universal, well, this is mostly written in Ruby...), optimized for screen at least 1280px wide (13.3 inch or larger on notebook or 17 inch or larger on desktop LCD)&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Acceptable file format&lt;/span&gt;: text files (.txt) encoded in ASCII or UTF-8 (Ruby is not good at handling character encodings)&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Acceptable languages&lt;/span&gt;: any single-byte character language (double-byte character languages (East Asian languages) can be analyzed except for kwic concordancing as long as words are separated by single-byte space)&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Target User&lt;/span&gt;: Mac users who don't want to start up Windows machine, switch to BootCamp, or run Virtual PC/Parallels/VM Ware for simple concordancing for preliminary analysis, preparing teaching materials, learning, etc. (CasualConc is probably not good enough as your primary research tool)&lt;br /&gt;&lt;br /&gt;I use CasualConc on my Mac mini (1.86GHz Core 2 Duo) and have used it on G4 (1.5GHz) machine.  It works fine for me, but with faster CPU and more memory, performance is better.  With 1 million corpus, it works at reasonable speed (not as fast as WordSmith Tools).  With a corpus larger than that, well, you can try.&lt;br /&gt;&lt;br /&gt;If you are interested, check out &lt;a href="http://sites.google.com/site/casualconc/"&gt;&lt;span style="text-decoration: underline;"&gt;CasualConc site&lt;/span&gt;&lt;/a&gt;.  Documentation is not complete (far from it), so if you have never used any concordancer, you might find it difficult to use.  But if you have, you can probably use most of the basic features.&lt;br /&gt;&lt;br /&gt;By the way, this is freeware.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/626625090667668862-6720738002330437094?l=casualconc.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://casualconc.blogspot.com/feeds/6720738002330437094/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=626625090667668862&amp;postID=6720738002330437094' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/6720738002330437094'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/626625090667668862/posts/default/6720738002330437094'/><link rel='alternate' type='text/html' href='http://casualconc.blogspot.com/2008/03/casualconc.html' title='CasualConc'/><author><name>Yasu</name><uri>http://www.blogger.com/profile/08489030458578691142</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry></feed>
