There is only one new feature on Word Count (although creating n-gram list should be a little faster).
You can now specify what to count in Regular Expression. Go to Preferences -> Others and check Regular Expression Mode for Word Count. You can count whatever you want as long as you can write a regular expression for it.
Now, the functionality of tag modes will be presented here.
Word Count in Tag(s) mode with Separate Word and Tag in WC NOT checked, the result will look like this:
If Separate Word and Tag in WC IS checked, the result will look like this:
If Tag Only mode is selected, the result will look like this:
N-gram in Tag(s) mode with Separate Word and Tag in WC NOT checked, the result will look like this (4-gram):
If Separate Word and Tag in WC IS checked, the result will look like this:
If Tag Only mode is selected, the result will look like this:
That's it for Word Count.
Saturday, April 10, 2010
The current status of CasualConc beta - Cluster/Collocation/Cooccurrence
Only one minor change is made to Cluster.
When you select Left Only in Span, the result will be aligned to the left.
If you select Tag(s) mode in search word, the result of searching 'jj nn1' (adjective + singular noun) will look like this:
If Suppress tags in context is on, the result will look like this.
And Tag Only search will return results like this:
Now the Collocation. In the current working version, the frequency of the total includes the frequency of the keyword position. But in the new version, the total is only the total of word in the context. So the keyword no longer comes on top of the list.
Another change is variable span. Now you can set the span of the context words and do it separately for the left and right up to 5 words.
The result will look like this:
Collocation Stats calculations should reflect this to some extent.
Now with Tag(s) selected, the result will look like this:
If Treat Keywords As One Word is checked, the result will look like this:
If Suppress tags in context is on, the results with or without Treat Keywords As One Word checked will look like this:
NOT Checked
Checked
Finally, Tag Only search will return the result like this:
The next is Cooccurrence.
The new feature of Cooccurrence is sorting. You can now sort words in each position based on collocation statistics. To use this feature, you need to run Word Count first.
In the normal frequency order, the result looks like this:
With MI (Mutual Information) sort, the result will look like this:
With Tag(s) mode, the result will look like this:
Now, you can export Cooccurrence result with frequency information. Check Include frequency info in the Cooccurrence export in Preference -> Others.
The exported csv file with freq info will look like this on Excel.
Now with tag handling on Cooccurrence. With Tag Only mode, the result will look like this:
These are the new features of Cluster/Collocation/Cooccurrence.
When you select Left Only in Span, the result will be aligned to the left.
If you select Tag(s) mode in search word, the result of searching 'jj nn1' (adjective + singular noun) will look like this:
If Suppress tags in context is on, the result will look like this.
And Tag Only search will return results like this:
Now the Collocation. In the current working version, the frequency of the total includes the frequency of the keyword position. But in the new version, the total is only the total of word in the context. So the keyword no longer comes on top of the list.
Another change is variable span. Now you can set the span of the context words and do it separately for the left and right up to 5 words.
The result will look like this:
Collocation Stats calculations should reflect this to some extent.
Now with Tag(s) selected, the result will look like this:
If Treat Keywords As One Word is checked, the result will look like this:
If Suppress tags in context is on, the results with or without Treat Keywords As One Word checked will look like this:
NOT Checked
Checked
Finally, Tag Only search will return the result like this:
The next is Cooccurrence.
The new feature of Cooccurrence is sorting. You can now sort words in each position based on collocation statistics. To use this feature, you need to run Word Count first.
In the normal frequency order, the result looks like this:
With MI (Mutual Information) sort, the result will look like this:
With Tag(s) mode, the result will look like this:
Now, you can export Cooccurrence result with frequency information. Check Include frequency info in the Cooccurrence export in Preference -> Others.
The exported csv file with freq info will look like this on Excel.
Now with tag handling on Cooccurrence. With Tag Only mode, the result will look like this:
These are the new features of Cluster/Collocation/Cooccurrence.
Friday, April 9, 2010
The current status of CasualConc beta - Concord
Now the Concord.
It now has a few minor feature additions. First is independent left/right span. You can set the span of context texts on the right and left independently.
So if you want to see more on the right of the keyword, you can set the span on the left to a smaller number of characters and set the right one to a larger number. So the result would look like this:
The second is wider context word span. In the current working version, you can only search context words up to 5 words to the right and left and you can only search single words. With the new version, you can search up to 10 word on the right and left of the keyword and you can search a phrase. To enable wider context words, go to Preferences -> Concord and check Wide Context.
Now you can select up to 10 words on the right or left.
On the preferences, you can select Words Only or Words and Phrases for context word search.
This means you can search like this:
You can also specify words you don't want to see in the context. This means if the specified word(s) appear in the specified span, that line will be excluded from the result. To enable this, check Exclude right next to the Context Word Mode setting (see above). A text box and span settings for Excluding words appear on the main window. Check the box next to Exclude and search the keyword.
The result would look like this:
In this example, a word 'the' should not appear on the left and the right of the keyword with in 10 words (if this is working correctly).
Another feature is font selection. With the current working version, you can only select Courier or Courier New (or appropriate font might be chosen for non-alphabetic languages). Now you can select any font on your system. To do this, go to Preferences -> Concord and click Manage in the Display setting.
A font panel appears. You can select a font from Pop-up menu and click Add to add to the list. To remove a font from the list, just select a font and click Remove.
You can now select this added font.
Another feature is related to copy/paste or export kwic results. If you want to paste copied lines on Pages or MSWord or any application that accepts Rich Text Format text, you can paste the lines with the keyword in bold. Check Keep text style when copying the results.
If you paste the text on a document, the copied text preserves the font, font size and bold text on the keywords.
You can also insert a tab before and after the keywords. This works with a plain text format and rich text format. Check Insert Tab before/after Keyword.
The pasted text should look like these:
Plain Text
Rich Text
This setting (tab) should work when you export the results.
Another feature is editing the original/database text. First, you need to enable this in Preferences. You can allow editing of the original file and editing of database files.
For Allow editing of original file, you can select Only in File Mode or In Both Modes.
If allow editing of original file is on, after you run Concord and select a line on the table, right-click (or go Main Menu -> Text Data) the table.
If you select Open Displayed Text, a editor panel appears.
You can edit the original text file and save the changes. This comes in handy when you find errors in the kwic result. This has a basic tagging helper function. You specify tags on Tag Panel and then open Tag Drawer and click the number next to the tag text to insert a tag.
If you select Open Displayed Text with Application, the file opens with a specified application (the default is TextEdit).
In Database Mode, you can edit a database entry.
Select Edit Database Entry of the selected Line.
You can make changes and update the entry or delete the entry.
If the original text file is in the same directory when you created the database, you can select Open Displayed Text with Application to open the file with a specified application.
And in both modes, if you select Show File of the selected Line in Finder, the original text file will be displayed in Finder (of course if the file is still in the same directory in Database mode).
Another minor addition is searching text in the Context view. You could just copy and paste to search any word you find in the Context view, but you can directly do it in the Context view. Just select a word (or phrase) in the Context view and right-click to select Search in Concord.
Finally, I will briefly go over how tag-handling works.
In Preferences, you can select Tag(s). Tag Only mode does not work in Concord because looking at kwic lines of just tags does not make much sense.
Once you select the Tag(s) mode, select a tag type.
In Tag(s) mode, you can run kwic just by typing tags. This example search 'jj nn' (adjective + noun) combination.
You can suppress tags in the context. In Preferences, check Suppress tags in context.
Then the results should look like this:
But this suppressing tags in context has one issue. When you click a kwic result line, the displayed context text and keyword coloring in the context text is not correct. I will fix this if I can find a good way, but until then, this remains as is.
It now has a few minor feature additions. First is independent left/right span. You can set the span of context texts on the right and left independently.
So if you want to see more on the right of the keyword, you can set the span on the left to a smaller number of characters and set the right one to a larger number. So the result would look like this:
The second is wider context word span. In the current working version, you can only search context words up to 5 words to the right and left and you can only search single words. With the new version, you can search up to 10 word on the right and left of the keyword and you can search a phrase. To enable wider context words, go to Preferences -> Concord and check Wide Context.
Now you can select up to 10 words on the right or left.
On the preferences, you can select Words Only or Words and Phrases for context word search.
This means you can search like this:
You can also specify words you don't want to see in the context. This means if the specified word(s) appear in the specified span, that line will be excluded from the result. To enable this, check Exclude right next to the Context Word Mode setting (see above). A text box and span settings for Excluding words appear on the main window. Check the box next to Exclude and search the keyword.
The result would look like this:
In this example, a word 'the' should not appear on the left and the right of the keyword with in 10 words (if this is working correctly).
Another feature is font selection. With the current working version, you can only select Courier or Courier New (or appropriate font might be chosen for non-alphabetic languages). Now you can select any font on your system. To do this, go to Preferences -> Concord and click Manage in the Display setting.
A font panel appears. You can select a font from Pop-up menu and click Add to add to the list. To remove a font from the list, just select a font and click Remove.
You can now select this added font.
Another feature is related to copy/paste or export kwic results. If you want to paste copied lines on Pages or MSWord or any application that accepts Rich Text Format text, you can paste the lines with the keyword in bold. Check Keep text style when copying the results.
If you paste the text on a document, the copied text preserves the font, font size and bold text on the keywords.
You can also insert a tab before and after the keywords. This works with a plain text format and rich text format. Check Insert Tab before/after Keyword.
The pasted text should look like these:
Plain Text
Rich Text
This setting (tab) should work when you export the results.
Another feature is editing the original/database text. First, you need to enable this in Preferences. You can allow editing of the original file and editing of database files.
For Allow editing of original file, you can select Only in File Mode or In Both Modes.
If allow editing of original file is on, after you run Concord and select a line on the table, right-click (or go Main Menu -> Text Data) the table.
If you select Open Displayed Text, a editor panel appears.
You can edit the original text file and save the changes. This comes in handy when you find errors in the kwic result. This has a basic tagging helper function. You specify tags on Tag Panel and then open Tag Drawer and click the number next to the tag text to insert a tag.
If you select Open Displayed Text with Application, the file opens with a specified application (the default is TextEdit).
In Database Mode, you can edit a database entry.
Select Edit Database Entry of the selected Line.
You can make changes and update the entry or delete the entry.
If the original text file is in the same directory when you created the database, you can select Open Displayed Text with Application to open the file with a specified application.
And in both modes, if you select Show File of the selected Line in Finder, the original text file will be displayed in Finder (of course if the file is still in the same directory in Database mode).
Another minor addition is searching text in the Context view. You could just copy and paste to search any word you find in the Context view, but you can directly do it in the Context view. Just select a word (or phrase) in the Context view and right-click to select Search in Concord.
Finally, I will briefly go over how tag-handling works.
In Preferences, you can select Tag(s). Tag Only mode does not work in Concord because looking at kwic lines of just tags does not make much sense.
Once you select the Tag(s) mode, select a tag type.
In Tag(s) mode, you can run kwic just by typing tags. This example search 'jj nn' (adjective + noun) combination.
You can suppress tags in the context. In Preferences, check Suppress tags in context.
Then the results should look like this:
But this suppressing tags in context has one issue. When you click a kwic result line, the displayed context text and keyword coloring in the context text is not correct. I will fix this if I can find a good way, but until then, this remains as is.
Subscribe to:
Posts (Atom)