Training the Thulika Keyboard

Before you read :

Indian languages use complex script unlike English letters. Each English letter is composed of a single Unicode letter. But each letter of Indian languages may be composed of more than one Unicode letters.

The decomposition is as follows: LetterCode + modifyingSymbolCode

Eg: ക്രൊ = ( ക + ്ര + ൊ )

So as you see the code that comes first(ക) need not be the first letter that the user writes(െ). Also some symbols(like the symbol that comes after െ) will not have direct Unicode substitution (it should be written as  ്ര).

Also the code that’s already written might have to be modified according to the symbol that the user writes next. For eg:

If user writes ാ and the symbol just before isെ, then we’ll have replaceെ withൊ.

 

Now onto the training part:

This post describes the steps needed to configure new language/symbols for the Thulika Keyboard app. For this you have to download  ThulikaTrainer app.

To train Thulika keyboard on new inputs, require three steps:

  1. Save handwriting images of all the symbols of the language you wish the keyboard recognises (This is using the ThulikaKeyboard app on your Android phone)
  2. Run a program on your PC (will update the steps for this later)
  3. ThulikaTrainer Load engine (will update later)

Step 1. ThulikaTrainer app

1. Run the app for the first time. It’ll create directory structures on your sdcard. Make sure the following directories are created: sdcard/Android/data/me.sinu.thulika.train/files. There’ll be two folders under this – engines and letters.
2. Create a text file alpha.txt in sdcard/Android/data/me.sinu.thulika.train/files. This file should contain the symbols that the keyboard should recognize. Each symbol should be separated from other symbol using a COMMA(,).

For example, if you want to train the symbols a, b, c, 1 and 2, then, the alpha.txt should contain :

a, b, c, 1, 2

You can create the alpha.txt file using PC and then copy to the respective folder on your sdcard.

3. Now if you run the app, the symbols that you have specified in alpha.txt will be present on each of the buttons at the bottom of the app.

4. Draw a symbol on the screen and then press the button that corresponds to the symbol you have drawn. It’ll add a file under sdcard/Android/data/me.sinu.thulika.train/files/letters

If you have made a mistake you can press Clear button. If you accidentally pressed the wrong symbol button, press ‘Delete Last’, it will delete the last file you have created.

Please make sure you press the correct button corresponding to the symbol you have drawn. This is what ensures the correctness of the keyboard.

You can draw more than one symbol for each symbol button. It doesn’t matter, but it’ll be better if one person provides only one set of symbols. You can ask another person to provide images for the entire set again, so that there will be more than one handwriting for each symbol. But in the beginning make sure you have atleast one set of handwriting images for the entire symbol set.

 

Step 2 : Run ThulikaMaker on your PC

1. Click here to download ThulikaMaker.jar

2. Copy the folder sdcard/Android/data/me.sinu.thulika.train/files/letters from your phone to your PC.

3. Run ThulikaMaker.jar.

ThulikaMaker

4. Enter a width and height. For south Indian languages width=13 and height=10 should work. Its usually a trial and error procedure to find width and height. For letters like English, give width=7, height=7. For numbers give width=5, height=5. Try to keep widthxheight (the product) value as low as possible.

5. Click ‘Load Bundle’ and select the ‘letters’ folder that you have copied to the PC. Click open.

6. Wait for a minute. Now files will be loaded. You can now select any entry to view the file. Press ‘Delete File’ if you want to delete any file(if you find that the image is wrong for a particular file).

7. Alternatives text is not used now(SKIP)

8. Align value, the default value is 0 :

If the image is a symbol that comes on the left side of a letter (like െ, േ in Malayalam), set Align=-1

If the image is a symbol that comes on the right side of a letter (like ു, ൂ, ൃ in Malayalam), set Align=1

If it is a usual symbol, set Align=0. For latin letters and numerals, align will be 0.

If you modify align/rules for a symbol, make sure you click ‘Save Symbol’

9. Rules: Give a rule if you want to modify the current letter according to a letter that comes before it. This applies only to complex languages like Indian languages. For English and numerals, there will not be any rules.

For a symbol X, if you want to modify what happens when letter X is written according to a symbol that comes before it,  then you have to give rules for symbol X. If the letter before is Y, and if you want to turn it into Z, give rule as Y:Z. If the letter is B, and if you want to turn it into BX, give rule as B:BX

For example: symbol:ാ align:1 rules:ഒ:ഓ; െ:ൊ; േ:ോ

If you modify align/rules for a symbol, make sure you click ‘Save Symbol’

10. Finally, give a language ID.It can be any name. Usuall we give the name of the language. Click ‘Save Engine’ . Give the name E_engine and click Save.

Step 3: ThulikaTrainer Load Engine

1. Copy E_engine file from PC to sdcard/Android/data/me.sinu.thulika.train/files/engines/ of your phone.

2. Run ThulikaTrainer app on the phone.

3. Press menu key, and select Load Engine. The E_engine file will be loaded.

4. Draw a symbol, and press Recognize button. Test for all the letters, and make sure all the letters are getting recognized.

Example: Click here to see the details of all the letters for Malayalam language.

 

Mail the letter files folder and E_engine to sjsuperapps at gmail dot com. If you have any doubt please mail at this address. I’ll do my best to help you.

 

Advertisements

13 comments

  1. for any language i try to put it in the application it does not work
    even if english it appear as strange symbols !
    i don’t know where is the problem exactly
    any help !

  2. can you tell me which technique use for handwritten gesture acquisition? and also did you use image processing here?

    1. Its unsupervised learning. A bunch of images are given to be trained. No image processing is done. Just divided the image into a series of rows and columns. And checks whether a cell is on or off.

  3. can you please tell me what technique use for handwritten character acuqisition and did you use preprocessing in here?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s