apps

RIP Thulika Keyboard.. (sob).. (sob)

Just got to know that Google has came up with an handwriting recognition app – Google Handwriting Input.

And the good part is it has got Indian language support. Yay!!

But the bad part is that, it means the demise of my very own Thulika Keyboard. I had created it about 2 years back, and at that time there was no handwriting recognition app for any Indian language. I also had opensourced it, and have got requests from many to translate it to their language(Arabic, Hausa, Tamil to name a few) and some even have used it as part of their school/college projects.

RIP Thulika Keyboard, it was pleasure to have you around(it’s still out there, so you can download it and play with it, but with Google’s new app, mine seems a bit primitive 😦 ).

rip

Google’s new app is awesome though, it recognises full words (mine could only do a letter at a time), and speaking about recognition, its very good at that.

So, here is me(Thulika), the old out dated tech giving way for the new stylish Google app 😉

Advertisements

Thulika Keyboard

Download the app from PlayStore
On January 6th early morning(at about 3 AM), I created my Google Developer account, and published my first Android app – Thulika Keyboard – in the Play Store.
Thulika Keyboard is a Handwriting recognition keyboard. It is the first app that brings handwriting recognition to an Indian Language(Malayalam for the time being!). User writes a single symbol, the app recognizes it and may give a possible list of suggestions. Writing a keyboard for a language like Malayalam is not quite simple as Malayalam has a complex script compared to Latin languages. Some symbols may have more than one Unicode letters to accompany it. And in some cases more than one symbols make up a single Unicode letter!
Thulika uses machine learning/AI technique called SOM (Self Organising Map) using the library Encog. Encog is a great library and is quite easy to use. I have trained it using the handwriting of my own and some of my friends. I started writing the app in October, took a break for nearly 2 weeks in November end, and the app was completed by mid-December. Then I took it lazily, and waited till January to make some final polishes and then published it 🙂
The entire application consists of Thulika Keyboard (the actual keyboard for end user), ThulikaTrainer (an Android app that helps in gathering user handwritings for a particular language) and ThulikaMaker (a Java application that helps in providing additional details required for the keyboard and also creates the language recognition engine). Anyone can easily help me in adding new languages for the Thulika Keyboard. Read this post to understand how.

Training the Thulika Keyboard

Before you read :

Indian languages use complex script unlike English letters. Each English letter is composed of a single Unicode letter. But each letter of Indian languages may be composed of more than one Unicode letters.

The decomposition is as follows: LetterCode + modifyingSymbolCode

Eg: ക്രൊ = ( ക + ്ര + ൊ )

So as you see the code that comes first(ക) need not be the first letter that the user writes(െ). Also some symbols(like the symbol that comes after െ) will not have direct Unicode substitution (it should be written as  ്ര).

Also the code that’s already written might have to be modified according to the symbol that the user writes next. For eg:

If user writes ാ and the symbol just before isെ, then we’ll have replaceെ withൊ.

 

Now onto the training part:

This post describes the steps needed to configure new language/symbols for the Thulika Keyboard app. For this you have to download  ThulikaTrainer app.

To train Thulika keyboard on new inputs, require three steps:

  1. Save handwriting images of all the symbols of the language you wish the keyboard recognises (This is using the ThulikaKeyboard app on your Android phone)
  2. Run a program on your PC (will update the steps for this later)
  3. ThulikaTrainer Load engine (will update later)

Step 1. ThulikaTrainer app

1. Run the app for the first time. It’ll create directory structures on your sdcard. Make sure the following directories are created: sdcard/Android/data/me.sinu.thulika.train/files. There’ll be two folders under this – engines and letters.
2. Create a text file alpha.txt in sdcard/Android/data/me.sinu.thulika.train/files. This file should contain the symbols that the keyboard should recognize. Each symbol should be separated from other symbol using a COMMA(,).

For example, if you want to train the symbols a, b, c, 1 and 2, then, the alpha.txt should contain :

a, b, c, 1, 2

You can create the alpha.txt file using PC and then copy to the respective folder on your sdcard.

3. Now if you run the app, the symbols that you have specified in alpha.txt will be present on each of the buttons at the bottom of the app.

4. Draw a symbol on the screen and then press the button that corresponds to the symbol you have drawn. It’ll add a file under sdcard/Android/data/me.sinu.thulika.train/files/letters

If you have made a mistake you can press Clear button. If you accidentally pressed the wrong symbol button, press ‘Delete Last’, it will delete the last file you have created.

Please make sure you press the correct button corresponding to the symbol you have drawn. This is what ensures the correctness of the keyboard.

You can draw more than one symbol for each symbol button. It doesn’t matter, but it’ll be better if one person provides only one set of symbols. You can ask another person to provide images for the entire set again, so that there will be more than one handwriting for each symbol. But in the beginning make sure you have atleast one set of handwriting images for the entire symbol set.

 

Step 2 : Run ThulikaMaker on your PC

1. Click here to download ThulikaMaker.jar

2. Copy the folder sdcard/Android/data/me.sinu.thulika.train/files/letters from your phone to your PC.

3. Run ThulikaMaker.jar.

ThulikaMaker

4. Enter a width and height. For south Indian languages width=13 and height=10 should work. Its usually a trial and error procedure to find width and height. For letters like English, give width=7, height=7. For numbers give width=5, height=5. Try to keep widthxheight (the product) value as low as possible.

5. Click ‘Load Bundle’ and select the ‘letters’ folder that you have copied to the PC. Click open.

6. Wait for a minute. Now files will be loaded. You can now select any entry to view the file. Press ‘Delete File’ if you want to delete any file(if you find that the image is wrong for a particular file).

7. Alternatives text is not used now(SKIP)

8. Align value, the default value is 0 :

If the image is a symbol that comes on the left side of a letter (like െ, േ in Malayalam), set Align=-1

If the image is a symbol that comes on the right side of a letter (like ു, ൂ, ൃ in Malayalam), set Align=1

If it is a usual symbol, set Align=0. For latin letters and numerals, align will be 0.

If you modify align/rules for a symbol, make sure you click ‘Save Symbol’

9. Rules: Give a rule if you want to modify the current letter according to a letter that comes before it. This applies only to complex languages like Indian languages. For English and numerals, there will not be any rules.

For a symbol X, if you want to modify what happens when letter X is written according to a symbol that comes before it,  then you have to give rules for symbol X. If the letter before is Y, and if you want to turn it into Z, give rule as Y:Z. If the letter is B, and if you want to turn it into BX, give rule as B:BX

For example: symbol:ാ align:1 rules:ഒ:ഓ; െ:ൊ; േ:ോ

If you modify align/rules for a symbol, make sure you click ‘Save Symbol’

10. Finally, give a language ID.It can be any name. Usuall we give the name of the language. Click ‘Save Engine’ . Give the name E_engine and click Save.

Step 3: ThulikaTrainer Load Engine

1. Copy E_engine file from PC to sdcard/Android/data/me.sinu.thulika.train/files/engines/ of your phone.

2. Run ThulikaTrainer app on the phone.

3. Press menu key, and select Load Engine. The E_engine file will be loaded.

4. Draw a symbol, and press Recognize button. Test for all the letters, and make sure all the letters are getting recognized.

Example: Click here to see the details of all the letters for Malayalam language.

 

Mail the letter files folder and E_engine to sjsuperapps at gmail dot com. If you have any doubt please mail at this address. I’ll do my best to help you.