This document extracts the behavioral logic of LimeIME's input method service from the Android Java sources and presents it as a platform-independent specification. It covers the complete pipeline from key event to composing to search to candidates to commit to learning.
Source files analyzed: LIMEService.java, SearchServer.java, LIMEKeyboardSwitcher.java, CandidateView.java, LimeDB.java, Mapping.java, LIME.java, LIMEPreferenceManager.java
IMService is the bridge between the Virtual Keyboard, the SearchServer, and the operating system's text input API. It receives key events from the keyboard, builds composing code sequences, queries SearchServer for candidates, manages candidate selection, commits text to the active text field, and coordinates the learning flow.
Virtual Keyboard
│ key events (onKey)
▼
IMService
│ composing code → query
▼
SearchServer
│ candidate list
▼
IMService
│ display candidates
▼
CandidateView
│ user picks candidate
▼
IMService
│ commit text → OS text API
│ learn score/phrase → SearchServer
▼
OS Text Input (Android: InputConnection / iOS: textDocumentProxy)
When a text field gains focus, initOnStartInput() configures the IMService based on the text field type.
| Input Type | mEnglishOnly |
mPredictionOn |
Keyboard Mode |
|---|---|---|---|
| NUMBER, DATETIME | true | true | MODE_TEXT (symbol keyboard) |
| PHONE | true | true | MODE_PHONE |
| PASSWORD, WEB_PASSWORD, VISIBLE_PASSWORD | true | false | MODE_EMAIL |
| EMAIL_ADDRESS, WEB_EMAIL_ADDRESS | true | false | MODE_EMAIL |
| URI | true | false | MODE_URL |
| SHORT_MESSAGE | false | true | MODE_IM (Chinese) |
| TEXT (default) | from persistent setting | true | MODE_TEXT or Chinese IM |
| TEXT with NO_SUGGESTIONS | — | false | — |
| TEXT with AUTO_COMPLETE | — | false | completion mode in fullscreen |
- Check keyboard theme change — recreate input view if changed
- Load IM keyboard config list from SearchServer
- Reset keyboards if arrow key or split keyboard setting changed
- Load user settings (
hasVibration,hasSound,mPersistentLanguageMode) - Build activated IM list from preferences
- Reset composing state:
mPredictionOn=true,mCompletionOn=false,mCapsLock=false - Clear
tempEnglishWordandtempEnglishList - Apply text field type configuration (table above)
- If English-only and no prediction → force hide candidate view
- Else → clear composing
iOS UIInputViewController receives textDocumentProxy.keyboardType (.default, .emailAddress, .URL, .numberPad, .phonePad, etc.) and textDocumentProxy.returnKeyType. Map these to the same configuration table. iOS does not have composing region styling, so mPredictionOn controls whether composing simulation is active.
iOS Porting Note — Auto-capitalization: LimeIME manually implements auto-capitalization via updateShiftKeyState() checking EditorInfo flags. iOS provides textDocumentProxy.autocapitalizationType (.none, .words, .sentences, .allCharacters) which the keyboard extension can read directly. Use this instead of manual implementation.
| Variable | Type | Purpose |
|---|---|---|
mComposing |
StringBuilder | Accumulates typed code characters |
LDComposingBuffer |
String | Tracks original composing string during continuous typing (LD) |
mPredictionOn |
boolean | Whether composing/prediction is active |
mCompletionOn |
boolean | Whether app-provided completions are active |
| Variable | Type | Purpose |
|---|---|---|
selectedCandidate |
Mapping | Currently highlighted candidate |
committedCandidate |
Mapping | Last committed candidate (for related phrase lookup) |
mCandidateList |
LinkedList<Mapping> | Current displayed candidate list |
hasCandidatesShown |
boolean | Whether candidate view is visible |
hasChineseSymbolCandidatesShown |
boolean | Whether Chinese punctuation candidates are shown |
hasMappingList |
boolean | Whether current list has database mappings |
| Variable | Type | Purpose |
|---|---|---|
mEnglishOnly |
boolean | true = English mode, false = Chinese IM mode |
mEnglishFlagShift |
boolean | Shift key flag for English/Chinese toggle |
activeIM |
String | Current input method code (e.g., "phonetic", "cj", "dayi") |
hasSymbolMapping |
boolean | Whether current IM accepts symbol-key codes |
hasNumberMapping |
boolean | Whether current IM accepts number-key codes |
currentSoftKeyboard |
String | Current soft keyboard configuration name |
mCapsLock |
boolean | Caps lock state |
mAutoCap |
boolean | Auto-capitalization enabled |
| Variable | Type | Purpose |
|---|---|---|
tempEnglishWord |
StringBuffer | Accumulates English characters for prediction |
tempEnglishList |
List<Mapping> | English prediction suggestions |
| Variable | Type | Purpose |
|---|---|---|
auto_commit |
int | Auto-commit threshold (0 = off) |
Entry point: onKey(int primaryCode, int[] keyCodes, int x, int y)
If mCapsLock is true and code is lowercase letter (97–122), convert to uppercase (subtract 32).
| Key Code | Action |
|---|---|
KEYCODE_DELETE |
handleBackspace() |
KEYCODE_SHIFT |
handleShift() |
KEYCODE_DONE |
handleClose() |
KEYCODE_UP/DOWN/LEFT/RIGHT |
Arrow key via keyDownUp() |
KEYCODE_OPTIONS |
handleOptions() |
KEYCODE_SPACE_LONGPRESS |
showIMPicker() — show input method picker |
KEYCODE_SWITCH_TO_SYMBOL_MODE |
switchKeyboard() — switch to symbol keyboard |
KEYCODE_SWITCH_SYMBOL_KEYBOARD |
switchKeyboard() — cycle symbol keyboards (1/2/3) |
KEYCODE_NEXT_IM |
switchToNextActivatedIM(true) — next Chinese IM |
KEYCODE_PREV_IM |
switchToNextActivatedIM(false) — previous Chinese IM |
KEYCODE_SWITCH_TO_ENGLISH_MODE |
switchKeyboard() — Chinese → English |
KEYCODE_SWITCH_TO_IM_MODE |
switchKeyboard() — English → Chinese |
Conditions for candidate picking via Space:
- Space key AND not English-only AND not Phonetic IM
- OR: Space key AND not English-only AND Phonetic IM AND (composing ends with space OR composing is empty)
- OR: Enter key
If candidates are shown → pickHighlightedCandidate(). If pick fails and composing is empty → hide candidate view and send the key character. If no candidates shown → sendKeyChar().
All other keys → handleCharacter(primaryCode), followed by auto-commit check:
- If
auto_commit > 0AND not English-only AND composing length ==auto_commitAND current keyboard is phonetic →commitTyped().
handleCharacter() determines whether to add a character to the composing buffer based on the current IM's mapping capabilities. The character is accepted into composing if it matches one of these conditions:
| Condition | Accepted Characters |
|---|---|
| No symbol mapping, no number mapping | Letters, Space (phonetic only), comma, period |
| No symbol mapping, has number mapping | Letters, digits |
| Has symbol mapping, no number mapping | Letters, symbols, Space (phonetic only) |
| Has symbol mapping, no number mapping, Array IM | Digits after "w" prefix (Array special code) |
| Has symbol mapping, has number mapping | Letters, digits, symbols, Space (phonetic only) |
When a character is accepted:
mComposing.append((char) primaryCode)ic.setComposingText(mComposing, 1)— display composing in text fieldupdateCandidates()— query SearchServer for candidates
When a character is NOT accepted (doesn't match any condition):
pickHighlightedCandidate()— commit current candidate if anyic.commitText(String.valueOf((char) primaryCode), 1)— send character directlyfinishComposing()— clear composing state
When mEnglishOnly is true:
- Apply shift if keyboard is shifted →
Character.toUpperCase(primaryCode) - If English prediction is enabled and character is a letter → append to
tempEnglishWord, callupdateEnglishPrediction() - If not a letter →
resetTempEnglishWord(), callupdateEnglishPrediction() commitText(String.valueOf((char) primaryCode), 1)— commit character immediately (no composing buffer in English mode)
handleBackspace() behavior depends on composing length:
| Composing Length | Action |
|---|---|
| > 1 | Delete last character from mComposing, update composing text, re-query candidates |
| == 1 | clearComposing(true) — clear composing and force-clear system buffer |
| == 0, candidates shown (not Chinese symbol) | clearComposing(false) — clear candidates |
| == 0, candidates shown (Chinese symbol) | Hide candidate view |
| == 0, English mode with prediction | Delete last char from tempEnglishWord, re-query English predictions |
| == 0, no candidates | Send KEYCODE_DEL to delete in text field |
For Stroke5 input method, composing is truncated to maximum 5 characters. If the user types a 6th character, it is discarded and composing is reset to the first 5.
After querying candidates, keyToKeyname(code) converts the typed code sequence to display-friendly symbol names. If the converted string differs from the original code, it is shown in the composing bar above the candidate list.
Examples:
- Phonetic: "bp" → "ㄅㄆ"
- Cangjie: "ab" → "日月"
- Dayi: typed shifted keys → Dayi radical names
This is cached per code in keynamecache for performance.
Different phonetic keyboard layouts map the same QWERTY key to different Zhuyin (Bopomofo) symbols. preProcessingRemappingCode() in LimeDB translates the typed key sequence to the canonical phonetic code before database lookup.
Standard Phonetic: No remapping. Keys map directly to Zhuyin (defined in keyboard XML labels).
ETEN 41-key: Single remap table. Each QWERTY key maps to one Zhuyin symbol regardless of position.
- Remap:
ETEN_KEY→ETEN_KEY_REMAP(character-by-character substitution)
ETEN 26-key: Dual remap tables — INITIAL and FINAL — because some keys map to different Zhuyin depending on whether they appear at a non-final or final position in the syllable.
- Non-final position:
ETEN26_KEY→ETEN26_KEY_REMAP_INITIAL - Final position:
ETEN26_KEY→ETEN26_KEY_REMAP_FINAL - Exception: keys "q", "w", "d", "f", "j", "k" always use INITIAL remap even when single character
- Position detection: if preceding characters match
[dfjk ]$, the next character uses INITIAL (new syllable starting)
HSU: Dual remap tables with similar logic.
- Non-final position:
HSU_KEY→HSU_KEY_REMAP_INITIAL - Final position:
HSU_KEY→HSU_KEY_REMAP_FINAL - Exception: keys "a", "e", "s", "d", "f", "j" always use INITIAL remap when single character
- Position detection: if preceding characters match
[sdfj ]$, the next character uses INITIAL
Shifted Key Remapping (also applies to other IMs):
- Dayi, EZ, standard Phonetic:
SHIFTED_NUMBERIC_KEY("!@#$%^&*()") →SHIFTED_NUMBERIC_KEY_REMAP("1234567890") andSHIFTED_SYMBOL_KEY("<>?_:+"") →SHIFTED_SYMBOL_KEY_REMAP(",./-;='") - Array:
SHIFTED_SYMBOL_KEY→SHIFTED_SYMBOL_KEY_REMAPonly
Remap tables are cached per (tableName + phoneticKeyboardType) combination in a HashMap<String, HashMap<String, String>>.
Checked after every handleCharacter() call:
- Condition:
auto_commit > 0AND not English-only ANDmComposing.length() == auto_commitAND current keyboard is phonetic (keyboard name contains "phone") - Action:
commitTyped(ic)— automatically commits the best candidate
When the user types more code characters than needed for the first matched word, the system supports continuous typing without requiring the user to explicitly select each candidate:
- After
commitTyped(), check:mComposing.length() > selectedCandidate.getCode().length() - If true →
composingNotFinish = true - Determine real code length via
getRealCodeLength(selectedCandidate, mComposing):- For phonetic: strips tone symbols [3467 ] from the matched code to find the boundary
- For ETEN: applies
preProcessingRemappingCode()before boundary detection - If code is dual-mapped: uses full composing length (abandons LD)
- Trim committed code from composing:
mComposing.delete(0, committedCodeLength) - Strip leading space if present
- Buffer the mapping for LD learning:
SearchSrv.addLDPhrase(selectedCandidate, false) - If remaining composing > 0:
ic.setComposingText(mComposing, 1)andupdateCandidates()— re-query for remaining code - Track full original composing in
LDComposingBuffer
When composing finally finishes (no remaining code):
- If
LDComposingBufferis not empty →SearchSrv.addLDPhrase(selectedCandidate, true)— final mapping, triggers LD learning - If
LDComposingBufferis empty (LD interrupted) →SearchSrv.addLDPhrase(null, true)— signal end without learning
updateCandidates() runs on a background thread:
- Stroke5 length check: If WB keyboard, truncate composing to 5 characters
- Main query:
SearchSrv.getMappingByCode(code, isSoftKeyboard, getAllRecords)code: current composing stringisSoftKeyboard: true for soft keyboard (affects candidate ordering)getAllRecords: false for initial query, true for "show all" request
- Thread yield: Short sleep to allow interruption by newer query
- Selection key setup:
- Get base selkey from
SearchSrv.getSelkey()(typically "1234567890" or IM-specific) - Determine
mixedModeSelkey: space (" ") for symbol-mapping IMs (except Dayi and standard phonetic), backtick ("`") for others - Apply
selkeyOption: 0 = no prepend, 1 = prepend mixedModeSelkey, 2 = prepend mixedModeSelkey + space
- Get base selkey from
- Emoji injection (if enabled):
- Try English emoji lookup on first candidate word
- If no English match, try Traditional Chinese (TW) lookup on second candidate
- If no TW match, try Simplified Chinese (CN) lookup
- Deduplicate emoji by word
- Insert at configurable position (
getEmojiDisplayPosition(), default: 6)
- Display candidates:
setSuggestions(list, hasPhysicalKeyPressed, selkey) - Keyname display: Convert code to display symbols via
keyToKeyname(), show in composing bar if different from raw code
If query returns empty → clearSuggestions() (which may trigger Chinese punctuation display if autoChineseSymbol is enabled).
In setSuggestions():
- If candidate at index 1 exists and
isExactMatchToCodeRecord()→ select index 1 (skip the composing code at index 0) - Else → select index 0
When smartChineseInput is enabled, makeRunTimeSuggestion() builds phrase candidates incrementally:
- Maintains
suggestionLoL— a list of lists ofPair<Mapping, code>representing candidate phrase chains - As user types more characters, extends existing suggestions by combining previous best matches with new exact matches for the remaining code
- Validates combinations against the
relatedtable (phrase must exist as a related word pair) - Scores combinations by base score, averaged over word length
- Best suggestion list is kept at the last position of
suggestionLoL - Results are tagged as
RECORD_RUNTIME_BUILT_PHRASE - On backspace: removes suggestions that used the deleted character's code
- On composing restart (length == 1): clears all suggestions
Each updateCandidates() call checks if a previous query thread is alive and interrupts it. The query thread checks for interruption at yield points and terminates early if interrupted. This prevents stale results from overwriting newer queries.
A parallel candidate pipeline active when mEnglishOnly is true and englishPrediction is enabled.
- Letter character → append to
tempEnglishWord, callupdateEnglishPrediction() - Non-letter character →
resetTempEnglishWord(), callupdateEnglishPrediction() - Backspace → delete last char from
tempEnglishWord, callupdateEnglishPrediction()
updateEnglishPrediction() runs on a background thread:
- Cursor context validation:
- Check text after cursor: must be empty or non-alphanumeric
- Check text before cursor: must match
tempEnglishWord(prevents stale predictions when cursor moved)
- Query:
SearchSrv.getEnglishSuggestions(tempEnglishWord) - Build candidate list: Insert
tempEnglishWorditself as first item (composing code record), then add suggestions - Emoji injection: Same logic as Chinese flow but English-only lookup (
EMOJI_EN) - Store in
tempEnglishList: Used later for commit handling - Display:
setSuggestions(list, hasPhysicalKeyPressed, "1234567890")
If tempEnglishWord is empty or query returns empty → clearSuggestions().
iOS provides built-in English word completion that keyboard extensions can use, eliminating the need to port LimeIME's custom English dictionary and prediction logic.
UITextChecker — Apple's word completion and spell checking API:
completions(forPartialWordRange:in:language:)— returns word completions for a partial stringguesses(forWordRange:in:language:)— returns spelling correction suggestions- Available to keyboard extensions without "Allow Full Access"
- Supports multiple languages
UILexicon — personalized vocabulary (via requestSupplementaryLexicon() on UIInputViewController):
- Contains words from user's contacts, text shortcuts, and paired device vocabulary
- Returns
UILexiconEntryobjects withuserInput→documentTextmappings - Supplements
UITextCheckerwith personalized vocabulary
What to drop on iOS:
- English dictionary data in
lime.db SearchServer.getEnglishSuggestions()implementationtempEnglishWord/tempEnglishListstate tracking- Cursor context validation logic (
getTextBeforeCursor/getTextAfterCursormatching)
What to keep:
- Emoji injection on top of
UITextCheckerresults (wrap results asMappingobjects withRECORD_ENGLISH_SUGGESTIONtype, then apply the same emoji lookup) - Candidate display integration (feed
UITextCheckercompletions into the candidate bar)
What iOS does NOT provide: The system's full predictive text engine (the one powering the built-in keyboard). Third-party extensions cannot access that. UITextChecker provides word completion (prefix matching) and spell correction, which is functionally equivalent to getEnglishSuggestions().
pickHighlightedCandidate(): Called by Space/Enter key. Delegates tomCandidateView.takeSelectedSuggestion()which calls backpickCandidateManually(index).pickCandidateManually(int index): Called when user taps a candidate or via selection key. SetsselectedCandidate = mCandidateList.get(index), then dispatches to one of the three commit paths below.
Path 1 — Completion Suggestion (fullscreen mode with app completions):
- Condition:
mCompletionOnand candidateisPartialMatchToCodeRecord() - Action:
ic.commitCompletion(mCompletions[index])
Path 2 — Chinese / Related Phrase Candidate:
- Condition:
mComposing.length() > 0OR candidate is not a composing code record (i.e., it's a related phrase or database match); AND not English-only mode - Action:
commitTyped(ic)(see detailed flow below) - After commitTyped returns,
updateRelatedPhrase(false)is called from insidecommitTyped()post-commit block (not frompickCandidateManuallydirectly) — see step 7 of commitTyped() Detailed Flow.
Path 3 — English Prediction:
- Condition: English-only mode with prediction
- If emoji:
ic.commitText(word + " ", 0) - If regular word:
ic.commitText(word.substring(tempEnglishWord.length()) + " ", 0)— commits only the suffix not yet typed, plus a space - Then:
resetTempEnglishWord(),clearSuggestions()
Related Phrase Chaining: When the user selects a candidate from the related phrase list, the same commit path (Path 2) runs again, because the candidate is not a composing code record and mComposing is empty (condition holds). updateRelatedPhrase(false) fires again after each related phrase commit, displaying the next level of associated words. This chaining continues as long as the related table has entries for each successive committed word.
- Null check: If
selectedCandidateis null, return - Surrogate check: If word contains Unicode surrogate (emoji), commit directly and clear composing
- Han conversion: If
hanConvertOption!= 0, applySearchSrv.hanConvert(word)before committing.- iOS Porting Note: LimeIME uses
hanconvertv2.dbfor simple 1-to-1 character-by-character lookup (no context-dependent conversion). iOS providesCFStringTransformwithkCFStringTransformToSimplifiedChinese/kCFStringTransformToTraditionalChinesewhich is functionally equivalent or better — uses Apple's maintained mapping, no need to shiphanconvertv2.db. Drop:LimeHanConverterclass andhanconvertv2.db. Replace with a singleCFStringTransform()call.
- iOS Porting Note: LimeIME uses
- Commit text:
ic.commitText(wordToCommit, 1) - Special clear: If Stroke5 (wb), emoji, or Chinese punctuation →
clearComposing(true) - Continuous typing check (LD composing):
- Get
committedCodeLengthviaSearchSrv.getRealCodeLength(selectedCandidate, mComposing) - If
mComposing.length() > selectedCandidate.getCode().length()→ composing not finished:- Buffer mapping:
SearchSrv.addLDPhrase(selectedCandidate, false) - Trim composing:
mComposing.delete(0, committedCodeLength) - Strip leading space if present
- If remaining composing > 0 →
ic.setComposingText(mComposing, 1)+updateCandidates()
- Buffer mapping:
- If composing finished:
- If
LDComposingBuffernot empty →SearchSrv.addLDPhrase(selectedCandidate, true)(final) - Else →
SearchSrv.addLDPhrase(null, true)(interrupted)
- If
- Get
- Post-commit (only if composing finished, no continuous typing):
committedCandidate = new Mapping(selectedCandidate)— save for related phraseclearComposing(false)updateRelatedPhrase(false)— display related phrases as next candidatesSearchSrv.learnRelatedPhraseAndUpdateScore(committedCandidate)— learn score + relatedSearchSrv.getCodeListStringFromWord(committedCandidate.getWord())— reverse lookup
When a RUNTIME_BUILT_PHRASE candidate is selected, getRealCodeLength() spawns a background thread to learn all sub-phrases from bestSuggestionList:
- For each sub-phrase where the selected word starts with the sub-phrase word:
- If word length > 8 → stop learning
dbadapter.addOrUpdateMappingRecord(code, word)— save as new mappingremoveRemappedCodeCachedMappings(code)— invalidate cache
updateRelatedPhrase(boolean getAllRecords) runs after commit on a background thread:
- Check
committedCandidatehas a non-empty word — return if null/empty - Check
hasMappingListis true — return if no mapping results were ever displayed (nothing to relate against) - Skip query if committed candidate is emoji or Chinese punctuation — call
clearSuggestions()and return - Query:
SearchSrv.getRelatedByWord(committedCandidate.getWord(), getAllRecords)withgetAllRecords = false(top results only) - If results non-empty: Display as candidate list with selkey
"1234567890"viasetSuggestions() - If results empty: Set
committedCandidate = null, callclearSuggestions()— candidate bar returns to empty state
getAllRecords parameter: Always false when called from commitTyped(). The true variant is only used for "show all" expansion requests, not after normal commits.
Learning has two timing tiers:
- Immediate (per-commit): Score cache update — fires a background thread on every candidate selection to update in-memory ordering and persist to DB.
- Deferred (session-end): Related phrase (RP) learning and LD phrase learning — processed in a single background thread when
postFinishInput()is called (text field loses focus).
learnRelatedPhraseAndUpdateScore(committedCandidate) is called from commitTyped() step 7 immediately after every commit:
- Thread-safe enqueue: Adds a copy of
committedCandidateto the synchronizedscorelist(used later bypostFinishInput()). - Spawn background thread to call
updateScoreCache(mapping):- Calls
dbadapter.addScore(mapping)— persists the incremented score to the IM table in DB. - For related phrase records (
isRelatedPhraseRecord()): Removes the cache entry for that code, forcing a fresh DB query next time. - For partial match records (
isPartialMatchToCodeRecord()): Removes the whole cache entry for that code. - For exact match records (
isExactMatchToCodeRecord()): IfsortSuggestionsis enabled, finds the matching entry in the in-memory cache list, increments its score, and moves it up if its new score exceeds the preceding entry (insertion-sort step). IfsortSuggestionsis disabled, still increments the cached score but doesn't reorder. Then callsupdateSimilarCodeCache(code)to rebuild related-code (similar-code prefix) caches.
- Calls
postFinishInput() is called when the text field loses focus (input session ends). It spawns a single background thread that runs two learners in sequence, operating on snapshots to avoid locking the input thread:
- Snapshot
scorelist: Thread-safely copiesscorelistintoscorelistSnapshotand clearsscorelist. - Run
learnRelatedPhrase(scorelistSnapshot)— see RP Learning below. - Snapshot
LDPhraseListArray: Copies into a local list and clears the global array. - Run
learnLDPhrase(localLDPhraseListArray)— see LD Learning below.
learnRelatedPhrase(localScorelist) runs inside postFinishInput()'s background thread:
- Guard:
mLIMEPref.getLearnRelatedWord()must be enabled ANDlocalScorelist.size() > 1(need at least two consecutive selections to form a pair). - Iterates consecutive pairs
(unit at i, unit2 at i+1)in the scorelist. - Pair acceptance conditions:
- Both
unit.wordandunit2.wordare non-empty. unitmust be: exact match, partial match, or related phrase record.unit2must be: exact match, partial match, related phrase, Chinese punctuation, or emoji record.
- Both
- Action:
dbadapter.addOrUpdateRelatedPhraseRecord(unit.word, unit2.word)— incrementsscorein therelatedtable and returns the new score. - LD trigger: If returned score > 20 AND
mLIMEPref.getLearnPhrase()is enabled →addLDPhrase(unit, false)+addLDPhrase(unit2, true)— feeds the pair as a completed phrase into the LD buffer.
Note on related phrase records in the scorelist: When the user selects a candidate from the related phrase list (after a previous commit), learnRelatedPhraseAndUpdateScore() is called again for that selection, so it too goes onto the scorelist. This means the related-word chain (commit A → select from related → commit B → select from related → commit C) is fully captured as consecutive pairs: (A,B) and (B,C) are both learned as related associations.
addLDPhrase(mapping, ending) buffers mappings into LDPhraseListArray throughout the session:
- If
mappingis not null → append to currentLDPhraseList. - If
endingis true: ifLDPhraseList.size() > 1→ save it toLDPhraseListArrayand start a new empty list; if size ≤ 1 → discard it (single word is not a learnable phrase). - If
mappingis null ANDendingis true → discard current list (interrupted sequence — e.g., LD composing was interrupted mid-sequence).
addLDPhrase call sites:
- From
commitTyped()during continuous typing (LD composing) — oneaddLDPhrase(mapping, false)per intermediate commit, thenaddLDPhrase(mapping, true)at the final commit. - From
learnRelatedPhrase()when RP score exceeds 20 —addLDPhrase(unit, false)+addLDPhrase(unit2, true)for high-frequency pairs.
learnLDPhrase(localLDPhraseListArray) runs inside postFinishInput()'s background thread. For each accumulated phrase list (max 4 characters):
- Get first unit's code: If the unit has no code (selected from related phrase list), do reverse lookup via
dbadapter.getMappingByWord()to find the canonical code. - Build codes character by character:
- LDCode: Full concatenation of each character's code (e.g., "su3" + "cl3" = "su3cl3").
- QPCode: First character of each character's code (e.g., "s" + "c" = "sc").
- For multi-character words within the phrase: Break down into individual characters, look up each one's code.
- For phonetic table: Strip tone symbols
[3467 ]from LDCode → e.g., "su3cl3" becomes "sucl". - Write to database:
- If phonetic and LDCode length > 1:
addOrUpdateMappingRecord(LDCode, baseWord) - If phonetic and QPCode length > 1:
addOrUpdateMappingRecord(QPCode, baseWord) - If non-phonetic and baseCode length > 1:
addOrUpdateMappingRecord(baseCode, baseWord)
- If phonetic and LDCode length > 1:
- Invalidate caches:
removeRemappedCodeCachedMappings()+updateSimilarCodeCache()
QPCode is part of LD learning. It creates a first-letter shortcut for learned phrases:
- QPCode = first character of each constituent word's code, concatenated.
- Example: 你好 with codes "su3" + "cl3" → QPCode = "sc".
- User can later type "sc" and "你好" appears as a candidate.
- Only generated for phonetic table (other IMs use the full concatenated code as LDCode).
LimeIME has two independent levels of keyboard/IM switching:
- OS-level switching — switch between LimeIME and other system keyboards (e.g., iOS built-in, other 3rd-party keyboards). LimeIME registers as a single keyboard with the OS.
- LIME-internal switching — switch between different input methods (Phonetic, Cangjie, Dayi, English, Symbol, etc.) within LimeIME itself. The OS is not aware of these internal IMs.
| Key Action | Function |
|---|---|
| Keyboard key — single press | Hide the soft keyboard |
| Keyboard key — long press | Show options menu (switch to other system keyboards, Han conversion settings, etc.) |
| Space key — slide left/right | Switch to next/previous IM within LIME |
| Space key — long press | Show LIME IM picker (list of activated IMs) |
| NEXT_IM / PREV_IM keys | Switch to next/previous IM within LIME (if keyboard layout includes these keys) |
| SWITCH_TO_ENGLISH / SWITCH_TO_IM keys | Toggle between Chinese IM and English mode |
| SWITCH_TO_SYMBOL keys | Toggle to symbol keyboard |
iOS keyboard extensions must implement OS-level keyboard switching, but the globe key does not need its own dedicated bottom-row position. Apple's contract (stable since iOS 11):
- Check
UIInputViewController.needsInputModeSwitchKeyon every layout pass (e.g., inviewWillLayoutSubviews) - If
false(modern iPhone X and later — virtually all current iPhones): The system draws its own globe/dictation row below the keyboard's input view. The user taps the system's globe and UIKit handles the switch — LIME does nothing. - If
true(iPad in some configurations, older Home-button iPhones, certain embedded contexts): LIME must provide an in-keyboard control that callsadvanceToNextInputMode(). Apple does not require this control to be a dedicated, always-visible globe button — it can be any easily-accessible affordance. - Re-check on trait/size changes — the value can change (e.g., iPad split view, external keyboard attach/detach)
LIME design — long-press the keyboard key:
LIME does not show an independent globe key. The design mirrors the existing LIME Android behavior, with one visual difference for iOS:
- Keyboard key — single press: Hide the soft keyboard (same on Android and iOS)
- Keyboard key — long press:
- Visual preview: Android shows a keyboard icon preview; iOS shows a globe icon (🌐) preview to satisfy Apple's "clearly visible globe affordance" expectation
- Action: Pop up an options menu containing:
- Switch to next system keyboard → calls
advanceToNextInputMode()(one entry in the menu, not the only action) - Show keyboard picker →
handleInputModeList(from:with:) - Han conversion settings
- LIME preferences
- Other options
- Switch to next system keyboard → calls
This long-press-on-keyboard-key pattern (with globe icon as the long-press preview) is used by existing App Store–approved third-party keyboards. The globe icon preview is what makes it acceptable to Apple's review — the user sees a globe affordance when they long-press, even though the actual switch is one menu item among several.
When needsInputModeSwitchKey == false (modern iPhones), the OS-switch entries in the menu can either be hidden (system row handles it) or kept for parity with the Android behavior.
Independent of OS switching — LIME-internal IM switching:
- Space key — slide left/right: Switch to next/previous IM within LIME (Phonetic ↔ Cangjie ↔ Dayi ↔ ...)
- Space key — long press: Show LIME IM picker (list of activated IMs)
- These are completely invisible to iOS — no API calls, just internal state changes via
switchToNextActivatedIM()
Do NOT delete the advanceToNextInputMode() code path — it must exist conditionally inside the long-press menu for the cases where needsInputModeSwitchKey == true.
switchChiEng():
clearComposing(false)— clear current composingmKeyboardSwitcher.toggleChinese()— switch keyboard layoutmEnglishOnly = !mKeyboardSwitcher.isChinese()— update mode flagmLIMEPref.setLanguageMode(mEnglishOnly)— persist if persistent mode enabledclearSuggestions()— clear candidate list
Chinese/English mode switching intentionally does not show a toast.
switchToNextActivatedIM(boolean forward):
- Cycles through the activated IM list (from
keyboard_statepreference) - Updates
activeIM, callsSearchSrv.setTableName()to switch the query table - Updates keyboard layout via
mKeyboardSwitcher.setKeyboardMode() - Resets composing and candidates
- Triggered by: space key slide/long-press, NEXT_IM/PREV_IM key codes
switchKeyboard(keycode) handles:
KEYCODE_SWITCH_TO_SYMBOL_MODE: Toggle to symbol keyboardKEYCODE_SWITCH_SYMBOL_KEYBOARD: Cycle through symbol keyboards 1/2/3KEYCODE_SWITCH_TO_ENGLISH_MODE: Switch from Chinese IM to English keyboardKEYCODE_SWITCH_TO_IM_MODE: Switch from English back to Chinese IM
When autoChineseSymbol is enabled and clearSuggestions() is called in Chinese mode with candidates visible:
hasChineseSymbolCandidatesShown = true- Load Chinese symbol list via
ChineseSymbol.getChineseSymbolList() - Display with selkey "1234567890"
- User can pick a punctuation symbol like a regular candidate
The Chinese symbol list includes: 。,、;:?!「」『』【】〈〉《》()—…·&*#@ and more.
Android InputConnection |
iOS textDocumentProxy |
Context in IMService |
|---|---|---|
setComposingText(text, cursor) |
insertText() + manual tracking |
Composing display — iOS has no native composing region. Must insert text and track length to delete-and-replace on update. |
finishComposingText() |
No-op (text already inserted) | End composing — on iOS the text is already committed inline. |
commitText(text, cursor) |
insertText(text) |
Commit selected candidate |
deleteSurroundingText(before, after) |
deleteBackward() × count |
Delete characters (used sparingly) |
sendKeyEvent(KeyEvent) |
insertText() or deleteBackward() |
Arrow keys, DEL key — iOS textDocumentProxy has adjustTextPosition(byCharacterOffset:) for cursor movement |
getTextBeforeCursor(n, flags) |
documentContextBeforeInput |
Read preceding text for English prediction validation and related phrase context |
getTextAfterCursor(n, flags) |
documentContextAfterInput |
Read following text for English prediction validation |
commitCompletion(CompletionInfo) |
N/A | App-provided completions — not available on iOS keyboard extensions |
clearMetaKeyStates(states) |
N/A | Physical keyboard meta state — not applicable on iOS |
Since iOS textDocumentProxy has no setComposingText():
- Track composing length internally (e.g.,
composingLengthcounter) - On first composing character:
insertText(char); setcomposingLength = 1 - On subsequent characters:
insertText(char); incrementcomposingLength - On composing update (backspace within composing):
deleteBackward()×composingLength, theninsertText(newComposingText); updatecomposingLength - On commit: composing text is already in place — just reset
composingLength = 0 - On cancel:
deleteBackward()×composingLength; reset
All methods are called by IMService. Thread safety: query/learning methods are called from background threads; configuration methods are called from the main thread during IM switching.
getMappingByCode(code: String, isSoftKeyboard: Bool, getAllRecords: Bool) → [Mapping]
| Parameter | Type | Description |
|---|---|---|
code |
String | Current composing string |
isSoftKeyboard |
Bool | true for soft keyboard (affects candidate ordering) |
getAllRecords |
Bool | false = top results only; true = full result set ("show all") |
Returns: Mappings matching code from the active IM table, sorted by score.
Side effect: If smartChineseInput is enabled, calls makeRunTimeSuggestion() internally to build phrase candidates.
getRelatedByWord(word: String, getAllRecords: Bool) → [Mapping]
| Parameter | Type | Description |
|---|---|---|
word |
String | The word just committed |
getAllRecords |
Bool | false = top results; true = full set |
Returns: Words that commonly follow word, from the related table. Empty list if none found.
getEnglishSuggestions(word: String) → [Mapping]
| Parameter | Type | Description |
|---|---|---|
word |
String | Partial English word typed so far |
Returns: Words starting with word from the English dictionary, as RECORD_ENGLISH_SUGGESTION mappings.
iOS replacement: Drop this method. Use UITextChecker.completions(forPartialWordRange:in:language:) instead. See §7 for full English prediction porting notes.
getSelkey() → String
Returns: The selection key string for the current IM (e.g., "1234567890"). Used to build the candidate selection key row.
keyToKeyname(code: String) → String
| Parameter | Type | Description |
|---|---|---|
code |
String | Raw typed code sequence |
Returns: Display-friendly symbol names for the code (e.g., "bp" → "ㄅㄆ", "ab" → "日月"). Result is cached per code. If the returned value equals code, nothing extra is shown in the composing bar.
getRealCodeLength(selectedMapping: Mapping, currentCode: String) → Int
| Parameter | Type | Description |
|---|---|---|
selectedMapping |
Mapping | The candidate the user just selected |
currentCode |
String | Current full composing string |
Returns: Number of characters from the front of currentCode consumed by selectedMapping. Used to trim the composing buffer in continuous typing (LD) mode.
Behaviour details:
- Phonetic: strips tone symbols
[3467 ]from the matched code to find the boundary. - ETEN: applies
preProcessingRemappingCode()before boundary detection. - Dual-mapped code: returns full composing length (abandons LD).
Side effect: IfselectedMappingisRECORD_RUNTIME_BUILT_PHRASE, spawns a background thread to persist all sub-phrases viaaddOrUpdateMappingRecord().
learnRelatedPhraseAndUpdateScore(committedCandidate: Mapping)
| Parameter | Type | Description |
|---|---|---|
committedCandidate |
Mapping | The mapping just committed |
Called from: commitTyped() step 7 (main thread), after every successful commit.
Behaviour:
- Thread-safely appends a copy of
committedCandidatetoscorelist(consumed later bypostFinishInput()). - Spawns a background thread to call
updateScoreCache(mapping):dbadapter.addScore(mapping)— persists incremented score to DB.- Related phrase record → invalidates its code cache entry.
- Partial match record → invalidates the whole code cache entry.
- Exact match record → increments cached score; if
sortSuggestionsenabled, re-positions entry in cache (insertion-sort step); callsupdateSimilarCodeCache().
addLDPhrase(mapping: Mapping?, ending: Bool)
| Parameter | Type | Description |
|---|---|---|
mapping |
Mapping? | Mapping to buffer; nil signals an interrupted sequence |
ending |
Bool | true = this is the final mapping in the phrase |
Behaviour:
mapping != nil→ append to currentLDPhraseList.ending == trueANDLDPhraseList.count > 1→ save list toLDPhraseListArray, start new list.ending == trueANDLDPhraseList.count ≤ 1→ discard (single word is not a learnable phrase).mapping == nilANDending == true→ discard current list (LD sequence was interrupted).
Call sites: commitTyped() during continuous typing (one false per intermediate commit, one true at final); learnRelatedPhrase() when RP score exceeds 20.
setTableName(table: String, numberMapping: Bool, symbolMapping: Bool)
| Parameter | Type | Description |
|---|---|---|
table |
String | IM table name to switch to (e.g., "cj", "phonetic") |
numberMapping |
Bool | Whether the IM accepts digit keys as code input |
symbolMapping |
Bool | Whether the IM accepts symbol keys as code input |
Behaviour: Switches the active IM table. Updates hasNumberMapping and hasSymbolMapping flags used by handleCharacter(). Triggers cache prefetch for the new table.
getTablename() → String
Returns: The currently active IM table name.
resetCache()
Clears all in-memory candidate caches. Called when switching IMs or after user data is cleared.
hanConvert(input: String) → String
| Parameter | Type | Description |
|---|---|---|
input |
String | Word to convert |
Returns: Converted word per hanConvertOption (0=off, 1=T→S, 2=S→T).
iOS replacement: Drop LimeHanConverter and hanconvertv2.db. Replace with a single CFStringTransform() call:
- T→S:
CFStringTransform(str, nil, kCFStringTransformToSimplifiedChinese, false) - S→T:
CFStringTransform(str, nil, kCFStringTransformToTraditionalChinese, false)
emojiConvert(code: String, type: Int) → [Mapping]
| Parameter | Type | Description |
|---|---|---|
code |
String | Word to look up |
type |
Int | EMOJI_EN / EMOJI_TW / EMOJI_CN |
Returns: Emoji candidates matching the word, as RECORD_EMOJI_WORD mappings.
iOS implementation: No built-in word-to-emoji API on iOS. Convert emoji.db to a bundled read-only JSON dictionary. Wrap results as Mapping objects with RECORD_EMOJI_WORD type and feed into the standard candidate pipeline. Eliminates SQLite overhead for a small static dataset.
getCodeListStringFromWord(word: String) → String
| Parameter | Type | Description |
|---|---|---|
word |
String | Word to reverse-look up |
Returns: Comma-separated string of all codes that produce word. Used to build the reverse-lookup notification shown after commit when the active IM's reverse-lookup table is not none.
postFinishInput()
Called when the text field loses focus (input session ends). Flushes all pending learning in a single background thread:
- Snapshot
scorelist→scorelistSnapshot; clearscorelist. - Run
learnRelatedPhrase(scorelistSnapshot)— RP learning (see §9). - Snapshot
LDPhraseListArray→ local copy; clear global array. - Run
learnLDPhrase(localLDPhraseListArray)— LD phrase learning (see §9).
Ensures all queued learning is processed even if the user switches away mid-session.
getAllImKeyboardConfigList() → [ImConfig]
Returns: All IM-to-keyboard configuration mappings from the im table. Used during IM switching to determine which keyboard layout to load for each IM.
getKeyboardConfigList() → [Keyboard]
Returns: All keyboard layout configurations from the keyboard table. Used by LIMEKeyboardSwitcher to map IM codes to keyboard resource files.
| Field | Type | Purpose |
|---|---|---|
id |
String | Database record ID |
code |
String | Input code (lowercase) |
codeorig |
String | Original input code (preserves case) |
code3r |
String | Code without tone keys (phonetic only) |
word |
String | Output word / candidate text |
pword |
String | Parent word (for related phrases) |
related |
String | Related phrase info |
score |
int | User score (learned frequency) |
basescore |
int | Base score from preloaded data |
highLighted |
Boolean | From highlighted list vs exact match |
recordType |
int | Type classifier (see constants below) |
| Constant | Value | Meaning |
|---|---|---|
RECORD_COMPOSING_CODE |
1 | The input code itself (first item in list) |
RECORD_EXACT_MATCH_TO_CODE |
2 | Exact code match from database |
RECORD_PARTIAL_MATCH_TO_CODE |
3 | Prefix match |
RECORD_RELATED_PHRASE |
4 | Related phrase suggestion |
RECORD_ENGLISH_SUGGESTION |
5 | English word suggestion |
RECORD_RUNTIME_BUILT_PHRASE |
6 | Generated phrase from LD/QP learning |
RECORD_CHINESE_PUNCTUATION_SYMBOL |
7 | Chinese punctuation |
RECORD_HAS_MORE_RECORDS_MARK |
8 | "More..." placeholder |
RECORD_EXACT_MATCH_TO_WORD |
9 | Reverse lookup: word→code exact |
RECORD_PARTIAL_MATCH_TO_WORD |
10 | Reverse lookup: word→code partial |
RECORD_COMPLETION_SUGGESTION_WORD |
11 | Word completion suggestion |
RECORD_EMOJI_WORD |
12 | Emoji character |
Each record type has a corresponding is*Record() method (e.g., isExactMatchToCodeRecord(), isRelatedPhraseRecord(), isRuntimeBuiltPhraseRecord()). These are used throughout the commit and learning flows to determine handling logic.
All user-adjustable settings from LIMEPreferenceManager, grouped by category with the logic each setting affects.
| Setting | Pref Key | Type | Default | Affects |
|---|---|---|---|---|
| Active IM | keyboard_list |
String | "phonetic" | Which IM table SearchServer queries |
| IM activated state | keyboard_state |
String | all enabled | Which IMs appear in NEXT_IM/PREV_IM cycle |
| Phonetic keyboard type | phonetic_keyboard_type |
String | "phonetic" | Code remapping in preProcessingRemappingCode(): Standard / ETEN / ETEN26 / HSU |
| Auto-commit threshold | auto_commit |
int | 0 (off) | Composing auto-commit when length reaches threshold (phonetic only) |
| Han conversion | han_convert_option |
int | 0 (off) | 0=off, 1=T→S, 2=S→T. Applied in commitTyped() before commitText() |
| Selection key option | selkey_option |
int | 0 | Candidate selkey prepend: 0=none, 1=prepend mixedModeSelkey, 2=prepend with space |
| Smart Chinese input | smart_chinese_input |
boolean | false | Enables runtime phrase suggestion (makeRunTimeSuggestion) in SearchServer |
| Auto Chinese symbol | auto_chinese_symbol |
boolean | false | Shows Chinese punctuation candidates when candidate list cleared |
| Persistent language mode | persistent_language_mode |
boolean | false | Remembers Chinese/English mode across text fields |
| Language mode | language_mode |
String | "no" | Stored Chinese/English state ("yes"=English, "no"=Chinese) |
| Swipe candidate bar | candidate_switch |
boolean | true | CandidateView swipe gesture: true = dragging scrolls the candidate list; false = swipe left/right commits previous/next candidate. Global — applies to all IMs |
| Setting | Pref Key | Type | Default | Affects |
|---|---|---|---|---|
| Learn related words | candidate_suggestion |
boolean | true | Enables RP learning in learnRelatedPhrase() |
| Learn phrase | learn_phrase |
boolean | true | Enables LD phrase learning when RP score > 20 |
| Sort suggestions | learning_switch |
boolean | true | Score-based candidate reordering in search results |
| Similar code enable | similiar_enable |
boolean | true | Show candidates with similar (prefix-matching) codes |
| Similar code limit | similiar_list |
int | 20 | Max number of similar-code candidates |
| Setting | Pref Key | Type | Default | Affects |
|---|---|---|---|---|
| Emoji display position | enable_emoji_position |
int | 6 | 0 disables inline emoji candidates; 2–10 inserts after that candidate index |
| Setting | Pref Key | Type | Default | Affects |
|---|---|---|---|---|
| English prediction | english_dictionary_enable |
boolean | true | updateEnglishPrediction() activation in English mode |
| Auto-capitalization | auto_cap |
boolean | true | updateShiftKeyState() after key handling |
| Setting | Pref Key | Type | Default | Affects |
|---|---|---|---|---|
| Font size | font_size |
float | 1.0 | Candidate text size multiplier |
| Keyboard size | keyboard_size |
float | 1.0 | Key size scale multiplier |
| Show arrow keys | show_arrow_key |
int | 0 | Arrow key row visibility mode |
| Split keyboard | split_keyboard_mode |
int | 0 | Split keyboard layout mode |
| Keyboard theme | keyboard_theme |
int | 6 | Theme index; 6 follows system light/dark and triggers input view recreation on change |
| Number row in English | number_row_in_english |
boolean | true | Number row in English keyboard |
| Setting | Pref Key | Type | Default | Affects |
|---|---|---|---|---|
| Vibrate on keypress | vibrate_on_keypress |
boolean | true | Haptic feedback on key press. iOS: UIImpactFeedbackGenerator |
| Vibrate level | vibrate_level |
int | 40 | Vibration duration in ms. iOS: map to UIImpactFeedbackGenerator style — e.g., ≤20ms→.light, ≤50ms→.medium, >50ms→.heavy. Note: iOS does not allow custom duration; only style selection. |
| Sound on keypress | sound_on_keypress |
boolean | false | Audio feedback. iOS: use UIInputViewController.playInputClick() — the proper API for keyboard click sounds, respects system sound settings |
| Setting | Pref Key | Type | Default | Affects |
|---|---|---|---|---|
| Reverse lookup table | {table}_im_reverselookup |
String | "none" | Which IM to use for reverse lookup display |
| Setting | Pref Key | Type | Default | Affects |
|---|---|---|---|---|
| Allow digit keys as IM codes | accept_number_index |
boolean | false | hasNumberMapping flag in initializeIMKeyboard() — custom IM only; all built-in IMs hardcode their own value regardless of this pref |
| Allow symbol keys as IM codes | accept_symbol_index |
boolean | false | hasSymbolMapping flag in initializeIMKeyboard() — custom IM only; all built-in IMs hardcode their own value regardless of this pref |
These settings are not applicable to iOS keyboard extensions:
physical_keyboard_enable, disable_physical_selkey, disable_physical_selkey_option, english_dictionary_physical_keyboard, physical_keyboard_sort, switch_english_mode, switch_english_mode_shift, hide_software_keyboard_typing_with_physical, three_rows_remapping
physical_keyboard_type is a legacy Android phone hardware-keyboard value retained only as a runtime default/backward-compatibility key, not a visible preference.
{table}total_record, {table}mapping_version, {table}mapping_file, {table}mapping_file_temp, total_userdict_record, searchsrv_reset_cache