Usage
Some example use cases.
Obtain Morpheus Analytic blocks for a Greek word
# convert unicode greek to betacode
import beta_code
bc_word=beta_code.greek_to_beta_code(u'του')
api_endpoint="10.0.1.156:1315"
print(morphkit.get_word_blocks(bc_word,api_endpoint))
This will output three Morpheus Analytic blocks (one shown):
:raw tou
:workw tou=
:lem o(
:prvb
:aug1
:stem tou= indeclform
:suff
:end masc/neut gen sg indeclform article
Get the compact analysis results
print(morphkit.get_word_blocks(bc_word,api_endpoint,output="compact"))
This prints (for the same word as the previous example) the compact notation:
tou
<NL>N tou=,o( masc/neut gen sg indeclform article</NL><NL>N tou=,ti/s gen sg attic indeclform indecl</NL><NL>N tis gen sg attic enclitic indeclform indef</NL>
Perform a full analysis
The following will analyse a word and produce a full dictionary of Morpheus data augmented with a Part of Speech and a morphological tag.
result=morphkit.analyse_word_with_morpheus("mo/non",api_endpoint)
This provides a Python dictionary like below (truncated):
{'raw_bc': 'mo/non',
'raw_uc': 'μόνον',
'blocks': 2,
'analyses': [{'raw_bc': 'mo/non',
'raw_uc': 'μόνον',
'workw_bc': 'mo/non',
'workw_uc': 'μόνον',
'lem_full_bc': 'mo/nos',
'lem_full_uc': 'μόνος',
'lem_base_bc': 'mo/nos',
'lem_base_uc': 'μόνος',
'stem_bc': 'mon',
'stem_uc': 'μον',
'stem_codes': ['os_h_on'],
'end_bc': 'on',
'end_uc': 'ον',
'gender': 'masc',
'case': 'acc',
'number': 'sg',
'end_codes': ['os_h_on'],
'pos': 'noun',
'morph': 'N-ASM'},
{'raw_bc': 'mo/non',
'raw_uc': 'μόνον',
'workw_bc': 'mo/non',
...
Limited Latin support
Latin words can be analysed and the results stored in a Python dictionary:
import pprint as pp
raw_text=morphkit.get_word_blocks("dico",api_endpoint,language="latin")
blocks=morphkit.split_into_raw_blocks(raw_text)
all_parses = []
for block in blocks:
raw_beta, parses = morphkit.parse_word_block(block,"latin")
all_parses.append(parses)
pp.pprint(parses)
This procudes a dictiorary like:
[{'end': 'o_',
'end_codes': ['conj1'],
'lem_base': 'dico#',
'lem_full': 'dico#1',
'lem_homonym': 1,
'mood': 'indicative',
'number': 'sg',
'person': '1',
'raw': 'dico',
'stem': 'dic',
'stem_codes': ['conj1', 'are_vb'],
'tense': 'present',
'voice': 'active',
'workw': 'dico_'}]
[{'end': 'o_',
'end_codes': ['conj3'],
'lem_base': 'dico#',
To see these examples in action, you can download this Jupyter Notebook.