Home

Welcome to the documentation for Morphkit, a Python research tool for processing the output of the Morpheus Morphological analyser.

You are currently reading the |docs_label| documentation for Morphkit |release_version|. Use the version selector in the sidebar to switch between the stable release, development docs, and older tagged versions.

This package was created as part of a research project to create a Text-Fabric dataset containing the Morpheus analytical data for each word of the Nestle1904 Greek New Testament. A number of functions are specifically related to this use case.

In 1.0.0, Morphkit is best understood as a semantic translation layer between two incompatible morphological systems: the raw Morpheus analyses and the SP / N1904-TF tagging conventions used in that project. The initial release is therefore tightly bound to the N1904-TF environment. It is packaged and documented so the research workflow can be reproduced, not because it has already become a fully general standalone package.

Features

Research-oriented middleware around Morpheus output.
Translation of Morpheus analyses into SP / N1904-TF-style tags.
Intended primarily for Nestle1904 Text-Fabric scripts, notebooks, and exports.
Basic support for Latin within the same architecture.

Using this package

Installation: How to install the reproducible research snapshot
Usage: How to use this tool in its intended research setting
Architecture: How the 1.0.0 Morphkit translation layer is structured internally
License: How code and non-code materials in this repository are licensed

GitHub

You can find the project’s source code on GitHub and report issues or suggestions at the issue tracker.

Summary of functions

`morphkit.analyse_pos`	analyse a single Morpheus parse record and determine its part of speech.
`morphkit.analyse_morph_tag`	Compute the Sandborg–Petersen morphological tag for a single Morpheus analyses block.
`morphkit.analyse_word_with_morpheus`	Query the Morpheus morphological analyser for a Greek word in Betacode and parse its analyses.
`morphkit.annotate_and_sort_analyses`	Annotate and sort analyses in a morphkit-compatible structure, grouping by base lemma and appending homonym suffixes extracted from lem_full_bc minus lem_base_bc.
`morphkit.compare_tags`	Compare two morphological parsing tags by decoding them into features and computing a weighted similarity score.
`morphkit.decode_tag`	Decode a morphological tag into a set of human-readable features.
`morphkit.get_word_blocks`	Retrieve the raw word blocks data for a given beta-code word from a Morpheus endpoint.
`morphkit.init_compare_tags`	Factory that initializes and returns a fully-configured `compare_tags()` function.
`morphkit.parse_word_block`	Parse a single Morpheus output block of Beta-code lines into structured morphological data.
`morphkit.split_into_raw_blocks`	Split the input text into blocks at each ':raw' header using multiline regex.