Features for Nestle1904GBI Text-Fabric Corpus (by feature type)
In Text-Fabric, a “feature” refers to attributes associated with nodes, which represent linguistic elements in the text, including words, word groups, sentences, and verses. These features contain additional information specific to these nodes, facilitating diverse linguistic analyses and data extraction.
This is the key to the meaning of the features of the
Nestle 1904 GBI dataset. The available features can be taken grouped together as follows:
Grid features
Name |
Description |
Examples |
oslots |
slot containment |
1 1-11 2010-2015,2020-2030 |
otext |
configuration for sections, structure, and text formats (textapi) |
no data, only specifications |
otype |
Object definitons of nodes |
book verse clause phrase word |
Sectional features
Name |
Description |
Examples |
book |
Full book name |
Matthew Mark … Revelation |
booknum |
Book number |
1 2 … 27 |
bookshort |
Short book name |
Matt Mark … Rev |
chapter |
Chapter number inside book |
1 2 … |
clause |
Clause number (counted per chapter) |
1 2 … |
monad |
Monad |
|
nodeID |
Node ID (as in the XML source data) |
|
phrase |
Phrase number (counted per chapter) |
|
sentence |
Sentence number (counted per chapter) |
|
verse |
Verse number inside chapter |
1 2 |
Lexical features
Name |
Description |
Examples |
gloss |
English gloss |
|
lemma |
Lexical lemma (cf. BDAG) |
|
lex_dom |
Lexical domain according to SDBG |
|
ln |
Louw-Nida lexical classification |
|
strongs |
Strongs number |
|
Orthographic features
Name |
Description |
Examples |
after |
Space or punctuation after word |
` ` . ; - |
normalized |
Surface word stripped of punctations |
|
word |
Word as it appears in the text |
|
Morphological features
Name |
Description |
Examples |
case |
Gramatical case |
Nominative Genitive Dative |
degree |
Degree of an comparative or superlative adjective |
Comparitative Superlative |
formaltag |
Formal tag (Sandborg-Petersen morphology) |
N-GSM CONJ |
functionaltag |
Functional tag (Sandborg-Petersen morphology) |
V-AAI-3S |
gn |
Gramatical gender |
Masculine Feminine Neuter |
mood |
Gramatical mood of a verb |
Indicative Optative |
nu |
Gramatical number of a noun |
Singular Plural |
number |
Gramatical number of a verb |
Singular Plural |
person |
Gramatical person of the verb |
first second third |
tense |
Gramatical tense of the verb |
Present Aorist |
type |
Gramatical type of noun or pronoun |
Common Personal |
voice |
Gramatical voice of the verb |
Active Middle |
Syntactic features
Relational features
Name |
Description |
subj_ref |
Subject reference (to nodeID in XML source data, not yet post-processes) |
The concept ‘features’ in Text-Fabric
Text-Fabric, true to its name, implements the concepts of ‘warp’ and ‘weft’, inspired by textile weaving, to represent its data. The ‘warp’ denotes the foundational structured data, encompassing linguistic annotations like words, and phrases, while the ‘weft’ refers to the additional layers of information, known as features. These features encompass linguistic data, annotations, and metadata, seamlessly woven into the ‘warp’ data, resulting in a clear separation between structure and content. This approach enables Text-Fabric to efficiently handle complex linguistic datasets with versatility.