Create_penntree_feature_for_TF

Project Status: WIP – Initial development is in progress, but there has not yet been a stable, usable release suitable for the public. License: CC BY 4.0

Create penntree feature for Text-Fabric

This repository contains the Jupyter Notebook used to create three new Text-Fabric feature:

The final feature files will be added to the package available at the tonyjurg/N1904addons repository.

Example

The following example shows a penn tree generated when passing the feature data to nltk:

from nltk import Tree
ptbString=F.penntree.v(sentenceNode)
tree = Tree.fromstring(ptbString)
tree.pretty_print()  # console-style print   
                                               S                                 
                                               |                                  
                                               WG                                
            ___________________________________|_________________                 
           CL                                                    |               
    _______|________________                                     |                
   |                      NP-SBJ                                 |               
   |        ________________|_____________                       |                
   |       |       |                      CL                     |               
   |       |       |         _____________|____                  |                
  VP-V     |       |       VP-V              PP-ADV              CL              
   |       |       |        |         _________|______       ____|____________    
  VERB    NOUN   PUNCT     VERB     PREP      NOUN  PUNCT  NOUN PRON  NOUN  PUNCT
   |       |       |        |        |         |      |     |    |     |      |   
ἐγένετο ἄνθρωπος   ,   ἀπεσταλμένος παρὰ      θεοῦ    ,   ὄνομα αὐτῷ ἰωάνης   ·  

Production notebook

You can view the production notebook on nbviewer.org.

Alternative, you can also download it from the GitHub repository.

Attribution and footnotes

The following resources were consulted for creating the notebook:

The Greek base text is from Nestle1904 Greek New Testament, edited by Eberhard Nestle, published in 1904 by the British and Foreign Bible Society:

Nestle, Eberhard. Η Καινή Διαθήκη Novum Testamentum Graece (New York: Fleming H. Revell Company, 1904). The 1913 reprint is available here, which was transcribed by Diego Santos. All this material is in Public domain.

The N1904-TF dataset available under MIT license. Formal reference:

Tony Jurg, Saulo de Oliveira Cantanhêde, & Oliver Glanz. (2024). CenterBLC/N1904: Nestle 1904 Text-Fabric data. Zenodo. DOI: 10.5281/zenodo.13117911.

License

This notebook is released under the Creative Commons Attribution 4.0 International (CC BY 4.0)

Citation

If you use this repository in your academic work, please cite it.