lingpy.data package¶

Subpackages¶

lingpy.data.ipa package

Submodules¶

lingpy.data.derive module¶

Module for the derivation of sound class models.

The module provides functions for the customized compilation of sound-class models. All models are defined in simple text files. In order to guarantee their quick access when loading the library, the models are compiled and stored in binary files.

lingpy.data.derive.compile_dvt(path='')¶

Function compiles diacritics, vowels, and tones.

Notes

A model is defined by a folder placed in data/models directory of the LingPy package. The name of the folder reflects the name of the model. It contains three files: the file converter, the file INFO, and the optional file scorer. The format requirements for these files are as follows:

INFO

The INFO-file serves as a reference for a given sound-class model. It can contain arbitrary information (and also be empty). If one wants to define specific characteristics, like the source, the compiler, the date, or a description of a given model, this can be done by employing a key-value structure in which the key is preceded by an @ and followed by a colon and the value is written right next to the key in the same line, e.g.:

@source: Dolgopolsky (1986)

This information will then be read from the INFO file and rendered when printing the model to screen with help of the print() function.

converter

The converter file contains all sound classes which are matched with their respective sound values. Each line is reserved for one class, precede by the key (preferably an ASCII-letter) representing the class:

B : ɸ, β, f, p͡f, p͜f, ƀ
E : ɛ, æ, ɜ, ɐ, ʌ, e, ᴇ, ə, ɘ, ɤ, è, é, ē, ě, ê, ɚ
D : θ, ð, ŧ, þ, đ
G : x, ɣ, χ
...

matrix

A scoring matrix indicating the alignment scores of all sound-class characters defined by the model. The scoring is structured as a simple tab-delimited text file. The first cell contains the character names, the following cells contain the scores in redundant form (with both triangles being filled):

B  10.0 -10.0   5.0 ...
E -10.0   5.0 -10.0 ...
F   5.0 -10.0  10.0 ...
...

scorer

The scorer file (which is optional) contains the graph of class-transitions which is used for the calculation of the scoring dictionary. Each class is listed in a separate line, followed by the symbols v,``c``, or t (indicating whether the class represents vowels, consonants, or tones), and by the classes it is directly connected to. The strength of this connection is indicated by digits (the smaller the value, the shorter the path between the classes):

A : v, E:1, O:1
C : c, S:2
B : c, W:2
E : v, A:1, I:1
D : c, S:2
...

The information in such a file is automatically converted into a scoring dictionary (see List2012b for details).

Based on the information provided by the files, a dictionary for the conversion of IPA-characters to sound classes and a scoring dictionary are created and stored as a binary. The model can be loaded with help of the Model class and used in the various classes and functions provided by the library.

lingpy.data.model module¶

Module for handling sequence models.

class lingpy.data.model.Model(model, path=None)¶

Bases: object

Class for the handling of sound-class models.

Parameters

model : { ‘sca’, ‘dolgo’, ‘asjp’, ‘art’, ‘_color’ }

A string indicating the name of the model which shall be loaded. Select between:

‘sca’ - the SCA sound-class model (see List2012a),

‘dolgo’ - the DOLGO sound-class model (see: :evobib:`Dolgopolsky1986’),

‘asjp’ - the ASJP sound-class model (see Brown2008 and Brown2011),

‘art’ - the sound-class model which is used for the calculation of sonority profiles and prosodic strings (see List2012), and

‘_color’ - the sound-class model which is used for the coloring of sound-tokens when creating html-output.

Module contents¶

LingPy comes along with many different kinds of predefined data. When loading the library, the following dictionary is automatically loaded and employed by all LingPy modules:

rcParams : dict
As an alternative to all global variables, this dictionary contains all these variables, and additional ones. This dictionary is used for internal coding purposes and stores parameters that are globally set (if not defined otherwise by the user), such as

specific debugging messages (warnings, messages, errors)

default values, such as “gop” (gap opening penalty), “scale” (scaling factor

by which extended gaps are penalized), or “figsize” (the default size of

figures if data is plotted using matplotlib).

These default values can be changed with help of the rc function that takes any keyword and any variable as input and adds or modifies the specific key of the rcParams dictionary, but also provides more complex functions that change whole sets of variables, such as the following statement:
>>> rc(schema="asjp")
which switches the variables “asjp”, “dolgo”, etc. to the ASCII-based transcription system of the ASJP project.

If you want to change the content of c{rcParams} directly, you need to import the dictionary explicitly:
>>> from lingpy.settings import rcParams
However, changing the values in the dictionary randomly can produce unexpected behavior and we recommend to use the regular rc function for this purpose.

lingpy.settings.rc(rval=None, rcParams_=None, **keywords)¶

Function changes parameters globally set for LingPy sessions.

Parameters

rval : string (default=None)

Use this keyword to specify a return-value for the rc-function.

schema : {“ipa”, “asjp”}

Change the basic schema for sequence comparison. When switching to “asjp”, this means that sequences will be treated as sequences in ASJP code, otherwise, they will be treated as sequences written in basic IPA.

rcParams_ : Allow passing in a plain dict for testing.

Notes

This function is the standard way to communicate with the rcParams dictionary which is not imported as a default. If you want to see which parameters there are, you can load the rcParams dictonary directly:

>>> from lingpy.settings import rcParams

However, be careful when changing the values. They might produce some unexpected behavior.

Examples

Import LingPy:

>>> from lingpy import *

Switch from IPA transcriptions to ASJP transcriptions:

>>> rc(schema="asjp")

You can check which “basic orthography” is currently loaded:

>>> rc(basic_orthography)
'asjp'
>>> rc(schema='ipa')
>>> rc(basic_orthography)
'fuzzy'

lingpy.data package¶

Subpackages¶

Submodules¶

lingpy.data.derive module¶

lingpy.data.model module¶

Module contents¶

Table of Contents

This Page

converter	dict	A dictionary with IPA tokens as keys and sound-class characters as values.
scorer	dict	A scoring dictionary with tuples of sound-class characters as keys and similarity scores as values.
info	dict	A dictionary storing the key-value pairs defined in the `INFO`.
name	str	The name of the model which is identical with the name of the folder from wich the model is loaded.