PROJECT EUREKA GENELEX

 

Report on the

SEMANTIC LAYER

GENELEX Consortium

 

Version 2.1

September 1994

 

 

TABLE OF CONTENTS

A - GENERAL POINTS: situating the problem 1

1. Genericity and flexibility 3

2. Ensuring the suitability of the model 6

3. Description axes 7

3.1. Componential or analytic axis 7

3.2. Relational or differential axis 7

B - OVERALL ARCHITECTURE OF THE LEXICON: 9

1. Semantic unit - Semantic unit of an affix 10

2. Articulation with the other layers 11

3. Lexical unit 14

C -MODEL 15

1. Overview of the model of the semantic layer 16

2. Semantic unit 17

3. Valued semantic features 18

3.1. General features (property) 19

3.2. Semantic class features 20

3.3. Distinctive features 22

3.4. Domain features 23

3.5. Language level features 23

3.6. Connotation and evaluation features 24

3.7. Pragmatic features 25

3.8. Miscellaneous features 25

4. Semantic relations between Semantic Units 27

4.1. Paradigmatic relations between substitutable units 28

4.2. Semantic relations of derivation type 29

4.3. Semantic relations of collocation type 31

5. Predicate 34

5.1. Highlighting the notion of predicate 34

5.2. Description 35

5.3. Consequences on the previously described points of the model 36

5.3.1. At the componential level 36

5.3.2. At the level of semantic relations 37

5.3.3. At the coding level and definition of consistency criteria 43

6. Concept 44

7. The "conceptual" level of description 46

7.1. Predicate 46

7.2. Relations at the conceptual level 47

7.2.1. Relation between concepts 47

7.2.2. Relation between predicates 47

7.2.3. Relation between predicates and concepts 48

7.3. Contribution of this level of representation 49

7.4. Development of this level 50

D - CORRESPONDANCE BETWEEN SYNTAX AND SEMANTICS 51

1. The notion of predicate: a reminder 52

2. Syntactic position/Predicative argument 53

3. Constraints on the base description 55

3. 1 Filtering scope 55

3. 2 Filtering through features 57

3. 2. 1 Filtering through syntactic features 57

3. 2. 2 Filtering through semantic features 58

3. 2. 2. 1 Filtering through verification-attestation of the presence of information 58

3. 2. 2. 2 Filtering through verification-enrichment of information 58

3. 2. 2. 3 Filtering through compulsory enrichment of information 59

4. Realization of arguments 60

4. 1. Correspondence between argument and position 60

4. 2. Floating argument 61

4.3 Default values 62

5. Optionality management 67

6. A few more complex cases 69

6.1. Semantic unit including, for one of its arguments, a predicate and one (or more) of its arguments. 69

6.2. The case of Compound syntactic units 74

6.2.1. Internal/external part of a syntactic compound 74

6.2.2. Insertion of modifier 76

7. Mechanisms implemented 81

E - REFERENCE BIBLIOGRAPHY 83

F - USER'S MANUAL 85

1. General points 86

2. Usem 87

2.1. General points 87

2.2. Predicative representation (RepresentationPredicative) 89

2.3. Weighted Conceptual Representation (RepresentationConceptuelle_Pond) 89

3. Semantic unit of an affix (Usem_Aff) 91

4. Predicate 92

4.1 General Points 92

4.2. Argument 94

4.3. SelectEtPreciseArg, InformeArg 95

4.4. Instantiated predicate (PredInstancie) 97

4.5. List of predicates (ListePred) 98

4.5.1. Variable 99

4.5.2. SelectPredArg 99

5. Concept 100

6. Valued semantic feature 102

6.1. Weighted valued semantic feature (Trait_Sem_ValPond) 102

6.2. General Points 102

6.3. Semantic feature (Trait_Sem) 104

7. Weighted valued relation 106

7.1. General Points 106

7.2. Semantic relation 107

7.3. Semantic relation between Usems (R_Usem) 107

7.4. Semantic relation between the other objects (R_Pred R_Concept, R_Pred_Concept, R_Concept_Pred) 108

7.5. Correspondence between arguments (Corresp_Arg_Arg) 108

8. Correspondence between syntax and semantic (Corresp_Usyn_Usem) 109

8.1. Correspondence 109

8.2. Simple and floating correspondence 110

8.3. ContraintDescription 111

8.3.1. ContraintIntervConst 111

8.3.2. ContraintConstruction 111

8.3.3. ContraintStructInterne 112

8.3.4. Contraint_mdc 112

8.3.5. ContraintPosition 112

8.3.6. ContraintSyntagme 112

G - ENTITY-RELATION DIAGRAMS 115

1. Semantic unit, Predicative representation, Conceptual representation 116

2. Affix Usem (Usem_Aff) 117

3. Predicate, List of predicates, Argument, Variable, Semantic role 118

4. Concept 119

5. Weighted valued feature, Valued feature Feature value, Semantic feature 120

6. Weighted valued relations (-pred, -concept, -Usem) 121

7. Weighted valued relations (-pred-concept, -concept-pred) 122

8. Semantic relations (Usem-Usem, pred-pred, concept-concept, pred-concept, concept-pred) 123

9. SelectEtPrŽciseArg, Informe argument, Instantiated argument 124

10. Correspondence between syntax and semantics 125

11. Contraint description, Contraint  IntervConst, Contraint mdc, Contraint construction, Contraint struct interne, Contraint position, Contraint syntagme 126

H- SGML DTD 127

I - Introduction - Translation of the Conceptual Model 128

II - commented Genelex DTD 129

1. DTD genelex.dtd 129

2. DTD semant.dtd 132

3. Constraints semant.ctr 155

4. Entities semant.ent 159

5. Entities custom.ent 161

6. DTD syntaxe.dtd 162

7. Contraintes syntaxe.ctr 180

8. Entites syntaxe.ent 185

9. DTD morpho.dtd 187

10. Contraintes morpho.ctr 193

11. Entites morpho.ent 194

This report presents the work realized by all of the partners of the GENELEX France consortium for the definition of a model describing the semantic part of an electronic dictionary.

It follows the presentation documents on morphology called "MORPHOLOGICAL LAYER Version 3.2" and "SYNTACTIC LAYER Version 4.0" written by the same consortium. The general introduction and preliminary statements of those two reports remain valid for this document. The notions and conceptual choices explained in those reports are not specified again here, but are sometimes referred to.

A - GENERAL POINTS: situating the problem

Contrary to morphology or even syntax, lexical semantics is still a relatively new and unformalized subject. The former topics are less open to debate and controversy, and they are characterized by main trends and notions common to various theoretical approaches, which facilitates the definition of a generic model. At present, there is no consensus concerning lexical semantics.

We will therefore treat the subject cautiously and unassumingly and will limit our unifying ambitions concerning the semantic model.

The development of a model must integrate existing experience and savoir-faire regarding lexical semantics, and rely on dictionaric lexicographic tradition as well as current work in computational linguistics.

On the one hand, there is a lexicographic tradition (surrounding paper dictionaries) which we must take advantage of, even though it is not subject to the same constraints as those which govern the development of an electronic dictionary.

In fact, the description of lexical entries in paper dictionaries complies, for a given dictionary, with standards or coding recommendations designed to avoid excessive coding discrepancies or inconsistencies within a single dictionary. However, the (partly subjective) assessment of the writer is always necessary, and the human reader accepts these characteristics with relatively little difficulty. For example, circularity, which is a feature of these dictionaries, is undoubtedly acceptable for a human user because it partly reflects our perception of the meaning of words. Moreover, the reader corrects inaccuracies, ambiguities, presuppositions and tacit elements within definitions by calling upon his/her general linguistic and extra-linguistic knowledge to a large degree.

However, a computer system that processes data extracted from an electronic dictionary (of GENELEX type or not) cannot find it as easy to deal with. A systematic and consistent structuring of data, together with a set of additional descriptive tools, seem to be required for a model of description of lexical semantics within the framework of electronic dictionary development. The data extracted from this type of dictionary must be usable with automatic language processing applications, without specifying which ones. Consequently, very different requirements shall have to be met.

On the other hand, a good deal of research on lexical semantics has been carried out, but few generic full-scale realizations developed for the industrial use have been completed as yet, apart from the EDR and CYC projects. The work of I. Mel'Cuk and his team should be mentioned here [Mel'cuk 1984 -1988].

Indeed, the most elaborate studies in lexical semantics often focus on a specific topic (motion, temporality, aspect, connotations...) but their modelling remains generally unselective. The global framework that could integrate these partial modellings, and which is required for a generic model, remains to be defined.

First, we will define our approach, and the goals we have set for a model of the semantic layer. Then, we will describe the global architecture of the lexicon before considering more specifically the model of the semantic layer.

1. Genericity and flexibility

Just as for morphology and syntax, the semantic approach, within the framework of GENELEX, must guarantee format genericity to ensure the reusability of a dictionaric database corresponding to the GENELEX model. By genericity, we mean the ability of automatic language processing systems, being integrated into different theoretical frameworks and designed for different applications, to extract significant lexical data and also allow human users to easily express or consult lexical data .

This implies that the model must simultaneously:

- represent, without too much distortion, the various theories or approaches (given a certain state of the art to date), without being related to a specific theory itself.

- be easy to develop through enhancements and without overall re-examination. (This implies that a flexible frame be provided to support subsequent developments).

The semantic layer of the GENELEX model represents the semantic data which are specific to the lexicon. Therefore, it deals with general lexical semantics. The model of the lexicon is not targeted at specific applications nor does not it adapt its description to any special field. It must be usable for concrete applications and the description provided must not exceed certain limits: data related to a specific application which could be regarded as partly pragmatic in the field, and partly encyclopedic, whether it involves general knowledge or information related to the needs of the application itself, must not be integrated.

However, the limits between these categories of data are relatively vague, and one may consider that there is a certain continuity between them, or even that lexical semantics can be enriched by integrating data which go beyond pure linguistic fields and reflect the world in general. That is why each instanciation of the model must define which data the GENELEX model is to include.

This way, it should represent various levels of information which may seem incompatible, but which are necessary for different approaches and families of applications.

This comprises:

A level of linguistic semantic representation.

This representation, which is closely related to the language, is mainly created on the basis of the lexicon studied in context (in the utterances actually produced in a situation and not structured for that purpose) and of the semantic relations between the elements of the lexicon. (This includes the precise semantic data expected of a paper dictionary derived from a GENELEX dictionary and which are required for quality automatic translation or generation, for automatic text comprehension and summary generation ...).

A level of conceptual semantic representation.

This representation, resulting at least in part from the trends within artificial intelligence and some other reflections, is more "abstract". It relies on primitives associated with a formalism of knowledge representation.

Extra-linguistic data concerning the representation of the world can be added at this level.

Consequently, the information coming into play often refer not only to the purely linguistic intentional content but also partly to the properties of the denoted object, and they may include a partial representation of a given world.

This level of representation will most probably associated, at an earlier stage (outside the GENELEX model), with a system which makes it possible to reason on this knowledge.

It is recommended to combine both levels and to specify the articulation which will make it possible to reflect the purely linguistic semantic content and to interface it with the conceptual level. One of the options of the GENELEX semantic model is to provide the opportunity to articulate both levels.

The great variety of approaches and theories that must be accounted for obliges us to adopt a very elaborate model providing the opportunity to express a given theory within a sub-set of the model.

Consequently, the approach selected must be multi-theoretical rather than non-theoretical, while remaining consistent. So, it must provide a very comprehensive descriptive framework allowing for the expression of connected data in various ways.

This high flexibility and open framework certainly imply a risk of redundancy and even inconsistency which the lexicographer must take into account. As for the morphological and syntactic layer, he/she will have to make global choices and stick to them, and the software managing the GENELEX lexicographic basis will make it possible to, according to the subset of the model used, verify certain constraints of integrity.

Within this scope, it is obvious that the global choices of the data coding and structuring constitute a kind of "parameterization" of the model and that this parameterization must be compatible between two dictionaries that are intended to communicate; therefore, the relation of the model instanciations is a prerequisite to the merging.

2. Ensuring the suitability of the model

The GENELEX project is particularly oriented to industries and the model must support a great number and wide variety of data, having varying degrees of specificity. It must neither deter dictionary editors, nor lexicographers of the "classical" school, the school of "computational linguistics" or the main trends of artificial intelligence, nor users.

An overly complex model would disconcert many lexicographers and would risk being difficult to use.

It is also recommended to provide varying degrees of subtlety of expression, or at least to select a representation of the semantic level that allows for easy extraction from dictionaries which are more or less precise from the semantic point of view.

The modelling of semantics relies on the morphological and syntactic layers, and must take data from these levels into account . The approach must then be consistent and in particular describe the connections between data at different levels, e.g. specify the realization of semantic arguments with regard to the data of the syntactic level.

Although it is undesirable to give greater importance to a particular theory, it is no doubt necessary to provide a minimum of methodology for describing semantic entries, i.e. defining criteria for splitting entries and structuring "polysemic" entries if necessary.

It is important to limit the quantity of data by factorizing wherever possible and by providing the means of circulating certain data derived from these factorizations. Consequently, inheritance or sharing mechanisms must be applied between various levels of representation and between units linked by certain relations.

3. Description axes

A semantic unit represents the unit of meaning (sense and explanation) associated with one or more syntactic units derived from the same morphological unit, in the case of simple Usyn, or the unit of meaning associated with a compound Usyn. The data it comprises must provide details about the behavior of this unit as far as meaning is concerned, i.e. describe its specific semantic contribution in relation to its interaction with the elements combined according to the syntax. Naturally, we assume that the general rules applied to the composition of meaning in order to build the semantic representation of utterances, or even texts, are defined outside the GENELEX dictionary.

The description of a semantic unit can be made according to two main complementary axes:

3.1. Componential or analytic axis

This axis consists in describing the semantic unit "from within" by "breaking it down" into elementary semantic data (for example, by features and references to basic classes or predicates); the data related to this axis provide a definition through components of meaning.

They also make it possible to relate the meaning of the unit to units integrated in the conceptual abstract level while providing an indication of the specific content.

This axis is well adapted to the conceptual description using several possible abstraction or generalization levels (possibly up to the primitives used in Artificial Intelligence).

3.2. Relational or differential axis

This axis consists in defining the position of semantic units in relation to each other by specifying the nature of their sense relations.

It acknowledges the proximity of sense relations between units and also makes it possible to specify the relations of semantic derivation and collocation or preference.

It will be possible to express relations between semantic units (located at the linguistic level) but also relations between units at the conceptual abstract level to which the semantic units are pointed.

Moreover, it is not possible to describe the senses of lexical entries individually, without also describing the interactions of senses specific to the context of occurrence of the entry described, and therefore, it is also necessary to take into account the repercussions of the syntagmatic organization on the semantic level. In other words, we must also describe the transition from the syntactic level (context of occurrence of the entry described) to the semantic level and the consequences that the syntactic context of this connection and its instanciation can have on the semantic representation of the entry.

These predicative entries will have to comprise a number of specific data concerning arguments and their connections, not only within the description of the transition from syntax to semantics, but also at each level of the semantic model bringing into play the semantic relations, between predicative entries.

Certain data required for the description of the semantic unit partly depend on the category of the morphological unit from which it is drawn. For example, one usually distinguishes names which correspond to entities, verbs which describe a state, a process or a transition, adjectives or adverbs which modify the meaning of elements belonging to the previous categories by specifying some of their attributes...

This distinction is only partly relevant. That is why the GENELEX model will not fix the type of data required or allowed according to category, and it is obvious, for example, that the content of the lexical units belonging to the verb category is not the only one which can be described in a predicative way.

Besides, the model does not exclude any grammatical category or sub-category from the semantic description: a determiner or a preposition can be described perfectly well in semantics.

B - OVERALL ARCHITECTURE OF THE LEXICON:

THE THREE LAYERS

1. Semantic unit - Semantic unit of an affix

A semantic unit concentrates all the semantic data corresponding to the sense of an entry in the dictionary. It carries these data itself, or gives access to them via other associated objects. One may distinguish the semantic units representing the meaning of autonomous units, called Usem, from the semantic units associated with affixes called Usem_Aff.

An autonomous semantic unit (Usem) describes the meaning of a morphological unit in a given syntactic context: in this case, it corresponds to a simple Usyn in semantics. It can also correspond to a compound Usyn considered as a whole in semantics, even if it integrates components (Um or Usyn) into the syntactic description.

A Usem does not intend to give the full meaning of the syntactic structure. The grammar will also provide semantic data, particularly concerning the type of compositionality, and these data will make it possible to calculate the semantic representation by combining the semantic data associated with each structural element and the data drawn from the grammar.

The semantic unit of an affix concentrates on itself or gives access to all the semantic data corresponding to the specific semantic contribution of an affix to the formation of a word, with the method of formation being defined in the morphological layer. It is directly associated with the affix Um for which it provides a minimum description of sense or senses.

2. Articulation with the other layers

A semantic unit can be accessed through one or more syntactic units associated with a single morphological unit associated with a semantic unit representing a sense (as is the case with simple Usyn). A semantic unit is therefore indirectly but unambiguously related to one and only one Um in the case of a simple Usyn. The semantic unit describes the meaning associated with SELF in the Basic Description (BD) which characterizes the syntactic unit. When the semantic unit is described as predicative, it can also provide information about the predicate arguments and their surface origin.

Moreover, a syntactic unit, and implicitly, the morphological unit from which it is drawn, can be associated with several semantic units, i.e. a single syntactic construction can correspond to one or more senses.

This can be represented by the following entity-relation model:

which may, for a given morphological unit, have the following instanciation:

If we integrate Compound syntactic units , this model becomes:

For the affix semantics, we have:

It is obvious that, within the framework of the GENELEX model, the lexicographer will have to make various choices concerning the coding of the syntactic level and that these choices will have consequences for the overall structure of the dictionary.

In fact, many factorizations within the description of position occupants, the intensive use of optionalities within the BDs and the non-application of denotational conditions will lead to an instanciation of the GENELEX dictionary which usually includes few syntactic units for a morphological unit, and many semantic units for a syntactic unit. It is therefore necessary to specify many filterings on the BD, when establishing a relation between a syntactic unit and a semantic unit.

The more syntactic groupings are intensified, the more filtering on syntax in semantics is required. Conversely, the reduction of syntactic groupings requires a lesser degree of filtering on syntax from semantics. The implementation of the GENELEX model must therefore be between the two extremes, i.e. either a highly "factorizing" splitting into few syntactic units, with numerous filtering data, or a splitting into more syntactic units requiring very few (or even no) filtering data.

Conversely, the splitting of syntactic units, by factorizing little data and using denotational constraints and thematic roles will reveal, from the syntactic level, differences in construction that are usually reflected at the semantic level since they correspond to different semantic units (and therefore to different senses).

A semantic unit can be associated with various syntactic contexts described in many Usyn.

In general, a dictionary with the GENELEX format comprises a large number of semantic units, more than the number of morphological units, even if certain semantic units share data (they can have common features or predicates, or point to the same concept...).

3. Lexical unit

A fully identified simple lexical unit is virtually represented in the GENELEX model, through the association of a morphological unit, a syntactic unit, and a semantic unit. It therefore represents a specific flow through the various layers, with an accumulation of data related to each layer. Naturally, this flow can be incomplete if no semantic information has been considered necessary, in which case, the lexical unit will be limited to the morphological unit-syntactic unit pair. Certain units may only comprise morphological data, in which case, the lexical unit is fully identified by the morphological unit.

A "compound" lexical unit can be described at different levels:

- in the morphological layer. In this case, it is described as a compound Um which relies on the morphological data provided by its components; syntactically and semantically, it is described in the same way as a simple Um: it also represents a flow through the three layers.

- in the syntactic layer. It is then described by a compound Usyn relying on component Usyns and/or Ums. Consequently, its morphological data are provided by its component units. This Usyn is related to Usems in the same way as simple Usyns. It therefore represents a specific flow of the syntactic layer (pointing to the morphological layer) and the semantic layer.

- in the semantic layer. This semantic "composition" is described by the fact that certain Usems have a collocation relationship. These Usems each present a specific flow through the three layers.

The Morphological units of affixes are described separately. They are not related to any real syntactic behavior since their combinatorial behavior at the word level is defined in the morphological layer. However, the model provides semantic elements by associating them directly with one or more Semantic units of affixes. It then represents a flow from the morphological layer to the semantic layer, without passing through the syntactic layer.

We are now going to describe in detail the model of the semantic layer. Then, we will describe the relationship between the syntactic and semantic layers.

C -model of the

semantic layer

This part will deal with the main notions identified in the model. The modelling of the semantic layer must be regarded in terms of the two complementary axes of the componential and relational analysis. It is also intended to present, by linking them together, two different viewpoints: lexical semantics (linguistic) and conceptual representation of knowledge (IA-oriented). First, we will deal with the data that do not concern the predicative aspect of entries. These will be described at a later stage.

Most of the examples given in this report concern the semantics of open classes (which correspond in morphology to entries belonging to the grammatical categories noun, verb, adjective, adverb) as opposed to the closed classes (which correspond in morphology to entries belonging to the grammatical categories determiner, pronoun, preposition, conjunction). This does not imply that these elements cannot be semantically described according to the GENELEX model. On the contrary, determiners as well as prepositions will have their place in the semantic layer of a dictionary built according to the GENELEX model.

1. Overview of the model of the semantic layer

 

 

2. Semantic unit

The semantic unit (Usem) is the entry point of the semantic layer, just as the Usyn is the entry point of the syntactic layer, and the Um that of the morphological layer.

The Usem describes the meaning, the sense of a Usyn, whether it is simple or compound, and in the case of a simple Usyn, whether it is related to a simple Um or a compound one. Insofar as the Usem is related to one or more Usyns, the meaning is described in relation to one or more syntactic contexts.

The description of the Usem is based on the basis of various objects: Predicate and Argument, Concept, and Valued semantic feature. Additionally, it carries certain attributes that complete its characterization. These attributes allow it to give an unrestricted definition, that of the lexicographer or the paper dictionary, as well as the "formal" definition. The characteristics of its use are also specified through the combination of use values (CombVE), as for syntax and morphology.

3. Valued semantic features

A valued semantic feature is the association of a named semantic feature with one of its values. It therefore represents a kind of attribute-value pair. We will see, later in the document, that there are different types of semantic features, which may be binary (with a value +/-) or comprise a list of open or closed values, and that the model provides the opportunity to express information concerning these semantic features.

A large part of this semantic information can be expressed in a componential way by using valued semantic features. This represents a rather flexible and certainly very powerful breakdown instrument. The approach adopted by F.Rastier [Rastier 1987], which seeks to unify the representations of different levels (word, phrase, text) seems to constitute a very interesting basis for reflecting on the organization of the lexicon.

However, we must not forget that the major problem encountered when applying this method to the description of the lexicon is:

where does the breakdown stop?

Indeed, a precise description of the meaning of words related to the use of this type of full-scale approach on the whole, general lexicon, (and not within a small restricted field), leads to the identification of an increasing number of features whose generality in and even pertinence to language are sometimes questionable.

The multiplication of features leads us to create a kind of metalanguage of features that is very elaborate and almost impossible to process automatically; those features can be associated with a paraphrase that may be very complex, and which indicates the meaning of the feature.

We do not want to recommend such an approach for GENELEX, even if the use of recurrent features in the general lexicon seems interesting and even necessary.

Defining a minimum set of features used for the generic description of the complete lexicon of a language remains a problem with a somewhat arbitrary solution.

Semantic features do not all have the same status and they express different families of data. The model makes it possible to characterize the semantic features used, and in particular, provides a number of feature types that will be described later in the text. In using the model, each type of feature may be associated with one or more semantic features, or none at all.

Whatever the types of features used, the model does not impose any feature name or list of values. This means that an instanciation of the model is defined by all of its acquired features (a priori or in an incremental way, during the coding of the lexicon), which constitute a sort of instanciation of the feature language of the model.

3.1. General features (property)

General semantic features (or properties) are generally used to express semantic restrictions regarding selection. These restrictions can be expressed in the form of such features at the syntactic layer if the user desires it. In that case, they are applied to the syntactic and semantic constraints on position occupants.

These features are general in the sense that they are not assigned to any specific semantic class or field. They often take the form of the values more or less.

The features animŽ, humain, comptable, masse, abstrait, concret, action, processus, Žtat usually belong to this category of general features .

Note that the only data indicated at the lexicon level corresponds to a standard or typical use without intended effect. This way, figures (metaphors or metonymies...) will be detected during the analysis process due to the fact that certain restrictions on the selection of general semantic features will be transgressed.

Ex : Ma voiture boit de l'essence (my car is heavy on petrol)

refers to a predicate with two semantically constrained arguments:

boire (Arg0[animŽ : +], Arg1 [liquide : +]

[comestible : +])

these constraints are deliberately transgressed here.

Naturally, the lexicon will not take such tropes into account, unless they are lexicalized, in which case, a specific semantic entry can be assigned to the description of this use.

It must be possible to specify that certain features are "compulsory" (and that the transgression of a restriction concerning the selection of one of them must be taken into account by specific processing, either to identify the intended effect, or to reject the related semantic interpretation in favour of another interpretation which is compatible with the morphological and syntactic data).

Other weaker features could be called "preferential" features.

The model shows the various "weighting" of these data through an attribute specifying the weighting associated with this valued feature information.

From the standpoint of the power of expression, it is interesting to offer the possibility of structuring knowledge; that is why the model makes it possible to integrate features into lattices. The factorization and implementation of inheritance mechanisms are then made possible.

Ex :

The presence of [animal : +] would imply:

[humain : -]

[animŽ : +]

[concret : +]

Note: The structuring of valued features varies according to the type of values. "Binary" features (with value + or -) will necessarily be structured on the basis of the value +, with the two values being exclusive of each other. Features with a list of values, whether it is included in the model (closed list) or not (open list), will have an inheritance system in which one value excludes other values only if a feature is defined as monovalued.

3.2. Semantic class features

The feature characterizing the semantic unit in terms of semantic class is particularly important. The notion of semantic class, although intuitive, is relatively vague. A class groups non-trivial semantic information in hopes of identifying and designating it without necessarily breaking it down into a set of elementary semantic features; it represents a conceptual and generic notion that is functionally pertinent within the linguistic system of the described language and to which a semantic unit is supposed to be related.

Depending on the theoretical or methodological approach adopted, the description of the lexicon with semantic classes and the development of the list of classes used in the semantic layer will take the form of a pre-existing item, a kind of a priori description, or, otherwise will emerge from the lexicon itself

Semantic classes can be paradigmatically highlighted by certain predicates which are relatively constrained at the semantic level (acting as operators) and which can adopt an element of this class as an argument of rank n. The approach of G. Gross [Gross-1994] makes it possible to highlight semantic classes.

For example, one may consider that the semantic class vtement can be highlighted by certain specialized predicates (porter, essayer, aller ˆ...) (to wear, to try on, to suit...). Other semantic classes could be, for example: profession, vŽhicule, aliment, moyen de transport...

Whatever the methodology selected for identifying classes, classification is bound to be partly arbitrary: even if semantic classes are determined according to the role played by their elements as actants of certain specialized predicates, the selection of the list of predicates to be related to a semantic class will determine the class, and vice versa.

Although semantic classes are named and have a specific "existence" within the model, it seems interesting to describe them with a set of semantic features. The description therefore retains its componential character at this level, if desired, and also highlights the class that shares a set of pertinent semantic features in the language concerned. Therefore, the elements of a semantic class implicitly carry all the features related to this class.

The notion of semantic class can also be regarded as taxonomic; it would then be possible to describe a lattice or tree structure integrating classes. Two sister classes within this structure diverge (at least implicitly, or explicitly whenever possible) for the value of at least one feature among all the valued semantic features implied in the semantic class. This divergent feature may be general, distinctive or specific.

The various elements belonging to a single semantic class (i.e. carrying the same valued feature of semantic class) can also carry certain diverging features that will play the role of distinctive features within this class.

According to a certain theoretical approach (to be defined for a dictionary instanciation resembling the GENELEX model and which will be specified with the properties attributed to the class features used), a semantic unit can belong to different semantic classes; in a way, a semantic class feature then represents a viewpoint on the semantic unit. Ex : avion can then belong to the classes vŽhicule and objet volant according to this type of approach.

It is also possible to apply these semantic class features in a more restrictive way which can then rely on a tree structure integrating the valued features of a semantic class, in which entries described are positioned by these classification features (monovalued and therefore exclusive in that case).

Remark concerning the representation of semantic classes within the model

The GENELEX model makes it possible to represent semantic classes with the valued semantic features associated with features called "semantic class features". Representation within the model remains the same, whatever the linguistic method used to identify semantic classes.

Depending on the options selected, one or more classes will be assigned to the the units, which is a characteristic of the class feature (mono/multi-valued). The model does not intend to keep track of the set of predicates revealing the class (if such a method has been used). It makes it possible to associate the valued feature of a semantic class with valued semantic features implied in this class (and that may be, if this model instanciation was selected, the set of valued features which make it possible to identify the class elements) and also a set of valued features which are incompatible with the class. It also makes it possible to attribute a set of features that should be (or must be) indicated to this type of valued class feature.

Moreover, a taxonomic structuring of classes (through an ISA relation between the valued features) makes it possible to structure the whole set of valued class features, and therefore obtain a varying degree of specificity.

3.3. Distinctive features

In order to differentiate the various elements of a single semantic class based on a componential approach, distinctive specific features will be used. These features differ from general semantic features in the sense that they are usually related to a certain semantic class or domain, and are not used to describe semantic units in the lexicon.

The features urbain and collectif. provide an example of specific distinctive features within the semantic class of transport means.

3.4. Domain features

The meaning of a lexical unit is often related to a specific domain of activity. It is therefore necessary to be able to specify this domain with a feature. The set of values associated with this feature represents a sort of description of the activities encountered in a given world.

This is a piece of information that makes it possible to relate the semantic unit to the domain in which it is used, and the link with knowledge from a more specific domain (generally not included in the generic dictionary) could be established, at least in part, by means of this feature.

The values of the feature domaine are hierarchical. For example, mŽdecine is one possible value, but so is chirurgie, psychiatrie... which hierarchically depend on mŽdecine. We do not recommend adopting overly precise descriptions of domains, but it is interesting to describe a two- or three-level hierarchy, for example, within the GENELEX model. The values of the feature domaine could be drawn from any one of these levels.

 

Note The types of semantic features that we have presented up to now make it possible to describe the objective or inherent content of the semantic unit. It is also important to be able to describe a set of partly subjective, evaluative or expressive data, in relation to the situational context of the units described. These peripheral or non-essential data are useful not only for providing a correct interpretation of an utterance or a text, but also for generating it. The features that we are going to describe include this type of data.

The types of features described above can be used at the purely lexical level of the Usem, or in the characterization of more abstract descriptive objects (concepts, lexical or non-lexical predicates).

The following features are intended more for use with a precise description of a lexical level (but they can also be used at any level of description ) .

3.5. Language level features

A semantic unit can include a certain level of language. In fact, the sense can be characterized at the level of use, but neither the morphological entry (related to different constructions and different senses) nor the construction describing the context of occurrence of the semantic unit.

Ex: One sense of the word canard (journal) (duck - newspaper) belongs to the familier level of language.

This information which, from the semantic point of view, is similar to a valued feature, is represented uniformly in the GENELEX model: a COMBVE (combination of use values) reflects the level of language as well as the frequency and date of use. For the semantic layer, this information is necessarily carried by the Usem. .

3.6. Connotation and evaluation features

Phenomena which are commonly called connotation are strongly marked by a cultural context. They are related to the description of highly prototypical data and they are used for providing a precise description of the semantics of a lexical unit, which is required for certain applications and from a purely lexicographic point of view. For that reason, the model offers the possibility of integrating them with connotation features.

Example: force with the value plus for a sense of homme and the value moins for the entry femme.

Certain semantic units serve as references in the language; they have an aspect of neutral or generic use, which is not the case for other semantically related units.

This is true for certain units associated with adjectives. For example, in the set {bon, mauvais, excellent, nul } (good, bad, excellent, worthless) bon seems to represent the neutral entry; the other entries are somehow relative to it. Therefore, one could ask: c'est bon ? (is it good?) without necessarily expecting a positive answer, but it is more difficult to interpret the question: C'est mauvais ? (is it bad?) in a neutral way.

A specific feature will make it possible to identify these as "reference" entries:

Representations of pragmatic knowledge go beyond the objectives of the GENELEX semantic layer. However, the limits are relatively vague and the model must make it possible to add pragmatic knowledge to the semantic layer. This is the purpose of the following features.

3.7. Pragmatic features

This type of feature may include data relative to the referential anchoring of the units described, depending on the various conditions of enunciation.

We can mark, for example: papa :  [dŽictique : +]

pre :  [relationnel: +]

homme :  [absolu: +]

Data concerning the argumentative use of the unit described may also be marked.

A reference to a level of pragmatic data is also provided in the model by means of lists of Predicates.

The GENELEX model is meant to be open, more at the semantic level than at the morphological or syntactic levels. That is why it provides a number of features, but also gives users the opportunity to express and structure the information rather differently, by remaining within the general framework of the valued feature formalism.

3.8. Miscellaneous features

The user is entirely free to use the language of expression of valued features within a larger scope than is specified above.

It will be possible to use various features whose status is not constrained by the model but is entirely free for users. They will include, for example, data relative to the aspect and temporality, and of marking various main semantic categories related to a given morpho-syntactic category.

The identification with a lexical field can be marked here.

It will also be possible to assign features of semantic categorization to units, for example: relation humaine for the units (ami, frre, proche...) (friend, brother, relative...). This can also be applied to verbs.

Relations of lexicalization between Semantic features and Usem, and between Valued semantic feature and Usem make it possible to introduce the meta-language of semantic features into the language, i.e. the lexicon.

On the basis of certain valued features (mentioned here as examples), one could have:

from the standpoint of meaning, the semantic unit vtement corresponds to the designation of the valued feature [classe : vtement]. It can therefore have a relation of lexicalization with this valued feature.

from the standpoint of meaning, the semantic unit couleur corresponds to the designation of the feature [couleur : ....]. It can therefore have a relation of lexicalization with this valued feature.

from the standpoint of meaning, the semantic unit rouge (red) corresponds to the designation of the valued feature [couleur : rouge]. It can therefore have a relation of lexicalization with this valued feature.

4. Semantic relations between Semantic Units

The preceding point dealt with the analytic approach which makes it possible a describe the semantic unit or more abstract descriptive objects of semantics (predicate, concept) with valued features, and semantic components of the described unit. This approach can be pushed to the extreme and then suffice in itself: relations between semantic units are deduced from the presence of such and such feature with such and such value, with the set of features then representing a metalanguage.

The model of the GENELEX semantic layer is not designed to impose this approach in its entirety, and is intended to also give the possibility of describing the semantic unit from outside by considering its relations with other semantic units, or with the predicates or concepts which a set of units may share.

Semantic relations make it possible to describe, from a global point of view, the structure of the lexicon for a group of relations, on the one hand, and collocations or local semantic preferences, on the other.

It is interesting to consider the lexicon as a set of lexical elements whose position inside the set (and consequently the specific semantic contribution) is defined according to the web of relations connecting them to other elements of this set.

We will now look at the big family of semantic relations at the lexical level. These relations have been closely studied by I . Mel'cuk and his colleague [Mel'cuk 1984 -1988], who regard them as a main axis describing a lexical unit. This study as a whole seems to represent a very interesting reference as a learning experience at the lexicographic level. Without adopting its global approach, it is certainly interesting to use this study as a reference and to take into account what he calls "lexical functions".

A number of relations are traditionally identified by the lexicographer. This constitutes an essential contribution to our reflection and which must be taken into account. The model will thus make it possible to represent various types of relations.

Moreover, the level of "conceptual knowledge representation" can rely on the identification of concepts or predicates and their ontological relations with other concepts or predicates ; this aspect will be described later.

There are various nature of semantic relations between the Semantic units of the model: paradigmatic, derivation (in the broad or restricted sense), or collocation. We will see later in the document that there are semantic relations between objects which are somewhat abstract in relation to the Usems, i.e. semantic relations between predicates, concepts, and between a predicate and a concept. Moreover, the link between a Usem and its predicate carries information; this link is described in what is called the predicative representation.

We are now going to study semantic relations between Usems.

4.1. Paradigmatic relations between substitutable units

It is essential to represent the semantic relations usually included in the dictionaries.

These are relations linking two semantic units associated with morphological units of the same morpho-syntactic category. The units linked by this set of relations are substitutable in their contexts of occurrence by modifying the meaning of the contexts which naturally depend on the relation concerned.

Without intending to make a precise and exhaustive list, we will include the following relations

- synonymy

- antonymy (encompassing contrary, re reciprocal, and complementary)

- opposition

- taxonomy

- part-whole (meronymy)

These relations are very evident throughout the lexicon, irrespective of the morpho-syntactic categories (open) and the semantic domains or classes. These are fundamental general relations concerning the structure of the lexicon.

We will not describe these relations in detail, although they are certainly essential for the description of the lexicon.

These semantic relations should be described according to their type, but also according to the properties they have or don't have: reflexivity, transitivity, symmetry (for synonimic relations, for example), anti symmetry, or the fact that they are order relations or not.

Relations themselves can be interrelated by means of relations of generalization (ISA), incompatibility, or implication. Therefore, it is possible to describe various degrees of synonymy, and have both a more precise view of these various relations, and , through the structuring of this family of relations, an overall view virtually comprising only the broad relation of synonymy encompassing the whole set. This allows us to view the semantic data of the lexicon with varying specificity.

Semantic relations are grouped in large families or "types", specified by a "sub-type", and some of these relations can be selected as being "pivots", which means that they are common to various languages.

The semantic model also offers the possibility of integrating a modality comprising a weighting factor and a viewpoint on this information into the information associating two Usems linked by a semantic relation.

4.2. Semantic relations of derivation type

Semantic derivations make up a large part of the set of relations which must be looked at here. These relations, like the former ones, are both general and fundamental, because they provide structure for the lexicon as a whole, and they link two semantic units.

Let us remember that the syntactic layer makes it possible to describe the syntactic transformations between Usyns, whether these Usyns are associated to the same MUs or not. A family of transformations can concern what could be called the syntactic derivation, and for certain pairs of Usems, the relation of semantic derivation which binds them will be reflected in the syntactic layer by means of a syntactic transformation/derivation relation linking the associated Usyns.

Moreover, these relations link lexical units which belong to distinct morphological categories of the MU associated with the Usem via one or more Usyns.

These relations depend on the categories they link. Many of them stem from predicative semantic units; we will provide a more detailed description of the predicative aspect in the following point. But we can already see that it involves, for example :

- Relations pointing to the nth typical predicate argument associated with the entry, be it as a semantic unit associated with a noun (noun of the actant) or an adjective (qualifying this actant according to various modalities).

• Usem Vendre ---> Usem Client(Nom, Arg2)

Usem Lire--> Usem Livre(Nom, Arg1)

Usem Ecrire1--> Usem Livre(Nom, Arg1)

Usem Ecrire1--> Usem Ecrivain(Nom, Arg0)

Usem Croire1 ---> Usem CrŽdule(Adjectif, Arg0)

Usem Croire1 ---> Usem IncrŽdule(Adjectif,Arg0)

Usem Croire2 ---> Usem Croyant (Adjectif, Arg0)

- Relations pointing to the typical circumstant of an action: instrument, location, result, means, purpose ...

Usem Tennis ---> Usem Court(lieu)

Usem DŽcouper --> Usem Ciseaux(moyen)

- Relations pointing to the adjective qualifying the possibility/impossibility of being the nth argument.

Usem Manger --> Usem Comestible(Arg1)

Usem Croire1 ---> Usem CrŽdible(Adjectif, Arg1)

Usem Boire --> Usem Potable(Arg1)

- Relations of semantic derivation in the restricted sense of the word . The core of meaning of the entries linked by this type of relation is somehow the same, but the related elements belong to different categories, and therefore have different contexts of insertion.

Usem Prison--> Usem CarcŽral

Usem Jeu--> Usem Ludique

Usem Territoire-->Usem Territorial

These relations are particularly important in the case of paraphrastic relations between synonymous utterances. They make it possible to choose a lexical selection according to a global context at the time of generation, and to choose, for a given meaning, the semantic unit of the category required by the more global context, and grammatical and stylistic constraints.

- Relation pointing to the typical activities associated with a nominal lexical unit (allowing for the integration of the Telic role as defined by Pustejovsky)

Usem Livre--> Usem Lire(Arg1)

Usem Livre--> Usem Ecrire(Arg1)

 

Remark 1: Relations between Usems are represented here simply in the form of "-->", but these semantic relations are described in the model.

Remark 2 : Some of these relations between Usems may seem redundant if one considers the power of expression provided by the model surrounding the predicate, and its knowledge concerning its arguments (see following point). However, in order to learn how the various means of description are articulated, it is necessary to understand that:

- the relations described here are intentionally placed at the level of the lexicon, and that they may have a far-reaching scope, and a low level of abstraction on the lexicon.

- most data carried by the Predicate have a much higher level of abstraction (valued semantic features, concepts, instantiated predicates). They can be expressed by Usems, at a lexical level for default values -- these are the "defining" data of the Predicate, and that which can be expressed on the Predicate and its arguments is therefore more limited. It is true that certain relations between Usems are deducible from the data carried by the Predicate and by its relations with other descriptive objects. In this case, the coding strategy, for an application of the model, must specify whether one should code certain data between the Usems at the same time and, if possible, at the level of the Predicate, or code only the Usems that cannot be recalculated on the basis of the more abstract levels of description.

- In any case, the model must allow the coding at the level of the Usem. In fact, the "abstract" level is not necessarily used by all the instanciations of the model.

4.3. Semantic relations of collocation type

The morphological and syntactic layers of the model make it possible to represent a phenomenon called "composition" which covers a great variety of cases. Composition in morphology is reserved for "fossilized" composition which includes a few formal or a-syntactic characteristics. Composition in syntax makes it possible to describe quasi-fossilized to the quasi-free, and, as we have seen , certain "compounds" can be coded in morphology or syntax, and others may or may not be identified in the syntactic layer. One may consider that the collocation phenomenon corresponds to that which is also called composition phenomenon in semantics. The lexical units linked by these collocation relations are generally found together, on the surface, in the same syntagm; they each select from each other and usually, each one provides its semantic part, allowing for compositionality of meaning.

In most cases, it seems that the pairs of Usems put into play by these relations give greater importance to one of the two units, i.e. a sort of basis of the collocation that selects the unit with which it has a collocation relation (its privileged modifier, its support verb, etc.) by a given lexical function.

These relations are particularly critical for applications of automatic translation and generation; they often make it possible, during the analysis phase, to disambiguate the two entries together.

The following relations can be integrated into this family:

- Relations between a lexical unit belonging to the noun category with another unit from the adjective category , a modifier (intensifiers, for example) whose relation with the noun takes priority.

Usem Prix--> Usem Exorbitant

(preferred by collocation to Usem ExagŽrŽ)

Usem Peur--> Usem Panique

Usem Pluie--> Usem Diluvien

- Relations between a lexical unit belonging to the noun category and another unit belonging to the verb category , wherein the noun takes priority as actant:

Usem Fivre--> Usem Monter

Usem Prix--> Usem Monter

- Relations between a nominal and predicative lexical unit, and the support verb unit with which it is associated for a syntagmatic realization.

The support verb is almost empty, but it is not entirely devoid of semantic value because it generally contains at least the aspect, which can justify its association with a semantic unit.

Usem Attention--> Usem porter (inchoatif)

Usem Attention--> Usem maintenir (duratif)

Usem Attention--> Usem faire (neutre)

 

-Relations between two lexical units belonging to the noun category in which one of them has the specific role of specifying the relationship between the two.

Usem Loup--> Usem Meute

(groupe de)

Usem Pain--> Usem Miette

(petite partie de)

Usem LycŽe--> Usem Proviseur

(responsable de)

 

5. Predicate

The notion of predicate, which has already been mentioned throughout this document is essential for the semantic modelling of several lexical units. Although the term Predicate is rather widespread, we sometimes come across the term used with a different sense. We are going to define what we mean by this term, how this notion is represented in the GENELEX model, and how it is articulated with other objects of the model.

5.1. Highlighting the notion of predicate

Let us consider the following utterances:

Le Mexique achte deux usines atomiques ˆ la France pour une somme modique. (Mexico purchases two atomic power stations from France for a modest price.)

L'achat de deux usines atomiques ˆ la France par le Mexique renforce les liens d'amitiŽ entre les deux pays. (Mexico's purchase of two atomic power stations from France reinforces the bonds of friendship between the two countries.)

Les nouveaux achats du Mexique seront mis en service au dŽbut du mois prochain. (Mexico's new purchases will be put into service at the beginning of the next month.)

L'acheteur des usines a payŽ comptant. (The purchaser of the stations paid cash.)

Intuitively, we feel that these utterances share something special, that we want to highlight: a relationship related to an actant structure which is realized in various lexical ways and within the surface structure. What the terms acheter (to purchase), achat (action) (purchase-action), achat(rŽsultat) (purchase-result), acheteur (purchaser) have in common seems to be a given element of the language which we hope to identify and describe in the notion of predicate, in this case the predicate acheter(acheteur, objet, vendeur, prix).

A lexical predicate is a relationship which describes a situation and which is identified in the described language. It includes a number of participants playing a certain role; they are the actants or semantic arguments associated with the situation described. The lexical predicate has one or more lexicalizations which be integrated into syntactic contexts that do not systematically name all the participants in the situation; some may remain implicit.

The notion of predicate is highlighted through the study of what is called the language. It is generalized in the model, whether or not the language identifies the relation which links the various semantic predicate arguments. The model makes a distinction between lexical predicates (identified by the language) and non lexical predicates.

It should be noted that the notion of lexical predicate is an attempt to distinguish and represent predicates identified by language and that the list of lexical predicates identified in this way is specific to each language, and not predefined a priori. The more abstract (non-lexical) predicates can be defined a priori or as generalizations of lexical predicates. A type attribute will be assigned to the predicates so as to describe their lexical or non-lexical nature. Formally, there is only one difference between a lexical predicate and a non-lexical predicate: at least one Usem directly points to the former via a Predicative Representation.

5.2. Description

Therefore, the GENELEX semantic model explicitly identifies the predicates "extracted" from the semantic units for which they provide an important part of the meaning and to which they are related.

A predicate is characterized by its actant structure which is defined as follows:

- A predicate is associated with a number of arguments (or actants); the number of arguments is the number of participants in the situation or the action described by the predicate. It corresponds to a saturated structure which may be only partly filled in the contexts associated with the lexicalizations of the predicate. However, choosing the list of arguments is not always easy, particularly when one must decide whether an element called circumstantial modifier must be considered as an argument or not; each instanciation of the dictionary will be assigned its own criteria.

Each argument plays a specific semantic role in the situation described. This is expressed through the fact that it is assigned one or more semantic roles. These roles are rather similar to the thematic roles used in syntax. However, one may require, at this level, a set of more precise values making it possible to more accurately describe the semantic role played by the predicate in the situation described. Consequently, the values of roles will have to be compatible with the values drawn from the syntax. By default, it will be possible to decide to use the same list of values as that assigned to the thematic roles, if they are included in the instance of the dictionary concerned.

The model also makes it possible to structure semantic roles in a hierarchical order, so as to point at a more or less precise level and associate more or less precise roles according to the predicate arguments.

Within the structure of arguments associated with the predicate, each argument carries more or less complex semantic data associated with various statuses, which increases the description power of arguments.

The statuses carried by the semantic data related to the arguments are the following:

- DEFAUT: this is a default datum concerning the argument, when it is not instantiated by the context

- VERIF: this is a verification datum; the semantic representation of the instantiated argument (if it is instantiated) must check these data.

- ENRICHIT: this is a datum which enriches the representation of the argument, whether it is instantiated or not.

- DEFAUT_VERIF: this is a verification or enriching datum compatible when the argument is instanciated or default depending on whether the context represents the argument or not.

The nature of these semantic data relative to the arguments varies: it can involve valued semantic features, concepts, possibly an instantiated predicate (in the case of a complex argument) and a Usem.

The semantic features used for these constraints are those we have described as being correspondents of an objective datum, as well as the unrestricted user features (miscellaneous).

The Usem is used to provide each argument with a default value which is assigned when the semantic structure extracted from a surface structure does not assign any value to this predicate. This default value is usually related to the predicate, and must therefore be described at the predicate level. However, when certain correspondences are established at the syntactic level, the default value is different from the value assigned by the predicate, in which case, it is defined by the correspondence.

The predicate is thus defined, at least in part, by this actant structure which represents its semantic "signature".

5.3. Consequences on the previously described points of the model

The predicate-related data will be described at the predicate level and consequently be inherited by each semantic unit sharing the predicate. A good part of the data included in the predicative semantic units are carried by the lexical predicate which shares the data with other semantic units. This has an effect on the content to be associated with the semantic units.

5.3.1. At the componential level

The predicate itself may carry certain semantic features, and it comprises a number of semantic data concerning its arguments.

The semantic features carried by the predicate can be "general" features or predicate classifications.

Data (with a verification status) concerning the predicate arguments allow for a filtering associated with the syntactic-semantic correspondence (particularly concerning features) or a detection of non-standard uses of lexical units. In any case, the data relative to the correspondence between syntax and semantics and the data concerning the actant structure of the predicate must be consistent and compatible. The features defined by the correspondence between syntax and semantics are present if they specify or complete the features carried by the predicate argument structure.

5.3.2. At the level of semantic relations

A number of relations can be described at the predicate level, which make it possible to factorize relations between semantic units that would then only be implicitly related at the semantic level, since these relations can be calculated on the basis of relations between predicates. The more concise expression they allow justifies the application of these relations between predicates. However, they do not exclude relations between Usems, even if this involves the risk of redundancy. Here as well, the lexicographic strategy of a dictionary instance sets the limits of what must be coded on the Usems, and on the predicates, and which is responsible for managing the problems of redundancy.

Concision can rely on two types of links within the model:

1 - One may consider the link between a semantic unit and its lexical predicate (strictly speaking, this is not what we call a semantic relation; this link is represented in the model by the Predicative Representation object).

Indeed, the way a predicate is integrated into the Usem carries meaning. For each semantic unit pointed to the predicate, the nature of the link between them can be defined. For example, the semantic unit of the verb acheter (to buy) will be associated with the predicate acheter (to purchase) as a main (master) lexicalization, emphasizing the situation and its conditions of realization. The noun acheteur (purchaser) will be a nominal lexicalization integrating the argument 0 of the predicate. A sense of achat (purchase) will be represented by the lexicalization of the predicate integrating the argument 1 (object). Another sense will concern the predicate as the name of the action.

The Predicative Representation specifies whether an argument is concerned in the association of a Predicate with a Usem, and which one, and if so, whether this argument is integrated by the Usem or not. Moreover, the link is named and it is therefore possible to characterize the links between a Usem and its Predicate differently .

2 - Semantic relations between lexical predicates. These relations will define the correspondence between the arguments of each related predicate. The relation between them also makes it possible to specify the semantic data relative to the arguments, in addition to the data provided by the predicate concerning its arguments.

Ex: Clouer (to nail) is an example of a predicate for which the number of arguments required for its description is unclear. One may decide that this means of fastening is a predicate argument, or not.

One may consider that the predicative Clouer has three arguments, in which case, it is described as Clouer(agent, objet, lieu) (agent, object, location). Its relation with the predicate Fixer (to fasten) will be described as follows:

Clouer(agent, objet, lieu)

<---Specialization/Generalization--->

fixer(agent, objet, lieu, moyen)

The semantic relation between both predicates then specifies the value of the argument 3 of Fixer which is compulsory in Clouer: the means of fastening is a nail

 

One could also say that the predicate Clouer has four arguments; in that case, it is described as Clouer(agent, objet, lieu, moyen) (agent, object, location, means). Then, its relation with the predicative Fixer will be described as follows:

Clouer(agent, objet, lieu, moyen)

<---Specialization/Generalization--->

fixer(agent, objet, lieu, moyen)

The description of the predicate then forces the value of argument 3 of Clouer by the Usem Clou (or the concept Clou depending on the choice made), and in that case, a simple correspondence is established between both Arg3.

 

Another example of relation between predicates

Ex: aimer(affectŽ, objet) (to love; affected, object) <--- synonyme converse large ---> plaire(agent, affectŽ) (to suit; agent, affected)

Part of the description of the non-paradigmatic semantic relations described above can be reformulated on the basis of the relations between the predicate and the semantic units which comprise it. For example, the relations of semantic derivation can often be expressed as relations between units that share the same predicate and have different relations with it. The relations between semantic units are then deduced from those which link the semantic units to the predicate.

 

Ex: Let us suppose that the predicates vendre (to sell) (Arg0, Arg1, Arg2, Arg3) and acheter (to purchase) (Arg0, Arg1, Arg2, Arg3) shared by the senses of semantic units acheter (to buy), acheteur (purchaser), achat1 (purchase1), achat2 (purchase2), achetable (purchasable) and vendre (to sell), vendeur (salesman), vente (sale), vendable (saleable) have been identified and described.

It could be particularly interesting to establish a relationship between both predicates which also specifies how the arguments correspond with each other.

We will illustrate the various possibilities of the model with the following diagram which shows certain simplifications and certain choices at the coding level, but which is designed to be representative of the model.

 

R1, R2, R3, R4, R5 represent associations between Usems and predicates which specify a certain modality of association

R1 : master, no argument concerned

R2 : no argument concerned

R3 : argument 0(agent) concerned and integrated

R4 : argument 1(object) concerned and not integrated; a quality of what this argument can be

R5 : argument 1(object) concerned and integrated

These links of association (Predicative representations) are explicitly present.

RS1 is a semantic relation between Usems; it links a (verbal) Usem to its prototypical agent and is explicitly present.

The arcs relating Semantic units represent (miscellaneous) semantic relations which can be deduced from the explicitly present data; they are not explicit in the dictionary. These are relations of semantic derivation of units related to the same predicate, on the one hand, and of relations deducible from explicitly present semantic relations which are associated with argument data, on the other.

Note that this represents a certain choice at the level of coding and organization of the semantic layer, but that other options are possible. For instance, one could have decided to code it as information related by default to the agent argument the link between "acheter" (to purchase) and "client" (customer), or "vendre" (to sell) and "marchand" (tradesman), by pointing the argument to its default value Usem.

 

5.3.3. At the coding level and definition of consistency criteria

From the viewpoint of the lexicographic methodology, the predicative structure can be sufficiently specific and restrictive so as to either provide indications concerning semantic proximities (according to the proximity of the argument structure) or even to define semantic classes, or on the contrary, to reject certain relations between semantic units or between predicates. It is therefore very useful at the time of the coding and for verifying the consistency of a dictionary instance.

The number of arguments, their own semantic role, and their associated semantic data (semantic features, in particular) make it possible to determine classes of predicative entries, whether one intends to code these data at the level of semantic entries or at the level of predicates. This information can also be used for "intellectually" verifying the consistency within relations.

For example, a relation of synonymy implies that predicative structures are the same or are very similar (since pure synonymy does not exist in the language, certain differences must be accepted), and that the discrepancy between predicative structures can provide a "measure" of the degree of synonymy. In the same way, the sense relation between two predicates with similar predicative structures is probably nil; this applies, in particular, when the number of arguments increases.

6. Concept

A semantic unit can be associated with a concept which is usually common to several semantic units. The notion of concept must then be regarded as a generalization of the objective or cognitive content of a semantic unit described from a non-predicative point of view.

It may be interesting to highlight the class of equivalence of semantic units described from a non-predicative point of view and which are linked by a relation of synonymy. This may be done by identifying the class with a common concept that would be related to its corresponding semantic units.

A number of valued semantic features which are common to these semantic units may be therefore assigned to the concept. The semantic units belonging to this class have inherited or explicitly obtained the same valued features and other specific features (for example: level of language and connotation). Consequently, the semantic units of chien (dog), clebs (hound), clŽbard (mutt), cabot (cur), chien-chien (doggie), toutou (bow-wow) may all point at the same concept that we call chien (dog). Such a concept, identified by the lexicographer on the basis of the semantic units, is drawn from the lexicon and may be considered as a lexical concept since it factorizes what is named by the described language (usually in different ways). As a specific semantic unit, the least "marked" can be selected as the privileged representative of the concept with which it is associated; it then represents the "master" representative related to the concept through the Conceptual Representative.

In an approach of knowledge representation, it will also be possible to rely on non-lexical concepts (which cannot be expressed in the language other than by a periphrase). For instance, it is possible to rely on this conceptual level to describe a taxonomy comprising lexical gaps. We will come back to this subject when we look at relations between concepts.

Other concepts which are not drawn directly from the lexicon can also be used to provide an increasingly abstract and general representation. They factorize the common elements of a group of concepts with a lower degree of abstraction.

It is therefore the virtually possible to have a semantic representation with variable depth and level of detail.

The lexical level of semantic units provides the greatest precision; it makes it possible to account for the meaning of an entry in detail. This level points to more abstract representations. One may also consider that one abstract level is enough and only take the given data into account.

Consequently, different levels of dictionaries can coexist (and communicate through the projection of one level to another) within the GENELEX dictionary. A given application can then easily utilize the application dictionary comprising the information of the level required by its semantic needs.

7. The "conceptual" level of description

Most semantic data described until now are mainly found in a lexical semantic context where descriptive objects emerge from the described language and remain close to it. These descriptive objects do not deviate from the lexicon through generalization and significant abstraction relying on primitives. The descriptive tools we are now going to describe are designed to allow for such abstraction.

Note that the GENELEX model does not impose the use of primitives or their mode of definition: it allows the description of predefined primitives on which semantic units are projected, as well as the emergence of more abstract primitives from the lexical descriptions. In both approaches, concepts and predicates, together with their associated data, will account for a more abstract level of semantic description, possibly oriented towards Artificial Intelligence.

7.1. Predicate

The notion of an abstract concept for the entities described in a non-predicative way is parallel to the notion of a generalized, primitive or non-primitive predicate.

These predicates are described by the same formal object as the lexical predicates; the difference in their statuses results from a type attribute valued differently and from the fact that they are not directly pointed to a Usem.

A first level of these predicates can be identified by the classes of equivalence of lexical predicates linked by a relation of synonymy (which is more or less strict depending on the level of generality required for the primitive predicate).

For example, the lexical predicates acheter (to purchase) and acquŽrir (to buy) can also point to the same generalized (or even primitive) predicate that we will call achat (purchase), and the lexical predicates vendre (to sell), bazarder (to sell off), liquider (to liquidate), cŽder (to dispose of) can point at another generalization: the generalized (or primitive) predicate vente (sale).

An additional level of generalization makes it possible to describe predicates by relying on other generalized or primitive predicates with a higher degree of generality. As a result, both predicates identified above could be expressed according to the same primitive predicate, called a transaction.

The most general predicates also include the greatest number of arguments. In fact, a more specialized predicate comprises partly implicit data (and may therefore absorb or integrate one or more explicit arguments of the general predicate). This must be specified in the description of a predicate by another, more general predicate. Consequently, verbs of movement can be projected on a primitive predicate of movement identifying various arguments: origin, destination, location crossed, means...

For all these reasons, the relations between predicates must makes it possible to define correspondence between argumentary structures as is the case at the level of the lexical predicates described earlier.

Primitive predicates can also be carriers of valued semantic features.

7.2. Relations at the conceptual level

A number of the relations mentioned as lexical semantic relations at the level of conceptual objects can be applied in interesting ways. This concerns, in particular, the relations that reflect general knowledge and which have an encyclopedic character.

7.2.1. Relation between concepts

Relations describing taxonomies and meronymies (part-whole relations) between non-predicative units may well be expressed by relations between concepts.

This makes it possible to distinguish the level of semantic units that comprises only units drawn from the language from the level of knowledge representation-concepts where concepts without lexicalization can be part of the hierarchies. The problem of lexical gaps can therefore be solved without introducing any inconsistency into the model.

Relations between concepts will be considered as implicitly inherited by their lexicalizations.

7.2.2. Relation between predicates

We have already mentioned the existence of semantic relations between lexical predicates. These relations can be applied to all predicates. However, it is recommended to retain, at the "generalized" and not necessarily lexical level, the role of general knowledge representation.

It is possible to use relations of generalization between predicates to define more general predicates on the basis of specific predicates as one or more generalizations usually lead to primitive predicates.

In this way, hierarchical or taxonomic relations can be defined between the predicates.

Part-whole relations can also link predicates (not necessarily lexical ones). Note that the model allows for a more precise and potentially more elaborate expression of the presupposition, implication or reference to scenarios. However, under minimal use, part-whole relations between predicates can make it possible (in a slightly roundabout way) to code relations of presupposition or implication, as well as sets of predicates which would constitute the core of a scenario.

7.2.3. Relation between predicates and concepts

Here as well, certain relations can be transferred from the lexical level (between semantic units) to the level of the concept, or the lexical or primitive predicate.

More general relations can also relate predicates to concepts.

A predicate can be related to one of its typical circumstants (not described in the predicative structure) through a relation between predicate and concept. , This information can be carried by the concept, which can be lexicalized in different ways, rather than by a specific Usem; the link to the concept then "carries" implicit links to the various Usems associated with the concept. As a result, the concepts associated, for example, with a typical instrument, a typical location or a typical means can be associated with the predicates for which they represent typical circumstants.

It will be necessary to adopt an elaborate strategy allowing for the homogenous coding of the data relative to the typical arguments and/or cicumstants of a predicate at the level of an instance of the GENELEX dictionary. Predicates describe their arguments rather precisely, and can associate them with concepts. Moreover, the predicate arguments can be more numerous than the positions named in syntax; their realization in syntax is then described on the basis of semantics. Finally, the lexicographic team has to make choices and define its coding strategy in order to identify the syntactic "circumstants" representing predicate arguments (and described as such) and possibly describe the "real" circumstants through specific semantic relations between predicates and concepts or between Semantic units,

7.3. Contribution of this level of representation

The level of conceptual representation constituted by concepts and non-lexical predicates makes it possible to express more or less general and independent knowledge in the language according to the choices made for a given dictionary at a given time.

The level of representation closely related to the language (semantic unit, lexical predicates) is projected into the conceptual level. The projection is more or less simplistic depending on the degree of abstraction selected for the primitives.

Moreover, a dictionary sometimes includes a level of description which comprises few generalizations; a level of additional primitives may be added later to this existing level.

In the same way, a dictionary can include very few data regarding semantic units and lexical predicates, and these can be used almost exclusively to reach an abstract conceptual level correctly described at a given time. During a later stage, the lexical level can be enriched without bringing into question the description of the conceptual level.

The structuring of data therefore makes it possible to:

- enrich and develop a GENELEX dictionary without necessarily calling into question the existing information.

- virtually obtain a variable depth by considering only the data included in a certain level or which can be projected from another level to the lexical level.

Ex: One may wish to work only at the lexical level (semantic unit and lexical predicates). In this case, attention will be focused exclusively on the data of this level and on the data inherited from the conceptual levels into which they are integrated. This will provide the most precise description of the dictionary.

One may also decide to focus on semantic features called objective features and ignore features of evaluation, connotation or level of language. The description will then be limited but will still remain at the "lexical" level.

On the other hand, one may focus on data formulated in terms of primitives and completely ignore the data belonging to the lexical level, in which case projection is immediate.

The structuring of data into increasingly abstract levels makes it possible to obtain in a rather simple way the expression of variable depth and meet very different needs. Applications of indexing or data base inquiry will probably rely on a conceptual level; the "lexical" level will certainly be very useful to translation or generation applications.

7.4. Development of this level

We have already frequently mentioned the various approaches of lexicographic methodology applicable to the description of the lexicon. These two approaches consist in:

- Highlighting description elements by observing the language, and providing indications at the "conceptual" level after giving a precise description of the lexicon, in order to distinguish generalities and then to describe them within the formal objects provided by the model: predicates, concepts, relations between these objects.

- Relying on predefined elements of description.

The "conceptual" level also makes it possible to apply both approaches. One may wish to identify primitives by observing the objects of a less abstract level of language and generalizations gradually made. There can also be a battery of primitives (concepts, predicates and relations, or even features) which is then represented in the model, and to which the semantic units are projected.

D - correspondence between syntax and semantics

1. The notion of predicate: a reminder

The notion of predicate is essential in the semantic model; the correspondence between syntax and semantics relies on this object. The following description recalls how it is used.

In a given situation, a predicate expresses the relation which links arguments (for example, the relation of donation). It is partially defined by its argumentary structure which is a kind of "signature". Each predicate argument plays a semantic role within this structure. A predicate may semantically constrain its arguments, i.e. impose feature values on them, determine their belonging to a semantic class...

At the level of the model, predicates are identified as such, and represent objects on their own. A predicate can be shared by several semantic units.

The correspondence between a semantic unit and a syntactic unit is therefore, in terms of predicates, the correspondence between a syntactic construction and a predicative structure, when the semantic unit is described as predicative.

2. Syntactic position/Predicative argument

A syntactic unit is characterized by a Base Description (BD) associated with a SELF and a construction. The Construction comprises positions held by syntagms related to various data. The BD of the syntactic level will correspond, at the semantic level, to data relating to the predicate and the predicative structure associated with the semantic unit, if this is the case. (It usually corresponds to the cases where the unit described has a HEAD function, for example a head verb of the complementation structure described by the Construction).

A correspondence must be established between both structures, and the model must at least make it possible to express, on the predicate arguments, data concerning their surface origin at the syntactic level.

Naturally, this correspondence does not systematically represents a bijection associating a position with a predicate argument at the semantic level, even if this can sometimes be the case (for various verbal constructions, for example).

The relation of correspondence between Usyn and Usem must therefore carry data which make it possible to identify, at the syntactic level, within the base construction characterizing the syntactic unit, the position occupants corresponding to the arguments of the predicative entry.

It must also include data which makes it possible to filter in the family of possible realizations of synactic structures associated with the BD (Construction + SELF carrying the particularities of the entry for the construction) those to which the semantic unit are to be associated.

In the correspondence with a given semantic unit, the optionality of positions, as described in syntax on the Construction, must sometimes be constrained. It is then necessary to manage a discrepancy on the optional character of certain positions of the Construction by specifying that a given optional position, at the level of the Usyn, becomes compulsory or, on the contrary, prohibited for this sense. Finally, if a position retains its optional character for a given sense, it must be possible to describe the semantic data assigned by default to an argument when the optional position is not occupied. (This is a phenomenon of internal complement, or of absolute use of transitive verbs, where the syntactic context can force the semantics of the argument which is not expressed on the surface).

Ex : Pierre nage (la brasse) (Pierre swims - breast-stroke -)

nager (arg0, arg1 [defaut= nage])

Pierre mange (une pomme) (Pierre eats - an apple -)

manger(arg0, arg1[[defaut : comestible : +] ])

The correspondence between Usyn and Usem can be represented in a simplified way, as follows:

3. Constraints on the base description

DBs are associated with a construction that comprises several surface realizations which, through optionalities and alternatives concerning position occupants, do not only differ at the level of their lexicalization. It must be possible to associate a syntactic unit (characterized, among other things, by its BD) with a semantic unit through the filtering of all the possible realizations of the construction represented by the BD, and consequently, of the position instanciations of the Construction and of the SELF realizations.

3. 1 Filtering scope

These restrictions through filtering concern

-- An element of the position distribution concerning a given syntagmatic label position ;

for example, one may want to select the NS occupant during the association of a semantic unit with a syntactic unit characterized by a BD which comprises a position likely to be occupied by an NS, a P [mode : inf] or a P[sscat : complŽtive][Conj : que]. This filtering is made by inhibiting the various elements of distribution which are not allowed in the correspondence, i.e. they are incompatible with the meaning attributed by the pointed Usem.

-- Certain features intended to be added to distribution elements which are not inhibited but simply more constrained than they are in the description.

for example: one may want to restrict a position occupant P[Conj : que][sscat : complŽtive] to the subjunctive mode for which the mode is not specified in the BD: an additional feature is then "added" by the correspondence between syntax and semantics.

-- The optionality of a position:

For example, one may want to make an optional position of the BD compulsory in order to associate a semantic unit with the syntactic unit described by the BD.

Ex:

manger (to eat)

CatGram : V

db : Self cb

cb : P0 IntervConst (P1)

P0 : SN

Self : IntervConst : V

P1 : SN

Let us suppose that a unique description is assigned to manger (to eat), without any denotational constraint, which covers the usual sense and the metaphoric sense: "manger les virgules ou manger les mots " (to suppress commas or to swallow words).

It is intended to impose, at the level of correspondence with the abstract sense, the surface realization of position P1, i.e. to make compulsory, for a given sense, the position described as optional in the description.

-- The semantic interpretation of a position realization:

Ex:

croire (to believe)

CatGram : V

db : Self cb

cb : P0 IntervConst P1

P0 : SN

Self : IntervConst : V

P1 : SN

P[Conj: que]

[sscat : complŽtive]

P[mode : infinitif]

may be associated with a Semantic unit for the following sense:

croire quelqu'un (to believe someone), predicate with two arguments

with a filtering of the syntagm (restriction on the distribution of the position) and of the semantic feature described as follows :

P1 : SN[humain : +]

and with another one for the following sense:

considŽrer comme vrai (to consider to be true), predicate with two arguments

with the filtering: (of the distribution and semantic interpretation associated with the position)

P1 : SN[humain : -] [abstrait : +]

P[Conj : que][sscat : complŽtive]

and probably other senses filtering other position instanciations.

Remark 1:

As we have noted before, the choice of coding in syntax will have consequences on the correspondence between syntax and semantics.

High factorization at the level of the syntactic units will require a big volume of filtering data during the shifting to semantics; conversely, few filtering data are required in the case of numerous splittings at the syntactic level, especially when precise data concerning thematic roles and denotative constraints are used.

One should remember that the correspondence between syntax and semantics relies on the base description of the Usyn concerned by the correspondence. Actually, according to the choices made at the syntactic level, the syntactic behavior of a Usyn must be mainly described on the basis of the descriptive content of the description "base description" (BD), but also according to the descriptions associated with the Usyn and derived from the BD or from one of the transformed descriptions. They are considered as the different syntactic expressions of a single "deep" syntactic behavior, and the sense modifications associated with the various syntactic contexts represented by the various Descriptions seem to be limited, with regard to the sense of the unit described, and seem to concern a different thematization. Syntactic transformations are therefore exclusively integrated into the data of the syntactic layer, and the correspondence between syntax and semantics does not "see" them. This means that it is not possible to filter a transformed description during correspondence, and that it is also impossible to inhibit certain transformations, from the semantics. As a result, these choices provide a criteria for the splitting into different Usyns: if an intervention, for certain associated senses, is required at the level of transformations (to limit or inhibit them), it certainly implies that there are two underlying syntactic behavior, and that it is therefore recommended to identify two Usyns.

As a matter of interest, let us remember that two Usyns can also be linked by a relation of syntactic transformation. If they are drawn from the same MU, they may be associated with a single Usem.

3. 2 Filtering through features

3. 2. 1 Filtering through syntactic features

The expression of construction filtering relies on the same descriptive objects and the same language as the one applied to describe position occupants, i.e. syntagms, together with a set of restrictive features as for the syntactic level (see Report on the syntactic layer). The correspondence relies on the objects of the base description and provides additional data which act as a filter: inhibition or compulsory presence of an optional position within the construction, inhibition of the syntagm of a position distribution, or addition of features, syntactic or semantic features projected on the semantic interpretation of the syntagm.

A certain language is provided for filtering position occupants; it is based on the descriptive elements of the syntactic level, especially the features. Filtering can also be made on semantic features which are not necessarily present at the syntactic level (by means of free features associated with semantic features), and, depending on whether they are present at the syntactic level or not, filtering will have a slightly different meaning.

The semantic features used for filtering must be regarded as an addition of constraints on the base description, and consequently the definition of a sub-set of realizations in relation with the realizations included in the base description.

3. 2. 2 Filtering through semantic features

Valued semantic features can also be used for filtering. It is necessary to define the role played by these valued semantic features since several mechanisms can be applied in an instantiated syntactic framework and each of them provides interesting possibilities of expression.

3. 2. 2. 1 Filtering through verification-attestation of the presence of information

The use of semantic features during the correspondence process consists in requiring the explicit presence of the valued feature in the semantic representation associated with the syntagm described; in this case, filtering through valued features implies a verification of the presence of the feature and of the expected value.

The absence of the expected feature, like the presence of the feature valued (monovalued) differently (concerning semantic features) by a non-compatible value (the hierarchy of features makes it possible to structure certain values and to calculate compatibilities), means that the semantic unit pointed by the correspondence does not belong to the senses that must be associated with the syntactic structure and which have positions instantiated in a non-satisfactory way. This sense is therefore incompatible with a semantic interpretation of certain instanciations of the syntactic context described by the BD. The valued semantic feature provides useful information, during the analysis, that is applicable to disambiguation.

In this interpretation, the filtering constraint does not enrich the semantic representation of the argument; it merely plays a role of verification. If semantic features are already used at the level of the syntax, filtering must be compatible with the requirements of the syntactic unit. If this is not the case, filtering will concern, a posteriori, the semantic representation of the occupant of the filtered position. These features can be identified as "constrained features".

These features will be expressed in the model in the form of AjouteTraitSem with the status: FILTRE (see the User's manual).

3. 2. 2. 2 Filtering through verification-enrichment of information

Another possible use consists in checking the compatibility of the filtering information with the semantic representation related to the syntagm described. This means that an existing and differently valued feature should make it possible to exclude the sense associated with the pointed Usem, but that the absence of a feature allows the correspondence to this Usem, and that this feature is consequently added to the semantic representation of the argument associated with the position constrained by the feature.

The constraint expressed then plays a double role: it ensures the verification of compatibility and the enrichment of the semantic representation. The correspondence between Usyn and Usem forces, "if necessary", the semantic representation of arguments. In this case, they are described as "projected features".

These features will be expressed in the model in the form of an AjouteTraitSem with the following status: FILTRE_AJOUTE (see the User's manual).

3. 2. 2. 3 Filtering through compulsory enrichment of information

A third possibility consists in "forcing" the semantic feature, irregardless of its compatibility with the semantic representation of the syntagm. In this case, there is no filtering but rather a forced enrichment. As an example, let us take the verb manger (to eat) (usual sense) and project on the position sujet a semantic feature [animŽ = +] which would force the semantic interpretation of the position to carry this feature, which would then correspond, for instance, to the interpretation of the following sentence: Ma voiture boit de l'essence (my car is heavy on petrol).

These features of semantic filtering applied to the syntax will be expressed in the model in the form of an AjouteTraitSem with the status: FORCE (see the User's manual).

Remark: Note that the semantic features used in the correspondence between syntax and semantics make it possible to filter certain semantic interpretations of the syntactic context and to enrich a semantic representation of the context, independently of the fact that the unit subjected to the correspondence is described as predicative or not, and independently of the fact that positions correspond or not to semantic arguments. This illustrates the importance of having a precise description of the syntactic context and of a fine filtering during the correspondence, independently of the predicative representations that can be associated with the Usems.

4. Realization of arguments

4. 1. Correspondence between argument and position

In many cases, the correspondence between the Syntactic position and the Semantic argument can be established in a simple way: the SELF of the BD characterizing the Usyn corresponds to the element described (noun, verb, adjective...) which occupies the HEAD function, and its essential complements are described in the associated Construction. In this case, defining the realization of arguments usually consists in assigning a position to each of them. The Semantic unit which represents the sense associated with SELF points to the predicate according to a certain modality that may constrain the instanciation of the associated structure of arguments.

However, note that the correspondence between argument and position only concerns the units described as predicative; the description of all semantic units at the syntactic level is not required (ex: floating arguments). Likewise, all syntactic positions are not constrained to point at arguments. The general grammar will have to build the global interpretation according to its choices, and will have to define the role played by the interpretation of a position which is not associated with any argument.

Ex :

croire (to believe)

CatGram : V

db : Self cb

cb : P0 IntervConst P1

P0 : SN

Self : IntervConst : V

P1 : SN

PRO

P[Conj : que]

[sscat : complŽtive]

P[mode : infinitif]

will be associated with several predicative Usems, by means of various filterings, but correspondences between argument and position may be similar:

 

 

Remark: In this case, Contr_d represents all the data which constrain the Syntactic description (ContraintDescription) in the model.

4. 2. Floating argument

A predicative semantic unit will be associated with a lexical predicate comprising a set of arguments. These arguments duly characterized in terms of features, semantic roles and maybe sometimes of default values is one of the characteristics of the lexical predicate. It represents its "signature". It corresponds to the maximum saturated structure of the predicate.

Certain arguments of the semantic predicate are not included in the surface of the associated reference syntactic structure since they are "adjunct" or "circumstant" complements that are not necessarily described as essential complements at the level of the Usyn. However, it is possible to describe their syntactic realization. They are described as floating arguments, i.e., integrated in semantics: their realization in syntax will be defined on the basis of the semantics.

Consequently, a floating argument points to a syntactic position that is fully described (function, thematic roles, distribution...), as well as a CheminSyntagme which specifies into which level the "floating" position is integrated into. The absence of Chemin_syntagme only means that the floating position is integrated into the highest level (of the BC, BS, or the Usyn according to the value of the attribute "carried"). In any case, the "floating" position is not specified within the list of positions. In fact, a position associated with a floating argument often belongs to adjuncts or circumstants, and the canonical order of these elements is not meant to be specified. The general syntax (outside the dictionary) will be used to describe the various insertion points of these "floating positions".

Several semantic units may share the same lexical predicate. These semantic units are extracted from syntactic units characterized by different BDs which do not necessarily include the same number of positions and the same optionalities as those accepted in the correspondence between Usyn and Usem.

4.3 Default values

The surface of the designed construction (even as adjunct complements) may not represent what corresponds to one or more predicate arguments that is intended to be associated at the semantic level. This situation may also occur when a position is optional (the filtering constraints do not require the presence of this position for the correspondence concerned); in that case, it is necessary to specify what would become of a corresponding argument.

An argument which is "absent" on the surface can have a default value or valued features; in most cases, these default or implicit data do not depend on the characteristic construction of the syntactic unit but on the predicate. The default values carried by the predicate are therefore, generally sufficient.

Remark:

They may depend on the corresponding pair and may be more precise than the default values related to the predicate. The default information is then carried by the correspondence between Usyn and Usem.

Ex:

Max fume (Max smokes)

Max fume la pipe (Max smokes a pipe)

Max fume le jambon (Max smokes ham)

Max fume le champ de son voisin (Max manures the field of his neighbor)

fumer the Usyn of this BV is characterized by

db : P0 IntervConst (P1)

P0 : SN

IntervConst : V

P1 : SN

(for this coding, the denotative constraints have not been used and this BD accounts for the four preceding contexts).

It is intended to associate this Usyn with several Usems; each one is related to a predicate:

fumer_tabac(arg0, arg1) (tabacco)

fumer_aliment(arg0, arg1) (food)

fumer_terrain(arg0, arg1) (field)

Only one sense supports the optionality of P1. The default value does not differ from the value extracted from the "signature" of the predicate, and is therefore not comprised in the correspondence.

 

 

Remark: One refers to the semantic class feature, as in the syntax report. The list of possible values for this feature is not imposed by the model.

Note that the data of the valued semantic features can be carried by AjouteTraitSem or by SelectEtPrŽciseArg. They do not have the exact same status: FILTRE, FILTRE-AJOUTE or FORCE can be specified only on the AjouteTraitSem which are no longer oriented to semantic "filtering-enrichment" of the syntactic context associated with the Usyn. Default values will only be specified by SelectEtPrŽciseArg that can carry information concerning valued semantic features, as well as a concept, a predicate or even a Usem.

Ex:

Le chat boit du lait (the cat drinks some milk)

Max boit (Max drinks) (ambiguous)

Max boit tout son salaire (Max spends his entire salary on drinking)

boire (to drink) (Verb) has a Usyn with the following characteristics

cb : P0 IntervConst (P1)

P0 : SN

IntervConst : V

P1 : SN

(for this coding, the denotative constraints have not been used and this BD accounts for the three contexts of occurrence)

It is intended to associate this Usyn with three semantic units; two of them share the same predicate:

boire1(Arg0 : [animŽ : +], Arg1 : [liquide +]).

The correspondence specifies larger restrictions on arguments for one of the correspondences, with a default value on the optional position P1.

The sense to spend (one's money) on drinking introduces a non-trivial correspondence between argument and position and leads us to broaden the language applied to argument calculation. When one argument, at the highest level, introduces a predicate (in this case dŽpenser_1(Arg0 : animŽ : +, Arg1, Arg2)), it is possible to specify, with a SelectEtPreciseArg, the predicate "contained" by the correspondence, which forces a predicate argument at the highest level. For example, to provide, as the Arg2 of the associated predicate, a predicate which includes arguments also specified at the level of their realization by a correspondence between argument and position. (See below the point dealing more specifically with complex cases.)

 

Remark: Arg2.Arg0<=P0 means that the first argument of argument 2 in dŽpenser_1 (predicate associated with the semantic unit at the first level) comprises the elements associated with P0 in syntax. The second argument has no correspondent on the surface. By default, it inherits the feature [alcoolisŽ : +], through the path: Arg2.Arg1.

5. Optionality management

A Usyn describes a family of syntactic constructions, and may comprise Positions with optional realizations. When correspondence is established with the semantic level, it must be possible to specify that such and such sense prohibits or makes compulsory such and such optional Position within the Construction.

If this occurs, the default values of positions whose optionality is maintained at the level of the shifting will be specified in the correspondence. The value of the argument drawn from an optional position when this one is not realized, if the default value related to this construction differs from the default value specific to the predicate, or if this predicate is not valued. These default values will be expressed by a SelectEtPreciseArg; they can consist of valued semantic features, a concept, a predicate or a Usem.

 

Ex : capable AJ (able)

Cet individu est capable de mentir pour arriver ˆ ses fins (This individual is capable of lying to achieve his aims)

de mensonge pour que tu viennes

(so that you'll come)

pour une rŽcompense

(for a reward)

Cet individu est capable de voler (This individual is able to fly)

Cet enfant est capable de comprendre (This child is able to understand)

db : P0 P1 P2

P0 : SN[animŽ +]

P1 : V[Lex : tre]

P2 : SADJ : P0 IntervConst P1 (P2)

P0 : SADV

IntervConst : ADJ[Lex : SELF]

P1 : SP[Prep : de]

P[mode : infinitif]

[Prep : de]

P2 : P[mode : infinitif]

[Prep : pour]

SP[Prep : pour]

P[mode : subjonctif]

[Conj : pour que]

 

If one intends to associate it with two semantic units corresponding to the predicates: capable_ose (Arg0, Arg1, Arg2) and capable_apte(Arg0, Arg1) respectively associated with three or two arguments, then we would have:

 

 

Remark:

The description of positions in a BD (tree structure) implies the possibility to describe the paths through these tree structures in order to associate the position of any level with an argument. That is the meaning of P1.S1.P2: position P2 of the list of positions associated with the first syntagm of position P1. The representation here is a classical one.

The formal model will rely on the descriptive elements of the syntax, and in particular, on the CheminPosition and CheminSyntagme of this layer.

6. A few more complex cases

In some cases, one wishes to express even more complex correspondences in terms of argument structure.

6.1. Semantic unit including, for one of its arguments, a predicate and one (or more) of its arguments.

Let us consider the adjective meilleur (better) (comparative).

Neither the morphological model nor the syntactic model establish a relation with bon (good) . It is therefore desirable to demonstrate that the semantic unit meilleur (better), at the semantic level, comprises the sense plus (more) and the sense bon (good) at the same time.

The predicate of superiority should be referred to:

plus(Arg0, Arg1)

(where both arguments can naturally be structured in the form of predicate-arguments)

in order to obtain homogenous representations of comparative structures, whether they are regular or not. It is also interesting to demonstrate that both arguments have the predicate bon (good) (in fact, the number of Usems associated with the comparative "better" must be equal to the number of "bon" supporting the comparative).

It must be possible to express this in the correspondence between Usyn and Usem, if one wants to describe meilleur, at the predicative level, like plus(bon(Arg0),bon(Arg1)) . It is intended to "provide" the semantic unit with the data required for the calculations of arguments on plus and, as the argument of plus, on bon as well. The Usem meilleur is therefore, directly related to the predicate plus and indirectly related to the predicate bon. This correspondence is established at the semantic level, not at the morphological or syntactic level.

In this perspective, it must be possible to point the semantic unit at the predicate plus by forcing the arguments of plus to represent a predicative structure (a predicate specified by the correspondence and possibly by its arguments).

Ex: meilleur (Um of the category adjective) has a Usyn (corresponding to the comparative behavior of bon with one argument) that presents the following characteristics

cb : P0 P1 P2

P0 : SN

P1 V[souscat : copule]

P2 SADJ : (P0) IntervConst P1 P2

P0 : SADV

IntervConst :Adj

[sscat : comparatif]

P1 : Conj[Lex : que]

P2 : SN

Let us suppose that bon (bon_1 and bon_2) has two senses related to two unit predicates; this Usyn would be represented as follows:

The calculation specifies the predicate arguments associated with the semantic unit (at the first level), in this case the binary predicate plus. This predicate includes two predicative structures that represent its arguments which are also specified in the calculation. In this case: Arg0<=bon_1(P0) means that the argument Arg0 de bon_1 corresponds to P0. The calculation of this argument is then also implicitly described.

 

Remark:

The data concerning filtering through P0 and P2.P2 features depend on the lexical predicates bon_1 and bon_2 and on the coding of the correspondences making it possible to reach these predicates. They must therefore be coded in an homogenous way. We will not provide any specification here.

We do not intend to describe here the syntactic behavior corresponding to the superlative meilleur which would naturally correspond to one or more other Usyns associated with Usems.

The same principle (other Usyn-Usem pairs) will be applied to describe the structures associated with:

Luc est meilleur en maths que Max en franais (Luc is better at mathematics than Max is at French)

Luc est meilleur en maths qu'en franais (Luc is better at mathematics than at French)

Il est meilleur en maths que Max (He is better at mathematics than Max)

which are related to a predicate bon_3 with two arguments. With the set of co-references associated with the position occupants carrying the label e (not expressed on the surface), it will be possible to describe these three contexts on the same Usyn.

Ex: meilleur AJ

db : P0 P1 P2

P0 : SN [coref : i]

P1 V[souscat : copule]

P2 SADJ : P0 IntervConst P1 P2 P3 P4

P0 : SADV[souscat : degrŽ]

IntervConst Adj [souscat : comparatif]

P1 SP [prep : en]

[coref : j]

P2 : Conj[Lex : que]

P3 : SN

e [coref : i]

P4 SP [prep : en]

e [coref : j]

Si P3 = e[coref : i],

alors ! P4 = e[coref : j]

 

6.2. The case of Compound syntactic units

In theory, the association of a compound Usyn with a Usem does not pose any specific problem. In fact, the correspondence is established in the same way as the simple Usyns by specifying the filtering, the origin of the arguments, the default values... Like simple Usyns, the Usem must generally describe the sense of the BD self. However, there is a significant difference which affects the correspondence between syntax and semantics: these syntactic compounds do not represent one form, but a set of possible realizations. Within this set, certain variations are merely variations of form with no repercussion on the sense, while other variations are fossilized or semi-fossilized and have a repercussion on the sense. For example, all the forms associated with a syntactic compound Fil-de-fer-barbelŽ (barbed wire){fil de fer barbelŽ, fil barbelŽ, barbelŽ} correspond to the same sense, whereas the forms {en connaissance de cause, en toute connaissance de cause, en parfaite connaissance de cause} (with full knowledge of the facts) associated with another compound Usyn correspond to the same core of sense which is modified by "toute" or "parfaite". The "modifier" position, although it is partially fossilized at the lexical level by a finite list of lexicalizations, then acts as a "free" intensifier from the semantic point of view. Other similar problems may arise, and will be illustrated in the following cases.

6.2.1. Internal/external part of a syntactic compound

The association of a compound Usyn with a predicative Usem can be used as a guide to distinguish the internal part from the external part as follows: the predicate corresponds to the internal part of the compound which has no semantic compositionality, and the arguments must be associated with the positions of the external BD. For the compound X met Y en Ïuvre pour Z (X implements Y for Z), mettre and en Ïuvre will represent the internal part of the compound, and will be associated with the ternary predicate mettre_en_Ïuvre(Arg0, Arg1, Arg2). Consequently, when there is no sense compositionality, the criterion is simple.

But usually, sense compositionality is not entirely absent. In this case, the lexicographer must decide whether he/she wants to emphasize compositionality by associating the compound with a predicative Usem which includes arguments drawn from what is identified as the internal part of the compound.

Remark:

The phenomenon of sense compositionality is similar to the fossilization of complex expressions: it can be studied in terms of continuum, i.e., exclusively compositional, partly compositional, non-compositional. However, it is not simplistic, at the semantic level, to associate a single Usem to the whole internal structure, although this presents a certain character of compositionality: the semantic relations between the semantic units will reflect sense proximities related to the compositionality of the internal structure.

Ex: the following entries can be considered as a single syntactic compound or as 4 different syntactic compounds :

filtre ˆ air (air filter)

filtre ˆ eau (water filter)

filtre ˆ huile (oil filter)

filtre ˆ cafŽ (coffee filter)

depending on whether one wishes to associate the same predicative Usem filtre(Arg0) including an argument that will be described during the correspondence as being drawn respectively from air, eau, huile, cafŽ, or from four different Usems filtre_cafŽ, filtre_eau, filtre_huile, filtre_cafŽ. Here, the relation with filtre will be highlighted at the semantic level only, by means of a taxonomic relation. Depending on the lexicographic choice, the correspondence between syntax and semantics will need to access positions of the internal structure of the compound or not.

All of the filtre ˆ X (X filters) can be coded on a single compound in syntax, and be associated through filtering with the n different semantic units (depending on X).

 

Remark:

These compounds could also be coded in morphology, in which case, the problem of compositionality highlighting would no longer exist. The compound Um would be associated with a simple Usyn associated with a Usem. Semantic relations then represent the only way to show proximities of sense.

One may also decide to describe these structures as the syntactic behavior of filtre and to associate it with the predicative Usem filtre(Arg0) . This coding may seem to be the most economical, and therefore the best, if one wishes to highlight filtre ˆ air, filtre ˆ eau as autonomous units at the semantic level. "Composition in semantics" will then be represented by relations of collocations between Usem filtre and Usem eau, Usem air...

When the internal-external division of the syntactic compound is not evident and arbitrary, and since the coding of the syntactic layer must be carried out without taking into consideration the choices that will be made at the semantic level, all correspondence data will have to be carried by the construction describing the external syntactic structure, but also possibly by the construction describing the internal structure of the compound. However, the approach of the GENELEX model consists, whenever possible, highlighting, within the external construction, the positions associated with arguments at the semantic level.

Moreover, the coding of syntactic compounds is also made through a "calculation" on the "Composition" describing the compound on the basis of the component units and the different interactions between component units. Therefore, the correspondence between syntax and semantics will also rely on the description of the compounds through "Composition".

The language of expression of the correspondence must therefore explicitly specify, when it refers to a position, whether the position belongs to the external Construction or the internal structure, and in the case of a description through Composition, must indicate the nature of the component.

6.2.2. Insertion of modifier

Compounds in syntax have another specificity, concerning the insertion of modifiers, whether they are totally, partially or not lexicalized at all. Actually, when they are lexicalized, they are generally comprised in the "internal" description of the compound, whether it concerns the internal structure or the Composition. The case where the modifier concerns (according to the general syntax) the HEAD verb of the internal structure is probably the most frequent; this is a simple case where general compositionality will modify the semantic representation associated with the compound Usyn.

Ex : Il met frŽquemment en marche le chauffage (he frequently turns on the heating)

il met frŽquemment le chauffage en marche (he frequently turns the heating on)

could be represented in the following way:

Correspondence at the semantic level is then not required to provide any precision concerning the scope of the modifier, it complies with the general rules.

The insertion of the modifier can also be represented in the BS as follows:

It would then be recommended to adopt the implicit rule (at the grammatical

level) which defines the scope of the modifier of the Construction HEAD as the modifier of the compound entry described. (The modifier of mettre (turn) would then be interpreted in this context as the modifier of mettre en marche) (to turn on).

However, this poses a problem because, within the compound, certain optional components must be regarded as modifiers, and others merely play a role at the surface level without modifying the sense of the compound. The solution which consists in inserting modifiers at the level of the BD and specifying their insertion point in the BS seems to be the best when it is possible and when the contribution of these modifiers at the semantic level is the same as for simple units.

Certain cases are more complex, and they pose a problem that cannot be solved so directly. This is especially true for the optional insertion of an adjective modifying a noun in the compound, or an adjective that must be interpreted as a compound modifier.

These are usually compound intensifiers. It is possible, at the level of the syntactic representation, to assign, in the BS, a thematic role allowing to indicate their semantic interpretation.

Ex: En toute/parfaite/trs bonne connaissance de cause (with full knowledge of the facts)

However, since the application of thematic roles depends on the approach selected for describing the lexicon, it cannot be imposed for solving the problem.

The modifier function assigned to these positions can also be interpreted as desired by the general semantic interpretation managed by the grammar. The Genelex model can recommend such a convention which must be adopted for an instanciation of the model in a given GENELEX dictionary.

Remark:

Basic structures comprising optional positions do not necessarily include modifiers.

The following compound is provided as an example

(fil (de fer)) barbelŽ .

It is therefore essential to be able to specify insertions corresponding to modifiers.

7. Mechanisms implemented

In view of the requirements expressed above, and in order to account for the various data necessary for establishing precise correspondence between two layers, the correspondence carries the following data: Correspondence Argument-position, SelectEtPrŽciseArg, Contraint_description.

The Correspondence Argument-position (Correspondence between Argument and position) makes it possible to describe the syntactic correspondent of an argument. When the argument is "represented" in the syntactic layer, this consists in associating an argument with a position by pointing to each of them. The predicate may also include deep arguments which are absent on the surface. In that case, the correspondence specifies their values when they are related to the Usyn-Usem pair and not to the predicate itself (see the following point).

The semantic layer has the possibility of identifying a number of arguments with a surface expression that does not correspond to any position; these arguments are associated with "floating positions" taken from the semantics. These positions (that can be regarded as adverbial or adjunct complements in syntax and are not included in the base description) are described as the syntax positions (a single object makes it possible to represent them: these "floating positions" can be associated with a function, a thematic role, and a distribution); their insertion level (but not their insertion point in the Position list) within the BD Construction is specified.

SelectEtPrŽciseArg provides information concerning the predicate arguments associated with the Usem. It specifies the default values of arguments drawn from optional positions, or information depending on the specific syntactic context associated with the Usyn concerned by the correspondence between syntax and semantics. This object also makes it possible to specify that a predicate argument at the highest level (i.e. directly pointed to the Usem taking part in the correspondence) is itself a predicate specified by the correspondence.

Contraint_description adds further constraints on the base description of the Usyn. These constraints are expressed by adding (syntax) features on the distribution elements of a position, by inhibiting certain elements, and prohibiting or forcing the realization of a position considered as optional in the syntactic layer.

In the example above, the Usem36 prohibits the realization of position P2.S1.P2 (optional in the BD). It is marked as such.

The optional positions with compulsory realizations would be specified in the same way.

E - REFERENCE BibliographY

Following is a list of some reference works. They include more comprehensive bibliographies which could provide elements for the detailed study of one specific aspect of the model.

 

Cruse, D.A.1986. Lexical Semantics. Cambridge : Cambridge University Press

Evens, M W, editor, 1988. Relational models of the lexicon. Cambridge University Press

Gross, G. dans la Revue Langages , "SELECTION ET SEMANTIQUE, classes d'objets, complŽments appropriŽs, complŽments analysables". Septembre 1994. Larousse

Kerbrat Orecchioni, C. 1979. De la sŽmantique lexicale ˆ la sŽmantique de l'Žnonciation. Tome 1. Service de Reproduction des thses. UniversitŽ de Lille III

Lecomte, A., Baschung K. & Bs G. Une modŽlisation des entrŽes lexicales. 1991. (Rapport Projet Eureka GENELEX)

Mel'cuk I., &al.(1984 & 1988) Dictionnaire Explicatif et Combinatoire du franais contemporain. Recherches Lexico-sŽmantiques I & II. MontrŽal,Presses de l'UniversitŽ de MontrŽal

Pustejovsky, J. 1991. The generative Lexicon. Computational Linguistics. vol 17

Pustejovsky, J. 1989. Current issues in computational lexical semantics, Proceedings of the 4th European ACL, Manchester, pp xvii-xxv

Rastier, F. 1987. SŽmantique interprŽtative. Paris : Presses Universitaires de France.

Rastier, F., 1987. ReprŽsentation du contenu lexical et formalismes de l'intelligence artificielle. Langages 87, 77-102

The following documents give an overall view on the various problems of lexical semantics:

Actes du premier SŽminaire de SŽmantique Lexicale. du GDR-GRECO Communication Homme-Machine IRIT, UniversitŽ Paul Sabatier, Toulouse Janvier 1991.

Proceedings of the 2nd Seminar on Computational Lexical Semantics, IRIT, Toulouse Janvier 1992

F - USER'S MANUAL

1. General points

The model provides a set of descriptive elements that allows for a very precise description of lexical semantics based on a componential analysis or a very comprehensive set of lexical semantic relations, or more probably on a combination of both descriptive axes.

The model also provides a set of abstract description elements. These are generalizations or abstractions made on the basis of the lexicon since certain elements are no longer directly related to the lexicon but constitute elements of knowledge representation.

As in the case of the morphological and syntactic layers, the semantic model avoids, as much as possible, using descriptive elements that are not explained. This accounts for the large number of intermediate objects (some have a "meta-knowledge" status) which intervene in the semantic layer.

These two modes of description coexist with no problems in the same dictionary, and the "meta-descriptive" level derives from the lexical level, since both possible "viewpoints" concerning the lexicon are connected.

Most descriptive data are associated with the objects of the semantic layer in consideration of the modality of the knowledge they express. A symbolic weighting is assigned, and a set of weighting values is provided in the model, which can be increased according to the needs of the lexicographic team building an instance of the GENELEX dictionary.

Here are the symbolic weighting values:

- DƒFINITOIRE (DEFINING)

- PROTOTYPIQUE (PROTOTYPICAL)

- ACCESSOIRE (ACCESSORY)

- EXCEPTIONNEL (EXCEPTIONAL)

- CONNOTATIF (CONNOTATIVE)

In addition to the symbolic weighting, it is possible to provide a numerical weighting of this knowledge. Besides, the knowledge modality also comprises a viewpoint that gives the opportunity to gather various viewpoints on the same object and to describe it according to this viewpoint. This field is entirely free.

2. Usem

2.1. General points

The Semantic Unit (Usem) is the entry point in the semantic layer of the model.

The Usem is a unit that is closely related to the language and allows for a precise lexical semantic description. Its description relies on objects which represent various abstractions on the sense components that must be assigned to the Usems; some of them have a rather decompositional or analytic status, others are generalizations or abstractions made on the basis of the Usems.

The Usems can be pointed to by several Correspondences between Syntax and Semantics (Corresp_Usyn_Usem) provided these Correspondences belong to simple Usyns associated with the same Um.

In most cases, a Usem describes the sense (or the sense) of a morphological unit (simple or compound Um) in the syntactic contexts described by one or more of its simple Usyns which may be filtered.

A Usem may also represent one of the senses of a compound described in syntax, in which case, there is no associated Um, but a Usyn which gathers one or more Usyns or Ums in a composition that can be considered as a whole from the linguistic point of view.

 

All Usems can carry a CombVE, or a quadruplet of using values:

Level of language (attribute niveaulgue) with the values: FAMILIER (COLLOQUIAL), VULGAIRE (VULGAR), ARGOTIQUE (SLANG), POPULAIRE (VERNACULAR), LITTERAIRE (LITTERARY), SAVANT (LEARNED), STANDARD). If the expressive power provided by this attribute does not suffice to meet the requirements of a certain semantic approach, it will also be possible to complete the data with one or more weighted valued features expressing more precise information concerning the level of language.

Frequency (attribute frŽquence) with the values RARE, COURANT (COMMON)

Geographical variations (attribute vargeog) with free values

Dating (attribute dating) with the values ARCHAIQUE (ARCHAIC), VIEILLI (DATED), MODERNE (MODERN)

The Predicate and the Concept are two descriptive elements representing a certain (more or less important) abstraction in relation to the lexicon. The Usem relies on these objects and uses them to access increasingly abstract levels of description (which are also increasingly independent of lexical specificities).

The Usems have 0 or one Predicative Representation (RepresentationPredicative), and between 0 and N associated Conceptual Representations (RepresentationConceptuelle_Pond). A Usem described by a predicative structure can enrich the semantic representations of the predicate arguments. This semantic enrichment is specified by the various SelectEtPreciseArg that may be included in the Predicative Representation.

Usems are associated with 0 or n Weighted Semantic Features (Trait_Sem_ValPond), representing decompositional (or componential) characteristics of the sense. If one describes the Usem by relying on more abstract generalized objects, the only data to be noted here will be the precise lexical data that are not shared by the various Usems relying on the same abstractions; the other data will be implicitly present through the inheritance of abstract descriptive objects.

A Usem is inserted into a network of semantic (lexical) relations between Usems, which constitutes one description axis of the Usem. It has therefore a number of semantic relations with other Usems, which is reflected by the fact that it is the source or target of Weighted Valued Relations linking Usems (R_ValPond_Usem). According to a coding convention, the Usem represents the source of embedded R_ValPond_Usem; R_ValPond_Usem specifies their target as well as the semantic relation linking both Usems and the relation modality of both units.

A Usem includes a free definition, i.e., it is a completely free field intended for a human reader. It comprises, for example, the definition of the paper dictionary which is useful to the lexicographer. This field is not controlled, and the model does not impose any rules regarding the structure or syntax of the definition.

A formal definition field is also attributed to the Usem. It may be useful, for example, to associate a logical formula to the determiners which may include the expression of lambda-calculation associated with the entry, if required. Like the preceding field, the content of this field is not controlled.

2.2. Predicative representation (RepresentationPredicative)

A predicative Usem describes its predicative component with one and only one Predicative Representation in which the Usem may constitute a master element or not. The unique master Usem, in its association with a predicate, is the privileged lexicalization of this predicate, the most neutral one, which is described by this predicate and a minimum of data. In many cases when the predicate is shared by a set of Usems building a semantic network around a nucleus verb, the nucleus verb is the most simple expression of the predicate, and its Usem is its master element (ex: the Usem of the verb "acheter" (to buy), the master element of the predicate acheter shared by the Usems "acheteur", "achat1", "achat2", "achetable" ...). The Predicative Representation is associated with a Usem in an essential (or defining) way; in particular, it explains the Usem. That is why it is not necessary (or possible within the model) to give a modality to this association, and why a Usem can only be associated with one (or zero) Predicative Representation.

The relation between the Usem and the predicate is freely named by the attribute type of link which characterizes the shifting "function" from the Usem to the Predicate. The semantics of the link can be specified in a precise way. (for example: "capacity to be argument1" between the Usem lavable (washable) and the predicate laver (to wash)).

chemin_arg_concerne is used to specify the rank of the argument concerned by the relation (ex: acheteur: predicate acheter, argument0). arg_inclus will indicate whether the Usem has integrated the argument (ex: acheteur) or not (ex: achetable "characteristic of what can be an argument," chemin_arg_concerne: 1 arg_inclus: NON_I).

2.3. Weighted Conceptual Representation (RepresentationConceptuelle_Pond)

A Usem can be associated with zero, one, or more concepts (independently of its representation through the Predicative Representation) by specifying the modalities of this association (symbolic weighting, numerical weighting and viewpoint). The modality of the Conceptual Representation makes it possible to associate different concepts (with different modalities) with the same Usem without creating any inconsistency. Moreover, it will be possible to associate a unit, described as predicative in precise lexical semantics, with a concept according to a certain viewpoint that will then be defined.

The Usems associated with a predicate by integrating an argument may have a predicative representation and also a conceptual representation that will account for the nature of the integrated argument, especially by entering a taxonomic network.

Ex: It will be possible to associate the Usem "filtre" (filter) to the predicate "filtrer" (to filter) if the instrument has been considered, within the description, as an integral part of the predicate arguments. It can also be associated with a concept "filtre" that is integrated into the network of "tools".

The Usem can be the master element or not in its relation with a concept, which determines whether it stands as the privileged lexical representative of the concept, and which also means that this concept is named or lexicalized (see Concept) .

The relation between the Usem and the Concept is freely named by the attribute type of link which characterizes the "function" relative to the shifting from the Usem to the Concept. The semantics of the link can be specified in a precise way.

3. Semantic unit of an affix (Usem_Aff)

The semantic unit of an affix (Usem_Aff) comes directly from an affix Um (Um_Aff): it provides a minimal description of its core of meaning, and a semantic characterization. This information makes it possible to minimally predict the sense for regular and productive form derivations not included in the dictionary.

A Usem_Aff can be described with the Weighted valued features (preferably of a particular type specified by the concerne attribute of Trait_semantique through the value AFFIXE), with the Weighted concept (RepresentationConceptuelle_Pond_Aff) or the Predicate. These elements can be combined as follows: a Predicate and a list comprising Weighted valued features or not, a Weighted concept and a list comprising Weighted valued features or not, or a list comprising Weighted valued features. Actually, a Usem_Aff including no semantic information would have no grounds for existence.

As in the case of the Usems, the definition_libre and definition_formelle fields make it possible to provide a textual or more formal definition. Combinations of use values are also specified (combve).

Ex: a mis- affix (mis-anthrope, miso-gyne) (mis-anthropist, miso-ginist)

could have a Usem_Aff pointing to the predicate: "dŽtester"(to hate)

(the calculation of the predicate arguments is not detailed)

Ex: a pisc- affix (pisci-culture)

could have a Usem_Aff pointing to the concept: "poisson" (fish)

Ex: a -tion--affix (fabric-ation)

could have a Usem_Aff pointing to the Trait_Sem Value "action = +".

(related to the Trait_Sem action with concerne = AFFIXE)

4. Predicate

4.1 General Points

A predicate is one of the main elements of the semantic model: a Usem described as Predicative will rely on a LEXICAL type predicate for describing the Predicative nucleus.

A predicate can be of different types, according to the model:

-LEXICAL: drawn directly from a Usem, whether this is a master element or not.

-GƒNƒRALISƒ: not drawn directly from a Usem, introduced into the generalization relation of another Predicate (that may be lexicalized), and also generalized by another Predicate.

-PRIMITIF: not drawn directly from a Usem, does not allow any generalization relation. A PRIMITIVE Predicate can be provided a priori as a pre-existing or pre-defined description element, that may be independent of the language (the pivot attribute specifies this). If the whole set of PRIMITIVE Predicates represents a datum a priori, the Usems and intermediate descriptive objects will be projected on them. The whole set of PRIMITIVE Predicates can also be drawn from the descriptions of the lexicon; in which case, the list of them will be provided a posteriori.

-LEXICAL_PRIMITIF: simultaneously drawn directly from the lexicon, without generalization, and possibly also a pivot.

-TROU_LEXICAL: drawn between two lexical Predicates: non-lexicalized generalization of one of them, and generalized by another LEXICAL Predicate.

A predicate can have a pivot status, which means that it has been selected as a pivot description element that is independent of the languages and as a descriptive element shared by several languages.

A predicate can have a free definition: this is an entirely free field intended for a human reader. The textual definition or the entire expression which is useful to the lexicographer is recorded therein. This field is not controlled, and the model does not impose any rules regarding the structure or syntax of the definition.

A predicate can have a formal definition: this is another entirely free field which can comprise a formal definition (lambda-calculation formula, for example). Like the preceding field, it is not controlled.

A predicate can be a source or target of weighted valued relations between Predicates (R_ValPond_Pred); it can also be a source or target of the weighted valued relations between Predicate and Concept (R_ValPond_Pred_Concept et R_ValPond_Concept_Pred). Among the relations between Predicates, a the relation of generalization type plays a specific role, since it provides a structures among Predicates. Predicate carries the weighted valued relations among the Predicates: R_ValPond_Pred and between Predicate and Concept: R_ValPond_Pred_Concept.

A predicate knows the list of its arguments; this an essential characteristic of the Predicate; the two cannot be disassociated. (Arguments are to Predicates what, approximately, Positions are to Constructions)

The list of referred Arguments follows a pertinent order in the model because, for a given Predicate, the rank of an Argument in the list will be used outside the Predicate to refer to its nth argument. By convention, Arguments are numbered from 0. Arguments do not explicitly show their rank as an attribute, because, from the linguistic point of view, the rank is more a practical requirement for description (to access an Argument from the outside) than a pertinent piece of information. Since the rank is not a characteristic of the Argument, the Predicates can share Arguments independently of the rank assigned to them by these Predicates.

A Predicate can point at weighted valued features, semantic characteristics related to the decompositional axis of the semantic layer.

4.2. Argument

An argument is closely related to the definition of the notion of predicate, and represents one of its totally essential elements. The presence of at least one argument is what makes a predicate is a predicate.

A predicate has a number of data concerning each of its arguments. As description elements, arguments may be shared by several Predicates, and they are present at different ranks within the list of arguments of these different Predicates (like Positions in syntax that can be shared by several Constructions, in which case, Predicate must be paralleled with Construction).

An argument is characterized by at least ONE or perhaps several Semantic roles (attribute role_sŽm_l which points to Role_Sem) and by semantic data expressed in the list of InformeArg informe_arg_decrit_l. The InformeArg directly referenced by the Argument does not carry the rank information, because it is totally inoperative in that case (this information is useful when an argument is accessed from outside the Predicate as a Predicate argument).

Semantic roles (Role_Sem) are hierarchically described with an "ISA" relation fossilized in the model. This is represented by the attribute ISA_l that points a Role_Sem at its "father(s)" in the hierarchy.

The connection with the thematic roles used in syntax is made by associating a thematic role of the syntactic layer with certain Role_Sem of the semantic layer. Consequently, a list of roles may be different from the list of roles provided in syntax; the only constraint for ensuring consistency between both layers is represented by the fact that the roles of syntax must be "plunged" into a hierarchy of Role_Sem; semantic roles can then be more precise. To remain consistent, the thematic role (when applied in syntax) will have to be compatible with the semantic role corresponding to the argument drawn from the position that carries the thematic role, since the semantic role must be the correspondent of the thematic role, or one of its descendants in the hierarchy of roles.

4.3. SelectEtPreciseArg, InformeArg

SelectEtPreciseArg points to a predicate argument at any level of depth; when the predicate arguments are predicates themselves, arguments can be accessed through the series of argument ranks at each level. The attribute chemin_arg includes a series of Argument ranks, and, in any case, makes it possible to reach the Argument concerned. This Argument is enriched with semantic data of various statuses through the pointed InformeArg.

InformeArg plays a part (directly or via SelectEtPreciseArg) at different levels of the descriptions in order to add the data gathered on the argument of a predicate. The Predicate itself knows its Arguments and these data on its Arguments are expressed through InformeArg carried by the Arguments themselves. A Usem is described with a Predicate and possibly some added information on the Arguments. The correspondence between Usyn and Usem can also add data on the Predicate arguments associated with the Usem. All these data are gathered together; each level provides specific elements, but contribution is not compulsory.

The correspondences between Arguments (in the weighted valued relations between predicative Usems or between predicates) may provide data on the arguments (of the source or target) in the form of SelectEtPreciseArg.

The status of the data relative to an Argument makes it possible to give them various modalities, mainly related to the management of defects, the verifications or construction of the semantic representation of the Argument.

An InformeArg carries a set of data on an argument; all these data share the same status (DEFAUT, VERIF, ENRICHIT, or DEFAUT_VERIF).

DEFAUT: information on the argument, in the absence of its semantic representation: absent on the surface, or occupied by a semantically "empty" element. The value concerned may even be a Usem.

VERIF: when the existing argument has a semantic representation, this must comply with the constraints imposed on this status. For example, it must carry the features trait_sem_valpond_l pointed to here, or take the Concepts pointed to by concept_l as generalization.

ENRICHIT adds semantic data to the argument data, whether there are any or not.

DEFAUT_VERIF combines default contribution, if there are no data, with verification, if arguments are semantically "full".

pred_instancie points to an instantiated predicate PredInstancie which may be associated with a DEFAUT or VERIF status (particularly if the predicate is not lexical) or a DEFAUT_VERIF status; ENRICHIT will be used if the data are added by an instantiated Predicate on one its Arguments.

concept_l points to a list of Concepts, usually 1 or 0; the case >1 is used to insert a polyhierarchy of Concepts which the argument should inherit. The various statuses are compatible with this information.

usem points to a Usem; the associated status is most often DEFAUT, owing to the strict lexicalization constraint imposed by the information carried on a Usem, or ENRICHIT, especially when the information is added by an instantiated Predicate PredInstancie.

trait_sem_valpond_l points to a list of weighted valued semantic features; the various statuses are also compatible with this information.

Several InformeArg or SelectEtPreciseArg can concern the same Argument, as the information associated with various statuses is gathered.

4.4. Instantiated predicate (PredInstancie)

PredInstancie represents the association of a Predicate and more or less precise and comprehensive data concerning the semantic representation of its Arguments. Arguments can be fully described, "instantiated" by Usems, or only partially described through Concepts, or Trait_Sem_ValPond. The list informe_arg_l may even be completely empty, in which case, this element will merely point at a predicate (and implicitly at the SelectEtPreciseArg of its description). Formally, the data gathered on the arguments at various stages of the description are expressed in the same language of InformeArg (via SelectEtPreciseArg or not). Therefore, an instantiated predicate (PredInstancie) is merely a Predicate that integrates a little or a lot of data specific to its Predicate definition on its arguments.

4.5. List of predicates (ListePred)

A list of instantiated predicates makes it possible to gather these instantiated predicates, to give a status to all of them and to make it play the role of an autonomous description element. For a list of predicates, it is possible to associate Variables with certain sets of arguments in order to specify the distribution of arguments among the various Predicates. (like Prolog variables).

A list of instantiated predicates makes it possible to define a predicate by inserting it or not into the list; it makes it possible to explain the presuppositions related to a predicate, their implications, and even to describe a scenario to which the predicate and concept can refer.

A list of predicates has a status which specifies the status of the group of enriched predicates: a number of values are provided in the model:

- SCƒNARIO

- DƒFINITION

if necessary, users can add their own values.

A list of predicates also has a type specifying the meaning of the order of enriched predicates within the list. A number of values are provided in the model:

- ORDRE_TEMP

- ORDRE_SPATIAL

- SIMULTANEITE

if necessary, users can add their own values.

The order of this list is pertinent; its meaning is specified by the type, and the rank within the list is also used to describe variables by accessing predicate arguments in this list.

A list of predicates is associated with a predicate or a concept through an Assoc_Liste_PrŽd that specifies the modality of this association.

The role of the list of predicates in relation to the predicate or the related concept is specified. A number of values are provided in the model:

- A_PRESUPPOSITION

- A_IMPLICATION

- A_DƒFINITION

- A_EXPLICITATION

- PARTICIPE_A

if necessary, users can add their own values.

A field intervient specifies, for a Predicate, whether it intervenes directly into the ListePred, i.e. whether it intervenes at the level of one instantiated predicate included in this list (it will then be mentioned expressly, since the list can be referenced by other semantic objects).

4.5.1. Variable

Within a list of predicates, a variable associates the arguments of several Predicat_instancie which are "unified" for a List of related predicates.

This association is made by pointing to a list of SelectPredArg, the paths leading to Predicate arguments that are intended to be unified.

4.5.2. SelectPredArg

A path leading to an argument (SelectPredArg) makes it possible to reach the predicate argument included in a list of predicates.

Its rank within the list (nieme_pred) (nth pred) refers to an instantiated predicate of the list.

The argument concerned in this predicate is reached through the attribute chemin_arg. In most cases, this list of ranks will comprise a single element. In rare cases, however, this predicate has a complex argument, which means that it is a predicative itself, i.e., the argument may be found by following the predicative structure representing the realization of the complex argument. The following rank argument within the list may then be recursively pointed to. This complex case accounts for the list of NUMBERS.

The following example illustrates the notion of a list of instantiated predicates, of variables in such lists, and of a path leading to an argument:

The predicate acheter (to purchase) (X : acheteur (purchaser), Y : objet (object), Z : vendeur (seller), W : argent (money)) could be associated with a list of enriched predicates (with defining status, temporal order)

- possŽder(Z, Y)(to own) predicate instantiates 1

- dŽsirer(X, Y) (to wish) predicate instantiates 2

- possŽder(X, W) predicate instantiates 3

- donner(X, W, Z) (to give) predicate instantiates 4

- donner(Z, Y, X) predicate instantiates 5

- possŽder (X, Y) predicate instantiates 6

- ...

(note that the same Predicate (possŽder) appears three times in the list, within two different enriched predicates, and that the Predicate "donner" appears twice. Within the model, the variable represented here by X would be associated,with 5 paths leading to arguments:

- predicate 2, argument 1

- predicate 3, argument 1

- predicate 4, argument 1

- predicate 5, argument 3

- predicate 6, argument 1

in the same way, variable Y will be associated with:

- predicate 1, argument 2

- predicate 2, argument 2

- predicate 5, argument 2

- predicate 6, argument 2

the same mechanism is naturally applied to W and Z.

5. Concept

A concept is one of the main elements of the semantic model. This is a "cognitive" abstraction on Usems or a generalization of such abstractions.

There are different types of concepts, according to the model:

-LEXICAL: drawn directly from a Usem, whether it is a master element or not.

-GƒNƒRALISƒ: not drawn directly from a Usem, introduced into the generalization relation of another Concept that may be lexicalized, and also generalized by another Concept (and therefore not PRIMITIF)..

-PRIMITIF: not drawn directly from a Usem, and does not allow any generalization relation. A PRIMITIVE concept can be provided a priori as a pre-existing or pre-defined description element, that may be independent of the language (the pivot attribute specifies this). If the whole set of PRIMITIVE concepts represents a datum a priori, the Usems and intermediate descriptive objects will be projected on them. The whole set of PRIMITIVE concepts can also be drawn from the descriptions of the lexicon, in which case, the list of them will be provided a posteriori.

-LEXICAL_PRIMITIF: drawn directly from the lexicon, without generalization, and also possibly a pivot.

-TROU_LEXICAL: drawn between two lexical Predicates: non-lexicalized generalization of one of them, and generalized by the other one (concepts of this type will be particularly useful to describe a number of taxonomies including "gaps" at the lexical level).

A Concept can have a pivot status, which means that it has been selected as a pivot description element, independently of the languages and the descriptive element shared by several languages.

A Concept can have a free definition: this is an entirely free field intended for a human reader. This is where the definition used by the lexicographer is recorded. This field is not controlled, and the model does not impose any rules regarding the structure or syntax of the definition.

A Concept can have a formal definition: this is another entirely free field which can comprise a formal definition (lambda-calculation formula for example). Like the preceding field, it is not controlled.

A Concept can be a source or target of weighted valued relations between Concepts (R_ValPond_Concept). It can also be a source or target of the weighted valued relations between predicates and concepts (R_ValPond_Concept_Pred, R_ValPond_Pred_Concept). Among the relations between Concepts, a relation of generalization type plays a specific role, since it provides a structure among Concepts. By convention, Concept will carry the deriving R_ValPond_Concept and R_ValPond_Concept_Pred.

A Concept can have a number of specifically related semantic features because they are particularly pertinent. Consequently, their description is either relevant (trait_sem_pertinent_l) or compulsory (trait_sem_obliga- toire_l) for every associated Usem or concept that may be less general. This can allow for the association of a Concept with particularizing features that must be or could be described by the associated Usems or Concepts (directly or through successive generalizations).

Weighted valued features can be assigned to a concept. They are semantic characteristics related to the decompositional axis of the semantic layer. The mechanisms of inheritance must be used to avoid specifying every feature at each level. (A relation of the PARTICULARISATION type can imply the inheritance of data carried by the "father"; this depends on the choice of instanciation of the model.)

A concept can be associated with a list of predicates (usually of the SCENARIO type...) by means of Assoc_Liste_Pred which specify the modality of this association.

6. Valued semantic feature

The notion of a valued feature is commonly used in the representation of knowledge and in computational linguistics. (It is often called "attribute-value pair".) It makes it possible to assign "elementary" knowledge to objects, and the informatic use of this knowledge is rather easy. Moreover, object modelling is adapted to the use of such valued features.

Componential semantics is expressed through semantic features.

The model of the semantic layer provides a language of valued semantic features that can carry the main descriptive objects of the model (Usem, Predicate, Concept).

The valued semantic features of the GENELEX model are structured, and there are a number of relations between them. Whenever possible, the abstruseness of a meta-language of features is avoided by establishing a connection with the language (used for the description).

6.1. Weighted valued semantic feature (Trait_Sem_ValPond)

A weighted valued semantic feature is the association of a valued semantic feature and a symbolic weighting with a list of given values that the user can adapt to his/her needs. It is also possible to indicate a numerical weighting and a viewpoint.

6.2. General Points

A valued semantic feature is the association of a semantic feature with a value belonging to this feature.

It is possible to specify whether this feature has a pivot status, i.e. whether it is independent of the language, and if it can appear with the same definition in the descriptions of linguistic objects in various languages.

A valued feature refers to the Semantic feature associated with the value; this semantic feature also has properties, and is described as a specific object.

The value associated with the feature is specified by the valued feature (valeurtrait or valeurbin).

The value of a valued feature can be named in the described language; hence, the field usem_lexicalise_val may refer to the Usem that names the valued feature. In this case, valued features, although they constitute a meta-language, are connected to the language. Of course, every valued feature is not necessarily associated with a lexicalization.

Valued features are organized into a hierarchy through an ISA relation. A valued feature is inserted into a hierarchy or a lattice; valued features refer to a "father" list of valued features by integrating them into the ISA relation. This means that these features can be organized into a hierarchy to apply the system of value inheritance.

The ISA relation (ISA_l) makes it possible to establish a hierarchy between the valued features of the same feature, i.e. the values of this feature (or when features have a binary value, to establish a hierarchy between valued features with the value +). This way, it will be possible to express that [humain = +] is integrated into a hierarchical relation with [animŽ = +] and that the assignment of [humain = +] to a unit makes it possible to conclude that [animŽ = +] also applies.

Note that valued features with a binary value, although they are formally described within the same entity, have a slightly different semantics for features with a potentially high number of values like a feature belonging to the semantic class. The semantics of [humain = +] will be: "human" characteristic, [humain = -] means: does not have the "human" characteristic, and the hierarchical inheritance makes it possible to deduce a set of "positive" properties on a branch, and the rejection of the properties belonging to the other branches. [humain = +] implies that [humain = -] is impossible.

A relation of incompatibility links certain valued features. This is expressed by the field incompatible_l which refers to a set of valued features that are incompatible with the Valued feature referring to it.

A relation of implication (implique_l) links certain valued features. According to the type of Trait_sem, the sense of this relation can vary slightly; let us look at why.

A valued feature is associated with a Trait_Semantique that carries a field "concerne" indicating whether this feature appears in the description of the semantic unit of an affix (semantic characterization of affixes) or in the various elements included in the semantic description of Usems.

If the feature carrying implique_l is a Semantic feature concerning an affix, the sense of the valued features carried by implique_l is as follows: since Usems comprise this affix in the mode of derivation of their Um, and since the sense of the affix corresponds to the Usem_Aff within the Usem, Valued features included in the Valued feature D_AFFIXE are projected on the sense of the element (Usem if it exists) that involves the affix. This information is not very relevant in the case of the derivatives which are fully described in the three layers. But it also (and mainly) makes it possible to make a minimal prediction concerning the semantic representation of derivatives which are not described in the dictionary, but are formed in a regular way. The implication of the Valued feature is comprised between two different objects. (Valued feature carried by a Usem_Aff ==> Valued feature carried by a Usem, if it is present, or by a "fictitious" Usem, if it is not explicitly present. These include Usems, at the morphological level, an affix Um associated with the Usem_Aff.)

When the relation of implication is carried by a Trait_sŽmantique concerning "AUTRE", it means that a descriptive object that carries this Trait_valuŽ will not have to explicitly carry the Valued features implied that will then be carried implicitly. Implication is made through the same object that explicitly carries a valued feature and implicitly carries the features implied by the preceding one.

Correspondence between features with semantic value in syntax and valued features in semantics is established through a relation of correspondence. The field trait_synt_corresp refers to the "semantic" feature used in syntax to which it corresponds, and makes it possible to provide its full definition at the semantic level. A correspondence is thus explicitly established between the semantic features of restriction in syntax and the features used for the semantic representation.

A valued feature can be associated with a list of Semantic features whose description can be pertinent (trait_sem_pertinent_l) or compulsory (trait_sem_obligatoire_l). This can make it possible to associate differentiating features that must be described with a valued feature belonging to the semantic class. This serves as an excellent guide for the lexicographer, and this clarifies the use of certain features.

A (typical) example illustrates these pertinent features:

Let us take the following valued feature:

classe sŽmantique = sige

(semantic class = seat)

it can be associated with the following pertinent semantic features:

a_un_dossier (has_a_back)

a_des_bras (has_arms)

pour_une_personne (for_a_person)

confortable (comfortable)

...

that will make it possible to characterize the Usems associated with chair, armchair, bench, sofa, stool...

6.3. Semantic feature (Trait_Sem)

A Semantic feature is associated with a value to create a Valued feature (that has a number of properties described above).

A Semantic feature can be identified in the language, in which case, it refers to its lexicalizing Usem through the attribute usem_lexicalise.

It can be related to the description of a language or pivot, which means that it is selected as an element shared by several languages. A shared description element: this is the meaning of the attribute pivot.

A Trait_Semantique comprises a field "concerne" which indicates whether the feature is included in the description of the semantic unit of affix AFFIXE (semantic characterization of affixes) or in the various elements integrated into the semantic description of the Usems AUTRE.

A semantic feature is characterized by a certain type. Following are the values provided (but the user can add elements to this list):

- PROPRIETE (general semantic feature)

- CLASSE (semantic class feature)

- DOMAINE (domain of activity or special field)

- DISTINCTIF (usually related to a class or a domain)

- PRAGMATIQUE

- DIVERS

- CONNOTATION

The status of a trait_sŽmantique can be MONOVALUED, (which means that an element can carry only one Trait_valuŽ associated with this Trait_sŽmantique, because values exclude one another) or MULTIVALUED (which means that several Valued features associated with the same Semantic feature can be carried by the same object describing the semantic layer).

A semantic feature accepts certain types of values, specified by the attribute type_liste_valeurs. Here are the different types of values:

BINAIRE (+/-),

LISTE_FERMEE (list of values knows in extension and specified),

LISTE_OUVERTE.

The list of values (when it is of the type LISTE_FERMEE) must be specified by valeurtrait_l which refers to the entities Valeur_trait.

The set of valued features associated with the same semantic feature can be structured. The structuring of the different Valued features associated with a Trait_sŽmantique with non-binary values is indicated by the attribute structure_liste_valeurs. This structuring is expressed locally through the ISA relation between Valued features.

(Binary features have a specific status; they are also structured in relation to each other. Certain valued features are inherited, and others are excluded: a value + makes it possible to exclude the value - on the same feature).

Trait_sŽmantique is partly characterized by the associated structure:

- TREILLIS_TOTAL: the set of associated Trait_valuŽs form a lattice

-TREILLIS_PARTIEL: Valued features are integrated locally into lattice structures, and several lattices are allowed (consequently, certain Valued features can be isolated).

-HIƒRARCHIE_TOTALE: the set of associated Valued features is included in a tree that provides a structure for all the values.

-HIƒRARCHIE_PARTIELLE: the set of associated Valued features is included in one or more trees that provide a local structure for the values. All Valued features are not necessarily integrated into a structuring relation.

A semantic feature can be associated with a list of Semantic features whose description can be pertinent (trait_sem_pertinent_l) or compulsory (trait_sem_obligatoire_l).

7. Weighted valued relation

7.1. General Points

The GENELEX model makes it possible to relate various objects through semantic relations. The selected modelling has been designed to provide a systematic weighting of relations between objects and to introduce the notion of weighted valued relation.

A weighted valued relation is therefore a description object carried by its source object, including a target object, a semantic relation between source and target, and the modality for establishing this relation.

Weighted valued relations are of different types depending on the element concerned:

- a Usem target and a source

- a Predicate target or source

- a Concept target or source

- a Predicate target and a Concept source

- a Concept target and a Predicate source

As we have already seen, shifting from the Usem level to more abstract levels is carried out through the predicative structure that relates the Usem and the predicate, on the one hand, and through the conceptual representations that relate the Usem and the Concept, on the other. Apart from these objects, the semantic relations of the model do not associate the levels of representation; semantic relations are therefore applied at the purely lexical level between Usems or between objects of a more abstract nature.

Depending on the nature of the source and target objects, semantic relations will be distinct, and they will sometimes have different properties as well.

We will not study the various types of weighted valued relations here, nor the various types of semantic relations; we only need to be aware that these types are characterized by the nature of their source and target (semantic relations comply with the same typing). The DTD SGML can serve as a reference examining the model in detail. The entity relation model is also very meaningful.

7.2. Semantic relation

The nature of semantic relations varies according to the objects they relate and the level of their content.

7.3. Semantic relation between Usems (R_Usem)

Here we will describe semantic relations linking Usems (through Weighted valued relations between Usems) in more detail.

These relations can specify:

* the morpho-syntactic category of the source Usem

* the morpho-syntactic category of the target Usem

* the type of relation

The following set of types is provided by the GENELEX model (this set can be extended by the user):

- PARADIGMATIQUE

- DERIVATION

- COLLOCATION

* the relation sub-type, in order to characterize the semantic relation more precisely.

The following values are provided (the user can extend this list):

-SYNONYMIE

- CONTRAIRE

- OPPOSITION

- CONVERSE

- TAXINOMIE

- PARTIE_TOUT

(sub-types associated with the type PARADIGMATIQUE)

- STRICTE

- NON_STRICTE

(sub-types associated with the type DERIVATION)

A semantic relation can have a number of properties that belong to binary relations: reflexivity, transitivity, symmetry, anti symmetry, order relation.

A semantic relation between Usems can identify the reverse semantic relation (source => target, target =>source)

Certain semantic relations are incompatible; these are identifiable.

Semantic relations can be organized into a hierarchical structure (partial structure) that is expressed by the ISA relation established between them. The attribute ISA_l specifies the hierarchical structuring between relations by referring to the "mother " R_usem of the described relation.

7.4. Semantic relation between the other objects (R_Pred, R_Concept, R_Pred_Concept, R_Concept_Pred)

The other semantic relations have a number of properties in common with the relations between Usems: reverse relation, incompatible relations, hierarchical relations through the ISA relation , and equivalence.

They can be independent of the language or not; this is the meaning of the pivot attribut.

They also have a type with the following values (the list can be extended by the user):

- GƒNƒRALISATION

- PARTICULARISATION

(in the cases when the source and target have the same nature)

- ESSENTIEL

and a sub-type with totally free values.

The relations of generalization/particularization are important because they partly structure objects from the most lexical to the most abstract; in that case, the system of inheritance of weighted valued features must be applicable (if this represents a choice of lexicographic strategy).

7.5. Correspondence between arguments (Corresp_Arg_Arg)

When a semantic relation is established between Predicates or predicative Usems, two predicative structures are in relation with each other, and it is important to establish a correspondence between the predicate arguments implied as the source or target of a weighted valued relation. This is the role of the correspondence between arguments which associate two arguments in the following way:

the paths leading to the source and target Arguments are indicated: chemin_arg_source and chemin_arg_cible

Certain additional data, which can be added into the correspondence, concern the source argument (informe_arg_precise_source_l) or the target argument (informe_arg_precise_cible_l) in the form of InformeArg. It is therefore possible to constrain the acceptable semantic interpretations of certain arguments in the correspondence, and to specify the values of arguments that are not integrated into the correspondence of predicates (when the predicates do not have the same number of arguments).

8. Correspondence between syntax and semantics (Corresp_Usyn_Usem)

The correspondence between syntax and semantics is established by means of correspondence elements (Corresp_Usyn_Usem) carried by the Usyns. For a given Usyn, the number of Corresp_Usyn_Usem is equal to the number of Usems pointed by a Usyn. The number of Corresp_Usyn_Usem in which a Usem participates corresponds to the number of associated syntactic contexts (each one is described by the Usyn that includes the Corresp_Usyn_Usem).

A Usyn corresponds to one or more Usems and a Usem can also correspond to one or more Usyns. When a correspondence is established between a Usyn and a Usem, one may need additional data concerning the modalities of this correspondence: this is what the Correspondence makes it possible to explain.

8.1. Correspondence

Correspondence gives access to these additional data:

- data which makes it possible to specify the expression on the surface of a predicate argument at the semantic level (in short, it consists in establishing a correspondence between predicate arguments and construction positions). This information is carried by Correspondences between Argument and Position which are simple (Corresp_Arg_Pos_Simple) when the positions are present in the description of the syntactic layer, or associated with "floating positions" (Corresp_Arg_Pos_Flottant). The number of Corresp_Arg_Pos_(Simple or Floating) is equal to the number of arguments to be specified.

- filtering data concerning the syntactic description, which make it possible to reduce, from the syntactic or semantic point of view, the sub-set of surface realizations allowed by the Usem. This is the meaning of the object ContraintDescription which describes the restrictions imposed on the base description of the Usyn.

- data concerning the "implicit" semantic contribution specific to the construction. This contribution concerns the predicate arguments associated with the predicative Usem. These data are expressed with InformeArg.

8.2. Simple and floating correspondence

between Argument and Position (Corresp_Arg_Pos_Simple Corresp_Arg_Pos_Flottant)

Corresp_Arg_Pos_Simple makes it possible for a predicate argument to know what its surface correspondence is, based on the syntactic description of the Usyn that comprises the realization of the argument.

Corresp_Arg_Pos_Simple associates an argument of the semantic layer with a syntax position by giving access to each of these two elements.

Corresp_Arg_Pos_Simple makes it possible to point at a syntax Position, whatever the Position of the external construction, the internal structure, or a composition. The attribute portee makes it possible to specify this information.

The CheminPosition is traced from the external construction, the internal structure, or the construction of a component. In the case of a composition, the composition concerned (nieme_composition) (nth_composition) and, within this composition, the component concerned (nieme_composante) are specified; the path must then be followed from the external construction of the component. The attributes nieme_composition and nieme_composante are implied only in case of composition.

The attribute chemin_arg makes it possible to access the Argument corresponding to a Position. It is identified by a series of ranks (usually only one) which makes it possible, at each stage, to access an nth argument that may be simple (the path necessarily stops), or a predicative structure itself (the nth argument of this structure is accessed through the following rank in the list).

Corresp_Arg_Pos_Flottant makes it possible to associate an essential Predicate argument with its family of syntactic realizations when it corresponds to an element that is not described at the syntactic level in the base description, owing to the fact that it is not an essential element of the syntax. The Position that is absent from the base construction (and therefore from the description in syntax) must; be semantically described, and included in the syntactic description.

Certain predicate Arguments associated with "adjunct complements": circumstants, modifiers, etc., which are not specified for the Usyn but which, according to semantics, must be described. Their description will be allowed by Corresp_Arg_Pos_Flottant. This allows greater independence for coding of syntactic and semantic layers.

The attribute position: points to a Position that describes the family of realizations in syntax (and may also specify the function, thematic roles, and a distribution in any cases).

CheminSyntagme makes it possible to specify, if rewriting, the level of visual insertion of the Position associated with the Arg_flottant by pointing to the rewritten syntagm.

The floating position is then inserted into the rewriting list of this syntagm, and the function and thematic roles are specified in relation to the head of the syntagm. The rank of insertion into this list will not be specified; this will be a floating position.

The absence of CheminSyntagme implies that the insertion is made at the highest level, i.e. depending on the case (specified by the attribute portee) in the list of Positions of the external Construction (the most common case), the internal structure (in the case of a composition described by a structure syntagm) or of one composition element (specified by nieme_composition and nieme_composante which are implied in the case of a composition described by a calculation of composition (in this case, there can be no CheminSyntagme). It will be possible to point at the right level by selecting the desired component.

8.3. ContraintDescription

A ContraintDescription makes it possible to restrict the set of syntactic realizations among those virtually described by the base description of the Usyn and to retain only those which are compatible with the sense identified by the semantic unit.

This filtering is carried out either by constraining either the realizations of Self (0 or a ContraintIntervConst), the external construction (0 or a ContraintConstruction), or the internal structure (0 or a ContraintStructInterne and 0 or a Contraint_mdc) depending on the choice of model instanciations. Multiple elements can be present at the same time however, a ContraintStructInterne and a Contraint_mdc can be present simultaneously only if the constraints expressed are compatible and consistent.

Moreover, at least one of these elements must be present: a ContraintDescription cannot be completely void of information (in fact, it is an optional element of the Correspondence).

8.3.1. ContraintIntervConst

Through ContraintIntervConst, it is possible to restrict the possible realizations of Self to 1 or n syntagms (among the syntagms included in the Self as participant in the construction).

ContraintIntervConst points to a number of ContraintSyntagme that is equal to the number of syntagms to be constrained by inhibiting or adding restrictive features (belonging to the model of the syntactic layer) or semantic characteristics.

When ContraintIntervConst is absent, it means that all the realizations of the Self, taken as participant in the construction, are allowed.

8.3.2. ContraintConstruction

ContraintConstruction constrains the external construction by constraining one or more of its positions, by making it compulsory or prohibited (if the position is optional in the Construction), or by constraining its distribution. They are as many ContraintPosition as positions constrained in the Construction, whether they are inhibited or made compulsory, or whether their distribution is constrained, and whatever their level of depth; the CheminPosition of ContraintPosition must be followed from the base construction.

8.3.3. ContraintStructInterne

The role of ContraintStructInterne is similar to the role of ContraintConstruction; the only difference lies in the fact that the positions refer to internal structure. They are as many ContraintPosition as constrained positions in the internal structure; the CheminPosition of ContraintPosition must be followed from the Internal Structure.

8.3.4. Contraint_mdc

Contraint_mdc is used to constrain correspondence with the semantics of compounds described by the Mode of Composition. Contraint_Position_mdc specifies, for each constrained position, the component of the composition for which it represents a position. The position, once it is identified, is constrained by a ContraintPosition like the positions of the internal Construction or Structure. They are as many ContraintPosition_mdc as constrained positions in the mode of composition.

8.3.5. ContraintPosition

ContraintPosition makes it possible to access a position (from the external construction, the internal structure, or the composition depending on the element involving the ContraintPosition) by means of a CheminPosition (of the syntax) which is followed from the base construction, the internal structure, or a component (in the case of the description of Usyn by Composition).

modif_optionnalitŽ makes it possible to specify whether an optional position, within the description, becomes compulsory, or is prohibited for the pointed sense (usem).

A list of constrained syntagms (ContraintSyntagme) makes it possible to specify additional associated syntagms and constraints at the distribution level of the constrained position. There will be as many ContraintSyntagme as syntagms on which restrictions are intended to be imposed. Distribution is never modified by relaxing constraints or by adding new syntagms, but always by imposing an increasing number of constraints.

Besides, a syntagm can be filtered or enriched from the semantic point of view with semantic features.

8.3.6. ContraintSyntagme

ContraintSyntagme makes it possible to select a syntagm (in a distribution) and to specify whether it is inhibited and is therefore absent from the distribution associated with the sense. If it is not inhibited, it may be restricted by adding restrictive features of the syntax. It may be constrained at the level of the criterion of the semantic information also associated with its semantic interpretation (through AjouteTrait_Sem).

AjouteTrait_Sem makes it possible to filter at the level of the semantic interpretation associated with the syntagm by verifying the presence of a feature that must necessarily be present in order to allow correspondence (status FILTRE) or that may be present or added, provided it is compatible with the semantic representation (status FILTRE_AJOUTE). AjouteTrait_Sem can also systematically enrich (without verification) the associated semantic representation (status FORCE).

G - ENTITY-RELATION DIAGRAMS

1. Semantic unit, Predicative representation, Conceptual representation

 

 

 

 

2. Affix Usem (Usem_Aff)

 

 

3. Predicate, List of predicates, Argument, Variable, Semantic role

 

4. Concept

 

5. Weighted valued feature, Valued feature Feature value, Semantic feature

 

6. Weighted valued relations (-pred, -concept, -Usem)

7. Weighted valued relations (-pred-concept, -concept-pred)

 

8. Semantic relations (Usem-Usem, pred-pred, concept-concept, pred-concept, concept-pred)

9. SelectEtPrŽciseArg, Informe argument, Instantiated argument

10. Correspondence between syntax and semantics

 

 

 

11. Contraint description, Contraint  IntervConst, Contraint mdc, Contraint construction, Contraint struct interne, Contraint position, Contraint syntagme

H- DTD SGML

I - Introduction - Translation of the Conceptual Model

The conceptual model GENELEX has been expressed to a great extent in terms of Entity-Attribute-Relation models (Merise).

Many constraints on integrity are expressed in this formalism: type of objects, type of relations, cardinality of relations, etc. However, since the model was not conceived to express rules - doing so gives rise to extreme complications - certain constraints had to be expressed in the accompanying document (restriction on the combinations of values). It follows that the conceptual model of GENELEX is a combination of the Entity-Attribute-Relation (EAR) formalism and of natural language comments.

An SGML DTD (Document Type Definition) is a physical model of grammar that describes the marking of data.

When shifting from the Conceptual Model to the GENELEX DTD we have attempted to systematically translate the EAR models and to formally express most of the integrity constraints described in natural language.

Certain rules have been applied for the translation from EAR formalism to SGML. These are:

(i) EAR entities become SGML elements.

(ii) Attributes of EAR entities become Attributes of Elements.

When the values of an Attribute are exclusive from one another and when they constitute a closed vocabulary, these values are represented in the form of listed SGML attributes.

(iii) Non-attributed relations pointing to a non-shared EAR entity are expressed by hierarchical links between the Elements of the DTD. Their cardinality is expressed by SGML occurrence indicators: ? + *

(iv) Non-attributed Relations pointing to a shared EAR entity are expressed by reference links between Elements.

(v) Attributed relations are expressed in the form of SGML attributed elements connected (by means of hierarchy or reference relation) with the Elements translating the EAR Entities.

A file of constraints ("sŽmant.ctr") has been created to facilitate the reading of cross-references. These constraints appear as comments and are therefore not taken into account by an SGML parser; they show the reference types in an intuition-based syntax.

 

II - DTD GENELEX commented

1. DTD genelex.dtd

<!--Consortium GENELEX @(#) genelex.dtd 3.1@(#) 94/02/25 18:26:42 -->

<!-- **************FOR THE BENEFIT OF USERS ******************

Your comments concerning the DTD will be studied by the GENELEX

consortium which will ensure the circulation of any new version.

*************************************************************** -->

<!DOCTYPE Genelex [

<!ENTITY % ISOlat1 PUBLIC "ISO 8879-1986//ENTITIES Added Latin 1//EN">

%ISOlat1

<!ENTITY % CustEnt PUBLIC "-//GLX-TEAM//ENTITIES Custom Entity Set//FR">

<!ENTITY % MorpEnt PUBLIC "-//GLX-TEAM//ENTITIES Morphologie Entity Set//FR">

<!ENTITY % SyntEnt PUBLIC "-//GLX-TEAM//ENTITIES Syntaxe Entity Set//FR">

<!ENTITY % SemEnt PUBLIC "-//GLX-TEAM//ENTITIES Semantique Entity Set//FR">

%CustEnt

%MorpEnt

%SyntEnt

%SemEnt

<!--

A Genelex document is made of several parts:

- the morphological description

- the syntactic description

- ...

To select a part, simply specify the appropriate

keyword (INCLUDE or IGNORE) in the following

entity declarations:

-->

<!ENTITY % isMor "INCLUDE" >

<!ENTITY % isSyn "INCLUDE" >

<!ENTITY % isSem "INCLUDE" >

<!ELEMENT Genelex - O (GenelexMorpho? & GenelexSyntaxe?

& GenelexSemant? & CombVE*)>

<!ATTLIST Genelex

nom CDATA #REQUIRED

langue CDATA #REQUIRED

version CDATA #IMPLIED

date_creation1 CDATA #IMPLIED

date_creationglx CDATA #IMPLIED

date_modif CDATA #IMPLIED

propriete CDATA #IMPLIED

copyright CDATA #IMPLIED

integrite (SANS_B|%pBooleen) SANS_B

certification CDATA #IMPLIED>

<!-- *********************************************************** -->

<!ENTITY % pGlose

"appellation CDATA #IMPLIED

exemple CDATA #IMPLIED

commentaire CDATA #IMPLIED">

<!-- As a general rule, throughout the file:

- appellation : makes it possible to name the object in a comprehensible

and, if possible, univocal way

- exemple : makes it possible to illustrate its usage (a quotation from an author or an example

taken from a dictionary or used by a linguist)

- commentaire : free user field

-->

<!-- *********************************************************** -->

<!ELEMENT CombVE - O EMPTY>

<!ATTLIST CombVE

id ID #REQUIRED

datation (SANS_D|%pDatation) SANS_D

niveaulgue (SANS_NL|%pNiveauLgue) SANS_NL

frequence (SANS_F|%pFrequence) SANS_F

vargeog CDATA #IMPLIED>

 

<![ %isMor [

<!ENTITY % GLXmor PUBLIC

"-//GLX-TEAM//DTD Description Morphologie//FR">

<!ENTITY % MorpCtr PUBLIC

"-//GLX-TEAM//DTD Contraintes Morphologie//FR">

%GLXmor

%MorpCtr

]]>

<![ %isSyn [

<!ENTITY % GLXsyn PUBLIC

"-//GLX-TEAM//DTD Description Syntaxe//FR">

<!ENTITY % SyntCtr PUBLIC

"-//GLX-TEAM//DTD Contraintes Syntaxe//FR">

%GLXsyn

%SyntCtr

]]>

<![ %isSem [

<!ENTITY % GLXsem PUBLIC

"-//GLX-TEAM//DTD Description Semantique//FR">

<!ENTITY % SemaCtr PUBLIC

"-//GLX-TEAM//DTD Contraintes Semantique//FR">

%GLXsem

%SemaCtr

]]>

]>

 

2. DTD semant.dtd

 

<!--Consortium GENELEX @(#) semant.dtd 2.1@(#) 94/09/21 17:30:20 -->

<!-- **************FOR THE BENEFIT OF USERS ******************

Your comments concerning the DTD will be studied by the GENELEX

consortium which will ensure the circulation of any new version.

*************************************************************** -->

 

<!ELEMENT GenelexSemant - O (

Usem+

& Usem_Aff*

& Predicat*

& Argument*

& InformeArg*

& PredInstancie*

& Role_Sem*

& Concept*

& Trait_Sem_ValPond*

& Trait_Sem_Value*

& ValeurTrait*

& Trait_Sem*

& R_Usem*

& R_Pred*

& R_Concept*

& R_Pred_Concept*

& R_Concept_Pred*

& ListePred*

& Correspondance*

& ContraintDescription*

& ContraintIntervConst*

& ContraintSyntagme*

& ContraintConstruction*

& ContraintStructInterne*

& ContraintPosition*

& Corresp_Arg_Pos_Simple*

& Corresp_Arg_Pos_Flottant* )>

<!ENTITY % pModalite

"ponderation (SANS_POND

|%pPonderation

|%pPonderation_cust) SANS_POND

ponderation_num NUMBER #IMPLIED

point_de_vue CDATA #IMPLIED">

<!-- Modalite makes it possible to express the modality of the element carried by the semantic description: this will be mainly

a Trait_Sem_ValPond or a R_ValPond_Sem_...

Modality is expressed by various attributes:

symbolic weighting with certain values provided by the

model, whose list of values can be customized,

numerical weighting,

point of view that makes it possible to express a specific viewpoint

concerning the dictionary: general point of view or specific domain.

-->

<!-- ********************************************************* -->

<!-- ***** SEMANTIC UNIT ***** -->

<!-- ********************************************************* -->

<!ENTITY % pUsemAtt

"%pGlose

definition_libre CDATA #IMPLIED

definition_formelle CDATA #IMPLIED

combve IDREF #IMPLIED

trait_sem_valpond_l IDREFS #IMPLIED">

<!-- Groups various attributes common to the Usems and Usem_Aff:

definition_libre is an entirely free field designed for a

human reader. It includes the definition of the paper dictionary,

which is useful to the lexicographer. This field is not controlled, and the model does not impose any rules regarding the structure or syntax of the definition,

definition_formelle is another entirely free field that

can include the formal definition. It is useful to associate

a logical formula with the determiners. This can include the

expression of lambda-calculation associated with the entry, if desired.

As for the preceding field, its content is not controlled.

combve points to a CombVE (combination of Use values). This element is present in the morphological as well as syntactic

layers. It groups information concerning dating, level,

frequency and geographical variable.

trait_sem_valpond_l refers to the associated

Trait_Sem_ValPond. This concerns semantic characteristics, and the decompositional axis of the semantic layer. The lexical data that are not shared by the various Usems independently of shared abstract descriptive objects are preferably noted here.

-->

<!ELEMENT Usem - O (RepresentationPredicative?,

RepresentationConceptuelle_Pond*,

R_ValPond_Usem*) >

<!ATTLIST Usem

id ID #REQUIRED

%pUsemAtt>

<!-- In most cases, the Usem (Semantic unit) describes one

sense of a simple or compound Um (Morphological unit), in the

syntactic contexts described by one or more of its Usyns

that may be filtered.

The Usem may also describe one sense of a compound described in

syntax. In this case, it is not drawn from one Um, but from one or

more families of Usyns or Ums (compositions), and does not describe

the sense separately, but considers it as a whole from the linguistic

point of view.

A Usem (like a Um or a Usyn) is related to a given language,

and makes it possible to specify a lexical semantic description.

It is the entry point of the semantic layer. Its description

relies on objects representing various abstractions on the sense

components that must be assigned to it; some of them have a

decompositional or analytic status, others are generalizations or

abstractions made on the basis of the Usems.

The Predicate and the Concept are the description elements on which

the Usem relies and through which it can access the abstract

description level (partially independent of lexical particularities).

Connection with the "abstract" sub-layer is made by means of embedded

entities

RepresentationConceptuelle_Pond et

RepresentationPredicative,

that associate the Usem with a predicate(by means of certain modalities) and/or one or more concepts (with certain modalities and a weighting). The predicative Usem carries its specific Trait_Sem_ValPond.The associated predicate can be shared; it is described and also carries Trait_Sem_ValPond. Everything that can be factorized must be factorized, and the data shared by various Usems sharing the same predicate will be preferably carried by the predicate.

Likewise, several Usems pointing to the same Concept with the

same weighting (DEFINITORY, for example) preferably carry data specific to them, and factorize what they have in common on the unique

description of the Concept.

The Usem can be a master element in its association with a predicate or a concept.

Note that the data comprising a DEFINITORY or PROTOTYPICALD weighting (the use of these weightings are not

compulsory...) make it possible to build a "definition" of the Usem.

-->

<!ENTITY % pRepresentation

"vedette (SANS_B|%pBooleen) SANS_B

type_de_lien CDATA #IMPLIED">

<!-- This entity gathers two attributes:

type_de_lien specifies the association between Usem and

Concept or Predicate. In the case of the Predicate, the semantics of the link can be specified more precisely (for exemple: capacity to be Arg0)

master makes it possible to identify the Usem which takes priority as lexicalization of a concept or a predicate.

-->

<!ELEMENT RepresentationPredicative - O (SelectEtPreciseArg*)>

<!ATTLIST RepresentationPredicative

%pRepresentation

chemin_arg_concerne NUMBERS #IMPLIED

arg_inclus (SANS_I|%pArgInclus) SANS_I

predicat IDREF #REQUIRED>

<!-- Makes it possible to associate a Usem with a predicate by specifyingthe modalities of this association. The Usem may be the master element, which means that it is the privileged lexical representative of the Predicate, and that this is a named or lexicalized Predicate (see the entity Predicate).

chemin_arg_concerne specifies whether an argument is

particularly concerned by a Usem pointing to the predicate

and indicates the access path to this argument. This is a path,

because a predicative structure can be the argument of a predicate.

The chemin_arg_concerne is an ordered series of NUMBERS which must be

interpreted as follows: the first number i points to the rank argument

i of the predicate pointed to; the following one, if there is one,

then points to the predicate argument intervening as its

argument, and so on and so forth.

argument_inclus specifies whether the argument concerned is included.

(ex: acheteur: predicate acheter, argument 0 included)

SelectEtPrecise_Arg makes it possible to add information on the arguments of the associated predicate.

-->

<!ELEMENT RepresentationConceptuelle_Pond - O EMPTY>

<!ATTLIST RepresentationConceptuelle_Pond

%pRepresentation

%pModalite

commentaire CDATA #IMPLIED

concept IDREF #REQUIRED>

<!-- Makes it possible to associate a Usem with a concept by specifying the modalities of this association. The Usem can be the master element, which means that takes priority as the lexical representative of the concept, and that this is a named or lexicalized concept (see entity Concept).

This association is associated with (symbolic and/or numeric weightings and a viewpoint.

-->

<!-- ********************************************************* -->

<!-- ***** SEMANTIC UNIT OF AN AFFIX ***** -->

<!-- ********************************************************* -->

<!ELEMENT Usem_Aff - O (RepresentationConceptuelle_Pond_Aff*)>

<!ATTLIST Usem_Aff

id ID #REQUIRED

%pUsemAtt

predicat IDREF #IMPLIED>

<!-- Usem_Aff (or Semantic unit of an affix) comes directly from

an affix Um; it provides the minimum description of its meaning core or characterizes it from the semantic point of view. This information allows for a minimum prediction of the sense of derivations which are

regular or produces forms that are not included in the dictionary.

A Usem_Aff can be described with Trait_Sem_ValPond, Concepts and/or a

Predicate; these elements can be combined as follows:

- a Predicate with or without a list of Trait_Sem_ValPond,

and/or - Concepts with or without a list of Trait_Sem_ValPond,

or - a list of Trait_Sem_ValPond

It can also be assigned a definition (free and/or formal) and a CombVE.

-->

<!ELEMENT RepresentationConceptuelle_Pond_Aff - O EMPTY>

<!ATTLIST RepresentationConceptuelle_Pond_Aff

%pModalite

commentaire CDATA #IMPLIED

concept IDREF #REQUIRED>

<!-- Association, to describe a Usem_Aff, a Concept and a set

of data describing its modality: symbolic or numeric weighting

point of view-->

<!-- ********************************************************* -->

<!-- ***** PREDICATE AND CONCEPT ***** -->

<!-- ********************************************************* -->

<!ENTITY % pPredOuConcept

"%pGlose

type (SANS_TPC

|%pTypePredOuConcept) SANS_TPC

pivot (%pBooleen) NON

definition_libre CDATA #IMPLIED

definition_formelle CDATA #IMPLIED

trait_sem_valpond_l IDREFS #IMPLIED">

<!-- Entity that groups attributes common to typical predicates and

concepts and that have various values provided in the model:

- LEXICAL: drawn directly from a Usem, whether this is a master element or not

- GENERALISE: not drawn directly from a Usem, introduced

into the generalization relation of another Concept that may be

lexicalized, and also generalized by another Concept or Predicate

(and therefore not PRIMITIF)

- PRIMITIF: not drawn directly from a Usem, and does not allow for a generalization relation. It may be provided as a pre-existing or pre-defined description element, which may be independent of the language (the pivot attribute specifies this)

- LEXICAL_PRIMITIF: drawn directly from the lexicon, without

generalization (and possibly a pivot)

- TROU_LEXICAL: drawn between two lexical Concepts or

Predicates: non-lexicalized generalization of one of them, generalized

by a LEXICAL concept or predicate (concepts of this type will be

particularly useful to describe a number of taxonomies including

"gaps" at the lexical level).

pivot specifies whether the Concept or Predicate has been

selected as pivot description element, i.e. descriptive element shared

independently of languages.

definition_libre is an entirely free field designed for a

human reader. It includes a textual definition or every expression

that may be useful to the lexicographer. This field is not controlled,

and the model does not impose any rules regarding the structure or syntax of the definition,

definition_formelle is another entirely free field that

can include the formal definition. As in the preceding field, its content is not controlled.

trait_sem_valpond_l refers to the Trait_Sem_ValPond associated

with the Predicate or Concept; these are semantic characteristics

related to the decompositional axis of the semantic layer.

Inheritance mechanisms will be implemented to avoid specifying all

features at every level (a relation of type PARTICULARISATION

will imply the inheritance of the data carried by the "father").

-->

<!ELEMENT Predicat - O (R_ValPond_Pred* & R_ValPond_Pred_Concept*

& Assoc_ListePred*)>

<!ATTLIST Predicat

id ID #REQUIRED

%pPredOuConcept

argument_l IDREFS #REQUIRED>

<!-- A predicate is one of the main elements of the semantic model:

a Usem described as predicative will rely on a Predicate of LEXICAL

type for its predicative nucleus.

argument_l refers to the Predicate arguments, which

represent its essential descriptive elements (Arguments represent for the Predicates what Positions represent for Constructions...).

The list of Arguments (argument_l) is a list with a pertinent order;

for a given Predicate, it associates an argument with its rank in the list (numbered from 0).

From the outside, for a certain predicate, one of its arguments is

pointed to according to its rank.

Arguments do not explicitly bear their rank as an attribute,

because the rank is a coding requirement (to access the outside of an

argument) rather than a pertinent datum from the linguistic point of

view.

In the list, two elements may be identical, which means that they

point at the same Argument: this is a list, not a set.

R_ValPond_Pred and R_ValPond_Pred_Concept carry the weighted

valued relations with a predicate as a source; the target is respectively a Predicate or a Concept.

Assoc_Liste_Pred points to the lists of instantiated arguments

that may be associated with the predicate.

-->

<!ELEMENT Argument - O EMPTY>

<!ATTLIST Argument

id ID #REQUIRED

%pGlose

role_sem_l IDREFS #REQUIRED

informe_arg_decrit_l IDREFS #IMPLIED>

<!-- An argument is closely related to the definition of the notion of predicate, and represents one of its totally essential elements: a predicate is a predicate if it includes at least one Argument.

A predicate has a number of data concerning each of its arguments. As description elements, arguments may be shared by several Predicates, and they are present at different ranks within the list of arguments of these different Predicates (like Positions in syntax that can be shared by several Constructions, in which case, Predicate must be paralleled with Construction).

role_sem_l points to one or more Role_Sem (at least one)

informe_arg_decrit_l points to InformeArg that describe or constrain the argument at the semantic level.

-->

<!ELEMENT InformeArg - O EMPTY>

<!ATTLIST InformeArg

id ID #REQUIRED

commentaire CDATA #IMPLIED

statut (SANS_S

|%pStatutInfoArg

|%pStatutInfoArg_cust) SANS_S

usem IDREF #IMPLIED

pred_instancie IDREF #IMPLIED

concept_l IDREFS #IMPLIED

trait_sem_valpond_l IDREFS #IMPLIED>

<!-- This element intervenes to add data gathered on a predicate argument. The Predicate knows its Arguments and these data

on its Arguments are expressed through InformeArg carried by the

Arguments themselves. A Usem is described with a Predicate and

possibly some added information on the Arguments(via

SelectEtPreciseArg). The correspondence between Usyn and Usem can also

add data to the Predicate arguments associated with the Usem

(also via SelectEtPreciseArg), when the syntactic context "forces"

one or more arguments at the semantic level. All these data are

gathered; each level provides specific elements, but contribution is

not compulsory.

The status of the Argument-related data makes it possible to assign

various modalities, mainly related to the management of defects, and

verifications or construction of the semantic representation of the

Argument.

statut: an InformeArg carries a set of data on an argument;

all these data have the same "status" according to the value of this

attribute:

- DEFAUT: information about the argument, in the absence of its semantic representation: this can be absent from the surface, or occupied by a semantically "empty" element.

- VERIF: when the existing argument has a semantic representation, this must comply with the constraints imposed on the status. For example, it must carry the features trait_sem_valpond_l pointed here, or take the Concepts pointed to by concept_l as generalization.

- ENRICHIT adds semantic data to the argument data, whether there are any or not

- DEFAUT_VERIF combines default contribution, if there are no data, with verification, if arguments are semantically "full".

usem points to a Usem.

pred_instancie points to an instantiated predicate

PredInstancie.

concept_l points to a list of Concepts, usually a unique Concept; the case >1 is used to insert a polyhierarchy of Concepts which the argument should inherit.

trait_sem_valpond_l points to a list of weighted valued semantic features.

Several InformeArg can concern the same argument.

Among the various optional data, at least one (that must be different from the comment) must be indicated: an InformeArg necessarily brings information on the argument.

-->

<!ELEMENT SelectEtPreciseArg - O EMPTY>

<!ATTLIST SelectEtPreciseArg

chemin_arg NUMBERS #REQUIRED

informe_arg_precise IDREF #REQUIRED>

<!-- Makes it possible to add information to the predicate arguments, from the outside by passing through the predicate.

chemin_arg is the path leading to the argument: This is a series of ranks, usually one, but sometimes more if the argument pointed to the first level is itself a predicate that includes an argument that must be pointed to.

informe_arg_precise points to the InformeArg comprising

the semantic information which specifies the argument by adding the data provided by the argument itself.

-->

<!ELEMENT PredInstancie - O (SelectEtPreciseArg*)>

<!ATTLIST PredInstancie

id ID #REQUIRED

%pGlose

predicat IDREF #REQUIRED>

<!-- Association of a predicate with more or less precise and complete

data concerning the semantic representation of its arguments.

Arguments can be fully described, "instantiated", or only partially

described. The list of SelectEtPreciseArg can even be completely

empty; in this case, this element will merely point at a predicate.

-->

<!ELEMENT Role_Sem - O EMPTY>

<!ATTLIST Role_Sem

id ID #REQUIRED

%pGlose

nom CDATA #REQUIRED

roleth_assoc IDREF #IMPLIED

est-un_l IDREFS #IMPLIED>

<!-- Role_Sem (semantic roles) are hierarchically described by the

ISA relation.

roleth_assoc makes it possible to associate (if necessary) a thematic role of the syntactic layer with its equivalent Role_Sem of the semantic layer.

ISA_l points a Role_Sem at its "fathers" in the hierarchy.

-->

<!ELEMENT Concept - O (R_ValPond_Concept* & R_ValPond_Concept_Pred*

& Assoc_ListePred*)>

<!ATTLIST Concept

id ID #REQUIRED

%pPredOuConcept

trait_sem_pertinent_l IDREFS #IMPLIED

trait_sem_obligatoire_l IDREFS #IMPLIED>

<!-- A concept is one of the main elements of the semantic model: this is a "cognitive" abstraction on Usems or a generalization of such abstractions.

trait_sem_pertinent_l associates a Concept with a list of Trait_Sem that is relevant to describe. This, for example, makes it possible to associate a Concept with particularizing features that must be or could be described by the associated Usems or Concepts (directly or through successive generalizations).

This information makes it possible to ensure that the information coded on the entries associated with the Concept (directly or by generalization) is relatively consistent and comprehensive.

trait_sem_obligatoire_l plays a role that is similar to the preceding one, but it is semantically more constrained: the features pointed to must be described.

A Concept is the source of weighted valued relations with concepts or predicates; that is the meaning of the embedded elements:

R_ValPond_Concept and R_ValPond_Concept_Pred.

A Concept may also be associated with one or more ListePred

(typical example in I.A: the concept of restaurant will be associated

with a certain scenario; that accounts for the presence of the embedded element Assoc_ListePred.

-->

<!-- ********************************************************* -->

<!-- ***** SEMANTIC FEATURES ***** -->

<!-- ********************************************************* -->

<!ELEMENT Trait_Sem_ValPond - O EMPTY>

<!ATTLIST Trait_Sem_ValPond

id ID #REQUIRED

%pModalite

commentaire CDATA #IMPLIED

trait_sem_value IDREF #REQUIRED>

<!-- Trait_Sem_ValPond is the association of:

a valued feature,

a symbolic weighting with a list of values (this list may be customized),

a numeric weighting,

and the expression of a viewpoint.

-->

<!ELEMENT Trait_Sem_Value - O EMPTY>

<!ATTLIST Trait_Sem_Value

id ID #REQUIRED

%pGlose

pivot (%pBooleen) NON

valeurtrait IDREF #IMPLIED

valeurbin (SANS_B|%pBin) SANS_B

 

trait_sem IDREF #REQUIRED

usem_lexicalise_val IDREF #IMPLIED

trait_synt_corresp IDREF #IMPLIED

est_un_l IDREFS #IMPLIED

incompatible_l IDREFS #IMPLIED

implique_l IDREFS #IMPLIED

trait_sem_pertinent_l IDREFS #IMPLIED

trait_sem_obligatoire_l IDREFS #IMPLIED>

<!-- Trait_Sem_Value is the association of a semantic feature with a value of this feature.

pivot makes it possible to specify whether this feature has a pivot

status, i.e. is independent of the language, and can therefore intervene in the same definition in descriptions of Usems in different languages.

valeurtrait, valeurbin: either the value of the feature is

provided by a valuetrait element, or it is binary and is provided by valeurbin element.

trait_sem refers to the Trait_Sem associated with the value; this feature has also properties and a specific description.

usem_lexicalise_val may refer to the Usem that lexicalizes the valued feature. Valued features constitute a meta-language that can be connected with the language, because all valued features are not necessarily associated with their lexicalization.

trait_synt_corresp points to the "semantic" feature used in syntax, to which it corresponds, and makes it possible to give the full definition at the semantic level.

est_un_l makes it possible to insert the Trait_Sem_Value into a hierarchy and even into a lattice; the Trait_Sem_Value referred to in this list are the "father" nodes of the valued feature that carries this attribute.

incompatible_l refers to a set of Trait_Sem_Value that are semantically incompatible with the Trait_Sem_Value that carries this attribute.

implique_l refers to a set of implied Trait_Sem_Value.

Depending on the type of Trait_Sem (its attribute "concerne"), the

sense of the relation may vary slightly:

- If it deals with a Trait_Sem of AFFIXE, this means that the

Usems including this affix in the mode of derivation of their Um will

carry the Trait_Sem_Value implied by the Trait_Sem_Value

of AFFIXE if the semantics of the integrated affix correspond to the

semantics described. This information, in particular, makes it possible to make a minimum prediction on the semantic representation of derivatives that are not described in the dictionary, but are formed in a regular way. Trait_Sem_Value plays a role between two conceptually different objects.

- If it deals with a Trait_Sem AUTRE (= other than Affix),

this means that a descriptive object carrying this Trait_Sem_Value

will not need to specify the implied Trait_Sem_Value that it will

implicitly carry. The implication here concerns objects of the same

nature that carry the "implicating" feature and the implied

features.

- trait_sem_pertinent_l associates a Trait_Sem_Value

with a list of Trait_Sem that is pertinent to describe. This makes it possible to associate differentiating features to be described with

a valued feature of semantic class.

trait_sem_obligatoire_l is semantically analogous to the

preceding feature, but integrates an additional constraint on the coding of elements carrying this feature: they will have to carry weighted valued features associated with these compulsory features.

-->

<!ELEMENT ValeurTrait - O EMPTY>

<!ATTLIST ValeurTrait

id ID #REQUIRED

libelle CDATA #REQUIRED>

<!-- ValeurTrait represents the value of the non-binary valued semantic features Trait_Sem_Value. These values can be referenced by many different semantic features (Trait_Sem).

-->

<!ELEMENT Trait_Sem - O EMPTY>

<!ATTLIST Trait_Sem

id ID #REQUIRED

%pGlose

nom CDATA #REQUIRED

pivot (%pBooleen) NON

concerne (SANS_C|%pConcerne) SANS_C

type (SANS_TYPE

|%pTypeTrait_Sem

|%pTypeTrait_Sem_cust) SANS_TYPE

statut (%pValuation) MONOVALUE

type_liste_valeurs (%pTypeListe) BINAIRE

structure_liste_valeurs (SANS_STRUCT

|%pStructureListe) SANS_STRUCT

usem_lexicalise IDREF #IMPLIED

valeurtrait_l IDREFS #IMPLIED

trait_sem_pertinent_l IDREFS #IMPLIED

trait_sem_obligatoire_l IDREFS #IMPLIED>

<!-- Trait_Sem is associated with a value to produce a Trait_Sem_Value

(which has a number of properties, see above).

pivot: the Trait_Sem can be linked to the description of a

language or pivot, i.e. be selected as an element shared by several

languages, or a shared description element.

concerne makes it possible to specify whether the feature exclusively concerns affixes and other elements of the semantic layer.

type specifies the type or family of the Trait_Sem (a list of

values is provided, but it can be extended by the user).

statut makes it possible to specify whether the Trait_Sem is:

- MONOVALUE: which means that an element carries only

one Trait_Sem_Value associated with this Trait_Sem, because values

exclude one another.

- MULTIVALUE: which means that several Trait_Sem_Value

associated with the same Trait_Sem can be carried by the same

descriptive object of the semantic layer.

type_liste_valeurs specifies the nature of the values

associated with the Trait_Sem:

- BINAIRE: + or -

- LISTE_FERMEE: list of values known in extension.

- LISTE_OUVERTE: open list of values.

structure_liste_valeurs specifies the structuring of the

various Trait_Sem_Value associated with the Trait_Sem; this

structuring is expressed locally through the ISA relation between

Trait_Sem_Value. Trait_Sem is partly characterized by the associated structure:

- TREILLIS_TOTAL: the set of associated Trait_Sem_Value

forms a lattice.

- TREILLIS_PARTIEL: valued features are locally integrated into lattice structures, and several lattices are allowed. (Consequently, certain Valued features can be isolated.)

- HIERARCHIE_TOTALE: the set of associated Trait_Sem_Value

is integrated into a tree structure.

- HIERARCHIE_PARTIELLE: the set of associated Trait_Sem_Value

is integrated into one or more tree structures. All

Trait_Sem_Value are not necessarily integrated into a structuring relation.

usem_lexicalise makes it possible to identify the Trait_Sem in the language, by referring to a lexicalizing Usem.

valeurtrait_l makes it possible to specify the list of

values (when its type is LISTE_FERMEE) in extended form by referring to the elements ValeurTrait.

trait_sem_pertinent_l associates a Trait_Sem with a list of

Trait_Sem that is pertinent to describe for the elements that carry

a Trait_Sem_ValPond associated with this Trait_Sem, independently of

their value.

trait_sem_obligatoire_l is semantically analogous to the

preceding feature, but has an additional constraint on the coding of

elements that carry this feature: they will have to carry

Trait_Sem_ValPond associated with these compulsory features.

-->

<!-- ********************************************************* -->

<!-- ***** WEIGHTED SEMANTIC RELATIONS ***** -->

<!-- ********************************************************* -->

<!ENTITY % pR_ValPond_Sem

"%pModalite

commentaire CDATA #IMPLIED">

<!ELEMENT (R_ValPond_Usem|R_ValPond_Pred) - O (Corresp_Arg_Arg*)>

<!ELEMENT R_ValPond_Concept - O EMPTY>

<!ATTLIST (R_ValPond_Usem|R_ValPond_Pred|R_ValPond_Concept)

%pR_ValPond_Sem

cible IDREF #REQUIRED

r_sem IDREF #REQUIRED>

<!-- These elements are embedded in the source description: weighted valued relation between two Usems (R_ValPond_Usem), two

predicates (R_ValPond_Pred), two concepts (R_ValPond_Concept),

according to a certain modality:

- a symbolic weighting with a list of values provided

(this list may be customized),

- a numeric weighting,

- and the expression of a viewpoint.

Corresp_Arg_Arg specifies how the arguments correspond to one another when Usems are predicative, or when the relation links two

Predicates.

cible points to the target of the weighted valued relation:

Usem, Predicate or Concept depending on the nature of the element.

r_sem points to the R_Usem, R_Pred or R_Concept linking both

objects.

An R_ValPond_Usem/Pred/Concept must therefore be considered

as the entity that describes the fact that the semantic relation R_Usem/Pred/Concept links the source Usem/Pred/Concept and the target Usem/Pred/Concept according to a certain modality.

-->

<!ELEMENT (R_ValPond_Pred_Concept|R_ValPond_Concept_Pred) - O EMPTY>

<!ATTLIST (R_ValPond_Pred_Concept|R_ValPond_Concept_Pred)

%pR_ValPond_Sem

cible IDREF #REQUIRED

r_sem IDREF #REQUIRED

chemin_arg_concerne NUMBERS #IMPLIED>

<!-- These elements are embedded in the source description:

weighted valued relation between a source/Predicate and a

target/Concept (R_ValPondPred_Concept), or a source/Concept and a

target/Predicate (R_ValPond_Concept_Pred) according to a certain

modality:

- a symbolic weighting with a list of values provided

(that may be customized),

- a numeric weighting,

- and the expression of a viewpoint.

cible points to the target Predicate or Concept of the

relation.

r_sem refers to the R_Pred_Concept/R_Concept_Pred linking the

Predicate and the Concept.

chemin_arg_concerne provides the access path, if necessary, from the predicate to the argument particularly concerned by the relation; the same semantic relation can therefore act independently of the argument concerned.

An R_ValPond_Pred_Concept/Concept_Pred must therefore be considered

as the entity that specifies that the semantic relation R_Pred_Concept

ou R_Concept_Pred links the source Predicate/Concept to the target

Concept/Predicate according to a certain modality.

-->

<!ELEMENT Corresp_Arg_Arg - O EMPTY>

<!ATTLIST Corresp_Arg_Arg

chemin_arg_source NUMBERS #IMPLIED

chemin_arg_cible NUMBERS #IMPLIED

informe_arg_precise_source_l IDREFS #IMPLIED

informe_arg_precise_cible_l IDREFS #IMPLIED>

<!-- Corresp_Arg_Arg establishes a correspondence between the

Predicate arguments implied as the source or target of a weighted valued relation. The paths leading to source and target arguments are indicated:

chemin_arg_source (usually a single rank giving access to the first level).

chemin_arg_cible (usually a single rank giving access to the first level).

Certain semantic data on the arguments can be added in the correspondence:

informe_arg_precise_source_l: information on the source

argument (InformeArg).

informe_arg_precise_cible_l: information on the target argument (InformeArg).

The acceptable semantic interpretations of certain arguments can thus be constrained in the correspondence. It is also possible to specify the "default" values related to the correspondence of arguments in the correspondence of predicates with several different arguments.

(Ex: fixer(X,Y,clou,Z) (to fasten X,Y,nail,Z) <=> clouer (to nail)(X,Y,Z))

-->

<!-- ********************************************************* -->

<!-- ***** SEMANTIC RELATIONS ***** -->

<!-- ********************************************************* -->

<!ENTITY % pR_SemPropriete

"reflexivite (REFLEXIF

|NON_REFLEXIF) NON_REFLEXIF

transitivite (TRANSITIF

|NON_TRANSITIF) NON_TRANSITIF

symetrie (SYMETRIQUE

|NON_SYMETRIQUE) NON_SYMETRIQUE

antisymetrie (ANTISYMETRIQUE

|NON_ANTISYMETRIQUE) NON_ANTISYMETRIQUE

ordre (PARTIEL|TOTAL

|NON_PERTINENT) NON_PERTINENT">

<!-- Properties of semantic relations. -->

<!ELEMENT R_Usem - O (CatGram_Select? & CatGram_Result?)>

<!ATTLIST R_Usem

id ID #REQUIRED

%pGlose

r_sem_inverse IDREF #IMPLIED

incompatible_l IDREFS #IMPLIED

est_un_l IDREFS #IMPLIED

type (SANS_T

|%pTypeR_Usem

|%pTypeR_Usem_cust) SANS_T

sstype (SANS_ST

|%pSsTypeR_Usem

|%pSsTypeR_Usem_cust) SANS_ST

%pR_SemPropriete>

<!-- R_Usem is the formal entity that describes a semantic relation between two Usems.

CatGram_Select: for filtering the grammatical category of the source.

CatGram_Result: for filtering the grammatical category of the target.

r_sem_inverse refers to the reverse relation R_Usem

(source=>target, target=>source).

incompatible_l refers to the R_Usem semantic relations that

are incompatible with the described relation, and that cannot be part of a R_ValPond linking elements that are already linked through it.

est_un_l specifies the hierarchical structuring among relations by referring to the "mother" R_Usem of the described relation.

type specifies the type of relation: a set of types is provided in the GENELEX model. This set can be extended by the user.

sstype makes it possible to characterize the semantic relation more precisely. Values are provided, but the user can extend the list.

Finally, a semantic relation can have a number of properties that also belong to binary relations: reflexivity, transitivity, symmetry, anti symmetry, and the fact that it is an order relation.

-->

<!ELEMENT (R_Pred|R_Concept) - O EMPTY>

<!ATTLIST (R_Pred|R_Concept)

id ID #REQUIRED

%pGlose

r_sem_inverse IDREF #IMPLIED

incompatible_l IDREFS #IMPLIED

est_un_l IDREFS #IMPLIED

pivot (%pBooleen) NON

type (SANS_T

|%pTypeR_PredOuConcept

|%pTypeR_PredOuConcept_cust) SANS_T

sstype CDATA #IMPLIED

%pR_SemPropriete>

<!-- R_Pred (R_Concept) is the formal entity that describes a semantic relation between two Predicates (Concept).

r_sem_inverse refers to the reverse semantic relation

R_Pred (R_Concept) (source=>target, target=>source).

incompatible_l refers to the R_Pred relations (R_Concept) that are incompatible with the described relation, and that cannot be part of a R_ValPond relating elements that are already linked through it.

est_un_l specifies the hierarchical structuring among relations by referring to the "mother" R_Pred (R_Concept) of the described relation.

pivot makes it possible to mark "primitive" relations which are relied upon, independently of languages: networks of relations can then be transposed from one language to the other on the basis of (bilingual) pairs of predicates or concepts.

type specifies the type of relation: a set of types is provided by the GENELEX model and this set can be extended by the user.

sstype makes it possible to characterize the relation more precisely.

Finally, a semantic relation can have a number of properties that also belong to the binary relations: reflexivity, transitivity, symmetry, anti symmetry, and the fact that it is an order relation.

-->

<!ELEMENT (R_Pred_Concept|R_Concept_Pred) - O EMPTY>

<!ATTLIST (R_Pred_Concept|R_Concept_Pred)

id ID #REQUIRED

%pGlose

r_sem_inverse IDREF #IMPLIED

incompatible_l IDREFS #IMPLIED

est_un_l IDREFS #IMPLIED

type CDATA #IMPLIED>

<!-- R_Concept_Pred (R_Pred_Concept) is the formal entity that describes a semantic relation between a Concept (resp. a Predicate) and a Predicate (resp. a Concept).

r_sem_inverse refers to the reverse semantic relation

(source=>target, target=>source): the reverse of a R_Concept_Pred is

a R_Pred_Concept and vice versa.

incompatible_l refers to the R_Concept_Pred (R_Pred_Concept)

that are incompatible with the described relation, and that cannot be part of a R_ValPond relating elements that are already linked through it.

est_un_l specifies the hierarchical structuring among relations by referring to the "mother" R_Concept_Pred (R_Pred_Concept) of the described relation.

type specifies the type of relation: this field is entirely free.

-->

<!ELEMENT Assoc_ListePred - O EMPTY>

<!ATTLIST Assoc_ListePred

role (ASSOC|%pRoleListePred

|%pRoleListePred_cust)

ASSOC

listepred IDREF #REQUIRED

intervient (%pBooleen) NON>

<!-- Assoc_ListePred makes it possible to associate a Predicate or a Concept with a ListePred by specifying the modality of this association:

role specifies the role of the ListePred used by the Predicate or the Concept description. A number of values are provided in the model: A_PRESUPPOSITION, A_IMPLICATION,

A_DEFINITION, A_EXPLICITATION, and if necessary, the user can add his/her own values.

listepred points to the ListePred concerned.

intervient specifies, for a given Predicate, whether it intervenes directly in the ListePred , i.e. whether it is an element of this list.

-->

<!ELEMENT ListePred - O (Variable*)>

<!ATTLIST ListePred

id ID #REQUIRED

%pGlose

statut (%pStatutListePred

|%pStatutListePred_cust) DEFINITION

type (SANS_TL|%pTypeListePred

|%pTypeListePred_cust) SANS_TL

pred_instancie_l IDREFS #REQUIRED>

<!-- ListePred describes an ordered list of PredInstancie to the arguments to which variables can be associated for the purpose of specifying the sharing of arguments between Predicates.

statut makes it possible to specify the status of PredInstancie collected: a list of values is provided in the model

(SCENARIO, DEFINITION) but if necessary, the user can add his/her own values.

type specifies the meaning to give to the order of the PredInstancie

within the list: a number of values are provided in the model: (ORDRE_TEMP, ORDRE_SPATIAL, SIMULTANEITE) temporal, spatial order,... but if necessary, the user can add his/her own values.

pred_instancie_l refers to the PredInstancie that constitute a list with a pertinent order.

-->

<!ELEMENT Variable - O (SelectPredArg+) >

<!-- The embedded entity Variable is meaningful only for the ListePrede that comprises it; it makes it possible to associate, on the same variable, the arguments of several PredInstancie which become "unified" for the ListePred that involves them. This association is achieved by pointing to a list of SelectPredArg, the paths leading to the arguments of the "unifying" Predicate.

For example, in a restaurant scenario, the waiter carries then serves food to a customer who then eats it: three SelectPredArg

make it possible to reach the arguments of three instantiated predicates in the list of "restaurant" scenario predicates, and these three arguments become unified within the same Variable.

-->

<!ELEMENT SelectPredArg - O EMPTY>

<!ATTLIST SelectPredArg

nieme_pred NUMBER #REQUIRED

chemin_arg NUMBERS #REQUIRED>

<!-- SelectPredArg determines a path that makes it possible to reach the argument of a Predicate of ListePred.

nieme_pred refers to the nth PredInstancie in the list ListePred.

chemin_arg designates the series of argument ranks of the Predicate (usually one only); in the rare cases when this predicate has a complex argument, which means that it is a predicative itself, the path leading to an argument may be followed along the predicative structure representing the realization of the complex argument and the following rank argument within the list may be recursively pointed to. This complex case accounts for the existence of a list of NUMBERS.

-->

<!-- ********************************************************* -->

<!-- ***** CORRESPONDENCE BETWEEN SYNTAX AND SEMANTICS ***** -->

<!-- ********************************************************* -->

<!ELEMENT Correspondance - O (SelectEtPreciseArg*)>

<!ATTLIST Correspondance

id ID #REQUIRED

%pGlose

contraint_description IDREF #IMPLIED

corresp_arg_pos_l IDREFS #IMPLIED>

<!-- Correspondance specifies, for a Corresp_Usyn_Usem, its various aspects:

contraint_description points to an object

ContraintDescription that makes it possible to filter a sub-set of syntactic realizations associated with the Usyn by increasing the constraints on the syntactic description.

corresp_arg_pos_l: is used to associate each predicate argument on which the Usem relies (one could also say that: the Usem lexicalizes with its own particularities) with its realization in syntax. There are, in general, as many corresp_arg_pos as arguments present on the surface, but there may be several corresp_arg_pos

alternating with the same argument in the case of a syntactic compound with lexicalization variants in different MDC.

SelectEtPreciseArg makes it possible to specify on arguments, if necessary, the semantic information (added to the data present on the predicate arguments in its definition and in the modality of lexicalization of the predicate by the Usem) specific to the syntactic context of the Usyn, and therefore to the Usyn-Usem pair. The semantic data specific to the predicate or the way the Usem is associated with it will be expressed elsewhere.

It is not compulsory to describe these attributes. Furthermore, an element may be empty or merely comprise the Glose in the case of an association between Usyn and Usem, where nothing needs to be specified.

-->

<!ELEMENT ContraintDescription - O (Contraint_mdc?)>

<!ATTLIST ContraintDescription

id ID #REQUIRED

%pGlose

contraint_intervconst IDREF #IMPLIED

contraint_struct_interne IDREF #IMPLIED

contraint_construction IDREF #IMPLIED>

<!-- This element makes it possible to restrict the set of syntactic realizations among those which are associated with the base description of the Usyn

Contraint_mdc constrains the mode of composition of a compound

contraint_intervconst constrains the realizations of the Self

contraint_struct_interne constrains the realizations of the positions of the internal structure of a compound

contraint_construction constrains the realizations of the positions associated with the base construction.

Many of these elements can be present at the same time. However, if contraint_struct_interne and contraint_mdc, which are associated with two competing modes of the composition description, are both present, it will be necessary to ensure consistency of information.

-->

<!ELEMENT ContraintIntervConst - O EMPTY>

<!ATTLIST ContraintIntervConst

id ID #REQUIRED

contraint_syntagme_l IDREFS #REQUIRED>

<!-- Makes it possible to constrain one or more syntagms of the intervconst either by adding or inhibiting syntactic features .

-->

<!ELEMENT ContraintSyntagme - O (AjouteTrait_Sem*)>

<!ATTLIST ContraintSyntagme

id ID #REQUIRED

syntagme IDREF #REQUIRED

inhibe (%pBooleen) NON

ajoute_trait_synt_l IDREFS #IMPLIED>

<!-- This element makes it possible to select and constrain a syntagm.

syntagme makes it possible to select the syntagm concerned (considered as belonging to the distribution of a position or to the intervconst).

inhibe specifies whether it is inhibited and is therefore absent from the distribution for the given sense.

ajoute_trait_synt_l restricts a syntagm that is not inhibited by the addition of restrictive syntactic features.

AjouteTrait_Sem makes it possible to constrain it on the criterion of the semantic information associated with its semantic interpretation.

-->

<!ELEMENT AjouteTrait_Sem - O EMPTY>

<!ATTLIST AjouteTrait_Sem

statut (%pStatutAjoute) FILTRE_AJOUTE

trait_sem_valpond IDREF #REQUIRED>

<!-- Makes it possible to filter on the semantic interpretation associated with the syntagm by forcing/verifying the presence of a feature.

statut specifies:

- that it must be present to allow for correspondence (value FILTRE),

- that it must be present or be added, provided it is compatible with the semantic representation (value FILTRE_AJOUTE),

- or that it must enrich the associated semantic representation (value FORCE) anyway.

trait_sem_valpond points to the weighted valued feature concerned.

-->

<!ELEMENT ContraintConstruction - O EMPTY>

<!ATTLIST ContraintConstruction

id ID #REQUIRED

contraint_position_l IDREFS #REQUIRED>

<!-- Constrains the external construction of a Usyn by constraining one or more of its positions:

- by making it compulsory or prohibited, in the case of an optional position,

- or by constraining its distribution.

They are as many ContraintPosition pointed to by contraint_position_l

as constrained positions, whether they are inhibited or made compulsory, or if their distribution is constrained.

-->

<!ELEMENT ContraintStructInterne - O EMPTY>

<!ATTLIST ContraintStructInterne

id ID #REQUIRED

contraint_position_l IDREFS #REQUIRED>

<!-- Constrains the internal structure of a compound Usyn by constraining one or more of its positions:

- by making it compulsory or prohibited, in the case of an optional position,

- or by constraining its distribution.

There are as many ContraintPosition pointed to contraint_position_l

as constrained positions, whether they are inhibited or made compulsory, or if their distribution is constrained.

-->

<!ELEMENT Contraint_mdc - O (ContraintPosition_mdc+)>

<!-- Constrains the mode of composition of a Usyn by constraining one or more positions of one or more compound Usyns. There are as many ContraintPosition_mdc as external constructions of components to be constrained.

-->

<!ELEMENT ContraintPosition_mdc - O EMPTY>

<!ATTLIST ContraintPosition_mdc

nieme_composition NUMBER #REQUIRED

nieme_composante NUMBER #REQUIRED

contraint_position_l IDREFS #REQUIRED>

<!-- For a composition (nieme_composition) (in the case of composition alternative), constrains one of its components (nieme_composante) by constraining one or more of its positions:

- by making it compulsory or prohibited, in the case of an optional position,

- or by constraining its distribution.

There are as many ContraintPosition pointed to the contraint_position_l as positions constrained for a single pair

nieme_composition/nieme_composante, whether they are inhibited or made compulsory, or if their distribution is constrained.

Le ContraintPosition_mdc must specify nieme_composition and

nieme_composante so that the position to be constrained is the position of the external construction of the Usyn pointed to. A Um can be constrained only if it belongs to the distribution of the calling Usyn position.

-->

<!ELEMENT ContraintPosition - O (CheminPosition)>

<!ATTLIST ContraintPosition

id ID #REQUIRED

modif_optionnalite (SANS_MODIF|OBLIGATOIRE

|INTERDITE) SANS_MODIF

contraint_syntagme_l IDREFS #IMPLIED>

<!-- ContraintPosition makes it possible to access a position (from the external construction, or internal structure or composition according to the element that integrates the ContraintPosition) by means of a:

CheminPosition followed from the base construction, the internal structure, or the external construction of a compound in the case of a description of the Usyn by Composition.

modif_optionnalite makes it possible to specify that an optional position in the description becomes either compulsory or prohibited for the sense (Usem) pointed to.

contraint_syntagme_l makes it possible to specify, at the position constrained distribution level, the additional syntagms and constraints associated. There will be as many ContraintSyntagme as syntagms that can be restricted. Moreover, a syntagm can be filtered or enriched from a semantic point of view by the semantic features.

-->

<!-- ********************************************************* -->

<!-- ***** CORRESPONDENCE BETWEEN ARGUMENT AND POSITION ***** -->

<!-- ********************************************************* -->

<!ENTITY % pCorresp_Arg_Pos

"chemin_arg NUMBERS #REQUIRED

portee (%pPorteeCorresp) EXTERNE

nieme_composition NUMBER #IMPLIED

nieme_composante NUMBER #IMPLIED">

<!-- The set of attributes common to the various cases of correspondence between argument and position, that makes it possible to select an argument and specify its expression within the syntactic structure.

chemin_arg makes it possible to select the argument concerned by a series of ranks (usually one, except in the case of a predicate as predicate argument).

portee specifies whether the syntactic realization is carried

- on a position of the external construction (in the most common case, portee = EXTERNE),

- on a position of the internal structure (in case of a composition described by a structural syntagm, portee =

STRUCT_INTERNE),

- or on one composition element (portee =

COMP_INTERNE).

In the last case:

nieme_composition specifies the composition concerned,

nieme_composante specifies the component of the composition specified.

-->

<!ELEMENT Corresp_Arg_Pos_Simple - O (CheminPosition)>

<!ATTLIST Corresp_Arg_Pos_Simple

id ID #REQUIRED

%pCorresp_Arg_Pos>

<!-- Associates an argument of the semantic layer with a syntax position.

portee, nieme_composition, nieme_composante makes it possible to know what CheminPosition applies to,

chemin_arg points to the argument concerned; ranks are traced from the predicate that is directly pointed to the Usem concerned by the correspondence between syntax and semantics calling on

Corresp_Arg_Pos_Simple.

-->

<!ELEMENT Corresp_Arg_Pos_Flottant - O (CheminSyntagme?)>

<!ATTLIST Corresp_Arg_Pos_Flottant

id ID #REQUIRED

%pGlose

%pCorresp_Arg_Pos

position IDREF #REQUIRED>

<!-- Makes it possible to associate an argument of the essential predicate with its family of syntactic realizations, when this corresponds to an element that is not described in the base description because it is considered a non-essential element of syntax ("adjunct complements complements": circumstants, modifiers... not specified for the Usyn).

position pointed to a Position that provides the description of the family of realizations in syntax (and will therefore, specify a distribution and possibly a function and thematic role, as well).

CheminSyntagme makes it possible to specify, when rewriting the level of virtual insertion of the position associated with the Arg_flottant, by pointing to the rewritten syntagm. The insertion is then made into the rewriting list of this syntagm. The function and thematic roles are specified in relation to the head of the syntagm.

The absence of CheminSyntagme implies that the insertion is made at the highest level, i.e., depending on the case (specified by the attribute portee):

- in the list of positions of the external construction (the most common case),

- in the list of positions of the internal structure (in the case of a composition described by a structural syntagm),

- or at the level of one composition element (specified by nieme_composition and nieme_composante which play a role in the case of a composition described by a calculation of composition. In this case, there cannot be any CheminSyntagme; the correct level will be pointed to by selecting the desired component).

-->

 

3. Constraints semant.ctr

<!--Consortium GENELEX @(#) semant.ctr 2.1 -->

 

 

<!--CONTRAINTE Usem

combve TYPE CombVE

trait_sem_valpond_l TYPE Trait_Sem_ValPond -->

<!--CONTRAINTE RepresentationPredicative

predicat TYPE Predicat -->

 

<!--CONTRAINTE RepresentationConceptuelle_Pond

concept TYPE Concept -->

<!--CONTRAINTE Usem_Aff

predicat TYPE Predicat

combve TYPE CombVE

trait_sem_valpond_l TYPE Trait_Sem_ValPond -->

 

<!--CONTRAINTE RepresentationConceptuelle_Pond_Aff

concept TYPE Concept -->

<!--CONTRAINTE Predicat

trait_sem_valpond_l TYPE Trait_Sem_ValPond argument_l TYPE Argument -->

<!--CONTRAINTE Argument

role_sem_l TYPE Role_Sem

informe_arg_decrit_l TYPE InformeArg -->

<!--CONTRAINTE InformeArg

usem TYPE Usem

pred_instancie TYPE PredInstancie

concept_l TYPE Concept

trait_sem_valpond_l TYPE Trait_Sem_ValPond -->

<!--CONTRAINTE SelectEtPreciseArg

informe_arg_precise TYPE InformeArg -->

<!--CONTRAINTE PredInstancie

predicat TYPE Predicat -->

<!--CONTRAINTE Role_Sem

roleth_assoc TYPE RoleTh

est_un_l TYPE Role_Sem -->

<!--CONTRAINTE Concept

trait_sem_valpond_l TYPE Trait_Sem_ValPond

trait_sem_obligatoire_l TYPE Trait_Sem

trait_sem_pertinent_l TYPE Trait_Sem -->

<!--CONTRAINTE Trait_Sem_ValPond

trait_sem_value TYPE Trait_Sem_Value -->

<!--CONTRAINTE Trait_Sem_Value

valeurtrait TYPE ValeurTrait

trait_sem TYPE Trait_Sem

usem_lexicalise_val TYPE Usem

trait_synt_corresp TYPE Trait_Aspect|Trait_Bin

|Trait_Libre

est_un_l TYPE Trait_Sem_Value

incompatible_l TYPE Trait_Sem_Value

implique_l TYPE Trait_Sem_Value

trait_sem_pertinent_l TYPE Trait_Sem

trait_sem_obligatoire_l TYPE Trait_Sem -->

<!--CONTRAINTE Trait_Sem

usem_lexicalise TYPE Usem

valeurtrait_l TYPE ValeurTrait

trait_sem_pertinent_l TYPE Trait_Sem

trait_sem_obligatoire_l TYPE Trait_Sem -->

<!--CONTRAINTE R_ValPond_Usem

cible TYPE Usem

r_sem TYPE R_Usem -->

<!--CONTRAINTE R_ValPond_Pred

cible TYPE Predicat

r_sem TYPE R_Pred -->

<!--CONTRAINTE R_ValPond_Concept

cible TYPE Concept

r_sem TYPE R_Concept -->

<!--CONTRAINTE R_ValPond_Pred_Concept

cible TYPE Concept

r_sem TYPE R_Pred_Concept -->

<!--CONTRAINTE R_ValPond_Concept_Pred

cible TYPE Predicat

r_sem TYPE R_Concept_Pred -->

<!--CONTRAINTE Corresp_Arg_Arg

informe_arg_precise_source_l TYPE InformeArg

informe_arg_precise_cible_l TYPE InformeArg -->

<!--CONTRAINTE R_Usem

r_sem_inverse TYPE R_Usem

incompatible_l TYPE R_Usem

est_un_l TYPE R_Usem -->

<!--CONTRAINTE R_Pred

r_sem_inverse TYPE R_Pred

incompatible_l TYPE R_Pred

est_un_l TYPE R_Pred -->

<!--CONTRAINTE R_Concept

r_sem_inverse TYPE R_Concept

incompatible_l TYPE R_Concept

est_un_l TYPE R_Concept -->

<!--CONTRAINTE R_Pred_Concept

r_sem_inverse TYPE R_Concept_Pred

incompatible_l TYPE R_Pred_Concept

est_un_l TYPE R_Pred_Concept -->

<!--CONTRAINTE R_Concept_Pred

r_sem_inverse TYPE R_Pred_Concept

incompatible_l TYPE R_Concept_Pred

est_un_l TYPE R_Concept_Pred -->

<!--CONTRAINTE Assoc_ListePred

listepred TYPE ListePred -->

<!--CONTRAINTE ListePred

pred_instancie_l TYPE PredInstancie -->

<!--CONTRAINTE Corresp_Usyn_Usem

usem_cible TYPE Usem

correspondance TYPE Correspondance -->

<!--CONTRAINTE Correspondance -

contraint_description TYPE ContraintDescription

corresp_arg_pos_l TYPE Corresp_Arg_Pos_Simple

|Corresp_Arg_Pos_Flottant -->

<!--CONTRAINTE ContraintDescription

contraint_intervconst TYPE ContraintIntervConst

contraint_struct_interne TYPE ContraintStructInterne

contraint_construction TYPE ContraintConstruction -->

<!--CONTRAINTE ContraintIntervConst

contraint_syntagme_l TYPE ContraintSyntagme -->

<!--CONTRAINTE ContraintSyntagme

syntagme TYPE Syntagme

ajoute_trait_synt_l TYPE (Trait_Lex|Trait_Introd

|Trait_Prep|Trait_Conj

|Trait_ProRel

|Trait_ProIntrog

|Trait_Mode|Trait_Temps

|Trait_Personne

|Trait_Genre

|Trait_Nombre

|Trait_NombrePosseur

|Trait_SsCatMorph

|Trait_SsCatSynt

|Trait_Aux

|Trait_Pronominal

|Trait_Neg|Trait_Accord

|Trait_Passif

|Trait_Tournure

|Trait_Coref

|Trait_Aspect

|Trait_Bin|Trait_Libre) -->

<!--CONTRAINTE AjouteTrait_Sem

trait_sem_valpond TYPE Trait_Sem_ValPond -->

<!--CONTRAINTE ContraintConstruction

contraint_position_l TYPE ContraintPosition -->

<!--CONTRAINTE ContraintStructInterne

contraint_position_l TYPE ContraintPosition -->

<!--CONTRAINTE ContraintPosition_mdc

contraint_position_l TYPE ContraintPosition -->

<!--CONTRAINTE ContraintPosition

contraint_syntagme_l TYPE ContraintSyntagme -->

<!--CONTRAINTE Corresp_Arg_Pos_Flottant

position TYPE Position -->

4. Entities semant.ent

<!--Consortium GENELEX @(#) semant.ent 2.1@(#) 94/09/07 14:35:28 -->

<!ENTITY % pPonderation

"PROTOTYPIQUE|DEFINITOIRE|ACCESSOIRE|EXCEPTIONNEL

|CONNOTATIF">

<!ENTITY % pArgInclus

"INCLUS|PAS_INCLUS">

<!ENTITY % pPorteeCorresp

"EXTERNE|STRUCT_INTERNE|COMP_INTERNE">

<!ENTITY % pStatutInfoArg

"DEFAUT|VERIF|ENRICHIT|DEFAUT_VERIF">

<!ENTITY % pTypeR_Usem

"PARADIGMATIQUE|DERIVATION|COLLOCATION">

<!ENTITY % pSsTypeR_Usem

"SYNONYMIE|CONTRAIRE|OPPOSITION|CONVERSE

|TAXINOMIE|PARTIE_TOUT|STRICTE|NON_STRICTE">

<!ENTITY % pTypeR_PredOuConcept

"GENERALISATION|PARTICULARISATION|ESSENTIEL">

<!ENTITY % pRoleListePred

"A_PRESUPPOSITION|A_IMPLICATION|A_DEFINITION

|A_EXPLICITATION|PARTICIPE_A">

<!ENTITY % pStatutListePred

"SCENARIO|DEFINITION">

<!ENTITY % pTypeListePred

"ORDRE_TEMP|ORDRE_SPATIAL|SIMULTANEITE">

<!ENTITY % pConcerne

"AFFIXE|AUTRE">

<!ENTITY % pTypeTrait_Sem

"PROPRIETE|CLASSE|DOMAINE|DISTINCTIF|PRAGMATIQUE

|DIVERS|CONNOTATION">

<!ENTITY % pValuation

"MONOVALUE|MULTIVALUE">

<!ENTITY % pTypeListe

"BINAIRE|LISTE_FERMEE|LISTE_OUVERTE">

<!ENTITY % pStructureListe

"TREILLIS_TOTAL|TREILLIS_PARTIEL|HIERARCHIE_TOTALE

|HIERARCHIE_PARTIELLE">

<!ENTITY % pTypePredOuConcept

"LEXICAL|GENERALISE|PRIMITIF|LEXICAL_PRIMITIF

|TROU_LEXICAL">

<!ENTITY % pStatutAjoute

"FILTRE_AJOUTE|FILTRE|FORCE">

 

5. Entities custom.ent

<!--Consortium GENELEX @(#) custom.ent 3.3@(#) 94/09/08 13 : 26 : 36 -->

<!-- Ce fichier est accessible a l'utilisateur qui peut y definir

des valeurs supplementaires d'attribut en remplacant le texte

"_VALEURS_A_DEFINIR_XXX" de l'entite par la liste des valeurs dont il

veut disposer : "VALEUR1|VALEUR2|...|VALEURn"

Pour un certain nombre des elements utilisant ces entites, une liste

minimale de valeurs est definie par ailleurs dans les fichiers

"syntaxe.ent" et "semant.ent".

On pourra envisager d'etendre ce fonctionnement a d'autres attributs.

-->

<!ENTITY % pEtiquetteSynt_cust "_VALEURS_A_DEFINIR_ES" >

<!ENTITY % pSsCatSynt_cust "_VALEURS_A_DEFINIR_SCS" >

<!ENTITY % pFonction_cust

"TETE|SUJET|OBJET_DIRECT|OBJET_INDIRECT|ATTRIBUT_SUJET

|ATTRIBUT_OBJET|EPITHETE_GAUCHE|EPITHETE_DROIT

|SPECIFIEUR|MODIFIEUR|GENITIF" >

<!ENTITY % pRoleTh_cust

"AGENT|PATIENT|DESTINATAIRE|SOURCE|BUT|CAUSE|MANIERE

|LOCATIF|TEMPS|INSTRUMENT|THEME" >

<!ENTITY % pPonderation_cust "_VALEURS_A_DEFINIR_PO">

<!ENTITY % pStatutInfoArg_cust "_VALEURS_A_DEFINIR_SIA">

<!ENTITY % pTypeTrait_Sem_cust "_VALEURS_A_DEFINIR_TTS">

<!ENTITY % pTypeR_Usem_cust "_VALEURS_A_DEFINIR_TRU" >

<!ENTITY % pSsTypeR_Usem_cust "_VALEURS_A_DEFINIR_SRU" >

<!ENTITY % pTypeR_PredOuConcept_cust "_VALEURS_A_DEFINIR_TR" >

<!ENTITY % pRoleListePred_cust "_VALEURS_A_DEFINIR_RLP">

<!ENTITY % pStatutListePred_cust "_VALEURS_A_DEFINIR_SLP" >

<!ENTITY % pTypeListePred_cust "_VALEURS_A_DEFINIR_TLP" >

6. DTD syntaxe.dtd

<!--Consortium GENELEX @(#) syntaxe.dtd 4.2@(#) 94/06/23 14:14:36 -->

 

<!-- **************A L'ADRESSE DES UTILISATEURS ******************

Vos remarques concernant la DTD seront etudiees par le consortium

GENELEX. Celui-ci assurera la diffusion de la nouvelle version qui

pourrait en decouler.

**************************************************************** -->

<!ELEMENT GenelexSyntaxe - O (

Usyn+ &

Description+ &

Self+ &

IntervConst* &

ComportAppele* &

Optionnalite* &

Construction* &

Position_C* &

Position_S* &

Insertion* &

Syntagme_T* &

Syntagme_NT_C* &

Syntagme_NT_S* &

MdC* &

TransfDescription* &

ModifConstruction* &

ModifPosition* &

TransfSyntagme* &

ModifSyntagme_T* &

ModifSyntagme_NT* &

ModifIntervConst* &

Trait_Lex* &

Trait_Introd* &

Trait_Prep* &

Trait_Conj* &

Trait_ProRel* &

Trait_ProIntrog* &

Trait_RefLex* &

Trait_RefIntrod* &

Trait_RefPrep* &

Trait_RefConj* &

Trait_RefProRel* &

Trait_RefProIntrog* &

Trait_Mode* &

Trait_Temps* &

Trait_Personne* &

Trait_Genre* &

Trait_Nombre* &

Trait_NombrePosseur* &

Trait_SsCatMorph* &

Trait_SsCatSynt* &

Trait_Aux* &

Trait_Pronominal* &

Trait_Neg* &

Trait_Accord* &

Trait_Passif* &

Trait_Tournure* &

Trait_Coref* &

Trait_Aspect* &

Trait_Bin* &

Trait_Libre* &

RoleTh* &

Fonction*)>

<!-- ********************************************************* -->

<!-- ***** UNITE SYNTAXIQUE ET DESCRIPTION ***** -->

<!-- ********************************************************* -->

<!ELEMENT Usyn - O ((Composition* & TransfUsyn*), Corresp_Usyn_Usem*)>

<!ATTLIST Usyn

id ID #REQUIRED

%pGlose

attestation CDATA #IMPLIED

combve IDREF #IMPLIED

description IDREF #REQUIRED

description_l IDREFS #IMPLIED

transfdescription_l IDREFS #IMPLIED>

<!-- Le champ attestation permet de preciser la source de l'emploi

releve (nom ou titre du dictionnaire, du texte d'auteur, ou de

l'article de linguistique).

L'attribut description note la description de base,

la liste description_l enregistre les descriptions transformees,

la liste transfdescription_l note les transformations entre les

descriptions associees a l'Usyn ; ces transformations peuvent operer

entre la description de base et les transformees mais aussi sur les

transformees entre elles. -->

<!ELEMENT Corresp_Usyn_Usem - O EMPTY>

<!ATTLIST Corresp_Usyn_Usem

usem_cible IDREF #REQUIRED

correspondance IDREF #IMPLIED>

<!-- Ces elements sont enchasses dans les Usyn, ils servent a la

connexion entre les couches syntaxique et semantique. Pour une Usyn,

il y a autant de Corresp_Usyn_Usem que d'Usem pointees par cette

Usyn. Une Usem peut etre pointee par plusieurs Usyn, avec a chaque

fois les informations de correspondance dans les Corresp_Usyn_Usem.

usem_cible designe l'Usem pointee.

correspondance precise toutes les informations completant

cette correspondance (fitrage, correspondance argument-position,

enrichissement semantique du au contexte) et peut etre vide si rien

n'est a preciser.

Une Usem est pointee via l'attribut usem_cible par autant de

Corresp_Usyn_Usem qu'elle a de contextes syntaxiques possibles

(decrits par les descriptions associees aux Usyn).

-->

 

<!ELEMENT Description - O (Condition*) >

<!ATTLIST Description

id ID #REQUIRED

%pGlose

um_representante CDATA #IMPLIED

self IDREF #REQUIRED

construction IDREF #IMPLIED>

<!-- ********************************************************* -->

<!-- ***** SELF ***** -->

<!-- ********************************************************* -->

<!ELEMENT Self - O EMPTY>

<!ATTLIST Self

id ID #REQUIRED

syntagme_nt_s IDREF #IMPLIED

syntagme_nt_s_l IDREFS #IMPLIED

transfsyntagme_l IDREFS #IMPLIED

IntervConst IDREF #IMPLIED

comportappele_l IDREFS #IMPLIED>

<!-- Le champ syntagme_nt_s n'est instancie que pour les Usyn

composees, il en exprime alors la structure interne, eventuellement

reduite a l'etiquette syntagmatique, par un Syntagme_NT_S avec ou

sans reecriture

Ex : mettre en marche SV

Les champs syntagme_nt_s_l et transfsyntagme_l ne concernent de

meme que les Usyn composees et servent a noter d'eventuelles

transformations sur la structure interne.

Le champ IntervConst donne les realisations de Self intervenant

dans la construction externe :

- en tant qu'occupant de la construction si celle-ci decrit

un contexte d'apparition dans lequel s'inscrit le Self,

- en tant que predicat associe a une construction decrivant un

schema de complementation.

La liste comportappele_l permet d'indiquer les alternatives

de comportement de Self en tant qu'appele par un element

non decrit dans sa construction -->

<!ELEMENT IntervConst - O EMPTY>

<!ATTLIST IntervConst

id ID #REQUIRED

fonction IDREF #IMPLIED

roleth_l IDREFS #IMPLIED

syntagme_t_l IDREFS #REQUIRED>

<!-- L'element IntervConst comporte des syntagmes terminaux.

On peut y exprimer des variations de realisation de Self :

Ex : N + [Nombre : PLURIEL]

Ex : V

V + [Pronominal : SE]

V + [Pronominal : SE_EN]

Pour les mots simples, on autorise un ecart entre la categorie

fonctionnelle et la categorie grammaticale de l'unite morphologique.

Ex : description du comportement adjectival de

l'Unite Morphologique du NOM abricot

Pour les composes syntaxiques, on rend compte de la categorie

fonctionnelle (externe) du compose qui peut differer de son

etiquette de syntagme structurel interne

Ex : mettre en Ïuvre (VERBE / SV)

On peut de plus indiquer sur cet element une fonction et des roles

thematiques : ce sont les valeurs portees par le Self lorsqu'il

s'insere dans la construction. -->

<!ELEMENT ComportAppele - O EMPTY>

<!ATTLIST ComportAppele

id ID #REQUIRED

fonction IDREF #IMPLIED

roleth_l IDREFS #IMPLIED

syntagme_t IDREF #REQUIRED>

<!-- ComportAppele note un comportement de Self en tant qu'appele par

un element non decrit dans sa construction.

Un comportement regroupe :

- une etiquette syntagmatique terminale et une liste de traits,

exprimant la categorie fonctionnelle d'appele et les traits

restrictifs associes :

cette association est equivalente a un syntagme terminal

Ex : PREP + [SsCatSynt : LIEU]

(si le Self porte un IntervConst, l'etiquette syntagmatique

du syntagme du ComportAppele doit etre la meme que l'etiquette

de l'un des syntagmes dudit IntervConst)

- une fonction en tant qu'appele

- une liste de roles thematiques -->

<!-- ********************************************************* -->

<!-- ***** CONSTRUCTION ***** -->

<!-- ********************************************************* -->

<!ENTITY % pConstSyntNT

"%pGlose

etiquettesynt (SANS_E

|%pEtiquetteSynt_NT

|%pEtiquetteSynt_cust) SANS_E

solidarite CDATA #IMPLIED

optionnalite IDREF #IMPLIED">

<!-- L'attribut solidarite indique par des tirets les paires

de positions solidaires ;

Ex : P0 SELF-P1-P2 -->

<!ELEMENT Optionnalite - O (ConditionOpt*)>

<!ATTLIST Optionnalite

id ID #REQUIRED

%pGlose

libelle CDATA #REQUIRED>

<!ELEMENT ConditionOpt - O (SiOpt+, AlorsOpt+)>

<!ELEMENT (SiOpt|AlorsOpt) - O EMPTY>

<!ATTLIST (SiOpt|AlorsOpt)

negation (%pBooleen) NON

nieme_position NUMBER #REQUIRED>

<!-- L'optionnalite est exprimee :

- d'une part par un libelle indiquant pour chaque position par

des parentheses si son effacement est possible dans une

realisation de la construction ; toutes les positions de la

construction ou du syntagme sont rappelees dans ce champ,

le point d'insertion du Self n'y apparait pas

Ex : P0 (P1) P2 (P3)

- d'autre part par des conditions exprimant d'eventuelles

interdependances entre les positions -->

<!ELEMENT Construction - O EMPTY>

<!ATTLIST Construction

id ID #REQUIRED

%pConstSyntNT

squelettique (%pBooleen) NON

listepositions (%pTypeListPos) FERMEE

insereself NUMBER #IMPLIED

trait_l IDREFS #IMPLIED

position_c_l IDREFS #REQUIRED>

<!-- L'attribut squelettique indique si l'element est un squelette

de construction - donc a mettre en regard avec un

ModifConstruction pour obtenir une Construction a part entiere.

L'attribut listepositions note si la liste position_c_l est

a comprendre comme une liste FERMEE - toutes les positions

sont donnees - ou OUVERTE.

L'attribut insereself indique lorsqu'il est renseigne le point

d'insertion de Self dans la liste de positions : avant la

position de rang correspondant a sa valeur.

La liste position_c_l renvoie a des Position_C.

Il s'agit d'une liste ordonnee, l'ordre des elements dans

la liste correspond a l'ordre canonique (valeur initiale = 0),

et les elements pourront par la suite etre references par

leur rang dans cette liste -->

<!-- ********************************************************* -->

<!-- ***** POSITION ET INSERTION ***** -->

<!-- ********************************************************* -->

<!ENTITY % pPosition

"%pGlose

repetable (SANS_B|%pBooleen) SANS_B

fonction IDREF #IMPLIED

roleth_l IDREFS #IMPLIED">

<!-- L'attribut repetable indique si une position peut etre

realisable plusieurs fois -->

<!ELEMENT (Position_C|Position_S) - O EMPTY>

<!ATTLIST Position_C

id ID #REQUIRED

%pPosition

syntagme_c_l IDREFS #IMPLIED

transfsyntagme_l IDREFS #IMPLIED>

<!ATTLIST Position_S

id ID #REQUIRED

%pPosition

syntagme_s_l IDREFS #REQUIRED

transfsyntagme_l IDREFS #IMPLIED>

<!-- L'attribut syntagme_c_l reference les occupants possibles

de Position_C :

- syntagme terminal (Syntagme_T)

- syntagme non terminal dont la reecriture est ou non

decrite (Syntagme_NT_C)

L'attribut syntagme_s_l reference les occupants possibles

de Position_S :

- syntagme terminal (Syntagme_T)

- syntagme non terminal dont la reecriture est ou non

decrite (Syntagme_NT_S) -->

<!ELEMENT Insertion - O (CheminPosition)>

<!ATTLIST Insertion

id ID #REQUIRED

obligatoire (SANS_B|%pBooleen) SANS_B

retire_syntagme_c_l IDREFS #IMPLIED

retire_transfsyntagme_l IDREFS #IMPLIED>

<!-- L'insertion dans un syntagme structurel est utilisee uniquement

pour representer le cas d'une insertion renvoyant a une position

d une position decrite dans la construction syntaxique externe :

l element CheminPosition donne acces a cette position.

Les attributs de l'Insertion indiquent les eventuelles modifications

a appliquer a la position referencee : retrait de syntagmes et

transformations entre syntagmes. La repetabilite reste celle de

la position externe referencee.

Ex : le compose "mettre en Ïuvre" a dans sa construction

externe une position objet direct contenant par exemple

un syntagme nominal et un pronom personnel :

mettre en Ïuvre un processus, le mettre en Ïuvre

l'insertion n est ici possible que pour le SN :

mettre un processus en Ïuvre

L'attribut obligatoire indique si, lors d'une realisation de la

position externe referencee, le phenomene d'insertion est

obligatoire ou facultatif -->

<!-- ********************************************************* -->

<!-- ***** CONDITION ***** -->

<!-- ********************************************************* -->

<!ELEMENT Condition - O (Si+, Alors+)>

<!ATTLIST Condition

appellation CDATA #IMPLIED>

<!-- Les Conditions permettent d'exprimer des contraintes mutuelles

entre realisations de positions.

Ex : si P0 == P[SsCatSynt : COMPLETIVE]

alors P1 != P[SsCatSynt : COMPLETIVE]

Les listes de Predicats (Si et Alors) permettent d'exprimer

des conjonctions sur ces predicats.

Les disjonctions sont exprimees au moyen de la liste de Conditions

portee par la Description (liste "et"). -->

<!ELEMENT (Si|Alors) - O (CheminPosition|CheminSyntagme

|SelectIntervConst)>

<!ATTLIST (Si|Alors)

portee (%pPortee) EXTERNE

negation (%pBooleen) NON>

<!-- Un predicat selectionne :

- un syntagme ou une position de la construction externe

(champ portee EXTERNE),

- un syntagme ou une position du syntagme structurel du Self

(champ portee INTERNE),

- une realisation du Self en tant qu intervenant de

construction (champ portee INTERVENANT).

La selection se fait

- pour un syntagme grace au CheminSyntagme,

- pour une position grace au CheminPosition,

- pour une realisation en tant qu intervenant grace

au SelectIntervConst.

Un predicat porte eventuellement une negation qui exprime

l inhibition du syntagme ou de la position pointee. -->

<!ELEMENT CheminSyntagme - O (CheminSyntagme?)>

<!ATTLIST CheminSyntagme

nieme_position NUMBER 0

syntagme IDREF #REQUIRED>

<!-- Cet element permet de selectionner un syntagme particulier.

La recursivite est utilisee pour descendre dans une eventuelle

reecriture. L'element aboutit toujours a un syntagme.

Les positions sont referencees par nieme_position qui indique

leur rang dans la liste ou elles apparaissent ;

la valeur 0 reference le premier element de la liste. -->

<!ELEMENT CheminPosition - O (CheminSyntagme?,PositionBut)>

<!-- Pour une construction ou un syntagme donne, cet element permet

de selectionner une de ses positions - eventuellement apparaissant

dans une reecriture de syntagme -.

L'element PositionBut indique la position selectionnee ;

si celle-ci apparait dans une reecriture de syntagme,

on utilise l'element CheminSyntagme pour atteindre

ledit syntagme -->

<!ELEMENT PositionBut - O EMPTY>

<!ATTLIST PositionBut

nieme_position NUMBER 0>

<!ELEMENT SelectIntervConst - O EMPTY>

<!ATTLIST SelectIntervConst

syntagme_t IDREF #REQUIRED>

<!-- *********************************************************** -->

<!-- ***** SYNTAGMES TERMINAUX ET NON TERMINAUX ****** -->

<!-- *********************************************************** -->

<!ELEMENT Syntagme_T - O EMPTY>

<!ATTLIST Syntagme_T

id ID #REQUIRED

%pGlose

etiquettesynt (SANS_E

|%pEtiquetteSynt_T

|%pEtiquetteSynt_cust) SANS_E

trait_l IDREFS #IMPLIED>

<!ELEMENT Syntagme_NT_C - O EMPTY>

<!ATTLIST Syntagme_NT_C

id ID #REQUIRED

%pConstSyntNT

listepositions (%pTypeListPos) FERMEE

insereself NUMBER #IMPLIED

trait_l IDREFS #IMPLIED

position_c_l IDREFS #IMPLIED>

<!-- La liste position_c_l renvoie a des Position_C.

Il s'agit d'une liste ordonnee, l'ordre des elements dans

la liste correspond a l'ordre canonique et les elements pourront

par la suite etre references par leur rang dans cette liste. -->

<!ELEMENT Syntagme_NT_S - O EMPTY>

<!ATTLIST Syntagme_NT_S

id ID #REQUIRED

%pConstSyntNT

listepositions (%pTypeListPos) FERMEE

insereinsertion_l NUMBERS #IMPLIED

trait_l IDREFS #IMPLIED

position_s_l IDREFS #IMPLIED

insertion_l IDREFS #IMPLIED>

<!-- La liste position_s_l renvoie a des Position_S ;

la liste insertion_l renvoie a des Insertions.

Il s'agit de listes ordonnees, leurs elements pourront par la suite

etre references par leur rang d'apparition dans leur liste.

La liste d'entiers insereinsertion_l indique les points ou

s'inserent parmi les Positions les Insertions presentes dans la

liste insertion_l -->

<!-- Le Syntagme_T est un occupant de position terminal (etiquette

reduite a la Categorie grammaticale ou a "e" pour la trace).

Le Syntagme_NT_C/S est un occupant de position non terminal,

on utilisera sa liste de positions (#IMPLIED) si l'on veut preciser

sa reecriture.

Le Syntagme_NT_S sert de plus pour decrire la structure interne

d'une unite composee (champ syntagme_nt_s du Self).

La liste trait_l renvoie a des traits restrictifs et permet ainsi

de specifier sur un syntagme un ensemble de contraintes :

contraintes lexicales, morphologiques, morpho-syntaxiques,

syntaxico-semantiques, voire semantiques.

Le champ appellation permettra de noter le nom usuel du syntagme.

Ex : Syntagme_NT_C

etiquettesynt = "P"

trait_l -> [Mode : INFINITIF]

appellation = "phrase infinitive"

La categorie "e" permet de noter les traces pour les tenants

de la grammaire generative et de les considerer comme des syntagmes

fantomes auxquels on peut associer les restrictions necessaires.

Ex : Syntagme_T

etiquettesynt = "e"

trait_l -> [Personne : 3][Nombre : SINGULIER]

appellation = "elt vide" -->

<!-- ********************************************************* -->

<!-- ***** COMPOSITION ***** -->

<!-- ********************************************************* -->

<!ELEMENT Composition - O (R_ComposeUm|R_ComposeUsyn)+>

<!-- Les elements Composition portes par l unite syntaxique

enregistrent les listes alternatives de lexicalisations :

Ex : (avoir admiration pour)

(eprouver admiration pour)

(eprouver admiration envers)

(porter admiration a) -->

<!-- La liste contenue de R_ComposeUm et R_ComposeUsyn donne la

liste des composants pour une alternative de composition donnee. -->

<!-- On referera dans la suite les composants par des traits RefLex

portant deux valeurs d indices :

- l indice de la compositon dans la liste de compositions

- l indice du composant dans la liste des composants

Ex. : [RefLex : 1,2] :

Ce mecanisme sera utilise :

- soit dans le syntagme structurel interne de Self,

- soit dans le mode de calcul MdC. -->

<!ENTITY % pCompose

"type (%pTypeComposant) APPELE">

<!ELEMENT R_ComposeUm - O (RestrictUm*)>

<!ATTLIST R_ComposeUm

%pCompose

um IDREF #REQUIRED>

<!ELEMENT R_ComposeUsyn - O EMPTY>

<!ATTLIST R_ComposeUsyn

%pCompose

usyn IDREF #REQUIRED

mdc IDREF #IMPLIED>

<!-- Dans le cas ou l'on veut noter la formation du compose selon

l'Usyn composante :

- lexicalisation de ses positions par d'autres composants

(seulement dans le cas d'un comportement d'appelant)

et/ou

- heritage de ses positions avec eventuellement ajout de

restrictions,

on associe un MdC decrivant l'ensemble de ces phenomenes.

NB : La lexicalisation d'une position d une Usyn appelante par un

autre composant est faite par l'introduction d'un Syntagme

portant un trait [RefLex : nieme_cposition,nieme_cposant] dans

le MdC de l'Usyn. Ce Syntagme peut de plus porter d autres

traits restrictifs ; on devra dans ce cas s'assurer que ces

traits sont compatibles avec le MdC et le Self de l Usyn

fonctionnant en tant que composant appele.

Ex : Le MdC de l appelante indique :

SN[RefLex : 3,1][SsCatSynt : DET_VIDE]

L'appelee (3,1) ne devra pas dans son MdC imposer

une lexicalisation de sa position Determinant. -->

<!ELEMENT MdC - O (HeritePosition* & FiltreSelf?)>

<!ATTLIST MdC

id ID #REQUIRED

%pGlose>

<!-- Le MdC decrit le mode de composition d une Usyn Composee sur une

de ses Usyn composantes. Il permet de specifier :

- les contraintes emises par l Usyn composee sur les Usyn

composantes,

- l'organisation mutuelle des Usyn composantes, en indiquant

quel composant occupe quelle position d'un autre composant.

Le MdC s applique a l usyn composante et permet de

- heriter des positions de sa construction

- filtrer son Self

Une position non referencee par le MdC est inhibee. Lorsqu aucun

filtre n est precise sur le Self, celui-ci est par defaut herite -->

<!ELEMENT HeritePosition - O (CheminPosition)>

<!ATTLIST HeritePosition

destination (%pDestination) EXTERIEUR

optionnel (HERITAGE

|%pBooleen) HERITAGE

modifposition IDREF #IMPLIED>

<!-- L'attribut destination indique si la position heritee se retrouve

dans l'exterieur ou l'interieur du compose.

L'attribut optionnel permet de noter d'eventuelles modifications

de l'optionnalite de la position heritee. -->

<!ELEMENT FiltreSelf - O EMPTY>

<!ATTLIST FiltreSelf

modifIntervConst IDREF #IMPLIED

modifsyntagme_nt IDREF #IMPLIED>

<!-- Le FiltreSelf doit au moins realiser une des deux operations :

- Modification de l'IntervConst

- Modification du syntagme structurel interne. -->

<!-- ********************************************************** -->

<!-- ***** TRANSFORMATIONS ***** -->

<!-- ********************************************************** -->

<!-- Il existe trois types de transformations :

- TransfUsyn : transformations operant entre deux Unites Syntaxiques

- issues de la meme Um

Ex : neutralite

- issues d'Um differentes

Ex : derivation syntaxique

- TransfDescription : transformations operant entre deux

Descriptions (c'est-a-dire deux couples Self/Construction).

Ex : passivation

- TransfSyntagme : transformations operant entre des occupants

de Position, c'est-a-dire des Syntagmes, terminaux ou non.

Ex : pronominalisation -->

<!ELEMENT TransfUsyn - O (ModifDescription?)>

<!ATTLIST TransfUsyn

%pGlose

usyn_resultat IDREF #REQUIRED>

<!-- Les TransfUsyn sont pointees par les Usyn origine et indiquent

les Usyn resultat.

Elles s'appliquent entre la Description de base de l'Usyn origine

et la Description de base de l'Usyn resultat -->

<!ELEMENT TransfDescription - O (ModifDescription?)>

<!ATTLIST TransfDescription

id ID #REQUIRED

%pGlose

description_origine IDREF #REQUIRED

description_resultat IDREF #REQUIRED>

<!-- Les TransfDescriptions sont portees par l'Usyn et operent entre

les Descriptions de cette Usyn.

L'origine d une TransfDescription peut etre la Description de base

mais aussi une Description transformee de l'Usyn. -->

<!ELEMENT ModifDescription - O EMPTY>

<!ATTLIST ModifDescription

%pGlose

modifconstruction IDREF #IMPLIED

modifIntervConst IDREF #IMPLIED

modifsyntagme_nt IDREF #IMPLIED>

<!-- Les ModifDescriptions peuvent rendre compte de trois phenomenes :

- transformation sur la construction

- transformation sur les realisations de Self associees

a la construction externe en tant qu'occupant ou predicat

(champ IntervConst de Self)

Ex : V -> V[Passif : PLUS]

- transformation sur le syntagme structurel interne decrivant

un compose

Ex : pour les beaux yeux de SN

-> pour ses beaux yeux

Dans le premier cas, on peut toujours choisir entre un mode

calculatoire ou descriptif :

- en mode calculatoire, la Construction pointee par la

Description resultat est declaree squelettique : la

construction resultat elle meme est a construire a partir

de ce squelette en lui appliquant les modifications

indiquees sur ModifConstruction

- en mode descriptif, la Construction pointee par la

Description resultat est totalement decrite : le

ModifConstruction indique alors des correspondances,

a des degres de finesse variable, entre Construction

origine et Construction resultat.

Dans les deux autres cas, les resultats - IntervConst et

Syntagme_NT_S - sont totalement decrits : on est en mode

descriptif -->

<!ELEMENT ModifConstruction - O (TransfPosition*)>

<!ATTLIST ModifConstruction

id ID #REQUIRED

%pGlose

etiquettesynt (HERITAGE

|%pEtiquetteSynt_NT

|%pEtiquetteSynt_cust) HERITAGE

insereself NUMBER #IMPLIED

solidarite CDATA #IMPLIED

optionnalite IDREF #IMPLIED

retire_trait_l IDREFS #IMPLIED

ajoute_trait_l IDREFS #IMPLIED>

<!-- En mode descriptif, cet element permet de noter - eventuellement

de facon partielle - des informations sur le passage de la

construction origine a la construction resultat ; il peut

minimalement s'agir de mises en correspondance de positions

par le contenu TransfPosition*.

En mode calculatoire, cet element permet de construire la

construction resultat a partir de la construction origine

et du squelette : le squelette est habille avec des elements

ou attributs provenant de la construction origine ou precises sur

le ModifConstruction.

Si les attributs etiquettesynt, insereself, solidarite,

optionnalite et ceux groupes par pGlose

- sont renseignes : ils remplissent (voire ecrasent)

les attributs correspondants du squelette

- ne sont pas renseignes : les attributs correspondants

et non renseignes sur la construction squelette sont herites

de la construction origine

Les traits de la Construction origine sont herites moyennant

des retraits et ajouts notes ici.

La liste contenue de TransfPositions exprime la formation

des positions de la construction resultat a partir des positions

du squelette et des positions de la construction origine. -->

<!ELEMENT TransfPosition - O (CheminPosition, CheminPosition?)>

<!ATTLIST TransfPosition

%pGlose

modifposition IDREF #IMPLIED>

<!-- Selection de la Position d'origine

Le premier element contenu CheminPosition pointe vers une position

de la construction a l'origine de la transformation.

Selection de la Position resultat

La position resultat est eventuellement selectionnee par le

second element contenu CheminPosition

L'attribut ModifPosition indique les modifications a operer pour

passer de la position origine a la position resultat -->

<!ELEMENT ModifPosition - O EMPTY>

<!ATTLIST ModifPosition

id ID #REQUIRED

%pGlose

repetable (HERITAGE

|%pBooleen) HERITAGE

fonction IDREF #IMPLIED

roleth_l IDREFS #IMPLIED

retire_syntagme_l IDREFS #IMPLIED

retire_transfsyntagme_l IDREFS #IMPLIED

ajoute_syntagme_l IDREFS #IMPLIED

ajoute_transfsyntagme_l IDREFS #IMPLIED

transfsyntagme_l IDREFS #IMPLIED>

<!-- Le fonctionnement de ModifPosition est similaire a celui du

ModifConstruction ; on aura en particulier la meme alternative

entre un fonctionnement calculatoire - la position resultat est

a construire a partir de la position pointee sur le squelette

et les informations notees ici - et un fonctionnement descriptif

- la position resultat est deja totalement renseignee. -->

<!ELEMENT TransfSyntagme - O EMPTY>

<!ATTLIST TransfSyntagme

id ID #REQUIRED

%pGlose

syntagme_origine IDREF #REQUIRED

syntagme_resultat IDREF #IMPLIED

modifsyntagme IDREF #IMPLIED>

<!-- Les TransfSyntagme notent les relations de transformation

existant entre :

- des syntagmes occupant une meme position - Syntagmes

terminaux ou non -, les syntagmes a l'origine et au

resultat de la transformation sont alors tous deux

necessairement renseignes,

- des syntagmes mis en correspondance lors d'une

transformation de plus haut niveau ; dans ce cas,

le syntagme resultat ne sera pas necessairement

renseigne si on est a l'interieur d'une transformation

de Construction fonctionnant en mode calculatoire. -->

<!ENTITY % pModifSyntagme

"%pGlose

retire_trait_l IDREFS #IMPLIED

ajoute_trait_l IDREFS #IMPLIED">

<!ELEMENT ModifSyntagme_T - O EMPTY>

<!ATTLIST ModifSyntagme_T

id ID #REQUIRED

%pModifSyntagme>

<!ELEMENT ModifSyntagme_NT - O ((TransfPosition|TransfInsertion)*)>

<!ATTLIST ModifSyntagme_NT

id ID #REQUIRED

%pModifSyntagme

etiquettesynt (HERITAGE

|%pEtiquetteSynt_NT

|%pEtiquetteSynt_cust) HERITAGE

insereself NUMBER #IMPLIED

insereinsertion_l NUMBERS #IMPLIED

retire_position_l NUMBERS #IMPLIED

retire_insertion_l NUMBERS #IMPLIED

solidarite CDATA #IMPLIED

optionnalite IDREF #IMPLIED>

<!-- Modifier un Syntagme consiste a :

- retirer ou ajouter des traits,

- et dans le cas d'un syntagme non terminal :

- changer l'etiquette,

- changer l'optionnalite, la solidarite, et

les eventuels points d'insertion notes.

- modifier les positions et insertions de

l'eventuelle reecriture -->

<!ELEMENT TransfInsertion - O (CheminInsertion, CheminInsertion)>

<!-- L'element TransfInsertion met en rapport une insertion du

syntagme de structure origine et l'insertion correspondante

du syntagme de structure resultat. -->

<!ELEMENT CheminInsertion - O (CheminSyntagme?,InsertionBut)>

<!ELEMENT InsertionBut - O EMPTY>

<!ATTLIST InsertionBut

nieme_insertion NUMBER 0>

<!-- L attribut nieme_insertion selectionne une insertion par son rang

dans la liste insertion_l ou elle apparait -->

<!ELEMENT ModifIntervConst - O EMPTY>

<!ATTLIST ModifIntervConst

id ID #REQUIRED

fonction IDREF #IMPLIED

roleth_l IDREFS #IMPLIED

retire_syntagme_t_l IDREFS #IMPLIED

ajoute_syntagme_t_l IDREFS #IMPLIED

transfsyntagme_l IDREFS #IMPLIED>

<!-- L'element ModifIntervConst permet de noter les modifications

subies, en transformation ou en composition, par le contenu

IntervConst du Self. -->

<!-- ************************************************************ -->

<!-- ***** TRAITS RESTRICTIFS ***** -->

<!-- ************************************************************ -->

<!-- Les traits restrictifs permettent de specifier les syntagmes

auxquels ils sont combines

Ex : P[Conj : que][Mode : SUBJONCTIF] =>completive

P[SsCatSynt : RELATIVE] =>relative

P[Mode : INFINITIF] =>infinitive

P[SsCatSynt : COORDONNE] =>phrase coordonnee

P[SsCatSynt : SUBORDONNEE] =>subordonnee

SN[Nombre : PLURIEL]

SN[Coref : I] -->

<!--******************** Traits LEXICAUX ***********************-->

<!--************************************************************-->

<!ENTITY % pTrait_Lexical

"id ID #REQUIRED

valeur CDATA #IMPLIED

um IDREF #IMPLIED">

<!-- Ces traits permettent d'exprimer une restriction d'ordre lexical

sur un Syntagme,

- soit en entrant dans le champ "valeur" une chaine de caracteres

representant la graphie - ce moyen ne desambiguise pas

les homographes -,

- soit en referant une unite morphologique (par son identifiant).

On autorise un ecart (fonde sur l'ecart entre categorie

morpho-syntaxique et categorie fonctionnelle) entre la categorie

du syntagme qui supporte le trait lexical et la categorie de l'Um

referee par ce trait lexical

Ex : NOM[Lex : courageux[um : UM04]]

avec en Morphologie :

Um[id : UM04 ; catgram : ADJECTIF[umg : courageux]] -->

<!ELEMENT Trait_Lex - O EMPTY>

<!ATTLIST Trait_Lex

%pTrait_Lexical

saturesynt (%pBooleen) OUI>

<!ELEMENT Trait_Introd - O EMPTY>

<!ATTLIST Trait_Introd

%pTrait_Lexical>

<!ELEMENT Trait_Prep - O EMPTY>

<!ATTLIST Trait_Prep

%pTrait_Lexical>

<!ELEMENT Trait_Conj - O EMPTY>

<!ATTLIST Trait_Conj

%pTrait_Lexical>

<!ELEMENT Trait_ProRel - O EMPTY>

<!ATTLIST Trait_ProRel

%pTrait_Lexical>

<!ELEMENT Trait_ProIntrog - O EMPTY>

<!ATTLIST Trait_ProIntrog

%pTrait_Lexical>

<!ENTITY % pTrait_RefLexical

"id ID #REQUIRED

nieme_cposition NUMBER 0

nieme_cposant NUMBER 1">

<!-- Ces traits referent par leurs coordonnees (nieme_cposition

indique le rang dans la liste de Composition portee par l Usyn

et nieme_cpsosant le rang dans la liste mixte de R_ComposeUm/Usyn

portee par la Composition) l'Um ou Usyn entrant dans la

composition de l'Unite. La valeur 0 sur nieme_cposition refere

tous les composants de rang nieme_cpsant independamment des

compositions -->

<!ELEMENT Trait_RefLex - O EMPTY>

<!ATTLIST Trait_RefLex

%pTrait_RefLexical

saturesynt (%pBooleen) OUI>

<!ELEMENT Trait_RefIntrod - O EMPTY>

<!ATTLIST Trait_RefIntrod

%pTrait_RefLexical>

<!ELEMENT Trait_RefPrep - O EMPTY>

<!ATTLIST Trait_RefPrep

%pTrait_RefLexical>

<!ELEMENT Trait_RefConj - O EMPTY>

<!ATTLIST Trait_RefConj

%pTrait_RefLexical>

<!ELEMENT Trait_RefProRel - O EMPTY>

<!ATTLIST Trait_RefProRel

%pTrait_RefLexical>

<!ELEMENT Trait_RefProIntrog - O EMPTY>

<!ATTLIST Trait_RefProIntrog

%pTrait_RefLexical>

<!--******************* Traits MORPHOLOGIQUES*******************-->

<!--************************************************************-->

<!ELEMENT Trait_Mode - O EMPTY>

<!ATTLIST Trait_Mode

id ID #REQUIRED

valeur (%pMode) INDICATIF>

<!ELEMENT Trait_Temps - O EMPTY>

<!ATTLIST Trait_Temps

id ID #REQUIRED

valeur (%pTemps|COMPOSE) PRESENT>

<!-- Le trait de temps permet d'exprimer les restrictions de temps

liees a certaines tournures.

Ex : etre arrive socialement

V[Lex : arriver]

[Temps : COMPOSE]

[Aux : ETRE] -->

<!ELEMENT Trait_Personne - O EMPTY>

<!ATTLIST Trait_Personne

id ID #REQUIRED

valeur (%pPersonne) 3>

<!ELEMENT Trait_Genre - O EMPTY>

<!ATTLIST Trait_Genre

id ID #REQUIRED

valeur (%pGenre) MASCULIN>

<!ELEMENT Trait_Nombre - O EMPTY>

<!ATTLIST Trait_Nombre

id ID #REQUIRED

valeur (%pNombre) SINGULIER>

<!ELEMENT Trait_NombrePosseur - O EMPTY>

<!ATTLIST Trait_NombrePosseur

id ID #REQUIRED

valeur (%pNombrePosseur) SINGULIER_POSSEUR>

<!--******************* Traits MORPHO-SYNTAXIQUES ****************-->

<!--**************************************************************-->

<!ELEMENT Trait_SsCatMorph - O EMPTY>

<!ATTLIST Trait_SsCatMorph

id ID #REQUIRED

valeur (%pSsCatGram) COMMUN>

<!-- Les valeurs des traits de sous-categorie morphologique

sont predefinies -->

<!ELEMENT Trait_SsCatSynt - O EMPTY>

<!ATTLIST Trait_SsCatSynt

id ID #REQUIRED

valeur (%pSsCatSynt

|%pSsCatSynt_cust) COORDONNE>

<!-- Les traits de sous-categorie syntaxique peuvent porter sur des

categories terminales ou non terminales et sont definis a facon,

en plus de certaines valeurs deja proposees dans le modele GENELEX.

Ex : un KILO de pommes

N[SsCatSynt : DETERMINATIF]

SN[SsCatSynt : DET_VIDE] -->

<!ELEMENT Trait_Aux - O EMPTY>

<!ATTLIST Trait_Aux

id ID #REQUIRED

valeur (%pAux) AVOIR

temps (SANS_T|%pTemps

|COMPOSE) SANS_T

mode (SANS_M|%pMode) SANS_M>

<!-- Ce trait permet d'associer a un verbe donne (l'entree decrite ou

bien un verbe dans le contexte de l'entree) l'auxiliaire qui

correspond a un emploi et qui doit lui etre associe.

Ex : se lever (etre leve) // lever (avoir leve)

V[Lex : lever] V[Lex : lever]

[Aux : ETRE] [Aux : AVOIR]

[Pronominal : SE]

L'apparition du Trait_Aux en conjonction avec un Trait_Temps portant

la valeur COMPOSE indique que l'auxiliaire est necessairement

present dans l'emploi que l'on decrit.

Ex : etre arrive

V[Lex : arriver]

[Aux : ETRE]

[Temps : COMPOSE]

Les attributs temps et mode precisent lorsque c'est necessaire

le temps et le mode de l'auxiliaire lui-meme.

Ex : etant donne

V[Lex : donner]

[Temps : COMPOSE]

[Aux : ETRE[Mode : PARTICIPE][Temps : PRESENT]] -->

<!ELEMENT Trait_Pronominal - O EMPTY>

<!ATTLIST Trait_Pronominal

id ID #REQUIRED

valeur (%pPronominal) SE>

<!-- Ce trait permet d'associer a un verbe donne (l'entree decrite

ou bien un verbe dans le contexte de l'entree) la particule

preverbale NON REFERENTIELLE qui correspond a un emploi et qui doit

lui etre associee (Cf. "vrais" pronominaux).

Ex : s'en aller

V[Lex : aller]

[Pronominal : SE_EN] -->

<!ELEMENT Trait_Neg - O EMPTY>

<!ATTLIST Trait_Neg

id ID #REQUIRED

valeur (%pNeg) LIBRE>

<!-- La presence d'un Trait_Neg indique que l'emploi decrit est a la

forme negative ; on peut de plus preciser dans le champ valeur une

restriction sur la lexicalisation de la negation -->

<!ELEMENT Trait_Accord - O EMPTY>

<!ATTLIST Trait_Accord

id ID #REQUIRED

valeur (%pIJKL) I>

<!ELEMENT Trait_Passif - O EMPTY>

<!ATTLIST Trait_Passif

id ID #REQUIRED

valeur (%pBin) PLUS>

<!ELEMENT Trait_Tournure - O EMPTY>

<!ATTLIST Trait_Tournure

id ID #REQUIRED

valeur (%pTournure) INTERROGATIVE>

<!--******************* Traits SYNTAXICO-SEMANTIQUES ***********-->

<!--************************************************************-->

<!ELEMENT Trait_Coref - O EMPTY>

<!ATTLIST Trait_Coref

id ID #REQUIRED

valeur (%pIJKL) I>

<!-- La coreference doit toujours etre resoluble : si un trait de

valeur I est present, il existe au moins un autre trait de valeur I

ou NON_I lui repondant.

Les traits Coref ne forcent pas la co-realisation des syntagmes les

portant ; si on souhaite imposer cette co-realisation, on le fera

comme d'habitude par l'usage de Conditions -->

<!--***************** Traits SEMANTIQUES ***********************-->

<!--************************************************************-->

<!ELEMENT Trait_Aspect - O EMPTY>

<!ATTLIST Trait_Aspect

id ID #REQUIRED

valeur (%pAspect) PROCESSIF>

<!ELEMENT Trait_Bin - O EMPTY>

<!ATTLIST Trait_Bin

id ID #REQUIRED

nom CDATA #REQUIRED

valeur (%pBin) PLUS>

<!-- Ce type de trait permet par exemple d'exprimer les

"conditions denotationnelles",

Ex : anime, humain -->

<!ELEMENT Trait_Libre - O EMPTY>

<!ATTLIST Trait_Libre

id ID #REQUIRED

nom CDATA #REQUIRED

valeur CDATA #REQUIRED>

<!-- Ce type de trait pourra etre exploite pour specifier des classes

ou des familles semantiques

Ex : nom : classe

valeur : vetement -->

<!-- Les Traits Bin et Libre pourront de plus etre utilises pour

d'eventuels autres traits non predefinis dans cette DTD -->

<!--******************** ROLE THEMATIQUE ********************-->

<!--************************************************************-->

<!ELEMENT RoleTh - O EMPTY>

<!ATTLIST RoleTh

id ID #REQUIRED

valeur (%pRoleTh_cust) AGENT>

<!--*********************** FONCTION **************************-->

<!--************************************************************-->

<!ELEMENT Fonction - O EMPTY>

<!ATTLIST Fonction

id ID #REQUIRED

valeur (%pFonction_cust) TETE>

<!-- Les entites suffixees par _cust sont definies dans un fichier

"custom.ent" propre a l'utilisateur ; celui-ci peut ainsi ajouter

les valeurs d'attributs dont il souhaite disposer -->

 

7. Contraintes syntaxe.ctr

<!--Consortium GENELEX @(#) syntaxe.ctr 4.2@(#) 94/06/23 14:22:02 -->

<!--CONTRAINTE Usyn

combve TYPE CombVE

(description

|description_l) TYPE Description

transfdescription_l TYPE TransfDescription -->

<!--CONTRAINTE Description

self TYPE Self

construction TYPE Construction -->

<!--CONTRAINTE Self

(syntagme_nt_s

|syntagme_nt_s_l) TYPE Syntagme_NT_S

transfsyntagme_l TYPE TransfSyntagme

intervconst TYPE IntervConst

comportappele_l TYPE ComportAppele -->

<!--CONTRAINTE IntervConst

fonction TYPE Fonction

roleth_l TYPE RoleTh

syntagme_t_l TYPE Syntagme_T -->

<!--CONTRAINTE ComportAppele

fonction TYPE Fonction

roleth_l TYPE RoleTh

syntagme_t TYPE Syntagme_T -->

<!--CONTRAINTE Construction

optionnalite TYPE Optionnalite

trait_l TYPE (Trait_Lex|Trait_Introd

|Trait_Prep|Trait_Conj

|Trait_ProRel

|Trait_ProIntrog

|Trait_Mode|Trait_Temps

|Trait_Personne|Trait_Genre

|Trait_Nombre

|Trait_NombrePosseur

|Trait_SsCatMorph

|Trait_SsCatSynt

|Trait_Aux|Trait_Pronominal

|Trait_Neg|Trait_Accord

|Trait_Passif|Trait_Tournure

|Trait_Coref|Trait_Aspect

|Trait_Bin|Trait_Libre)

position_c_l TYPE Position_C -->

<!--CONTRAINTE Position_C

fonction TYPE Fonction

roleth_l TYPE RoleTh

syntagme_c_l TYPE (Syntagme_T|Syntagme_NT_C)

transfsyntagme_l TYPE TransfSyntagme -->

<!--CONTRAINTE Position_S

fonction TYPE Fonction

roleth_l TYPE RoleTh

syntagme_s_l TYPE (Syntagme_T|Syntagme_NT_S)

transfsyntagme_l TYPE TransfSyntagme -->

<!--CONTRAINTE Insertion

retire_syntagme_c_l TYPE (Syntagme_T|Syntagme_NT_C)

retire_transfsyntagme_l TYPE TransfSyntagme -->

<!--CONTRAINTE CheminSyntagme

syntagme TYPE (Syntagme_T|Syntagme_NT_C

|Syntagme_NT_S) -->

<!--CONTRAINTE SelectIntervConst

syntagme_t TYPE Syntagme_T -->

<!--CONTRAINTE Syntagme_T

trait_l TYPE (Trait_Lex|Trait_RefLex

|Trait_Mode|Trait_Temps

|Trait_Personne|Trait_Genre

|Trait_Nombre

|Trait_NombrePosseur

|Trait_SsCatMorph

|Trait_SsCatSynt

|Trait_Aux|Trait_Pronominal

|Trait_Neg|Trait_Accord

|Trait_Passif|Trait_Tournure

|Trait_Coref|Trait_Aspect

|Trait_Bin|Trait_Libre) -->

<!--CONTRAINTE Syntagme_NT_C

optionnalite TYPE Optionnalite

trait_l TYPE (Trait_Lex|Trait_Introd

|Trait_Prep|Trait_Conj

|Trait_ProRel

|Trait_ProIntrog

|Trait_Mode|Trait_Temps

|Trait_Personne|Trait_Genre

|Trait_Nombre

|Trait_NombrePosseur

|Trait_SsCatMorph

|Trait_SsCatSynt

|Trait_Aux|Trait_Pronominal

|Trait_Neg|Trait_Accord

|Trait_Passif|Trait_Tournure

|Trait_Coref|Trait_Aspect

|Trait_Bin|Trait_Libre)

position_c_l TYPE Position_C -->

<!--CONTRAINTE Syntagme_NT_S

optionnalite TYPE Optionnalite

trait_l TYPE (Trait_Lex|Trait_Introd

|Trait_Prep|Trait_Conj

|Trait_ProRel

|Trait_ProIntrog

|Trait_RefLex

|Trait_RefIntrod

|Trait_RefPrep|Trait_RefConj

|Trait_RefProRel

|Trait_RefProIntrog

|Trait_Mode|Trait_Temps

|Trait_Personne|Trait_Genre

|Trait_Nombre

|Trait_NombrePosseur

|Trait_SsCatMorph

|Trait_SsCatSynt

|Trait_Aux|Trait_Pronominal

|Trait_Neg|Trait_Accord

|Trait_Passif|Trait_Tournure

|Trait_Coref|Trait_Aspect

|Trait_Bin|Trait_Libre)

position_s_l TYPE Position_S

insertion_l TYPE Insertion -->

<!--CONTRAINTE R_ComposeUm

um TYPE (Um_S|Um_Agg|Um_C) -->

<!--CONTRAINTE R_ComposeUsyn

usyn TYPE Usyn

mdc TYPE MdC -->

<!--CONTRAINTE HeritePosition

modifposition TYPE ModifPosition -->

<!--CONTRAINTE FiltreSelf

modifintervconst TYPE ModifIntervConst

modifsyntagme_nt TYPE ModifSyntagme_NT -->

<!--CONTRAINTE TransfUsyn

usyn_resultat TYPE Usyn -->

<!--CONTRAINTE TransfDescription

(description_origine

|description_resultat) TYPE Description -->

<!--CONTRAINTE ModifDescription

modifconstruction TYPE ModifConstruction

modifintervconst TYPE ModifIntervConst

modifsyntagme_nt TYPE ModifSyntagme_NT -->

<!--CONTRAINTE ModifConstruction

optionnalite TYPE Optionnalite

(retire_trait_l

|ajoute_trait_l) TYPE (Trait_Lex|Trait_Introd

|Trait_Prep|Trait_Conj

|Trait_ProRel

|Trait_ProIntrog

|Trait_Mode|Trait_Temps

|Trait_Personne|Trait_Genre

|Trait_Nombre

|Trait_NombrePosseur

|Trait_SsCatMorph

|Trait_SsCatSynt

|Trait_Aux|Trait_Pronominal

|Trait_Neg|Trait_Accord

|Trait_Passif|Trait_Tournure

|Trait_Coref|Trait_Aspect

|Trait_Bin|Trait_Libre) -->

<!--CONTRAINTE TransfPosition

modifposition TYPE ModifPosition -->

<!--CONTRAINTE ModifPosition

fonction TYPE Fonction

roleth_l TYPE RoleTh

(retire_syntagme_l

|ajoute_syntagme_l) TYPE (Syntagme_T|Syntagme_NT_C

|Syntagme_NT_S)

(retire_transfsyntagme_l

|ajoute_transfsyntagme_l

|transfsyntagme_l) TYPE TransfSyntagme -->

<!--CONTRAINTE TransfSyntagme

(syntagme_origine

|syntagme_resultat) TYPE (Syntagme_T|Syntagme_NT_C

|Syntagme_NT_S)

modifsyntagme TYPE (ModifSyntagme_T

|ModifSyntagme_NT) -->

<!--CONTRAINTE ModifSyntagme_T

(retire_trait_l

|ajoute_trait_l TYPE (Trait_Lex|Trait_RefLex

|Trait_Mode|Trait_Temps

|Trait_Personne|Trait_Genre

|Trait_Nombre

|Trait_NombrePosseur

|Trait_SsCatMorph

|Trait_SsCatSynt

|Trait_Aux|Trait_Pronominal

|Trait_Neg|Trait_Accord

|Trait_Passif|Trait_Tournure

|Trait_Coref|Trait_Aspect

|Trait_Bin|Trait_Libre) -->

<!--CONTRAINTE ModifSyntagme_NT

(retire_trait_l

|ajoute_trait_l) TYPE (Trait_Lex|Trait_Introd

|Trait_Prep|Trait_Conj

|Trait_ProRel

|Trait_ProIntrog

|Trait_RefLex

|Trait_RefIntrod

|Trait_RefPrep|Trait_RefConj

|Trait_RefProRel

|Trait_RefProIntrog

|Trait_Mode|Trait_Temps

|Trait_Personne|Trait_Genre

|Trait_Nombre

|Trait_NombrePosseur

|Trait_SsCatMorph

|Trait_SsCatSynt

|Trait_Aux|Trait_Pronominal

|Trait_Neg|Trait_Accord

|Trait_Passif|Trait_Tournure

|Trait_Coref|Trait_Aspect

|Trait_Bin|Trait_Libre)

optionnalite TYPE Optionnalite -->

<!--CONTRAINTE ModifIntervConst

fonction TYPE Fonction

roleth_l TYPE RoleTh

(retire_syntagme_t_l

|ajoute_syntagme_t_l) TYPE Syntagme_T

transfsyntagme_l TYPE TransfSyntagme -->

<!--CONTRAINTE (Trait_Lex|Trait_Introd|Trait_Prep|Trait_Conj

|Trait_ProRel|Trait_ProIntrog)

um TYPE (Um_S|Um_Agg|Um_C) -->

8. EntitŽs syntaxe.ent

<!--Consortium GENELEX @(#) syntaxe.ent 4.1@(#) 94/06/23 14:19:20 -->

<!-- **************A L'ADRESSE DES UTILISATEURS ******************

Vos remarques concernant la DTD seront etudiees par le consortium

GENELEX. Celui-ci assurera la diffusion de la nouvelle version qui

pourrait en decouler.

*************************************************************** -->

<!ENTITY % pEtiquetteSynt_T

"%pCatGram|e">

<!ENTITY % pEtiquetteSynt_NT

"P|Nbarre|SN|SV|SADJ|SADV|SP">

<!ENTITY % pSsCatSynt "RELATIVE|COMPLETIVE

|COORDONNE|SUBORDONNEE

|INTERROGATIVE_DRI|INTERROGATIVE_DRD

|COMPARATIF|SUPERLATIF

|TEMPS|LIEU|MANIERE|DEGRE|QUANTITE

|COPULE|DET_VIDE|DETERMINATIF">

<!ENTITY % pIJKL "I|J|K|L|NON_I|NON_J|NON_K|NON_L">

<!ENTITY % pAux "ETRE|AVOIR">

<!ENTITY % pPronominal "SE|LE|LA|LES|Y|EN

|SE_LE|SE_LA|SE_LES|SE_Y|SE_EN">

<!ENTITY % pNeg "LIBRE|NE_PAS|NE_PLUS|NE_JAMAIS|NE

|NE_GUERE|NE_POINT|NE_MAIS

|NE_QUE|NE_PAS_QUE|NE_PLUS_QUE

|NE_JAMAIS_QUE|NE_GUERE_QUE|NE_RIEN_QUE">

<!ENTITY % pAspect "PROCESSIF|RESULTATIF|STATIF">

<!ENTITY % pBin "PLUS|MOINS">

<!ENTITY % pTournure "INTERROGATIVE|EXCLAMATIVE">

<!ENTITY % pPortee "EXTERNE|INTERNE|INTERVENANT">

<!ENTITY % pTypeComposant

"APPELANT|APPELANT_APPELE|APPELE">

<!ENTITY % pDestination "EXTERIEUR|INTERIEUR">

<!ENTITY % pTypeListPos "OUVERTE|FERMEE">

 

 

9. DTD morpho.dtd

<!--Consortium GENELEX @(#) morpho.dtd 3.2@(#) 94/09/07 14:35:28 -->

<!-- **************A L'ADRESSE DES UTILISATEURS ******************

Vos remarques concernant la DTD seront etudiees par le consortium

GENELEX. Celui-ci assurera la diffusion de la nouvelle version qui

pourrait en decouler.

*************************************************************** -->

<!ELEMENT GenelexMorpho - O (

(Um_S|Um_C|Um_Agg|Um_Aff)* &

Etymon* &

Mfg* &

Mfp* &

CombTM* &

Mfc* &

Comb_Comb*)>

<!-- ********************************************************* -->

<!-- ******* DEFINITION DES UNITES MORPHOLOGIQUES ************ -->

<!-- ********************************************************* -->

<!ENTITY % pUmAtt

"id ID #REQUIRED

appellation CDATA #IMPLIED

attestation CDATA #IMPLIED

combve IDREF #IMPLIED

etymon_l IDREFS #IMPLIED">

<!ELEMENT Um_S - O ((Umg|Ump)+ & Derivation* & FormeBreve*)>

<!ATTLIST Um_S

%pUmAtt

catgram (SANS_C|%pCatGram) SANS_C

sscatgram (SANS_SC|%pSsCatGram) SANS_SC

autonomie (SANS_B|%pBooleen) SANS_B

usyn_l IDREFS #IMPLIED>

<!-- Le contenu (Umg|Ump)+ oblige une Unite Morphologique Simple a

avoir au moins soit une Unite Graphique soit une Unite Phonique.

Le contenu Derivation permet d'indiquer les eventuelles derivations

aboutissant a l'Um_S.

Le contenu FormeBreve permet de mettre l Um_S consideree en relation

avec d`autres Unites qui en sont une forme breve.

usyn_l pointe vers les differentes Usyn qui decrivent le(s)

comportement(s) syntaxique(s) de l'Um -->

<!ELEMENT Um_C - O (R_Compose+ & FormeBreve*)>

<!ATTLIST Um_C

%pUmAtt

catgram (SANS_C|%pCatGram) SANS_C

sscatgram (SANS_SC|%pSsCatGram) SANS_SC

usyn_l IDREFS #IMPLIED>

<!-- Une Unite Morphologique Composee n'a pas d'Umg ni d'Ump : ses

formes graphiques et phonemiques sont deduites de celles des unites

qui la composent. Chaque Composant est indique par une relation

R_Compose.

usyn_l pointe vers les differentes Usyn qui decrivent le(s)

comportement(s) syntaxique(s) de l'Um -->

<!ELEMENT Um_Agg - O ((Umg|Ump)+ & R_Compose+)>

<!ATTLIST Um_Agg

%pUmAtt

obligatoire (SANS_B|%pBooleen) SANS_B>

<!-- Une Unite Morphologique Agglutinee utilise la relation R_Compose

pour indiquer ses elements incorpores - "agglutinants" -.

L'attribut obligatoire indique le caractere obligatoire ou

facultatif de l'emploi de l'agglutine vs. l'emploi de la forme

developpee correspondante. -->

<!ELEMENT Um_Aff - O ((Umg|Ump)+ & CatGram_Select* & CatGram_Result*

& Genre_Result*)>

<!ATTLIST Um_Aff

%pUmAtt

typaff (SANS_T|%pTypaff) SANS_T

usem_aff_l IDREFS #IMPLIED>

<!-- L'Attribut typaff permet de noter le type d'une Unite

Morphologique Affixe ; dans le cas d'un affixe ne prenant son type

qu'en contexte de derivation, cet attribut aura la valeur SANS_T.

Les Contenus CatGram_Select/Result et Genre_Result permettent

d'indiquer pour une Unite Morphologique Affixe d'eventuelles

restrictions sur la categorie

grammaticale et le genre des Unites derivees resultantes.

usem_aff_l pointe vers les differentes Usem_Aff qui decrivent le(s)

sens de l'Um -->

<!ELEMENT CatGram_Result - O EMPTY>

<!ATTLIST CatGram_Result

catgram (SANS_C|%pCatGram) SANS_C>

<!ELEMENT CatGram_Select - O EMPTY>

<!ATTLIST CatGram_Select

catgram (SANS_C|%pCatGram) SANS_C>

<!ELEMENT Genre_Result - O EMPTY>

<!ATTLIST Genre_Result

genre (SANS_G|%pGenre) SANS_G>

<!-- ********************************************************* -->

<!-- ******* FORME GRAPHIQUE / FORME PHONIQUE ********** -->

<!-- ********************************************************* -->

<!ENTITY % pUmgpAtt

"nieme NUMBER #IMPLIED

vedette (SANS_B|%pBooleen) SANS_B

appellation CDATA #IMPLIED

attestation CDATA #IMPLIED

combve IDREF #IMPLIED

mf IDREF #IMPLIED

corresp_l NUMBERS #IMPLIED">

<!ELEMENT Umg - O (Lib & Radg*)>

<!ELEMENT Ump - O (Lib & Radp*)>

<!ATTLIST (Umg|Ump)

%pUmgpAtt>

<!-- Dans le cas d'une Unite possedant des variantes Graphiques et/ou

Phoniques, soit plusieurs Umg et/ou Ump, ces Umg et Ump porteront un

nieme identifiant leur numero de variante. La correspondance entre

Umg et Ump est alors faite a l'aide de la liste d'entiers corresp_l.

Si on souhaite de plus identifier une vedette parmi ces variantes,

on la notera par l'attribut vedette.

Le champ mf note le mode flexionnel : ne pas renseigner le champ

signifie qu'on ne connait pas le mode flexionnel de l'Umg/p

consideree. Dans le cas d'Unites ne se flechissant pas

(prepositions), on affectera un mf vide. -->

<!ENTITY % pRadgpAtt

"nieme NUMBER #IMPLIED

contexte_var CDATA #IMPLIED">

<!ELEMENT (Radg|Radp) - O (Lib)>

<!ATTLIST (Radg|Radp)

%pRadgpAtt>

<!-- le radical a deux usages :

- il est utilise dans le calcul des formes flechies par le

Mfg/p. Le radical egal au libelle de l umg/p n est pas

enregistre comme element radical mais seulement comme

libelle ; on pourra cependant y faire reference en tant

que radical nieme 0

- il est utilise dans la derivation -->

<!-- ********************************************************* -->

<!-- ******* ETYMOLOGIE ********** -->

<!-- ********************************************************* -->

<!ELEMENT Etymon - O (Lib?)>

<!ATTLIST Etymon

id ID #REQUIRED

langue CDATA #IMPLIED

sens CDATA #IMPLIED

date CDATA #IMPLIED

appellation CDATA #IMPLIED>

<!-- ********************************************************* -->

<!-- ******* MODE DE FLEXION GRAPHIQUE ET PHONIQUE ********** -->

<!-- ********************************************************* -->

<!ENTITY % pMfAtt

"id ID #REQUIRED

%pGlose">

<!ELEMENT (Mfg|Mfp) - O (CombTM_Cff+)>

<!ATTLIST (Mfg|Mfp)

%pMfAtt>

<!ELEMENT CombTM_Cff - O (Cff+)>

<!ATTLIST CombTM_Cff

combtm IDREF #REQUIRED>

<!ELEMENT Cff - O (Retrait,Ajout)>

<!ATTLIST Cff

nieme NUMBER #IMPLIED

nieme_radgp NUMBER 0

contexte_var CDATA #IMPLIED

corresp_l NUMBERS #IMPLIED>

<!-- Les attributs nieme et corresp_l permettent de mettre en rapport

d'eventuelles variantes de calcul de formes flechies.

Ex : calcul de je peux/je puis

L'attribut nieme_radgp indique le Nieme radical a selectionner pour

former la forme flechie. La valeur 0 signifie qu on fait reference

au libelle Lib -->

<!ELEMENT (Lib|Ajout|Retrait) O O (#PCDATA)>

<!-- combinaison de traits morphologiques -->

<!ELEMENT CombTM - O EMPTY>

<!ATTLIST CombTM

id ID #REQUIRED

mode (SANS_M|%pMode) SANS_M

temps (SANS_T|%pTemps) SANS_T

personne (SANS_P|%pPersonne) SANS_P

genre (SANS_G|%pGenre) SANS_G

nombre (SANS_N|%pNombre) SANS_N

nombreposseur (SANS_NP|

%pNombrePosseur) SANS_NP>

<!-- ********************************************************* -->

<!-- ******* DERIVATION MORPHOLOGIQUE ********** -->

<!-- ********************************************************* -->

<!ELEMENT Derivation - O (RestrictUm* & R_Derive+)>

<!ATTLIST Derivation

appellation CDATA #IMPLIED

commentaire CDATA #IMPLIED>

<!-- La liste contenue de R_Derive permet d'indiquer les differents

composants d'une derivation.

L'enregistrement de derivations concurrentes se fait en notant

plusieurs elements Derivation sur l'unite derivee.

RestrictUm s'applique ici au derive. -->

<!ELEMENT R_Derive - O (RestrictUm*)>

<!ATTLIST R_Derive

ordre_lineaire NUMBER #IMPLIED

statut (SANS_S|%pStatut) SANS_S

retraitg CDATA #IMPLIED

retraitp CDATA #IMPLIED

um IDREF #REQUIRED>

<!-- Le champ um designe le composant de derivation.

RestrictUm s'applique ici au composant de derivation. -->

<!ELEMENT RestrictUm - O EMPTY>

<!ATTLIST RestrictUm

nieme_umg NUMBER #IMPLIED

nieme_radg NUMBER #IMPLIED

nieme_ump NUMBER #IMPLIED

nieme_radp NUMBER #IMPLIED>

<!-- Dans le contexte d'une Unite Morphologique, cet element exprime

une restriction sur cette unite en permettant d'en selectionner une

variante (ou un radical) graphique et/ou phonemique -->

<!-- ********************************************************* -->

<!-- ******* FORME BREVE ********** -->

<!-- ********************************************************* -->

<!ELEMENT FormeBreve - O EMPTY>

<!ATTLIST FormeBreve

typebref (SANS_T|%pTypeBref) SANS_T

um IDREF #REQUIRED>

<!-- L'attribut um designe l'unite que l'on souhaite noter comme forme

breve de l'unite portant la relation Forme_Breve. -->

<!-- ********************************************************* -->

<!-- ******* COMPOSITION MORPHOLOGIQUE ********** -->

<!-- ********************************************************* -->

<!ELEMENT R_Compose - O (RestrictUm*)>

<!ATTLIST R_Compose

ordre_lineaire NUMBER #IMPLIED

separg (ATTAQUE_G|%pSeparg) ATTAQUE_G

separp (ATTAQUE_P|%pSeparp) ATTAQUE_P

um IDREF #REQUIRED

mfc IDREF #IMPLIED>

<!-- L'attribut um designe l'Um_S/Aff/Agg/C composante.

L'attribut ordre_lineraire indique la place du composant dans le

compose. Les attributs separg/p donnent la liste des separateurs

possibles devant le composant. -->

<!-- modes de flexion des composes -->

<!ELEMENT Mfc - O EMPTY>

<!ATTLIST Mfc

id ID #REQUIRED

%pGlose

comb_comb_l IDREFS #REQUIRED>

<!ELEMENT Comb_Comb - O EMPTY>

<!ATTLIST Comb_Comb

id ID #REQUIRED

contexte_var CDATA #IMPLIED

combcpose IDREF #REQUIRED

combcposant_l IDREFS #REQUIRED>

<!-- L'element Comb_Comb met en rapport une combinaison de traits

flexionnels du Compose avec une - eventuellement plusieurs en cas de

compose admettant des variantes de flexion - combinaison de traits

flexionnels du Composant. L'attribut contexte_var etiquette les

variantes de flexion des composes.

Ex : des pare-soleil(s)

Le pluriel du compose est forme a partir du singulier

(ancienne orthographe) ou du pluriel du composant soleil

(nouvelle orthographe).

Ces informations "ancienne orthographe" et "nouvelle

orthographe" seront notees dans l'attribut contexte_var.

On prevoira donc un separateur entre les zones de ce CDATA,

l'ordre des zones devant correspondre a l'ordre des IDREFS

dans combcposant_l

-> "ancienne orthographe | nouvelle orthographe" -->

<!-- ********************************************************* -->

<!-- ******* MECANISMES SIMPLIFICATEURS ********** -->

<!-- ********************************************************* -->

<!-- Les SHORTREF suivants permettent l'omission des balises ouvrantes

des elements ajout et retrait ; ces elements peuvent alors

apparaitre dans le fichier balise sous la forme de deux chaines de

caracteres separees par une virgule -->

<!ENTITY e-s-ajout "<ajout>" >

<!SHORTREF s-ajout

"," e-s-ajout >

<!USEMAP s-ajout retrait >

<!-- Dans le cas de figure : <cff>,s</> ou l'element retrait ne

contient aucun PCDATA, le USEMAP introduira <retrait><ajout> -->

<!ENTITY e-s-retrait "<retrait><ajout>" >

<!SHORTREF s-retrait

"," e-s-retrait >

<!USEMAP s-retrait Cff >

 

 

 

 

 

10. Contraintes morpho.ctr

<!--Consortium GENELEX @(#) morpho.ctr 3.2@(#) 94/06/23 14:12:39 -->

<!--CONTRAINTE Um_S

combve TYPE CombVE

etymon_l TYPE Etymon

usyn_l TYPE Usyn -->

<!--CONTRAINTE Um_C

combve TYPE CombVE

etymon_l TYPE Etymon

usyn_l TYPE Usyn -->

<!--CONTRAINTE Um_Agg

combve TYPE CombVE

etymon_l TYPE Etymon -->

<!--CONTRAINTE Um_Aff

combve TYPE CombVE

etymon_l TYPE Etymon

usem_aff_l TYPE Usem_Aff -->

<!--CONTRAINTE Umg

combve TYPE CombVE

mf TYPE Mfg -->

<!--CONTRAINTE Ump

combve TYPE CombVE

mf TYPE Mfp -->

<!--CONTRAINTE CombTM_Cff

combtm TYPE CombTM -->

<!--CONTRAINTE R_Derive

um TYPE (Um_S|Um_Agg

|Um_Aff) -->

<!--CONTRAINTE FormeBreve

um TYPE (Um_S|Um_C

|Um_Agg|Um_Aff) -->

<!--CONTRAINTE R_Compose

um TYPE (Um_S|Um_C

|Um_Agg|Um_Aff)

mfc TYPE Mfc -->

<!--CONTRAINTE Mfc

comb_comb_l TYPE Comb_Comb -->

<!--CONTRAINTE Comb_Comb

combcpose TYPE CombTM

combcposant_l TYPE CombTM -->

11. EntitŽs morpho.ent

<!--Consortium GENELEX @(#) morpho.ent 3.1@(#) 94/06/23 14:23:22 -->

<!-- **************A L'ADRESSE DES UTILISATEURS ******************

Vos remarques concernant la DTD seront etudiees par le consortium

GENELEX. Celui-ci assurera la diffusion de la nouvelle version qui

pourrait en decouler.

*************************************************************** -->

<!ENTITY % pBooleen "OUI|NON" >

<!ENTITY % pDatation "ARCHAIQUE|VIEILLI|MODERNE" >

<!ENTITY % pNiveauLgue "FAMILIER|VULGAIRE|ARGOTIQUE|POPULAIRE

|LITTERAIRE|SAVANT|STANDARD" >

<!ENTITY % pFrequence "RARE|COURANT" >

<!ENTITY % pCatGram "NOM|ADJECTIF|ADVERBE|VERBE|PREPOSITION

|CONJONCTION|INTERJECTION|DETERMINANT|PRONOM

|PARTICULE" >

<!ENTITY % pSsCatGram "PROPRE|COMMUN|POSSESSIF|DEMONSTRATIF

|PARTITIF|DEFINI|INDEFINI|CARDINAL|ORDINAL

|EXCLAMATIF|QUALIFICATIF|INTERROGATIF

|RELATIF|COMPLETIF|COORDINATION|SUBORDINATION

|PERSONNEL_FORT|PERSONNEL_FAIBLE|IMPERSONNEL

|COMPARATIF_EGALITE|COMPARATIF_SUPERIORITE

|COMPARATIF_INFERIORITE

|SUPERLATIF_SUPERIORITE|SUPERLATIF_INFERIORITE

|SUPERLATIF_ABSOLU">

<!ENTITY % pMode "INDICATIF|SUBJONCTIF|CONDITIONNEL|IMPERATIF

|INFINITIF|PARTICIPE" >

<!ENTITY % pTemps "PRESENT|IMPARFAIT|PASSE_SIMPLE|FUTUR|PASSE" >

<!ENTITY % pPersonne "1|2|3" >

<!ENTITY % pGenre "MASCULIN|FEMININ|NEUTRE" >

<!ENTITY % pNombre "SINGULIER|PLURIEL" >

<!ENTITY % pNombrePosseur

"SINGULIER_POSSEUR|PLURIEL_POSSEUR" >

<!ENTITY % pTypaff "PREFIXE|SUFFIXE|INFIXE" >

<!ENTITY % pStatut "%pTypaff|BASE" >

<!ENTITY % pTypeBref "ABREVIATION|SIGLE|ACRONYME" >

<!ENTITY % pSeparg "TIRET|APOSTROPHE|ESPACE|JOINTURE

|TIRET_ESPACE|TIRET_JOINTURE

|TIRET_ESPACE_JOINTURE

|APOSTROPHE_JOINTURE" >

<!ENTITY % pSeparp "LIAISON_t|LIAISON_z|LIAISON_k

|LIAISON_n|LIAISON_r|FRONTIERE_MOT" >