100% found this document useful (2 votes)
11 views53 pages

Foundations and Methods in Combinatorial and Statistical Data Analysis and Clustering 1st Edition Israël César Lerman (Auth.) pdf download

Ebook download

Uploaded by

hzwxbigo805
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
11 views53 pages

Foundations and Methods in Combinatorial and Statistical Data Analysis and Clustering 1st Edition Israël César Lerman (Auth.) pdf download

Ebook download

Uploaded by

hzwxbigo805
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

Foundations and Methods in Combinatorial and

Statistical Data Analysis and Clustering 1st


Edition Israël César Lerman (Auth.) download

https://textbookfull.com/product/foundations-and-methods-in-
combinatorial-and-statistical-data-analysis-and-clustering-1st-
edition-israel-cesar-lerman-auth/

Download full version ebook from https://textbookfull.com


We believe these products will be a great fit for you. Click
the link to download now, or visit textbookfull.com
to discover even more!

Statistical Data Analysis using SAS Intermediate


Statistical Methods Mervyn G. Marasinghe

https://textbookfull.com/product/statistical-data-analysis-using-
sas-intermediate-statistical-methods-mervyn-g-marasinghe/

An Introduction to Statistical Methods and Data


Analysis 7th Edition R. Lyman Ott

https://textbookfull.com/product/an-introduction-to-statistical-
methods-and-data-analysis-7th-edition-r-lyman-ott/

Statistical Methods for Imbalanced Data in Ecological


and Biological Studies Osamu Komori

https://textbookfull.com/product/statistical-methods-for-
imbalanced-data-in-ecological-and-biological-studies-osamu-
komori/

Statistical Methods An Introduction to Basic


Statistical Concepts and Analysis 2nd Edition Cheryl
Ann Willard

https://textbookfull.com/product/statistical-methods-an-
introduction-to-basic-statistical-concepts-and-analysis-2nd-
edition-cheryl-ann-willard/
Statistical Human Genetics Methods and Protocols 2nd
Edition Robert C. Elston (Eds.)

https://textbookfull.com/product/statistical-human-genetics-
methods-and-protocols-2nd-edition-robert-c-elston-eds/

Analysis for Computer Scientists Foundations Methods


and Algorithms Michael Oberguggenberger

https://textbookfull.com/product/analysis-for-computer-
scientists-foundations-methods-and-algorithms-michael-
oberguggenberger/

Statistical Methods in Psychiatry and Related Fields


Longitudinal Clustered and Other Repeated Measures Data
1st Edition Ralitza Gueorguieva

https://textbookfull.com/product/statistical-methods-in-
psychiatry-and-related-fields-longitudinal-clustered-and-other-
repeated-measures-data-1st-edition-ralitza-gueorguieva/

Analysis for computer scientists foundations methods


and algorithms Second Edition Oberguggenberger

https://textbookfull.com/product/analysis-for-computer-
scientists-foundations-methods-and-algorithms-second-edition-
oberguggenberger/

Transcriptome Data Analysis Methods and Protocols 1st


Edition Yejun Wang

https://textbookfull.com/product/transcriptome-data-analysis-
methods-and-protocols-1st-edition-yejun-wang/
Advanced Information and Knowledge Processing

Israël César Lerman

Foundations
and Methods in
Combinatorial and
Statistical Data
Analysis and
Clustering
Advanced Information and Knowledge
Processing

Series editors
Lakhmi C. Jain
Bournemouth University, Poole, UK, and
University of South Australia, Adelaide, Australia

Xindong Wu
University of Vermont
Information systems and intelligent knowledge processing are playing an increasing
role in business, science and technology. Recently, advanced information systems
have evolved to facilitate the co-evolution of human and information networks
within communities. These advanced information systems use various paradigms
including artificial intelligence, knowledge management, and neural science as well
as conventional information processing paradigms. The aim of this series is to
publish books on new designs and applications of advanced information and
knowledge processing paradigms in areas including but not limited to aviation,
business, security, education, engineering, health, management, and science. Books
in the series should have a strong focus on information processing—preferably
combined with, or extended by, new results from adjacent sciences. Proposals for
research monographs, reference books, coherently integrated multi-author edited
books, and handbooks will be considered for the series and each proposal will be
reviewed by the Series Editors, with additional reviews from the editorial board and
independent reviewers where appropriate. Titles published within the Advanced
Information and Knowledge Processing series are included in Thomson Reuters’
Book Citation Index.

More information about this series at http://www.springer.com/series/4738


Israël César Lerman

Foundations and Methods


in Combinatorial
and Statistical Data
Analysis and Clustering

123
Israël César Lerman
Department of Data Knowledge
and Management
University of Rennes 1, IRISA
Rennes, Ille-et-Vilaine
France

ISSN 1610-3947 ISSN 2197-8441 (electronic)


Advanced Information and Knowledge Processing
ISBN 978-1-4471-6791-4 ISBN 978-1-4471-6793-8 (eBook)
DOI 10.1007/978-1-4471-6793-8

Library of Congress Control Number: 2016931997

© Springer-Verlag London 2016

The author(s) has/have asserted their right(s) to be identified as the author(s) of this work in accordance
with the Copyright, Design and Patents Act 1988.

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made.

Printed on acid-free paper

This Springer imprint is published by SpringerNature


The registered company is Springer-Verlag London Ltd.
I dedicate this work to Rollande and
our three daughters Sabine, Alix and
Judith and their children
Preface

Relative to the basic notions of a descriptive attribute (variable) and an object


described, there are two fundamental concepts in Data Analysis: association
between attributes and similarity between objects. Given the description of objects
by attributes the goal of data analysis methods is to propose a reduced represen-
tation of the data which preserves as accurately as possible the relationships
between attributes and between objects. Mainly, there are two types of methods:
factorial and clustering. Factorial methods are geometric. For these, the com-
pression structure is obtained from a system of synthetic axes, called factorial axes.
The most discriminant of them are retained in order to be substituted for the origin
axes. Then, the set of data units (objects and also attributes) is represented by a
cloud of points placed in the geometrical space, referring to the new system of axes.
Clustering methods are combinatorial. The compression structure consists of an
organized system of proximity clusters. In our terminology, an equivalent term for
clustering is classification.
In our approach clustering (Classification) is considered as a central tool in data
analysis. The extensive development of this principle has led to a very rich
methodology. According to this standpoint the first facet of clustering concerns the
organization of the attribute set. This enables us to discover the behavioural ten-
dencies and subtendencies of the population studied from a sample of it, the latter
defining the object set. The second facet concerns the proximity organization of the
object set or a category set induced from it. Behaviour understanding is provided by
the first facet and management control by the second facet. Geometrical factorial
analysis is often considered as a special tool of data analysis for attribute set
structuration. Clustering attributes is a non-classical subject in the literature on data
analysis. Generally, the methods proposed for this problem consist of adapting
methods created for clustering an object set. By distinguishing clearly the two dual
problems: attribute clustering and object clustering, our approach is essentially
different.
This book provides a large synthesis and systematic treatment in the area of
clustering and combinatorial data analysis. A new vision of this very active field is

vii
viii Preface

given. The methodological principles are very new in the data mining field. All
types of data structures are clearly represented and can be handled in a precise way:
qualitative data of any sort, quantitative data and contingency data. The methods
invented have been validated by many important and big applications. Their the-
oretical foundations are clearly and strongly established from three points of view:
logical, combinatorial and statistical. In this way, the respective rationales of the
distinct methods are clearly set up.
As expressed above, the special structure we are interested in for a reduced
representation of the data is that obtained by clustering methods. A non-hierarchical
clustering algorithm on a finite set E, endowed with a similarity index, produces a
partition on E. Whereas a hierarchical clustering algorithm on E produces an
ordered partition chain on E. This book is dominated by hierarchical clustering.
However, methods of non-hierarchical clustering are also considered (see below).
In Chap. 1 we study some formal and combinatorial aspects of the sought
mathematical structure: partition or ordered chain of partitions. More particularly,
two sides are developed. The first is enumerative and consists of counting chains in
the partition lattice or counting specific subsets in the partition set. In order to relate
the partition type and the cardinality of the equivalence relation graph associated
with it, we are led to address the set organized of an integer partition. The second
important side concerns the mathematical representation of a partition and, more
generally and importantly, an ordered chain of partitions on a finite set E. Thereby,
the relationships between the latter structure and numerical (rep., ordinal) ultra-
metric spaces are established. In fact, all the algorithmic development of a given
clustering method is dependent on the representation adopted. We end Chap. 1 by
showing the transition between the formalization of symmetrical hierarchical
clustering and that of directed hierarchical clustering, where junctions between
clusters are directed according to a total (also said “linear”) order on E.
Our method is focused on ascendant agglomerative hierarchical clustering
(AAHC). However, non-hierarchical clustering plays an important role in the
compression of data representation. This methodology addresses the problem of
clustering an object set and not that of an attribute set. Its philosophy is different
from that of hierarchical clustering. In these conditions, we describe in Chap. 2 two
fundamental and essentially different methods of non-hierarchical clustering. These
reflect two important families of no-hierarchical clustering algorithms. It is a matter
of the “central” partitions of S. Régnier and that of “dynamic clustering” of E.
Diday. The latter is derived from a generalization of the “allocating and centring”
k-means algorithm, defined by D.J. Hall and G.H. Ball (see references of the
chapter concerned). This method is discussed in this chapter. On the other hand,
new theoretical and software developments are mentioned.
For the mathematical data representation the descriptive attributes are interpreted
in terms of relations on the object set. Thereby, categorical attributes of any sort are
represented faithfully. In these conditions, numerical attributes are defined as val-
ued relations. Whereas classical approaches propose a converse reasoning by
assigning, more or less arbitrarily, numerical values to categories.
Preface ix

In Chap. 3 we describe the set theoretic and relational representation of the data
description. All types of data can be taken into account. Two description levels are
considered: objects and categories. For each of the levels, object description and
category description, two attribute types are considered depending on the arity
of the representative relation on the object set, unary or binary. Notice that the arity
of the representative relation associated with a given attribute can be greater than
two. And this, is also considered in our development. Thus, in this framework, we
define several structured attributes concerned by observation of real data.
The fundamental concept of resemblance between data units: attributes, objects
or categories, is studied in Chaps. 4–7. It is based on a deep development of a
similarity notion between combinatorial structures. Invariance properties of statis-
tical nature are set up. These lead to a constructive and unified theory of the
resemblance notion. Classical association coefficients such that the Goodman and
Kruskal, Kendall and Yule coefficients are clearly stood in the framework of this
theory. Two options are considered for normalization of the association coefficients
between descriptive attributes: standard deviation and maximum. A probability
scale, associated with the first normalization, is built in order to compare association
coefficients between attributes or similarity indices between objects (resp., cate-
gories). This scale is obtained by associating independent random data with the
observed one, the random model respecting the general characteristics of the data
observed. This comparison technique is a part of the likelihood linkage analysis
(LLA) clustering method where an observed value of a numerical similarity index is
situated with respect to its unlikelihood bigness. Well-know non-parametric sta-
tistical theorems are needed for the application of this approach to the attribute
comparison. New theorems are established. Based on the same principle an index of
implication between Boolean attributes is set up. Also, we show how partial
association coefficients between structured categorical attributes are built.
Comparing objects described is not equivalent to comparing descriptive attri-
butes. We show in Chap. 7 how the LLA approach enables similarity indices
between objects, described by heterogeneous attributes of different types, to be
built. We also show how comparing categories is a specific task.
The fascinating concept of “natural” cluster of objects cannot be defined
mathematically. Its realization in real cases is expected as a result derived from
application of clustering algorithms. Such a cluster is interpreted intuitively.
However, it is important to define it as accurately as possible. This definition is
necessarily a statistical one. Nevertheless, statistical formalization of a “natural”
cluster is very difficult. In Chap. 8 we address this concept. Statistical tools are
established for understanding the meaning of such a cluster. For this purpose, initial
description is examined for all types of data. Thus, the analysis of a “natural”
cluster is essentially analytical. Another way consists of crossing with the target
cluster associated with a “natural” cluster, known and discriminant clusters disjoint
logically of it, but statistically linked. A “natural” cluster is a part of a “natural”
clustering. Generally, this statistical structure sustains real data. However, it is
important to test this hypothesis for the data treated. In these conditions, “classi-
fiability” testing hypotheses are proposed and studied.
x Preface

Whereas Chap. 8 is focused on the intrinsic analysis of clustering, Chap. 9 is


devoted to comparing clusterings or clustering trees on the same finite set endowed
with a similarity or dissimilarity index. In this chapter very powerful tools are
established for this comparison. In this, the similarity data is either numerical or
ordinal. A minute analysis of the comparison criteria for both types (numerical or
ordinal) is provided. The criteria proposed have a combinatorial and non-parametric
statistical nature and they are extremely general. They are established with respect
to a probabilistic independence hypothesis between similarity and clustering
structures. This enables us to establish significant and non-biased comparisons.
As mentioned above, AAHC is considered in this book as a main tool for data
analysis. Starting with similarities or distances between data units (See Chaps. 4–7)
we show in Chap. 10 how to build a classification tree on the data set corresponding
to an agglomerative technique. Ordinal notion of pairwise similarities is treated
first. Natural transition to a numerical version of this notion is shown. Defining a
dissimilarity between disjoint subsets of the set to be clustered is a fundamental task
in agglomerative hierarchical clustering. This dissimilarity is established from the
pairwise dissimilarities of data units. Two families of dissimilarity indices are
studied. The first is classical and employs distances and weightings. The second is
defined from probabilistic indices obtained in the context of the LLA approach. The
numerical dissimilarity indices between disjoint subsets of the data set enable
comparisons between the clusters merged to be made. The algorithmic analysis
of the clustering tree construction is a very important problem. Fundamental results
for this problem are reported in this chapter. Thus, we describe some basic solutions
provided for agglomerative hierarchical clustering of large data sets. Their com-
putational complexities are expressed. We end this chapter by showing the tran-
sition between the usual symmetric hierarchical clustering and that directed where
junctions between the branches of the hierarchical tree are compatible with a total
order on the set clustered.
In Chap. 11 we begin by describing the Classification Hiérarchique par Analyse
de la Vraisemblance des Liens (CHAVL) software. The address of a link is spec-
ified in the References section in order to access this software. The latter performs
according to the LLA methodology, the AAHC of a descriptive attribute set or,
dually, a described object (resp., category) set; and this, for a large family of data
table structures. In this chapter the results obtained by the LLA method on many
real cases are reported. These are provided from different areas: psychosociology,
sociological surveys, biology, bioinformatics, image data processing, rural econ-
omy. The LLA hierarchical clustering method is applied in order to discover
“natural” clusters and behavioural tendencies in the population observed. The
cluster interpretation is based on the coefficients developed in Chaps. 4–8. In some
of these cases, comparison of the LLA results with those of the Ward hierarchical
clustering method, is expressed. In order to realize the different facets in applying
the LLA method, some presentations of the processed real cases are detailed
sufficiently.
The book ends with Chap. 12 devoted to a general conclusion in which several
routes for future research works are outlined. Moreover, the contribution of the
Preface xi

book to challenges and advances in cluster analysis is clearly specified. Further, in


this chapter, the situation of the book content with respect to other books in the
same field is described.
The starting point of the project of this book was a reviewed and completed
English translation of the French book:
Classification et analyse ordinale des données
published—with the support of the CNRS—by Dunod (Paris) in 1981.
The progress of my research, the works I met and the considerable development
of the field concerned have made that a single volume cannot suffice to cover the
entire material expressed in the French book.
In the book we propose here, symmetrical synthetic structures for summarizing
data are considered. For these structures—defined by partitions or partition chains—
if x and y are two elements of the set E to be organized, the role of x with respect to y is
identical to that of y with respect to x.
The different steps of the passage from the data table to the synthetic structure
(partition or partition chain) on E are minutely studied. Recall that the set E to be
clustered may be an attribute set or an object set (resp., a category set).
The book we propose is a new book. It corresponds with respect to the earlier
French version, to a new writing, a new design and a much larger scope and
potential. The intuitive introductions, the examples and the mathematical formal-
ization and analysis of the subjects treated permit the reader to understand in depth
the different approaches in data analysis and clustering. Special concern is devoted
for expressing the relationships between these approaches. More precisely, the
development provided in this book has the following general distinctive and related
features:
1. Mathematical and statistical foundations of combinatorial data analysis and
clustering;
2. Mathematical, formal conception and properties are set up in order to compare
different approaches in the field concerned;
3. Definition of new methods, guided by a few fundamental principles taking into
account the formal analysis;
4. Applying new methods to real data.
More specific distinctive features might be listed as follows:
• Formal descriptions and specific mathematical properties of the synthetic
structures sought in clustering (partitions, partition chains (symmetrical and
directed));
• Emphasizing data description by categorical attributes of different sorts (broad
scope);
• Interpreting descriptive attributes in terms of relations on the object set
described;
• Set theoretic representation of the relations defined by the descriptive attributes;
• Very clear typology of data description in the most general case;
xii Preface

• Development of a unified association coefficient notion (symmetrical and


asymmetrical) between descriptive attributes of different sorts, including all
types of categorical attributes;
• Development of a similarity notion between objects or categories for different
types of description, including all types of categorical attributes;
• Probabilistic similarity measures between objects, object clusters, categories,
category clusters, attributes, attribute clusters, …;
• Clustering numerical or categorical descriptive attributes of different kinds;
• Clustering data units (objects or categories) described by a mixing of descriptive
attribute types;
• Dual association between object clustering and attribute clustering;
• Seriation and clustering;
• Combinatorial and non-parametric statistical basis for the association coeffi-
cients, similarity indices and criteria in clustering;
• Algorithmic studies.
In the part of the French book not retaken here the synthetic structures summa-
rizing the data are of asymmetrical nature. Ordinal considerations take part. The
chapters concerned with the latter, which may constitute a second volume, are: 6–10.
Let me give briefly the subject of each of them.
• Chapter 6: Principal component analysis and correspondence analysis;
• Chapter 7: Mathematical comparisons between factorial analysis and classifi-
cation methods;
• Chapter 8: From combinatorial and statistical seriation methods to a family of
cluster analysis methods;
• Chapter 9: Totally ordering the whole set of categories associated with a set of
ordinal categorical attributes;
• Chapter 10: Assignation problems in pattern recognition between geometrical
figures where the quality measure of the assignation has to be independent of
specific geometrical transformations applied on the figures concerned.
As indicated in the title of the book, our work refers to Combinatorial and
Statistical Data Analysis. The importance of this methodology has already been
underlined in the well-known article “Combinatorial Data Analysis” by Phipps
Arabie and Lawrence Hubert, published in 1992.
This book is not conceived a priori as a “text book”. It is a result of my research
led since 1966, with many collaborators (See below). Thus the main orientation is
“research”. However, the latter is placed in the framework of the entire domain
concerned. Moreover, a very important part of this research is oriented towards the
foundation and synthesis of different methods in combinatorial data analysis and
clustering. Consequently, this book is a reference book. It will be very useful to
master’s and Ph.D. students. Wide parts of this book can be taught to students of
computer science, statistics and mathematics. I did it.
Let me now cite, in alphabetic order, the names of different collaborators who
have worked with me and participated in this research. Most often, but not always,
Preface xiii

they were around preparing theses and subsequent articles. I especially thank them.
The theses defended at the University of Rennes 1 can be consulted at the link
address: Sadoc.abes.fr/Recherche avancée.

Collaborators

Jérôme Azé, Helena Bacelar-Nicolaü, Kaddour Bachar, Jean-Louis Buard, Thierry


Chantrel, Isaac Cohen-Hallaleh, Jean-Louis Cotrieux, François Daudé, Aziz Faraj,
Jean-Paul Geffrault, Nadia Ghazzali, Régis Gras, Sylvie Guillaume, Ivan
Kojadinovic, Pascale Kuntz, Jean-Yves Lafaye, Georges Lecalvé, Alain Léger,
Henri Leredde, Jean Rémi Massé, Annie Moreau, Roger Ngouënet, Fernando
Nicolaü Da Costa, Mohammed Ouali-Allah, Philippe Peter, Joaquim Pinto Da
Costa, Annick Prod’Homme, Habibullah Rostam, Valérie Rouat, François Rouxel,
Abdel Rahmane Sbii, Basavaneppa Tallur and Philippe Villoing.
Acknowledgements

Before proceeding, I express my gratitude to Dan A. Simovici, Professor at the


University of Massachusetts Boston (Department of Computer Science) for having
encouraged me to propose a recasted new English version of my book—mentioned
above—“Classification et analyse ordinale des donées”, published by Dunod (Paris)
in 1981.
I especially thank Fionn Murtagh, Professor of Data Science at the University of
Derby (Department of Computing and Mathematics), for having distributed the
French book among classics in Clustering, in the framework of the International
Federation of Classification Societies and the Journal of Classification. These
books are now on the site:
http://www.brclasssoc.org.uk/books/index.html
I am grateful to the Institut de Recherche en Informatique et Systèmes Aléatoires
(IRISA) institute where a great part of the research underlying the book, was carried
out. Special thanks to the Directors Bruno Arnaldi and Jean-Marc Jezequel to have
supported my project.
I also thank Gilles Lesventes, Director of the ISTIC UFR Informatique-
Électronique to welcome me in his institute and to encourage me.
I owe a debt of gratitude to DUNOD publisher to have published the initial
French book and to have retroceded me the copyright.
I also have immense gratitude to the editor of the series in which this book is
published as well as to the editorial staff of Springer, particularly to Helen Desmond
who has always been in my listening. I also thank James Robinson for taking care
of the production process.
Validating new methods of data analysis on real cases is a fundamental task. In
this regard, I especially thank researchers in computing science who have worked
with me and built efficient and elegant softwares needed for applying the LLA
methodology. Let me cite
Henri Leredde, Philippe Peter, Mohamed Ouali-Allah, Kaddour Bachar, Ivan
Kojadinovic and Basavaneppa Tallur.

xv
xvi Acknowledgements

Philippe Louarn (INRIA-Rennes) has defined the general LATEX structure with
respect to which I have composed this book. He helped me many times and his help
was always valuable. I am very grateful to him.
I cannot conclude these acknowledgments without special thanks to my
son-in-law Benjamin Enriquez (Professor of Mathematics at the University Louis
Pasteur of Strasbourg). I regularly used to inform him about the progress of my
writing. His encouragement and advice have always been very beneficial.
Contents

1 On Some Facets of the Partition Set of a Finite Set . . . . . . . . . . . 1


1.1 Lattice of Partition Set of a Finite Set . . . . . . . . . . . . . . . . . . 1
1.1.1 Definition and General Properties . . . . . . . . . . . . . . . 1
1.1.2 Countings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2 Partitions of an Integer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.2.1 Generalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.2.2 Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.3 Type of a Partition and Cardinality of the Associated
Equivalence Binary Relation . . . . . . . . . . . . . . . . . . . . . . . . 21
1.4 Ultrametric Spaces and Partition Chain Representation . . . . . . 30
1.4.1 Definition and Properties of Ultrametric Spaces . . . . . 30
1.4.2 Partition Lattice Chains of a Finite Set and the
Associated Ultrametric Spaces . . . . . . . . . . . . . . . . . 33
1.4.3 Partition Lattice Chains and the Associated
Ultrametric Preordonances . . . . . . . . . . . . . . . . . . . . 37
1.4.4 Partition Hierarchies and Dendrograms . . . . . . . . . . . 39
1.4.5 From a Symmetrical Binary Hierarchy to a Directed
Binary Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . 45
1.5 Polyhedral Representation of the Partition Set
of a Finite Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
2 Two Methods of Non-hierarchical Clustering. . . . . . . . . . . . . . . . 61
2.1 Preamble. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
2.2 Central Partition Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
2.2.1 Data Structure and Clustering Criterion . . . . . . . . . . . 62
2.2.2 Transfer Algorithm and Central Partition . . . . . . . . . . 69
2.2.3 Objects with the Same Representation. . . . . . . . . . . . 72
2.2.4 Statistical Asymptotic Analysis . . . . . . . . . . . . . . . . 74
2.2.5 Remarks on the Application of the Central Partition
Method and Developments . . . . . . . . . . . . . . . . . . . 78

xvii
xviii Contents

2.3 Dynamic and Adaptative Clustering Method . . . . . . . . . . . . . 80


2.3.1 Data Structure and Clustering Criterion . . . . . . . . . . . 80
2.3.2 The K-Means Algorithm . . . . . . . . . . . . . . . . . . . . . 84
2.3.3 Dynamic Cluster Algorithm . . . . . . . . . . . . . . . . . . . 86
2.3.4 Following the Definition of the Algorithm . . . . . . . . . 91
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
3 Structure and Mathematical Representation of Data . . . . . . . . . . 101
3.1 Objects, Categories and Attributes . . . . . . . . . . . . . . . . . . . . 101
3.2 Representation of the Attributes of Type I . . . . . . . . . . . . . . . 103
3.2.1 The Boolean Attribute. . . . . . . . . . . . . . . . . . . . . . . 104
3.2.2 The Numerical Attribute . . . . . . . . . . . . . . . . . . . . . 105
3.2.3 Defining a Categorical Attribute from a Numerical
One . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
3.3 Representation of the Attributes of Type II . . . . . . . . . . . . . . 109
3.3.1 The Nominal Categorical Attribute . . . . . . . . . . . . . . 110
3.3.2 The Ordinal Categorical Attribute. . . . . . . . . . . . . . . 113
3.3.3 The Ranking Attribute . . . . . . . . . . . . . . . . . . . . . . 116
3.3.4 The Categorical Attribute Valuated by a Numerical
Similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
3.3.5 The Valuated Binary Relation Attribute. . . . . . . . . . . 120
3.4 Representation of the Attributes of Type III. . . . . . . . . . . . . . 121
3.4.1 The Preordonance Categorical Attribute . . . . . . . . . . 121
3.4.2 The Taxonomic Categorical Attribute . . . . . . . . . . . . 124
3.4.3 The Taxonomic Preordonance Attribute. . . . . . . . . . . 129
3.4.4 Coding the Different Attributes in Terms of
Preordonance or Similarity Categorical
Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 132
3.5 Attribute Representations When Describing a Set C of
Categories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
3.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
3.5.2 Attributes of Type I . . . . . . . . . . . . . . . . . . . . . . . . 138
3.5.3 Nominal or Ordinal Categorical Attributes . . . . . . . . . 138
3.5.4 Ordinal (preordonance) or Numerical Similarity
Categorical Attributes . . . . . . . . . . . . . . . . . . . . . .. 142
3.5.5 The Data Table: A Tarski System T or a Statistical
System S . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 143
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 146
4 Ordinal and Metrical Analysis of the Resemblance Notion . . .... 149
4.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .... 149
4.2 Formal Analysis in the Case of a Description
of an Object Set O by Attributes of Type I; Extensions . .... 152
4.2.1 Similarity Index in the Case of Boolean Data . . .... 152
4.2.2 Preordonance Associated with a Similarity Index
in the Case of Boolean Data . . . . . . . . . . . . . . .... 165
Contents xix

4.3 Extension of the Indices Defined in the Boolean Case to


Attributes of Type II or III . . . . . . . . . . . . . . . . . . . . . . . . . 178
4.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
4.3.2 Comparing Nominal Categorical Attributes . . . . . . . . 180
4.3.3 Comparing Ordinal Categorical Attributes . . . . . . . . . 183
4.3.4 Comparing Preordonance Categorical Attributes . . . . . 191
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
5 Comparing Attributes by Probabilistic and Statistical
Association I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
5.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
5.2 Comparing Attributes of Type I for an Object Set Description
by the Likelihood Linkage Analysis Approach . . . . . . . . . . . . 201
5.2.1 The Boolean Case . . . . . . . . . . . . . . . . . . . . . . . . . 201
5.2.2 Comparing Numerical Attributes
in the LLA approach . . . . . . . . . . . . . . . . . . . . . . . . 221
5.3 Comparing Attributes for a Description of a Set
of Categories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
5.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
5.3.2 Case of a Description by Boolean Attributes . . . . . . . 234
5.3.3 Comparing Distributions of Numerical, Ordinal
Categorical and Nominal Categorical Attributes . . . . . 242
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
6 Comparing Attributes by a Probabilistic and Statistical
Association II. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
6.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
6.2 Comparing Attributes of Type II for an Object Set Description;
the LLA Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
6.2.1 Introduction; Alternatives in Normalizing
Association Coefficients . . . . . . . . . . . . . . . . . . . . . 252
6.2.2 Comparing Two Ranking Attributes . . . . . . . . . . . . . 256
6.2.3 Comparing Two Nominal Categorical Attributes . . . . 261
6.2.4 Comparing Two Ordinal Categorical Attributes . . . . . 276
6.2.5 Comparing Two Valuated Binary Relation
Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
6.2.6 From the Total Association to the Partial One . . . . . . 309
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
7 Comparing Objects or Categories Described by Attributes. . .... 325
7.1 Preamble. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .... 325
7.2 Comparing Objects or Categories by the LLA Method. . . .... 328
7.2.1 The Outline of the LLA Method for Comparing
Objects or Categories . . . . . . . . . . . . . . . . . . . .... 328
7.2.2 Similarity Index Between Objects Described by
Numerical or Boolean Attributes . . . . . . . . . . . .... 331
xx Contents

7.2.3 Similarity Index Between Objects Described


by Nominal or Ordinal Categorical Attributes . . . . .. 334
7.2.4 Similarity Index Between Objects Described
by Preordonance or Valuated Categorical
Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 338
7.2.5 Similarity Index Between Objects Described
by Taxonomic Attributes. A Solution
for the Classification Consensus Problem . . . . . . . .. 341
7.2.6 Similarity Index Between Objects Described
by a Mixed Attribute Types: Heterogenous
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 344
7.2.7 The Goodall Similarity Index. . . . . . . . . . . . . . . . .. 345
7.2.8 Similarity Index Between Rows of a Juxtaposition
of Contingency Tables . . . . . . . . . . . . . . . . . . . . .. 349
7.2.9 Other Similarity Indices on the Row Set I
of a Contingency Table. . . . . . . . . . . . . . . . . . . . .. 353
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 355
8 The Notion of “Natural” Class, Tools for Its Interpretation.
The Classifiability Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
8.1 Introduction; Monothetic Class and Polythetic Class . . . . . . . . 357
8.1.1 The Intuitive Approaches of Beckner and Adanson;
from Beckner to Adanson . . . . . . . . . . . . . . . . . . . . 360
8.2 Discriminating a Cluster of Objects by a Descriptive
Attribute . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
8.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
8.2.2 Case of Attributes of Type I: Numerical
and Boolean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
8.2.3 Discrimination a Partition by a Categorical
Attribute . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
8.3 “Responsibility” Degree of an Object in an Attribute Cluster
Formation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
8.3.1 A is Composed of Attributes of Type I. . . . . . . . . . . 370
8.3.2 The Attribute Set A is Composed of Categorical
or Ranking Attributes . . . . . . . . . . . . . . . . . . . . . . . 375
8.4 Rows or Columns of Contingency Tables . . . . . . . . . . . . . . . 377
8.4.1 Case of a Single Contingency Table . . . . . . . . . . . . . 377
8.4.2 Case of an Horizontal Juxtaposition of Contingency
Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
8.5 On Two Ways of Measuring the “Importance”
of a Descriptive Attribute . . . . . . . . . . . . . . . . . . . . . . . . . . 382
8.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382
8.5.2 Comparing Clustering “Importance” and Projective
“Importance” of a Descriptive Attribute. . . . . . . . . . . 386
Contents xxi

8.6 Crossing Fuzzy Categorical Attributes or Fuzzy Classifications


(Clusterings) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
8.6.1 General Introduction . . . . . . . . . . . . . . . . . . . . . . . . 391
8.6.2 Crossing Net Classifications; Introduction
to Other Crossings . . . . . . . . . . . . . . . . . . . . . . . . . 394
8.6.3 Crossing a Net and a Fuzzy Dichotomous
Classifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 400
8.6.4 Crossing Two Fuzzy Dichotomous Classifications . . . 404
8.6.5 Crossing Two Typologies . . . . . . . . . . . . . . . . . . . . 408
8.6.6 Extension to Crossing Fuzzy Relational Categorical
Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
8.7 Classifiability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419
8.7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419
8.7.2 Discrepancy Between the Preordonance Structure
and that Ultrametric, on a Data Set. . . . . . . . . . . . . . 420
8.7.3 Classifiability Distribution Under a Random
Hypothesis of Non-ultrametricity . . . . . . . . . . . . . . . 426
8.7.4 The Murtagh Contribution . . . . . . . . . . . . . . . . . . . . 431
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 432
9 Quality Measures in Clustering. . . . . . . . . . . . . . . . . . . . . . .... 435
9.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .... 435
9.2 The Direct Clustering Approach: An Example
of a Criterion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438
9.2.1 General Presentation . . . . . . . . . . . . . . . . . . . . . . . . 438
9.2.2 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 440
9.3 Quality of a Partition Based on the Pairwise Similarities . . . . . 443
9.3.1 Criteria Based on a Data Preordonance . . . . . . . . . . . 444
9.3.2 Approximating a Symmetrical Binary Relation
by an Equivalence Relation: The Zahn Problem . .... 451
9.3.3 Comparing Two Basic Criteria . . . . . . . . . . . . . .... 456
9.3.4 Distribution of the Intersection Criterion
on the Partition Set with a Fixed Type . . . . . . . .... 468
9.3.5 Extensions of the Previous Criterion . . . . . . . . . .... 474
9.3.6 “Significant Levels” and “Significant Nodes”
of a Classification Tree . . . . . . . . . . . . . . . . . . .... 483
9.4 Measuring the Fitting Quality of a Partition Chain
(Classification Tree) . . . . . . . . . . . . . . . . . . . . . . . . . . .... 489
9.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . .... 489
9.4.2 Generalization of the Set Theoretic and Metrical
Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .... 490
9.4.3 Distribution of the Cardinality of the Graph
Intersection Criterion . . . . . . . . . . . . . . . . . . . .... 493
xxii Contents

9.4.4 Pure Ordinal Criteria: The Lateral Order


and the Lexicographic Order Criteria . . . . . . . . . . . . 502
9.4.5 Lexicographic Ranking and Inversion Number
Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511
10 Building a Classification Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . 513
10.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513
10.2 “Lexicographic” Ordinal Algorithm . . . . . . . . . . . . . . . . . . . 519
10.2.1 Definition of an Ultrametric Preordonance Associated
with a Preordonance Data . . . . . . . . . . . . . . . . . . . . 519
10.2.2 Algorithm for Determining ωu Defined by the H
Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 521
10.2.3 Property of Optimality . . . . . . . . . . . . . . . . . . . . . . 523
10.2.4 Case Where ω Is a Total Ordonance . . . . . . . . . . . . . 524
10.3 Ascendant Agglomerative Hierarchical Clustering Algorithm;
Classical Aggregation Criteria . . . . . . . . . . . . . . . . . . . . . . . 527
10.3.1 Preamble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527
10.3.2 “Single Linkage”, “Complete Linkage”
and “Average Linkage” Criteria . . . . . . . . . . . . . . . . 528
10.3.3 “Inertia Variation (or Ward) Criterion” . . . . . . . . . . . 530
10.3.4 From “Lexicographic” Ordinal Algorithm
to “Single Linkage” or “Maximal Link”
Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534
10.4 AAHC Algorithms; Likelihood Linkage Criteria . . . . . . . . . . . 535
10.4.1 Family of Criteria of the Maximal Likelihood
Linkage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535
10.4.2 Minimal Likelihood Linkage and Average Likelihood
Linkage in the LLA Analysis . . . . . . . . . . . . . . . . . . 545
10.5 AAHC for Clustering Rows or Columns of a Contingency
Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 549
10.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 549
10.5.2 Chi Square Criterion: A Transposition of the Ward
Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 550
10.5.3 Mutual Information Criterion . . . . . . . . . . . . . . . . . . 552
10.6 Efficient Algorithms in Ascendant Agglomerative Hierarchical
Classification (Clustering) . . . . . . . . . . . . . . . . . . . . . . . . . . 555
10.6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555
10.6.2 Complexity Considerations of the Basic AAHC
Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 558
10.6.3 Reactualization Formulas in the Cases of Binary
and Multiple Aggregations . . . . . . . . . . . . . . . . . . . 560
10.6.4 Reducibility, Monotonic Criterion, Reducible
Neighborhoods and Reciprocal Nearest
Neighborhoods. . . . . . . . . . . . . . . . . . . . . . . . . . . . 566
Contents xxiii

10.6.5 Ascendant Agglomerative Hierarchical Clustering


(AAHC) Under a Contiguity Constraint . . . . . . . . . . . 572
10.6.6 Ascendant Agglomerative Parallel Hierarchical
Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 576
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 580
11 Applying the LLA Method to Real Data . . . . . . . . . . . . . . . . . . . 583
11.1 Introduction: the CHAVL Software (Classification
Hiérarchique par Analyse de la Vraisemblance des Liens) . . . . 583
11.2 Real Data: Outline Presentation of Some Processings . . . . . . . 586
11.3 Types of Child Characters Through Children's Literature. . . . . 590
11.3.1 Preamble: Technical Data Sheet . . . . . . . . . . . . . . . . 590
11.3.2 General Objective and Data Description . . . . . . . . . . 591
11.3.3 Profiles Extracted from the Classification
Tree on A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 593
11.3.4 Developments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 596
11.3.5 Standardized Association Coefficient with Respect
to the Hypergeometric Model. . . . . . . . . . . . . . . . . . 596
11.3.6 Return to Individuals . . . . . . . . . . . . . . . . . . . . . . . 597
11.4 Dayhoff, Henikoffs and LLA Matrices for Comparing Proteic
Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 600
11.4.1 Preamble: Technical Data Sheet . . . . . . . . . . . . . . . . 600
11.4.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 601
11.4.3 Construction of the Dayhoff Matrix . . . . . . . . . . . . . 603
11.4.4 The Henikoffs Matrix: Comparison with the Dayhoff
Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 612
11.4.5 The LLA Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 616
11.4.6 LLA Similarity Index on a Set of Proteic Aligned
Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 619
11.4.7 Some Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625
11.5 Specific Results in Clustering Categorical Attributes
by LLA Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 629
11.5.1 Structuring the Sets of Values of Categorical
Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 629
11.5.2 From Total Associations Between Categorical
Attributes to Partial Ones . . . . . . . . . . . . . . . . . . . . 632
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 637
12 Conclusion and Thoughts for Future Works . . . . . . . . . . . . . . . . 639
12.1 Contribution to Challenges in Cluster Analysis . . . . . . . . . . . 639
12.2 Around Two Books Concerning Relational Aspects . . . . . . . . 641
12.3 Developments in the Framework of the LLA Approach . . . . . . 643
12.3.1 Principal Component Analysis . . . . . . . . . . . . . . . . . 643
12.3.2 Multidimensional Scaling . . . . . . . . . . . . . . . . . . . . 644
xxiv Contents

12.3.3 In What LLA Hierarchical Clustering Method


Is a Probabilistic Method? . . . . . . . . . . . . . . . . . . . . 645
12.3.4 Semi-supervised Hierarchical Classification . . . . . . . . 645
12.4 Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 646
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 646
Chapter 1
On Some Facets of the Partition Set
of a Finite Set

As indicated in the Preface, we shall start by describing mathematically the structure


sought in Clustering. The latter is a partition or an ordered partition chain on a finite
set E. A new structure has appeared these last years where each partition class is a
subset of E, linearly ordered. Relative to the latter structure, an introduction will be
given in Sect. 1.4.5.
The set E to be clustered may be a set of objects (resp., categories) or a set of
descriptive attributes (see Chap. 3). According to the usage, without any generality
restriction, we shall put E as a set O of objects. Nonetheless, the non-specified
notation E will be maintained in Sect. 1.4.5.

1.1 Lattice of Partition Set of a Finite Set

1.1.1 Definition and General Properties

Let O be a finite set of n elements. A partition of O is a set of O subsets, mutually


disjoint, whose union is the entire set O. These subsets are called classes of the
partition. Thereby, {oa , ob , oc , od }, {oe , o f } and {og } are the classes of the partition
{oa , ob , oc , od }, {oe , o f }, {og } of O = {oa , ob , oc , od , oe , o f , og }.
We shall use the notation P(O) to express the set of partitions of O. To each
element P of P(O) we will associate a binary relation on O, which we can denote
also, without ambiguity, by P. The latter is defined as follows:

 
∀(x, y) ∈ O × O , x P y ⇔ x and y are in the same class of the partition P

© Springer-Verlag London 2016 1


I.C. Lerman, Foundations and Methods in Combinatorial and Statistical
Data Analysis and Clustering, Advanced Information and Knowledge Processing,
DOI 10.1007/978-1-4471-6793-8_1
2 1 On Some Facets of the Partition Set of a Finite Set

P is an equivalence relation, that is to say,

• reflexive: (∀x ∈ O), x P x;


• symmetrical: (∀(x, y) ∈ O × O), x P y ⇔ y P x;
• and transitive: (∀(x, y, z) ∈ O × O × O), x P y and y Pz ⇒ x P z.

Clearly, there is a bijective correspondence between P(O) and the set, designated
by Eq (O) of equivalence relations on O. To simplify notations, we will denote below
by P the set introduced with the notation P(O).
The graph of a binary relation P on O is the subset of O × O defined by

Gr (P) = {(x, y)|x ∈ O, y ∈ O and x P y} (1.1.1)

When P is an equivalence relation associated with a partition P = {Ok |1 ≤ k ≤


K }, Gr (P) can be written as

Gr (P) = Ok × Ok (1.1.2)
1≤k≤K

Figure 1.1 shows Gr (P) in the case of the example above.


Inclusion relation between subsets of the Cartesian product O × O provides an
order relation < on P :
 
∀(P, P  ) ∈ P × P , P < P  ⇔ Gr (P) ⊂ Gr (P  ) (1.1.3)

oa ob oc od oe of og

oa

ob

oc

od

oe

of

og

Fig. 1.1 Graph of the equivalence relation P


1.1 Lattice of Partition Set of a Finite Set 3

which means
 
∀ (x, y) ∈ O × O , x P y ⇒ x P  y

In this case, P is said finer than P  or “P is a refinement of P  ”. Thus, relative to


the set O = {oa , ob , oc , od , oe , o f , og }, the partition
 
P = {oa , ob , oc , od }, {oe , o f }, {og }

is finer than
 
P  = {oa , ob , oc , od }, {oe , o f , og } .

This order relation endows P with a lattice structure; that is to say, to every pair
{P, P  } of P elements corresponds in P a common greatest lower bound P ∧ P 
and a common lowest upper bound P ∨ P  .
P ∧ P  can be defined by the graph of the associated equivalence relation

Gr (P ∧ P  ) = Gr (P) ∩ Gr (P  )

where Gr (P) (resp., Gr (P  )) is the graph of the equivalence relation associated with
P (resp., P  ).
P ∨ P  can also be defined from its graph. Gr (P ∨ P  ) is the graph of the
transitive closure of the binary relation “P or P  ”. In more explicit words, for any
(x, y) ∈ O × O, x P ∨ P  y, if and only if there exists a sequence (z 0 , z 1 , . . . , zl ),
where z 0 = x, zl = y and such that z i Pz i+1 or zi P  z i+1 , for i = 0, 1, . . . , l −1.
Example
 Relative to above, consider  P = {oa , ob , oc , od }, {oe , o f }, {og } and
P  = {oa , ob }, {oc , od }, {oe , o f , og }

   
P ∧ P  = {oa , ob }, {oc , od }, {oe , o f }, {og } , P ∨ P  = {oa , ob , oc , od }, {oe , o f , og }

Clearly, the lattice P depends only on the cardinality n of O. The smallest ele-
ment of P is the finest partition, that for which each class is a “singleton” class,
including exactly one element of O. The biggest element of P is defined by the least
fine partition of O comprising a single class which includes O in its totality. The
finest and least fine partitions are considered as “trivial” partitions. They are called
partition of singletons and singleton partition, and will be denoted below by Ps and
Pt , respectively.
(a) A partition P  covers a partition P if and only if

1. P < P  ;
2. {Q|Q ∈ P, P < Q < P  } =]P, P  [= ∅.
Exploring the Variety of Random
Documents with Different Content
[2095] S. Fleischhafen.

[2096] S. Attest.

[2097] S. Eltern.

[2098] S. (betr. K i t t) Abort.

[2099] S. Stadt.

[2100] S. abbetteln.

[2101] S. Blut.

[2102] S. abbrennen.

[2103] S. bezahlen.

[2104] S. abzahlen.

[2105] Vgl. dazu auch in der Zigeunerspr.: p l e i s s e r p e n n, d. h.


„Bezahlung, Lohn“, = Verdienst; s. L i e b i c h, S. 251 vbd. mit S. 152.

[2106] S. Aas.

[2107] S. anbrennen.

[2108] S. aufschlagen.

[2109] S. schlagen.

[2110] S. Ast.

[2111] S. anreden; vgl. (betr. die Zigeunerspr.) oben Anm. 2087.

[2112] S. (zu allen drei Ausdr.) abkaufen.

[2113] S. handeln.
[2114] S. ankleiden.

[2115] S. anlachen.

[2116] S. abbetteln.

[2117] S. Adler u. anbrennen.

[2118] S. abgehen.

[2119] S. belügen.
[2120] Vgl. (betr. den Gebrauch des Subst. als Adj.)
„Vorbemerkung“, S. 15, Anm. 38 E.

[2121] S. abbrennen.

[2122] S. Adler u. abbiegen bezw. Brücke.

[2123] S. ermorden.

[2124] S. besonnen.

[2125] S. Konkurs.

[2126] Substant. Partiz. von v e r d i b e r n; vgl. „Vorbemerkung“, S.


15, Anm. 36.

[2127] S. arg.

[2128] S. aberwitzig.

[2129] S. angenehm.

[2130] S. abschließen.

[2131] S. Ärger.

[2132] S. absterben.

[2133] S. Amme.

[2134] S. essen.

[2135] S. (zu allen drei Ausdr.) Abendessen.

[2136] S. Brücke.

[2137] S. Mastpulver.
[2138] S. Adler.

[2139] S. Gewerbeschein.

[2140] S. (betr. K i t t) Abort.

[2141] S. Entenstall.

[2142] S. Ei.

[2143] S. Fleischhafen.

[2144] S. abschießen.

[2145] S. anbeten.

[2146] S. Entenfuß.

[2147] S. Angesicht.

[2148] S. abbeißen.

[2149] S. alljährlich.

[2150] S. belügen.

[2151] S. anreden.

[2152] S. ansagen.

[2153] S. abgeben.

[2154] S. abschreiben.

[2155] S. absingen.

[2156] S. aufspielen.

[2157] S. Stadt.
[2158] S. alltäglich.

[2159] S. Abend u. Abort.

[2160] S. Adler u. Gendarm; vgl. Bischof.

[2161] S. (betr. S c h r e n d e) Frauenstube.

[2162] S. Degen u. anbrennen.

[2163] S. Eisenbahnwagen.

[2164] S. abfahren.

[2165] S. Betrug.

[2166] S. (zu beiden Ausdr.) Ananas.

[2167] S. Haselnuß.

[2168] S. Aschenbecher.

[2169] S. Bauch.

[2170] S. Aas u. Filzlaus; — Bei den Zigeunern wird (nach L i e b i c h,


S. 258 vbd. m. S. 166) die Wanze durch p l a t t i t s c h ū w od. l ō l i
t s c h u w, d. h. „platte“ od. „rote Laus“, umschrieben; vgl. auch
schon „Vorbemerkung“, S. 18, Anm. 47

[2171] S. Brücke.

[2172] S. (betr. P f l a d e r - [pfladeren]) abwaschen.

[2173] S. Bauernfrau.

[2174] S. Abort.

[2175] S. abbrühen.
[2176] S. Hahn.

[2177] S. Henne.

[2178] S. Fleischhafen.

[2179] S. Mühle.

[2180] S. Metzelsuppe.

[2181] S. Adler.

[2182] S. daher a. E. u. davongehen.

[2183] S. Chaussee.

[2184] S. (betr. B i c h) Almosen.

[2185] S. abfahren.

[2186] S. abgehen.

[2187] S. abbeißen.

[2188] S. anschauen.

[2189] S. anfassen.

[2190] S. ausstehlen.

[2191] S. abtragen.

[2192] S. böse Frau.

[2193] S. Frau.

[2194] S. Bauernfrau.

[2195] S. Amme.
[2196] S. (betr. M a l f e s) Frauenrock.

[2197] S. Frucht u. Apfelbaum.

[2198] S. Flurschütz.

[2199] S. Apfelwein

[2200] S. Ananas.

[2201] S. (betr. S o r e) Brücke.

[2202] S. ausweinen.

[2203] S. Bierglas.

[2204] S. Abort.

[2205] S. Fleischhafen.

[2206] S. Baumholz.

[2207] S. Apfelbaum.

[2208] S. angenehm u. Bäcker.

[2209] S. (zu beiden Ausdr.) Frucht.

[2210] S. (zu beiden Ausdr.) Aas.

[2211] S. Dietrich u. Adler.

[2212] S. bewerfen.

[2213] S. abfallen.

[2214] S. brauchbares Kind u. Bett. — Vgl. bei den Zigeunern (nach


L i e b i c h, S. 260): t s c h a w é s k ĕ r o s c h u k k l e p e n n, d. h. etwa
„Kinderschaukel“, = Wiege.
[2215] S. (zu beiden Ausdr.) Gasthaus.

[2216] S. abwaschen.

[2217] S. Aas.

[2218] S. gebären.

[2219] S. (betr. b e g e r i s c h) absterben.

[2220] S. (betr. R a n d e) Bauch.

[2221] S. Hauswirt.

[2222] S. Abort.

[2223] S. Hammel u. Augenbrauen. — Sachlich übereinstimmend


auch die Zigeunerspr.; s. L i e b i c h, S. 261 (b a k o r é n g e r e b a l l a,
d. h. „Schafhaare“, = Wolle); vgl. auch F i n c k, S. 49 (b a k r é s k e r o
b a l = Schafwolle).

[2224] S. Ärger.

[2225] S. absterben u. Amtmann.

[2226] S. (betr. D u p f -) stechen.

[2227] S. Leberwurst.

[2228] S. (betr. A c h i l e r e i) essen.

[2229] S. Abendessen.

[2230] S. Aas.

[2231] S. Brücke.

[2232] S. Metzelsuppe.
[2233] S. arg.

[2234] S. bezahlen; vgl. auch „Vorbemerkg.“, S. 15, Anm. 35


(B e r e i m e [u. Z e i n e] wohl = subst. Infinitive).

[2235] S. (zu beiden Ausdr.) abzahlen; vgl. (betr. Z e i n e) auch die


vor. Anm. a. E.

[2236] S. abbeißen.

[2237] S. (betr. S i n s) Amtmann.

[2238] S. Aas.

[2239] S. behext.

[2240] S. Betrug.

[2241] S. Amme.

[2242] S. Entenfuß u. Daumen.

[2243] Diese merkwürdige Bezeichnung findet sich schon in dem


D o l m . d e r G a u n e r s p r. 93 (in der Form S c h o f n a s e u. mit der
Bedeutg. „Groschen“); sonst ist sie m. Wiss. unbekannt im
Rotwelsch u. in den Geheimsprachen. Ob es sich um eine Metapher
handelt oder wie die Umschreibung sonst zu erklären ist (ob
vielleicht nach einem mit einer „Schafnase“ ausgestatteten
Regentenkopfe auf einer Münze [Hypothese von Dr. A . L a n d a u,
Wien]), bleibt zweifelhaft.

[2244] S. (betr. L i n z -) anschauen.

[2245] S. Attest.

[2246] S. Kaffee.

[2247] S. (betr. S c h o t t e l) Aschenbecher.


[2248] S. abbrennen u. Apfelkern. — Sachlich übereinstimmend
damit auch die Zigeunerspr.; s. L i e b i c h, S. 262 (c h a d s c h ē d o
p a r r, d. h. „gebrannter Stein“ = Ziegelstein; Syn: l ō l o p a r r, d. h.
„roter Stein“). B e i d e Ausdr. auch bei L i e b l i c h, S. 180 unter
„Backstein“, während in W i t t i c h s Jenisch d a f ü r nur
K i t t l e s k i e s angeführt ist; vgl. „Vorbemerkg.“, S. 19, Anm. 48.

[2249] S. Pfeife.

[2250] S. Pfeife u. abbiegen.

[2251] S. (betr. R e i b e r) Beutel.

[2252] S. (betr. R a n d e) Bauch.

[2253] S. Löwenzahn.

[2254] Zu S e n d e = Zigeuner vgl. (aus dem v e r w . Q u e l l e n k r.):


S u l z e r Z i g e u n e r l i s t e 1787 (252: d i e S e n d e = die Zigeuner);
W.- B . d e s K o n s t . H a n s 257 (d i e S e n t e [ebenfalls p l u r.]);
S c h w ä b . G a u n e r - u . K u n d e n s p r. 77 (S e n d o = Zigeuner);
S c h w ä b . H ä n d l e r s p r. (L ü t z . [215]: S ĭ n d o). Auch in der
sonstigen Gauner- u. Kundenspr., bes. d. 19. Jahrh. (seit P f i s t e r
1812 [206]) öfter in versch. Formen (S e n d e, S e n t e, S a n d e,
S i n d e usw.) angeführt u. bis in die Neuzeit erhalten s. G r o ß 494
[S i n t e; Nebenbedtg.: Genosse]; R a b b e n 123 [S i n t e r; auch hier
Nebenbedtg.: Genosse, Komplize); O s t w a l d [Ku.] 143 [hier
getrennt: S i n d e = Zigeuner; S i n t e r = Komplize). Zur
E t y m o l o g i e des aus der Z i g e u n e r s p r. entlehnten Wortes (vgl.
„Einleitung“, S. 30) von noch u n s i c h e r e r Herkunft s. Näh. bei
P o t t I, S. 32 ff. vbd. m. L i e b i c h, S. 7, Anm. 1. Die Form lautet bei
den deutsch. Zig. nach den meisten Vokab. s í n t o (plur. s ī n t e); s.
(außer P o t t, a. a. O. u. II, S. 239 u. L i e b i c h, S. 159 u. 262) auch
M i k l o s i c h, Beitr. III, S. 19 u. F i n c k, S. 85; bei J ü h l i n g, S. 226
dagegen: S e n d o, plur. S e n d i; fem. S e n d a z a; vgl. S e n d e a z a =
„Volk der Sendi“. — Über das zigeun. Synon. r o m (eigtl. „Mann“) s.
oben unter „Frau“ (Anm. zu R o m a n e a. E.). Die ebenfalls
gleichbed. Bezeichnung m ā n u š („mānusch“), d. h. eigtl. „Mensch“
(vgl. darüber Näh. bei P o t t II, S. 446; L i e b i c h, S. 145 u. 262;
M i k l o s i c h, Beitr. III, S. 15 u. Denkschriften, Bd. 27, S. 10;
J ü h l i n g, S. 224; F i n c k, S. 72), fehlt in W i t t i c h s „Jenisch“,
obwohl sie mit veränderter Form mehrfach im Rotwelsch des 19.
Jahrh. (seit P f i s t e r b e i C h r i s t e n s e n 1814 [326]) anzutreffen
und auch in die s c h w ä b . H ä n d l e r s p r. eingedrungen ist (s.
L ü t z . [215]: M a n i s c h e r = Zigeuner; vgl. 488: m ô n i s c h =
zigeunerisch); vgl. F i s c h e r, Schwäb. W.-B. IV, Sp. 1440 sowie noch
Archiv, Bd. 59, S. 263, 64.

[2255] S. (betr. F i ( e ) s e l) Bettelbube.

[2256] S. Haushund.

[2257] S. Beischläferin.

[2258] S. Eisenbahnwagen.

[2259] S. Frauenstube.

[2260] S. (betr. - p f l a n z e r) anbrennen.

[2261] Z u R o c h u s v g l . (aus dem v e r w . Q u e l l e n k r.):


S c h w ä b . G a u n .- u . K u n d e n s p r. 77 (R o c h e s od. B r o c h e s
= Zorn); S c h w ä b . H ä n d l e r s p r. 488 (hier nur das Adj.
p r o u c h e s = zornig; vgl. in P f e d e l b . [215]: b r o c h e s = trotzig).
Im sonst. Rotw. kommt die Vokabel vorwiegend als Adjektiv vor (s.
z. B. P f i s t e r 1812 [286: b r o o g e s = bös, feind] u. dann so öfter,
mit lateinisch. Endung — b r o c h u s = böse — in K r ü n i t z ’
E n z y k l o p ä d i e 1820 [349], in der H a n d t h i e r k a 1820 [354]:
b r a u k e s = böse, bei T h i e l e 236 und F r ö h l i c h 1851 [395]:
b r a u g e s, das auch A.-L. 592 — neben b [ e ] r o g e s [= zornig,
tobend] — hat, desgl. auch G r o ß 459 [= böse, erzürnt]), jedoch
vereinzelt auch als Hauptwort (s. A.-L. 592 u. G r o ß 487: R o g e s =
Unruhe, Zorn, Toben, desgl. O s t w a l d 123 [Bedeutg.: Zorn]).
E t y m o l o g i e: R o c h u s (gleichsam latinisiert), richtiger R o g e s,
stammt her vom hebr. r ô g e z = „Unruhe, Zorn“, das Adj.
b ( e ) r o g e s usw. aus b e r ô g e s, d. h. „im Zorn“. Vgl. A.-L. 592
(unter „Roges“) u. 454 (unter „Rogas“) vbd. mit F i s c h e r, Schwäb.
W.-B. I, Sp. 1433.

[2262] Nach F i s c h e r, Schwäb. W.-B. IV, Sp. 1519, Nr. 2 bedeutet


m a s s i g im Schwäbischen (ähnlich wie auch in anderen südd.
Mundarten, z. B. im Elsaß) so viel wie: unzuverlässig, störrisch (von
Menschen u. Tieren, z. B. Pferden, gebr.), eigensinnig, z o r n i g,
wütend, ungestüm, wild, derb, grob, mürrisch, widerwärtig, zänkisch
u. a. m. und wird auch als Subst. für „roher, derber Mensch“
gebraucht. Seiner E t y m o l o g i e nach gehört der Ausdruck wohl zu
dem neuhebr. m a z z i q = „böser Geist, verderbenbringendes Wesen“
(vom hebr. Stamm n â z a q [vgl. A.-L., S. 410 unter „Nĕsack“]), das
als M a s s i g od. M a s s i k = Teufel ins Rotwelsch eingedrungen sowie
(in der Form M a s s i n g und mit gleicher Bedeutg.) auch der
schwäb. H ä n d l e r s p r. bekannt ist. S. D o l m . der
G a u n e r s p r. 100 (M a s s i g = Teufel); P f u l l d . J .- W.- B . 345
(M a s s i k); S c h w ä b . H ä n d l e r s p r. 487 (M a s s i n g). Vgl.
F i s c h e r, a. a. O.

[2263] S. Arrest; vgl. Gefängnis.

[2264] S. arg u. Abort; vgl. Arrest sowie „Einleitung“, S. 28 u. S. 25,


Anm. 61.

[2265] S. (betr. L e h m) Bäcker.

[2266] S. Fingerhut.

[2267] S. abbrühen.

[2268] S. abbrennen.

[2269] S. (betr. S p r e i s l e) Baumholz.

[2270] S. abgehen.
[2271] S. anschauen.

[2272] S. abgeben.

[2273] S. aufschlagen.

[2274] S. Ast.

[2275] S. abschließen.

[2276] S. Dietrich, Adler u. Bauer. — Die Zigeuner umschreiben


(nach L i e b i c h S. 264) den Begriff etwas einfacher durch d i k k n o
g ā d s c h o, d. h. „kleiner Mann“; vgl. oben unter „Riese“.

[2277] Mit B l a u h a n z e (od. - h a n s e) sind zusammengesetzt:


B l a u h a n z e s t ö b e r = Zwetschenbaum, B l a u h a n z e k i e s, -
b r a n d l i n g u. - g ’ f i n k e l t e r od. - s o r u f = Zwetschgenkern (-
stein), -kuchen u. -wasser. Z u v g l . (aus dem v e r w .
Q u e l l e n k r.): S c h w ä b . H ä n d l e r s p r. 488 (B l a u h a n s e n =
Zwetschgen neben dem gleichbed. B l a u h o s e n [das auch das
P f u l l d . J .- W.- B . 346 sowie (in der Form B l o h o s e n) schon der
D o l m . d e r G a u n e r s p r. 102 kennt]); s. auch M e t z e r J e n i s c h
218 (B l a u h ä n s c h e = Zwetschge). Über Belege (für
B l a u h a n [ n ] s e) im Rotw. (des 19. u. 20. Jahrh.) s. G r o ß ’ Archiv,
Bd. 51, S. 145, Anm. 3. Ebds. zur E t y m o l o g i e (gleichsam
Personifizierung durch Verbindung mit dem Eigennamen H a n s); vgl.
auch P o t t II, S. 9 u. 36 u. G ü n t h e r, Rotwelsch, S. 84. = Über
B l a u l i n g = Pflaume s. schon oben.

[2278] S. (betr. S t ö b e r) Apfelbaum.

[2279] S. Apfelkern.

[2280] S. Apfelkuchen.

[2281] S. (zu beiden Ausdr.) Branntwein; vgl. (betr. G ’ f i n k e l t e r)


auch behext.
[2282] S. Brücke.

[2283] Schon in meiner „Vorbemerkung“ (S. 3) habe ich erwähnt,


daß die „Sprachproben“ — aus dort näher angegebenen Gründen —
nicht unwesentlich gekürzt worden sind. Sie umfaßten ursprünglich
46 Nummern, die auf 35 reduziert werden konnten; außerdem
wurden aber auch noch i n n e r h a l b einzelner Nummern (s. bes. in
Nr. 25) mehrfache Streichungen vorgenommen. Bei der Übersetzung
der jenischen Gespräche ins Deutsche habe ich grundsätzlich soweit
wie möglich den W i t t i c h s c h e n Wortlaut beibehalten und nur hier
und da einzelne Stellen in eine etwas flüssigere Form gebracht. Der
jenische Text stellt sich als wichtige Ergänzung zu dem „Wörterbuch“
dar, nicht nur durch die Verwendung mancher dort ursprünglich
fehlender (und erst von mir mit dem Zusatz „Spr.“ hinzugefügter)
Vokabeln, sondern namentlich auch insofern, als wir erst hier
erfahren, wie die einzelnen Wörter in einer konkreten
Satzverbindung gebraucht zu werden pflegen. Während z. B. im
Wörterbuch über das G e s c h l e c h t d e r H a u p t w ö r t e r nur ganz
ausnahmsweise etwas zu entnehmen ist, erscheinen sie hier
regelmäßig in Verbindung mit dem (bestimmten oder unbestimmten)
Artikel, also unter Geschlechtsbezeichnung. Diese aber weicht in
zahlreichen Fällen von der in unserer Gemeinsprache üblichen ab
(vgl. z. B. d e r Galm = d a s Kind, d e r Funk = d a s Feuer, d e r
Flu[h]te = d a s Wasser [vgl. d i e Flut], d e r Stichling = d i e Gabel
[aber — in Übereinstimmg. mit dem Deutsch. u. Französ. — d i e
Furschet], d i e Model = d a s Mädchen, d i e Kitt = d a s Haus usw.).
Zuweilen scheint auch der Sprachgebrauch zu schwanken. So findet
sich z. B. in Nr. 7 d e r Sore = d i e Sache (in Übereinstimmg. u. a. mit
dem W.- B . d e s K o n s t . H a n s [254]), während an einer anderen
Stelle (Nr. 26) das Wort als f e m i n . gebraucht wird (p f l a n z t e
S o r e = die gemachte Ware), was auch in der n e u e r e n
Gaunersprache der Fall ist (vgl. z. B. Ω Σ in Z. V, 429 u. R a b b e n
124). In einzelnen Fällen ist aber k e i n Artikel gesetzt worden,
während wir nach dem deutschen Text einen solchen erwarten
würden, so z. B. in Nr. 23 (S c h e f f t S c h n a l l n o b i s b i b r i s c h ?
= Ist d i e Suppe nicht kalt?); Nr. 25 (W o s c h e f f t F e h t e ? = Wo
ist d i e Herberge?; ... p f l a n z e t S c h a f f e l a u f = ... macht d i e
Scheune auf; b o h l e t S ä u f t l i n g i n R ä d l i n g = tut [eigtl. werft]
d i e Betten in den Wagen; ... p f l a n z e t S t r a u b e r t s = ... macht
[euch] d i e Haare), namentlich auch dann, wenn schon ein
a n d e r e s, m i t (dem bestimmten od. unbestimmten) Artikel
versehenes Hauptwort v o r a ngestellt worden; vgl. z. B. Nr. 11 (I c h
s c h n i f f ’ e i n R a n d e u n d S t e n z = Ich nehme einen Sack und
e i n e n Stock mit); Nr. 19 (m i t d e r d o f B e i z e r e u n d B e i z e r
... = ... mit der guten Wirtin und d e m Wirt ...); Nr. 25 (L i n z e d i e
d o f L a t t u n d K l a s s = Schau [nur] den schönen Hirschfänger
und d a s Gewehr).
Obwohl sonst — wie beim Rotwelsch — G r a m m a t i k und
S y n t a x sich auch beim Gebrauch des „Jenischen“ grundsätzlich den
allgemeinen Regeln unserer Muttersprache anschließen, enthalten
naturgemäß Gespräche, die zwischen Leuten aus dem niederen
Volke geführt werden, auch in dieser Beziehung mancherlei
Abweichungen von der Schriftsprache.
I. Zunächst seien hierfür zwei (nicht bloß auf einzelne
Mundarten beschränkte, vielmehr) wohl durch ganz Deutschland
verbreitete Besonderheiten der volkstümlichen Redeweise erwähnt,
nämlich:
1. daß „des Nachdrucks halber V e r n e i n u n g e n d o p p e l t (ja
dreifach) gesetzt werden können, o h n e e i n a n d e r aufzuheben“
(P o l l e - W e i s e, Wie denkt das Volk über die Sprache?, 3. Aufl.,
Leipzig 1904, S. 108; vgl. Näh. noch bei R . H i l d e b r a n d, Ges.
Aufsätze, Leipzig 1890, S. 214 ff.). B e i s p i e l e: in Nr. 20 (... d e r
k e m e r e t n o b i s k e i n e S t i e b e ... = ... der kauft keine Bürsten
...) u. Nr. 25 (... i c h s p a n n ’ n o b i s k e i K e n e m = ... ich sehe
keine Laus);
2. die Verwechselung des D a t i v s u. A k k u s a t i v s bei den
p e r s ö n l i c h e n F ü r w ö r t e r n (also mir statt mich, dir statt dich
usw. und umgekehrt). B e i s p i e l: in Nr. 16 (I c h b a u s ’ m i r = ich
fürchte mich).
II. Folgende Eigentümlichkeiten sind dagegen auf die
M u n d a r t e n namentl. die süddeutschen (bayr.-schwäb. Dialekt)
beschränkt:
1. der Gebrauch des N o m i n a t i v s s t a t t des A k k u s a t i v s bei
Hauptwörtern. Während sich für den u m g e k e h r t e n Fall (also
Gebrauch des Akkus. für den Nomin.), der z. B. auch im
Schwäbischen vorkommt (s. F i s c h e r, Schwäb. W.-B. II, Sp. 579
unter „ein“ Nr. I: das ist eine n gute n Mann) m. Wiss. in W i t t i c h s
Jenisch kein Beispiel findet, enthält es für die z u e r s t genannte
Besonderheit — außer einigen unsicheren Fällen (in denen der
unbestimmte Artikel e i n ev. auch als Akkusativ eines Neutrums
aufgefaßt werden könnte) — mehrere z w e i f e l s f r e i e, so z. B. Nr.
11 (... v i e l l e i c h t b e s t i e b e m e r e i n S c h m a l e r = ... vielleicht
bekommen wir eine Katze), Nr. 18 (... s p a n n ’ s e i n d o f e r
O b e r m a n = ... schau seinen schönen Hut), Nr. 24 (... i c h
s c h w ä c h ’ e i n S t i e l i n g s j o h l e = ... ich trinke einen
Birnenmost; ... s c h w ä c h t . . . G e f i n k e l t e r = ... trinket ...
Branntwein), Nr. 25 (... i c h b e s t i e b ’ e i n S t u m p f = ... ich
bekomme einen Zorn; ... d e r R u c h p f l a n z t e i n l i n k e r G i e l =
... der Bauer macht einen wüsten Mund) usw.; 2) der Gebrauch des
relat. räuml. Adv. w o statt des Relativpronomens w e l c h e r (-e -es)
bezw. der (die das), worüber zu vgl. u. a. v. S c h m i d, Schwäb. W.-
B. S. 536/37 u. S c h m e l l e r, Bayer. W.-B. II, Sp. 828 (unter „wo“, lit.
c). B e i s p i e l e: Nr. 21 (... i n d e m M o c h e m , w o m a n s p a n n t
= in dem Dorfe, das man ḍ sieht); Nr. 25 (... U l m e , w o
k a s p e r e t = Leute, die zaubern).
III. Z u m Te i l gleichfalls auf die M u n d a r t e n beschränkt,
z u m Te i l aber auch a l l g e m e i n volkstümlich erscheinen gewisse
(übrigens nur n e b e n den schriftdeutschen Formen auftretende)
V e r ä n d e r u n g e n (namentlich K ü r z u n g e n) verschiedener
(kurzer) Wertgattungen) so: 1) d e s (bestimmten und [häufiger] des
unbestimmten) A r t i k e l s; s. Nr. 11 (d ’ S c h m a l e r = die Katzen);
Nr. 18 (i n d e ’ G r i f f l i n g = in der Hand; a u f ’ e m K i e b e s = auf
dem Kopfe); Nr. 19 (v o r ’ m J a h n e = vor einem Jahre); Nr. 25 (s ’
G l i e d = der Sohn; i n ’ s S t e i n h ä u f l e = in die Stadt); bes. aber
(betr. a’ = e i n [einer, eine]; vgl. dazu v. S c h m i d, Schwäb. W.-B.,
S. 1 u. F i s c h e r, Schwäb. W.-B. II, Sp. 578): Nr. 24 a ’ j e n i s c h e s
M o d e l; a ’ j e n i s c h e r F i e s e l); Nr. 25 (a ’ S c h u b e r l e; a ’
S c h a f n a s ’; a ’ F i n k e l m o s s); 2) des a d j . Z a h l p r o n o m e n s
k e i n (-ner, -ne) = k e i ’ (vgl. dazu F i s c h e r, Schwäb. W.-B. IV, Sp.
310); s. Nr. 25 (k e i ’ K e n e m = keine Laus); 3) d e s
b e s i t z a n z e i g e n d e n F ü r w o r t s m e i n (-ner, -ne) = m e i ’; s. z.
B. Nr. 11 (m e i ’ K e i l u f); Nr. 14 (m e i ’ P a t r i s); Nr. 15 (m e i ’
M o s s); Nr. 35 (m e i ’ K l u p e r); 4) d e r p e r s ö n l i c h e n
F ü r w ö r t e r in Verbindung mit Zeitwörtern; vgl. z. B. a) d u = d’; s.
z. B. Nr. 13 (b i s d ’ u m b o h l s t = bis du umfällst); b) d i r = d e r; s.
Nr. 27 (S c h m u s d e r n o b i s = sag’ dir[’s] nicht); c) d i c h = t e in
der (z. B. in Nr. 20, 25 [öfter] begegnenden) Imperativform
s c h u p f t e (für: schupf dich) = hör’ auf (schweig’ still); d) i h m =
(e)m; s. Nr. 20 (i c h s c h m u s e m ’ s = ich sage es ihm); e) s i e
(Nom. u. Akkus.) = s(e); s. z. B. 23 (h a u r e t s e ...? = ist sie ...?);
Nr. 25 (i c h . . . b u k l e s ’ = ich trage sie); Nr. 28 (s c h n i f f s e =
nimm sie); Nr. 32 (g n e i s t s e l o r e ...?); f) e s (Nom. u. Akkus.) =
’s; s. Nr. 8, 9, 18 (s ’ s c h e f f t od. s ’ h a u r e t e i n S i n s = es ist
ein Herr; i c h s p a n n ’ s = ich sehe es; e r g n e i s t ’ s = er merkt
es); Nr. 19, 25 (s ’ h a u r e t = es ist) u. a. m.; g) m a n = m e r (vgl.
dazu v. S c h m i d, Schwäb. W.-B., S. 382 unter „mer“, Nr. 1;
F i s c h e r, Schwäb. W.-B. IV, Sp. 1433 unter „man“; auch
S c h m e l l e r, Bayer. W.-B. I, Sp. 1642 unter „mir“, lit. c); s. Nr. 22 (...
d a b e s t i e b t m e r n o b i s = ... da bekommt man nichts); h) w i r =
m e r oder (etwas seltener) m i r (vgl. v. S c h m i d, a. a. O., S. 382
unter „mer“, Nr. 2 u. S. 533 unter „wir“; F i s c h e r, a. a. O. IV, Sp.
1433 unter „man“ a. E.; S c h m e l l e r, a. a. O. I, Sp. 1641 unter
„mir“, lit. b); B e i s p i e l e: α) für m e r: Nr. 11 (b o s t e m e r = gehen
wir; b e s t i e b e m e r = bekommen wir); Nr. 19 (r u e d l e m e r ...? =
fahren wir ...?; b u t t e m e r ...? = essen wir ...?); Nr. 25 (W o
s c h l a u n e t m e r = Wo schlafen wir?) u. a. m.; β) für m i r: Nr. 25
(D a n n [ J e t z t ] p f i c h e t m i r i n S a u f t [ l i n g e ] = dann (jetzt)
gehen wir zu Bett; b o s t e t m i r = gehen wir; p f l a n z e t m i r
B l a t t = übernachten wir im Freien; b e s t i e b e t m i r = bekommen
wir); i) i h r = e r; s. Nr. 25 (d u r m e t e r n o c h n o b i s ? = schlaft ihr
noch nicht?); Nr. 27 (h a u r e t e r ? = seid ihr?); k) e u c h = i c h; s. Nr.
25 (s c h u p f e t i c h = seid still; d e r K o e l e m u s s i c h b u k e l e =
der Teufel muß [soll] euch holen). — Oft werden auch die persönl.
Fürwörter ganz weggelassen; s. z. B. Nr. 4 (h a u e r s t b e g e r i s c h ?
= bist du krank?); Nr. 6 (w a s s i c h e r s t ? = was kochst du?); Nr. 13
(i n N o l l e h a u r e t = im Krug ist er [näml. d. Most]); Nr. 25
(s p a n n s t n o b i s = siehst du nichts; d a n n s c h e f f t e s c h i e b e s
= dann gehe ich fort; p f l a n z e = mache ich) u. a. m.
IV. Auch allerlei A b k ü r z u n g e n durch Weglassung der
E n d s i l b e n (Buchstaben) o d e r der A n f a n g s s i l b e n — bei
Haupt-, Eigenschafts-, Umstands-, namentlich aber Zeitwörtern —
stehen (gleich den Fällen unter III) in Übereinstimmung mit der
allgemein oder doch mundartlich üblichen Redeweise des Volkes
überhaupt. B e i s p i e l e: 1) für Kürzung durch Weglassung der
E n dsilbe -e (-en): a) b e i S u b s t a n t i v e n: Nr. 25 (a ’
S c h a f n a s ’); b) b e i A d j e k t i v e n: u. a. Nr. 16 (d i e j e n i s c h
M o s s); Nr. 19 (m i t d e r d o f B e i z e r e); Nr. 25 (i n d i e d o f
D u f t; d i e d o f L a t t) usw.; c) b e i A d v e r b i e n: Nr. 11 und öfter
(h e u t ’ [Leile] = heute [Nacht]; d) b e i V e r b e n: hier ist dieser
Sprachgebrauch für d i e e r s t e P e r s o n P r ä s e n t i s und d e n
I m p e r a t i v so häufig, daß er fast als R e g e l erscheint, immerhin
finden sich in diesen Fällen a u c h noch die volleren Formen, und
zwar zuweilen unmittelbar n e b e n den kürzeren; vgl. z. B. (für die
1 . P e r s o n P r ä s .) Nr. 16 (I c h b o s t e u n d b e s c h r e n k ’ = ich
gehe und schließe zu) und (für den I m p e r a t i v) Nr. 28 (P f l a n z ’,
d o g e m i r e i n F u n k e r l e = Mach’, gib mir ein Streichholz); 2) für
Kürzung durch Weglassung der A n f a n g s s i l b e (ge-): bei
Zeitwörtern (Partizipien): Nr. 17 (’ b u t t e t = gegessen); Nr. 25
(e i n ’ b a s c h t = eingekauft; ’ p f l a n z t e S o r e = gemachte Ware;
’ d a l f t = gebettelt); Nr. 33 (’ d o g t = gegeben) usw. Den Übergang
dazu vermittelt g’ statt ge-; s. z. B. Nr. 24 (g ’ s c h a l l e t =
gesungen); Nr. 25 (a b g ’ s c h u n d e G l e i s, g ’ s p r u n k t, g ’ h a u r e t
usw.).
V. Eine spezielle (wohl auch auf m u n d a r t l i c h e n Einfluß
zurückzuführende) Eigentümlichkeit des W i t t i c h s c h e n Jenisch ist
endlich noch der Gebrauch der Endsilbe - e t statt des im
Schriftdeutsch üblichen - e n in mehreren Zeitwortformen, nämlich
für den I n f i n i t i v, für die e r s t e und für die d r i t t e P e r s o n
P l u r a l i s d e s P r ä s e n s, wofür sich übrigens mehrfache Beispiele
auch schon im W.- B . d e s K o n s t a n z e r H a n s („Schmusereyen“)
finden, dessen Ähnlichkeiten mit unserem Jenisch ja auch sonst
mehrfach auffallen (vgl. schon „Vorbemerkung“, S. 3, Anm. 4, S. 6 u.
in d i e s e r Anm. oben S. 73 sowie noch weiter unten die Anm. 2284
zu den „jenischen Schnadahüpfeln“). B e i s p i e l e: 1) für den
I n f i n i t i v: a) in W.- B . d e s K o n s t . H a n s: 256 u. 258 (z ’
m a l o c h e t = zu plündern; z ’ h o l c h e t = zu laufen); 259 (z ’
k a h l e t u n d z ’ s c h w ä c h e t = zu essen und zu trinken); b) in
W i t t i c h s S p r a c h p r.: Nr. 12 (z ’ s c h w ä c h e t = zum Trinken [zu
trinken]); Nr. 21 (z ’ b i k e t u n d z ’ s c h w ä c h e t = zu essen und
zu trinken); Nr. 25 (z ’ b u t t e t = zu essen; z ’ d a l f e t = zu betteln);
2) f ü r d i e e r s t e P e r s o n P l u r. d e s P r ä s .: a) im W.- B . d e s
K o n s t a n z e r H a n s: 256 (H o l c h e t m i r ...? = Kommen wir ...?);
b) in W i t t i c h s S p r a c h p r.: Nr. 11 (v i e l l e i c h t b e s t i e b e m e r
. . . u n d s p a n n e t = vielleicht bekommen wir ... und sehen); Nr.
18 (d a s s w i r . . . s c h m u s e t = daß wir ... sprechen); ebds. (w i r
p f i c h e t = wir gehen); Nr. 19 (S c h w ä c h e t u n d b u t t e m e r ...?
= Trinken und essen wir ...?); Nr. 20 (W i r z e i n e t . . . u n d
s c h e f f t e n s c h i e b e s = wir bezahlen ... und gehen fort); Nr. 25
(w i r k e m e r e t = wir kaufen usw.); ebds. ([schon oben unter Nr.
III, 3 lit. h als Belege für den Gebrauch von m e r und m i r = w i r
angeführt]: W o s c h l a u n e t m e r ?; J e t z t p f i c h e t m i r i n
S a u f t; b o s t e t m i r; p f l a n z e t m i r B l a t t; b e s t i e b e t m i r); 3)
für die d r i t t e P e r s o n P l u r. d e s P r ä s .: a) im W.- B . d e s
K o n s t . H a n s: 256 (... d e n K o c h e m , d i e s c h i a u n e t = ...
den Dieben, die schlafen; S ’ e s c h m u s e t = sie sagen; J e t z t
s c h w ä c h e t s ’ e = Jetzt trinken sie); 260 (... G r a n d s c h a r r l e
s c h e f f t e t l a u u n d P r i n z e n s c h e f f t e t l a u s c h o f e l = ...
Die Hatschier’ sind für nichts, und die Herren sind gar nicht scharf);
b) in W i t t i c h s S p r a c h p r.: Nr. 4 (B u z u n d S c h a r l e h a u r e t
. . . d o f = Polizeidiener und Schultheiß sind ... gut); Nr. 25
(D u r m e t d i e S c h r a w i n e r ? = Schlafen die Kinder?; h e r l e s
p f i c h e t U l m e = hier kommen Leute; d i e H o r b o g e h a u r e t
a m K a i m = die Kühe gehören dem Juden) u. a. m. — Die sonst
noch vorkommenden Abweichungen von der Schriftsprache bedürfen
kaum einer besonderen Hervorhebung oder Erläuterung.
[2284] Nach dem Wörterbuch bedeutet n i ( e ) s i c h und n i l l i c h
sowohl dumm als a u c h verrückt.

[2285] Das hier in Verbindung mit „wo“ (für „woher“) vorkommende


Wort s c h u r e l e s habe ich nicht ins jenisch-deutsche Wörterbuch
eingestellt, weil es sehr schwierig erscheint, eine passende
Verdeutschung dafür (ohne Rücksicht auf den ganzen Satz) zu
geben. (Das einfache „her“ würde kaum deutlich genug sein.) In der
schwäbischen Händlersprache in Unterdeufstetten
(213) ist s c h u r l e s für „fort!“ gebräuchlich. Dahingestellt lasse ich
es auch sein, ob dieses Adverb — etwa gleich dem Zeitw.
s c h u r e l e ( n ) — noch in Verbindung mit dem — einen
Aushilfscharakter an sich tragenden — Hauptw. S c h u r e (Schurele)
gebracht werden darf oder etwa anders zu erklären ist.

[2286] Eine wörtliche Übersetzung dieser Redensart erscheint nicht


gut möglich. Ins W.-B. ist sie deshalb nicht mit eingetragen worden.

[2287] W i t t i c h hat hierzu in einer Anmerkung bemerkt, daß er von


einer Übersetzung dieser „Schnadahüpfel“ abgesehen habe, weil teils
ihr Sinn sich leicht mit Hilfe des jenisch-deutschen Wörterbuchs
herausbringen lasse, teils dagegen (wie z. B. bei Nr. 3) eine
Wiedergabe der jenischen Unflätigkeiten im Deutschen kaum
möglich erscheine. Ich kann dem nur beistimmen. Die Gründe,
weshalb ich von diesen „Schnadahüpfeln“ — trotz ihres groben
Inhalts — nichts gestrichen habe, sind in meiner „Vorbemerkung.“, S.
3, 4 angegeben worden.

[2288] Die Nummern 1 u. 2 (bezw. 4) der „Schnadahüpfel“ stimmen


(wie schon in der „Vorbemerkg.“, S. 3, Anm. 4 erwähnt)
auffälligerweise dem I n h a l t e nach f a s t g a n z und auch in der
F o r m z u m Te i l noch mit „ein paar Strophen aus J a u n e r -
L i e d e r n“ überein, die sich am Schluß des „W ö r t e r b u c h s d e s
K o n s t a n z e r H a n s“ von 1791 (bei K l u g e, Rotw. I, S. 260)
abgedruckt finden. Da mir nun W i t t i c h auf eine Anfrage hin
versicherte, daß ihm das W.-B. des Konstanzer Hans gänzlich
u n b e k a n n t gewesen sei, so muß man wohl schlechterdings
annehmen, daß es sich hier um alte, bis in die Gegenwart hinein
erhaltene Überlieferungen aus der Blütezeit des deutschen
Gaunertums handelt, die bei den „jenischen Leuten“ nur in der
äußeren Form einige Abänderungen erfahren haben. — Von Nr. 1
lautet (nach K l u g e, a. a. O.) die ältere Fassung folgendermaßen:

Ey lustig seyn Kanofer (die Diebe, Schorne)


Dann sia thun nichts als Schofle;
Wann sia kenne Rande fülla
Und brav mit der Sore springa.
Hei ja! Vi va!
Grandscharrle, was machst du da?

Zu K a n o f e r, das auch das P f u l l d . J .- W.- B . 338 (K a n o f f e r


= Dieb; vgl. 339, 343, 345) kennt u. das auch sonst im Rotwelsch
vorkommt, s. F i s c h e r, Schwäb. W.-B. IV, Sp. 193, der das Wort in
erster Linie zwar zu jüd. c h o n e f = „Heuchler, Betrüger“, c h a n u f a
= „Heuchelei“ gestellt hat (vgl. dazu auch W e i g a n d im
„Intelligenzblatt für die Provinz Oberhessen“, Jahrg. 1846, Nr. 74, S.
300 [unter Nr. 13]), jedoch hinzufügt, daß es „doch (auch) wohl
nicht ohne Beziehung zu g a n f e n, stehlen“ sei.

[2289] Zu dieser Nummer (sowie auch zu Nr. 4) vgl. die folgende


Fassung beim „K o n s t a n z e r H a n s“:

Schicksal, was hot auch der Kochern g’schmußt,


Wia er ist abgeholcht von dier?
Er hat g’schmußt: Wann er vom Schornen holch,
Scheft er gleich wieder zu mier.
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade

Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.

Let us accompany you on the journey of exploring knowledge and


personal growth!

textbookfull.com

You might also like