Description |
1 online resource (226 pages) |
Series |
Mathematics in Mind |
|
Mathematics in mind
|
Contents |
Intro -- Contents -- Part I Language as a Complex System -- 1 Introduction -- 1.1 Aims -- 1.2 Structure of This Book -- 1.3 Position of This Book -- 1.3.1 Statistical Universals as Computational Properties of Natural Language -- 1.3.2 A Holistic Approach to Language via Complex Systems Theory -- 1.4 Prospectus -- 2 Universals -- 2.1 Language Universals -- 2.2 Layers of Universals -- 2.3 Universal, Stylized Hypothesis, and Law -- 3 Language as a Complex System -- 3.1 Sequence and Corpus -- 3.1.1 Definition of Corpus -- 3.1.2 On Meaning -- 3.1.3 On Infinity -- 3.1.4 On Randomness |
|
3.2 Power Functions -- 3.3 Scale-Free Property: Statistical Self-Similarity -- 3.4 Complex Systems -- 3.5 Two Basic Random Processes -- Part II Property of Population -- 4 Relation Between Rank and Frequency -- 4.1 Zipf's Law -- 4.2 Scale-Free Property and Hapax Legomena -- 4.3 Monkey Text -- 4.4 Power Law of n-grams -- 4.5 Relative Rank-Frequency Distribution -- 5 Bias in Rank-Frequency Relation -- 5.1 Literary Texts -- 5.2 Speech, Music, Programs, and More -- 5.3 Deviations from Power Law -- 5.3.1 Scale -- 5.3.2 Speaker Maturity -- 5.3.3 Characters vs. Words -- 5.4 Nature of Deviations |
|
6 Related Statistical Universals -- 6.1 Density Function -- 6.2 Vocabulary Growth -- Part III Property of Sequences -- 7 Returns -- 7.1 Word Returns -- 7.2 Distribution of Return Interval Lengths -- 7.3 Exceedance Probability -- 7.4 Bias Underlying Return Intervals -- 7.5 Rare Words as a Set -- 7.6 Behavior of Rare Words -- 8 Long-Range Correlation -- 8.1 Long-Range Correlation Analysis -- 8.2 Mutual Information -- 8.3 Autocorrelation Function -- 8.4 Correlation of Word Intervals -- 8.5 Nonstationarity of Language -- 8.6 Weak Long-Range Correlation -- 9 Fluctuation -- 9.1 Fluctuation Analysis |
|
9.2 Taylor Analysis -- 9.3 Differences Between the Two Fluctuation Analyses -- 9.4 Dimensions of Linguistic Fluctuation -- 9.5 Relations Among Methods -- 10 Complexity -- 10.1 Complexity of Sequence -- 10.2 Entropy Rate -- 10.3 Hilberg's Ansatz -- 10.4 Computing Entropy Rate of Human Language -- 10.5 Reconsidering the Question of Entropy Rate -- Part IV Relation to Linguistic Elements and Structure -- 11 Articulation of Elements -- 11.1 Harris's Hypothesis -- 11.2 Information-Theoretic Reformulation -- 11.3 Accuracy of Articulation by Harris's Scheme -- 12 Word Meaning and Value |
|
12.1 Meaning as Use and Distributional Semantics -- 12.2 Weber-Fechner Law -- 12.3 Word Frequency and Familiarity -- 12.4 Vector Representation of Words -- 12.5 Compositionality of Meaning -- 12.6 Statistical Universals and Meaning -- 13 Size and Frequency -- 13.1 Zipf Abbreviation of Words -- 13.2 Compound Length and Frequency -- 14 Grammatical Structure and Long Memory -- 14.1 Simple Grammatical Framework -- 14.2 Phrase Structure Grammar -- 14.3 Long-Range Dependence in Sentences -- 14.4 Grammatical Structure and Long-Range Correlation -- 14.5 Nature of Long Memory Underlying Language |
Summary |
This volume explores the universal mathematical properties underlying big language data and possible reasons why such properties exist, revealing how we may be unconsciously mathematical in our language use. These properties are statistical and thus different from linguistic universals that contribute to describing the variation of human languages, and they can only be identified over a large accumulation of usages. The book provides an overview of state-of-the art findings on these statistical universals and reconsiders the nature of language accordingly, with Zipf's law as a well-known example. The main focus of the book further lies in explaining the property of long memory, which was discovered and studied more recently by borrowing concepts from complex systems theory. The statistical universals not only possibly lie as the precursor of language system formation, but they also highlight the qualities of language that remain weak points in today's machine learning. In summary, this book provides an overview of language's global properties. It will be of interest to anyone engaged in fields related to language and computing or statistical analysis methods, with an emphasis on researchers and students in computational linguistics and natural language processing. While the book does apply mathematical concepts, all possible effort has been made to speak to a non-mathematical audience as well by communicating mathematical content intuitively, with concise examples taken from real texts |
Notes |
Part V Mathematical Models |
Bibliography |
Includes bibliographical references and index |
Notes |
Print version record |
Subject |
Mathematical linguistics.
|
|
Computational linguistics.
|
|
computational linguistics.
|
|
Lingüística matemática
|
|
Computational linguistics
|
|
Mathematical linguistics
|
|
Lingüística matemàtica.
|
|
Lingüística computacional.
|
Genre/Form |
Llibres electrònics.
|
Form |
Electronic book
|
ISBN |
9783030593773 |
|
3030593770 |
|