书籍详情
Uncertainty Modeling for Data Mining:A Label Semantics Approach(基于不确定性建模的数据挖掘)
作者:秦曾昌 著,秦曾昌,汤永川 编
出版社:浙江大学出版社
出版时间:2014-02-01
ISBN:9787308121064
定价:¥120.00
购买这本书可以去
内容简介
不确定性建模是人工智能的重要研究领域。现实世界的问题都因为不确定性而变得复杂。概率理论、模糊集合理论及其他不确定性理论的目的都是用数学工具对不确定性建模以解决实际问题。本文的主要内容来源于作者们近年来对于一种新的不确定性理论的基础和应用研究。Label Semantics基于人类用语言描叙客观自然的现象,用label (词或标签)对不确定性建模。经过近几年的研究,该理论通过结合模糊集合论和概率论已经形成了一个独特的体系。本书将介绍基于Label Semantics的不确定建模的理论基础和最新的应用,尤其是国际最新出现的Prototype理论对Label Semantics的解释将统计学习和模糊集理论紧密结合,为模糊理论的应用开辟了一条新路。 该书的第一作者自2001年以来从事不确定性建模领域的研究并和Label Semantics理论的创始人Lawry教授一起工作并做了一些奠基性的工作。第二作者于2009年和Lawry教授合作,一起提出了Prototype的解释理论,这是该领域目前最前沿的工作。本书是国内第一本Label Semantics的专著,也是国际上唯一介绍该理论应用及最新进展的专著。本书提出的理论不仅是国际最新的科研成果,有利于带动该领域在国内的发展,而且,本书提出的理论仍在发展之中,更能引起更多中国科学家的重视并从事该研究工作,从而推动国内人工智能的研究,以达到国际领先。 Label Semantics理论主要研究不确定信息的建模,可以广泛地应用于各种数据挖掘问题。本书详细阐述了其中一些具体的应用,如对洪水的预警研究中,用某一流域的地质水文信息来生成解释度比较高的规则来判断洪水的可能性。书中还提到了函数拟合、太阳黑子周期和医疗诊断等具体的实例。 本书重点介绍了Label Semantics的理论框架和与其他理论之间的关系。重点突出了该理论与模糊理论和概率论的结合。首次提出了数据算法的透明度(Transparency)概念,在如何权衡算法的透明度及其性能方面做了定量研究,并与其他常用算法进行比较。本书介绍了近年来作者提出的几种基于Label Semantics的“透明”数据挖掘模型,并有详细的算法描述和对其优缺点的论述。 首次介绍Prototype理论对Label Semantics的解释方法,把统计学习的方法引入到不确定系统建模。书中既有详细的理论证明和相应的实例,又有直观的图表说明。 本书内容深入浅出,可读性强,读者可以较好地理解整个理论体系。定义和证明都配有相应的实例和解答,能加深读者对概念的理解。 本书利用模糊集合来描述语言中词(Word or Label)的模糊性,用所有可能词的概率分布来表示不确定性的概念,不同于传统的模糊逻辑方法。该理论为为人们理解不确定性提供了有力的数学工具。本书是国内第一本该不确定性理论的专业书籍。本书框架如下:前言:该理论框架的基础,意义,研究进展和应用前景。第一部分:Label Semantics 理论第二部分:基于Label Semantics 的算法及应用第三部分:Prototype 理论对于Label Semantics 的重新诠释
作者简介
秦曾昌 著 秦曾昌,汤永川 编
目录
1 Introduction
1.1 Types of Uncertainty
1.2 Uncertainty Modeling and Data Mining
1.3 Related Works
References
2 Induction and Learning
2.1 Introduction
2.2 Machine Learning
2.2.1 Searching in Hypothesis Space
2.2.2 Supervised Learning
2.2.3 Unsupervised Leaming
2.2.4 Instance-Based Learning
2.3 Data Mining and Algorithms
2.3.1 Why Do We Need Data Mining?
2.3.2 How Do We do Data Mining?
2.3.3 Artificial Neural Networks
2.3.4 Support Vector Machines
2.4 Measurement of Classifiers
2.4.1 ROC Analysis for Classification
2.4.2 Area Under the ROC Curve
2.5 Summary
References
3 Label Semantics Theory
3.1 Uncertainty Modeling with Labels
3.1.1 Fuzzy Logic
3.1.2 Computing with Words
3.1.3 Mass Assignment Theory
3.2 Label Semantics
3.2.1 Epistemic View of Label Semantics
3.2.2 Random Set Framework
3.2.3 Appropriateness Degrees
3.2.4 Assumptions for Data Analysis
3.2.5 Linguistic Translation
3.3 Fuzzy Discretization
3.3.1 Percentile-Based Discretization
3.3.2 Entropy-Based Discretization
3.4 Reasoning with Fuzzy Labels
3.4.1 Conditional Distribution Given Mass Assignments
3.4.2 Logical Expressions of Fuzzy Labels
3.4.3 Linguistic Interpretation of Appropriate Labels
3.4.4 Evidence Theory and Mass Assignment
3.5 Label Relations
3.6 Summary
References
4 Linguistic Decision Trees for Classification
4.1 Introduction
4.2 Tree Induction
4.2.1 Entropy
4.2.2 Soft Decision Trees
4.3 Linguistic Decision for Classification
4.3.1 Branch Probability
4.3.2 Classification by LDT
4.3.3 Linguistic ID3 Algorithm
4.4 Experimental Studies
4.4.1 Influence of the Threshold
4.4.2 Overlapping Between Fuzzy Labels
4.5 Comparison Studies
4.6 Merging of Branches
4.6.1 Forward Merging Algorithm
4.6.2 Dual-Branch LDTs
4.6.3 Experimental Studies for Forward Merging
4.6.4 ROC Analysis for Forward Merging
4.7 Linguistic Reasoning
4.7.1 Linguistic Interpretation of an LDT
4.7.2 Linguistic Constraints
4.7.3 Classification of Fuzzy Data
4.8 Summary
References
5 Linguistic Decision Trees for Prediction
5.1 Prediction Trees
5.2 Linguistic Prediction Trees
5.2.1 Branch Evaluation
5.2.2 Defuzzification
5.2.3 Linguistic ID3 Algorithm for Prediction
5.2.4 Forward Branch Merging for Prediction
5.3 Experimental Studies
5.3.1 3D Surface Regression
5.3.2 Abalone and Boston Housing Problem
5.3.3 Prediction of Sunspots
5.3.4 Flood Forecasting
5.4 Query Evaluation
5.4.1 Single Queries
5.4.2 Compound Queries
5.5 ROC Analysis for Prediction
5.5.1 Predictors and Probabilistic Classifiers
5.5.2 AUC Value for Prediction
5.6 Summary
References
6 Bayesian Methods Based on Label Semantics
6.1 Introduction
6.2 Naive Bayes
6.2.1 Bayes Theorem
6.2.2 Fuzzy Naive Bayes
6.3 Fuzzy Semi-Naive Bayes
6.4 Online Fuzzy Bayesian Prediction
6.4.1 Bayesian Methods
6.4.2 Online Learning
6.5 Bayesian Estimation Trees
6.5.1 Bayesian Estimation Given an LDT
6.5.2 Bayesian Estimation from a Set of Trees
6.6 Experimental Studies
6.7 Summary
References
7 Unsupervised Learning with Label Semantics
7.1 Introduction
7.2 Non-Parametric Density Estimation
7.3 Clustering
7.3.1 Logical Distance
7.3.2 Clustering of Mixed Objects
7.4 Experimental Studies
7.4.1 Logical Distance Example
7.4.2 Images and Labels Clustering
7.5 Summary
References
8 Linguistic FOIL and Multiple Attribute Hierarchy for Decision Making
8.1 Introduction
8.2 Rule Induction
8.3 Multi-Dimensional Label Semantics
8.4 Linguistic FOIL
8.4.1 Information Heuristics for LFOIL
8.4.2 Linguistic Rule Generation
8.4.3 Class Probabilities Given a Rule Base
8.5 Experimental Studies
8.6 Multiple Attribute Decision Making
8.6.1 Linguistic Attribute Hierarchies
8.6.2 Information Propagation Using LDT
8.7 Summary
References
9 A Prototype Theory Interpretation of Label Semantics
9.1 Introduction
9.2 Prototype Semantics for Vague Concepts
9.2.1 Uncertainty Measures about the Similarity Neighborhoods Determined by Vague Concepts
9.2.2 Relating Prototype Theory and Label Semantics
9.2.3 Gaussian-Type Density Function
9.3 Vague Information Coarsening in Theory of Prototypes
9.4 Linguistic Inference Systems
9.5 Summary
References
10 Prototype Theory for Learning
10.1 Introduction
10.1.1 General Rule Induction Process
10.1.2 A Clustering Based Rule Coarsening
10.2 Linguistic Modeling of Time Series Predictions
10.2.1 Mackey-Glass Time Series Prediction
10.2.2 Prediction of Sunspots
10.3 Summary
References
11 Prototype-Based Rule Systems
11.1 Introduction
11.2 Prototype-Based IF-THEN Rules
11.3 Rule Induction Based on Data Clustering and Least-Square Regression
11.4 Rule Learning Using a Conjugate Gradient Algorithm
11.5 Applications in Prediction Problems
11.5.1 Surface Predication
11.5.2 Mackey-Glass Time Series Prediction
11.5.3 Prediction of Sunspots
11.6 Summary
References
12 Information Cells and Information Cell Mixture Models
12.1 Introduction
12.2 Information Cell for Cognitive Representation of Vague Concept Semantics
12.3 Information Cell Mixture Model (ICMM) for Semantic Representation of Complex Concept
12.4 Learning Information Cell Mixture Model from Data Set
12.4.1 Objective Function Based on Positive Density Function..
12.4.2 Updating Probability Distribution of Information Cells...
12.4.3 Updating Density Functions of Information Cells
12.4.4 Information Cell Updating Algorithm
12.4.5 Learning Component Number of ICMM
12.5 Experimental Study
12.6 Summary
References
1.1 Types of Uncertainty
1.2 Uncertainty Modeling and Data Mining
1.3 Related Works
References
2 Induction and Learning
2.1 Introduction
2.2 Machine Learning
2.2.1 Searching in Hypothesis Space
2.2.2 Supervised Learning
2.2.3 Unsupervised Leaming
2.2.4 Instance-Based Learning
2.3 Data Mining and Algorithms
2.3.1 Why Do We Need Data Mining?
2.3.2 How Do We do Data Mining?
2.3.3 Artificial Neural Networks
2.3.4 Support Vector Machines
2.4 Measurement of Classifiers
2.4.1 ROC Analysis for Classification
2.4.2 Area Under the ROC Curve
2.5 Summary
References
3 Label Semantics Theory
3.1 Uncertainty Modeling with Labels
3.1.1 Fuzzy Logic
3.1.2 Computing with Words
3.1.3 Mass Assignment Theory
3.2 Label Semantics
3.2.1 Epistemic View of Label Semantics
3.2.2 Random Set Framework
3.2.3 Appropriateness Degrees
3.2.4 Assumptions for Data Analysis
3.2.5 Linguistic Translation
3.3 Fuzzy Discretization
3.3.1 Percentile-Based Discretization
3.3.2 Entropy-Based Discretization
3.4 Reasoning with Fuzzy Labels
3.4.1 Conditional Distribution Given Mass Assignments
3.4.2 Logical Expressions of Fuzzy Labels
3.4.3 Linguistic Interpretation of Appropriate Labels
3.4.4 Evidence Theory and Mass Assignment
3.5 Label Relations
3.6 Summary
References
4 Linguistic Decision Trees for Classification
4.1 Introduction
4.2 Tree Induction
4.2.1 Entropy
4.2.2 Soft Decision Trees
4.3 Linguistic Decision for Classification
4.3.1 Branch Probability
4.3.2 Classification by LDT
4.3.3 Linguistic ID3 Algorithm
4.4 Experimental Studies
4.4.1 Influence of the Threshold
4.4.2 Overlapping Between Fuzzy Labels
4.5 Comparison Studies
4.6 Merging of Branches
4.6.1 Forward Merging Algorithm
4.6.2 Dual-Branch LDTs
4.6.3 Experimental Studies for Forward Merging
4.6.4 ROC Analysis for Forward Merging
4.7 Linguistic Reasoning
4.7.1 Linguistic Interpretation of an LDT
4.7.2 Linguistic Constraints
4.7.3 Classification of Fuzzy Data
4.8 Summary
References
5 Linguistic Decision Trees for Prediction
5.1 Prediction Trees
5.2 Linguistic Prediction Trees
5.2.1 Branch Evaluation
5.2.2 Defuzzification
5.2.3 Linguistic ID3 Algorithm for Prediction
5.2.4 Forward Branch Merging for Prediction
5.3 Experimental Studies
5.3.1 3D Surface Regression
5.3.2 Abalone and Boston Housing Problem
5.3.3 Prediction of Sunspots
5.3.4 Flood Forecasting
5.4 Query Evaluation
5.4.1 Single Queries
5.4.2 Compound Queries
5.5 ROC Analysis for Prediction
5.5.1 Predictors and Probabilistic Classifiers
5.5.2 AUC Value for Prediction
5.6 Summary
References
6 Bayesian Methods Based on Label Semantics
6.1 Introduction
6.2 Naive Bayes
6.2.1 Bayes Theorem
6.2.2 Fuzzy Naive Bayes
6.3 Fuzzy Semi-Naive Bayes
6.4 Online Fuzzy Bayesian Prediction
6.4.1 Bayesian Methods
6.4.2 Online Learning
6.5 Bayesian Estimation Trees
6.5.1 Bayesian Estimation Given an LDT
6.5.2 Bayesian Estimation from a Set of Trees
6.6 Experimental Studies
6.7 Summary
References
7 Unsupervised Learning with Label Semantics
7.1 Introduction
7.2 Non-Parametric Density Estimation
7.3 Clustering
7.3.1 Logical Distance
7.3.2 Clustering of Mixed Objects
7.4 Experimental Studies
7.4.1 Logical Distance Example
7.4.2 Images and Labels Clustering
7.5 Summary
References
8 Linguistic FOIL and Multiple Attribute Hierarchy for Decision Making
8.1 Introduction
8.2 Rule Induction
8.3 Multi-Dimensional Label Semantics
8.4 Linguistic FOIL
8.4.1 Information Heuristics for LFOIL
8.4.2 Linguistic Rule Generation
8.4.3 Class Probabilities Given a Rule Base
8.5 Experimental Studies
8.6 Multiple Attribute Decision Making
8.6.1 Linguistic Attribute Hierarchies
8.6.2 Information Propagation Using LDT
8.7 Summary
References
9 A Prototype Theory Interpretation of Label Semantics
9.1 Introduction
9.2 Prototype Semantics for Vague Concepts
9.2.1 Uncertainty Measures about the Similarity Neighborhoods Determined by Vague Concepts
9.2.2 Relating Prototype Theory and Label Semantics
9.2.3 Gaussian-Type Density Function
9.3 Vague Information Coarsening in Theory of Prototypes
9.4 Linguistic Inference Systems
9.5 Summary
References
10 Prototype Theory for Learning
10.1 Introduction
10.1.1 General Rule Induction Process
10.1.2 A Clustering Based Rule Coarsening
10.2 Linguistic Modeling of Time Series Predictions
10.2.1 Mackey-Glass Time Series Prediction
10.2.2 Prediction of Sunspots
10.3 Summary
References
11 Prototype-Based Rule Systems
11.1 Introduction
11.2 Prototype-Based IF-THEN Rules
11.3 Rule Induction Based on Data Clustering and Least-Square Regression
11.4 Rule Learning Using a Conjugate Gradient Algorithm
11.5 Applications in Prediction Problems
11.5.1 Surface Predication
11.5.2 Mackey-Glass Time Series Prediction
11.5.3 Prediction of Sunspots
11.6 Summary
References
12 Information Cells and Information Cell Mixture Models
12.1 Introduction
12.2 Information Cell for Cognitive Representation of Vague Concept Semantics
12.3 Information Cell Mixture Model (ICMM) for Semantic Representation of Complex Concept
12.4 Learning Information Cell Mixture Model from Data Set
12.4.1 Objective Function Based on Positive Density Function..
12.4.2 Updating Probability Distribution of Information Cells...
12.4.3 Updating Density Functions of Information Cells
12.4.4 Information Cell Updating Algorithm
12.4.5 Learning Component Number of ICMM
12.5 Experimental Study
12.6 Summary
References
猜您喜欢